This tool scans your source code for comments with a specific syntax and generates cross-referenced AsciiDoc documentation based on templates that you define.
|The Developer Guide explains how to build and modify the source code, refer to the User Guide to learn how to use this software.|
|This document, like the User Guide, has been generated using docra itself and can also serve as a demonstration of how this tool can be used.|
This project uses CMake as its build system. Version 3.0 or newer is required, since at least the following CMake features are expected:
Command mode with
flex(only on Windows)
CPACK_RPM_FILE_NAMEin CPack ¶
Docra is written in C++11 and a compatible compiler is required, GCC, Clang and Visual Studio are known to work. In particular the compiler must support
flex lexer generator is required for compiling, when building on Windows a prebuilt version will be downloaded automatically.
asciidoctor is installed, docra will use it to render its own documentation as HTML and man pages. The older
asciidoc can also be used as a fallback, but currently only to generate the man pages.
pygmentize is installed, the docra source code will be rendered as HTML and the generated HTML cross-reference will link to these local files and not to the public repository (in order to do this, you have to compile the project once, and then re-run cmake in the build directory so that it can find and run the docra executable you just compiled on its own source code).
Once you have all the prerequisites, you can launch the build system.
Create a build directory
mkdir -p build/dir && cd "$_"
Configure the build system in it
Visual Studio, as of version 15.5, is capable of opening and configuring CMake projects directly, just open the source code folder using ¶then do once configured.
The build system provides the following targets:
Run docra on its own source code to render documentation templates into AsciiDoc files. ¶
Generate HTML documentation from the generated AsciiDoc source. ¶
Generate man pages from the generated AsciiDoc source. ¶
Build lexer library for AsciiDoc input. ¶
Build lexer library for
#-commented languages. ¶
Build lexer library for C-like languages. ¶
Build lexer library for INI-like input. ¶
Build lexer library for Verilog input. ¶
Build lexer library for VHDL input. ¶
Build lexer library for XML. ¶
Build lexer library for YACC-like grammars. ¶
Build the main executable. ¶
Default CMake installation target. ¶
Remove files copied by the install target. ¶
The software is written in C++. Since this tool depends on asciidoctor, an initial idea was to write it in in Ruby, however I never wrote anything in Ruby before and decided to stick with a known language instead. This also means that the docra executable runs standalone with no runtime dependencies (although you will need asciidoctor anyway to turn its output into anything usable). Being just plain C++ also means the main tool can already be compiled to JS (via emscripten) allowing it, in the future, to be distributed and run as an extension inside of editors like Code and Atom.
More specifically, we use C++11 (this is in fact my first C++11 project), the main reason for moving on from C++98 was to finally have access to the
<filesystem> standard library. Other than that and the occasional
auto we do not use lots of C++11 features.
|Like any side projects, the quality of this code suffered a lot from being able to contribute to it in only tiny chunks of one hour or less after work. A lot could be cleaned up.|
This section has been generated using the
The low-level input to the generic
Hierarchy level This field is also set by the low-level lexers since they know how to deconstruct hierarchy for each language ¶
Current lexing position within a
This iterator is repositioned by the
Line within the input file This field is set by the low-level lexers ¶
The text contained by the
This field is filled by the specific low-level lexer
Only set in pre-parsed tags, the tag will be just copied in the resulting token ¶
ConfigSections::iterator begin(const std::string &sec);
Returns → iterator over all sections called sec ¶
const ConfigEntries &get(const std::string &sec, const std::string &sub);
Map from a file path to the root of its scope tree (populated by
typedef std::map<fs::path, Scope> FileScopes;
List of tokens describing this parameter ¶
Code block describing this parameter ¶
This param is listed in a group (and should not be rendered by itself but inside the group) ¶
Takes a stream of
to create a parse tree
Reference to the current
Reference to the current
Document-wide state for a given prefix ¶
Representation of a generic lexical scope within a
List of scopes semantically contained in this
Source location of scope start ¶
containing this one
Points to the suffix corresponding to the scope tag (optional) ¶
Scope* add_child(int lineno);
Insert a new scope below the current hierarchy ¶
All data attached to a given (prefix, id, suffix) tuple ¶
Code block describing this entity ¶
Long description of this entity ¶
Short description of this entity ¶
If this tag refers to code, the corresponding line number ¶
Local symbols valid within the scope of this full tag ¶
Path to file that this tag belongs to ¶
bool render_codeblock(std::ostream &out, const Tokens &desc, Tokens::const_iterator &it) const;
bool render_description(const Config &cfg, std::ostream &out, const Tokens &desc) const;
Pointer into the scope tree to where this suffix was defined ¶
The full resolved tag identifying this suffix ¶
All data assigned to a given tag id ¶
All the suffixes, but appearing in the same order they were inserted ¶
bool render_xref(const Config &cfg, std::ostream &out) const;
bool render_xref_yacc(const Config &cfg, std::ostream &out) const;
Map from an optional suffix to the associated
Comparison operator for source locations ¶
inline bool operator == (const Location &a, const Location &b);
Returns → boolean result of comparison if both operands point to same source location ¶
Language-agnostic base class for each lexer ¶
produced by low-level lexer
This can briefly differ from chunk.depth, the lexer will generate SCOPE_XXX tokens until balanced.
Cumulative depth variation to be applied at next eol ¶
Handle to input file We use stdio’s FILE since that’s what Flex expects. ¶
virtual void init(const fs::path &path, FILE *file) = 0;
Is called by the lower-level lexers. ¶
virtual bool next_chunk() = 0;
Returns → 'false' on EOF, 'true' otherwise ¶
Produce a new
Returns → 'false' at the end of input, 'true' otherwise ¶
The last token can be accessed at
¶ Tokens are matched using C++11 regular expressions Two regexes are used, the first one is:
static std::regex re_comment( "(<\\?)([a-z]*)" "|" (1) "(\\?>)" "|" (2) "``([^`]+)``" "|" (3) "(\r?\n)" "|" (4) "(" (5) "(([^\\s`]*)`([^`]*)`([^\\s:]*))(:)?" ")" );
Tags are formatted using the following format:
The id can be empty, the suffix and colon can be omitted. The full syntax is explained in (TODO).
The second regex is:
static std::regex re_comment_boc( "(<\\?)([a-z]*)" "|" (1) "(\\?>)" "|" (2) "``([^`]*)``" "|" (3) "(\r?\n)" "|" (4) "(" (5) "(([^\\s`]*)`([^`]*)`([^\\s:]*))(:)?" "|" "^(([A-Z]+):)" get rid of this exception! It only creates problems! ")" );
This syntax matches tags in the form
In this case, ABCD becomes the prefix, the (TODO) id is empty and the colon is not optional anymore. This syntax is only used to mark positions in the code as TODOs, FIXMEs and so forth
For more details, (TODO).
It can happen that
returns 'true' and the last chunk is invalid, this is in fact what is returned on eof! See
Code chunks are passed on to the parser as code tokens with no modification, only comments are parsed.
Some rules only apply at the beginning of a comment.
In those cases,
is set accordingly.
In C++11 regex, anything that comes after the end of the match constitutes the 'suffix'.
If its 'matched' attribute is true, then it contains a valid value.
iterator to the beginning of the prefix will move the tokenizer forward.
If there’s nothing left to parse, the tokenizer will consume the next chunk.
'true' is returned because one
was already successfully produced.
The only way to reset a match variable seems to be to swap it with a new, local variable that will be cleaned up outside this scope.
If the regex matches somewhere after the beginning of the string, 'prefix' will be set accordingly.
This part of the string will be returned as a simple
Lexer recognizing Asciidoc comment syntax ¶
struct Lexer_Adoc : public Lexer
struct Lexer_CLike : public Lexer
Dummy lexer to test generic logic only ¶
struct Lexer_Dummy : public Lexer
Lexer for #-commented languages ¶
struct Lexer_Hash : public Lexer
Lexer for 'ini-like' configuration files ¶
struct Lexer_Ini : public Lexer
Specific lexer for Verilog and SystemVerilog ¶
struct Lexer_Verilog : public Lexer
Specific lexer for VHDL ¶
struct Lexer_Vhdl : public Lexer
Lexer for Yacc/Bison grammars ¶
struct Lexer_Yacc : public Lexer
Since a terminal symbol can be identified both by an identifier and an arbitrary string, this can be used to find the identifier starting from the string. ¶
The identifier for the token or rule currently being processed ¶
A pattern is a list of terminals and nonterminals, a rule can match several patterns ¶
points to a token, holds its (optional) string representation
void init(const fs::path &path, FILE *file);
- l (I)
- p (I)
Parse the current document ¶
- root_scope (I)
representing the document ¶
Returns → always 'true' (TODO: should actually return false on error) ¶
First tag inside a scope that inherited the tag from its parent, if we are on the same line as the scope opening then we are the actual tag for this scope! ¶ First tag inside a scope that does not have one yet assigned, is assigned to the scope itself (if within a couple of lines from the start) ¶ If this tag follows some code, AND that code is not already tagged, then consider this tag to belong to that line of code.
This tag does NOT have some code already associated with it, and it’s not a '#' tag.
Then the code will come later, but we must already create a suffix for it!
(Or we have nowhere to put any subsequent
¶ If the scope starts on a line that was tagged, and that tag is NOT already matched, use that tag for the scope itself. (Add exception in case where tag is on the SAME line as the scope, because in this case the two clearly belong together).
¶ A DOC_STR inside a block is just added verbatim to that block.
¶ If a comment starts with a tag, it starts a tagged block Comments on lines immediately following it should be considered part of the same block. (but they must start with a space! this is a simple way to separate them from commented code).
¶ If a "? " comment occurs inside a block, it will be rendered in asciidoc as a numbered code block callout. (but all I have to do here is add it to the description, the renderer will do the rest).
¶ If a comment starts with "? ", it starts a hierachical block (or it continues the previous one, there is no difference) As such it should associate with the innermost containing scope with a suffix associated to it.
Recurse up the scope stack to find where this belongs.. ¶ Comments that end up here did NOT start with "? ", they will be accepted as docstrings only if they are preceded by a line belonging to a documentation block.
¶ Additional conditions for valid comments are that:
Either they are empty (and will count as an empty line in the resulting asciidoc).
¶ - or they start with an arrow sign, in which case they are documenting a function return value.
Aggregates comments collected from all files found in the command line input ¶
Suffix *insert_tag(const Tag &tag, Token &&token, Scope *scope);
Suffix *locate(const std::string &prefix, const std::string &symbol, const std::string &suffix = "");
Suffix *locate(const Tag &tag);
Symbol *locate_symbol(const std::string &prefix, const std::string &symbol);
Symbol *locate_symbol(const Tag &tag);
Recursively scan a directory looking for cigma tags ¶
bool analyze_directory(int verb, const fs::path &path, int level);
Returns → 'false' on error, 'true' otherwise ¶
Lex and parse the given file ¶
bool analyze_file(int verb, const fs::path &path, int level);
Returns → 'false' on error, 'true' otherwise ¶
Association of each file path in the project with its syntax tree ¶
Identifier of a documented entity ¶
A piece of input recognized by the
A single line of code
kind goes through the
Entry point of the program ¶
int main(int argc, char *argv);