Website Navigation Generator

This page contains documentation and notes regarding the menugen.py — written 2/2011 during our attempt to get the new Lumiera website online finally. The initial draft version was written by cehteh in Lua

The purpose of the menugen script is to maintain the navigation menu on the Lumiera website semi-automatically. In the usual setup, this script is triggered from a Git push — it walks the web subdirectories and discovers menu entries. The generated HTML page contains both visible elements and JavaScript snippets to display and highlight the menu on the client side appropriately

Overview: how it works

The menu generation and display is comprised of several parts working together

the build_website.sh is triggered as a Git post-receive hook, whenever new commits are transfered to the website Git repository. After discovering new Asciidoc source files and generating the corresponding HTML files, the menu generator script is invoked
the menugen python script walks the subdirectories to discover possible menu contents. It visits Asciidoc source files (*.txt) and picks up
- the location / URL
- the title
- special //MENU: directives embedded in Asciidoc comments
after building a complete menu tree (actually a DAG), this data structure is walked to generate output HTML into a menu.html file in website root.
the page template (page.conf) for generated Asciidoc pages contains an <IFrame> to display this menu.html
when loading menu.html, some JavaScript elements generated into the body alongside with the visible content will execute, causing a lookup table in the client side memory being populated with the menu entries and parent dependencies. Each individual menu entry has an attached unique ID, originally generated by the server side menugen script. The clientside JavaScript always addresses elements directly through these IDs, mostly ignoring the actual DOM structure
whenever a new webpage is loaded, the onload handler on the <IFrame> (or a similar mechanism) invokes the markPageInMenu() JavaScript function, which addresses the IFrame by its ID inavi, and calls into the JavaScript located there. This script in turn finds the menu entry corresponding to the current page with the help of the lookup table mentioned above; this allows to highlight the current page and fold any other branches of the menu to keep the visible part reasonably small to fit on a single page
folding and highlighting changes are done by manipulating the style of these elements; the actual presentation is mostly controlled by a menu.css
any further JavaScript functions used to operate the menu are located in the statically served menu.js — the generated menu contains only the “moving parts”

Configuring menu generation

While, generally speaking, the script was written to remove the need to care for the menu most of the time, there are numerous extension points and configuration options to deal with special cases. Adjustments can be done on several levels:

the menugen python script contains in embedded set of predefined menu entries, forming the backbone of the generated menu. The use of this feature is optional and can be enabled with the -p or --predefined switch. These predefined configuration steps are done in a function addPredefined() right at the top; the configuration is written in the style of an internal DSL and should be fairly self explanatory.
when discovering Asciidoc page sources, special //MENU: directives are processed (// marks an Asciidoc comment). The remainder of such a line is always parsed as a single directive; in case of a parsing error a warning is printed and the line will be ignored. The individual directives mostly correspond to similar functions usable in the aforementioned internal DSL; actually both kinds of configuration have the same effect: they attach some modification command to the menu element in question. Note especially that such directives can modify the discovery of further pages — pages can be attached, excluded, ordered; and the discovery can be redirected to another subdirectory.
the actual code generation is mostly based on python template code contained in a separate script menuformat.py — located alongside the main menu generator script. This code generation is driven by a classical recursive tree visitation over the menu data structure built up thus far; code generation hooks are called on each tree leaf and when entering and leaving inner nodes (submenu nodes).
the highlighting is done by the client side JavaScript in js/menu.js — mostly just by adding or removing CSS classes dynamically. The actual styling of the menu entries is thus largely independent of the menu generation (but of course the CSS selectors must line up with the basic structure of the generated code). The current version of this CSS stylesheet makes heavy use of contextual selectors and the general cascading mechanism to build up the final style; e.g. the indentation according to the menu level is done by attaching a style based on the number of nested HTML elements.

Summary of menu placement directives

With the term placement directives we denote all the adjustments and configuration possible either through the internal DSL for the predefined menu structure, or through the //Menu: lines in the individual pages.

addressing menu nodes

Each menu entry corresponds to a menu node in the internal data structure. In the most general case, this structure is a Directed Acyclic Graph, because a node might be hooked up below several different parent nodes. In this case, such a node will also be visited multiple times for code generation — one time for each parent it is attached below. Amongst these parent nodes, the first parent node attached is called the primary parent, because this first attachment of a node defines the logical path uniquely describing this node. Note, this logical path can be different to the actual web paths / URLs generated, and also be different to the file system path where the source file resides. It is just defined by the chain of parent nodes leading to the root of the menu data structure.

The leaf element of this logical menu path is called the ID of the node. Typically this ID corresponds to the filename without the extension. But for the code generation and the client sides JavaScripts, the full menu path is used as an HTML id element, because — generally speaking — only the full menu path denotes an element unambiguously.

When working with nodes, and especially when writing placement directives in the individual source files, in most cases it is not necessary to specify the full menu path of a node. Actually, nodes can be addressed by any path suffix, and even just by the bare node ID. But when there is an ambiguity, just the first node found is picked. Because nodes have an unique identity, this can sometimes yield rather wired results. To minimise the danger of ambiguities, the discovery of source pages always addresses the menu node to be populated with the full menu path.

configuration example

def addPredefined():
    root = Node(TREE_ROOT, label='Lumiera')                                # 
    proj = root.linkChild('project')                                       # 
    proj.linkChild('faq')

    proj.prependChild ('screenshots')                                      # 
    proj.putChildLast ('press')
    proj.putChildAfter('faq', refPoint=Node('screenshots'))                # 

    proj.link('https://issues.lumiera.org/roadmap',label="Roadmap (Trac)") # 
    Node('rfc').sortChildren()

	the root node by convention uses a special ID token. Additional fields of the node object can be given as named parameters. Here we define the visual menu label to be “Lumiera”
	a child node `root/project` is attached. Note: this node will later be picked up, when the actual page discovery delves down into the project subdirectory and encounters a index.txt there. Index files are always searced within the directory; they may be called `index.txt` or use the same name as the enclosing directory.
	this placement directive defines that a node `screenshots` shall be prepended at the start of the list. Because such a node doesn’t yet exist, a new node `root/project/screenshots` is created as a side-effect.
	this directive places an entry after another entry, which is assumed to exist when this directive gets applied finally. All placement directives get applied in order of definition, just before the output for a given node is generated. Note also the constructor syntax `Node('screenshots')`: here the constructor just acts as a general factory function; either it creates a new node, or it fetches an existing node with matching node path from the internal `NodeIndex`
	here we create a submenu entry in the project menu, featuring an external link. The ID of that menu node will be derived from the name in the url (here `roadmap`) — it can be defined explicitly if necessary (`id=…`)

supported placement directives

internal DSL	Asciidoc source
`Node(<id>)`	`— discover id.txt —`	create new node or retrieve existing node
`linkChild(id)`		basic function for attaching child node
`linkParent(id)`		basic function to attach below parent
`putChildLast(id)`	`[attach] child <id>`	move child to current end of list
`appendChild(id)`	`[append] child <id>`
`putChildFirst(id)`		move child to current list start
`prependChild(id)`	`prepend [child] <id>`
`putChildAfter(id,ref)`	`[attach\|put] child <id> after <ref>`	move child after the given ref entry
`link(url[,id][,label])`	`[child <id>] link ::<url>[<label>]`	attach an entry, holding an external link
`Node(<id>,label=<lbl>)`	`label\|title <lbl>`	define the visible text in the menu entry
`sortChildren()`	`sort [children]`	sort all children currently in list
`enable(False)`	`off\|disable\|deactivate`	make node passive; any children/parents added later are ignored
`enable([True])`	`on\|active\|activate`	make node active again (this is the default)
`detach()`	`detach`	cut away any parents and children, disable the node
`discover(srcdirs=…)`	`include dir <token>[,<token>]`	instead of current dir, retrieve children from other dirs (relative)
`discover(includes=…)`	`include <token>[,<token>]`	explicitly use the listed elements as children
`discover(excludes=…)`	`exclude <token>[,<token>]`	after discovering, filter names matching the <token> (without extension)

commandline options

The behaviour of the menugen script can be influenced by some options:

predefined: using the built-in predefined nodes
scan: discover nodes
debug: dump data structure after discovery
text: generate plaintext version of the menu
webpage: actually generate HTML / JavaScript

a positional parameter denotes the start directory for discovery (default is current). This directory is assumed also to be the web root; any URLs are generated relative

Design and Implementation notes

The initial observation was that actually we’re parsing and processing some kind of Domain Specific Language here. Thus the general advice for such undertakings does apply: we should try to handle the actual language just as a thin layer on top of some kind of semantic model. In our case, this model is the menu tree to be generated, while the actual “syntax tree” is the real filesytem, holding Asciidoc files with embedded comments. Thus, the semantic model was developed first, and separate of the syntax of the specifications; it was tested to generate suitable HTML and CSS.

The syntactic elements where then added as a collection of parser or matcher objects, each with the ability to recognise and implement one kind of placement specification. Each such Placement subclass exposes an acceptVerb() function for handling invocations of the internal DSL functions, and an acceptDSL() function to parse and accept a //Menu: line from some Asciidoc source file. This approach makes adding further configuration options simple.

Another interesting question is to what extent the actual path handling and file discovery logic should be configurable. My reasoning is, that any attempts towards larger flexibility are mostly moot, because we can’t overcome the fact that this is logic to be cast into program code. Extension points or strategy objects will just have the effect to tear apart the actual code thus will make the code harder to read. Thus I confined myself just to configure the index file name and file extensions.

Known issues

for sake of simplicity, there is one generated container HTML element per menu entry. In case this entry is a submenu, the <ul>-element is used, not the preceding headline <li> — this is due to the fact that this submenu entry is going to be collapsed eventually, but has the side-effect of highlighting only that submenu block, not the preceding headline.
the acceptable DSL syntax needs to be documented manually; there is no way to generate this information. Doing so would require to add specific information methods into Placement subclasses, and it would result in duplicated information between the regular expressions and the informations returned by such information methods. This was deemed dangerous.
the __repr__ of the Placement subclasses is not an representation but rather a __str__ — but unfortunately the debugger in PyDev invokes __repr_\_
the startdir for automatic discovery is an global variable
when through the use of redirection, the same file is encountered multiple times during discovery, it is treated repeatedly, each times associated with another node, because, on discovery, the node-ID is generated as parentPath/fileID, to avoid mixing up similarly named files in different directories. (The NodeIndex allows to retrieve a node just by its bare ID, without path anyway)
no escaping: currently any variable text is written to the generated HTML without any sanitising or escaping. This might be a security issue, especially because Git pushes immediately trigger menu generation.
the method Node.matches() is implemented sloppily: it uses just a mutual postfix match, while actually it should line up full path components and check equality on components, starting from the path end. This cheesy implementation can yield surprising side-effects: e.g. an not-yet attached node \'end' could match a new menu page \'documentation/backend'

git://git.lumiera.org/LUMIERA →Gitweb	TRAC · timeline · roadmap
master · integration · dev · doc · web	recent · stalled · core-work · non-code	CC-By-SA 4
API Documentation (Doxygen)	Impressum · GDPR	CC-By-SA 4