Re: [mwlib] Using mwlib for parsing wikitext (but not PDF generation)

2012-07-19 Thread Lars Jørgen Solberg
Hi, I'm sorry for hijacking an old thread. When you mention token offsets, are those character offsets in the raw wiki markup? I.e do they make it possible to say that a given node in the parse tree represents the markup from position a to position b? If so, is this a capability in mwlib or

Re: [mwlib] Using mwlib for parsing wikitext (but not PDF generation)

2012-04-26 Thread Travis Briggs
Hi Joel, My needs are pretty simple. The basic 'algorithm' of what I want to do is identify section headers with their names: if(isinstance(node, Section) and node.name == External Links): finish_node = node Then, given the location in the document of a section header with a given name, I

Re: [mwlib] Using mwlib for parsing wikitext (but not PDF generation)

2012-04-25 Thread Joel Nothman
I have been using mwlib for exactly that since 2008, but I haven't checked if my scripts work with a more recent version of mwlib. (I mostly use mwlib.refine.compat.parse_text.) I and others may be able to help you with more detail if you give us some idea what you would like to get out