On Tue, Jan 25, 2011 at 10:27 AM, Platonides <platoni...@gmail.com> wrote:

> Had LST used <section name=foo> </section> to mark sections,
> instead of <section begin=foo />content<section end=foo />, it
> would be as easy as traversing the preprocessor output, which would
> already have the sections splitted.
>

It was done this way in order to allow overlapping sections: LST was created
so arbitrary parts of a document on Wikisource can be quoted while retaining
a direct link to the original document as it continues to be edited.

Basically, the section markers are permanent markers for the source of a
copy-and-paste operation. One person might be copying from paragraph 1 to
paragraph 4; another might copy from paragraph 3 to paragraph 5; your page
structure looks like this:

  [page]
    [section-open 1/]
    [para 1/] <!-- in section 1 only -->
    [para 2/] <!-- in section 1 only -->
    [section-open 2/]
    [para 3/] <!-- in both section 1 and 2 -->
    [para 4/] <!-- in both section 1 and 2 -->
    [section-close 1/]
    [para 5/] <!-- in section 2 only -->
    [section-close 2/]
  [/page]

Since the LST sections overlap, they don't really fit well in the
hierarchical structures that the preprocessor deals in except as standalone
start/end markers.

*BUT* ... it's probably possible to actually redo things to use that above
structure in a sensible way, instead of doing text regexes:

  iterate through the node tree:
    if found desired section start node:
      start saving our spot
    if found desired section end node:
      if start node was at same level:
        grab everything in between
        RETURN that to upstream parser
      else:
        find the closed common parent node of start and end
        build a node tree that has the parts of the start's parent before
the start trimmed, and the parts of the end's parent after the end trimmed
        RETURN that to upstream parser

One could also pull the markers out of the original text and store them as
separate metadata in some way, which seems to be part of the suggestions
earlier in thread. The main problem here is that we could easily end up
losing track of the markers during editing; we have no persistent identity
for pieces of text, so if there's not a visible node in there for editors to
move & copy along with their alterations, they not be able to persist
automatically.

-- brion
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to