I wrote: > ... the current implementation discards > expat_characterData too soon.
After thinking about this... Conceptually, character data appears between elements. If we had something like <div>text1<span>text2</span>text3<br />text4</div> That would be valid xml. The xml standard does require a top level element, so we don't have anything preceding or following that element. Other than that, though, we wind up encountering a text fragment between each appearance of an element tag. And, because xml is hierarchical, we may encounter arbitrary complexity between and opening and closing tag. In the current design of api/expat, character data is captured on encountering a closing tag, so in the above example, we would only capture text2. (Actually... something else is broken here that I've not adequately studied. For whatever reason, api/expat stops when it encounters the <br /> tag.) So.... I think the current implementation is broken enough that it should be completely replaced. I propose: (1) We should have a character data callback, which is called immediately before each open / close tag (with some kind of exception for the first tag). (2) The parser should continue to function on encountering self-closing tags. (There will not be a closing tag, nor a character data tag, so probably we should have a distinct "self closing callback".) Also: it's probably the case that the y argument to the current expat_start_elementx callback is useless, so I think it would make sense to discard that y argument and use what's currently the x argument in its place. These changes are significant enough that the addon should probably get a new name, just in case someone is using the current implementation and has somehow found it useful. I guess I'm proposing that I should build this replacement addon -- though I would not object if someone else carried the load here. Still... it needs a name. I think api/jexpat should be reserved for a native j implementation -- one which doesn't require shared library support. So maybe api/cexpat would be a good name for this rebuild of the addon? Comments? Am I way off base on anything here? Thanks, -- Raul ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
