I found VTD-xml while researching this. Looks like an interesting alternative and reminds me of the work you did with segmented strings
http://vtd-xml.sourceforge.net/VTD.html http://jsoftware.2058.n7.nabble.com/quot-Segmented-Strings-quot-td59863.html On Dec 5, 2014 4:28 PM, "Raul Miller" <[email protected]> wrote: > I would like to revisit the idea of using J to parse xml. > > The xml/sax addon was a nice idea, but not very stable. It represented > xml as a series of events (function calls), and left it up to the user > how they would structure the result. Unfortunately, it also rather > reliably crashes J. > > This can be mitigated in various ways. If what you are parsing is > simple enough, and you can live with 32 bit j602, xml/sax can work > great. But those are not always ideal constraints to work with. > > But... what's a good data structure in J, to represent xml? > > A problem is that xml is something of a living example of "the nice > thing about standards is that there are so many to choose from". The > standards documents describing xml are voluminous, and there are many > alternatives which are physically different but logically similar to > wade through. > > Still, at a basic level, xml is something of a nested sequence type of > a thing. So one approach might leverage boxed character arrays. This > will not be particularly efficient, but it's a start. > > For example, this xml snippet: > > <ab cd="ef" gh="ijk">lmnop</a> > > Might be represented in J as: > 'ab';<('cd';'ef'),('gh';'ijk'),:'';<<'lmnop' > > (The extra boxing on the text is because that might in the general > case actually be a sequence of elements). > > Another approach might be: > 'ab';(('cd';'ef'),:('gh';'ijk'));<<'lmnop' > > Here, the [textual, in this case] content of the element is stored in > a separate box from the attributes, instead of treating it as a > blank-named attribute. > > But perhaps there are good non-boxed ways of representing the structure? > > Has anyone else been working with xml in J? > > Thanks, > > -- > Raul > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
