Many years ago I had started this Lux project that was designed to build an XML-aware index using Solr; see https://github.com/msokolov/lux/tree/master/src/main/java/lux/index/analysis for the analysis chain I used. Maybe you'll find something useful in this project? It's dormant for years, and pre-dates interval queries, but the code is still the code, and XML has not really changed...
On Fri, May 6, 2022 at 5:23 AM Mikhail Khludnev <m...@apache.org> wrote: > > Hi Devs! > > I found intervals quite nice and natural for retrieving scoped data (thanks, > Alan!): > <tag>foo stuff bar</tag> > I.containing(I.ordered(I.term("<tag>"), I.term("<tag>")), > I.unordered(I.term("bar"), I.term("foo"))); > It works like a charm until it encounter ill nested tags: > <tag>foo <tag>bug</tag> bar</tag> > Due to intrinsic minimalizations it picks the internal tag. I feel like plain > intervals backed on positions lack tag scoping information. > Do you know any approaches for retrieving XML in Lucene? > > -- > Sincerely yours > Mikhail Khludnev --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org