Not quite what I had in mind, more like level1-1/level2-1/level3-1/Term1 level1-1/level2-1/level3-1/Term2 level1-1/level2-1/level3-2/Term3 level1-1/level2-1/level3-2/Term4
With an increment gap 0f 100 and an analyzer that split on slashes, the term positions would be something like: term term pos 0 level1-1 1 level2-1 2 level3-1 3 Term1 104 level1-1 105 level2-1 106 level3-1 107 Term2 208 level1-1 209 level2-1 210 level3-2 211 Term3 312 level1-1 313 level2-1 314 level3-2 315 Term4 As you see, a lot or repetition, but perhaps acceptable... Or, you could choose an analyzer that didn't break up the terms (although this would make your index somewhat bigger due to more unique terms). term term pos 0 level1-1/level2-1/level3-1/Term1 101 level1-1/level2-1/level3-1/Term2 202 level1-1/level2-1/level3-2/Term3 303 level1-1/level2-1/level3-2/Term4 Although I don't know if you really need an increment gap here..... This latter would make gathering all the documents with specific levels easier although the former would also work if you didn't need partial terms (that is, wildcards inside of phrases are new, see JIRA-1486, ComplexPhraseQueryParser). Best Erick On Mon, Mar 15, 2010 at 5:09 PM, Rene Hackl-Sommer <rene.a.ha...@gmx.de>wrote: > Hi Erick, > >> What about indexing >> the triplets with a small increment gap between? That is: >> ... >> >> gets indexed as: >> >> level1-1/level2-1/level3-1 +gap 100 >> level1-1/level2-1/level3-2 +gap 100 >> level1-1/level2-2/level3-3 +gap 100 >> level1-1/level2-2/level3-4 >> >> > > If I understand this correctly, the field would look like > "level1-1/level2-1/level3-1 Term1 Term2 level1-1/level2-1/level3-2 Term3 > Term4 "? > > I think, the problem here is the same like in the Payloads approach I wrote > of in my response to Steve's mail. We cannot test for equality at search > time (please correct me if we actually can do this). So if we have > > > level1-1/level2-1/level3-1 > ... > level1-1/level2-1/level3-244 > level1-1/level2-2/level3-1 > level1-1/level2-2/level3-105 > > and I search for T1 and T2 on level3, but want them to be in the same > level2, this cannot be done satisfactorily. > > > Or you could think about *documents* being your level1, that is each >> document has one and only one level1 element but many documents >> may have the same level1 token. Combining this with your increment >> gap notion for level2-3 might work for you. >> >> > > I was thinking about this, yet the trouble is that the issue at hand is > just one field in an already not quite trivial scenario involving 200+ > fields. If I add say 50 level1-documents per real document, I would still > need to be able to relate these level1-documents to the real documents to > which they belong, and, during retrieval, there are use cases where I need > to look into each of the level1-documents to see if they fulfill certain > criteria and then, in a further step, ascertain whether I can gather the > needed level1-documents to fulfill the query on a "MyField"-Level (not > existant here per se). I feel this might get somewhat unwieldy. > > > You might also search the list for "Heirarchal" or "tree" indexing, >> this is a variant of such I think. >> >> > > Thank you, I'll look into this. > > > Cheers > Rene > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >