Re: Increase number of available positions?

2010-03-18 Thread Rene Hackl-Sommer
Hi Steve, I'm not sure what's wrong with the above (have you tried each of the two nested SpanNot clauses independently?), but here's another thing to try: Your query works. And as turns out, if I don't commit the same embarrassing lower case / upper case inconsistency over and over

Re: Increase number of available positions?

2010-03-17 Thread Rene Hackl-Sommer
Hi, I was looking at SpanNotQuery to see if I could make do without the position increment gaps. A search requirement that's causing me some trouble to implement is when two terms are supposed to be on the same L_2, yet on different L_3's (L_3's are hierarchically below L_2). With the

RE: Increase number of available positions?

2010-03-17 Thread Steven A Rowe
Hi Rene, On 03/17/2010 at 11:17 AM, Rene Hackl-Sommer wrote: SpanNot fieldName=MyField Include !-- Gets all the matching spans within L_2 boundaries and includes them -- SpanNot Include SpanNear slop=2147483647 inOrder=false SpanTermt293/SpanTerm SpanTermt4979/SpanTerm /SpanNear

Re: Increase number of available positions?

2010-03-16 Thread Rene Hackl-Sommer
Hi Guys, Thanks for the input! I am now going to put in some work to see how things fare. Should I post the question about substituting int with long on lucene-dev again, if need arises? Thanks again, Rene Am 15.03.2010 23:04, schrieb Steven A Rowe: Hi Rene, Have you seen

Re: Increase number of available positions?

2010-03-16 Thread Erick Erickson
Sure. I'd start a new thread though, referencing this one and outlining why none of the solutions you tried worked. Erick On Tue, Mar 16, 2010 at 4:35 AM, Rene Hackl-Sommer rene.a.ha...@gmx.dewrote: Hi Guys, Thanks for the input! I am now going to put in some work to see how things

Increase number of available positions?

2010-03-15 Thread Rene Hackl-Sommer
Hello, I am working at a use case that is very demanding regarding the number of token positions. For one special field in the index, I need to represent different hierarchy levels, like this: MyField Level_1 Level_2 Level_3 Please note that I need to do this with Lucene, not a XML search

Re: Increase number of available positions?

2010-03-15 Thread Erick Erickson
Is your entire corpus a single document? Because I'm having trouble imagining a single document where this would be a problem, unless your increment gap is huge. The term positions are relative to a single document... You say that your levels have less than 1,000 elements each With an increment

Re: Increase number of available positions?

2010-03-15 Thread Rene Hackl-Sommer
Is your entire corpus a single document? Because I'm having trouble imagining a single document where this would be a problem, unless your increment gap is huge. The term positions are relative to a single document... It is getting pretty huge, yes (see below). The term positions are also

RE: Increase number of available positions?

2010-03-15 Thread Steven A Rowe
Hi Rene, Why can't you use a different field for each of the Level_X's, i.e. MyLevel1Field, MyLevel2Field, MyLevel3Field? On 03/15/2010 at 9:59 AM, Rene Hackl-Sommer wrote: Search in MyField: Terms T1 and T2 on Level_2 and T3, T4, and T5 on Level_3, which should both be in the same

Re: Increase number of available positions?

2010-03-15 Thread Erick Erickson
I was wondering about Steven's approach to, have you considered it? I don't know the internals of whether you could go to a 64 bit quantity for term positions, but I suspect it would be *very* involved, but perhaps people more familiar with the code could comment. How big is your corpus?

Re: Increase number of available positions?

2010-03-15 Thread Rene Hackl-Sommer
Hi Steve, Why can't you use a different field for each of the Level_X's, i.e. MyLevel1Field, MyLevel2Field, MyLevel3Field? Well, the hierarchical structure needs to be maintained. As hundreds of Level_X entities can be found on levels 2 and 3, I need to be able to tell for instance

Re: Increase number of available positions?

2010-03-15 Thread Rene Hackl-Sommer
Hi Erick, What about indexing the triplets with a small increment gap between? That is: ... gets indexed as: level1-1/level2-1/level3-1 +gap 100 level1-1/level2-1/level3-2 +gap 100 level1-1/level2-2/level3-3 +gap 100 level1-1/level2-2/level3-4 If I understand this correctly, the field

Re: Increase number of available positions?

2010-03-15 Thread Erick Erickson
Not quite what I had in mind, more like level1-1/level2-1/level3-1/Term1 level1-1/level2-1/level3-1/Term2 level1-1/level2-1/level3-2/Term3 level1-1/level2-1/level3-2/Term4 With an increment gap 0f 100 and an analyzer that split on slashes, the term positions would be something like: term term

RE: Increase number of available positions?

2010-03-15 Thread Steven A Rowe
Hi Rene, Have you seen SpanNotQuery?: http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/spans/SpanNotQuery.html For a document that looks like: Level_1 id=1 Level_2 id=1 Level_3 id=1T1 T2 T3/Level_3 Level_3 id=2T4 T5 T6/Level_3 Level_3 id=3T7 T8 T9/Level_3