Hi

my original problem is to index large number of documents which
contains 360 integers in rage from 0-90K. Searching it's a little bit
complicated - I need to find most similar documents where query data
is also 360 numbers in range 0-90K. But (there is always 'but') i need
to create score with some predefined weight table. Here is example:

Index contains:

DOC1 : 1, 3, 5
DOC2 : 1, 100
DOC3 : 1, 5

I need to find all documents which are 'like' this:

SEARCH: 1,5,100

And suppose that i'm having table which says: "if value is larger than
10 wight hit as 0.5, else as 1" (in real application this is more
complicated weight table).

So for Query 1,5,100 i will have:

DOC1: SCORE=2    [1,5]
DOC3: SCORE=2    [1,5]
DOC2: SCORE=1.5 [1,100 (100>10- wight 0.5]

Searching is just: if hits occurs on field, increments score by 1*weight(value)

My first step was to create index with one field which contains all
360 values and to remove normals from it.

Now when i'm doing search like:

"F:1 F:5 F:100"

I'm getting results ok but score is not correct. Of course it gives me
score sorted by 'number of hits' (am I right?) but score value is not
calculated by increments of 1 nor i'm using wights at all.

So, my question is - is this even possible with lucene and if can, can
you point me into some directions (i already looked a little bit at
DefaultSimilarity overriding).

Thanks

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to