Hi my original problem is to index large number of documents which contains 360 integers in rage from 0-90K. Searching it's a little bit complicated - I need to find most similar documents where query data is also 360 numbers in range 0-90K. But (there is always 'but') i need to create score with some predefined weight table. Here is example:
Index contains: DOC1 : 1, 3, 5 DOC2 : 1, 100 DOC3 : 1, 5 I need to find all documents which are 'like' this: SEARCH: 1,5,100 And suppose that i'm having table which says: "if value is larger than 10 wight hit as 0.5, else as 1" (in real application this is more complicated weight table). So for Query 1,5,100 i will have: DOC1: SCORE=2 [1,5] DOC3: SCORE=2 [1,5] DOC2: SCORE=1.5 [1,100 (100>10- wight 0.5] Searching is just: if hits occurs on field, increments score by 1*weight(value) My first step was to create index with one field which contains all 360 values and to remove normals from it. Now when i'm doing search like: "F:1 F:5 F:100" I'm getting results ok but score is not correct. Of course it gives me score sorted by 'number of hits' (am I right?) but score value is not calculated by increments of 1 nor i'm using wights at all. So, my question is - is this even possible with lucene and if can, can you point me into some directions (i already looked a little bit at DefaultSimilarity overriding). Thanks --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
