[jira] Updated: (LUCENE-1260) Norm codec strategy in Similarity
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Kindgren updated LUCENE-1260: --- Attachment: Lucene-1260-2.patch I've added the old static methods again, but made them deprecated. In contrib/misc there is still a reference to the static encodeNorm method, maybe that should be replaced with Similarity.getDefaultSimilarity().encodeNormValue(f)? This call to the static method is only done if no similarity is passed to the FieldNormModifier. I added a short javadoc description to the static methods, not sure if that is enough? (I guess they will be removed, so the relevant javadoc is probably in the instance methods?) > Norm codec strategy in Similarity > - > > Key: LUCENE-1260 > URL: https://issues.apache.org/jira/browse/LUCENE-1260 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.3.1 >Reporter: Karl Wettin >Assignee: Michael McCandless > Fix For: 3.1 > > Attachments: Lucene-1260-1.patch, Lucene-1260-2.patch, > Lucene-1260.patch, LUCENE-1260.txt, LUCENE-1260.txt, LUCENE-1260.txt > > > The static span and resolution of the 8 bit norms codec might not fit with > all applications. > My use case requires that 100f-250f is discretized in 60 bags instead of the > default.. 10? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1260) Norm codec strategy in Similarity
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Kindgren updated LUCENE-1260: --- Attachment: Lucene-1260-1.patch Added 'final' modifier to the Similarity field where it was used. The norm-array in Similarity was already made 'final', so there's no change there. I think there could be further refactoring of the use of the Similarity instance, but that is perhaps out of the scope for this issue. I hope this will pass the performance-tests! > Norm codec strategy in Similarity > - > > Key: LUCENE-1260 > URL: https://issues.apache.org/jira/browse/LUCENE-1260 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.3.1 >Reporter: Karl Wettin >Assignee: Michael McCandless > Fix For: 3.1 > > Attachments: Lucene-1260-1.patch, Lucene-1260.patch, LUCENE-1260.txt, > LUCENE-1260.txt, LUCENE-1260.txt > > > The static span and resolution of the 8 bit norms codec might not fit with > all applications. > My use case requires that 100f-250f is discretized in 60 bags instead of the > default.. 10? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1260) Norm codec strategy in Similarity
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1260: --- Fix Version/s: 3.1 I think this is a reasonable change, but we probably should wait for 3.1 as long as 3.0 comes out soonish. > Norm codec strategy in Similarity > - > > Key: LUCENE-1260 > URL: https://issues.apache.org/jira/browse/LUCENE-1260 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.3.1 >Reporter: Karl Wettin >Assignee: Michael McCandless > Fix For: 3.1 > > Attachments: Lucene-1260.patch, LUCENE-1260.txt, LUCENE-1260.txt, > LUCENE-1260.txt > > > The static span and resolution of the 8 bit norms codec might not fit with > all applications. > My use case requires that 100f-250f is discretized in 60 bags instead of the > default.. 10? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1260) Norm codec strategy in Similarity
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Kindgren updated LUCENE-1260: --- Attachment: Lucene-1260.patch Removed 'static' keyword to enable a pluggable behavior for encoding/decoding norms. Our business-case for this is to fix scoring when using NGrams. If a word is split into three parts, the norm for these parts would then become ~0.3125 (don't remember exactly) in the current implementation. A search for the exakt same word would then generate a score of less than 1.0. With a pluggable norm-calculation, we could use a norm-table with values 0-100 and get a better scoring. Minor changes in 11 core-classes and some tests. Also minor changes in analyzers, instantiated, memory and miscellaneous. > Norm codec strategy in Similarity > - > > Key: LUCENE-1260 > URL: https://issues.apache.org/jira/browse/LUCENE-1260 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.3.1 >Reporter: Karl Wettin > Attachments: Lucene-1260.patch, LUCENE-1260.txt, LUCENE-1260.txt, > LUCENE-1260.txt > > > The static span and resolution of the 8 bit norms codec might not fit with > all applications. > My use case requires that 100f-250f is discretized in 60 bags instead of the > default.. 10? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1260) Norm codec strategy in Similarity
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin updated LUCENE-1260: Attachment: LUCENE-1260.txt New patch additionally includes: * Lots of javadocs with warnings * Similarity#readNormCodec(Directory):NodeCodec * Similarity#writeNormCodec(Directory, NodeCode) > Norm codec strategy in Similarity > - > > Key: LUCENE-1260 > URL: https://issues.apache.org/jira/browse/LUCENE-1260 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.3.1 >Reporter: Karl Wettin > Attachments: LUCENE-1260.txt, LUCENE-1260.txt, LUCENE-1260.txt > > > The static span and resolution of the 8 bit norms codec might not fit with > all applications. > My use case requires that 100f-250f is discretized in 60 bags instead of the > default.. 10? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-1260) Norm codec strategy in Similarity
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin updated LUCENE-1260: Attachment: LUCENE-1260.txt Fixed some typos and added some tests. Perhaps it needs new javadocs too? > Norm codec strategy in Similarity > - > > Key: LUCENE-1260 > URL: https://issues.apache.org/jira/browse/LUCENE-1260 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.3.1 >Reporter: Karl Wettin > Attachments: LUCENE-1260.txt, LUCENE-1260.txt > > > The static span and resolution of the 8 bit norms codec might not fit with > all applications. > My use case requires that 100f-250f is discretized in 60 bags instead of the > default.. 10? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Updated: (LUCENE-1260) Norm codec strategy in Similarity
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin updated LUCENE-1260: Attachment: LUCENE-1260.txt * Simlarity#getNormCodec() * Simlarity#setNormCodec(NormCodec) * Similarity$NormCodec * Similarity$DefaultNormCodec * Similarity$SimpleNormCodec (binsearches over a sorted float[]) I also depricated Similarity#getNormsTable() and replaced the only use I could find of it - in TermScorer. Could not spont any problems with performance or anything with that. > Norm codec strategy in Similarity > - > > Key: LUCENE-1260 > URL: https://issues.apache.org/jira/browse/LUCENE-1260 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Affects Versions: 2.3.1 >Reporter: Karl Wettin > Attachments: LUCENE-1260.txt > > > The static span and resolution of the 8 bit norms codec might not fit with > all applications. > My use case requires that 100f-250f is discretized in 60 bags instead of the > default.. 10? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]