Re: SpanNearQuery: All matches within slop

2008-09-02 Thread Paul Elschot
A bit late in reacting, but you may also may want to take a look at this: Paolo Boldi, Sebastiano Vigna Efficient Optimally Lazy Algorithms for Minimal-Interval Semantics Oct 2007, arXiv:0710.1525v1 The algorithms used in the lucene spans package are surprisingly similar. Nevertheless, there are

[jira] Commented: (LUCENE-1313) Ocean Realtime Search

2008-09-02 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627612#action_12627612 ] Karl Wettin commented on LUCENE-1313: - Hi Jason, I took an inital look at your code l

Re: Summer of Code idea for lucene

2008-09-02 Thread Joaquin Perez Iglesias
Hi all, finally I got some time to finish the BM25/BM25F implementation for Lucene you can find more details at http://nlp.uned.es/~jperezi/Lucene-BM25/, it has been tested but I cannot assure that is bugs free. It would be great to receive some feedback about it. There are some details abou

Moving SweetSpotSimilarity out of contrib

2008-09-02 Thread Shai Erera
Hi, Following Doron's quality work enhancements in TREC 2007 ( http://wiki.apache.org/lucene-java/TREC_2007_Million_Queries_Track_-_IBM_Haifa_Team), I was wondering if it's possible to move the SweetSpotSimilarity to Lucene's main code stream (out of "contrib" that is). It shows significant improv

Re: Moving SweetSpotSimilarity out of contrib

2008-09-02 Thread Grant Ingersoll
On Sep 2, 2008, at 6:07 AM, Shai Erera wrote: Hi, Following Doron's quality work enhancements in TREC 2007 (http://wiki.apache.org/lucene-java/TREC_2007_Million_Queries_Track_-_IBM_Haifa_Team ), I was wondering if it's possible to move the SweetSpotSimilarity to Lucene's main code stream (o

[jira] Commented: (LUCENE-1313) Ocean Realtime Search

2008-09-02 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627642#action_12627642 ] Jason Rutherglen commented on LUCENE-1313: -- Hi Karl, Thanks for taking a look at

Re: Moving SweetSpotSimilarity out of contrib

2008-09-02 Thread Shai Erera
>From a legal standpoint, whenever we need to use open-source code, somebody has to inspect the code and 'approve' it. This inspection makes sure there's no use of 3rd party libraries, to which we'd need to get open-source clearance as well. This process was done for Lucene core, but not for contr

Re: Moving SweetSpotSimilarity out of contrib

2008-09-02 Thread Chris Hostetter
: >From a legal standpoint, whenever we need to use open-source code, somebody : has to inspect the code and 'approve' it. This inspection makes sure there's : no use of 3rd party libraries, to which we'd need to get open-source : clearance as well. : : This process was done for Lucene core, but

[jira] Commented: (LUCENE-1243) A few new benchmark tasks

2008-09-02 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627773#action_12627773 ] Grant Ingersoll commented on LUCENE-1243: - My last look at it seemed like it was i

Re: [jira] Commented: (LUCENE-1243) A few new benchmark tasks

2008-09-02 Thread Mark Miller
And all that has been added is a better way to do the sort testing - its a new doc maker that you lets pick a range of random numbers to generate in a known sort field. The range of random ints to be generated can be specified. Its fairly simple, but its a start that works, and if we need more

[jira] Assigned: (LUCENE-1370) Patch to make ShingleFilter output a unigram if no ngrams can be generated

2008-09-02 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin reassigned LUCENE-1370: --- Assignee: Karl Wettin > Patch to make ShingleFilter output a unigram if no ngrams can be gen

[jira] Created: (LUCENE-1373) Most of the contributed Analyzers suffer from invalid recognition of acronyms.

2008-09-02 Thread Mark Lassau (JIRA)
Most of the contributed Analyzers suffer from invalid recognition of acronyms. -- Key: LUCENE-1373 URL: https://issues.apache.org/jira/browse/LUCENE-1373 Project: Lucene - Jav

[jira] Updated: (LUCENE-1373) Most of the contributed Analyzers suffer from invalid recognition of acronyms.

2008-09-02 Thread Mark Lassau (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Lassau updated LUCENE-1373: Description: LUCENE-1068 describes a bug in StandardTokenizer whereby a string like "www.apache.o

[jira] Commented: (LUCENE-1373) Most of the contributed Analyzers suffer from invalid recognition of acronyms.

2008-09-02 Thread Mark Lassau (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627873#action_12627873 ] Mark Lassau commented on LUCENE-1373: - I would be willing to contribute a patch to mak

[jira] Commented: (LUCENE-1373) Most of the contributed Analyzers suffer from invalid recognition of acronyms.

2008-09-02 Thread Mark Lassau (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627875#action_12627875 ] Mark Lassau commented on LUCENE-1373: - Causes JIRA issue [JRA-15484|http://jira.atlass

[jira] Commented: (LUCENE-1068) Invalid behavior of StandardTokenizerImpl

2008-09-02 Thread Mark Lassau (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627876#action_12627876 ] Mark Lassau commented on LUCENE-1068: - Causes JIRA issue [JRA-15484|http://jira.atlass

[jira] Commented: (LUCENE-1373) Most of the contributed Analyzers suffer from invalid recognition of acronyms.

2008-09-02 Thread Mark Lassau (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627903#action_12627903 ] Mark Lassau commented on LUCENE-1373: - Had a closer look at the code, including change