Re: Potential bug in StandardTokenizerImpl

2007-11-26 Thread Shai Erera
I understand it would change the behavior of existing search solutions, however the current behavior is just wrong. An ACRONYM cannot be ABC.DEF. If you look up acronym in Wikipedia, you find only examples of I.B.M. / U.S.A. like, or NATO, IBM, USA, but nothing of the form StandardAnalyzer currentl

Re: Potential bug in StandardTokenizerImpl

2007-11-26 Thread Chris Hostetter
: If you pass "www.abc.com", the output is (www.abc.com,0,11,type=) : (which is correct in my opinion). : However, if you pass "www.abc.com." (notice the extra '.' at the end), the : output is (wwwabccom,0,12,type=). see also... http://www.nabble.com/Inconsistent-StandardTokenizer-behaviour-tf596

Potential bug in StandardTokenizerImpl

2007-11-26 Thread Shai Erera
Hi This question was asked on the users mailing list, but I think it's a bug, so I'll describe it here: The following code should print the output of the StandardAnalyzer: Analyzer analyzer = new StandardAnalyzer(); TokenStream ts = analyzer.tokenStream("content", new StringReade

[jira] Updated: (LUCENE-1058) New Analyzer for buffering tokens

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1058: Fix Version/s: 2.3 > New Analyzer for buffering tokens > -

[jira] Updated: (LUCENE-1058) New Analyzer for buffering tokens

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1058: Attachment: LUCENE-1058.patch A new version of this with the following changes/additions:

Occasional failure in TestStressIndexing.java

2007-11-26 Thread Grant Ingersoll
OK, I have seen this twice in the last two days: Testsuite: org.apache.lucene.index.TestStressIndexing [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 18.58 sec [junit] [junit] - Standard Output --- [junit] java.lang.NullPointerException [

[jira] Closed: (LUCENE-921) IndexReader.FieldOption has incomplete Javadocs

2007-11-26 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch closed LUCENE-921. Resolution: Fixed Cool! Thanks Daniel, keep going! ;) > IndexReader.FieldOption has incomplete Jav

[jira] Commented: (LUCENE-921) IndexReader.FieldOption has incomplete Javadocs

2007-11-26 Thread Daniel Naber (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545645 ] Daniel Naber commented on LUCENE-921: - I added some javadoc comments. Not much, but I think this can be closed.

[jira] Commented: (LUCENE-935) Improve maven artifacts

2007-11-26 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545639 ] Michael Busch commented on LUCENE-935: -- I think in nightly.sh we should call "ant generate-maven-artifacts" befo

[jira] Closed: (LUCENE-920) IndexModifier has incomplete Javadocs

2007-11-26 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch closed LUCENE-920. Resolution: Fixed I agree. > IndexModifier has incomplete Javadocs > -

[jira] Commented: (LUCENE-920) IndexModifier has incomplete Javadocs

2007-11-26 Thread Daniel Naber (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545634 ] Daniel Naber commented on LUCENE-920: - I think this bug can be closed, as IndexModifier is deprecated. > IndexMo

[jira] Updated: (LUCENE-1046) Dead code in SpellChecker.java (branch never executes)

2007-11-26 Thread Daniel Naber (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Naber updated LUCENE-1046: - Attachment: LUCENE-1046.diff Thanks for your report, could you try out this patch? > Dead code

[jira] Commented: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545612 ] Grant Ingersoll commented on LUCENE-1045: - That's fine by me, I think we just need to document it clearly in

[jira] Commented: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545601 ] Doug Cutting commented on LUCENE-1045: -- > True, it isn't all that useful of an interface. Perhaps it should be

[jira] Commented: (LUCENE-982) Create new method optimize(int maxNumSegments) in IndexWriter

2007-11-26 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545594 ] Michael McCandless commented on LUCENE-982: --- OK I plan to commit this in a day or two. > Create new method

[jira] Commented: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545592 ] Yonik Seeley commented on LUCENE-1045: -- My only concern is that ExtendedFieldCache(Impl) adds more public class

[jira] Updated: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1045: Attachment: LUCENE-1045.patch Refactoring to remove duplicated code and use the ExtendedFi

[jira] Commented: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545584 ] Grant Ingersoll commented on LUCENE-1045: - {quote} Normally right, but a user can't provide their own implem

[jira] Reopened: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reopened LUCENE-1045: - Lucene Fields: [New, Patch Available] (was: [Patch Available, New]) There is a cleaner

[jira] Commented: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545580 ] Yonik Seeley commented on LUCENE-1045: -- Normally right, but a user can't provide their own implementation for l

[jira] Commented: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545578 ] Grant Ingersoll commented on LUCENE-1045: - Because it is an interface and that could break implementations.

[jira] Commented: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545575 ] Yonik Seeley commented on LUCENE-1045: -- Hmmm, I didn't realize that ExtendedFieldCache was added until you just

[jira] Commented: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545567 ] Grant Ingersoll commented on LUCENE-1045: - This patch seems a bit strange to me (sorry for getting to it so

Re: [jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-26 Thread robert engels
Thanks. On Nov 26, 2007, at 1:34 PM, Michael McCandless wrote: It's the "write" method in o.a.l.index.SegmentInfos It's called from IndexWriter/DirectoryIndexReader. Mike "robert engels" <[EMAIL PROTECTED]> wrote: Can you point me to the code that does the actual writing of the SEGMENTS.XX

Re: [jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-26 Thread Michael McCandless
It's the "write" method in o.a.l.index.SegmentInfos It's called from IndexWriter/DirectoryIndexReader. Mike "robert engels" <[EMAIL PROTECTED]> wrote: > Can you point me to the code that does the actual writing of the > SEGMENTS.XXX file? > > On Nov 26, 2007, at 1:16 PM, Michael McCandless w

Re: [jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-26 Thread robert engels
Can you point me to the code that does the actual writing of the SEGMENTS.XXX file? On Nov 26, 2007, at 1:16 PM, Michael McCandless wrote: This is correct. This just means the DeletionPolicy cannot delete a commit point until all files referenced by a future (the next) commit point are done

Re: [jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-26 Thread Michael McCandless
This is correct. This just means the DeletionPolicy cannot delete a commit point until all files referenced by a future (the next) commit point are done being sync'd (DeletionPolicy needs to query the Directory to find out which files are on stable storage). However before we even go there, I'm

Re: svn commit: r598382 - /lucene/java/trunk/CHANGES.txt

2007-11-26 Thread Yonik Seeley
FYI, I think we've decided to try and not go back and reformat CHANGES.txt http://www.nabble.com/Re%3A-svn-commit%3A-r468289lucene-java-trunk-CHANGES.txt-tf2653620.html#a7438247 -Yonik On Nov 26, 2007 1:53 PM, <[EMAIL PROTECTED]> wrote: > Author: dnaber > Date: Mon Nov 26 10:53:26 2007 > Ne

[jira] Closed: (LUCENE-1045) SortField.AUTO doesn't work with long

2007-11-26 Thread Daniel Naber (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Naber closed LUCENE-1045. Resolution: Fixed Fix Version/s: 2.3 Lucene Fields: [New, Patch Available] (was: [Patch

Re: [jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-26 Thread robert engels
I am not sure all of this effort is going to work anyway... I think you need to sync all of the segment files, THEN write the segments. file and sync it. It does you no good if there is a valid segments.XXX file, but some of the dependent files may not have written successfully to disk.

[jira] Commented: (LUCENE-1044) Behavior on hard power shutdown

2007-11-26 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545535 ] Doug Cutting commented on LUCENE-1044: -- > I found out however that delaying the syncs (but intending to sync) a

[jira] Resolved: (LUCENE-1059) bad java practices which affect performance (result of code inspection)

2007-11-26 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1059. Resolution: Fixed Lucene Fields: [New, Patch Available] (was: [Patch Availa

[jira] Commented: (LUCENE-794) Extend contrib Highlighter to properly support phrase queries and span queries

2007-11-26 Thread Michael Goddard (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545462 ] Michael Goddard commented on LUCENE-794: Mark, I did a little bit more with this since I needed support for

[jira] Updated: (LUCENE-1001) Add Payload retrieval to Spans

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1001: Attachment: LUCENE-1001.patch Fixes the unordered problem. Still needs more testing, but

[jira] Commented: (LUCENE-1001) Add Payload retrieval to Spans

2007-11-26 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545419 ] Grant Ingersoll commented on LUCENE-1001: - There is an issue w/ this patch related to unordered, overlapping