Re: Fuzzy search change

2009-06-20 Thread Varun Dhussa
Hi, I can port the code to java. I do not know the Lucene file structures etc. as of now. So if someone with experience on that to store trigrams and index them is can work on that part, I can port the rest of the code. Regards Varun Dhussa Product Architect CE InfoSystems (P) Ltd

[jira] Commented: (LUCENE-1630) Mating Collector and Scorer on doc Id orderness

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722159#action_12722159 ] Michael McCandless commented on LUCENE-1630: {quote} bq. CustomScorer's

[jira] Updated: (LUCENE-1630) Mating Collector and Scorer on doc Id orderness

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1630: --- Attachment: LUCENE-1630.patch Patch looks good! I attached a new version w/ some

[jira] Commented: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722166#action_12722166 ] Michael McCandless commented on LUCENE-1701: {quote} bq. Someday maybe I'll

[jira] Commented: (LUCENE-1703) Add a waitForMerges() method to IndexWriter

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722168#action_12722168 ] Michael McCandless commented on LUCENE-1703: bq. After I added a number of

[jira] Commented: (LUCENE-1703) Add a waitForMerges() method to IndexWriter

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722169#action_12722169 ] Michael McCandless commented on LUCENE-1703: bq. or .. deprecate all the

[jira] Commented: (LUCENE-1705) Add deleteAllDocuments() method to IndexWriter

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722172#action_12722172 ] Michael McCandless commented on LUCENE-1705: This should be simple to

Re: Shouldn't IndexWriter.commit(Map) accept Properties instead?

2009-06-20 Thread Michael McCandless
The javadocs state clearly it must be MapString,String. Plus, the type checking is in fact enforced (you hit an exception if you violate it), dynamically (like Python). And then I was thinking with 1.5 (3.0 -- huh, neat how it's exactly 2X) we'd statically type it (change Map to

[jira] Created: (LUCENE-1706) Site search powered by Lucene/Solr

2009-06-20 Thread Grant Ingersoll (JIRA)
Site search powered by Lucene/Solr -- Key: LUCENE-1706 URL: https://issues.apache.org/jira/browse/LUCENE-1706 Project: Lucene - Java Issue Type: New Feature Reporter: Grant Ingersoll

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722190#action_12722190 ] Uwe Schindler commented on LUCENE-1687: --- bq. True, but you know how we are about

[jira] Updated: (LUCENE-1706) Site search powered by Lucene/Solr

2009-06-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-1706: Attachment: LUCENE-1706.patch Patch attached. Site search powered by Lucene/Solr

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722193#action_12722193 ] Grant Ingersoll commented on LUCENE-1687: - Go for it. No need to deprecate EFC,

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722196#action_12722196 ] Uwe Schindler commented on LUCENE-1687: --- Removing ExtendedFieldCache complete would

[jira] Assigned: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-1687: - Assignee: Uwe Schindler (was: Grant Ingersoll) Remove ExtendedFieldCache by rolling

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722197#action_12722197 ] Grant Ingersoll commented on LUCENE-1687: - The whole point of this issue is that

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722198#action_12722198 ] Uwe Schindler commented on LUCENE-1687: --- It breaks backwards compatibility in the

[jira] Issue Comment Edited: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722197#action_12722197 ] Grant Ingersoll edited comment on LUCENE-1687 at 6/20/09 7:16 AM:

[jira] Issue Comment Edited: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722198#action_12722198 ] Uwe Schindler edited comment on LUCENE-1687 at 6/20/09 7:19 AM:

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722199#action_12722199 ] Grant Ingersoll commented on LUCENE-1687: - Uwe, The ENTIRE point of this issue is

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722201#action_12722201 ] Yonik Seeley commented on LUCENE-1687: -- Uwe is right - EFC has been around since 2.3,

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722202#action_12722202 ] Grant Ingersoll commented on LUCENE-1687: - Why? So much for case-by-case back

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722203#action_12722203 ] Grant Ingersoll commented on LUCENE-1687: - And Yonik, if you're argument is b/c

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722204#action_12722204 ] Yonik Seeley commented on LUCENE-1687: -- bq. And Yonik, if you're argument is b/c Solr

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722206#action_12722206 ] Grant Ingersoll commented on LUCENE-1687: - OK Remove ExtendedFieldCache by

[jira] Updated: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1701: -- Attachment: LUCENE-1701.patch Patch with all changes, including LUCENE-1687 (it is easier to

[jira] Commented: (LUCENE-1687) Remove ExtendedFieldCache by rolling functionality into FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722217#action_12722217 ] Uwe Schindler commented on LUCENE-1687: --- Patch available in LUCENE-1701! I close

[jira] Commented: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722218#action_12722218 ] Uwe Schindler commented on LUCENE-1701: --- I know you will kill me, Yonik, and Mike

[jira] Updated: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache

2009-06-20 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1701: -- Attachment: LUCENE-1701-test-tag-special.patch LUCENE-1701.patch The last

[jira] Commented: (LUCENE-1703) Add a waitForMerges() method to IndexWriter

2009-06-20 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722231#action_12722231 ] Shai Erera commented on LUCENE-1703: bq. Though we have a migration challenge (if we

Excessive use of ensureOpen()

2009-06-20 Thread Shai Erera
Hi I noticed both IndexReader and IndexWriter call ensureOpen in almost every method. How critical is this check? Why would someone call any of these when the writer or reader are close? If we add isOpen() to both, then I think we can remove this code from IndexWriter/Reader, and tell folks to

Re: Excessive use of ensureOpen()

2009-06-20 Thread Yonik Seeley
On Sat, Jun 20, 2009 at 2:10 PM, Shai Ereraser...@gmail.com wrote: I noticed both IndexReader and IndexWriter call ensureOpen in almost every method. How critical is this check? Why would someone call any of these when the writer or reader are close? It's to catch user errors, calling methods

[jira] Updated: (LUCENE-1699) Field tokenStream should be usable with stored fields.

2009-06-20 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated LUCENE-1699: - Attachment: LUCENE-1699.patch Patch attached. I've attempted to clean up some of the semantics

Parallelize tests

2009-06-20 Thread Jason Rutherglen
I was looking at how to parallelize the tests, seems like this ANT command would work, is there an open issue to do this? http://ant.apache.org/manual/CoreTasks/parallel.html

[jira] Commented: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722271#action_12722271 ] Michael McCandless commented on LUCENE-1701: bq. there is a possibility to

Re: Parallelize tests

2009-06-20 Thread Michael McCandless
We've touched on this before: http://www.gossamer-threads.com/lists/lucene/java-dev/69669 I'd love to see a clean solution here (the tests are embarrassingly parallelizable, and we all have machines with good concurrency these days)... I have a rather hacked up solution now, that uses

Re: Excessive use of ensureOpen()

2009-06-20 Thread Michael McCandless
I agree we should only promise best-effort, not immediate detection on close, so we shouldn't be checking volatile refCount. Mike On Sat, Jun 20, 2009 at 2:32 PM, Yonik Seeleyyo...@lucidimagination.com wrote: On Sat, Jun 20, 2009 at 2:10 PM, Shai Ereraser...@gmail.com wrote: I noticed both

[jira] Commented: (LUCENE-1703) Add a waitForMerges() method to IndexWriter

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722274#action_12722274 ] Michael McCandless commented on LUCENE-1703: bq. If we change the default to

[jira] Updated: (LUCENE-1466) CharFilter - normalize characters before tokenizer

2009-06-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1466: --- Attachment: LUCENE-1466-back.patch LUCENE-1466.patch I think we

[jira] Commented: (LUCENE-1466) CharFilter - normalize characters before tokenizer

2009-06-20 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722283#action_12722283 ] Koji Sekiguchi commented on LUCENE-1466: Oops. Thanks for the updated patch, Mike!

Re: Parallelize tests

2009-06-20 Thread Mark Miller
I was looking at the ant parallelize stuff too - I think that only the very latest release has the built in parallelize tests functionality though. Just came out a bit ago. - Mark Michael McCandless wrote: We've touched on this before:

[jira] Commented: (LUCENE-1703) Add a waitForMerges() method to IndexWriter

2009-06-20 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722292#action_12722292 ] Shai Erera commented on LUCENE-1703: I'm not against the default, just thought that

[jira] Created: (LUCENE-1707) Don't use ensureOpen() excessively in IndexReader and IndexWriter

2009-06-20 Thread Shai Erera (JIRA)
Don't use ensureOpen() excessively in IndexReader and IndexWriter - Key: LUCENE-1707 URL: https://issues.apache.org/jira/browse/LUCENE-1707 Project: Lucene - Java Issue Type:

3MB lucene-analyzers.jar?

2009-06-20 Thread Ryan McKinley
With the added analyzer for LUCENE-1629, it seems the jar file is now ~3.5MB. Given the size, does it make sense to put it in its own jar file? That way programs can easily exclude it if space is a concern. thanks ryan - To