[jira] Updated: (LUCENE-2351) optimize automatonquery

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2351: Attachment: LUCENE-2351.patch attached is the same patch as before, except it includes a random

[jira] Resolved: (LUCENE-2351) optimize automatonquery

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-2351. - Resolution: Fixed Committed revision 929065. optimize automatonquery ---

Re: Incremental Field Updates

2010-03-30 Thread Grant Ingersoll
On Mar 29, 2010, at 10:11 AM, mark harwood wrote: Of course, but what about the Lucene doc id doesn't provide that? The question being how you determine the correct doc id to use in the first place (especially when they are know to be volatile) - the current answer is to use a stable

[jira] Commented: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851346#action_12851346 ] Michael McCandless commented on LUCENE-2302: Uwe is this issue done?

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851347#action_12851347 ] Michael McCandless commented on LUCENE-2126: Michael will this land on flex or

new facet parameter: facet.exists=true

2010-03-30 Thread Gregor Kaczor
Facetting in indexes with document volumes exceeding twenty million documents is a time and particularly memory consuming search. In such huge indexes i am not interested if there is 4 or 5 million documents of a special type, i just want to know there are some and if i choose that facet will

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851372#action_12851372 ] Michael McCandless commented on LUCENE-2111: Towards wrapping up flex, I ran a

Re: new facet parameter: facet.exists=true

2010-03-30 Thread Erik Hatcher
One trick to doing this is to index a field that lists the facet field names that each document possesses. Then you can facet on the field of field names (sounds confusing, sorry) and you'll know if there are any documents in a result set that have values in, say, a category field.

[jira] Commented: (LUCENE-2071) Allow updating of IndexWriter SegmentReaders

2010-03-30 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851388#action_12851388 ] Tim Smith commented on LUCENE-2071: --- +1 I have a special subclassed IndexSearcher that

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851400#action_12851400 ] Robert Muir commented on LUCENE-2111: - bq. I think net/net we are good to land flex!

Re: new facet parameter: facet.exists=true

2010-03-30 Thread Erik Hatcher
Faceting on a facet_fields field will only have a handful (most likely) or less values so you'd be able to have that particular faceting cached to use quickly. I'm not sure how much memory it'd take up, but certainly not as much as actually faceting on the fields themselves. However,

[jira] Updated: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2111: --- Attachment: LUCENE-2111.patch Small fixes for flex -- fixes SpanTermQuerty to throw

Landing the flex branch

2010-03-30 Thread Michael McCandless
I think the time has finally come! Pending one issue (LUCENE-2354 -- Uwe), I think flex is ready to land I think the other issues with Fix Version = Flex Branch can be moved to 3.1 after we land. We still use the pre-flex APIs in a number of places... I think this is actually good (so we

[jira] Commented: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2010-03-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851451#action_12851451 ] Michael Busch commented on LUCENE-2126: --- I'll try to commit tonight to flex, but

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851452#action_12851452 ] Michael Busch commented on LUCENE-2111: --- bq. Flex is generally faster. Awesome

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851456#action_12851456 ] Robert Muir commented on LUCENE-2111: - {quote} There are certain specific wildcard

Re: Query modifier

2010-03-30 Thread David Smiley (@MITRE.org)
I observed this problem when I started using Lucene (ages ago) and it's a shame this situation persists. In summary, it would be tremendously useful if Query objects were fully mutable and offered a visitor pattern to allow walking the query tree to facilitate rewriting. I could open a JIRA

Re: Query modifier

2010-03-30 Thread David Smiley (@MITRE.org)
I observed this problem when I started using Lucene (ages ago) and it's a shame this situation persists. In summary, it would be tremendously useful if Query objects were fully mutable and offered a visitor pattern to allow walking the query tree to facilitate rewriting. It would also be nice

Re: Query modifier

2010-03-30 Thread Jason Rutherglen
David, I totally agree with this idea. On Tue, Mar 30, 2010 at 9:58 AM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: I observed this problem when I started using Lucene (ages ago) and it's a shame this situation persists.  In summary, it would be tremendously useful if Query objects

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851508#action_12851508 ] Michael McCandless commented on LUCENE-2111: bq. Awesome work! What changes

[jira] Commented: (LUCENE-2111) Wrapup flexible indexing

2010-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851511#action_12851511 ] Michael McCandless commented on LUCENE-2111: {quote} The term dictionary

[jira] Commented: (LUCENE-2071) Allow updating of IndexWriter SegmentReaders

2010-03-30 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851528#action_12851528 ] Tim Smith commented on LUCENE-2071: --- found a couple of small issues with the patch

[jira] Commented: (LUCENE-2354) Convert NumericUtils and NumericTokenStream to use BytesRef instead of Strings/char[]

2010-03-30 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851598#action_12851598 ] Uwe Schindler commented on LUCENE-2354: --- Will work here the next days and rewrite

[jira] Commented: (LUCENE-2302) Replacement for TermAttribute+Impl with extended capabilities (byte[] support, CharSequence, Appendable)

2010-03-30 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851596#action_12851596 ] Uwe Schindler commented on LUCENE-2302: --- Will add the javadocs and think about the

[jira] Created: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Robert Muir (JIRA)
rename KeywordMarkerTokenFilter --- Key: LUCENE-2358 URL: https://issues.apache.org/jira/browse/LUCENE-2358 Project: Lucene - Java Issue Type: Task Components: Analysis Reporter: Robert Muir

[jira] Updated: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2358: Attachment: LUCENE-2358.patch attached is a patch (really svn move of KeywordMarkerTokenFilter

[jira] Commented: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851652#action_12851652 ] Steven Rowe commented on LUCENE-2358: - Hi Robert, I'm working on a change to

[jira] Commented: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851656#action_12851656 ] Robert Muir commented on LUCENE-2358: - {quote} I needed to be able to mark cached

[jira] Resolved: (LUCENE-2126) Split up IndexInput and IndexOutput into DataInput and DataOutput

2010-03-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch resolved LUCENE-2126. --- Resolution: Fixed Committed revision 929340. Split up IndexInput and IndexOutput into

[jira] Commented: (LUCENE-2358) rename KeywordMarkerTokenFilter

2010-03-30 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851659#action_12851659 ] Steven Rowe commented on LUCENE-2358: - Sorry for cluttering this issue... {quote} I'm

[jira] Commented: (LUCENE-1488) multilingual analyzer based on icu

2010-03-30 Thread David Bowen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851713#action_12851713 ] David Bowen commented on LUCENE-1488: - I have a possibly naive question on the bigram

[jira] Commented: (LUCENE-1488) multilingual analyzer based on icu

2010-03-30 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851714#action_12851714 ] Robert Muir commented on LUCENE-1488: - bq. I have a possibly naive question on the