[jira] Commented: (LUCENE-1582) Make TrieRange completely independent from Document/Field with TokenStream of prefix encoded values

2009-04-02 Thread Shalin Shekhar Mangar (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695261#action_12695261 ] Shalin Shekhar Mangar commented on LUCENE-1582: --- bq. trieCodeLong/Int() retu

IndexWriter.addIndexesNoOptimize(IndexReader[] readers)

2009-04-02 Thread Jason Rutherglen
This seems like something that's tenable? It would be useful for merging ram indexes to disk where if a directory is passed, the directory may be changed.

[jira] Updated: (LUCENE-1584) Callback for intercepting merging segments in IndexWriter

2009-04-02 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1584: - Attachment: LUCENE-1584.patch Patch is combined with LUCENE-1516. IndexWriter has a se

[jira] Commented: (LUCENE-1516) Integrate IndexReader with IndexWriter

2009-04-02 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695185#action_12695185 ] Jason Rutherglen commented on LUCENE-1516: -- In ReaderPool.get(SegmentInfo info, b

[jira] Created: (LUCENE-1584) Callback for intercepting merging segments in IndexWriter

2009-04-02 Thread Jason Rutherglen (JIRA)
Callback for intercepting merging segments in IndexWriter - Key: LUCENE-1584 URL: https://issues.apache.org/jira/browse/LUCENE-1584 Project: Lucene - Java Issue Type: Improvement

Re: Future projects

2009-04-02 Thread John Wang
Just to clarify, Approach 1 and approach 2 are both currently performing ok currently for us. -John On Thu, Apr 2, 2009 at 2:41 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Thu, Apr 2, 2009 at 4:43 PM, Jason Rutherglen > wrote: > >> What does Bobo use the cached bitsets for? >

Lucene filter

2009-04-02 Thread addman
How do you create a Lucene Filter to check if a field has a value? It is part for a ChainedFilter that I am creating. -- View this message in context: http://www.nabble.com/Lucene-filter-tp22858220p22858220.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. ---

Re: Future projects

2009-04-02 Thread Jason Rutherglen
> I think I need to understand better why delete by Query isn't viable in your situation... The delete by query is a separate problem which I haven't fully explored yet. Tracking the segment genealogy is really an interim step for merging field caches before column stride fields gets implemented.

Re: Future projects

2009-04-02 Thread Michael McCandless
On Thu, Apr 2, 2009 at 4:43 PM, Jason Rutherglen wrote: >> What does Bobo use the cached bitsets for? > > Bobo is a faceting engine that uses custom field caches and sometimes cached > bitsets rather than relying exclusively on bitsets to calculate facets.  It > is useful where many facets (50+) n

[jira] Commented: (LUCENE-1574) PooledSegmentReader, pools SegmentReader underlying byte arrays

2009-04-02 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695130#action_12695130 ] Jason Rutherglen commented on LUCENE-1574: -- True the pool would hold onto spares,

Re: Future projects

2009-04-02 Thread Jason Rutherglen
> What does Bobo use the cached bitsets for? Bobo is a faceting engine that uses custom field caches and sometimes cached bitsets rather than relying exclusively on bitsets to calculate facets. It is useful where many facets (50+) need to be calculated and bitset caching, loading and intersection

[jira] Commented: (LUCENE-1574) PooledSegmentReader, pools SegmentReader underlying byte arrays

2009-04-02 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695115#action_12695115 ] Michael McCandless commented on LUCENE-1574: Presumably it wouldn't save on me

Re: Future projects

2009-04-02 Thread Michael McCandless
On Thu, Apr 2, 2009 at 2:29 PM, Jason Rutherglen wrote: >> What is "passing filters to the SegmentReader level"? EG as of > LUCENE-1483, we now ask a Filter for it's DocIdSet once per > SegmentReader. > > The patch I was thinking of is LUCENE-1536. I wasn't sure what > the next steps are for it, i

Re: Future projects

2009-04-02 Thread Michael McCandless
I'm not sure how big a win this'd be, since the OS will cache those in RAM and the CPU cost there (to pull from OS's cache and reprocess) is maybe not high. Optimizing search is interesting, because it's the wicked slow queries that you need to make faster even when it's at the expense of wicked f

Re: Future projects

2009-04-02 Thread Michael McCandless
On Thu, Apr 2, 2009 at 2:07 PM, Jason Rutherglen wrote: > I'm interested in merging cached bitsets and field caches.  While this may > be something related to LUCENE-831, in Bobo there are custom field caches > which we want to merge in RAM (rather than reload from the reader using > termenum + te

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-02 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695098#action_12695098 ] Michael McCandless commented on LUCENE-1575: Super, all tests pass for me too.

Re: Future projects

2009-04-02 Thread Jason Rutherglen
> What is "passing filters to the SegmentReader level"? EG as of LUCENE-1483, we now ask a Filter for it's DocIdSet once per SegmentReader. The patch I was thinking of is LUCENE-1536. I wasn't sure what the next steps are for it, i.e. the JumpScorer, Scorer.skipToButNotNext, or simply implementing

Re: Future projects

2009-04-02 Thread Jason Rutherglen
I'm interested in merging cached bitsets and field caches. While this may be something related to LUCENE-831, in Bobo there are custom field caches which we want to merge in RAM (rather than reload from the reader using termenum + termdocs). This could somehow lead to delete by doc id. Tracking

Re: Future projects

2009-04-02 Thread Jason Rutherglen
4) An additional possibly contrib module is caching the results of TermQueries. In looking at the TermQuery code would we need to cache the entire docs and freqs as arrays which would be a memory hog? On Wed, Apr 1, 2009 at 4:05 PM, Jason Rutherglen wrote: > Now that LUCENE-1516 is close to bei

Re: Future projects

2009-04-02 Thread John Wang
Michael: I love your suggestion on 3)! This really opens doors for flexible indexing. -John On Thu, Apr 2, 2009 at 1:40 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Wed, Apr 1, 2009 at 7:05 PM, Jason Rutherglen > wrote: > > Now that LUCENE-1516 is close to being commit

Re: Atomic optimize() + commit()

2009-04-02 Thread Michael McCandless
With ConcurrentMergeScheduler, IndexWriter has gained alot of concurrency, such that an optimize (or normal BG merge) could be running at the same time as deletes/adds. I think this is a good thing and we should keep improving it (there are still places that block, eg while a flush is running a me

[jira] Commented: (LUCENE-1582) Make TrieRange completely independent from Document/Field with TokenStream of prefix encoded values

2009-04-02 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695016#action_12695016 ] Michael McCandless commented on LUCENE-1582: This sounds like a great improvem

Re: ant test should include test-tag

2009-04-02 Thread Michael McCandless
OK I I just left that new one off. So you have to run "ant test-core test-contrib". Mike On Thu, Apr 2, 2009 at 7:21 AM, Mark Miller wrote: > Wouldn't hurt I suppose - but test-core and test-contrib are probably > sufficient. I wasn't very clear with that comment. I was just saying, as > long a

[jira] Updated: (LUCENE-1583) SpanOrQuery skipTo() doesn't always move forwards

2009-04-02 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1583: --- Fix Version/s: 2.9 LUCENE-1327 was a similar issue. > SpanOrQuery skipTo() doesn't

[jira] Created: (LUCENE-1583) SpanOrQuery skipTo() doesn't always move forwards

2009-04-02 Thread Moti Nisenson (JIRA)
SpanOrQuery skipTo() doesn't always move forwards - Key: LUCENE-1583 URL: https://issues.apache.org/jira/browse/LUCENE-1583 Project: Lucene - Java Issue Type: Bug Components: Search

Atomic optimize() + commit()

2009-04-02 Thread Shai Erera
Hi I've run into a problem in my code when I upgraded to 2.4. I am not sure if it is a real problem, but I thought I'd let you know anyway. The following is a background of how I ran into the issue, but I think the discussion does not necessarily involve my use of Lucene. I have a class which wra

[jira] Updated: (LUCENE-1582) Make TrieRange completely independent from Document/Field with TokenStream of prefix encoded values

2009-04-02 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1582: -- Description: TrieRange has currently the following problem: - To add a field, that uses a trie

[jira] Created: (LUCENE-1582) Make TrieRange completely independent from Document/Field with TokenStream of prefix encoded values

2009-04-02 Thread Uwe Schindler (JIRA)
Make TrieRange completely independent from Document/Field with TokenStream of prefix encoded values --- Key: LUCENE-1582 URL: https://issues.apache.org/jira/browse/LUC

[jira] Updated: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-02 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1575: --- Attachment: LUCENE-1575.6.patch Changes: # TimeLimitedCollector, TestTimeLimitedCollector and CHANGE

Re: ant test should include test-tag

2009-04-02 Thread Mark Miller
Wouldn't hurt I suppose - but test-core and test-contrib are probably sufficient. I wasn't very clear with that comment. I was just saying, as long as I can still run the tests a bit quicker than running through everything twice - which is already available. I should have just said +1. On the o

Re: ant test should include test-tag

2009-04-02 Thread Michael McCandless
OK I'll add a "test-core-contrib" target. Mike On Thu, Apr 2, 2009 at 6:45 AM, Mark Miller wrote: > Shai Erera wrote: >> >> I definitely agree. It would have saved me another patch submission in >> 1575 :) >> >> On Thu, Apr 2, 2009 at 12:44 PM, Michael McCandless >> mailto:luc...@mikemccandless.

Re: ant test should include test-tag

2009-04-02 Thread Mark Miller
Shai Erera wrote: I definitely agree. It would have saved me another patch submission in 1575 :) On Thu, Apr 2, 2009 at 12:44 PM, Michael McCandless mailto:luc...@mikemccandless.com>> wrote: I think back-compat tests ("ant test-tag") should run when you run "ant test". Any objec

Re: ant test should include test-tag

2009-04-02 Thread Shai Erera
I definitely agree. It would have saved me another patch submission in 1575 :) On Thu, Apr 2, 2009 at 12:44 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > I think back-compat tests ("ant test-tag") should run when you run "ant > test". > > Any objections? > > If not I'll commit soon

ant test should include test-tag

2009-04-02 Thread Michael McCandless
I think back-compat tests ("ant test-tag") should run when you run "ant test". Any objections? If not I'll commit soon... Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: ja

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-02 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694938#action_12694938 ] Michael McCandless commented on LUCENE-1575: bq. I thought that ant test runs

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-02 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694927#action_12694927 ] Shai Erera commented on LUCENE-1575: I thought that ant test runs all tests. Thanks fo

[jira] Updated: (LUCENE-1516) Integrate IndexReader with IndexWriter

2009-04-02 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1516: --- Attachment: LUCENE-1516.patch Added another test case to TestIndexWriterReader, stre

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-04-02 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694919#action_12694919 ] Michael McCandless commented on LUCENE-1575: Could you also run "ant test-tag"

[jira] Commented: (LUCENE-1313) Realtime Search

2009-04-02 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694917#action_12694917 ] Michael McCandless commented on LUCENE-1313: Jason, your last patch looks like

Re: Future projects

2009-04-02 Thread Michael McCandless
On Wed, Apr 1, 2009 at 7:05 PM, Jason Rutherglen wrote: > Now that LUCENE-1516 is close to being committed perhaps we can > figure out the priority of other issues: > > 1. Searchable IndexWriter RAM buffer I think first priority is to get a good assessment of the performance of the current implem

Re: Problem using Lucene RangeQuery

2009-04-02 Thread Danil ŢORIN
Lucene stores and searches STRINGS so range [0..2] may return 0,1,101, ..109, 11, 110, ..119, 12, ., 2 prefix and normalize your number, like: 001,002...011,012,, 113, etc, if you'll have bigger numbers, put more 0's All of these and much more are documented on the wiki, javadocs and so on, pl