[jira] Commented: (LUCENE-1567) New flexible query parser

2009-06-09 Thread Adriano Crestani (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717588#action_12717588 ] Adriano Crestani commented on LUCENE-1567: -- > We also have to change "List change

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717654#action_12717654 ] Uwe Schindler commented on LUCENE-1453: --- I forgot to mention: bq. Oh, you mean ther

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717657#action_12717657 ] Earwin Burrfoot commented on LUCENE-1453: - Patch looks fine. I read the last one,

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717660#action_12717660 ] Uwe Schindler commented on LUCENE-1453: --- Both patches are the same, the second one i

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717709#action_12717709 ] Michael McCandless commented on LUCENE-1453: OK patch looks good Uwe! This is

bulk fixing svn eol-style?

2009-06-09 Thread Michael McCandless
We have a number of sources that don't have eol-style set to "native"... This causes problems, eg, patches to such files become degenerate (remove all lines, add all lines), which of course hides what really changed. So... are there any objections if I go through all our sources and set the eol-s

Re: Some thoughts around the use of reader.isDeleted and hasDeletions

2009-06-09 Thread Michael McCandless
> I recently read CHANGES to learn more about the readOnly parameter > IndexReader now supports, and came across LUCENE-1329 with a comment that > isDeleted was made not synchronized if readOnly=true (e.g. > ReadOnlyIndexReader), which can affect search code, as it is usually the > bottleneck for s

Re: HitCollector#collect(int,float,Collection)

2009-06-09 Thread Michael McCandless
My guess is such an approach could be made to work... But I think I'd rather directly improve *Scorer so that they provide such details (and you pay no performance cost if you don't ask for these details). Likewise for positional details of matching, which highlighter could use. And, then, we co

Re: Some thoughts around the use of reader.isDeleted and hasDeletions

2009-06-09 Thread Yonik Seeley
On Mon, Jun 8, 2009 at 4:07 PM, Shai Erera wrote: > if the > reader has no deletions, there's no point checking for each document if > there are deletions and/or if the document was deleted. If there are no deletions, it's just a null pointer check, right? Or would there be other benefits? -Yoni

Re: Some thoughts around the use of reader.isDeleted and hasDeletions

2009-06-09 Thread Yonik Seeley
On Tue, Jun 9, 2009 at 11:52 AM, Michael McCandless wrote: > Actually: I think we should also change IndexReader.document to not > check if it's deleted?  (Renaming it to something like rawDocument(), > storedDocument(), something, in the process, and deprecating the old > one). +1 -Yonik http:/

Re: Some thoughts around the use of reader.isDeleted and hasDeletions

2009-06-09 Thread Earwin Burrfoot
> Actually: I think we should also change IndexReader.document to not > check if it's deleted?  (Renaming it to something like rawDocument(), > storedDocument(), something, in the process, and deprecating the old > one). Yup. After all the most common use-case is to load a document after finding it

[jira] Assigned: (LUCENE-1677) Remove GCJ IndexReader specializations

2009-06-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1677: -- Assignee: Michael McCandless > Remove GCJ IndexReader specializations > --

[jira] Commented: (LUCENE-1677) Remove GCJ IndexReader specializations

2009-06-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717740#action_12717740 ] Michael McCandless commented on LUCENE-1677: bq. Mike, you said you are going

Re: bulk fixing svn eol-style?

2009-06-09 Thread DM Smith
Michael McCandless wrote: We have a number of sources that don't have eol-style set to "native"... This causes problems, eg, patches to such files become degenerate (remove all lines, add all lines), which of course hides what really changed. So... are there any objections if I go through al

[jira] Created: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Michael McCandless (JIRA)
Deprecate Analyzer.tokenStream -- Key: LUCENE-1678 URL: https://issues.apache.org/jira/browse/LUCENE-1678 Project: Lucene - Java Issue Type: Bug Components: Analysis Reporter: Michael McCandl

Re: Question on CachingWrapperFilter

2009-06-09 Thread Michael McCandless
I think, once we can efficiently apply cheap random-access docIDSets the way deleted docs are applied (ie, distribute down to all SegmentTermDocs) then it'd be useful for this filter manager to also pre-fold deletes in, such that SegmentTermDocs would only have a single random-access docIDSet to ch

[jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717754#action_12717754 ] Michael McCandless commented on LUCENE-1673: {quote} In Solr there are three d

Re: Some thoughts around the use of reader.isDeleted and hasDeletions

2009-06-09 Thread Shai Erera
> > Be careful: checkAbort needs to be called "fairly" frequently so > IndexWriter.close(false) doesn't take too long. > Of course - I meant check up front if checkAbort != null, and then always call work() if it's not null. But I also agree that a dummy impl is the better approach, since it's not

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717761#action_12717761 ] Uwe Schindler commented on LUCENE-1453: --- Only two discussion points (the first one c

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717766#action_12717766 ] Michael McCandless commented on LUCENE-1453: bq. If DirectoryReader.close() th

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717769#action_12717769 ] Earwin Burrfoot commented on LUCENE-1453: - bq. I think it should (be closed in a f

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717781#action_12717781 ] Michael McCandless commented on LUCENE-1453: Good question :) Exceptions in c

Re: [jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-09 Thread Jason Rutherglen
> I wonder if we could handle this by adding a setting in FieldInfo? Do we have an issue open that allows any metadata on a per field basis? This seems like something flexible indexing will require? On Tue, Jun 9, 2009 at 10:15 AM, Michael McCandless (JIRA) wrote: > >[ > https://issues.apach

Re: Some thoughts around the use of reader.isDeleted and hasDeletions

2009-06-09 Thread Jason Rutherglen
> I searched the code and was surprised to see isDeleted and hasDeletions are not called from any search code. It was weeded out over time, MatchAllDocsQuery for example used to call it. I think it was to offer users (who are using isDeleted) a way to access deleted docs without a performance hit.

Re: Some thoughts around the use of reader.isDeleted and hasDeletions

2009-06-09 Thread Yonik Seeley
2009/6/9 Shai Erera : >> If there are no deletions, it's just a null pointer check, right? > > Well ... one null pointer check here, one null pointer check there and at > some point you will see a difference. My point wasn't the null pointer check > itself, but the pointer check for *every* documen

[jira] Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717819#action_12717819 ] Grant Ingersoll commented on LUCENE-1678: - I frankly don't like renaming something

[jira] Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717823#action_12717823 ] Earwin Burrfoot commented on LUCENE-1678: - Second this. Though I lost any hope for

[jira] Updated: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1453: -- Attachment: LUCENE-1453.patch Here is an updated patch: - Factored out the FilterIndexReader -

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717830#action_12717830 ] Uwe Schindler commented on LUCENE-1453: --- Do we need to backport to 2.4.2? It's not s

[jira] Issue Comment Edited: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717830#action_12717830 ] Uwe Schindler edited comment on LUCENE-1453 at 6/9/09 2:43 PM: -

[jira] Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717831#action_12717831 ] Mark Miller commented on LUCENE-1678: - >>Second this. Though I lost any hope for sane

RE: [jira] Commented: (LUCENE-1673) Move TrieRange to core

2009-06-09 Thread Uwe Schindler
No we do not have such an issue, as far as I know. Storing some version/field type info would be great. In this case we could maybe extend TrieRange in future to use a different encoding or e.g. CSF for the highest precisision (as Michael Busch suggested in Amsterdam). Because TrieRange was and is

Common Bottlenecks

2009-06-09 Thread Vico Marziale
Hello all. I am new to Lucene as well as this list. I am a PhD student at the University of New Orleans. My current research in in leveraging highly-multicore processors to speed computer forensics tools. For the moment I am trying to figure out what the most common performance bottleneck inside of

[jira] Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717862#action_12717862 ] Earwin Burrfoot commented on LUCENE-1678: - bq. If there are sane/smart ways to cha

[jira] Commented: (LUCENE-1453) When reopen returns a new IndexReader, both IndexReaders may now control the lifecycle of the underlying Directory which is managed by reference counting

2009-06-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717866#action_12717866 ] Earwin Burrfoot commented on LUCENE-1453: - Two suggestions: Factor out RefCount c

Re: Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Mark Miller
Earwin Burrfoot (JIRA) wrote: - bq. If there are sane/smart ways to change our back compat policy, I think you have seen that no one would object. It's not a matter of finding a smart way. It is a matter of sacrifice that has to be made and readiness to

Re: [jira] Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Yonik Seeley
On Tue, Jun 9, 2009 at 7:19 PM, Earwin Burrfoot (JIRA) wrote: > You go zealously for back-compat - you sacrifice readability/maintainability > of your code but free users from any troubles when they want to 'simply > upgrade'. You adopt more relaxed policy - you sacrifice users' time, but in >

Re: Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Earwin Burrfoot
@Mark: >> Okay, there's an escape hatch I (and someone else) mentioned on the list >> before. Adopting a fixed release cycle with small intervals between releases >> (compared to what we have now). Fixed - as in, releases are made each N >> months instead of when everyone feels they finished and po

Re: Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Yonik Seeley
On Tue, Jun 9, 2009 at 8:23 PM, Earwin Burrfoot wrote: >> IMO, changes to interfaces should be clearly better than what existed before. > Recent changes to DISI? Were they clearly for the better? Recent *proposed* changes yes, for 3.0. If you include the scorer changes, it's a bigger change t

Payloads and TrieRangeQuery

2009-06-09 Thread Jason Rutherglen
At the SF Lucene User's group, Michael Busch mentioned using payloads with TrieRangeQueries. Is this something that's being worked on? I'm interested in what sort performance benefits there would be to this method?

[jira] Commented: (LUCENE-1678) Deprecate Analyzer.tokenStream

2009-06-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717888#action_12717888 ] Grant Ingersoll commented on LUCENE-1678: - bq. If there are sane/smart ways to cha