Re: Lucene's default settings & back compatibility

2009-05-21 Thread Shai Erera
> > Your example confused me. You're right. I Wrote it with one eye closed already. I meant to say that if I'm a 2.4 user and something gets deprecated in trunk (afterwards), it is carried through 2.4.X and 2.5 and then removed in 2.6. So only 1 full minor release. It's somewhat crazy, but what

[jira] Commented: (LUCENE-1636) TokenFilters with a null value in the constructor fail

2009-05-21 Thread Wouter Heijke (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711933#action_12711933 ] Wouter Heijke commented on LUCENE-1636: --- I'm on holiday now, but as far as I recolle

[jira] Issue Comment Edited: (LUCENE-1636) TokenFilters with a null value in the constructor fail

2009-05-21 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711920#action_12711920 ] Uwe Schindler edited comment on LUCENE-1636 at 5/21/09 6:53 PM:

[jira] Commented: (LUCENE-1636) TokenFilters with a null value in the constructor fail

2009-05-21 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711920#action_12711920 ] Uwe Schindler commented on LUCENE-1636: --- Mike: Would this affect backwards compatibi

[jira] Commented: (LUCENE-1636) TokenFilters with a null value in the constructor fail

2009-05-21 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711919#action_12711919 ] Uwe Schindler commented on LUCENE-1636: --- Hi Wouter, I still want to find out, what y

[jira] Commented: (LUCENE-1636) TokenFilters with a null value in the constructor fail

2009-05-21 Thread Wouter Heijke (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711915#action_12711915 ] Wouter Heijke commented on LUCENE-1636: --- I only hope users will understand this and

[jira] Issue Comment Edited: (LUCENE-1474) Incorrect SegmentInfo.delCount when IndexReader.flush() is used

2009-05-21 Thread Erik van Zijst (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711868#action_12711868 ] Erik van Zijst edited comment on LUCENE-1474 at 5/21/09 4:47 PM: ---

[jira] Commented: (LUCENE-1474) Incorrect SegmentInfo.delCount when IndexReader.flush() is used

2009-05-21 Thread Erik van Zijst (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711868#action_12711868 ] Erik van Zijst commented on LUCENE-1474: I have attached the output of CheckIndex

[jira] Updated: (LUCENE-1474) Incorrect SegmentInfo.delCount when IndexReader.flush() is used

2009-05-21 Thread Erik van Zijst (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik van Zijst updated LUCENE-1474: --- Attachment: CheckIndex.txt > Incorrect SegmentInfo.delCount when IndexReader.flush() is used

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Marvin Humphrey
On Thu, May 21, 2009 at 05:19:43PM -0400, Michael McCandless wrote: > Marvin, which solution would you prefer? Between the two, I'd prefer settings constructor arguments, though I would be inclined to have settings classes that are specific to individual classes rather than Lucene-wide. At lea

[jira] Commented: (LUCENE-1646) QueryParser throws new exceptions even if custom parsing logic threw a better one

2009-05-21 Thread Trejkaz (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711850#action_12711850 ] Trejkaz commented on LUCENE-1646: - Our improvements are (so far) specific to our subclass

RE: Lucene's default settings & back compatibility

2009-05-21 Thread Steven A Rowe
On 5/21/2009 at 7:17 AM, Michael McCandless wrote: > OK so it sounds like we've boiled the proposal down to two concrete > changes to the back-compat policy: > > 1) Default settings can change; we will always choose defaults > based on "latest & greatest for new users". This only > af

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Robert Muir
On Thu, May 21, 2009 at 5:55 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Thu, May 21, 2009 at 5:44 PM, Robert Muir wrote: > > and what if your analyzer needs a third-party library (or two)? > > In such cases the back-compat of your analyzer is your responsibility, > right? I

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 5:44 PM, Robert Muir wrote: > and what if your analyzer needs a third-party library (or two)? In such cases the back-compat of your analyzer is your responsibility, right? > i mean this isn't unique to analyzers, if something changes/bug is fixed in > the guts of some que

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Robert Muir
and what if your analyzer needs a third-party library (or two)? i mean this isn't unique to analyzers, if something changes/bug is fixed in the guts of some query/scorer that affects scoring in the slightest then thats a potential issue too, right? for a big index burying a result deep is effecti

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 5:19 PM, Earwin Burrfoot wrote: >> Why not store an "actsAs" in the index, just for the changes that >> affect what's in the index?  Ie the index records the >> version that created it, and by default TokenStreams emulate their >> behavior as of that version? > > Because yo

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 1:59 PM, Marvin Humphrey wrote: > That bug has led to 'base' having a compromised reputation among elite users > because of intermittent, inexplicable flakiness. Is that what you want for > Lucene? While I agree a single global default is not great, I do think it's the l

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Earwin Burrfoot
> Why not store an "actsAs" in the index, just for the changes that > affect what's in the index?  Ie the index records the > version that created it, and by default TokenStreams emulate their > behavior as of that version? Because you don't always have access to index at the time you create your T

[jira] Commented: (LUCENE-1436) Make ReqExclScorer package private, and use DocIdSetIterator for excluded part.

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711810#action_12711810 ] Michael McCandless commented on LUCENE-1436: OK why don't we make both package

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 4:34 PM, Shai Erera wrote: > Changes to the index file formats need to be supported for 2 major releases. > I.e. 2.X indexes need to be read by 3.Y code, but not by 4.0. Agreed. > Method deprecations last for one full minor release. Your example confused me. I think i

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Earwin Burrfoot
Sounds like a good proposition. There's one problem I'd like to address. Good names for classes/members matter, and matter much. They directly affect how fast a newcomer is able to understand that particular API, it also affects how comfortable you work with it once you did understand. When we're

[jira] Created: (LUCENE-1653) Change DateTools to not create a Calendar in every call to dateToString or timeToString

2009-05-21 Thread Shai Erera (JIRA)
Change DateTools to not create a Calendar in every call to dateToString or timeToString --- Key: LUCENE-1653 URL: https://issues.apache.org/jira/browse/LUCENE-1653 Pr

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711791#action_12711791 ] Shai Erera commented on LUCENE-1614: {quote} Are both new in 2.9? Yes. {quote} Oh th

[jira] Commented: (LUCENE-1595) Split DocMaker into ContentSource and DocMaker

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711792#action_12711792 ] Shai Erera commented on LUCENE-1595: Ok I'll make sure it's 1.4 compatible then. > Sp

[jira] Commented: (LUCENE-1436) Make ReqExclScorer package private, and use DocIdSetIterator for excluded part.

2009-05-21 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711793#action_12711793 ] Paul Elschot commented on LUCENE-1436: -- The reason to make things package private is

[jira] Created: (LUCENE-1652) Enhancements to Scorers following the changes to DocIdSetIterator

2009-05-21 Thread Shai Erera (JIRA)
Enhancements to Scorers following the changes to DocIdSetIterator - Key: LUCENE-1652 URL: https://issues.apache.org/jira/browse/LUCENE-1652 Project: Lucene - Java Issue Type: Im

[jira] Commented: (LUCENE-1436) Make ReqExclScorer package private, and use DocIdSetIterator for excluded part.

2009-05-21 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711790#action_12711790 ] Paul Elschot commented on LUCENE-1436: -- This should only affect external code that us

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Shai Erera
I thought we were actually on the track towards not introducing any Settings and/or actAs, but instead just change the policy? Can we agree on the following: * Changes to the index file formats need to be supported for 2 major releases. I.e. 2.X indexes need to be read by 3.Y code, but not by 4.0

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711785#action_12711785 ] Michael McCandless commented on LUCENE-1614: bq. Are you sure about it? Yes.

[jira] Commented: (LUCENE-1436) Make ReqExclScorer package private, and use DocIdSetIterator for excluded part.

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711783#action_12711783 ] Shai Erera commented on LUCENE-1436: I just hope this does not collide with LUCENE-161

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711782#action_12711782 ] Shai Erera commented on LUCENE-1614: bq. Oh, it turns out OBSI.nextDoc is new in 2.9!

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Jason Rutherglen
I'm having trouble visualizing the various methods people are talking about. It seems like we could open an issue and post patches with code illustrating what each person is talking about? On Thu, May 21, 2009 at 10:02 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > Actually, we sta

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711773#action_12711773 ] Earwin Burrfoot commented on LUCENE-1614: - bq. Oh, it turns out OBSI.nextDoc is ne

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Earwin Burrfoot
> That bug has led to 'base' having a compromised reputation among elite users > because of intermittent, inexplicable flakiness.  Is that what you want for > Lucene? While I agree with that point, Lucene already has lots and lots of static configuration. Having actsAsVersion won't add any new woes

[jira] Commented: (LUCENE-1436) Make ReqExclScorer package private, and use DocIdSetIterator for excluded part.

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711753#action_12711753 ] Michael McCandless commented on LUCENE-1436: Paul, this is technically a chang

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Marvin Humphrey
Mike McCandless: > Well this is what I love about the actsAsVersion solution. There's no > pain for our back-compat users (besides the one-time effort to set > actsAsVersion), and new users always get the best settings. When some mad-as-hell user complains to this list after spending an inordina

[jira] Resolved: (LUCENE-1636) TokenFilters with a null value in the constructor fail

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1636. Resolution: Fixed Fix Version/s: 2.9 > TokenFilters with a null value in th

[jira] Commented: (LUCENE-1636) TokenFilters with a null value in the constructor fail

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711704#action_12711704 ] Michael McCandless commented on LUCENE-1636: I think we should change this in

[jira] Assigned: (LUCENE-1636) TokenFilters with a null value in the constructor fail

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1636: -- Assignee: Michael McCandless > TokenFilters with a null value in the construct

[jira] Commented: (LUCENE-1637) Getting an IndexReader from a committed IndexWriter

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711701#action_12711701 ] Michael McCandless commented on LUCENE-1637: Couldn't you simply call IW.getRe

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711695#action_12711695 ] Michael McCandless commented on LUCENE-1614: Oh, it turns out OBSI.nextDoc is

Re: Lucene's default settings & back compatibility

2009-05-21 Thread DM Smith
Michael McCandless wrote: On Thu, May 21, 2009 at 12:19 PM, Robert Muir wrote: even as simple as changing default stopword list for some analyzer could be an issue, if the user doesn't re-index in response to that change. OK, right. So say we forgot to include "the" in the default En

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
Actually, we started with the *Settings classes (to hold defaults), but then realized a simple actsAsVersion (single static method) would suffice for just the back-compat settings and then pushed further and thought perhaps we should relax our back-compat policy entirely so emulating older versions

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Robert Muir
yeah, i was thinking the more likely case of where something like "teh" is in the list... On Thu, May 21, 2009 at 12:25 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Thu, May 21, 2009 at 12:19 PM, Robert Muir wrote: > > even as simple as changing default stopword list for some

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 12:46 PM, DM Smith wrote: > I'm looking forward to the repackaging effort. I'm looking forward to it too! I can't wait for NumericRangeQuery... But: someone with serious ant skill set, and some time, needs to get the itch here and start iterating... Mike --

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 12:43 PM, Mark Miller wrote: > Hmmm - thats starting to sound nastier. Its another barrier to upgrading to > a new jar. I have to monitor/hunt down and not miss all these little flags > so that docs/terms don't disappear from my index? There is already some of > that and I

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Matthew Hall
Sorry, I wasn't quite sure what to call this new class you guys have been talking about. I was referring to the class that's being discussed to encapsulate all of the defaults for a given lucene release. (Its caching strategies etc etc) I'm just not certain that something like a static list

Re: Lucene's default settings & back compatibility

2009-05-21 Thread DM Smith
Michael McCandless wrote: On Thu, May 21, 2009 at 8:24 AM, DM Smith wrote: On May 21, 2009, at 7:17 AM, Michael McCandless wrote: 1) Default settings can change; we will always choose defaults based on "latest & greatest for new users". This only affects "runtime behavior". E

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Mark Miller
Michael McCandless wrote: On Thu, May 21, 2009 at 12:19 PM, Robert Muir wrote: even as simple as changing default stopword list for some analyzer could be an issue, if the user doesn't re-index in response to that change. OK, right. So say we forgot to include "the" in the default En

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
What is the "lucene defaults class"? Mike On Thu, May 21, 2009 at 12:37 PM, Matthew Hall wrote: > For extreme examples like this, couldn't the stopword list be encapsulated > into a single class that's used by the lucene defaults class. > > That way if you folks released updates to mostly static

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711682#action_12711682 ] Michael McCandless commented on LUCENE-1614: bq. If they are all on -1 to star

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Matthew Hall
For extreme examples like this, couldn't the stopword list be encapsulated into a single class that's used by the lucene defaults class. That way if you folks released updates to mostly static content like a stopword list, new or old users could get it easily with a simple drop in fix? Just

[jira] Commented: (LUCENE-1595) Split DocMaker into ContentSource and DocMaker

2009-05-21 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711678#action_12711678 ] Mark Miller commented on LUCENE-1595: - Right - the back compat for each contrib is com

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 12:19 PM, Robert Muir wrote: > even as simple as changing default stopword list for some analyzer could be > an issue, if the user doesn't re-index in response to that change. OK, right. So say we forgot to include "the" in the default English stopwords list (yes, an extr

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711676#action_12711676 ] Yonik Seeley commented on LUCENE-1614: -- bq. But: wouldn't ConjunctionScorer still nee

[jira] Commented: (LUCENE-1595) Split DocMaker into ContentSource and DocMaker

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711674#action_12711674 ] Michael McCandless commented on LUCENE-1595: Probably it's best to stick w/ 1.

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Robert Muir
even as simple as changing default stopword list for some analyzer could be an issue, if the user doesn't re-index in response to that change. > Can you give an example of such changes? EG if we fix a bug in > StandardAnalyzer, we will default it to fixed for new users and expect > you on upgrad

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711670#action_12711670 ] Michael McCandless commented on LUCENE-1614: We could also consider adding DIS

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 8:24 AM, DM Smith wrote: > > On May 21, 2009, at 7:17 AM, Michael McCandless wrote: > >>  1) Default settings can change; we will always choose defaults based >>    on "latest & greatest for new users".  This only affects "runtime >>    behavior".  EG in 2.9, when sorting b

[jira] Updated: (LUCENE-1651) Make IndexReader.open() always return MSR to simplify (re-)opens.

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1651: --- Fix Version/s: 2.9 > Make IndexReader.open() always return MSR to simplify (re-)open

[jira] Commented: (LUCENE-1651) Make IndexReader.open() always return MSR to simplify (re-)opens.

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711668#action_12711668 ] Michael McCandless commented on LUCENE-1651: Excellent! Thanks Earwin. bq. t

[jira] Assigned: (LUCENE-1651) Make IndexReader.open() always return MSR to simplify (re-)opens.

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1651: -- Assignee: Michael McCandless > Make IndexReader.open() always return MSR to si

Re: SegmentReader instantiation

2009-05-21 Thread DM Smith
Michael McCandless wrote: On Thu, May 21, 2009 at 10:53 AM, Earwin Burrfoot wrote: I agree we should probably remove it, unless there are users relying on this. Maintaining side-by-side sources is difficult with time. As I said in the initial message, this feature introduces no run

[jira] Updated: (LUCENE-1651) Make IndexReader.open() always return MSR to simplify (re-)opens.

2009-05-21 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Earwin Burrfoot updated LUCENE-1651: Attachment: LUCENE-1651.patch Okay, here's the first patch, against latest trunk. test-cor

Re: SegmentReader instantiation

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 10:53 AM, Earwin Burrfoot wrote: >> I agree we should probably remove it, unless there are users relying >> on this.  Maintaining side-by-side sources is difficult with time. > > As I said in the initial message, this feature introduces no runtime > behaviour changes, so y

[jira] Created: (LUCENE-1651) Make IndexReader.open() always return MSR to simplify (re-)opens.

2009-05-21 Thread Earwin Burrfoot (JIRA)
Make IndexReader.open() always return MSR to simplify (re-)opens. - Key: LUCENE-1651 URL: https://issues.apache.org/jira/browse/LUCENE-1651 Project: Lucene - Java Issue Type: Ta

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711653#action_12711653 ] Michael McCandless commented on LUCENE-1614: {quote} On the other end of the s

Re: SegmentReader instantiation

2009-05-21 Thread Earwin Burrfoot
2009/5/21 Michael McCandless : > It looks like this was done in order to implement > SegmentTermDocs.read(int[], int[]) natively, when using a gcj > environment, since that gave performance improvements? Yup, you're right. But something tells me, since Lucene 1.9 many things changed and this is no

Re: SegmentReader instantiation

2009-05-21 Thread Michael McCandless
It looks like this was done in order to implement SegmentTermDocs.read(int[], int[]) natively, when using a gcj environment, since that gave performance improvements? I agree we should probably remove it, unless there are users relying on this. Maintaining side-by-side sources is difficult with t

SegmentReader instantiation

2009-05-21 Thread Earwin Burrfoot
Right now a set of system properties and Class.newInstance() is used to create SegmentReader. I've tracked down this code's origins to: r150531 | cutting | 2004-09-22 22:32:27 +0400 (ср, 22 сен 2004) | 2 lines Add GCJ native code for SegmentTermDocs.read(int[],int[]) to accellerate TermScorer. Te

Re: DateTools performance

2009-05-21 Thread Michael McCandless
Yes, please fix :) I think there may already be an issue open on the single instance / synchronization / ThreadLocal issue. Mike On Thu, May 21, 2009 at 9:52 AM, Shai Erera wrote: > How much is DateTools in use? I noticed a couple of potential improvements > to it, which at least for the benchm

[jira] Updated: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-1614: --- Attachment: LUCENE-1614.patch MAX_VAL as sentinel + the documentation changes + a new entry to CHANG

DateTools performance

2009-05-21 Thread Shai Erera
How much is DateTools in use? I noticed a couple of potential improvements to it, which at least for the benchmark package can improve performance: 1. timeToString calls Calendar.getInstance on every call? That's a very expensive call to make. Why not store it as a static member? We always call it

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711615#action_12711615 ] Shai Erera commented on LUCENE-1614: I plan to open another issue for 3.0 to take adva

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711611#action_12711611 ] Yonik Seeley commented on LUCENE-1614: -- I'm warming to some of the simplifications th

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711605#action_12711605 ] Michael McCandless commented on LUCENE-1614: bq. So Mike - does that mean I ca

[jira] Commented: (LUCENE-1648) when you clone or reopen an IndexReader with pending changes, the new reader doesn't commit the changes

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711604#action_12711604 ] Michael McCandless commented on LUCENE-1648: OK -- good catch! I've reopened

[jira] Reopened: (LUCENE-1648) when you clone or reopen an IndexReader with pending changes, the new reader doesn't commit the changes

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reopened LUCENE-1648: > when you clone or reopen an IndexReader with pending changes, the new reader > does

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711597#action_12711597 ] Shai Erera commented on LUCENE-1614: bq. I don't see any calls to OpenBitSetIterator.n

[jira] Updated: (LUCENE-1648) when you clone or reopen an IndexReader with pending changes, the new reader doesn't commit the changes

2009-05-21 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Earwin Burrfoot updated LUCENE-1648: Attachment: LUCENE-1648-followup.patch And here's the fix. The problem - it's not elegant

Re: Lucene's default settings & back compatibility

2009-05-21 Thread DM Smith
On May 21, 2009, at 7:17 AM, Michael McCandless wrote: 1) Default settings can change; we will always choose defaults based on "latest & greatest for new users". This only affects "runtime behavior". EG in 2.9, when sorting by field you won't get scores by default. When we do th

[jira] Updated: (LUCENE-1648) when you clone or reopen an IndexReader with pending changes, the new reader doesn't commit the changes

2009-05-21 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Earwin Burrfoot updated LUCENE-1648: Attachment: LUCENE-1648-followup.patch bq. Bad news is something is wrong w/ your patch, b

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shalin Shekhar Mangar (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711588#action_12711588 ] Shalin Shekhar Mangar commented on LUCENE-1614: --- bq. Perhaps the Solr guys c

[jira] Commented: (LUCENE-1595) Split DocMaker into ContentSource and DocMaker

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711586#action_12711586 ] Shai Erera commented on LUCENE-1595: BTW, am I allowed to use Java 5 generics in bench

[jira] Updated: (LUCENE-1647) IndexReader.undeleteAll can mess up the deletion count stored in the segments file

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1647: --- Attachment: LUCENE-1647.patch Attached patch w/ test showing the issue, and fix that

[jira] Commented: (LUCENE-1595) Split DocMaker into ContentSource and DocMaker

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711583#action_12711583 ] Shai Erera commented on LUCENE-1595: bq. Maybe make the seed an optional config? If it

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711580#action_12711580 ] Shai Erera commented on LUCENE-1614: bq. My guess is eg Solr probably relies heavily o

[jira] Commented: (LUCENE-1595) Split DocMaker into ContentSource and DocMaker

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711581#action_12711581 ] Michael McCandless commented on LUCENE-1595: bq. While I change SortableSingle

[jira] Resolved: (LUCENE-1648) when you clone or reopen an IndexReader with pending changes, the new reader doesn't commit the changes

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1648. Resolution: Fixed > when you clone or reopen an IndexReader with pending changes,

[jira] Commented: (LUCENE-1648) when you clone or reopen an IndexReader with pending changes, the new reader doesn't commit the changes

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711578#action_12711578 ] Michael McCandless commented on LUCENE-1648: {quote} Or to be more exact, it f

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
On Thu, May 21, 2009 at 7:21 AM, Shai Erera wrote: > I thought that the index file format is supposed to be supported until the > 2nd major release. I.e. 3.0 will still read 2.0 indexes, but 4.0 won't. Is > that what you meant, or am I wrong? Woops, you're correct: http://wiki.apache.org/jaka

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711576#action_12711576 ] Michael McCandless commented on LUCENE-1614: bq. I think I'll emphasize that i

[jira] Commented: (LUCENE-1595) Split DocMaker into ContentSource and DocMaker

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711577#action_12711577 ] Shai Erera commented on LUCENE-1595: While I change SortableSingleDocMaker I noticed i

[jira] Commented: (LUCENE-1648) when you clone or reopen an IndexReader with pending changes, the new reader doesn't commit the changes

2009-05-21 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711575#action_12711575 ] Earwin Burrfoot commented on LUCENE-1648: - Or to be more exact, it fixed the tests

[jira] Commented: (LUCENE-1648) when you clone or reopen an IndexReader with pending changes, the new reader doesn't commit the changes

2009-05-21 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711571#action_12711571 ] Earwin Burrfoot commented on LUCENE-1648: - bq. Try the patch? Yup, it fixed everyt

[jira] Commented: (LUCENE-1646) QueryParser throws new exceptions even if custom parsing logic threw a better one

2009-05-21 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711570#action_12711570 ] Michael McCandless commented on LUCENE-1646: bq. I guess that's true if you lo

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Shai Erera
I thought that the index file format is supposed to be supported until the 2nd major release. I.e. 3.0 will still read 2.0 indexes, but 4.0 won't. Is that what you meant, or am I wrong? Shai On Thu, May 21, 2009 at 2:17 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > OK so it sounds

Re: Lucene's default settings & back compatibility

2009-05-21 Thread Michael McCandless
OK so it sounds like we've boiled the proposal down to two concrete changes to the back-compat policy: 1) Default settings can change; we will always choose defaults based on "latest & greatest for new users". This only affects "runtime behavior". EG in 2.9, when sorting by field you

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711555#action_12711555 ] Shai Erera commented on LUCENE-1614: BTW, regarding SortedVIntList - even though it ex

[jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean

2009-05-21 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711553#action_12711553 ] Shai Erera commented on LUCENE-1614: bq. SortedVIntList subclasses DocIdSet Sorry, di

  1   2   >