SnapshotDeletionPolicy throws NPE if no commit happened

2010-04-15 Thread Shai Erera
SDP throws NPE if the index includes no commits, but snapshot() is called. This is an extreme case, but can happen if one takes snapshots (for backup purposes for example) in a separate code segment than indexing, and does not know if commit was called or not. I think we should throw an IllegalSta

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
We can remove Version, because all incompatible changes go straight to a new major release, which we release more often, yes. 3.x is going to be released after 4.0 if bugs are found and fixed, or if people ask to backport some (minor?) features, and some dev has time for this. The question of what

Re: SnapshotDeletionPolicy throws NPE if no commit happened

2010-04-15 Thread Earwin Burrfoot
We should just let IW create a null commit on an empty directory, like it always did ;) Then a whole class of such problems disappears. On Thu, Apr 15, 2010 at 11:16, Shai Erera wrote: > SDP throws NPE if the index includes no commits, but snapshot() is called. > This is an extreme case, but can

Re: SnapshotDeletionPolicy throws NPE if no commit happened

2010-04-15 Thread Shai Erera
Well ... one can still call commit() or close() right after IW creation. And this is a very rare case to be hit by. Was just asking about whether we want to add an explicit and clear protective code about it or not. Shai On Thu, Apr 15, 2010 at 10:26 AM, Earwin Burrfoot wrote: > We should just

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Shai Erera
Well ... I think that version numbers mean more than we'd like them to mean, as people perceive them. Let's discuss the format X.Y.Z: When X is changed, it should mean something 'big' happened - index structure has changed (e.g. the flexible scoring work), new Java version supported (Java 1.6) and

Re: SnapshotDeletionPolicy throws NPE if no commit happened

2010-04-15 Thread Shai Erera
BTW, even if it's a stupid thing to do, someone can today create SDP and call snapshot without ever creating IW. And it's not an impossible scenario. Consider a backup-aware application which creates SDP first, then passes it to the indexing process and the backup process, separately. The backup pr

Re: SnapshotDeletionPolicy throws NPE if no commit happened

2010-04-15 Thread Michael McCandless
Presumably you'd also hit this exception if the DP deletes all commit points, right? I like IllegalStateException. Mike 2010/4/15 Shai Erera : > BTW, even if it's a stupid thing to do, someone can today create SDP and > call snapshot without ever creating IW. And it's not an impossible scenario.

Re: TestCodecs running time

2010-04-15 Thread Michael McCandless
Yah :) TestStressIndexing2 is another slow one... I'll go fix it... Mike On Thu, Apr 15, 2010 at 2:15 AM, Shai Erera wrote: > See you already did that Mike :). Thanks ! now the tests run for 2s. > > Shai > > On Fri, Apr 9, 2010 at 12:49 PM, Michael McCandless > wrote: >> >> It's also slow beca

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Michael McCandless
2010/4/15 Shai Erera : > One way is to define 'major' as X and minor X.Y, and another is to define > major as 'X.Y' and minor as 'X.Y.Z'. I prefer the latter but don't have any > strong feelings against the former. I prefer X.Y, ie, changes to Y only is a minor release (mostly bug fixes but may

Merging the Mailing Lists

2010-04-15 Thread Grant Ingersoll
Looks like we are ready to go to merge the Lucene and Solr dev mailing lists. The new list will be [email protected]. All existing subscribers will automatically be subscribed to the new list. For more info, see https://issues.apache.org/jira/browse/INFRA-2567. -Grant --

[jira] Resolved: (LUCENE-1278) Add optional storing of document numbers in term dictionary

2010-04-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1278. Resolution: Won't Fix I think the pulsing codec (wraps any other codec, but inline

[jira] Created: (LUCENE-2395) Add a scoring DistanceQuery that does not need caches and separate filters

2010-04-15 Thread Uwe Schindler (JIRA)
Add a scoring DistanceQuery that does not need caches and separate filters -- Key: LUCENE-2395 URL: https://issues.apache.org/jira/browse/LUCENE-2395 Project: Lucene - Java

[jira] Commented: (LUCENE-2395) Add a scoring DistanceQuery that does not need caches and separate filters

2010-04-15 Thread Chris Male (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857278#action_12857278 ] Chris Male commented on LUCENE-2395: +1 This will replace the work I was doing on imp

[jira] Updated: (LUCENE-2395) Add a scoring DistanceQuery that does not need caches and separate filters

2010-04-15 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2395: -- Description: In a chat with Chris Male and my own ideas when implementing for PANGAEA, I thou

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Shai Erera
Well ... I must say that I completely disagree w/ dropping index structure back-support. Our customers will simply not hear of reindexing 10s of TBs of content because of version upgrades. Such a decision is key to Lucene adoption in large-scale projects. It's entirely not about whether Lucene is a

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Robert Muir
On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera wrote: > Well ... I must say that I completely disagree w/ dropping index structure > back-support. Our customers will simply not hear of reindexing 10s of TBs of > content because of version upgrades. Such a decision is key to Lucene > adoption in larg

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Danil ŢORIN
Sometimes it's REALLY impossible to reindex, or has absolutely prohibitive cost to do in a running production system (i can't shut it down for maintainance, so i need a lot of hardware to reindex ~5 billion documents, i have no idea what are the costs to retrieve that data all over again, but i est

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Robert Muir
its open source, if you feel this way, you can put the work to add features to some version branch from trunk in a backwards compatible way. Then this branch can have a backwards-compatible minor release with new features, but nothing ground-breaking. but this kinda stuff shouldnt hinder developm

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
I think an index upgrade tool is okay? While you still definetly have to code it, things like "if idxVer==m doOneStuff elseif idxVer==n doOtherStuff else blowUp" are kept away from lucene innards and we all profit? On Thu, Apr 15, 2010 at 16:21, Robert Muir wrote: > its open source, if you feel t

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Shai Erera
Thanks Danil - you reminded me of another reason why reindexing is impossible - fetching the data, even if it's available is too damn costly. Robert, I think you're driven by Analyzers changes ... been too much around them I'm afraid :). A major version upgrade is a move to Java 1.5 for example.

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Shai Erera
I can live w/ that Earwin ... I prefer the ongoing upgrades still, but I won't hold off the back-compat policy change vote because of that. Shai On Thu, Apr 15, 2010 at 3:30 PM, Earwin Burrfoot wrote: > I think an index upgrade tool is okay? > While you still definetly have to code it, things l

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Danil ŢORIN
All I ask is a way to migrate existing indexes to newer format. On Thu, Apr 15, 2010 at 15:21, Robert Muir wrote: > its open source, if you feel this way, you can put the work to add features > to some version branch from trunk in a backwards compatible way. > > Then this branch can have a back

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
I like the idea of index conversion tool over silent online upgrade because it is 1. controllable - with online upgrade you never know for sure when your index is completely upgraded, even optimize() won't help here, as it is a noop for already-optimized indexes 2. way easier to write - as flex sho

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Robert Muir
I think you guys miss the entire point. The idea that you can keep getting "all the new features" without reindexing is merely an illusion Instead, features simply aren't being added at all, because the policy makes it too cumbersome. Why is it problematic to have a different SVN branch/release,

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Yonik Seeley
Seamless online upgrades have their place too... say you are upgrading one server at a time in a cluster. -Yonik Apache Lucene Eurocon 2010 18-21 May 2010 | Prague On Thu, Apr 15, 2010 at 8:42 AM, Earwin Burrfoot wrote: > I like the idea of index conversion tool over silent online upgrade > beca

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Shai Erera
Well ... I could argue that it's you who miss the point :). I completely don't buy the "all the new features" comment --> how many new features are in a major release which force you to consider reindexing? Yet there are many of them that change the API. How will I know whether a release supports

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Danil ŢORIN
I'm realize that just transforming old index won't give me anything new. The applications usually evolve. Let's take as example 2.9 (relatively few changes in index structure, but Trie was a nice addition, per segment search and reload was a bless): - There are 4 billion documents which don't hav

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
On Thu, Apr 15, 2010 at 17:17, Yonik Seeley wrote: > Seamless online upgrades have their place too... say you are upgrading > one server at a time in a cluster. Nothing here that can't be solved with an upgrade tool. Down one server, upgrade index, upgrade sofware, up. -- Kirill Zakharenko/Кири

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Danil ŢORIN
True. Just need the tool. On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot wrote: > > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley > wrote: > > Seamless online upgrades have their place too... say you are upgrading > > one server at a time in a cluster. > > Nothing here that can't be solved with a

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Yonik Seeley
On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot wrote: > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley > wrote: >> Seamless online upgrades have their place too... say you are upgrading >> one server at a time in a cluster. > > Nothing here that can't be solved with an upgrade tool. Down one > se

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Robert Muir
wrong, it doesnt fix the analyzers problem. you need to reindex. On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot wrote: > On Thu, Apr 15, 2010 at 17:17, Yonik Seeley > wrote: > > Seamless online upgrades have their place too... say you are upgrading > > one server at a time in a cluster. > >

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Grant Ingersoll
+1 On Apr 14, 2010, at 5:22 PM, Michael McCandless wrote: > On Wed, Apr 14, 2010 at 12:06 AM, Marvin Humphrey > wrote: > >> Essentially, we're free to break back compat within "Lucy" at any time, but >> we're not able to break back compat within a stable fork like "Lucy1", >> "Lucy2", etc. So

[jira] Created: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
remove version from contrib/analyzers. -- Key: LUCENE-2396 URL: https://issues.apache.org/jira/browse/LUCENE-2396 Project: Lucene - Java Issue Type: Task Components: contrib/analyzers Affects

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Grant Ingersoll
I do think major versions should be able to read the previous version index. Still, even being able to do that is no guarantee that it will produce correct results. Likewise, even having an upgrade tool is no guarantee that correct results will be produced. So, my take is that we strive for i

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857321#action_12857321 ] Robert Muir commented on LUCENE-2396: - Additionally, i would like to remove all "CHANG

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
On Thu, Apr 15, 2010 at 17:49, Robert Muir wrote: > wrong, it doesnt fix the analyzers problem. > you need to reindex. > > On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot wrote: >> >> On Thu, Apr 15, 2010 at 17:17, Yonik Seeley >> wrote: >> > Seamless online upgrades have their place too... say

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857325#action_12857325 ] Robert Muir commented on LUCENE-2396: - Also, i would like to remove all deprecated met

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Danil ŢORIN
Agree. However I don't see how lucene could suddenly change that even a conversion tool is impossible to create. After all it's all about terms, positions and frequencies. Yeah..some additions as payloads may appear, disappear, or evolve into something new, but those are on user's side anyway. A

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Yonik Seeley
On Wed, Apr 14, 2010 at 5:22 PM, Michael McCandless wrote: >  * There is no back compat across major releases (index nor APIs), >    but full back compat within branches. > > This would match how many other projects work (KS/Lucy, as Marvin > describes above; Apache Tomcat; Hibernate; log4J; FreeB

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Erick Erickson
Coming in late to the discussion, and without really understanding the underlying Lucene issues, but... The size of the problem of reindexing is under-appreciated I think. Somewhere in my company is the original data I indexed. But the effort it would take to resurrect it is O(unknown). An unfortu

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
> reasonable, but changing APIs around when there's not a good reason > behind it (other than someone liked the name a little better) should > still be approached with caution. Changing names is a good enough reason :) They make a darn difference between having to read a book to be able to use som

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Mark Miller
If you absolutely cannot re-index, and you have *no* access to the data again - you are one ballsy mofo to upgrade to a new version of Lucene for "features". It means you likely BASE jump in your free time? On 04/15/2010 10:14 AM, Erick Erickson wrote: Coming in late to the discussion, and wit

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
I think the need to upgrade to latest and greatest lucene for poor corporate users that lost all their data is somewhat overblown. Why the heck do you need to upgrade if your app rotted in neglect for years?? On Thu, Apr 15, 2010 at 18:14, Erick Erickson wrote: > Coming in late to the discussion,

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Erick Erickson
I never said finding oneself in this position was the result of careful planning and flawless execution . But that's the reality some of our users will find themselves in. Even worse... *I* may find myself in that position because of a decision someone *else* made before they were fired. Eric

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Erick Erickson
'Cause some exec finally noticed the product was losing market share. Or got a wild hair strategically placed. My point is only that we should be clear that some number of Lucene users *will* be in such a position. I'm actually fine with a decision that we're not going to support such a scenario,

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Danil ŢORIN
The app is not rotted, it's alive and kicking, and gets a lot of TLC. There are some older indexes that use some features and there are newer indexes that will benefit greatly from newer features. All running in one freaking big distributed application. Leveraging lucene version by updating to ne

[jira] Updated: (LUCENE-2395) Add a scoring DistanceQuery that does not need caches and separate filters

2010-04-15 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2395: -- Attachment: DistanceQuery.java A first idea of the Query, it does not even compile as some cla

[jira] Assigned: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-2396: --- Assignee: Robert Muir > remove version from contrib/analyzers. > ---

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857373#action_12857373 ] Michael McCandless commented on LUCENE-2324: bq. The usual design is a queued

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-15 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857375#action_12857375 ] Tim Smith commented on LUCENE-2324: --- bq. But... could we allow an add/updateDocument cal

[jira] Updated: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2396: Attachment: LUCENE-2396.patch attached is a patch, including CHANGES rewording. All Lucene/Solr t

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-15 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857380#action_12857380 ] Jason Rutherglen commented on LUCENE-2324: -- bq. only one DW flushes at a time (th

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-15 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857381#action_12857381 ] Michael McCandless commented on LUCENE-2324: {quote} i would love to be able t

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857384#action_12857384 ] Uwe Schindler commented on LUCENE-2396: --- Are you sure you want to use LUCENE_CURRENT

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857386#action_12857386 ] Robert Muir commented on LUCENE-2396: - bq. Are you sure you want to use LUCENE_CURRENT

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2010-04-15 Thread Tim Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857385#action_12857385 ] Tim Smith commented on LUCENE-2324: --- bq. Probably if you really want to keep the segment

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857388#action_12857388 ] Shai Erera commented on LUCENE-2396: Robert I think this is great! Can we move more an

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857395#action_12857395 ] Robert Muir commented on LUCENE-2396: - {quote} Robert I think this is great! Can we mo

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857396#action_12857396 ] Shai Erera commented on LUCENE-2396: Static? Weren't you against that!? But if we re

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857398#action_12857398 ] Robert Muir commented on LUCENE-2396: - {quote} Static? Weren't you against that!? But

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857402#action_12857402 ] Uwe Schindler commented on LUCENE-2396: --- bq. Static? Weren't you against that!? He

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857412#action_12857412 ] Robert Muir commented on LUCENE-2396: - bq. Until we have no more analyzers in core exs

Re: Proposal about Version API "relaxation"

2010-04-15 Thread DM Smith
On 04/15/2010 09:49 AM, Robert Muir wrote: wrong, it doesnt fix the analyzers problem. you need to reindex. On Thu, Apr 15, 2010 at 9:39 AM, Earwin Burrfoot > wrote: On Thu, Apr 15, 2010 at 17:17, Yonik Seeley mailto:[email protected]>> wrote:

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857427#action_12857427 ] DM Smith commented on LUCENE-2396: -- Robert, I think this is a red-herring. There has been

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
> First, the index format. IMHO, it is a good thing for a major release to be > able to read the prior major release's index. And the ability to convert it > to the current format via optimize is also good. Whatever is decided on this > thread should take this seriously. Optimize is a bad way to co

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Robert Muir
On Thu, Apr 15, 2010 at 1:30 PM, DM Smith wrote: > > Another behavior change is an upgrade in Java version. By forcing users to > go to Java 5 with Lucene 3, the version of Unicode changed. This in itself > causes a change in some token streams. > > ... > > It is my observation, though possibly

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857440#action_12857440 ] Robert Muir commented on LUCENE-2396: - bq. There has been an implicit bw compat policy

Re: Proposal about Version API "relaxation"

2010-04-15 Thread DM Smith
On 04/15/2010 01:50 PM, Earwin Burrfoot wrote: First, the index format. IMHO, it is a good thing for a major release to be able to read the prior major release's index. And the ability to convert it to the current format via optimize is also good. Whatever is decided on this thread should take th

[jira] Updated: (LUCENE-2393) Utility to output total term frequency and df from a lucene index

2010-04-15 Thread Tom Burton-West (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated LUCENE-2393: Attachment: LUCENE-2393.patch New patch includes a (pre-flex ) version of HighFreqTerms th

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857456#action_12857456 ] DM Smith commented on LUCENE-2396: -- {quote} So I think we should instead use real-version

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
I'd like to remind that Mike's proposal has stable branches. We can branch off preflex trunk right now and wrap it up as 3.1. Current trunk is declared as future 4.0 and all backcompat cruft is removed from it. If some new features/bugfixes appear in trunk, and they don't break stuff - we backport

[jira] Issue Comment Edited: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857456#action_12857456 ] DM Smith edited comment on LUCENE-2396 at 4/15/10 2:16 PM: --- {quo

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857466#action_12857466 ] Robert Muir commented on LUCENE-2396: - bq. I could live with thatmaybe. What guara

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857471#action_12857471 ] Robert Muir commented on LUCENE-2396: - bq. How can I use lucene-analyzers-3.0.jar on o

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Shai Erera
I seriously don't understand the fuss around index format back compat. How many times is this changed such that it is too much to ask to keep X support X-1? I prefer to have ongoing segment merging but can live w/ a manual converter tool. Thing is - I'll probably not be able to develop one myself

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Sanne Grinovero
Hello, I think some compatibility breaks should really be accepted, otherwise these requirements are going to kill the technological advancement: the effort in backwards compatibility will grow and be more timeconsuming and harder every day. A mayor release won't happen every day, likely not even

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish > manual migration on the segments that are still on old versions. > That's not the point about whether optimize() is good or not. It is > the difference between telling the customer to run a 5-day migration > process, or a coup

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857487#action_12857487 ] DM Smith commented on LUCENE-2396: -- bq. Well, I think asking for a well-defined backwards

Re: Proposal about Version API "relaxation"

2010-04-15 Thread DM Smith
On 04/15/2010 03:04 PM, Earwin Burrfoot wrote: BTW Earwin, we can come up w/ a migrate() method on IW to accomplish manual migration on the segments that are still on old versions. That's not the point about whether optimize() is good or not. It is the difference between telling the customer to r

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857490#action_12857490 ] Robert Muir commented on LUCENE-2396: - {quote} One mechanism that would work is versio

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
On Thu, Apr 15, 2010 at 23:07, DM Smith wrote: > On 04/15/2010 03:04 PM, Earwin Burrfoot wrote: >>> >>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish >>> manual migration on the segments that are still on old versions. >>> That's not the point about whether optimize() is goo

Re: Proposal about Version API "relaxation"

2010-04-15 Thread jm
Not sure if plain users are allowed/encouraged to post in this list, but wanted to mention (just an opinion from a happy user), as other users have, that not all of us can reindex just like that. It would not be 10 min for one of our installations for sure... First, i would need to implement some

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857498#action_12857498 ] DM Smith commented on LUCENE-2396: -- {quote} bq. One mechanism that would work is versione

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Robert Muir
but seriously... are you moving across major lucene releases every single day? if you are using 3.x, how does it hurt you if there is a version 4.x that you can't use without re-indexing? why wouldn't you just stay on your stable branch (say 3.x)? 2010/4/15 jm > Not sure if plain users are all

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
> Not sure if plain users are allowed/encouraged to post in this list, > but wanted to mention (just an opinion from a happy user), as other > users have, that not all of us can reindex just like that. It would > not be 10 min for one of our installations for sure... > > First, i would need to impl

Re: Proposal about Version API "relaxation"

2010-04-15 Thread DM Smith
On 04/15/2010 03:12 PM, Earwin Burrfoot wrote: On Thu, Apr 15, 2010 at 23:07, DM Smith wrote: On 04/15/2010 03:04 PM, Earwin Burrfoot wrote: BTW Earwin, we can come up w/ a migrate() method on IW to accomplish manual migration on the segments that are still on old versions. That's no

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Shai Erera
The reason Earwin why online migration is faster is because when u finally need to *fully* migrate your index, most chances are that most of the segments are already on the newer format. Offline migration will just keep the application idle for some amount of time until ALL segments are migrated.

Re: Proposal about Version API "relaxation"

2010-04-15 Thread DM Smith
On 04/15/2010 03:25 PM, Shai Erera wrote: We should create a migrate() API on IW which will touch just those segments and not incur a full optimize. That API can also be used for an offline migration tool, if we decide that's what we want. What about an index that has already called optimize

RE: Proposal about Version API "relaxation"

2010-04-15 Thread Uwe Schindler
Hi Earwin, I am strongly +1 on this. I would also make the Release Manager for 3.1, if nobody else wants to do this. I would like to take the preflex tag or some revisions before (maybe without the IndexWriterConfig, which is a really new API) to be 3.1 branch. And after that port some of my po

#lucene-dev - a logged IRC channel

2010-04-15 Thread Steven A Rowe
I have created #lucene-dev on freenode: irc://freenode/lucene-dev The channel is logged, with an archive here: http://colabti.org/irclogger/irclogger_logs/lucene-dev I would like for #lucene-dev to be a place where people can have zero-latency on-the-record discussions about Lu

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
2010/4/15 Shai Erera : > The reason Earwin why online migration is faster is because when u > finally need to *fully* migrate your index, most chances are that most > of the segments are already on the newer format. Offline migration > will just keep the application idle for some amount of time unt

[jira] Commented: (LUCENE-2396) remove version from contrib/analyzers.

2010-04-15 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857507#action_12857507 ] Robert Muir commented on LUCENE-2396: - bq. I can go along with this. Cool! bq. I st

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Michael McCandless
Unfortunately, live searching against an old index can get very hairy. EG look at what I had to do for the "flex API on pre-flex index" flex emulation layer. It's also not great because it gives the illusion that all is good, yet, you've taken a silent hit (up to ~10% or so) in your search perf.

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Robert Muir
2010/4/15 Michael McCandless > > I realize the migration tool has issues -- it fixes the hard changes > but silently allows the soft changes to break (ie, your analyzers my > not produce the same tokens, until we move all core analyzers outside > of core, so they are separately versioned), but it

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Grant Ingersoll
From IRC: "why do I get the feeling that everyone is in "heated agreement" on the Version thread? there are some cases that mean people will have to reindex in those cases, we should tell people they will have to reindex then they can decide to upgrade or not all other cases, just do the sensible

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Earwin Burrfoot
I think this should split off the mega-thread :) On Thu, Apr 15, 2010 at 23:28, Uwe Schindler wrote: > Hi Earwin, > > I am strongly +1 on this. I would also make the Release Manager for 3.1, if > nobody else wants to do this. I would like to take the preflex tag or some > revisions before (mayb

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Andi Vajda
On Thu, 15 Apr 2010, Robert Muir wrote: 2010/4/15 Michael McCandless I realize the migration tool has issues -- it fixes the hard changes but silently allows the soft changes to break (ie, your analyzers my not produce the same tokens, until we move all core an

RE: Proposal about Version API "relaxation"

2010-04-15 Thread Uwe Schindler
I wish we could have a face to face talk like in the evenings at ApacheCon :( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -Original Message- > From: Grant Ingersoll [mailto:[email protected]] On Behalf Of Grant > Ingersoll

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Robert Muir
>> 3. put analyzers in their own versioned jar files. >> > > Yes, every analyzer needs to have its own version and thus, jar file. > Putting all analyzers into one versioned jar file joins them at the hip and > suffers from the same versioning and compat problems we're currently facing > in core. >

Re: Proposal about Version API "relaxation"

2010-04-15 Thread Michael McCandless
On Thu, Apr 15, 2010 at 3:50 PM, Robert Muir wrote: > for now simply moving analyzers to its own jar filE would be a great step! +1 -- why not consolidate all analyzers now? (And fix indexer to require a minimal API = TokenStream minus reset & close). Mike -

  1   2   >