Re: Proposal about Version API "relaxation"

Grant Ingersoll Thu, 15 Apr 2010 12:46:29 -0700

From IRC:
"why do I get the feeling that everyone is in "heated agreement" on the Version 
thread?
there are some cases that mean people will have to reindex
in those cases, we should tell people they will have to reindex
then they can decide to upgrade or not
all other cases, just do the sensible thing and test first
I have yet to meet anyone who simply drops a new version into production and 
says go"


So, as I said earlier, why don't we just move forward with it, strive to 
support reading X-1 index format in X and let the user know the cases in which 
they will have to re-index. If a migration tool is necessary, then someone can 
write it at the appropriate time.  Just as was said w/ the Solr merge, it's 
software.  If it doesn't work, we can change it.  Thank goodness we don't have 
a back compatibility policy for our policies!

-Grant




On Apr 15, 2010, at 3:35 PM, Michael McCandless wrote:

> Unfortunately, live searching against an old index can get very hairy.
> EG look at what I had to do for the "flex API on pre-flex index" flex
> emulation layer.
> 
> It's also not great because it gives the illusion that all is good,
> yet, you've taken a silent hit (up to ~10% or so) in your search
> perf.
> 
> Whereas building & maintaining a one-time index migration tool, in
> contrast, is much less work.
> 
> I realize the migration tool has issues -- it fixes the hard changes
> but silently allows the soft changes to break (ie, your analyzers my
> not produce the same tokens, until we move all core analyzers outside
> of core, so they are separately versioned), but it seems like a good
> compromise here?
> 
> Mike
> 
> 2010/4/15 Shai Erera <ser...@gmail.com>:
>> The reason Earwin why online migration is faster is because when u
>> finally need to *fully* migrate your index, most chances are that most
>> of the segments are already on the newer format. Offline migration
>> will just keep the application idle for some amount of time until ALL
>> segments are migrated.
>> 
>> During the lifecycle of the index, segments are merged anyway, so
>> migrating them on the fly virtually costs nothing. At the end, when u
>> upgrade to a Lucene version which doesn't support the previous index
>> format, you'll on the worse case need to migrate few large segments
>> which were never merged. I don't know how many of those there will be
>> as it really depends on the application, but I'd bet this process will
>> touch just a few segments. And hence, throughput wise it will be a lot
>> faster.
>> 
>> We should create a migrate() API on IW which will touch just those
>> segments and not incur a full optimize. That API can also be used for
>> an offline migration tool, if we decide that's what we want.
>> 
>> Shai
>> 
>> On Thursday, April 15, 2010, jm <jmugur...@gmail.com> wrote:
>>> Not sure if plain users are allowed/encouraged to post in this list,
>>> but wanted to mention (just an opinion from a happy user), as other
>>> users have, that not all of us can reindex just like that. It would
>>> not be 10 min for one of our installations for sure...
>>> 
>>> First, i would need to implement some code to reindex, cause my source
>>> data is postprocessed/compressed/encrypted/moved after it arrives to
>>> the application, so I would need to retrieve all etc. And then
>>> reindexing it would take days.
>>> javier
>>> 
>>> On Thu, Apr 15, 2010 at 9:04 PM, Earwin Burrfoot <ear...@gmail.com> wrote:
>>>>> BTW Earwin, we can come up w/ a migrate() method on IW to accomplish
>>>>> manual migration on the segments that are still on old versions.
>>>>> That's not the point about whether optimize() is good or not. It is
>>>>> the difference between telling the customer to run a 5-day migration
>>>>> process, or a couple of hours. At the end of the day, the same
>>>>> migration code will need to be written whether for the manual or
>>>>> automatic case. And probably by the same developer which changed the
>>>>> index format. It's the difference of when does it happen.
>>>> 
>>>> Converting stuff is easier then emulating, that's exactly why I want a
>>>> separate tool.
>>>> There's no need to support cross-version merging, nor to emulate old APIs.
>>>> 
>>>> I also don't understand why offline migration is going to take days
>>>> instead of hours for online migration??
>>>> WTF, it's gonna be even faster, as it doesn't have to merge things.
>>>> 
>>>> --
>>>> Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
>>>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>>>> ICQ: 104465785
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>>>> 
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>>> 
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>> 
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Proposal about Version API "relaxation"

Reply via email to