Hi Robert,

In my opinion, we *must* allow changing LiveDocs in any codec that is for 
migration (also in 5.x) but forced read-only. Of course in 5.x no longer needs 
support for 3.x indexes, but that’s another story. The reason for this 
*requirement* (and therefore the issue is blocker) is the special case for 
LiveDocs:
- Every part of an index except Livedocs is unmodifiable after the segment was 
flushed/committed/whatever. This allows to make a read-only codec like we have 
currently!
- BUT: The Livedocs file is special, as it can and must be modified after the 
segment was created (e.g. for document updates). To allow correct migration of 
older indexes, we must support this also in read-only codecs. We must also add 
this to the "backwards guideline".

The important thing is: Lucene3xCodec (or any later backwards-read-only codec) 
should prevent creating new index segments using this codec, but the above 
special case for deleting documents must be allowed, otherwise the whole 
backwards strategy is useless because you have no chance to migrate old, 
non-read-only indexes live. As noted in the issue, only allowing addDocument() 
[because it writes to a new segment], but not allowing updateDocument [because 
it also modifies the old index if document is existing] is crazy. The problems 
are e.g. that the Exception may not be visible on the first updateDocument 
call, because the upodate was in fact an add. UpdateDocument could also pass if 
the updated document is deleting another document already merged to a new 4.0 
index... (Hoss explained that very good)

An alternative approach I would also favour is another one: If IndexWriter 
detects that the current segment to delete documents on has no writeable 
LiveDocs, it could trigger a merge and apply the deletion after merge. In that 
case this segment is migrated on the fly. This is a heavier but a cleaner 
approach than the patch on the issue. But it is trickier to do, because of 
concurrent merges.

I agree we need a better test, but I don't see any problems with the current 
patch. Please discuss this on the issue 
https://issues.apache.org/jira/browse/LUCENE-4339.

I agree that you cannot use an index that contains 2.x segments (and for that 
the migration tool was provided in Lucene 3.x, so you can download 3.6.1 
core.jar and run IndexUpgrader). In that case it will throw ex on open, that’s 
fine.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]


> -----Original Message-----
> From: Robert Muir [mailto:[email protected]]
> Sent: Wednesday, August 29, 2012 10:29 AM
> To: [email protected]
> Subject: Re: Lucene3xCodec doesn't allow deletions in 4x?
> 
> On Wed, Aug 29, 2012 at 3:13 AM, Uwe Schindler <[email protected]> wrote:
> > Hi,
> >
> > In the early days (I mean in the time when it was already read only until we
> refactored the IndexReader.delete()/Codec stuff), this was working, because
> the LiveDocs were always handled in a special way. Making it now 100% read-
> only is in my opinion very bad, as it does not allow to update documents in a 
> 3.x
> index anymore, so you have no chance, you must run IndexUpgrader.
> >
> 
> It didn't really go down like that, instead, at one point, this was working. 
> Then
> later as the APIs changed, it was not really feasible anymore. I added the UOE
> for that reason. I knew exactly what the tradeoffs were when I did this.
> 
> It just happens that now, its (seemingly) easy and feasible to re-enable have 
> it
> working again (due to changes in LUCENE-4050/LUCENE-4055), which is why I
> suggested the patch. But we should think it through, be careful, and make sure
> I'm not missing or forgetting anything: my test is very trivial.
> 
> I don't think we should add this back compat requirement/test to
> TestBackwardsCompatibility in trunk. if it goes in, its 4.0 only and only 
> because
> we are agreeing to do this on a case-by-case basis.
> 
> In general you cannot 'seamlessly' upgrade from one version to the next. if 
> you
> have a 3.x index for example, it might contain some 2.x segments and be
> working fine in 3.x, that doesn't mean 4.x will read it, etc, etc. This is 
> nothing
> new. So you always must take some measures.
> 
> At one point we had decided an upgrade-tool-approach to 4.x was fine.
> I don't think we should forget that either. We just have online back compat
> because Mike spent a ton of time to do the work. I don't want us to require 
> this
> "feature" in the future. We might want to refactor codec apis or something 
> like
> that in 5.0 in a way where its no longer feasible again.
> 
> --
> lucidworks.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] For additional
> commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to