Re: RFC: N-2 compatibility for file formats

Adrien Grand Wed, 13 Jan 2021 05:58:16 -0800

+1 this strikes to me as a good balance between increasing backward
compatibility guarantees and still keeping room for innovation.


David, actually I would like to advocate in favor of still disallowing
opening N-2 indices by default, as they might not match Lucene's current
expectations (e.g. using a different encoding for norms due to
LUCENE-7730), and using Lucene's current analyzers/similarities/queries
might trigger surprising behavior. My preference would be to expose the
ability to open N-2 indices behind an expert API/flag that documents
limitations with N-2 indices.

Mike, I wondered about this question too. As you pointed out, I think that
we will generally be ok given that the N-2 compatibility layer will very
likely be the same as the N-1 compatibility layer that we need to develop
anyway. I tried to think of examples when that wouldn't work but couldn't
find any (which doesn't mean that there is none, but hopefully it would be
rare).



On Mon, Jan 11, 2021 at 4:57 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> +1, I like the idea in general.
>
> We will have to work out the details in practice as we come across "index
> breaking" changes, and where/how to draw the line of "best effort".  But I
> think this is an improvement for our users over the hard check we now have
> for "only N-1", and likely not so much development effort?
>
> I think where it might get interesting is if we want to make a Codec API
> change, maybe to optimize a interesting use-cases, and then we must do some
> development to fix N-2 BWC codec (as well as N-1 BWC codec that we already
> must fix for such an example, today).
>
> Some users seem to keep their indices alive for a very long time!
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sat, Jan 9, 2021 at 6:13 AM Simon Willnauer <simon.willna...@gmail.com>
> wrote:
>
>> I can provide some examples of BWC issues and what we would do if it
>> happened in the future:
>>
>> - negative offsets: in this case it would be best effort to add a
>> wrapper around the older formats to check if the offsets go backwards
>> on the read side and throw an exception to prevent consumers making
>> the assumption that offsets go forward only from failing or going OOM
>> etc.
>> - norms encoding: in this case it would be best effort in the older
>> norms formats to convert to the newer encodings.
>> - the removal of numeric fields queries would not fall under the
>> promises we make with compatibility of N-2 and it would be the
>> responsibility of the user to keep the code around that understands
>> the value of a field.
>>
>> I hope this clarifies some of the aspects?
>>
>> we would only do all this for the reading end, for writing we would
>> reject indices that are older than N-1
>>
>> simon
>>
>>
>> On Thu, Jan 7, 2021 at 8:04 PM jim ferenczi <jim.feren...@gmail.com>
>> wrote:
>> >
>> > The proposal is only about keeping the ability to read file-format up
>> to N-2. Everything that is done on top of the file format is not guaranteed
>> and should be supported on a best-effort basis.
>> > That's an important aspect if we don't want to block innovation. So in
>> practice that means that queries that require some specific file format or
>> analyzers that change behaviors in major versions would not be part of the
>> extended guarantee.
>> >
>> >
>> > Le mer. 6 janv. 2021 à 21:53, Yonik Seeley <ysee...@gmail.com> a écrit
>> :
>> >>
>> >> On Wed, Jan 6, 2021 at 4:40 AM Simon Willnauer <
>> simon.willna...@gmail.com> wrote:
>> >>>
>> >>>  You can open a reader on an index created by
>> >>> version N-2, but you cannot open an IndexWriter on it
>> >>
>> >>
>> >> +1
>> >> There should definitely be more consideration given to back compat in
>> general... it's caused a ton of pain to users over time.
>> >>
>> >> -Yonik
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>

-- 
Adrien

Re: RFC: N-2 compatibility for file formats

Reply via email to