[ https://issues.apache.org/jira/browse/LUCENE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14130262#comment-14130262 ]
Robert Muir commented on LUCENE-5940: ------------------------------------- {quote} are there technical issues here i'm unaware of beyond creating and maintaining the backwards compat tests? something outside of the codec mechanism that causes problems? {quote} There are plenty, first of all, maintaining back compat codecs has a real cost to improving lucene in the future, because if e.g. I want to make a change to the codec API, i have to make deal with tons of medieval index formats. Same goes with structural changes like making docvalues updatable (shai had to fight a lot here). Even stuff like simple code refactoring is expensive because its just a ton of code. Also the old codecs hang behind on features. They might not support various features like offsets in the postings, payloads in the term vectors, missing bitsets for docvalues, or whole datastructure types (SORTED_SET/SORTED_NUMERIC), or even whole parts of the index (3.x with docvalues at all). They are missing various useful statistics, etc. These are just ones i've worked on myself recently, there are more, and there are more coming (like Mike's range prefix feature). This makes things like testing difficult. Backwards compat drags around a lot of stuff for a long time (see the packed ints api) that makes it more complex and hard to work with and make changes to. It prevents and discourages real improvements to lucene. There are plenty of bugs in the back compat, the last few indexes have been riddled with them, some of them bad. Its undertested, overcomplex, and undermaintained. Again, not sexy stuff to work on, nobody wants to improve it. Finally, users want to have more options, but until we can minimize this backwards compat, i'm personally going to push back very hard on any "options", because we simply cannot take on more back compat. So the codec API goes mostly wasted. Maybe we should rename it "backcompat" api, because thats all its currently good for. Backcompat hurts the users here in this case. If we didn't have so many ancient formats, we could instead provide (and actually support) "breadth" instead, such as various options for the way to encode data so users really can take advantage of it. > change index backwards compatibility policy. > -------------------------------------------- > > Key: LUCENE-5940 > URL: https://issues.apache.org/jira/browse/LUCENE-5940 > Project: Lucene - Core > Issue Type: Bug > Reporter: Robert Muir > > Currently, our index backwards compatibility is unmanageable. The length of > time in which we must support old indexes is simply too long. > The index back compat works like this: everyone wants it, but there are > frequently bugs, and when push comes to shove, its not a very sexy thing to > work on/fix, so its hard to get any help. > Currently our back compat "promise" is just a broken promise, because we > cannot actually guarantee it for these reasons. > I propose we scale back the length of time for which we must support old > indexes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org