Re: 6.0 Release

Joel Bernstein Thu, 31 Mar 2016 13:38:18 -0700

So what's the latest plan on the release?

I just back ported SOLR-8888 to branch_6x. If the release is imminent I'll
hold off on releasing until 6.1. But if this release continues to be
delayed I'd like to backport to branch_6_0.


Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Mar 31, 2016 at 12:16 PM, Jack Krupansky <[email protected]>
wrote:

> Reindexing for the proposed changes in numeric fields...
>
> We in Solr land have this split personality about reindexing - sometimes
> blithely telling users "oh, if you want to make that schema change, then
> you will have to reindex (all of your data)" and then insisting on index
> compatibility and that installing a new release will "not require
> reindexing." When I was at Lucid with their packaged version of Solr,
> automatically reindexing was a single-click operation, so it required nary
> a second thought. But in raw Solr land reindexing is a cause of great
> concern, anxiety, pain, and in many cases outright impossibility. The later
> typically because using store="TRUE" for all of your fields is considered a
> very bad thing.
>
> But what's the story in Elasticsearch land? First, they have this concept
> of a "_source" field which can keep a fully copy of the entire original
> input document, so that any document can always be fully updated and...
> reindexed. They also have a scrolling feature to make it easy to bulk copy
> from one index into another. Again, making it easy to reindex or migrate
> from an old index to a fresh new one.
>
> And now, we read in a recent blog post that "Reindex is coming!", making
> reindexing even easier than ever in ES:
> https://www.elastic.co/blog/reindex-is-coming
>
> In short, reindexing is much less of a huge deal in ES land than it is in
> Solr. IOW, telling users "you must reindex" is not the end of the world for
> ES users.
>
> So, the point of all of this is to ask a question of the Elasticsearch
> guys (also known as Lucene guys): How does Elasticsearch plan on dealing
> with this transition from current numerics to dimensional points? The
> recent blog post is here:
> https://www.elastic.co/blog/lucene-points-6.0
>
> It merely says "As of this writing, Elasticsearch has not yet exposed
> points, but I expect that will change soon." The question I have is will ES
> simply tell users "you must reindex" to use a future release of ES that is
> based on Lucene 6.0, or... will ES offer some index migration tool, or...
> will ES automatically and transparently upgrade existing ES numeric fields
> to dimensional points, or... will ES support both numeric formats, or...
> will ES have some JSON syntax for selecting between the two numeric formats?
>
> Not that I actually expect ES to fully disclose future product plans here,
> but... do they actually have some kind of secret plan to make users fully
> happy with this transition of numeric formats, or do they simply plan to
> say "you must (manually) reindex", or... have they in fact not yet thought
> through these migration issues?
>
> The real point here is that it will be senseless for the Solr guys to work
> through and propose a sensible migration plan (including the decision as to
> when and how the current numerics will be deprecated) if ES staff have some
> hidden plan in play. If it really is simply a matter that ES hasn't thought
> through the migration process or that easy reindexing is the ES answer,
> then fine, but it would be helpful to state that explicitly. I'm not making
> any presumption about which of these scenarios is the truth, just trying to
> figure out what's really going on, and whether easy reindexing in ES is at
> the root of this mad push to deprecate current numerics before dimensional
> points is fully baked.
>
>
> -- Jack Krupansky
>
> On Thu, Mar 31, 2016 at 11:39 AM, Jack Krupansky <[email protected]
> > wrote:
>
>> Robert's detailing of the remaining work to get the rest of Lucene off of
>> current (current release, soon to be legacy) numerics is enlightening.
>>
>> Personally, I had thought that it was Solr that was holding up an
>> imminent Lucene/Solr 6.0 release, but now I'm thinking:
>>
>> 1. The new "point stuff" (did I mention that I didn't like or approve of
>> the current name?) seems more like a work in progress...
>> 2. I'd label the "point stuff" as experimental for 6.0.
>> 3. I wouldn't hold up 6.0 for any further baking of the "point stuff" or
>> migration of other Lucene features off current numeric types.
>> 4. The rest of Lucene can be weaned off the current numerics at a more
>> leaisurely pace, like for 6.1 or 6.2.
>> 5. Once the new "point stuff" is finally full baked, and the rest of
>> Lucene is migrated off current numerics, and... Solr has made "point stuff"
>> its default numeric type (6.1 or 6.2?), AND Lucene or Solr comes up with a
>> sound migration plan and/or index migration tool for current numerics, only
>> THEN should the current numerics become deprecated.
>> 6. I'm not absolutely certain, but I think the 6.0 changes to Solr to use
>> the Lucene LegacyXXX numeric field types should be fine for an initial 6.0
>> release, meaning backcompat is assured for 6.x.
>> 7. I'm imagining that with a manually-invoked index upgrade tool a
>> current (5.x) numeric field can be migrated to a "point stuff" field type.
>> A Lucene heavy will have to confirm that feasibility.
>> 8. I'm imagining that a typical Solr site would be okay with the
>> requirement that they have to explicitly, manually run such an index
>> upgrade tool to migrate from current (5.x) numerics to "point stuff". And
>> that they could either do that once Solr adds support for "point stuff"
>> fields or when they migrate from 6.x to 7.x. Bonus points if Solr can have
>> a variation of the index upgrade tool that discovers and upgrades all
>> current numeric fields.
>>
>> What else? (I'll ask some questions about Elasticsearch plans in a
>> separate message.)
>>
>>
>>
>>
>> -- Jack Krupansky
>>
>> On Thu, Mar 31, 2016 at 12:31 AM, David Smiley <[email protected]>
>> wrote:
>>
>>> That was an excellent summary Rob; thanks.
>>> Minor nit: BBoxSpatialStrategy isn't/wasn't deprecated.  It was enhanced
>>> to use PointValues.
>>>
>>> I too would like to see the legacy numerics stay in "backwards-codecs"
>>> as you describe with precisionStep specified on the Analyzer.
>>>
>>> I disagree with Shawn about #5, that a user with a Solr 6.0 index must
>>> be able to upgrade straight to 7.0.  Perhaps this has been the case for
>>> every major release in the past, and it would be nice if it continues if
>>> for no other reason than consistency.  But, IMO, that's kind of cosmetic --
>>> it isn't important.  What matters is that an eventual 6.x release occurs
>>> that allows someone to upgrade to 7.0 -- that there's a path forward.  And
>>> that one can always upgrade from one 6.x release to any greater 6.x release.
>>>
>>> Quoting Adrien:
>>> bq. Detour: In the future I wonder that we should consider having
>>> separate release cycles again. In addition to giving Solr more time to use
>>> new Lucene features like here, it would also remove the issue that we had
>>> when releasing 5.3.2 after 5.4.0, which makes perfectly sense from a Solr
>>> perspective but not from Lucene since it introduces blind spots in the
>>> testing of index backward compatibility.
>>>
>>> +1 to that!  I've had that thought.  It would be awesome for Solr to
>>> release when it feels it's right, independently of Lucene.  If that's too
>>> difficult/problematic then perhaps keep synchronizing releases but allow
>>> Lucene & Solr's release version to vary.    Then we'd be having a Solr 5.6
>>> release here.
>>>
>>> ~ David
>>>
>>> On Wed, Mar 30, 2016 at 9:39 PM Robert Muir <[email protected]> wrote:
>>>
>>>> On Wed, Mar 30, 2016 at 12:43 PM, Adrien Grand <[email protected]>
>>>> wrote:
>>>> > Hi Shawn,
>>>> >
>>>> > I think marking the legacy fields/queries as deprecated in Lucene in
>>>> 6.0 is
>>>> > the right thing to do in order to encourage users to migrate to the
>>>> new
>>>> > points API. If Solr needs to keep them around for 7.x, it would be
>>>> fine to
>>>> > move them to solr/ instead of lucene/ instead of a hard removal.
>>>> Given that
>>>> > it works on top of the postings API, it would not break.
>>>>
>>>> Also see my issue (https://issues.apache.org/jira/browse/LUCENE-7075)
>>>> where I proposed to at least get things headed to the backwards/ jar.
>>>> And the uninverting issue is still being discussed. If you look at
>>>> linked issues you will see the deprecated encoding is involved with
>>>> the following modules:
>>>>
>>>> * core (not just field/query/utils classes, but stuff like
>>>> precisionStep in the .document api!)
>>>> * spatial (Deprecated GeoPoint encoding etc)
>>>> * spatial-extras (Deprecated Bbox encoding etc)
>>>> * misc (UninvertingReader)
>>>> * queryparser (flexible and xml)
>>>> * join
>>>>
>>>> The purpose of that issue is to make sure people have the stuff they
>>>> need to move their code of the old encoding. I personally thought this
>>>> would make the transition easier, and it was finding bugs/problems in
>>>> points and improving the apis. I imagined it would just be me, but i
>>>> created a ton of linked issues all up front just in case. I did not
>>>> think anyone else would really be excited to work on these, because
>>>> its not particularly exciting stuff, but thanks Nick, David, Martijn,
>>>> etc who did. I didn't try to plan any grandiose schemes of *actually
>>>> pulling the old encoding out* because this was plenty on its own. I
>>>> tried to work on the fieldcache only because I was talking to Tomas
>>>> and he mentioned it as a difficulty in cutting over solr. But I bailed
>>>> after encountering complexity, and don't think it is the way to go,
>>>> read the issue for my explanation.
>>>>
>>>> To me, this is why we have a backwards compatibility policy for N-1,
>>>> it has to be a volunteer thing for some of this stuff: can't all be on
>>>> Mike.
>>>>
>>>> I do personally think it is enough to release, "removing" or "moving"
>>>> deprecations is something to worry about for master branch.
>>>>
>>>> I did mention in the issue an idea for a first step would be to get
>>>> the core/ stuff pulled out somewhere better.  Maybe the core/ stuff
>>>> should go to the backwards-codec jar if we can detangle the
>>>> deprecations from the .document api (e.g. maybe precisionStep can be a
>>>> parameter on a tokenizer or analyzer or something, so its a little bit
>>>> harder to use, but still works and not holding back core/'s .document
>>>> api). But what to do about the other stuff?
>>>>
>>>> If i wanted to start removing deprecations now, I would be trying to
>>>> just factor out the core/ NumericRangeQuery/NumericField stuff out to
>>>> the backwards-codec jar. I hate modules depending on other ones, I
>>>> really do, but just to iterate, I'd temporarily make all those other
>>>> modules depend on backwards-codec/ jar and then remove deprecations
>>>> from each one-by-one. Its too much to do all at once. I think we can
>>>> do it this way iteratively without breaking solr.
>>>>
>>>> If solr wants to hang on to e.g. some spatial field with old numerics
>>>> for an additional time (since it was still using it for 6.0), then the
>>>> deprecated spatial field can be moved to solr. If not, lets nuke it.
>>>>
>>>> To me this seems the least controversial path, and its something that
>>>> can be done iteratively. It has the downside of keeping "core"
>>>> deprecated legacy numerics around for an extra major release in the
>>>> backwards-codec jar. I think this "extra" back compat is ok in this
>>>> case. Uwe made clean code :)
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [email protected]
>>>> For additional commands, e-mail: [email protected]
>>>>
>>>> --
>>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>>> http://www.solrenterprisesearchserver.com
>>>
>>
>>
>

Re: 6.0 Release

Reply via email to