Re: Future of FieldCache in Solr

Yonik Seeley Thu, 27 Oct 2016 11:46:08 -0700

On Thu, Oct 27, 2016 at 5:10 AM, Adrien Grand <jpou...@gmail.com> wrote:
> [...] the discrepancy between the best practices as recommended by Lucene

Different discrepancies have different reasons... one shouldn't
attempt to paint them all with the same brush.
- For some issues, it may simply be a lack of volunteers (see "Why
Solr doesn't use Point fields" below).
- For some issues, there may not actually be consensus (and I've been
a Lucene committer before Solr even saw the light of day, so hopefully
my opinions would be included in the "recommended by Lucene".
- For some issues, there may be consensus to "recommend X", but that
doesn't imply a consensus to prohibit all alternatives.

With respect to "using doc values", I think most agree on that default
recommendation. That doesn't necessarily follow that everyone agrees
that FieldCache should be prohibited or should go away entirely.  The
fact that it was removed from Lucene so quietly under a JIRA
originally entitled "Move SlowCompositeReaderWrapper to solr sources"
(LUCENE-7283) was definitely surprising to many.

Why Solr doesn't use Point fields for numerics (yet):

Some of the same Lucene/Solr committers that worked on Points in
Lucene also added support in Elasticsearch as well as "Lucene Server"
( https://github.com/mikemccand/luceneserver )
One might as well complain that it took until 2016 for Lucene to get
proper numeric index support.
This is volunteer development, and Tomas has been the only person to
find time to work on Points support.

Why doesn't Solr use SortedNumericDocValues (yet):

Same as above.  Some Lucene/Solr committers added support to
Elasticsearch and to luceneserver, but none have gotten around to
adding support to Solr yet.  I don't think it's anyone's fault.  No
one has argued against using SortedNumericDocValues (or Points for
that matter).  But neither has anyone stepped up and contributed the
work.

Finally, Solr is not just an "example" of how to use Lucene.  While
Lucene seems to rapidly change APIs every major release (and some back
incompatible changes are  just because someone likes a name better),
Solr does not have that luxury.  We have many users that depend on us,
and we've already made it hard enough for people to move, and too many
people are stuck back on v4.x.

-Yonik

> such as using doc values, consuming readers per segment and using points for
> numerics and the fact that Solr, which should be the greatest example of how
> to use Lucene, is not totally following them. I agree that things are
> changing fast sometimes and changing is not easy, but some of the things we
> are talking about here are more than 4 years old. I am also concerned that
> these things are accumulating, for instance I can already see how the
> integration of points is made harder by the fact that fields are supposed to
> support uninverting. And even if we decide that the new point fields do not
> need to support uninverting, then it will give users the feeling that these
> new point fields are not feature-complete, which would be a pity. On a
> similar note, I hope that we can prevent new indices from using legacy
> numerics as of Solr 7 so that the implementation can be completely removed
> in Solr 8 and points will be the only way to index numerics.
>
> If we decide that Solr should keep supporting uninverting, I will accept it,
> but I genuinely think it would be better for Solr to eventually drop this
> feature and require doc values for sorting, facets and functions.
>
> Le mer. 26 oct. 2016 à 17:46, Yonik Seeley <ysee...@gmail.com> a écrit :
>>
>> I understand that the existence of the FieldCache meant additional
>> work when changing the DocValues APIs.
>> Those APIs are core enough however, that hopefully they won't change
>> too much in the future!
>> In the event that they do though, I'll try and help out with any
>> required transition.
>>
>> -Yonik
>>
>>
>> On Wed, Oct 26, 2016 at 11:34 AM, Adrien Grand <jpou...@gmail.com> wrote:
>> > I hear you that FieldCache has different trade-offs. That said, doc
>> > values
>> > look superior to me for a vast majority of users so I wish that we spent
>> > energy on improving doc values, making IndexSearcher aware of index
>> > sorting,
>> > or integrating points into Solr, which are more interesting ways to make
>> > Solr faster to me than spending energy keeping FieldCache and
>> > uninverting
>> > alive.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Future of FieldCache in Solr

Reply via email to