[
https://issues.apache.org/jira/browse/LUCENE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135876#comment-17135876
]
Michael McCandless commented on LUCENE-9395:
--------------------------------------------
{quote}Hello! {{org.apache.lucene.search.DoubleValuesSource#getValues}} is
called about once per segment, and so it's not an allocation hotspot.
{quote}
Hmm it is true there are other possible allocation hot spots (e.g. anything
per-hit, or per collected hit, etc.), but 1) some indices have a long tail of
small segments (LUCENE-8962 is trying to help there), 2) if you use many
dynamic expressions, you might be creating quite a few constant
{{DoubleValues}} per segment and 3) every little bit of allocation does add
some (yes, tiny, but still some) load to GC. Likely most of these allocations
"die young" and so the added cost is really small, I agree.
{quote}There are tons of implementations of this method and similarly for
{{org.apache.lucene.search.Weight#scorer}} as well which is also called about
once per segment that typically allocate a bunch of stuff.
{quote}
It is true there are other places that allocate per-segment objects, but I do
not think that is a valid argument against fixing this one? That is like
saying "the world is already dirty so why should I pick up this trash lying on
the sidewalk myself and throw it away?".
{quote}I don't think it's worth bothering changing the code.
{quote}
Whose/what "bother" are you referring to here? We commiters who would actually
push the change? I would say the "bother" was really on [~hypothesisx86] who
has already taken the initiative here to contribute a small improvement. Thank
you [~hypothesisx86]!
{quote}There is a very slight help for the GC (that I doubt you could even
measure) and a very slight negative impact on complexity.
{quote}
+1.
I also doubt it is measurable over the noise limit, but I don't think for small
improvements like this that we really must be able to measure the gain as a
blocker to improving. Say I find a change that removes one multiplication or
addition per-hit at defaults. Likely I could not prove that change moves the
needle, yet, it is still an improvement that we should want to make (all else
being equal). And we should not flaunt waste in Lucene's sources.
Yes, there is a miniscule change in code complexity, but it's exceptionally
tiny in my opinion.
I think when a newish developer offers a small contribution rather than
replying with "let's not bother", we should strongly welcome and encourage it.
This is how a healthy open-source community grows!
> ConstantValuesSource creates more than one DoubleValues unnecessarily
> ----------------------------------------------------------------------
>
> Key: LUCENE-9395
> URL: https://issues.apache.org/jira/browse/LUCENE-9395
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 8.5.2
> Reporter: Tony Xu
> Priority: Minor
> Attachments: LUCENE-9395.patch
>
>
> At my day job, we use ConstantValuesSource to represent default values or a
> constant query-level feature by calling *_DoubleValuesSource.constant_*. I
> realized under the hood the _*ConstantValuesSource.getDoubleValues*_ creates
> a new _*DoubleValues*_ which simply return the specified value each time it
> is called.
> Unless I missed something, I don't see a risk of creating one
> _*DoubleValues*_ as use it as the return value of all _*getDoubleValues**()*_
> calls given that the constant _*DoubleValues*_ doesn't maintain any state.
> We can also offer the user flexibilities of how to initialize it.
> 1) _*DoubleValuesSource.constant(double constant)*_ – we can eagerly
> initialize an `DoubleValues` that returns the constant and make it the return
> value of all _*getDoubleValues()*_ calls.
> 2) _*DoubleValuesSource.constant(DoubleSupplier doubleSupplier)*_ – For lazy
> evaluation if the constant takes some time to compute and user expects the
> returned DVS will not be used in all code path.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]