[
https://issues.apache.org/jira/browse/SOLR-16531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631331#comment-17631331
]
Noble Paul edited comment on SOLR-16531 at 11/9/22 11:04 PM:
-------------------------------------------------------------
{quote}A 7% slowdown that affects the 0.1% of users deployed in a particular
idiosyncratic way..{quote}
I don't think we should dismiss it saying that it's 0.1% of the users . Solr is
designed to deal with large datasets. There are so many tools if you just want
to process a few GB of data that can be processed in a laptop. But, when we go
to peta bytes of data there are very few tools and Solr is targeting that
market. Most of the problems we are trying to solve in Solr stems from the fact
that most of the design choices we made in the beginning were dictated by a few
cores and a few GB of data. (e.g: monolithic {{clusterstate.json}}) .
Fullstory (or any big user) spends several millions just on GCP/AWS/Azure
bills. A 5% slowdown usually means we need to spend 100's of thousands that
much extra on GCP. So, they just refuse to upgrade to a newer version of Solr
and they just keep patching their existing fork. Coincidentally, most of the
developers are sponsored by these big users (apple,salesforce,fullstory etc).
If the latest version of Solr is unusable for them, the developers will be
asked to work on the custom fork and that is not what we want.
We are trying to push the limits of what Solr can do and how much output we can
squeeze out of a given hardware. If we add a few KBs of heap usage to a core
or if we use a few 100ms per core it adds up big time for us. If Solr adds a
new opt-in feature that is a bit slow, we don't care. But if it is not opt-in
it hurts us
TLDR: We are less concerned about per node overhead and we are more concerned
about per core overhead. Solr has too much of unnecessary ovehead per core
which needs to be eliminated. If possible more objects/threads should be
shared/cached/eliminated/made optional
was (Author: noble.paul):
{quote}A 7% slowdown that affects the 0.1% of users deployed in a particular
idiosyncratic way..{quote}
I don't think we should dismiss it saying that it's 0.1% of the users . Solr is
designed to deal with large datasets. There are so many tools if you just want
to process a few GB of data that can be processed in a laptop. But, when we go
to peta bytes of data there are very few tools and Solr is targeting that
market. Most of the problems we are trying to solve in Solr stems from the fact
that most of the design choices we made in the beginning were dictated by a few
cores and a few GB of data. (e.g: monolithic {{clusterstate.json}}) .
Fullstory (or any big user) spends several millions just on GCP bills. A 5%
slowdown usually means we need to spend 100's of thousands that much extra on
GCP. So, they just refuse to upgrade to a newer version of Solr and they just
keep patching their existing fork. Coincidentally, most of the developers are
sponsored by these big users (apple,salesforce,fullstory etc). If the latest
version of Solr is unusable for them, the developers will be asked to work on
the custom fork and that is not what we want.
We are trying to push the limits of what Solr can do and how much output we can
squeeze out of a given hardware. If we add a few KBs of heap usage to a core
or if we use a few 100ms per core it adds up big time for us. If Solr adds a
new opt-in feature that is a bit slow, we don't care. But if it is not opt-in
it hurts us
TLDR: We are less concerned about per node overhead and we are more concerned
about per core overhead. Solr has too much of unnecessary ovehead per core
which needs to be eliminated. If possible more objects/threads should be
shared/cached/eliminated/made optional
> Performance degradation due to introduction of JAX-RS
> -----------------------------------------------------
>
> Key: SOLR-16531
> URL: https://issues.apache.org/jira/browse/SOLR-16531
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Ishan Chattopadhyaya
> Priority: Blocker
> Fix For: 9.2
>
> Attachments: Screenshot from 2022-11-09 11-20-44.png,
> results-with-patch.tar.gz
>
>
> During performance benchmarking on branch_9x, I observed a slowdown in
> restart performance since commits in SOLR-16347. See attached screenshot.
> CC [~gerlowskija].
> http://mostly.cool/cluster-test-with-patch.html
> The benchmark is here:
> https://github.com/fullstorydev/solr-bench/blob/ishan/repeatable-jenkins/suites/cluster-test.json.
> This suite was run after retro-actively applying the parallelStream patch
> from SOLR-16414:
> https://github.com/apache/solr/commit/b33161d0cdd976fc0c3dc78c4afafceb4db671cf.diff
>
> Effort to automate these benchmarks is WIP and tracked here: SOLR-16525.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]