cpoerschke commented on PR #3648:
URL: https://github.com/apache/solr/pull/3648#issuecomment-3313059019
Thanks @ercsonusharma for starting to take a look here!
Yes, there's relatively little code but I totally appreciate that it's not
easy to understand.
> ... with distributed tracing enabled to a local Zipkin/Jaeger to visualize
what's going on. ...
That's an interesting idea, thanks @dsmiley for sharing! @hossman's
_"Lifecycle of a Solr Search Request"_ talk slides and recording from
quite-a-while-ago-now also spring to mind for me here.
Brief replies to some of your initial observations.
> ... Creating a new stage in the distributed phase may create lots of
complexity around other components. ...
Yes, for each component it will need to be considered how it should behave
in a fusion scenario and whether or not it should participate in the fusion
stage, or just skip it, which is the implied default.
> ... Query work is already happening in parallel by design of the
distributed process. ...
The distributed process provides parallelism on the shard level e.g. if we
have 10 shards then all 10 will run in parallel. Within each shard however, as
I understand the current #3418 code, the `CombinedQueryComponent.process`
method will run one sub-query after the other, where the line 181 comment says
"// TODO: to be parallelized" currently.
The parallelism here on #3648 is on the shard level _and also_ on the
sub-queries level:
* `CombinedQueryComponent.distributedProcess` will for one sub-query after
the other _add a request_ to the outgoing queue of requests,
* the search handler will send each added request to the 10 shards (so now
we have 20 things happening in parallel!!),
* the shards will process each request sent to them (running exactly only
one of the sub-queries),
* the 20 responses will be received and handled (with the subtlety of
needing to make sure that the response is handled only by whoever added it).
> ... how the faceting & highlighting would work.
Answering only on the highlighting work at this time, I've advanced the
proof-of-concept further so that the test case does cover highlighting. The
highlighting test passes and it may help to develop an understanding on how
that extra fusion stage fits into the picture.
> ... with distributed tracing enabled to a local Zipkin/Jaeger to visualize
what's going on. I used to have a shelved code snippet to instrument a test to
do this.
@dsmiley - I don't suppose you'd be in a position to dig up or dust off that
code snippet?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]