[
https://issues.apache.org/jira/browse/SOLR-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15785486#comment-15785486
]
Joel Bernstein commented on SOLR-9636:
--------------------------------------
I added a new NullStream to test the performance of exporting and sorting on a
high cardinality field. This is a much more real world scenario for supporting
distributed joins on primary keys. The query looks like this:
{code}
parallel(collection2, workers=7, sort="count desc",
null(search(collection1,
q=*:*,
fl="id",
sort="id desc",
qt="/export",
wt="javabin",
partitionKeys=id)))
{code}
Notice the new *null* function which eats the tuples and returns a count to
verify the number of tuples processed.
The test query is sorting on the id field which has a unique value in each
record. Again performance was impressive:
* With json: 1,210,000 Tuples per second.
* With javabin: 1,350,000 Tuples per second.
So the ExportWriter doesn't slow down sorting on a high cardinality field.
> Add support for javabin for /stream, /sql internode communication
> -----------------------------------------------------------------
>
> Key: SOLR-9636
> URL: https://issues.apache.org/jira/browse/SOLR-9636
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Noble Paul
> Assignee: Noble Paul
> Fix For: master (7.0), 6.4
>
> Attachments: SOLR-9636.patch
>
>
> currently it uses json, which is verbose and slow
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]