[ 
https://issues.apache.org/jira/browse/SOLR-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15785486#comment-15785486
 ] 

Joel Bernstein commented on SOLR-9636:
--------------------------------------

I added a new NullStream to test the performance of exporting and sorting on a 
high cardinality field. This is a much more real world scenario for supporting 
distributed joins on primary keys. The query looks like this:
 
{code}
parallel(collection2, workers=7, sort="count desc", 
      null(search(collection1, 
                   q=*:*, 
                   fl="id", 
                   sort="id desc", 
                   qt="/export", 
                   wt="javabin", 
                   partitionKeys=id)))
{code}

Notice the new *null* function which eats the tuples and returns a count to 
verify the number of tuples processed.

The test query is sorting on the id field which has a unique value in each 
record. Again performance was impressive:

* With json: 1,210,000 Tuples per second.
* With javabin: 1,350,000 Tuples per second.

So the ExportWriter doesn't slow down sorting on a high cardinality field.





> Add support for javabin for /stream, /sql internode communication
> -----------------------------------------------------------------
>
>                 Key: SOLR-9636
>                 URL: https://issues.apache.org/jira/browse/SOLR-9636
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>             Fix For: master (7.0), 6.4
>
>         Attachments: SOLR-9636.patch
>
>
> currently it uses json, which is verbose and slow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to