[
https://issues.apache.org/jira/browse/SOLR-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112517#comment-15112517
]
Noble Paul commented on SOLR-8582:
----------------------------------
JsonRecordReader keeps around all keys in a root object. In this case it
happens to be a very large array . create a new Set for each object in the
array
> /update/json/docs is 4x slower than /update for indexing a list of json docs
> ----------------------------------------------------------------------------
>
> Key: SOLR-8582
> URL: https://issues.apache.org/jira/browse/SOLR-8582
> Project: Solr
> Issue Type: Bug
> Components: update
> Reporter: Shalin Shekhar Mangar
> Fix For: 5.5, Trunk
>
> Attachments: SOLR-8582.patch, SOLR-8582.patch
>
>
> Indexing a ~650 MB json file containing a list of 2.2 million json documents,
> I found that bin/post had become 4x slower after SOLR-7042. Memory
> consumption has also gone up and I can no longer index this file with a 512mb
> heap.
> The difference is because we now default to /update/json/docs instead of
> /update. This can be verified on trunk:
> {code}
> time curl 'http://localhost:8983/solr/gettingstarted/update' --data-binary
> @/hdd/solr-data/imdb.json
> {"responseHeader":{"status":0,"QTime":161869}}
>
> real 2m42.044s
> user 0m0.292s
> sys 0m0.493s
>
> time curl 'http://localhost:8983/solr/gettingstarted/update/json/docs'
> --data-binary @/hdd/solr-data/imdb.json
> {"responseHeader":{"status":0,"QTime":686264}}
>
> real 11m26.478s
> user 0m0.324s
> sys 0m0.552s
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]