The absolute time taken depends on the cluster resources of course. At my laptop, for 1000 docs of ~1k size in average, a scroll response 'took' field shows usually ~200-500ms. It takes additional time to process the response hits.
I am not sure if the number of shards is relevant. There are more important factors: shard numbers per node, shard size, buffers and heap memory, network compression, network speed, node workload... If you are interested in a Java scan/scroll example, you can peek into the knapsack plugin source https://github.com/jprante/elasticsearch-knapsack/blob/master/src/main/java/org/xbib/elasticsearch/action/RestExportAction.java#L310 Critical for a scalable scan/scroll is a reasonable timeout. In the knapsack plugin, I use a default of 30 seconds. In the ES docs, a timeout of 10 minutes is used http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html which seems not very helpful, as this will pressure your heap in almost all cases of long-lasting scan/scroll... Jörg -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGhWWJf%3DdvxsBBEc%3DzoNfGsqLofTfOv4J4CmXbGJACg-w%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.