Hey all,
  We use ES as our indexing sub system.  Our canonical record store is 
Cassandra.  Due to the denormalization we perform for faster query speed, 
occasionally the state of documents in ES can lag behind the state of our 
Cassandra instance.  To accommodate this eventually consist system, our 
query path is the following.


Query ES ->  returns scrollId and document ids-> Load entities from 
Cassandra -> Drop "stale" documents from results and asynchronously remove 
them.


If the user requested 10 entities, and only 8 were current, we will be 
missing 2 results and need to make another trip to ES to satisfy the result 
size.  Is it possible to do the following?


//get first page with 10 results
POST /testindex/mytype/_search?scroll=1m {"from":0, "size": 10} 

POST /testing/mytype/_search?scroll_id= <result from previous> {"size":2}


I always seem to get back 10 on my second request.  If not other option is 
available, I'd rather truncate our result set size to < the user's 
requested size than take the performance hit of using from and size to drop 
results.  Our document counts can be 50million+, so I'm concerned with the 
performance implications of not using scroll.


Thoughts?


Thanks,
Todd




-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a6cf1a9-9b96-47c2-a3f6-0955b3e74283%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to