Hi all,

I'm looking at using embedded Solr as the KeyValueStore, as that lets me 
extract ranked results from the state to publish as part of the task's 
operation.

Some of the methods defined by KeyValueStore are problematic, though - 
specifically the range() and all() methods that return iterators.

Iterating over lots of results in Solr, while more feasible with newer paging 
support, is still an abuse of its architecture :)

So I'm wondering whether I need to support those methods, or are they only 
called internally by tasks (e.g. my task) and thus can be optional.

I'm assuming that when state is being automatically restored from a changelog, 
the Samza system is calling putAll(list) repeatedly, but I haven't dug into 
those details. So that would be an example of a required method.

Thanks,

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





Reply via email to