Just want to update you on this one. After some time spent in debugging I found that the actual problem was a piece of our code that was calling next() on a range iterator twice :(. After removing the duplicate call everything works as expected.
Thank you! Alex On Mon, Nov 16, 2015 at 10:45 PM, Yi Pan <nickpa...@gmail.com> wrote: > Hi, Alexander, > > Sorry to reply late on this one. I embedded my questions and comments > in-between the lines: > > On Sun, Nov 15, 2015 at 7:07 PM, Alexander Filipchik <afilipc...@gmail.com > > > wrote: > > > > > nodeIterator = store.range( > > String.join(".", nodeId, String.valueOf(Character.MIN_VALUE)), > > String.join(".", nodeId, String.valueOf(Character.MAX_VALUE))); > > > > > Theoretically, what you want is a prefix scan, the start key should be > nodeId + '.' and end key should be nodeId + '.' + maxId, in which maxId > should have each character = Character.MAX_VALUE with total length that is > equal or greater than the max possible nodeId. > > I restreamed RockDB changelog topic and I can see all this edges stored > > there, but query still returnes only 4.3M nodes. > > > > Could you help to clarify what you did here to "see all these edges" and to > "query still returns only 4.3M nodes"? > > > > 1) Have anyone seen such a behaviour before? > > > > Not I am aware of. > > > > 2) What is the best way to debug it on a remote machine? Any particular > > logs to look for? Any RockDb config params that should be enabled? > > > > You can try to add Jmx debug port option to task.opts. With Samza 0.10 > (latest from trunk), the JMX server port is reported from the AppMaster's > web API. As for the state store config, you can try to disable the > CachedStore to prevent any potential issues w/ cache management. > > > > 3) Is it a good idea to store a graph in such a format? > > > > As long as you can partition the data based on nodeId, it should be fine. > > > > > > Thank you, > > Alex > > > > Please let us know if you find any issues with your use case. > > -Yi >