Samza 0.10 introduces the feature of Yarn host affinity for this exact reason. For jobs that need to bootstrap lots of state, downtime during bootstrapping is not acceptable. In our production usecases, we've observed bootstrap times from 25 mins to about 30 seconds.
Please refer https://samza.apache.org/learn/documentation/0.10/yarn/yarn-host-affinity.html for configs to take advantage of this feature. On Thu, Feb 18, 2016 at 8:35 AM, Leo Woessner <est...@gmail.com> wrote: > We are starting to use the key-value store with rocksdb. We are trying to > offically add Samza to our stack and functionally everything is great. But, > > I am seeing minutes to hours restore time. Does anyone have any benchmarks > on data size versus restore time? My big question is how will this scale. > > Thanks in advance > > -- > Leo Woessner > -- Jagadish V, Graduate Student, Department of Computer Science, Stanford University