----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33453/#review81838 -----------------------------------------------------------
Overall looks good to me. It would be nice to add unit tests as well. Thanks! samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala <https://reviews.apache.org/r/33453/#comment132322> Question: shouldn't the base dir to be determined by container ID, not jobId? samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala <https://reviews.apache.org/r/33453/#comment132326> nit: I think that the readablity is a little better if: val storeBaseDir = if (changeLogSystemStreamPartition != null) { loggedStorageBaseDir } else { defaultStoreBaseDir } ... val storageEngine = ... ( storeName, TaskStorageManager.getStorePartitionDir(storeBaseDir, storeName, taskName), ... samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala <https://reviews.apache.org/r/33453/#comment132329> How do we know that this store is using the default storeBaseDir not loggedStoreBaseDir? samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala <https://reviews.apache.org/r/33453/#comment132331> qq: Is there any checksum in the offset file to make sure that we don't read in a corrupted value? samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala <https://reviews.apache.org/r/33453/#comment132330> From the store initiation code, it seems that storeBaseDir won't have any logged store paths if logged store base dir is configured. Hence, here we may be deleting empty/non-existing storagePartitionDirs. - Yi Pan (Data Infrastructure) On April 22, 2015, 9:54 p.m., Navina Ramesh wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/33453/ > ----------------------------------------------------------- > > (Updated April 22, 2015, 9:54 p.m.) > > > Review request for samza, Yan Fang, Chris Riccomini, Naveen Somasundaram, and > Yi Pan (Data Infrastructure). > > > Repository: samza > > > Description > ------- > > Changed default to yarn cwd instead of io.tmpDir and refactored code > > > Diffs > ----- > > samza-core/src/main/scala/org/apache/samza/config/ShellCommandConfig.scala > 1a2dd4413f56e53dbeeb47b5637d7b0c50522f02 > samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala > 720fbdceafc4fe69b048d81a677e874d13e6d22f > samza-core/src/main/scala/org/apache/samza/storage/TaskStorageManager.scala > f68a7fee24614fce101e91c4f933d9b4e65dda0a > > samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala > 66c2a0dc2e38e21f951727a30f0987776ac52fe2 > > Diff: https://reviews.apache.org/r/33453/diff/ > > > Testing > ------- > > Tested locally using hello-samza. > Note: you have to set an environment variable LOGGED_STORE_BASE_DIR pointing > to the new location to persist the changelog attached stores. Otherwise, it > will default to YARN's cwd and will not re-use local state. > > > Thanks, > > Navina Ramesh > >