----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33146/ -----------------------------------------------------------
(Updated April 28, 2015, 10:20 a.m.) Review request for samza. Changes ------- * Fixed failure detected by check-all.sh (putAll metrics) * Fixed a test file path in README.md * Added perf tests to compare get vs. getAll performance (please take a close look at the perf test); here's a subset of the result: Msg Count Bytes/Msg get ms getAll ms 68519 135 3380 549 58746 564 3060 1076 30081 607 1483 436 25366 542 1230 255 93947 566 6369 1758 42270 610 3029 768 81058 763 5785 1662 19520 387 1250 179 83502 84 5454 451 32953 471 2623 434 68919 825 5106 1256 9263 481 627 62 13238 118 858 68 89747 341 6387 1337 85254 658 6645 1520 63351 301 4181 759 55124 416 3938 792 12201 335 825 80 76567 887 5762 2223 77492 439 5195 1185 97340 846 7698 5972 48609 445 3328 616 4261 654 310 27 22285 140 1420 125 19125 292 1298 157 5873 654 388 50 80877 999 5859 7983* slower than get here! 95384 776 6928 1616 13923 38 882 40 2362 752 146 30 13792 474 925 100 14080 106 899 71 533 207 36 2 87771 165 5904 884 13871 907 1102 261 31356 81 2148 169 59574 285 4142 655 92288 980 7044 2044 54501 33 3469 246 36294 662 2588 532 6066 254 430 49 32511 928 2608 652 49528 219 3335 405 87002 108 5776 489 72665 774 5193 1219 27576 314 1851 200 65125 271 4329 832 55917 154 3733 422 51372 781 3633 980 83189 602 6087 1307 85522 748 6909 1751 60153 793 4917 1009 16603 818 1333 248 9680 165 741 36 49536 165 3704 372 48008 258 3595 486 27473 973 2288 494 87476 507 7033 1463 22661 155 1780 177 30558 1001 2486 645 22971 641 2031 302 31181 650 2523 488 29879 202 2212 284 66481 716 5658 1975 26593 171 2058 184 15893 989 1595 574 33785 180 2490 265 49941 33 3484 206 23910 372 1712 209 94867 676 7828 6328 86260 136 6520 836 60997 181 4710 667 874 618 69 4 77245 51 6028 417 20520 871 1727 399 50780 789 3889 941 18757 57 1378 68 38911 651 3208 690 59447 585 5033 3316 24796 581 2017 360 51307 418 4106 727 6083 256 455 24 64912 205 4998 795 63663 282 5091 833 66707 481 5375 1600 94523 642 8161 4780 34111 797 2741 599 52607 275 4450 710 ... Bugs: SAMZA-647 https://issues.apache.org/jira/browse/SAMZA-647 Repository: samza Description ------- * Adding new KeyValueStore methods: Map<K, V> getAll(List<K>), and void deleteAll(List<K>). **Please note: "Backwards incompatible API changes, config changes, or library upgrades should only happen between major revision changes, or when the major revision is 0." -- since the latter is true, I found that adding the new methods to the already-existing interface to be a better solution, even though it breaks backward compatability (please see the first iteration for context).** Alternatively, to maintain backward compat., I would have added a new contract and a new class for each class that implements KeyValueStore (to implement the new contract and pass calls throught to the underlying store, whose type needs to change in the constructor to the new contract). * Improved the javadoc of KeyValueStore to have the same voice, to add API notes, and to follow the javadoc standards. * Removing stress tests from TestKeyValueStores.scala because: * * Unit tests are not meant to be stress-testing the system, * * The test load looked arbitrary, * * It wasn't measuring anything (just testing it doesn't crash), * * Stress testing requires extended periods of testing, and * * My machine, a MacBook Air, is not beefy enough to survive stress tests; they should be run on a typical production machine and not a dev one -- a flaky test is not a good test. * * There's a test class dedicated for KV stores' performance testing: TestKeyValuePerformance * Last, but definitely not least, the main motiviation behind this change: Allowing RocksDbKeyValueStore to implement getAll(List<K>) to call multiGet(List<K>); Preliminary tests showed that multiGet is at least 1.25x faster per key than get (see https://reviews.facebook.net/rROCKSDB4985a9f73b9fb8a0323fbbb06222ae1f758a6b1d). Diffs (updated) ----- README.md 7f92020726626e606dbd97b86dcd91f4157c9ea7 samza-kv-inmemory/src/main/scala/org/apache/samza/storage/kv/inmemory/InMemoryKeyValueStore.scala 217333c84c696c0cc1bc3eeabf1c4066a6e89795 samza-kv-rocksdb/src/main/scala/org/apache/samza/storage/kv/RocksDbKeyValueStore.scala 66c2a0dc2e38e21f951727a30f0987776ac52fe2 samza-kv/src/main/java/org/apache/samza/storage/kv/KeyValueStore.java b708341abed15aaad34df5934f5f310bc1feb87a samza-kv/src/main/scala/org/apache/samza/storage/kv/CachedStore.scala 61bb3f6acb080b653f8b11176538549738255acc samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStorageEngine.scala 3a23daf053f0b8dec3a7ec83a51c9c5527078a3b samza-kv/src/main/scala/org/apache/samza/storage/kv/KeyValueStoreMetrics.scala 79092b91c9498e55f1c4e28661b7280c6c19cef7 samza-kv/src/main/scala/org/apache/samza/storage/kv/LoggedStore.scala 26f4cd9cfef305546c85ef9330f3e8b8be5336f7 samza-kv/src/main/scala/org/apache/samza/storage/kv/NullSafeKeyValueStore.scala 4f48cf490d6c1012591a602c0d29dcc71473090f samza-kv/src/main/scala/org/apache/samza/storage/kv/SerializedKeyValueStore.scala 531e8bef2069a77fa9ceab36fa738bbaa162fe8c samza-test/src/main/config/perf/kv-perf.properties 33fcd8d1aea14ecea47bbadb24936f737feedb39 samza-test/src/main/scala/org/apache/samza/test/performance/TestKeyValuePerformance.scala 0858b981add6581230960356f65fe6f6e6ab108f samza-test/src/test/scala/org/apache/samza/storage/kv/TestKeyValueStores.scala 50dfc10bb053d74dba70fdbce0ef87609ba447ea Diff: https://reviews.apache.org/r/33146/diff/ Testing ------- Unit-tested. Thanks, Mohamed Mahmoud (El-Geish)