----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17603/#review33936 -----------------------------------------------------------
docs/learn/documentation/0.7.0/comparisons/storm.md <https://reviews.apache.org/r/17603/#comment63750> I think a spout is actually similar to a consumer (SystemConsumer) in Samza's parlance. In Storm, a spout is a thing that feeds messages from a stream into Storm's toplogies. This is what a SystemConsumer does with Samza. docs/learn/documentation/0.7.0/comparisons/storm.md <https://reviews.apache.org/r/17603/#comment63751> Even Storm's "exactly once" messaging is somewhat misleading. First, Storm only guarantees exactly once messaging within its framework? That is, if a Kafka producer sends a message, then times out (but the message makes it to the broker before the timeout), and re-sends, Storm's spout will process both messages (duplicates). This isn't really Storm's fault, but the point is that you get duplicate messages processed by your bolts. Second, what happens in the "exactly once" case in cases where the bolt is mutating state while processing a batch, and a failure occurs? As far as I know, Storm's state management requires idempotent operations, and only occurs outside of the topology, right? It might be worth discussing this, as these are both things that Samza and Kafka are attempting to address. docs/learn/documentation/0.7.0/comparisons/storm.md <https://reviews.apache.org/r/17603/#comment63754> This is somewhat confusing. Samza does not hold a single job per process. You can have N processes (SamzaContainers) for a single job. This is configured with YARN jobs using yarn.container.count. Might be worth calling out that a single Storm process with 100 threads is equivalent to a Samza job with 100 containers. docs/learn/documentation/0.7.0/comparisons/storm.md <https://reviews.apache.org/r/17603/#comment63756> You might want to call this out in the exactly once discussion above. If you have two topologies communicating with each other, they need to send messages through an underlying system (Kafka, HDFS, Kestrel, etc). This will break exactly-once messaging. docs/learn/documentation/0.7.0/comparisons/storm.md <https://reviews.apache.org/r/17603/#comment63758> Can't this be done in Samza by running a web service in a container, using streams to pass messages, and then having the web service container block until it receives a response message? docs/learn/documentation/0.7.0/introduction/background.md <https://reviews.apache.org/r/17603/#comment63761> Can you make these changes to Samza's index (landing) page as well? These two descriptions are identical, and should ideally be kept in sync. - Chris Riccomini On Feb. 6, 2014, 10:58 p.m., Martin Kleppmann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17603/ > ----------------------------------------------------------- > > (Updated Feb. 6, 2014, 10:58 p.m.) > > > Review request for samza. > > > Repository: samza > > > Description > ------- > > Copy-edited the 'introduction' and 'comparisons' sections of the > documentation, to make it more fluid to read. > > Changed all uses of the word 'member' (which is quite LinkedIn-specific > terminology) to refer to 'user' instead. > > Rewrote the explanation of state manatement (in comparisions/introduction) as > I found it confusing. > > Rewrote the page comparing Samza with Storm, because it was outdated and no > longer represented Storm accurately. > > > Diffs > ----- > > docs/img/0.7.0/learn/documentation/introduction/dag.png > bda85b2244df5f65f5472d557900fa2a65ea55c9 > docs/img/0.7.0/learn/documentation/introduction/group-by-example.png > 1acd355c4565ee484540897c9c1712ae0c03d185 > docs/learn/documentation/0.7.0/api/overview.md > b2324a411e8929c03971fd64a94699e8f6ded809 > docs/learn/documentation/0.7.0/comparisons/introduction.md > b70697ba51604b6d6b1c49e4e8ff0376d5d92ec1 > docs/learn/documentation/0.7.0/comparisons/mupd8.md > bb0d5a11691ae80725e51b799ab56d65edcb36db > docs/learn/documentation/0.7.0/comparisons/storm.md > b87c2077db2527041d8ed0397e2720772862dc60 > docs/learn/documentation/0.7.0/container/task-runner.md > 27dab79f76a34385db5e6bebec42dd0964cbb878 > docs/learn/documentation/0.7.0/container/windowing.md > 6058707e7d51986e8e36770303835673956a50b6 > docs/learn/documentation/0.7.0/introduction/architecture.md > ff8357dd0397156aebdc9fa30964b18c7a71c376 > docs/learn/documentation/0.7.0/introduction/background.md > 52d8e41cccbeb5851578c95dd0edca24f2b8471f > docs/learn/documentation/0.7.0/introduction/concepts.md > 2736bf0985c78d0314ed2011dc768cbbc5453f49 > > Diff: https://reviews.apache.org/r/17603/diff/ > > > Testing > ------- > > > Thanks, > > Martin Kleppmann > >
