yes, actually when I am using Samza, I add this config in the profile to make my life easier. It is also true that Samza's config system is already very complicated :( -- I feel the same way... It is now actually very convenient for experienced users but may scare new users away...
Fang, Yan [email protected] +1 (206) 849-4108 On Mon, Sep 8, 2014 at 3:27 PM, Chris Riccomini < [email protected]> wrote: > Hey Roger, > > Funnily enough, we actually used to have this feature in Samza 0.6.0, > before it was open sourced. We called them "logical streams". The main > reason that we removed them was really about usability. Samza's configs > are already overly complicated (at least, I feel that way), and adding an > extra level of indirection was leading to a lot of confusion from > developers. > > You can still proxy this by using job-specific config, and a variable: > > val edits = config.getString("edits") > > And then define: > > edits=kafka.edit-stream-name > > In conifig. That seems a bit clunky, though. > > Taking a step back, I think Samza's config system is due for a revisit. > David Chen was actually just discussing this with me today, and I expect a > JIRA on it to pop up sometime soon. If we simplified things, it might be > possible to add this feature back in. > > Cheers, > Chris > > On 9/8/14 2:51 PM, "Roger Hoover" <[email protected]> wrote: > > >"It might be a huge deal." I mean "it might not* be huge deal. > > > >On Mon, Sep 8, 2014 at 2:50 PM, Roger Hoover <[email protected]> > >wrote: > > > >> Hi, > >> > >> Wondering if people with more experience with Samza think it would be a > >> good idea to keep topic names out of the code. You might want to be > >>able > >> to change topics by editing the config instead of having to recompile > >>the > >> job. > >> > >> Maybe introduce an indirection so that output streams have names? > >> > >> Config: > >> #Define an input named "raw" which maps to Kafka topic "wikipedia-raw" > >> task.inputs.kafka.raw=wikipedia-raw > >> #Use raw as an input > >> task.inputs=kafka.raw > >> #Define an output named "edits" which maps to Kafka topic > >>"wikipedia-edits" > >> task.outputs.kafka.edits=wikipedia-edits > >> > >> Task code: > >> > >> //Input stream would be called "raw" here instead of "wikipedia-raw" > >> String stream = > >> envelope.getSystemStreamPartition().getSystemStream().getStream(); > >> if (stream.equals("raw") { > >> processRawMsg(envelope, collector, coordinator); > >> } > >> > >> //Send messages to locally named topic "edits" > >> collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", > >> "edits"), parsedJsonObject)); > >> > >> Thoughts? It might be a huge deal. I just found myself copy and > >>pasting > >> names a lot across config and code files while writing some test jobs. > >> > >> Cheers, > >> > >> Roger > >> > >> Roger > >> > >
