Hi,
Wondering if people with more experience with Samza think it would be a
good idea to keep topic names out of the code. You might want to be able
to change topics by editing the config instead of having to recompile the
job.
Maybe introduce an indirection so that output streams have names?
Config:
#Define an input named "raw" which maps to Kafka topic "wikipedia-raw"
task.inputs.kafka.raw=wikipedia-raw
#Use raw as an input
task.inputs=kafka.raw
#Define an output named "edits" which maps to Kafka topic "wikipedia-edits"
task.outputs.kafka.edits=wikipedia-edits
Task code:
//Input stream would be called "raw" here instead of "wikipedia-raw"
String stream =
envelope.getSystemStreamPartition().getSystemStream().getStream();
if (stream.equals("raw") {
processRawMsg(envelope, collector, coordinator);
}
//Send messages to locally named topic "edits"
collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka",
"edits"), parsedJsonObject));
Thoughts? It might be a huge deal. I just found myself copy and pasting
names a lot across config and code files while writing some test jobs.
Cheers,
Roger
Roger