Hi Casey, Here's a link to a response from Chris on that thread which I think gives the high-level picture pretty well:
http://mail-archives.apache.org/mod_mbox/incubator-samza-dev/201310.mbox/%3C1B43C7411DB20E47AB0FB62E7262B80179B2159F%40ESV4-MBX02.linkedin.biz%3E Basically it's a choice between building a system client within Samza that pulls the external data or having an external process that pushes the data into a Kafka topic that is then the input to a Samza task. Which is most appropriate will depend on the nature of the data source plus other considerations such as if the raw data will have other consumers. Though I'm not quite sure I follow what you mean re your implemented custom system? If you look at the source for the Wikipedia feed task it reads from the external source (Wikipedia) then writes the output message directly to Kafka (the call to collector.send) -- isn't this what you want? Regards, Garry -----Original Message----- From: Anh Thu Vu [mailto:[email protected]] Sent: 27 February 2014 11:32 To: [email protected] Subject: Read from a file and write to a Kafka stream with Samza Hi everyone, As the subject said, I want to read from external source (simplest case is to read from a file) and write to a Kafka stream. Then another StreamTask will start reading from the Kafka stream. I've succeeded running HelloSamza, write a similar app (VERY similar) that have a custom Consumer reading from a file then write to a custom system (i.e.: testsystem.mystream). Then I have a StreamTask that read from this custom stream, and write to a kafka stream. However, I want to bypass the custom stream and write the message from the external source directly to the Kafka stream. I guess I will have to implement the SystemFactory such that it will return a Kafka producer for the getProducer() method but I'm not very sure how to yet. Although I basically welcome & appreciate very much all guides/advises/suggestions, my main objective of this mail is to ask for the link to the thread "Writing a simple KafkaProducer in Samza" that was mentioned in https://mail-archives.apache.org/mod_mbox/incubator-samza-dev/201311.mbox/%3CEA1B8C30F3B4C34EBA71F2AAF48F5990D612E028%40Mail3.impetus.co.in%3E Thank you very much, Casey ----- No virus found in this message. Checked by AVG - www.avg.com Version: 2014.0.4259 / Virus Database: 3705/7127 - Release Date: 02/26/14
