Re: Latest 200 messages per topic

2016-07-20 Thread Cody Koeninger
If they're files in a file system, and you don't actually need multiple kinds of consumers, have you considered streamingContext.fileStream instead of kafka? On Wed, Jul 20, 2016 at 5:40 AM, Rabin Banerjee wrote: > Hi Cody, > > Thanks for your reply . > >Let

Re: Latest 200 messages per topic

2016-07-20 Thread Rabin Banerjee
Hi Cody, Thanks for your reply . Let Me elaborate a bit,We have a Directory where small xml(90 KB) files are continuously coming(pushed from other node).File has ID & Timestamp in name and also inside record. Data coming in the directory has to be pushed to Kafka to finally get into

Re: Latest 200 messages per topic

2016-07-19 Thread Cody Koeninger
Unless you're using only 1 partition per topic, there's no reasonable way of doing this. Offsets for one topicpartition do not necessarily have anything to do with offsets for another topicpartition. You could do the last (200 / number of partitions) messages per topicpartition, but you have no

Re: Latest 200 messages per topic

2016-07-16 Thread Rabin Banerjee
Just to add , I want to read the MAX_OFFSET of a topic , then read MAX_OFFSET-200 , every time . Also I want to know , If I want to fetch a specific offset range for Batch processing, is there any option for doing that ? On Sat, Jul 16, 2016 at 9:08 PM, Rabin Banerjee <

Latest 200 messages per topic

2016-07-16 Thread Rabin Banerjee
HI All, I have 1000 kafka topics each storing messages for different devices . I want to use the direct approach for connecting kafka from Spark , in which I am only interested in latest 200 messages in the Kafka . How do I do that ? Thanks.