Re: KafkaIO Windowing Fn

2016-09-02 Thread Chawla,Sumit
Most of the KafkaIO tests on DirectRunner are using withMaxNumRecords which creates a BoundedReadFromUnboundedSource Regards Sumit Chawla On Fri, Sep 2, 2016 at 11:07 AM, Raghu Angadi wrote: > On Tue, Aug 30, 2016 at 12:01 AM, Chawla,Sumit > wrote: > > > Sorry i tried with DirectRunner but ra

Re: KafkaIO Windowing Fn

2016-09-02 Thread Raghu Angadi
On Tue, Aug 30, 2016 at 12:01 AM, Chawla,Sumit wrote: > Sorry i tried with DirectRunner but ran into some kafka issues. Following > is the snippet i am working on, and will post more details once i get it > working ( as of now i am unable to read messages from Kafka using > DirectRunner) > woul

Re: KafkaIO Windowing Fn

2016-09-02 Thread Aljoscha Krettek
Ah, now I remember that the Flink runner did never support processing-time timers. I created a Jira issue for this: https://issues.apache.org/jira/browse/BEAM-615 On Thu, 1 Sep 2016 at 19:20 Chawla,Sumit wrote: > Thanks Ajioscha\Thomas > > I will explore on the option to upgrade. Meanwhile here

Re: KafkaIO Windowing Fn

2016-09-01 Thread Chawla,Sumit
Thanks Ajioscha\Thomas I will explore on the option to upgrade. Meanwhile here is what observed with the above code in my local Flink Cluster. 1. To start there are 0 records in Kafka 2. Deploy the pipeline. Two records are received in Kafka at time 10:00:00 AM 3. The Pane with 100 records w

Re: KafkaIO Windowing Fn

2016-08-31 Thread Aljoscha Krettek
Ah I see, the Flink Runner had quite some updates in 0.2.0-incubating and even more for the upcoming 0.3.0-incubating. On Thu, 1 Sep 2016 at 04:09 Thomas Groh wrote: > In 0.2.0-incubating and beyond we've replaced the DirectPipelineRunner with > the DirectRunner (formerly InProcessPipelineRunner

Re: KafkaIO Windowing Fn

2016-08-31 Thread Thomas Groh
In 0.2.0-incubating and beyond we've replaced the DirectPipelineRunner with the DirectRunner (formerly InProcessPipelineRunner), which is capable of handling Unbounded Pipelines. Is it possible for you to upgrade? On Wed, Aug 31, 2016 at 5:17 PM, Chawla,Sumit wrote: > @Ajioscha, My assumption i

Re: KafkaIO Windowing Fn

2016-08-31 Thread Chawla,Sumit
@Ajioscha, My assumption is here that atleast one trigger should fire. Either the 100 elements or the 30 second since first element. (whichever happens first) @Thomas - here is the error i get: I am using 0.1.0-incubating *ava.lang.IllegalStateException: no evaluator registered for Read(Unbounde

Re: KafkaIO Windowing Fn

2016-08-31 Thread Gaurav Gupta
Sumit, I tried running the code you shared. I noticed that if MaxNumRecords is set to number N then KafkaIO doesn't return till it has read N messages. So either try setting a low value of MaxNumRecords or don't set it at all.. Another thing I observed was that while using anonymous DoFns I got f

Re: KafkaIO Windowing Fn

2016-08-31 Thread Aljoscha Krettek
Hi, could the reason for the second part of the trigger never firing be that there are never at least 100 elements per key. The trigger would only fire if it saw 100 elements and with only 540 elements that seems unlikely if you have more than 6 keys. Cheers, Aljoscha On Wed, 31 Aug 2016 at 17:47

Re: KafkaIO Windowing Fn

2016-08-31 Thread Thomas Groh
KafkaIO is implemented using the UnboundedRead API, which is supported by the DirectRunner. You should be able to run without the withMaxNumRecords; if you can't, I'd be very interested to see the stack trace that you get when you try to run the Pipeline. On Tue, Aug 30, 2016 at 11:24 PM, Chawla,S

Re: KafkaIO Windowing Fn

2016-08-30 Thread Chawla,Sumit
Yes. I added it only for DirectRunner as it cannot translate Read(UnboundedSourceOfKafka) Regards Sumit Chawla On Tue, Aug 30, 2016 at 11:18 PM, Aljoscha Krettek wrote: > Ah ok, this might be a stupid question but did you remove this line when > running it with Flink: > .withMaxNumRec

Re: KafkaIO Windowing Fn

2016-08-30 Thread Aljoscha Krettek
Ah ok, this might be a stupid question but did you remove this line when running it with Flink: .withMaxNumRecords(500) Cheers, Aljoscha On Wed, 31 Aug 2016 at 08:14 Chawla,Sumit wrote: > Hi Aljoscha > > The code is not different while running on Flink. It have removed business > speci

Re: KafkaIO Windowing Fn

2016-08-30 Thread Chawla,Sumit
Hi Aljoscha The code is not different while running on Flink. It have removed business specific transformations only. Regards Sumit Chawla On Tue, Aug 30, 2016 at 4:49 AM, Aljoscha Krettek wrote: > Hi, > could you maybe also post the complete that you're using with the > FlinkRunner? I could

Re: KafkaIO Windowing Fn

2016-08-30 Thread Aljoscha Krettek
Hi, could you maybe also post the complete that you're using with the FlinkRunner? I could have a look into it. Cheers, Aljoscha On Tue, 30 Aug 2016 at 09:01 Chawla,Sumit wrote: > Hi Thomas > > Sorry i tried with DirectRunner but ran into some kafka issues. Following > is the snippet i am work

Re: KafkaIO Windowing Fn

2016-08-30 Thread Chawla,Sumit
Hi Thomas Sorry i tried with DirectRunner but ran into some kafka issues. Following is the snippet i am working on, and will post more details once i get it working ( as of now i am unable to read messages from Kafka using DirectRunner) PipelineOptions pipelineOptions = PipelineOptionsFactory.c

Re: KafkaIO Windowing Fn

2016-08-26 Thread Thomas Groh
If you use the DirectRunner, do you observe the same behavior? On Thu, Aug 25, 2016 at 4:32 PM, Chawla,Sumit wrote: > Hi Thomas > > I am using FlinkRunner. Yes the second part of trigger never fires for me, > > Regards > Sumit Chawla > > > On Thu, Aug 25, 2016 at 4:18 PM, Thomas Groh > wrote:

Re: KafkaIO Windowing Fn

2016-08-25 Thread Chawla,Sumit
Hi Thomas I am using FlinkRunner. Yes the second part of trigger never fires for me, Regards Sumit Chawla On Thu, Aug 25, 2016 at 4:18 PM, Thomas Groh wrote: > Hey Sumit; > > What runner are you using? I can set up a test with the same trigger > reading from an unbounded input using the Dire

Re: KafkaIO Windowing Fn

2016-08-25 Thread Thomas Groh
Hey Sumit; What runner are you using? I can set up a test with the same trigger reading from an unbounded input using the DirectRunner and I get the expected output panes. Just to clarify, the second half of the trigger ('when the first element has been there for at least 30+ seconds') simply nev

Re: KafkaIO Windowing Fn

2016-08-25 Thread Raghu Angadi
On Thu, Aug 25, 2016 at 1:13 PM, Thomas Groh wrote: > > We should still have a JIRA to improve the KafkaIO watermark tracking in > the absence of new records . > filed https://issues.apache.org/jira/browse/BEAM-591 I don't want to hijack this thread Sumit's primary issue, but want to mention re

Re: KafkaIO Windowing Fn

2016-08-25 Thread Chawla,Sumit
Hi Thomas That did not work. I tried following instead: .triggering( Repeatedly.forever( AfterFirst.of( AfterProcessingTime.pastFirstElementInPane() .plusDelayOf(Duration.standardSeconds(30)),

Re: KafkaIO Windowing Fn

2016-08-25 Thread Thomas Groh
You can adjust the trigger in the windowing transform if your sink can handle being written to multiple times for the same window. For example, if the sink appends to the output when it receives new data in a window, you could add something like Window.into(...).withAllowedLateness(...).triggering

Re: KafkaIO Windowing Fn

2016-08-25 Thread Chawla,Sumit
Thanks Raghu. I don't have much control over changing KafkaIO properties. I added KafkaIO code for completing the example. Are there any changes that can be done to Windowing to achieve the same behavior? Regards Sumit Chawla On Wed, Aug 24, 2016 at 5:06 PM, Raghu Angadi wrote: > The defaul

Re: KafkaIO Windowing Fn

2016-08-24 Thread Raghu Angadi
The default implementation returns processing timestamp of the last record (in effect. more accurately it returns same as getTimestamp(), which might overridden by user). As a work around, yes, you can provide your own watermarkFn that essentially returns Now() or Now()-1sec. (usage in javadoc

KafkaIO Windowing Fn

2016-08-24 Thread Chawla,Sumit
Hi All I am trying to do some simple batch processing on KafkaIO records. My beam pipeline looks like following: pipeline.apply(KafkaIO.read() .withTopics(ImmutableList.of(s"mytopic")) .withBootstrapServers("localhost:9200") .apply("ExtractMessage", ParDo.of(new ExtractKVMessage