Most of the KafkaIO tests on DirectRunner are using withMaxNumRecords which
creates a BoundedReadFromUnboundedSource
Regards
Sumit Chawla
On Fri, Sep 2, 2016 at 11:07 AM, Raghu Angadi
wrote:
> On Tue, Aug 30, 2016 at 12:01 AM, Chawla,Sumit
> wrote:
>
> > Sorry i tried with DirectRunner but ra
On Tue, Aug 30, 2016 at 12:01 AM, Chawla,Sumit
wrote:
> Sorry i tried with DirectRunner but ran into some kafka issues. Following
> is the snippet i am working on, and will post more details once i get it
> working ( as of now i am unable to read messages from Kafka using
> DirectRunner)
>
woul
Ah, now I remember that the Flink runner did never support processing-time
timers. I created a Jira issue for this:
https://issues.apache.org/jira/browse/BEAM-615
On Thu, 1 Sep 2016 at 19:20 Chawla,Sumit wrote:
> Thanks Ajioscha\Thomas
>
> I will explore on the option to upgrade. Meanwhile here
Thanks Ajioscha\Thomas
I will explore on the option to upgrade. Meanwhile here is what observed
with the above code in my local Flink Cluster.
1. To start there are 0 records in Kafka
2. Deploy the pipeline. Two records are received in Kafka at time
10:00:00 AM
3. The Pane with 100 records w
Ah I see, the Flink Runner had quite some updates in 0.2.0-incubating and
even more for the upcoming 0.3.0-incubating.
On Thu, 1 Sep 2016 at 04:09 Thomas Groh wrote:
> In 0.2.0-incubating and beyond we've replaced the DirectPipelineRunner with
> the DirectRunner (formerly InProcessPipelineRunner
In 0.2.0-incubating and beyond we've replaced the DirectPipelineRunner with
the DirectRunner (formerly InProcessPipelineRunner), which is capable of
handling Unbounded Pipelines. Is it possible for you to upgrade?
On Wed, Aug 31, 2016 at 5:17 PM, Chawla,Sumit
wrote:
> @Ajioscha, My assumption i
@Ajioscha, My assumption is here that atleast one trigger should fire.
Either the 100 elements or the 30 second since first element. (whichever
happens first)
@Thomas - here is the error i get: I am using 0.1.0-incubating
*ava.lang.IllegalStateException: no evaluator registered for
Read(Unbounde
Sumit,
I tried running the code you shared. I noticed that if MaxNumRecords is set
to number N then KafkaIO doesn't return till it has read N messages. So
either try setting a low value of MaxNumRecords or don't set it at all..
Another thing I observed was that while using anonymous DoFns I got
f
Hi,
could the reason for the second part of the trigger never firing be that
there are never at least 100 elements per key. The trigger would only fire
if it saw 100 elements and with only 540 elements that seems unlikely if
you have more than 6 keys.
Cheers,
Aljoscha
On Wed, 31 Aug 2016 at 17:47
KafkaIO is implemented using the UnboundedRead API, which is supported by
the DirectRunner. You should be able to run without the withMaxNumRecords;
if you can't, I'd be very interested to see the stack trace that you get
when you try to run the Pipeline.
On Tue, Aug 30, 2016 at 11:24 PM, Chawla,S
Yes. I added it only for DirectRunner as it cannot translate
Read(UnboundedSourceOfKafka)
Regards
Sumit Chawla
On Tue, Aug 30, 2016 at 11:18 PM, Aljoscha Krettek
wrote:
> Ah ok, this might be a stupid question but did you remove this line when
> running it with Flink:
> .withMaxNumRec
Ah ok, this might be a stupid question but did you remove this line when
running it with Flink:
.withMaxNumRecords(500)
Cheers,
Aljoscha
On Wed, 31 Aug 2016 at 08:14 Chawla,Sumit wrote:
> Hi Aljoscha
>
> The code is not different while running on Flink. It have removed business
> speci
Hi Aljoscha
The code is not different while running on Flink. It have removed business
specific transformations only.
Regards
Sumit Chawla
On Tue, Aug 30, 2016 at 4:49 AM, Aljoscha Krettek
wrote:
> Hi,
> could you maybe also post the complete that you're using with the
> FlinkRunner? I could
Hi,
could you maybe also post the complete that you're using with the
FlinkRunner? I could have a look into it.
Cheers,
Aljoscha
On Tue, 30 Aug 2016 at 09:01 Chawla,Sumit wrote:
> Hi Thomas
>
> Sorry i tried with DirectRunner but ran into some kafka issues. Following
> is the snippet i am work
Hi Thomas
Sorry i tried with DirectRunner but ran into some kafka issues. Following
is the snippet i am working on, and will post more details once i get it
working ( as of now i am unable to read messages from Kafka using
DirectRunner)
PipelineOptions pipelineOptions = PipelineOptionsFactory.c
If you use the DirectRunner, do you observe the same behavior?
On Thu, Aug 25, 2016 at 4:32 PM, Chawla,Sumit
wrote:
> Hi Thomas
>
> I am using FlinkRunner. Yes the second part of trigger never fires for me,
>
> Regards
> Sumit Chawla
>
>
> On Thu, Aug 25, 2016 at 4:18 PM, Thomas Groh
> wrote:
Hi Thomas
I am using FlinkRunner. Yes the second part of trigger never fires for me,
Regards
Sumit Chawla
On Thu, Aug 25, 2016 at 4:18 PM, Thomas Groh
wrote:
> Hey Sumit;
>
> What runner are you using? I can set up a test with the same trigger
> reading from an unbounded input using the Dire
Hey Sumit;
What runner are you using? I can set up a test with the same trigger
reading from an unbounded input using the DirectRunner and I get the
expected output panes.
Just to clarify, the second half of the trigger ('when the first element
has been there for at least 30+ seconds') simply nev
On Thu, Aug 25, 2016 at 1:13 PM, Thomas Groh
wrote:
>
> We should still have a JIRA to improve the KafkaIO watermark tracking in
> the absence of new records .
>
filed https://issues.apache.org/jira/browse/BEAM-591
I don't want to hijack this thread Sumit's primary issue, but want to
mention re
Hi Thomas
That did not work.
I tried following instead:
.triggering(
Repeatedly.forever(
AfterFirst.of(
AfterProcessingTime.pastFirstElementInPane()
.plusDelayOf(Duration.standardSeconds(30)),
You can adjust the trigger in the windowing transform if your sink can
handle being written to multiple times for the same window. For example, if
the sink appends to the output when it receives new data in a window, you
could add something like
Window.into(...).withAllowedLateness(...).triggering
Thanks Raghu.
I don't have much control over changing KafkaIO properties. I added
KafkaIO code for completing the example. Are there any changes that can be
done to Windowing to achieve the same behavior?
Regards
Sumit Chawla
On Wed, Aug 24, 2016 at 5:06 PM, Raghu Angadi
wrote:
> The defaul
The default implementation returns processing timestamp of the last record
(in effect. more accurately it returns same as getTimestamp(), which might
overridden by user).
As a work around, yes, you can provide your own watermarkFn that
essentially returns Now() or Now()-1sec. (usage in javadoc
Hi All
I am trying to do some simple batch processing on KafkaIO records. My beam
pipeline looks like following:
pipeline.apply(KafkaIO.read()
.withTopics(ImmutableList.of(s"mytopic"))
.withBootstrapServers("localhost:9200")
.apply("ExtractMessage", ParDo.of(new ExtractKVMessage
24 matches
Mail list logo