Kafka LogAppendTimePolicy not throwing Exception when log.message.timestamp.type = CreateTime for the Kafka topic

2019-01-03 Thread rahul patwari
Hi, We are using KafkaIO.read() with LogAppendTimePolicy. When the topic is idle at the beginning of the pipeline, IllegalStateException is NOT thrown even when log.message.timestamp.type = CreateTime. This happens due to the statement: else if (currentWatermark.equals(BoundedWindow.TIMESTAMP_MI

Re: GroupByKey and number of workers

2019-01-03 Thread Mohamed Haseeb
This explains it. Thanks Reza! On Thu, Jan 3, 2019 at 1:19 AM Reza Ardeshir Rokni wrote: > Hi Mohamed, > > I believe this is related to fusion which is a feature of some of the > runners, you will be able to find more information on fusion on: > > > https://cloud.google.com/dataflow/docs/guides/

Re: Suggestion or Alternative simples to read file from FTP

2019-01-03 Thread Rui Wang
For the calling external service, it's described in [1] as a pattern which has a small sample of code instruction. However, why not write a script to prepare the data first and then write a pipeline to process it? 1. https://cloud.google.com/blog/products/gcp/guide-to-common-cloud-dataflow-use-ca

Re: Beam Summits!

2019-01-03 Thread Austin Bennett
Hi Matthias, etc, Trying to get thoughts on formalizing a process for getting proposals together. I look forward to the potential day that there are many people that want (rather than just willing) to host a summit in a given region in a given year. Perhaps too forward looking. Also, you mentio

Child jobs not kicking off using SparkRunner in cluster mode

2019-01-03 Thread Shrijit Pillai
Hello, I'm trying to run the WordCount example using SparkRunner. In the client-mode, the child jobs are kicked off and the output is also produced. However in the cluster mode, the child jobs are not starting and no output is produced. I'm using Beam 2.9.0 and Spark 2.3.0 Here are the comman

[Go SDK] User Defined Coders

2019-01-03 Thread Robert Burke
One area that the Go SDK currently lacks: is the ability for users to specify their own coders for types. I've written a proposal document, and while I'm confident about the core, there are certainly some edge

Re: cyclic (non-DAG) computations

2019-01-03 Thread Kenneth Knowles
Hi Jan, This is a valuable but tricky problem. As you notice, it is an issue with watermarks. If you have a cycle, it may be convergent or not convergent (aka an infinite loop). Since a watermark measures completeness, it should advance only up to some event time where the cycle has converged. If

Suggestion or Alternative simples to read file from FTP

2019-01-03 Thread Henrique Molina
Hi Folks , I'm newbie in Beam, but I looking for some way to read an File stored at FTP First of all, I could create ParDo, using FTPClient (Commons-net) and access returning an Byte[] of File *.csv. second ParDO create the csv third PardDo using the TextIO to read lines Somebody could share