Re: [EXTERNAL] Re: Help required - "BucketingSink" usage to write HDFS Files

2017-08-06 Thread Raja . Aravapalli
Hi Vinay, Thanks for the response. I have NOT enabled any checkpointing. Files are rolling out correctly for every 2mb, but the files are remaining as below: -rw-r--r-- 3 2097424 2017-08-06 21:10 ////Test/part-0-0.pending -rw-r--r-- 3 1431430 2017-08-06 21:12

Re: Help required - "BucketingSink" usage to write HDFS Files

2017-08-06 Thread vinay patil
Hi Raja, Have you enabled checkpointing? The files will be rolled to complete state when the batch size is reached (in your case 2 MB) or when the bucket is inactive for a certain amount of time. Regards, Vinay Patil On Mon, Aug 7, 2017 at 7:53 AM, Raja.Aravapalli [via Apache Flink User

Help required - "BucketingSink" usage to write HDFS Files

2017-08-06 Thread Raja . Aravapalli
Hi, I am working on a poc to write to hdfs files using BucketingSink class. Even thought I am the data is being writing to hdfs files, but the files are lying with “.pending” on hdfs. Below is the code I am using. Can someone pls help me identify the issue and help me fix this ?

Flink - Handling late events - main vs late window firing

2017-08-06 Thread M Singh
Hi Folks: I am going through flink documentation and it states the following: "You should be aware that the elements emitted by a late firing should be treated as updated results of a previous computation, i.e., your data stream will contain multiple results for the same computation. Depending

Re: FileNotFound Exception in Cluster Standalone

2017-08-06 Thread Jörn Franke
No this is supposed to be task of the filesystem. Here you can have HDFS (you just put the file on HDFS and it is accessible from everywhere and flink tries to execute computation on the nodes where it is stored) or an object store, such as Swift or S3 (with the limitation that the file is then

Re: FileNotFound Exception in Cluster Standalone

2017-08-06 Thread Kaepke, Marc
Thanks Jörn! I expected Flink will schedule the input file to all workers. Am 05.08.2017 um 16:25 schrieb Jörn Franke >: Probably you need to refer to the file on HDFS or manually make it available on each node as a local file. HDFS is