Re: Move files read by flink

2018-03-08 Thread Jörn Franke
Why don’t you let your flink job move them once it’s done? > On 9. Mar 2018, at 03:12, flinkuser101 wrote: > > I am reading files from a folder suppose > > /files/* > > Files are pushed into that folder. > > /files/file1_2018_03_09.csv >

Re: Dynamic CEP https://issues.apache.org/jira/browse/FLINK-7129?subTaskView=all

2018-03-08 Thread Dawid Wysakowicz
Hi, Kostas is right, unfortunately I had to stop the work, cause we were missing BroadcastState. I hope I will get back to this feature soon and finish it for 1.6. > On 8 Mar 2018, at 17:28, Vishal Santoshi wrote: > > Perfect. Thanks. > > On Thu, Mar 8, 2018 at

Re: Emulate Tumbling window in Event Time Space

2018-03-08 Thread Xingcan Cui
Hi Dhruv, there’s no need to implement the window logic with the low-level `ProcessFunction` yourself. Flink has provided built-in window operators and you just need to implement the `WindowFunction` for that [1]. Best, Xingcan [1]

Emulate Tumbling window in Event Time Space

2018-03-08 Thread Dhruv Kumar
Hi I was trying to emulate tumbling window in event time space. Here is the link to my code. I am using the process function to do the custom processing which I want to do within every window. I am having an issue of how to emit results at the end of every window since my watermark only gets

Move files read by flink

2018-03-08 Thread flinkuser101
I am reading files from a folder suppose /files/* Files are pushed into that folder. /files/file1_2018_03_09.csv /files/file2_2018_03_09.csv Flink is reading files from the folder fine but as the no of files grows how do I move the files into another folder? Currently I am using cronjob to

UUIDs generated by Flink SQL

2018-03-08 Thread Gregory Fee
Hello, from what I understand in the documentation it appears there is no way to assign UUIDs to operators added to the DAG by Flink SQL. Is my understanding correct? I'd very much like to be able to assign UUIDs to those operators. I want to run a program using some Flink SQL, create a save

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-08 Thread Yan Zhou [FDS Science]
Hi Xingcan, Timo, Thanks for the information. I am going to convert the result table to DataStream and follow the logic of TimeBoundedStreamInnerJoin to do the timed-window join. Should I do this? Is there any concern from performance or stability perspective? Best Yan

Re: Event time join

2018-03-08 Thread Vishal Santoshi
Yep. I think this leads to this general question and may be not pertinent to https://github.com/apache/flink/pull/5342. How do we throttle a source if the held back data gets unreasonably large ? I know that that is in itself a broader question but delayed watermarks of slow stream accentuates

Re: Event time join

2018-03-08 Thread Fabian Hueske
The join would not cause backpressure but rather put all events that cannot be processed yet into state to process them later. So this works well if the data that is provided by the streams is roughly aligned by event time. 2018-03-08 9:04 GMT-08:00 Vishal Santoshi : >

Re: Event time join

2018-03-08 Thread Vishal Santoshi
Aah we have it here https://docs.google.com/document/d/16GMH5VM6JJiWj_N0W8y3PtQ1aoJFxsKoOTSYOfqlsRE/edit#heading=h.bgl260hr56g6 On Thu, Mar 8, 2018 at 11:45 AM, Vishal Santoshi wrote: > This is very interesting. I would imagine that there will be high back > pressure

Re: Event time join

2018-03-08 Thread Vishal Santoshi
This is very interesting. I would imagine that there will be high back pressure on the LEFT source effectively throttling it but as is the current state that is likely effect other pipelines as the free o/p buffer on the source side and and i/p buffers on the consumer side start blocking and get

Re: Event time join

2018-03-08 Thread Fabian Hueske
Hi Gytis, Flink does currently not support holding back individual streams, for example it is not possible to align streams on (offset) event-time. However, the Flink community is working on a windowed join for the DataStream API, that only holds the relevant tail of the stream as state. If your

Re: Dynamic CEP https://issues.apache.org/jira/browse/FLINK-7129?subTaskView=all

2018-03-08 Thread Vishal Santoshi
Perfect. Thanks. On Thu, Mar 8, 2018 at 10:41 AM, Kostas Kloudas wrote: > Hi Vishal, > > Dawid (cc’ed) who was working on that stopped because in the past Flink > did not support broadcast state. > > This is now added (in the master) and the implementation of

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-08 Thread Xingcan Cui
Hi Yan & Timo, this is confirmed to be a bug and I’ve created an issue [1] for it. I’ll explain more about this query. In Flink SQL/Table API, the DISTINCT keyword will be implemented with an aggregation, which outputs a retract stream [2]. In that situation, all the time-related fields will

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-08 Thread Hequn Cheng
Hi Yan, This is a bug in flink. As a workaround, you can cast eventTime to other basic sql types(for example, cast eventTime as varchar). @Timo and @Xingcan, I think we have to materialize time indicators in conditions of LogicalFilter. I created an issue and we can have more discussions

Re: Dynamic CEP https://issues.apache.org/jira/browse/FLINK-7129?subTaskView=all

2018-03-08 Thread Kostas Kloudas
Hi Vishal, Dawid (cc’ed) who was working on that stopped because in the past Flink did not support broadcast state. This is now added (in the master) and the implementation of FLINK-7129 will continue hopefully soon. Cheers, Kostas > On Mar 8, 2018, at 4:09 PM, Vishal Santoshi

Re: Dynamic CEP https://issues.apache.org/jira/browse/FLINK-7129?subTaskView=all

2018-03-08 Thread Vishal Santoshi
Hello Fabian, What about https://issues.apache.org/jira/browse/FLINK-7129 ? Do you folks intend to conclude this ticket too ? On Thu, Mar 8, 2018 at 1:08 AM, Fabian Hueske wrote: > We hope to pick up FLIP-20 after Flink 1.5.0 has been released. > > 2018-03-07 22:05

Re: Job is be cancelled, but the stdout log still prints

2018-03-08 Thread sundy
I got it. That’s really a big problem. Thank you very much > On 8 Mar 2018, at 21:03, kedar mhaswade wrote: > > Also, in addition to what Gary said, if you take Flink completely out of > picture and wrote a simple Java class with a main method and the static block >

Re: Job is be cancelled, but the stdout log still prints

2018-03-08 Thread kedar mhaswade
Also, in addition to what Gary said, if you take Flink completely out of picture and wrote a simple Java class with a main method and the static block (!) which does some long running task like getLiveInfo(), then chances are that your class will make the JVM hang! Basically what you are doing is

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-08 Thread Timo Walther
Hi Xingcan, thanks for looking into this. This definitely seems to be a bug. Maybe in the org.apache.flink.table.calcite.RelTimeIndicatorConverter. In any case we should create an issue for it. Regards, Timo Am 3/8/18 um 7:27 AM schrieb Yan Zhou [FDS Science]: Hi Xingcan, Thanks for

Re: Table Api and CSV builder

2018-03-08 Thread Timo Walther
Hi Karim, the CsvTableSource and its builder are currently not able to specify event-time or processing-time. I'm sure this will change in the near future. Until then I would recommend to either extend it yourself or use the DataStream API first to do the parsing and watermarking and then

Re: Flink UI not responding on Yarn + Flink

2018-03-08 Thread Gary Yao
Hi Samar, Can you share the JobManager and TaskManager logs returned by: yarn logs -applicationId ? Is your browser rendering a blank page, or does the HTTP request not finish? Can you show the output of one of the following commands: curl -v http://host:port curl -v

Re: Flink Kafka reads too many bytes .... Very rarely

2018-03-08 Thread Stephan Ewen
Double checking: The "deserialize(byte[] message)" already receives an additional byte[] with too many bytes? I wonder if this might be an issue in Kafka then, or in the specific way Kafka is configured. On Wed, Mar 7, 2018 at 5:40 PM, Philip Doctor wrote: > Hi

Re: Flink is looking for Kafka topic "n/a"

2018-03-08 Thread Nico Kruber
I think, I found a code path (race between threads) that may lead to two markers being in the list. I created https://issues.apache.org/jira/browse/FLINK-8896 to track this and will have a pull request ready (probably) today. Nico On 07/03/18 10:09, Mu Kong wrote: > Hi Gordon, > > Thanks for

Re: Job is be cancelled, but the stdout log still prints

2018-03-08 Thread Gary Yao
Hi, You are not shutting down the ScheduledExecutorService [1], which means that after job cancelation the thread will continue running getLiveInfo(). The user code class loader, and your classes won't be garbage collected. You should use the RichFunction#close callback to shutdown your thread

Event time join

2018-03-08 Thread Gytis Žilinskas
Hi, we're considering flink for a couple of our projects. I'm doing a trial implementation for one of them. So far, I like a lot of things, however there are a couple of issues that I can't figure out how to resolve. Not sure if it's me misunderstanding the tool, or flink just doesn't have a