Re: Timestamp and key preservation over operators

2019-04-30 Thread Averell
Hi Fabian, Guowei, I have some updates: 1. I added timestamp&watermark extractor on all of my remaining sources (3 & 4), and the watermark does propagate to my final operator. 2. As I could not find a way to set my file sources as IDLE, I tried to tweak the class ContinuousFileReaderOperator to be

RE: Can't build Flink for Scala 2.12

2019-04-30 Thread Visser, M.J.H. (Martijn)
In the meantime, I had a look at the Travis YAML file for examples how there the compilation for 2.12 is happening. It appears that 1) there might be a typo, because the build profile is 2.112 instead of 2.12 (see https://github.com/apache/flink/blob/master/.travis.yml#L143) and there's also no

Re: PatternFlatSelectAdapter - Serialization issue after 1.8 upgrade

2019-04-30 Thread Oytun Tez
Hi all, Making the tag a static element worked out, thank you! --- Oytun Tez *M O T A W O R D* The World's Fastest Human Translation Platform. oy...@motaword.com — www.motaword.com On Tue, Apr 23, 2019 at 10:37 AM Oytun Tez wrote: > Thank you Guowei and Dawid! I am trying your suggestions to

Re: Timestamp and key preservation over operators

2019-04-30 Thread Averell
Hi Fabian, Guowei Thanks for the help. My flow is as the attached photo. Where (1) and (2) are the main data streams from file sources, while (3) and (4) are the enrichment data, also from file sources.

Filter push-down not working for a custom BatchTableSource

2019-04-30 Thread Josh Bradt
Hi all, I'm trying to implement filter push-down on a custom BatchTableSource that retrieves data from a REST API and returns it as POJO instances. I've implemented FilterableTableSource as described in the docs, returning a new instance of my table source containing the predicates that I've remov

Re: Flink dashboard+ program arguments

2019-04-30 Thread Rad Rad
Thanks, Fabian. The problem was incorrect java path. Now, everything works fine. I would ask about the command for running sql-client.sh These commands don't work ./sql-client.sh OR ./flink sql-client -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Can't build Flink for Scala 2.12

2019-04-30 Thread Visser, M.J.H. (Martijn)
Hi all, I'm trying to build Flink (from current master branch) for Scala 2.12, using: mvn clean install -Pscala-2.12 -Dscala-2.12 -DskipTests It fails for me on the with this error: [ERROR] /home/pa35uq/Workspace/flink/flink-table/flink-table-planner-blink/src/main/scala/org/apache/flink/table

Re: Flink dashboard+ program arguments

2019-04-30 Thread Fabian Hueske
Hi, With Flink 1.5.0, we introduced a new distributed architecture (see release announcement [1] and FLIP-6 [2]). >From what you describe, I cannot tell what is going wrong. How do you submit your application? Which action resulted in the error message you shared? Btw. why do you go for Flink 1.

Flink dashboard+ program arguments

2019-04-30 Thread Rad Rad
Hi, I am using Flink 1.4.2 and I can't see my running jobs on Flink we dashboard. I downloaded Flink 1.5 and 1.6, I received this message when I tried to send my arguments like this --topic sensor --bootstrap.servers localhost:9092 --zookeeper.connect localhost:2181 --group.id test-consumer-g

Preserve accumulators after failure in DataStream API

2019-04-30 Thread Wouter Zorgdrager
Hi all, In the documentation I read about UDF accumulators [1] "Accumulators are automatically backup-ed by Flink’s checkpointing mechanism and restored in case of a failure to ensure exactly-once semantics." So I assumed this also was the case of accumulators used in the DataStream API, but I not

Re: Emitting current state to a sink

2019-04-30 Thread M Singh
Thanks Avi for your help.  Mans On Tuesday, April 30, 2019, 5:57:51 AM EDT, Avi Levi wrote: Sure!  you get the context and the collector in the processBroadcastElement method see snippet below  override def processBroadcastElement(value: BroadcastRequest, ctx: KeyedBroadcastProcess

Re: Emitting current state to a sink

2019-04-30 Thread Avi Levi
Sure! you get the context and the collector in the processBroadcastElement method see snippet below override def processBroadcastElement(value: BroadcastRequest, ctx: KeyedBroadcastProcessFunction[String, Request, BroadcastRequest, String]#Context, out: Collector[String]): Unit = {

Re: Flink heap memory

2019-04-30 Thread Rad Rad
Thanks a lot. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink orc file write example

2019-04-30 Thread Fabian Hueske
Hi, I had a look but couldn't find an ORC writer in flink-orc, only an InputFormat and TableSource to read ORC data into DataSet programs or Table / SQL queries. Where did you find the ORC writer? Thanks, Fabian Am Di., 30. Apr. 2019 um 09:09 Uhr schrieb Hai : > Hi, > > > I found flink now supp

Re: Timestamp and key preservation over operators

2019-04-30 Thread Fabian Hueske
Hi, Actually all operators should preserve record timestamps if set the correct TimeCharacteritics to event time. A window operator will set the timestamp of all emitted records to the end-timestamp of the window. Not sure what happens if you use a processing time window in an event time applicati

Re: can we do Flink CEP on event stream or batch or both?

2019-04-30 Thread Fabian Hueske
Hi, Stateful streaming applications are typically designed to run continuously (i.e., until forever or until they are not needed anymore or replaced). May jobs run for weeks or months. IMO, using CEP for "simple" equality matches would add too much complexity for a use case that can be easily sol

Re: How to verify what maxParallelism is set to?

2019-04-30 Thread Fabian Hueske
Hi Sean, I was looking for the max-parallelism value in the UI, but couldn't find it. Also the REST API does not seem to provide it. Would you mind opening a Jira issue for adding it to the REST API and the Web UI? Thank you, Fabian Am Di., 30. Apr. 2019 um 06:36 Uhr schrieb Sean Bollin : > Tha

Flink Kafka Connection Failure Notifications

2019-04-30 Thread Chirag Dewan
Hi, I am using Flink 1.7.2 with Kafka Connector 0.11 for Consuming records from Kafka.  I observed that if the broker is down, Kafka Consumer does nothing but logs the connection error and keeps on reconnecting to the broker. And infact the log level seems to be DEBUG.  Is there any way to captu

Re: assignTimestampsAndWatermarks not work after KeyedStream.process

2019-04-30 Thread Fabian Hueske
An operator task broadcasts its current watermark to all downstream tasks that might receive its records. If you have an the following code: DataStream a = ... a.map(A).map(B).keyBy().window(C) and execute this with parallelism 2, your plan looks like this A.1 -- B.1 --\--/-- C.1

Re: Working around lack of SQL triggers

2019-04-30 Thread Fabian Hueske
You could implement aggregation functions that just do AVG, COUNT, etc. and a parameterizable aggregation function that can be configured to call the avg, count, etc. functions. When configuring, you would specify the input and output, for example like this: input: [int, int, double] key: input.1

Re: Data Locality in Flink

2019-04-30 Thread Fabian Hueske
Such a decision would require some distribution statistics, preferably stats on the actual data that needs to be rebalanced or not. This data would only be available while a job is executed and a component that changes a running program is very difficult to implement. Best, Fabian Am Mo., 29. Ap

Re: kafka partitions, data locality

2019-04-30 Thread Fabian Hueske
Hi Sergey, You are right, keys are managed in key groups. Each key belongs to a key group and one or more key groups are assigned to each parallel task of an operator. Key groups are not exposed to users and the assignments of keys -> key-groups and key-groups -> tasks cannot be changed without ch

Flink orc file write example

2019-04-30 Thread Hai
Hi, I found flink now support the orc file writer in the module flink-connectors/flink-orc. Could any one show me a basic usage about this module ? I didn’t find that on the official site and the internet. Many thanks.