Re: Checkpoint is not triggering as per configuration

2018-05-15 Thread Tao Xia
llelism > into account). Thus your source functions could complete not when splits > are generated, but when they have finished reading splits. > > Piotrek > > On 14 May 2018, at 20:29, Tao Xia wrote: > > Thanks for the reply Piotr. Which jira ticket were you refer

Re: Checkpoint is not triggering as per configuration

2018-05-14 Thread Tao Xia
Thanks for the reply Piotr. Which jira ticket were you refer to? We were trying to use the same code for normal stream process to process very old historical backfill data. The problem for me right now is that, backfill x years of data will be very slow. And I cannot have any checkpoint during the

Re: Init RocksDB state backend during startup

2018-05-04 Thread Tao Xia
Also would like to know how to do this if it is possible. On Fri, May 4, 2018 at 9:31 AM, Peter Zende wrote: > Hi, > > We use RocksDB with FsStateBackend (HDFS) to store state used by the > mapWithState operator. Is it possible to initialize / populate this state > during the streaming applicati

Re: coordinate watermarks between jobs?

2018-05-04 Thread Tao Xia
uffer more events from the "faster" streams. > > Right now, it is not possible to throttle faster streams to the pace of > the slowest stream. > > Best, Fabian > > 2018-04-27 1:05 GMT+02:00 Tao Xia : > >> Hi All, >> I am trying to reply events from 3 d

merge/union fast and slow streams based on event timestamp

2018-04-30 Thread Tao Xia
I am running into a problem when processing the past 7 days of data from multiple streams. I am trying to union the streams based on event timestamp. The problem is that there are streams are significant big than other streams. For example if one stream has 1,000 event/sec and the other stream ha

Re: RateLimit for Kinesis Producer

2018-04-27 Thread Tao Xia
Are you sure rate limit is coming from KinesisProducer? If yes, Kinesis support 1000 record write per sec per shard. if you hit the limit, just increase your shard. On Fri, Apr 27, 2018 at 8:58 AM, Urs Schoenenberger < urs.schoenenber...@tngtech.com> wrote: > Hi all, > > we are struggling with Ra

coordinate watermarks between jobs?

2018-04-26 Thread Tao Xia
Hi All, I am trying to reply events from 3 different sources and hopefully in time sequence, say Stream1, Stream2, Stream3. Since their size vary a lot, the watermarks on one stream is much faster than other streams. Is there any way to coordinate the watermarks between different input streams.

Re: Access logs for a running Flink app in YARN cluster

2018-04-13 Thread Tao Xia
5835493_0002_01_02 > taskmanager.err taskmanager.log taskmanager.out > > 2. You can download the logs via HTTP from Flink: > > http://host:port/jobmanager/log > http://host:port/taskmanagers//log > > To get a list of taskmanagers: > > http://host:port/taskmanag

Access logs for a running Flink app in YARN cluster

2018-04-12 Thread Tao Xia
Any good way to get access container logs from a running Flink app in YARN cluster in EMR? You can view the logs through YARN UI. But cannot programmatically access it and send to other services. The log aggregator only runs when the application finishes or a minimum 3600 secs copy. Any way we can

Anyone got Flink working in EMR with KinesisConnector

2018-01-10 Thread Tao Xia
Hi All, I ran into an exception after deployed our app in EMR. It seems like the connection to Kinesis failed. Any one got Flink KinesisConnector working in EMR? Release label:emr-5.11.0 Hadoop distribution:Amazon 2.7.3 Applications:Flink 1.3.2 java.lang.IllegalStateException: Socket not create

Pending parquet file with Bucking Sink

2017-12-18 Thread Tao Xia
Hi All, Do you guys write parquet file using Bucking Sink? I run into an issue with all the parquet files are in the pending status. Any ideas? processedStream is a DataStream of NDEvent. Output files are all like this one "_part-0-0.pending" val parquetSink = new BucketingSink[NDEvent]("/tmp

Re: Window function support on SQL

2017-12-04 Thread Tao Xia
; > [1] https://ci.apache.org/projects/flink/flink-docs- > release-1.3/dev/table/sql.html#group-windows > > 2017-12-04 22:17 GMT+01:00 Tao Xia : > >> Hi All, >> Do you know if window function supported on SQL yet? >> I got the error message when trying to us

Window function support on SQL

2017-12-04 Thread Tao Xia
Hi All, Do you know if window function supported on SQL yet? I got the error message when trying to use group function in SQL. My query below: val query = "SELECT nd_key, concept_rank, event_timestamp FROM "+streamName + " GROUP BY TUMBLE(event_timestamp, INTERVAL '1' HOUR), nd_key" Error M