llelism
> into account). Thus your source functions could complete not when splits
> are generated, but when they have finished reading splits.
>
> Piotrek
>
> On 14 May 2018, at 20:29, Tao Xia wrote:
>
> Thanks for the reply Piotr. Which jira ticket were you refer
Thanks for the reply Piotr. Which jira ticket were you refer to?
We were trying to use the same code for normal stream process to process
very old historical backfill data.
The problem for me right now is that, backfill x years of data will be very
slow. And I cannot have any checkpoint during the
Also would like to know how to do this if it is possible.
On Fri, May 4, 2018 at 9:31 AM, Peter Zende wrote:
> Hi,
>
> We use RocksDB with FsStateBackend (HDFS) to store state used by the
> mapWithState operator. Is it possible to initialize / populate this state
> during the streaming applicati
uffer more events from the "faster" streams.
>
> Right now, it is not possible to throttle faster streams to the pace of
> the slowest stream.
>
> Best, Fabian
>
> 2018-04-27 1:05 GMT+02:00 Tao Xia :
>
>> Hi All,
>> I am trying to reply events from 3 d
I am running into a problem when processing the past 7 days of data from
multiple streams. I am trying to union the streams based on event
timestamp.
The problem is that there are streams are significant big than other
streams. For example if one stream has 1,000 event/sec and the other stream
ha
Are you sure rate limit is coming from KinesisProducer?
If yes, Kinesis support 1000 record write per sec per shard. if you hit the
limit, just increase your shard.
On Fri, Apr 27, 2018 at 8:58 AM, Urs Schoenenberger <
urs.schoenenber...@tngtech.com> wrote:
> Hi all,
>
> we are struggling with Ra
Hi All,
I am trying to reply events from 3 different sources and hopefully in
time sequence, say Stream1, Stream2, Stream3. Since their size vary a lot,
the watermarks on one stream is much faster than other streams. Is there
any way to coordinate the watermarks between different input streams.
5835493_0002_01_02
> taskmanager.err taskmanager.log taskmanager.out
>
> 2. You can download the logs via HTTP from Flink:
>
> http://host:port/jobmanager/log
> http://host:port/taskmanagers//log
>
> To get a list of taskmanagers:
>
> http://host:port/taskmanag
Any good way to get access container logs from a running Flink app in YARN
cluster in EMR?
You can view the logs through YARN UI. But cannot programmatically access
it and send to other services.
The log aggregator only runs when the application finishes or a minimum
3600 secs copy. Any way we can
Hi All,
I ran into an exception after deployed our app in EMR. It seems like the
connection to Kinesis failed. Any one got Flink KinesisConnector working in
EMR?
Release label:emr-5.11.0
Hadoop distribution:Amazon 2.7.3
Applications:Flink 1.3.2
java.lang.IllegalStateException: Socket not create
Hi All,
Do you guys write parquet file using Bucking Sink? I run into an issue
with all the parquet files are in the pending status. Any ideas?
processedStream is a DataStream of NDEvent.
Output files are all like this one "_part-0-0.pending"
val parquetSink = new BucketingSink[NDEvent]("/tmp
;
> [1] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/dev/table/sql.html#group-windows
>
> 2017-12-04 22:17 GMT+01:00 Tao Xia :
>
>> Hi All,
>> Do you know if window function supported on SQL yet?
>> I got the error message when trying to us
Hi All,
Do you know if window function supported on SQL yet?
I got the error message when trying to use group function in SQL.
My query below:
val query = "SELECT nd_key, concept_rank, event_timestamp FROM
"+streamName + " GROUP BY TUMBLE(event_timestamp, INTERVAL '1' HOUR),
nd_key"
Error M
13 matches
Mail list logo