gether into one directory but
> the handles to those directories are sent to the CheckpointCoordinator which
> creates a checkpoint that stores handles to all the states stored in DFS.
>
> Best,
> Aljoscha
>
>> On 4. Jan 2018, at 15:06, Jinhua Luo wrote:
>>
>>
>
>> On 4. Jan 2018, at 14:56, Jinhua Luo wrote:
>>
>> ok, I see.
>>
>> But as known, one rocksdb instance occupy one directory, so I am still
>> wondering what's the relationship between the states and rocksdb
>> instances.
>>
>> 2018-01-
; CheckpointCoordinator figures out which handles need to be sent to which
> operators for restoring.
>
> Best,
> Aljoscha
>
>> On 4. Jan 2018, at 14:44, Jinhua Luo wrote:
>>
>> OK, I think I get the point.
>>
>> But another question raises: how t
e-a-java-transaction-manager-that-works-with-postgresql/
> .
>
> Does this help or do you really need async read-modify-update?
>
> Best,
> Stefan
>
>
> Am 03.01.2018 um 15:08 schrieb Jinhua Luo :
>
> No, I mean how to implement exactly-once db commit (given our async i
uctor.
>
> Best,
> Aljoscha
>
>
>> On 4. Jan 2018, at 14:23, Jinhua Luo wrote:
>>
>> I still do not understand the relationship between rocksdb backend and
>> the filesystem (here I refer to any filesystem impl, including local,
>> hdfs, s3).
>>
>
; would be failed afterwards immediately.
2018-01-04 21:31 GMT+08:00 Jinhua Luo :
> 2018-01-04 21:04 GMT+08:00 Aljoscha Krettek :
>> Memory usage should grow linearly with the number of windows you have active
>> at any given time, the number of keys and the number of different wind
2018-01-04 21:04 GMT+08:00 Aljoscha Krettek :
> Memory usage should grow linearly with the number of windows you have active
> at any given time, the number of keys and the number of different window
> operations you have.
But the memory usage is still too much, especially when the
incremental a
jects/flink/flink-docs-release-1.4/ops/state/state_backends.html
>
>
>
> Am 1/1/18 um 3:50 PM schrieb Jinhua Luo:
>
>> Hi All,
>>
>> I have two questions:
>>
>> a) does the records/elements themselves would be checkpointed? or just
>> record offset c
t; aggregate elements of the windows?
>
> Regards,
> Timo
>
>
>
> Am 1/1/18 um 6:00 AM schrieb Jinhua Luo:
>
>> I checked the logs, but no information indicates what happens.
>>
>> In fact, in the same app, there is another stream, but its kafka
>>
You can use non-keyed state, aka operator state, to store such information.
> See here:
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/state.html#using-managed-operator-state
> . It does not require a KeyedSteam.
>
> Best,
> Stefan
>
>
>
> 2018-
operators, e.g.
fold, so it normally faces the DataStream but not KeyedStream, and
DataStream only supports ListState, right?
2018-01-03 18:43 GMT+08:00 Stefan Richter :
>
>
>> Am 01.01.2018 um 15:22 schrieb Jinhua Luo :
>>
>> 2017-12-08 18:25 GMT+08:00 Stefan Richter :
>>
Hi All,
I have two questions:
a) does the records/elements themselves would be checkpointed? or just
record offset checkpointed? That is, what data included in the
checkpoint except for states?
b) where flink stores the state globally? so that the job manager
could restore them on each task mang
2017-12-08 18:25 GMT+08:00 Stefan Richter :
> You need to be a bit careful if your sink needs exactly-once semantics. In
> this case things should either be idempotent or the db must support rolling
> back changes between checkpoints, e.g. via transactions. Commits should be
> triggered for conf
ook.
>
> have you tried to do a thread dump?
> How is the GC pause?
> do you see flink restart? check the exception tab in Flink web UI for your
> job.
>
>
>
> On Sun, Dec 31, 2017 at 6:20 AM, Jinhua Luo wrote:
>>
>> I take time to read some source codes ab
ut I am
really frustrated that flink could be fulfill its feature just like
the doc said.
Thank you all.
2017-12-29 17:42 GMT+08:00 Jinhua Luo :
> I misuse the key selector. I checked the doc and found it must return
> deterministic key, so using random is wrong, but I still could not
> under
I misuse the key selector. I checked the doc and found it must return
deterministic key, so using random is wrong, but I still could not
understand why it would cause oom.
2017-12-28 21:57 GMT+08:00 Jinhua Luo :
> It's very strange, when I change the key selector to use random key,
&
at
org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
Could anybody explain the internal of keyby()?
2017-12-28 17:33 GMT+08:00 Ufuk Celebi :
> Hey Jinhua,
>
> On Thu, Dec 28, 2017 at 9:57 AM, Jinhua Luo wrote:
>> The keyby() upon the field would generate unique key as the f
?
2017-12-28 17:33 GMT+08:00 Ufuk Celebi :
> Hey Jinhua,
>
> On Thu, Dec 28, 2017 at 9:57 AM, Jinhua Luo wrote:
>> The keyby() upon the field would generate unique key as the field
>> value, so if the number of the uniqueness is huge, flink would have
>> trouble both on c
Hi All,
I need to aggregate some field of the event, at first I use keyby(),
but I found the flink performs very slow (even stop working out
results) due to the number of keys is around half a million per min.
So I use windowAll() instead, and flink works as expected then.
The keyby() upon the fi
Hi All,
If I add new slave (start taskmanager on new host) which does not
included in the conf/slaves, I see below logs conintuously printed:
...Trying to register at JobManager...(attempt 147,
timeout: 3 milliseconds)
Is it normal? And does the new slave successfully added in the cluster?
Hi All,
I start an app which contains multiple streams, and I found some
stream of them get processed very slow, which uses keyBy() and the
number of key would be up to million, and I also found all streams
share only one task slot. Is it possible to assign more slots to the
app?
of
> windows for the same time have the same timestamp.
>
> 2017-12-18 11:30 GMT+01:00 Jinhua Luo :
>>
>> Thanks.
>>
>> The keyBy() splits the stream into multiple logical streams, if I do
>> timeWindow(), then how flink merge all logical windows into one?
>
igned.
> If records are aggregated in a time window, the aggregation results has the
> maximum allowed timestamp of the window. For example a tumbling window of
> size 1 hour that starts at 14:00 emits its results with a timestamp of
> 14:59:59.999.
>
> Best, Fabian
>
> 2017
Hi All,
The timestamp assigner is for one type, normally for the type from the
source, but after several operators, the element type would change and
the elements would be aggregated, if I do timeWindow again, how flink
extracts timestamp from elements? For example, the fold operators
aggregate 10
Hi All,
The io.netty package included in flnk 1.3.2 is 4.0.23, while the
latest lettuce-core (4.4) depends on netty 4.0.35.
If I include netty 4.0.35 in the app jar, it would throw
java.nio.channels.UnresolvedAddressException. It seems the netty
classes are mixed between versions from app jar and
Jinhua Luo :
> If the window contains only one element, no more elements come in,
> then by default (with EventTimeTrigger), the window would be fired by
> next element if that element advances watermark which passes the end
> of the window, correct?
> That is, even if the windo
ow is created when the first
> element arrives".
> Otherwise, you'd have to fire empty windows for all possible keys (in case
> of a window operator on a keyed stream) which is obviously not possible.
>
> 2017-12-12 9:30 GMT+01:00 Jinhua Luo :
>>
>> OK, I see.
>
ss). If a late event arrives, you can update the result and emit an
> update. In this case your downstream operators systems have to be able to
> deal with updates.
> 3) send the late events to a different channel via side outputs and handle
> them later.
>
>
>
> 2017-12-12 12:14
t;
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/event_timestamps_watermarks.html
> [2]
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/windows.html#allowed-lateness
> [3]
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/w
Hi All,
The watermark is monotonous incremental in a stream, correct?
Given a stream out-of-order extremely, e.g.
e4(12:04:33) --> e3 (15:00:22) --> e2(12:04:21) --> e1 (12:03:01)
Here e1 appears first, so watermark start from 12:03:01, so e3 is an
early event, it would be placed in another wind
uilt-in session window.
> You can also define custom windows like that.
>
> Best, Fabian
>
> 2017-12-12 7:57 GMT+01:00 Jinhua Luo :
>>
>> Hi All,
>>
>> The document said "a window is created as soon as the first element
>> that should belong to this
Hi All,
Given one stream source which generates 20k events/sec, and I need to
aggregate the element count using sliding window of 1 hour size.
The problem is, the window may buffer too many elements (which may
cause a lot of block I/O because of checkpointing?), and in fact it
does not necessary
Hi All,
The document said "a window is created as soon as the first element
that should belong to this window arrives, and the window is
completely removed when the time (event or processing time) passes its
end timestamp plus the user-specified allowed lateness (see Allowed
Lateness).".
I am sti
t support rolling
> back changes between checkpoints, e.g. via transactions. Commits should be
> triggered for confirmed checkpoints („notifyCheckpointComplete“).
>
> Your assumptions about the blocking behavior of the non-async sinks is
> correct.
>
> Best,
> Stefan
>
Hi, all.
The invoke method of sink seems no way to make async io? e.g. returns Future?
For example, the redis connector uses jedis lib to execute redis
command synchronously:
https://github.com/apache/bahir-flink/blob/master/flink-connector-redis/src/main/java/org/apache/flink/streaming/connecto
35 matches
Mail list logo