date:20180418

Re: Efficiency with different approaches of aggregation in Flink

2018-04-18 Thread Puneet Kinra

Hi Teena If you are proceeding with point 3, no doubt it will add some overhead but major significance is that you are persisting the state as per some key. so there will not be data loss in case of the job failure. On Thu, Apr 19, 2018 at 11:45 AM, Teena Kappen // BPRISE < teena.kap...@bprise.

Efficiency with different approaches of aggregation in Flink

2018-04-18 Thread Teena Kappen // BPRISE

Hi, If I have to aggregate a value in a stream of records, which one of the below approaches will be the most/least efficient? 1. Using a Global Window to aggregate the value and emit the record when it reaches a particular threshold value. 2. Using a FlatMap with a State Variable which

Confusing debug level log output with Flink 1.5

2018-04-18 Thread Ken Krugler

Hi Till, I just saw https://issues.apache.org/jira/browse/FLINK-9215 I’ve been trying out 1.5, and noticed similar output in my logs, e.g. 18/04/18 17:33:47 DEBUG slotpool.SlotPool:751 - Releasing slot with slot request id 2c9fba45bd28940c4650

Re: Help with OneInputStreamOperatorTestHarness

2018-04-18 Thread Chris Schneider

Hi Ted, I should have written that we’re using Flink 1.4.0. Thanks for the suggestion re: FLINK-8268 ; it could well be the issue (though the pull request appears fairly complex so I’ll need som

Help with OneInputStreamOperatorTestHarness

2018-04-18 Thread Chris Schneider

Hi Gang, I’m having trouble getting my streaming unit test to work. The following code: @Test public void testDemo() throws Throwable { OneInputStreamOperatorTestHarness testHarness = new KeyedOneInputStreamOperatorTestHarness( new StreamFlatMap<>(new

debug for Flink

2018-04-18 Thread Qian Ye

Hi I’m wondering if new debugging methods/tools are urgent for Flink development. I know there already exists some debug methods for Flink, e.g., remote debugging of flink clusters(https://cwiki.apache.org/confluence/display/FLINK/Remote+Debugging+of+Flink+Clusters

Re: FlinkML

2018-04-18 Thread Christophe Salperwyck

Hi, You could try to plug MOA/Weka library too. I did some preliminary work with that: https://moa.cms.waikato.ac.nz/moa-with-apache-flink/ but then it is not anymore FlinkML algorithms. Best regards, Christophe 2018-04-18 21:13 GMT+02:00 shashank734 : > There are no active discussions or gui

Re: why doesn't the over-window-aggregation sort the element(considering watermark) before processing?

2018-04-18 Thread Yan Zhou [FDS Science]

nvm, I figure it out. The event is not process once it's arrived. It's registered to processed in event time. It make sense. best Yan From: Yan Zhou [FDS Science] Sent: Wednesday, April 18, 2018 12:56:58 PM To: Fabian Hueske Cc: user Subject: Re: why doesn't t

Re: why doesn't the over-window-aggregation sort the element(considering watermark) before processing?

2018-04-18 Thread Yan Zhou [FDS Science]

Hi Fabian, Thanks for the reply. I think here is the problem. Currently, the timestamp of an event is compared with previous processed element's timestamp, instead of watermark, to determine if it's late. To my understanding, even the order of emitted event in preceding operator is perfec

Re: FlinkML

2018-04-18 Thread shashank734

There are no active discussions or guide on that. But I found this example in the repo : https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/ml/IncrementalLearningSkeleton.java

Re: FlinkML

2018-04-18 Thread Christophe Jolif

Szymon, The short answer is no. See: http://mail-archives.apache.org/mod_mbox/flink-user/201802.mbox/%3ccaadrtt39ciiec1uzwthzgnbkjxs-_h5yfzowhzph_zbidux...@mail.gmail.com%3E On Mon, Apr 16, 2018 at 11:25 PM, Szymon Szczypiński wrote: > Hi, > > i wonder if there are possibility to build FlinkML

Re: Tracking deserialization errors

2018-04-18 Thread Elias Levy

Either proposal would work. In the later case, at a minimum we'd need a way to identify the source within the metric. The basic error metric would then allow us to go into the logs to determine the cause of the error, as we already record the message causing trouble in the log. On Mon, Apr 16,

Re: Substasks - Uneven allocation

2018-04-18 Thread Ken Krugler

Hi Pedro, That’s interesting, and something we’d like to be able to control as well. I did a little research, and it seems like (with some stunts) there could be a way to achieve this via CoLocationConstraint

Substasks - Uneven allocation

2018-04-18 Thread PedroMrChaves

Hello, I have a job that has one async operational node (i.e. implements AsyncFunction). This Operational node will spawn multiple threads that perform heavy tasks (cpu bound). I have a Flink Standalone cluster deployed on two machines of 32 cores and 128 gb of RAM, each machine has one task man

Re: State-machine-based search logic in Flink ?

2018-04-18 Thread Fabian Hueske

As I said before, this is work in progress and there is a pending pull request (PR) to add this feature. So no, MATCH_RECOGNIZE is not supported by Flink yet and hence also not documented. Best, Fabian 2018-04-18 10:12 GMT+02:00 Esa Heikkinen : > Hi > > > > I did mean like “finding series of con

Jars uploaded to jobmanager are deleted but not free'ed by OS

2018-04-18 Thread Jeroen Steggink | knowsy

Sorry, I meant the jobmanager, not the taskmanager. On 18-Apr-18 15:44, Jeroen Steggink | knowsy wrote: Hi, I'm having some troubles running the Flink taskmanager in a Docker container (OpenShift). The container's internal storage is filling up because the deleted jar files in blob storage a

Jars uploaded to taskmanager are deleted but not free'ed by OS

2018-04-18 Thread Jeroen Steggink | knowsy

Hi, I'm having some troubles running the Flink taskmanager in a Docker container (OpenShift). The container's internal storage is filling up because the deleted jar files in blob storage are probably still in use and therefore resources are not free'ed. We are using Apache Beam to start an A

Re: Tracking deserialization errors

2018-04-18 Thread Alexander Smirnov

ouch, i forgot to mention I opened https://issues.apache.org/jira/browse/FLINK-9155 to track this. Should it be a duplicate of 9204 then? On Wed, Apr 18, 2018 at 3:32 PM Tzu-Li (Gordon) Tai wrote: > Hi, > > These are valid concerns. And yes, AFAIK users have been writing to logs > within the des

Re: Flink & Kafka multi-node config

2018-04-18 Thread Tzu-Li (Gordon) Tai

Hi, The partition-to-subtask assignment of is not locality aware. There were discussions to expose functionality for custom user-defined assignment methods, which it might be possible to leverage that for a locality aware assignment. Unfortunately, this feature is not implemented, yet. The rrela

Outputting the content of in flight session windows

2018-04-18 Thread jelmer

I defined a session window and I would like to write the contents of the window to storage before the window closes Initially I was doing this by setting a CountTrigger.of(1) on the session window. But this leads to very frequent writes. To remedy this i switched to a ContinuousProcessingTimeTrig

Re: Flink job testing with

2018-04-18 Thread Tzu-Li (Gordon) Tai

Hi, The docs here [1] provide some example snippets of using the Kafka connector to consume from / write to Kafka topics. Once you consumed a `DataStream` from a Kafka topic using the Kafka consumer, you can use Flink transformations such as map, flatMap, etc. to perform processing on the records

Re: Tracking deserialization errors

2018-04-18 Thread Tzu-Li (Gordon) Tai

Hi, These are valid concerns. And yes, AFAIK users have been writing to logs within the deserialization schema to track this. The connectors as of now have no logging themselves in case of a skipped record. I think we can implement both logging and metrics to track this, most of which you have

Re: Managing state migrations with Flink and Avro

2018-04-18 Thread Timo Walther

Thank you. Maybe we already identified the issue (see https://issues.apache.org/jira/browse/FLINK-9202). I will use your code to verify it. Regards, Timo Am 18.04.18 um 14:07 schrieb Petter Arvidsson: Hi Timo, Please find the generated class (for the second schema) attached. Regards, Pette

Re: Managing state migrations with Flink and Avro

2018-04-18 Thread Petter Arvidsson

Hi Timo, Please find the generated class (for the second schema) attached. Regards, Petter On Wed, Apr 18, 2018 at 11:32 AM, Timo Walther wrote: > Hi Petter, > > could you share the source code of the class that Avro generates out of > this schema? > > Thank you. > > Regards, > Timo > > Am 18.

Re: CaseClassSerializer and/or TraversableSerializer may still not be threadsafe?

2018-04-18 Thread Stefan Richter

Hi, I agree that this looks like a serializer is shared between two threads, one of them being the event processing loop. I am doubting that the problem is with the async fs backend, because there is code in place that will duplicate all serializers for the async snapshot thread and this is als

Re: rest.port is reset to 0 by YarnEntrypointUtils

2018-04-18 Thread Dongwon Kim

Hi Gary, Thanks a lot for replay. Hope the issue is resolved soon. I have a suggestion regarding the rest port. Considering the role of dispatcher, it needs to have its own port range that is not shared by job managers spawned by dispatcher. If I understand FLIP-6 correctly, only a few dispatche

Re: rest.port is reset to 0 by YarnEntrypointUtils

2018-04-18 Thread Gary Yao

Hi Dongwon, I think the rationale was to avoid conflicts between multiple Flink instances running on the same YARN cluster. There is a ticket that proposes to allow configuring a port range instead [1]. Best, Gary [1] https://issues.apache.org/jira/browse/FLINK-5758 On Tue, Apr 17, 2018 at 9:56

Re: Managing state migrations with Flink and Avro

2018-04-18 Thread Timo Walther

Hi Petter, could you share the source code of the class that Avro generates out of this schema? Thank you. Regards, Timo Am 18.04.18 um 11:00 schrieb Petter Arvidsson: Hello everyone, I am trying to figure out how to set up Flink with Avro for state management (especially the content of s

Managing state migrations with Flink and Avro

2018-04-18 Thread Petter Arvidsson

Hello everyone, I am trying to figure out how to set up Flink with Avro for state management (especially the content of snapshots) to enable state migrations (e.g. adding a nullable fields to a class). So far, in version 1.4.2, I tried to explicitly provide an instance of "new AvroTypeInfo(Accumul

RE: State-machine-based search logic in Flink ?

2018-04-18 Thread Esa Heikkinen

Hi I did mean like “finding series of consecutive events”, as it was described in [2]. Are these features already in Flink and how well they are documented ? Can I use Scala or only Java ? I would like some example codes, it they are exist ? Best, Esa From: Fabian Hueske Sent: Tuesday, Apri

Re: why doesn't the over-window-aggregation sort the element(considering watermark) before processing?

2018-04-18 Thread Fabian Hueske

The over window operates on an unbounded stream of data. Hence it is not possible to sort the complete stream. Instead we can sort ranges of the stream. Flink uses watermarks to define these ranges. The operator processes the records in timestamp order that are not late, i.e., have timestamps larg

Re: assign time attribute after first window group when using Flink SQL

2018-04-18 Thread Fabian Hueske

This sounds like a windowed join between the raw stream and the aggregated stream. It might be possible to do the "lookup" in the second raw stream with another windowed join. If not, you can fall back to the DataStream API / ProcessFunction and implement the lookup logic as you need it. Best, Fab

Re: Efficiency with different approaches of aggregation in Flink

Efficiency with different approaches of aggregation in Flink

Confusing debug level log output with Flink 1.5

Re: Help with OneInputStreamOperatorTestHarness

Help with OneInputStreamOperatorTestHarness

debug for Flink

Re: FlinkML

Re: why doesn't the over-window-aggregation sort the element(considering watermark) before processing?

Re: why doesn't the over-window-aggregation sort the element(considering watermark) before processing?

Re: FlinkML

Re: FlinkML

Re: Tracking deserialization errors

Re: Substasks - Uneven allocation

Substasks - Uneven allocation

Re: State-machine-based search logic in Flink ?

Jars uploaded to jobmanager are deleted but not free'ed by OS

Jars uploaded to taskmanager are deleted but not free'ed by OS

Re: Tracking deserialization errors

Re: Flink & Kafka multi-node config

Outputting the content of in flight session windows

Re: Flink job testing with

Re: Tracking deserialization errors

Re: Managing state migrations with Flink and Avro

Re: Managing state migrations with Flink and Avro

Re: CaseClassSerializer and/or TraversableSerializer may still not be threadsafe?

Re: rest.port is reset to 0 by YarnEntrypointUtils

Re: rest.port is reset to 0 by YarnEntrypointUtils

Re: Managing state migrations with Flink and Avro

Managing state migrations with Flink and Avro

RE: State-machine-based search logic in Flink ?

Re: why doesn't the over-window-aggregation sort the element(considering watermark) before processing?

Re: assign time attribute after first window group when using Flink SQL

32 matches

Site Navigation

Mail list logo

Footer information