Error while trigger checkpoint due to Kyro Exception

2018-08-17 Thread Bruce Qiu
Hi Community, I am using Flink 1.4.2 to do streaming processing. I fetch data from Kafka and write the parquet file to HDFS. In the previous environment, the Kafka had 192 partitions and I set the source parallelism to 192, the application works fine. But recently we had increased the Kafka

Re: Operator metrics do not get unregistered after job finishes

2018-08-17 Thread vino yang
Hi Helmut, Is the metrics of all the sub task instances of a job not unregistered, or part of it is not unregistered. Is there any exception log information available? Please feel free to create a JIRA issue and clearly describe your problem. Thanks, vino. Helmut Zechmann 于2018年8月17日周五

Re: processWindowFunction

2018-08-17 Thread vino yang
Hi antonio, Yes, ProcessWindowFunction is a very low level window function. It allows you to access the data in the window and allows you to customize the output of the window. So if you use it, while giving you flexibility, you need to think about other things, which may require you to write

Error in KyroSerializer

2018-08-17 Thread Pankaj Chaudhary
Hi, I am on Flink 1.4.2 and as part of my operator logic (i.e. RichFlatMapFunction) I am collecting the values in the Collector object. But I am getting an error stating “Caused by: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next

Re: Override CaseClassSerializer with custom serializer

2018-08-17 Thread Timo Walther
Hi Gerard, you are correct, Kryo serializers are only used when no built-in Flink serializer is available. Actually, the tuple and case class serializers are one of the most performant serializers in Flink (due to their fixed length, no null support). If you really want to reduce the

Re: Flink not rolling log files

2018-08-17 Thread Dominik Wosiński
I am using this *log4j.properties *file config for rolling files once per day and it is working perfectly. Maybe this will give You some hint: log4j.appender.file=org.apache.log4j.DailyRollingFileAppender log4j.appender.file.file=${log.file} log4j.appender.file.append=false

Re: Flink Jobmanager Failover in HA mode

2018-08-17 Thread Dominik Wosiński
I have faced this issue, but in 1.4.0 IIRC. This seems to be related to https://issues.apache.org/jira/browse/FLINK-10011. What was the status of the jobs when the main Job Manager has been stopped ? 2018-08-17 17:08 GMT+02:00 Helmut Zechmann : > Hi all, > > we have a problem with flink 1.5.2

Override CaseClassSerializer with custom serializer

2018-08-17 Thread gerardg
Hello, I can't seem to be able to override the CaseClassSerializer with my custom serializer. I'm using env.getConfig.addDefaultKryoSerializer() to add the custom serializer but I don't see it being used. I guess it is because it only uses Kryo based serializers if it can't find a Flink

Re: Flink not rolling log files

2018-08-17 Thread Gary Yao
Hello Navneet Kumar Pandey, org.apache.log4j.rolling.RollingFileAppender is part of Apache Extras Companion for Apache log4j [1]. Is that library in your classpath? Are there hints in taskmanager.err? Can you run: cat /usr/lib/flink/conf/log4j.properties on the EMR master node and show

Operator metrics do not get unregistered after job finishes

2018-08-17 Thread Helmut Zechmann
Hi all, we are using flink 1.5.2 in batch mode with prometheus monitoring. We noticed that a few metrics do not get unregistered after a job is finished: flink_taskmanager_job_task_operator_numRecordsIn flink_taskmanager_job_task_operator_numRecordsInPerSecond

Re: Looking for flink code example using flink-jpmml library over DataStream

2018-08-17 Thread sagar loke
Hi Hequn, Thanks for pointing that out. We were wondering if there is anything else other than these examples, that would help. Thanks, On Fri, Aug 17, 2018 at 5:33 AM Hequn Cheng wrote: > Hi sagar, > > There are some examples in flink-jpmml git library[1], for example[2]. > Hope it helps. >

Flink Jobmanager Failover in HA mode

2018-08-17 Thread Helmut Zechmann
Hi all, we have a problem with flink 1.5.2 high availability in standalone mode. We have two jobmanagers running. When I shut down the main job manager, the failover job manager encounters an error during failover. Logs: 2018-08-17 14:38:16,478 WARN akka.remote.ReliableDeliverySupervisor

Flink not rolling log files

2018-08-17 Thread Navneet Kumar Pandey
I am using Flink in EMR with following configuration. { "Classification": "flink-log4j", "Properties": { "log4j.logger.no":"DEBUG", "log4j.appender.file":"org.apache.log4j.rolling.RollingFileAppender",

Data loss when restoring from savepoint

2018-08-17 Thread Juho Autio
Some data is silently lost on my Flink stream job when state is restored from a savepoint. Do you have any debugging hints to find out where exactly the data gets dropped? My job gathers distinct values using a 24-hour window. It doesn't have any custom state management. When I cancel the job

Re: Looking for flink code example using flink-jpmml library over DataStream

2018-08-17 Thread Hequn Cheng
Hi sagar, There are some examples in flink-jpmml git library[1], for example[2]. Hope it helps. Best, Hequn [1] https://github.com/FlinkML/flink-jpmml [2] https://github.com/FlinkML/flink-jpmml/tree/master/flink-jpmml-examples/src/main/scala/io/radicalbit/examples On Fri, Aug 17, 2018 at 10:09

High CPU usage

2018-08-17 Thread Alexander Smirnov
Hello, I noticed CPU utilization went high and took a thread dump on the task manager node. Why would RocksDBMapState.entries() / seek0 call consumes CPU? It is Flink 1.4.2 "Co-Flat Map (3/4)" #16129 prio=5 os_prio=0 tid=0x7fefac029000 nid=0x338f runnable [0x7feed2002000]

ProcessingTimeSessionWindows and many other windowing pieces are built around Object

2018-08-17 Thread Andrew Roberts
I’m exploring moving some “manual” state management into Flink-managed state via Flink’s windowing paradigms, and I’m running into the surprise that many pieces of the windowing architecture require the stream be upcast to Object (AnyRef in scala). Is there a technical reason for this? I’m

Re: Initial value of ValueStates of primitive types

2018-08-17 Thread Averell
Thank you Dominik. So there's an implicit conversion, which means that getState().value() would always give a deteministic result (i.e: Boolean value would always be false, Int value would always be 0) I found another funny thing is even with ref type like Integer, there is also that implicit

Re: Initial value of ValueStates of primitive types

2018-08-17 Thread Dominik Wosiński
Hey, After you call, by default values you mean after you call : getRuntimeContext.getState() If so, the default value will be state with *value() *of null, as described in : /** * Returns the current value for the state. When the state is not * partitioned the returned value is the same for

Re: Need a clarification about removing a stateful operator

2018-08-17 Thread Tony Wei
Hi Stefan, Thanks for your detailed explanation. Best, Tony Wei 2018-08-17 15:56 GMT+08:00 Stefan Richter : > Hi, > > it will not be transported. The JM does the state assignment to create the > deployment information for all tasks. If will just exclude the state for > operators that are not

Initial value of ValueStates of primitive types

2018-08-17 Thread Averell
Hi, In Flink's documents, I couldn't find any example that uses primitive type when working with States. What would be the initial value of a ValueState of type Int/Boolean/...? The same question apply for MapValueState like [String, Int] Thanks and regards, Averell -- Sent from:

Re: InvalidTypesException: Type of TypeVariable 'K' in 'class X' could not be determined

2018-08-17 Thread Timo Walther
Hi Miguel, the issue that you are observing is due to Java's type erasure. "new MyClass()" is always erasured to "new MyClass()" by the Java compiler so it is impossible for Flink to extract something. For classes in declarations like class MyClass extends ... {    ... } the compiler adds

Re: Need a clarification about removing a stateful operator

2018-08-17 Thread Stefan Richter
Hi, it will not be transported. The JM does the state assignment to create the deployment information for all tasks. If will just exclude the state for operators that are not present. So in your next checkpoints they will no longer be contained. Best, Stefan > Am 17.08.2018 um 09:26 schrieb

Re: Need a clarification about removing a stateful operator

2018-08-17 Thread Tony Wei
Hi Chesnay, Thanks for your quick reply. I have another question. Will the state, which is ignored, be transported to TMs from DFS? Or will it be detected by JM's checkpoint coordinator and only those states reuired by operators be transported to each TM? Best, Tony Wei 2018-08-17 14:38

Re: OneInputStreamOperatorTestHarness.snapshot doesn't include timers?

2018-08-17 Thread Piotr Nowojski
No problem :) You motivated me to do a fix for that, since I stumbled across this bug/issue myself before and also took me some time in the debugger to find the cause. Piotrek > On 16 Aug 2018, at 20:05, Ken Krugler wrote: > > Hi Piotr, > > Thanks, and darn it that’s something I should have

Re: Need a clarification about removing a stateful operator

2018-08-17 Thread Chesnay Schepler
The state won't exist in the snapshot. On 17.08.2018 04:38, Tony Wei wrote: Hi all, I'm confused about the description in documentation. [1] * *Removing a stateful operator:*The state of the removed operator is lost unless another operator takes it over. When starting the upgraded