stack job on fail over

2019-11-26 Thread Nick Toker
Hi i have a standalone cluster with 3 nodes and rocksdb backend when one task manager fails ( the process is being killed) it takes very long time until the job is totally canceled and a new job is resubmitted i see that all slots on all nodes are being canceled except from the slots of the dead t

Problem loading JDBC driver

2019-11-26 Thread Nicholas Walton
Hi, I have a pipeline which is sinking into an Apache Derby database, but I’m constantly receiving the error java.lang.IllegalArgumentException: JDBC driver class not found. The Scala libraries I’m loading are val flinkDependencies = Seq( "org.apache.flink" %% "flink-scala" % flinkVersion , "o

Re: Pre-process data before it hits the Source

2019-11-26 Thread Felipe Gutierrez
I am afraid that this is not possible in FLink, since the entry point of all transformation is the source function. Everything that we can pre-process is in the source function or on the downstream operators. If you want to pre-process something before the data hits the source you will have to rely

Re: Question about to modify operator state on Flink1.7 with State Processor API

2019-11-26 Thread Kaihao Zhao
Hi Vino/Seth, Thanks Vino and Seth, changing the UID and setting offset manually is a solution, but the pin point is we have tons of applications(owned by other users) running on our platform, so it will be inefficient to do it manually, and the most difficult part is to let users to change their

Re: Pre-process data before it hits the Source

2019-11-26 Thread vino yang
Hi Felipe, Why do you think it's not possible. My thought is we can do the data pre-procession in the source function. If so, source function would contain consume upstream events then do pre-processing then emits to the downstream. Best, Vino Felipe Gutierrez 于2019年11月26日周二 下午4:56写道: > I am

Re: Problem loading JDBC driver

2019-11-26 Thread Caizhi Weng
Hi Nick, The "Test" after "org.apache.derby" % "derby" % "10.15.1.3" seems suspicious. Is that intended? Nicholas Walton 于2019年11月26日周二 下午4:46写道: > Hi, > > *I have a pipeline which is sinking into an Apache Derby database, but I’m > constantly receiving the error* > > java.lang.IllegalArgument

Re: Pre-process data before it hits the Source

2019-11-26 Thread Felipe Gutierrez
Hi Vino, yes, in the source function it is possible. But you said, "before it hits the Source". So, IMO I think it is outside of the flink workflow. Best, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com

Re: How to custom (or use) a window to specify everyday's beginning as watermark?

2019-11-26 Thread Caizhi Weng
Hi Rock, I think you can write your own trigger which fires when the date of the process time of the current record is different from that of the last record. Pinging @Jark Wu for a more professional answer. Rock 于2019年11月26日周二 下午3:37写道: > I need my job to aggregator every device's mertic as d

Re: stack job on fail over

2019-11-26 Thread Biao Liu
Hi Nick, I guess the reason is your Flink job manager doesn't detect the task manager is lost until heartbeat timeout. You could check the job manager log to verify that. Maybe a more elegant way of shutting down task manager helps, like through "taskmanager.sh stop" or "kill" command without 9 s

Kafka Offset commit failed on partition

2019-11-26 Thread PedroMrChaves
Hello, Since the last update to the universal Kafka connector, I'm getting the following error fairly often. /2019-11-18 15:42:52,689 ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-10, groupId=srx-consumer-group] Offset commit failed on partit

Re: How to custom (or use) a window to specify everyday's beginning as watermark?

2019-11-26 Thread Jark Wu
Hi Rock, Sorry, I don't fully understand what you want. If you want a tumbling window which covers one day, you can use `KeyedStream#timeWindow(Time.days(1))` which covers from UTC 00:00~24:00. Best, Jark On Tue, 26 Nov 2019 at 17:20, Caizhi Weng wrote: > Hi Rock, > > I think you can write yo

Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?

2019-11-26 Thread yingjie
The new BlockingSubpartition implementation in 1.9 uses mmap for data reading by default which means it steals memory from OS. The mmaped region memory is managed by JVM, so there should be no OutOfMemory problem reported by JVM and the OS memory is also not exhausted, so there should be no kernal

Re: How to custom (or use) a window to specify everyday's beginning as watermark?

2019-11-26 Thread Biao Liu
Hi Rock, >From my understanding, what you want is a one-day time based window which start at 0 clock. Actually the one-day time-based window (like Jack mentioned) starts at the beginning of day (0:00). You don't need to do anything special. If you are using event time window (since you mentioned

ArrayIndexOutOfBoundException on checkpoint creation

2019-11-26 Thread Theo Diefenthal
Hi, We have a pipeline with a custom ProcessFunction and state (see [1], implemented as suggested by Fabian with a ValueState and ValueState>) The behavior of that function works fine in our unittests and with low load in our test environment (100.000 records per minute). On the production e

Re: Pre-process data before it hits the Source

2019-11-26 Thread vino yang
Hi Felipe, >> But you said, "before it hits the Source". I did not say this. Vijay said it. About this question, he may not think about customizing the source connector. If he does not try to find a solution in the Flink domain. Why he asked Flink questions and pasted Flink program? IMO, It's j

Re: Pre-process data before it hits the Source

2019-11-26 Thread Felipe Gutierrez
ok. I am sorry, I thought that was you that said this. Maybe it is just a matter of expression that made the question confused. But, yes. In the source function something can be done. Not before. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.co

SQL Performance

2019-11-26 Thread Nicholas Walton
I’m streaming records down to an Embedded Derby database, at a rate of around 200 records per second. I’m certain Derby can sustain a higher throughput than that, if I could buffer the records but it seems that I’m writing each record as soon as it arrives and as a single transaction which is in

Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?

2019-11-26 Thread Piotr Nowojski
Thanks for the confirmation, I’ve created Jira ticket to track this issue [1] https://issues.apache.org/jira/browse/FLINK-14952 Piotrek > On 26 Nov 2019, at 11:10, yingjie wrote: > > The new BlockingSubpartition implementation in 1.9 uses mm

Re: SQL for Avro GenericRecords on Parquet

2019-11-26 Thread Peter Huang
Hi Hanan, After investigating the issue by using the test case you provided, I think there is a big in it. Currently, the parquet predicts push down use the predicate literal type to construct the FilterPredicate. The issue happens when the data type of value in predicate inferred from SQL doesn't

Re: Per Operator State Monitoring

2019-11-26 Thread Yu Li
Hi Aaron, I don't think we have such fine grained metrics on per operation state size, but from your description that "YARN kills containers who are exceeding their memory limits", I think the root cause is not the state size but related to the memory consumption of the state backend. My guess is

What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-26 Thread Aaron Levin
Hi, Some context: after a refactoring, we were unable to start our jobs. They started fine and checkpointed fine, but once the job restarted owing to a transient failure, the application was unable to start. The Job Manager was OOM'ing (even when I gave them 256GB of ram!). The `_metadata` file fo

[DISCUSS] Make Managed Memory always off-heap (Adjustment to FLIP-49)

2019-11-26 Thread Stephan Ewen
Hi all! Yesterday, some of the people involved in FLIP-49 had a long discussion about managed memory in Flink. Particularly, the fact that we have managed memory either on heap or off heap and that FLIP-49 introduced having both of these types of memory at the same time. ==> What we want to sugge

Re: SQL Performance

2019-11-26 Thread vino yang
Hi Nick, Can you provide more details? Are you using JDBCOutputFormat? If yes, can `JDBCOutputFormatBuilder#setBatchInterval` help you? Best, Vino Nicholas Walton 于2019年11月26日周二 下午9:20写道: > I’m streaming records down to an Embedded Derby database, at a rate of > around 200 records per second.

Re: How to custom (or use) a window to specify everyday's beginning as watermark?

2019-11-26 Thread Biao Liu
Hi Rock, There is an inaccurate description in last response. I don't think a watermark of 0 clock is needed to get the accurate calculation result. The watermark of 0 clock only helps to generate the result you want immediately. Thanks, Biao /'bɪ.aʊ/ On Tue, 26 Nov 2019 at 18:10, Biao Liu wr

Re: stack job on fail over

2019-11-26 Thread Biao Liu
Hi Nick, Yes, reducing heartbeat timeout is not a perfect solution. It just alleviates the pain a bit. I'm wondering my guess is right or not. Is it caused by heartbeat detection? Does it help with an elegant way of shutting down? Thanks, Biao /'bɪ.aʊ/ On Tue, 26 Nov 2019 at 20:22, Nick Toker

Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?

2019-11-26 Thread bupt_ljy
Hi, I’ve met the exactly same problem recently and solved it in Piotr’s way. @zhijiang, I didn’t see any oom error thrown by JVM (I’m not sure this can be thrown if yarn decides to kill it in a mandatory way). According to our monitoring system, the overusage of memory is from JVM directy memo

Re: [DISCUSS] Make Managed Memory always off-heap (Adjustment to FLIP-49)

2019-11-26 Thread Jingsong Li
Hi Stephan, +1 to default have off-heap managed memory. >From the perspective of batch, In our long-term performance test and online practice: - There is no significant difference in performance between heap and off-heap memory. If it is a heap object, the JVM has many opportunities to optimize i

Re: SQL Performance

2019-11-26 Thread Jingsong Li
Hi Nicholas, You can take a look to JDBCSinkFunction and JDBCUpsertSinkFunction. They can set something like flush batch max size just like vino mentioned. JDBCUpsertOutputFormat also has a async thread to flush batch to JDBC to avoid too high delay. In this way, the throughput can be improved wh

Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?

2019-11-26 Thread Piotr Nowojski
Hi, > @yingjie Do you have any idea how much memory will be stolen from OS when > using mmap for data reading? I think this is bounded only by the size of the written data. Also it will not be “stolen from OS”, as kernel is controlling the amount of pages residing currently in the RAM depend

Re: ***UNCHECKED*** Error while confirming Checkpoint

2019-11-26 Thread Tony Wei
Hi, I want to raise this question again, since I have had this exception on my production job. The exception is as follows > 2019-11-27 14:47:29 java.lang.RuntimeException: Error while confirming checkpoint > at org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1205) > at jav