EOFException on attempt to scale up job with RocksDB state backend

2021-03-10 Thread Alexey Trenikhun
Hello, I was trying to scale job up, took save point, changed parallelism setting from 6 to 8 and started job from savepoint: switched from RUNNING to FAILED on 10.204.2.98:6122-2946e1 @ gsp-tm-0.gsp-headless.gsp.svc.cluster.local (dataPort=41409). java.lang.Exception: Exception while creating S

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-15 Thread Tzu-Li (Gordon) Tai
Hi, Could you provide info on the Flink version used? Cheers, Gordon -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-15 Thread Alexey Trenikhun
Savepoint was taken with 1.12.1, I've tried to scale up using same version and 1.12.2 From: Tzu-Li (Gordon) Tai Sent: Monday, March 15, 2021 12:06 AM To: user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backen

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-15 Thread Yun Tang
state backend Savepoint was taken with 1.12.1, I've tried to scale up using same version and 1.12.2 From: Tzu-Li (Gordon) Tai Sent: Monday, March 15, 2021 12:06 AM To: user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-15 Thread Alexey Trenikhun
with RocksDB state backend Savepoint was taken with 1.12.1, I've tried to scale up using same version and 1.12.2 From: Tzu-Li (Gordon) Tai Sent: Monday, March 15, 2021 12:06 AM To: user@flink.apache.org Subject: Re: EOFException on attempt to scale up job

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-16 Thread Alexey Trenikhun
scale up job with RocksDB state backend Savepoint was taken with 1.12.1, I've tried to scale up using same version and 1.12.2 From: Tzu-Li (Gordon) Tai Sent: Monday, March 15, 2021 12:06 AM To: user@flink.apache.org Subject: Re: EOFException on attempt to sca

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-16 Thread Yun Tang
From: Alexey Trenikhun Sent: Tuesday, March 16, 2021 15:10 To: Yun Tang ; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend Also restore from same savepoint without change in parallelism works fine

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-16 Thread Yun Tang
@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend Hi Yun, I'm attaching shorter version of log, looks like full version didn't come through Thanks, Alexey From: Yun Tang Sent: Tuesday, March 16, 2021

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-17 Thread Yun Tang
on attempt to scale up job with RocksDB state backend Attached. From: Yun Tang Sent: Tuesday, March 16, 2021 11:13 PM To: Alexey Trenikhun ; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state back

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-17 Thread Alexey Trenikhun
, not sure does it have effect on savepoint or not) Thanks, Alexey From: Yun Tang Sent: Wednesday, March 17, 2021 12:33 AM To: Alexey Trenikhun ; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state b

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-17 Thread Yun Tang
Yun Tang From: Alexey Trenikhun Sent: Thursday, March 18, 2021 0:45 To: Yun Tang ; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend Hi Yun, I've copied 77e77928-cb26-4543

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-17 Thread Yun Tang
ordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend Hi Yun, Azure web UI shows size of all files created by Flink as 128Mib * X (128, 256, 640), see screenshot attached. In my understanding this is because Flink creates them as Page Blob

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-17 Thread Alexey Trenikhun
ed in [1] or without compaction? Thanks, Alexey From: Yun Tang Sent: Wednesday, March 17, 2021 9:31 PM To: Alexey Trenikhun ; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend Hi Alexey,

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-18 Thread Yun Tang
; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend Hi Yun, How underlying storage explains fact that without re-scale I can restore from savepoint? Does Flink write file once or many times, if many times, then potentially

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-18 Thread Alexey Trenikhun
March 18, 2021 5:08 AM To: Alexey Trenikhun ; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend Hi Alexey, Flink would only write once for checkpointed files. Could you try to write checkpointed files as block blob format

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-24 Thread Alexey Trenikhun
Tang ; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend I Yun, I've changed configuration to use block blobs, however due to another issue [1], I can't make savepoint, I hope eventually job will able to proces