Hi Yun,

I've copied 77e77928-cb26-4543-bd41-e785fcac49f0 and _metadata to Google drive:
https://drive.google.com/drive/folders/1J3nwvQupLBT5ZaN_qEmc2y_-MgFz0cLb?usp=sharing

Compression was never enabled (docs says that RocksDB's incremental checkpoints 
always use snappy compression, not sure does it have effect on savepoint or not)

Thanks,
Alexey
________________________________
From: Yun Tang <myas...@live.com>
Sent: Wednesday, March 17, 2021 12:33 AM
To: Alexey Trenikhun <yen...@msn.com>; Tzu-Li (Gordon) Tai 
<tzuli...@apache.org>; user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Hi Alexey,

Thanks for your quick response. I have checked two different logs and still 
cannot understand why this could happen.

Take 
"wasbs://gsp-st...@gspstatewestus2dev.blob.core.windows.net/gsp/savepoints/savepoint-000000-67de6690143a/77e77928-cb26-4543-bd41-e785fcac49f0"
 for example, the key group range offset has been intersected correctly during 
rescale for task "Intake voice calls (6/7)". The only place I could doubt is 
that azure blob storage did work as expected during seek offset [1].

Have you ever enabled snappy compression [2] [3] for savepoints?
Could you also share the file 
"wasbs://gsp-st...@gspstatewestus2dev.blob.core.windows.net/gsp/savepoints/savepoint-000000-67de6690143a/77e77928-cb26-4543-bd41-e785fcac49f0
 " so that I could seek locally to see whether work as expected.
Moreover, could you also share savepoint meta data 
""wasbs://gsp-st...@gspstatewestus2dev.blob.core.windows.net/gsp/savepoints/savepoint-000000-67de6690143a/_metadata"
 ?


[1] 
https://github.com/apache/flink/blob/dc404e2538fdfbc98b9c565951f30f922bf7cedd/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/restore/RocksDBFullRestoreOperation.java#L211
[2] 
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#compression
[3] 
https://ci.apache.org/projechttps://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#execution-checkpointing-snapshot-compressions/flink/flink-docs-stable/ops/state/large_state_tuning.html#compression

Best
Yun Tang
________________________________
From: Alexey Trenikhun <yen...@msn.com>
Sent: Wednesday, March 17, 2021 14:25
To: Yun Tang <myas...@live.com>; Tzu-Li (Gordon) Tai <tzuli...@apache.org>; 
user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Attached.

________________________________
From: Yun Tang <myas...@live.com>
Sent: Tuesday, March 16, 2021 11:13 PM
To: Alexey Trenikhun <yen...@msn.com>; Tzu-Li (Gordon) Tai 
<tzuli...@apache.org>; user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Hi Alexey,

Thanks for your reply, could you also share logs during normal restoring just 
as I wrote in previous thread so that I could compare.

Best
Yun Tang
________________________________
From: Alexey Trenikhun <yen...@msn.com>
Sent: Wednesday, March 17, 2021 13:55
To: Yun Tang <myas...@live.com>; Tzu-Li (Gordon) Tai <tzuli...@apache.org>; 
user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Hi Yun,
I'm attaching shorter version of log, looks like full version didn't come 
through

Thanks,
Alexey
________________________________
From: Yun Tang <myas...@live.com>
Sent: Tuesday, March 16, 2021 8:05 PM
To: Alexey Trenikhun <yen...@msn.com>; Tzu-Li (Gordon) Tai 
<tzuli...@apache.org>; user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Hi Alexey,

I believe your exception messages are printed from Flink-1.12.2 not 
Flink-1.12.1 due to the line number of method calling.

Could you share exception message of Flink-1.12.1 when rescaling? Moreover, I 
hope you could share more logs during restoring and rescaling. I want to see 
details of key group handle [1]

[1] 
https://github.com/apache/flink/blob/dc404e2538fdfbc98b9c565951f30f922bf7cedd/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/restore/RocksDBFullRestoreOperation.java#L153

Best
________________________________
From: Alexey Trenikhun <yen...@msn.com>
Sent: Tuesday, March 16, 2021 15:10
To: Yun Tang <myas...@live.com>; Tzu-Li (Gordon) Tai <tzuli...@apache.org>; 
user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Also restore from same savepoint without change in parallelism works fine.

________________________________
From: Alexey Trenikhun <yen...@msn.com>
Sent: Monday, March 15, 2021 9:51 PM
To: Yun Tang <myas...@live.com>; Tzu-Li (Gordon) Tai <tzuli...@apache.org>; 
user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

No, I believe original exception was from 1.12.1 to 1.12.1

Thanks,
Alexey

________________________________
From: Yun Tang <myas...@live.com>
Sent: Monday, March 15, 2021 8:07:07 PM
To: Alexey Trenikhun <yen...@msn.com>; Tzu-Li (Gordon) Tai 
<tzuli...@apache.org>; user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Hi,

Can you scale the job at the same version from 1.12.1 to 1.12.1?

Best
Yun Tang

________________________________
From: Alexey Trenikhun <yen...@msn.com>
Sent: Tuesday, March 16, 2021 4:46
To: Tzu-Li (Gordon) Tai <tzuli...@apache.org>; user@flink.apache.org 
<user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Savepoint was taken with 1.12.1, I've tried to scale up using same version and 
1.12.2

________________________________
From: Tzu-Li (Gordon) Tai <tzuli...@apache.org>
Sent: Monday, March 15, 2021 12:06 AM
To: user@flink.apache.org <user@flink.apache.org>
Subject: Re: EOFException on attempt to scale up job with RocksDB state backend

Hi,

Could you provide info on the Flink version used?

Cheers,
Gordon



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to