Hi Yu,

I’ve set `fs.default-scheme` to hdfs, and it's mainly used for simplifying 
checkpoint / savepoint / HA paths.

And I leave the rocksdb local dir empty, so the local snapshot still goes to 
YARN local cache dirs.

Hope that answers your question.

Best,
Paul Lam

> 在 2019年3月28日,15:34,Yu Li <l...@apache.org> 写道:
> 
> Hi Paul,
> 
> Regarding "mistakenly uses the default filesystem scheme, which is specified 
> to hdfs in the new cluster in my case", could you further clarify the 
> configuration property and value you're using? Do you mean you're using an 
> HDFS directory to store the local snapshot data? Thanks.
> 
> Best Regards,
> Yu
> 
> 
> On Thu, 28 Mar 2019 at 14:34, Paul Lam <paullin3...@gmail.com 
> <mailto:paullin3...@gmail.com>> wrote:
> Hi,
> 
> It turns out that under certain circumstances rocksdb statebackend mistakenly 
> uses the default filesystem scheme, which is specified to hdfs in the new 
> cluster in my case.
> 
> I’ve filed a Jira to track this[1]. 
> 
> [1] https://issues.apache.org/jira/browse/FLINK-12042 
> <https://issues.apache.org/jira/browse/FLINK-12042>
> 
> Best,
> Paul Lam
> 
>> 在 2019年3月27日,19:06,Paul Lam <paullin3...@gmail.com 
>> <mailto:paullin3...@gmail.com>> 写道:
>> 
>> Hi,
>> 
>> I’m using Flink 1.6.4 and recently I ran into a weird issue of rocksdb 
>> statebackend. A job that runs fine on a YARN cluster keeps failing on 
>> checkpoint after migrated to a new one 
>> (with almost everything the same but better machines), and even a clean 
>> restart doesn’t help. 
>> 
>> The root cause is IllegalStateException but with no error message. The stack 
>> trace shows that when the rocksdb statebackend is doing the async part of 
>> snapshots (runSnapshot), 
>> it finds that the local snapshot directory that is created by rocksdb 
>> earlier (takeSnapshot) does not exist. 
>> 
>> I tried to log more informations in RocksDBKeyedStateBackend (see 
>> attachment), and found that the local snapshot performed as expected and the 
>> .sst files were written, 
>> but when the async task accessed the directory, the whole snapshot directory 
>> was gone. 
>> 
>> What could possibly be the cause? Thanks a lot.
>> 
>> Best,
>> Paul Lam
>> 
>> <rocksdb_illegal_state.log.md <http://rocksdb_illegal_state.log.md/>>
>> 
> 

Reply via email to