Hi Till,

Thanks for sharing pointers related to entropy injection feature on 1.11.
We did some investigation and so far it seems like an edge case handling
bug.

Testing Environment:
flink 1.11.2 release with plugins 
plugins/s3-fs-hadoop/flink-s3-fs-hadoop

state.backend.rocksdb.timer-service.factory     rocksdb
state.checkpoints.dir   s3a://balabala/dev/checkpoints/_entropy_/test
state.checkpoints.num-retained  3
s3.entropy.key  _entropy_
s3.entropy.length       1

Observation: (v.s 1.9 no entropy marker)
checkpoint _metadata path point to directory with entropy marker
s3a://balabala/dev/checkpoints/_entropy_/xenon/event-stream-splitter/3bec4b7d4ac5c116649a4f579b87628e/chk-1669

Investigation:
add LOG.warn EntropyInjector#removeEntropyMarkerIfPresent build new binary
and copy dist jar to dev cluster gateway clifront lib/ folder.
rerun same job, found warning logs final EntropyInjectingFileSystem efs =
getEntropyFs(fs); return null.

adding more logs inside of getEntropyFs 
found fs passed from removeEntropyMarkerIfPresent is instanceof
org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem

so I realized there might be a bug in handling filesystem check, apply a
small fix there
patch.diff
<http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/file/t342/patch.diff>
  

repeat step above, run same job, seems remove entropy marker works
s3a://balabala/dev/checkpoints/xenon/event-stream-splitter/26081a69c330522e0b3f4fceb852401e/chk-27

Can we loop in someone to take a look this patch?


Chen





--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Reply via email to