[jira] [Updated] (FLINK-5908) Blob Cache can (rarely) get corrupted on failed blob downloads
[ https://issues.apache.org/jira/browse/FLINK-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-5908: --- Component/s: Network > Blob Cache can (rarely) get corrupted on failed blob downloads > -- > > Key: FLINK-5908 > URL: https://issues.apache.org/jira/browse/FLINK-5908 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination, Network >Affects Versions: 1.2.0, 1.3.0, 1.4.0 >Reporter: Stephan Ewen > > The Blob Cache downloads files directly to the target file location. > While it tries to clean up failed attempts, there is a change that this > cleanup does not complete. > In that case, we have a corrupt file at the target location. The blob cache > then assumes that it already has the file cached already and future requests > do not attempt to re-download the file. > The fix would be to download to a temp file name, validate the integrity, and > rename to the target file path when the validation succeeds. > The validation for "content addressable" could even include validating the > checksum hash. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-5908) Blob Cache can (rarely) get corrupted on failed blob downloads
[ https://issues.apache.org/jira/browse/FLINK-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-5908: --- Affects Version/s: 1.4.0 1.3.0 > Blob Cache can (rarely) get corrupted on failed blob downloads > -- > > Key: FLINK-5908 > URL: https://issues.apache.org/jira/browse/FLINK-5908 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination, Network >Affects Versions: 1.2.0, 1.3.0, 1.4.0 >Reporter: Stephan Ewen > > The Blob Cache downloads files directly to the target file location. > While it tries to clean up failed attempts, there is a change that this > cleanup does not complete. > In that case, we have a corrupt file at the target location. The blob cache > then assumes that it already has the file cached already and future requests > do not attempt to re-download the file. > The fix would be to download to a temp file name, validate the integrity, and > rename to the target file path when the validation succeeds. > The validation for "content addressable" could even include validating the > checksum hash. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-5908) Blob Cache can (rarely) get corrupted on failed blob downloads
[ https://issues.apache.org/jira/browse/FLINK-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Kruber updated FLINK-5908: --- Fix Version/s: (was: 1.4.0) (was: 1.2.2) > Blob Cache can (rarely) get corrupted on failed blob downloads > -- > > Key: FLINK-5908 > URL: https://issues.apache.org/jira/browse/FLINK-5908 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.2.0 >Reporter: Stephan Ewen > > The Blob Cache downloads files directly to the target file location. > While it tries to clean up failed attempts, there is a change that this > cleanup does not complete. > In that case, we have a corrupt file at the target location. The blob cache > then assumes that it already has the file cached already and future requests > do not attempt to re-download the file. > The fix would be to download to a temp file name, validate the integrity, and > rename to the target file path when the validation succeeds. > The validation for "content addressable" could even include validating the > checksum hash. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-5908) Blob Cache can (rarely) get corrupted on failed blob downloads
[ https://issues.apache.org/jira/browse/FLINK-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-5908: -- Fix Version/s: (was: 1.3.0) 1.4.0 > Blob Cache can (rarely) get corrupted on failed blob downloads > -- > > Key: FLINK-5908 > URL: https://issues.apache.org/jira/browse/FLINK-5908 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.2.0 >Reporter: Stephan Ewen > Fix For: 1.2.2, 1.4.0 > > > The Blob Cache downloads files directly to the target file location. > While it tries to clean up failed attempts, there is a change that this > cleanup does not complete. > In that case, we have a corrupt file at the target location. The blob cache > then assumes that it already has the file cached already and future requests > do not attempt to re-download the file. > The fix would be to download to a temp file name, validate the integrity, and > rename to the target file path when the validation succeeds. > The validation for "content addressable" could even include validating the > checksum hash. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (FLINK-5908) Blob Cache can (rarely) get corrupted on failed blob downloads
[ https://issues.apache.org/jira/browse/FLINK-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-5908: -- Fix Version/s: (was: 1.2.1) 1.2.2 > Blob Cache can (rarely) get corrupted on failed blob downloads > -- > > Key: FLINK-5908 > URL: https://issues.apache.org/jira/browse/FLINK-5908 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.2.0 >Reporter: Stephan Ewen > Fix For: 1.3.0, 1.2.2 > > > The Blob Cache downloads files directly to the target file location. > While it tries to clean up failed attempts, there is a change that this > cleanup does not complete. > In that case, we have a corrupt file at the target location. The blob cache > then assumes that it already has the file cached already and future requests > do not attempt to re-download the file. > The fix would be to download to a temp file name, validate the integrity, and > rename to the target file path when the validation succeeds. > The validation for "content addressable" could even include validating the > checksum hash. -- This message was sent by Atlassian JIRA (v6.3.15#6346)