[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171524#comment-17171524 ] ming li commented on FLINK-9373: cc [~aljoscha] > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0, 1.6.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168554#comment-17168554 ] ming li commented on FLINK-9373: Hi,[~sihuazhou]. Recently I was reading the related code of flink-statebackend-rocksdb, and found that in the seekToLast method of org.apache.flink.contrib.streaming.state.RocksIteratorWrapper, iterator.seekToFirst is called. I am puzzled why iterator.seekToLast is not called. {code:java} //代码占位符 {code} @Override public void seekToLast() \{ iterator.seekToFirst(); status(); } > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0, 1.6.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479462#comment-16479462 ] Sihua Zhou commented on FLINK-9373: --- [~srichter] FYI [3558|https://github.com/facebook/rocksdb/issues/3558], got reply from RocksDB. I think we chosen the right way that should go ;), cause the status could be reset. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0, 1.6.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479035#comment-16479035 ] ASF GitHub Bot commented on FLINK-9373: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/6020 > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478954#comment-16478954 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 Thanks @sihuazhou ! LGTM Will merge this. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478945#comment-16478945 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 cc @StefanRRichter > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478866#comment-16478866 ] Stefan Richter commented on FLINK-9373: --- Great, thanks a lot! > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478865#comment-16478865 ] Sihua Zhou commented on FLINK-9373: --- I will updated the PR quickly. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.6.0, 1.5.1 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478861#comment-16478861 ] Stefan Richter commented on FLINK-9373: --- [~sihuazhou] Will you be able to do this quickly or can I take over because this is currently THE release blocker? > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.6.0, 1.5.1 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478857#comment-16478857 ] Stefan Richter commented on FLINK-9373: --- Well, but we are also at risk to have only a partial fix. Ok, I suggest that we introduce a wrapper and check status there for all the methods mentioned in the documentation: {{Seek()}}, {{Next()}}, {{SeekToFirst()}}, {{SeekToLast()}}, {{SeekForPrev()}}, and {{Prev()}}. To be on the safe side, and see if there is any performance change. If it turns out that that is too much, we can still drop it again. {{}} > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.6.0, 1.5.1 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478848#comment-16478848 ] Sihua Zhou commented on FLINK-9373: --- I think that makes sense. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.6.0, 1.5.1 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478842#comment-16478842 ] Stefan Richter commented on FLINK-9373: --- Maybe we can also take the PR basically "as is" for now so that we check the status for each iteration and add more checks in the next minor release if this turns out to be not enough. Does that make sense? > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.6.0, 1.5.1 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478699#comment-16478699 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 Let's wait a bit more for their response. It seems like this example is older than their corrected docs. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478558#comment-16478558 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 Have't received any response from RocksDB yet, but I found this example with using `RocksIterator#stats()`: https://github.com/facebook/rocksdb/blob/3453870677ee2648f38d70fe8aa7fa16a93a96d2/java/samples/src/main/java/RocksDBSample.java > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477592#comment-16477592 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 FYI, I found this issue related to problem: https://github.com/facebook/rocksdb/issues/3558 > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477573#comment-16477573 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 Agreed! > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477560#comment-16477560 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 Maybe we should ask them on their issue tracker what the best practise is? I cannot remember seeing such checks in their code examples. Have a hard time to believe that this can be true, because it is not really documented on the Java API and also why wouldn't they always call `status` internally? > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477556#comment-16477556 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 It depends, maybe this is already covered currently because we might always do an iteration attempt that checks right after the seek. But in general, this is not very nice and fragile if true. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477554#comment-16477554 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 Oh My God...Is that means we need to wrap the `RocksIterator` to delegate all it API? > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477549#comment-16477549 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 After double checking with the RocksDB docs, I am afraid that we need to introduce more checks because for example the point out that also after methods like `seek` the iterator an become corrupted. And if the status flag is potentially cleared, that means we need to check in all the places...crazy :-( > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477465#comment-16477465 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 Agree, should be correct first before fast! Could you please have a look at this? I think it's already for a look now~ > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477459#comment-16477459 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 It sounds cheap if they just `&` all the flags from the sub iterators. In the end, we can see if there is a performance drop but better be correct first before fast. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477438#comment-16477438 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 @StefanRRichter I had a look at the implementation of the iterators in RocksDB, I found status just return the flag first `_status` as the result without any complex computation, But for some `composite Iterator` like the `MergeIteraor` and `TwoLevelIterator` it need to check all the `InternalIterator` they hold to decide the final status, and I also found the iterator could be reset to `OK` in some cases...Hmm...do you think this is super cheap or not? > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477414#comment-16477414 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 I'm going to check the native implementation and see whether the `status()` is a super cheap option... > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477413#comment-16477413 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 I think I am a bit torn here now... > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477412#comment-16477412 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 That is a good question, and I'm not sure...but I think that seems to be... > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477409#comment-16477409 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 Oh you are right, this is confusing :-) So does this also mean the status flag is cleared when we simple continue iterating and only check in the end? > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477406#comment-16477406 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 I think that is the incorrect one, If I'm not confused by the wiki's content... > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477399#comment-16477399 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 Also from the RocksDB docs: `In another word, if Iterator::Valid() is true, status() is guaranteed to be OK()` > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477397#comment-16477397 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 Yes, but eventually it will also return `false`, which is essentially the same as waiting until the loop terminates. Anyways, I think after the loop is the nicer way. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477393#comment-16477393 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 @StefanRRichter NO, I think that couldn't fix this issue, the problem here is that even `iterator.isValid()` return `true`, there may also some internal error in RocksDB. What do you think? > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477389#comment-16477389 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 You could also in `isRocksIteratorValid` run the check only if the return value is `false` if you like the helper method to avoid people forgetting about this check. > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477387#comment-16477387 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 @StefanRRichter No, I didn't have any performance tests yet. I think you are right! Your proposal is the way I'm going to choose. Addressing this... > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477383#comment-16477383 ] ASF GitHub Bot commented on FLINK-9373: --- Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/6020 A quick general question: could you observe any performance impact from calling the `status()` method in the loops. It looks like a native method and I am not sure that it is inexpensive. Maybe the better idea is to only check `isValid()` in the loops and check `status()` only once after the loop to ensure that everything was well and complete. Maybe that is also the reason why this is split into two methods in the first place. What do you think? > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9373) Fix potential data losing for RocksDBBackend
[ https://issues.apache.org/jira/browse/FLINK-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477309#comment-16477309 ] ASF GitHub Bot commented on FLINK-9373: --- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/6020 The reasons that the travis given red light is unrelated... > Fix potential data losing for RocksDBBackend > > > Key: FLINK-9373 > URL: https://issues.apache.org/jira/browse/FLINK-9373 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Blocker > Fix For: 1.5.0 > > > Currently, when using RocksIterator we only use the _iterator.isValid()_ to > check whether we have reached the end of the iterator. But that is not > enough, if we refer to RocksDB's wiki > https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should > find that even if _iterator.isValid()=true_, there may also exist some > internal error. A safer way to use the _RocksIterator_ is to always call the > _iterator.status()_ to check the internal error of _RocksDB_. There is a case > from user email seems to lost data because of this > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)