[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-17 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
Thanks @sihuazhou ! LGTM 👍 Will merge this.


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-17 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
cc @StefanRRichter 


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-17 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
Let's wait a bit more for their response. It seems like this example is 
older than their corrected docs.


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
Have't received any response from RocksDB yet, but I found this example 
with using `RocksIterator#stats()`: 
https://github.com/facebook/rocksdb/blob/3453870677ee2648f38d70fe8aa7fa16a93a96d2/java/samples/src/main/java/RocksDBSample.java


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
FYI, I found this issue related to problem: 
https://github.com/facebook/rocksdb/issues/3558


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
Agreed!


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
Maybe we should ask them on their issue tracker what the best practise is? 
I cannot remember seeing such checks in their code examples. Have a hard time 
to believe that this can be true, because it is not really documented on the 
Java API and also why wouldn't they always call `status` internally?


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
It depends, maybe this is already covered currently because we might always 
do an iteration attempt that checks right after the seek. But in general, this 
is not very nice and fragile if true.


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
Oh My God...Is that means we need to wrap the `RocksIterator` to delegate 
all it API?


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
After double checking with the RocksDB docs, I am afraid that we need to 
introduce more checks because for example the point out that also after methods 
like `seek` the iterator an become corrupted. And if the status flag is 
potentially cleared, that means we need to check in all the places...crazy :-(


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
Agree, should be correct first before fast! Could you please have a look at 
this? I think it's already for a look now~


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
It sounds cheap if they just `&` all the flags from the sub iterators. In 
the end, we can see if there is a performance drop but better be correct first 
before fast.


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
@StefanRRichter I had a look at the implementation of the iterators in 
RocksDB, I found status just return the flag first `_status` as the result 
without any complex computation, But for some `composite Iterator` like the 
`MergeIteraor` and `TwoLevelIterator` it need to check all the 
`InternalIterator` they hold to decide the final status, and I also found the 
iterator could be reset to `OK` in some cases...Hmm...do you think this is 
super cheap or not?


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
I'm going to check the native implementation and see whether the `status()` 
is a super cheap option...


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
I think I am a bit torn here now...


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
That is a good question, and I'm not sure...but I think that seems to be...


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
Oh you are right, this is confusing :-) So does this also mean the status 
flag is cleared when we simple continue iterating and only check in the end?


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
I think that is the incorrect one, If I'm not confused by the wiki's 
content...


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
Also from the RocksDB docs: `In another word, if Iterator::Valid() is true, 
status() is guaranteed to be OK()`


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
Yes, but eventually it will also return `false`, which is essentially the 
same as waiting until the loop terminates. Anyways, I think after the loop is 
the nicer way.


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
@StefanRRichter NO, I think that couldn't fix this issue, the problem here 
is that even `iterator.isValid()` return `true`, there may also some internal 
error in RocksDB. What do you think?


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
You could also in `isRocksIteratorValid` run the check only if the return 
value is `false` if you like the helper method to avoid people forgetting about 
this check.


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
@StefanRRichter No, I didn't have any performance tests yet. I think you 
are right! Your proposal is the way I'm going to choose. Addressing this...


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/6020
  
A quick general question: could you observe any performance impact from 
calling the `status()` method in the loops. It looks like a native method and I 
am not sure that it is inexpensive. Maybe the better idea is to only check 
`isValid()` in the loops and check `status()` only once after the loop to 
ensure that everything was well and complete. Maybe that is also the reason why 
this is split into two methods in the first place. What do you think?


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
The reasons that the travis given red light is unrelated...


---


[GitHub] flink issue #6020: [FLINK-9373][state] Fix potential data losing for RocksDB...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/6020
  
cc @StefanRRichter 


---