[ 
https://issues.apache.org/jira/browse/FLINK-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561259#comment-16561259
 ] 

ASF GitHub Bot commented on FLINK-9981:
---------------------------------------

StefanRRichter commented on a change in pull request #6438: [FLINK-9981] Tune 
performance of RocksDB implementation
URL: https://github.com/apache/flink/pull/6438#discussion_r205985966
 
 

 ##########
 File path: 
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOrderedSetStore.java
 ##########
 @@ -116,7 +127,14 @@ public RocksDBOrderedSetStore(
 
        @Override
        public void add(@Nonnull T element) {
+
                byte[] elementBytes = serializeElement(element);
+
+               if (LEXICOGRAPIC_BYTE_COMPARATOR.compare(elementBytes, 
lowerBoundSeekKey) < 0) {
+                       // a smaller element means a new lower bound.
+                       lowerBoundSeekKey = elementBytes;
 
 Review comment:
   Yes, please notice that the lower bound is also changed from requesting an 
iterator. Then it is set to the actual current low key. Afterwards, smaller 
keys can be inserted again, and that is why this check is required. Maybe keys 
get deleted again, that is why it can only be a lower bound without getting to 
expensive, but from my experiments small change gave a huge improvement because 
it helps ignoring tombstones pre-compactions, and there are often many ahead of 
the first existing element because of the typical access pattern of this state. 
We might chose another initial value for the lower bound than the prefix, but I 
would not introduce `null` and additional branching. Thinking about it, the 
prefix of the next key-group might be actually a sensible initial value, but in 
the end it also doesn't matter to much because it is properly "calibrated" when 
an iterator is requested for the first time

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Tune performance of RocksDB implementation
> ------------------------------------------
>
>                 Key: FLINK-9981
>                 URL: https://issues.apache.org/jira/browse/FLINK-9981
>             Project: Flink
>          Issue Type: Sub-task
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.6.0
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>            Priority: Major
>              Labels: pull-request-available
>
> General performance tuning/polishing for the RocksDB implementation. We can 
> figure out how caching/seeking can be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to