azagrebin commented on a change in pull request #7351: [FLINK-11008][State 
Backends, Checkpointing]SpeedUp upload state files using multithread
URL: https://github.com/apache/flink/pull/7351#discussion_r246340345
 
 

 ##########
 File path: 
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDbStateDataTransfer.java
 ##########
 @@ -61,6 +69,88 @@ static void transferAllStateDataToDirectory(
                downloadDataForAllStateHandles(miscFiles, dest, 
restoringThreadNum, closeableRegistry);
        }
 
+       public static void uploadFilesToCheckpointFs(
+               @Nonnull Map<StateHandleID, Path> files,
+               int numberOfSnapshottingThreads,
+               CheckpointStreamFactory checkpointStreamFactory,
+               CloseableRegistry closeableRegistry,
+               Map<StateHandleID, StreamStateHandle> hanldes) throws Exception 
{
 
 Review comment:
   Not sure here, in general I think it is not obvious that function implicitly 
returns something (here `hanldes`, typo btw, should be `handles`) in its 
arguments. At least I would mention this fact in the function doc comment. 
`transferAllStateDataToDirectory` is now public and should also have a doc 
comment.
   
   On the other hand, if function returns explicitly `handles`, we have to 
reiterate them later where the result is used to add it to the final result map 
(e.g. `sstFiles.putAll(uploadFilesToCheckpointFs(..))`). Though, I would not 
expect the size of map to be performance critical for reiteration.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to