Murtadha Hubail has posted comments on this change.

Change subject: ASTERIXDB-1337: Dataset Memory Management on Multi-Partition NC
......................................................................


Patch Set 1:

(6 comments)

https://asterix-gerrit.ics.uci.edu/#/c/705/1/asterix-common/src/main/java/org/apache/asterix/common/context/AsterixVirtualBufferCacheProvider.java
File 
asterix-common/src/main/java/org/apache/asterix/common/context/AsterixVirtualBufferCacheProvider.java:

Line 40:                 .getVirtualBufferCaches(datasetID, 
ctx.getTaskAttemptId().getTaskId().getPartition());
This partition value is a task based partition (the number of partitions this 
task is split into which always starts from 0) and not the cluster storage 
partition id. For example, if you execute a query on a metadata dataset, this 
partition id will always be 0, whereas the storage partition for the metadata 
datasets could be something completely different. Therefore, this will allocate 
an extra VBC to this dataset and will make it exceed its limit. Similarly, in 
case fault tolerance is enabled, the NC will be responsible for extra storage 
partitions on the same dataset, and will cause extra VBCs to be allocated. I 
believe the best thing to do here is to pass the IO Device number. You can get 
it from the file split of the callers of this method as 

opDesc.getFileSplitProvider().getFileSplits()[partition].getIODeviceId()


https://asterix-gerrit.ics.uci.edu/#/c/705/1/asterix-common/src/main/java/org/apache/asterix/common/context/DatasetLifecycleManager.java
File 
asterix-common/src/main/java/org/apache/asterix/common/context/DatasetLifecycleManager.java:

Line 746:                     vbcs = initializeVirtualBufferCaches(partition);
You might want to add a check here to make sure the number of VBCs of a dataset 
is <= numPartitions.


https://asterix-gerrit.ics.uci.edu/#/c/705/1/asterix-metadata/src/main/java/org/apache/asterix/metadata/bootstrap/MetadataBootstrap.java
File 
asterix-metadata/src/main/java/org/apache/asterix/metadata/bootstrap/MetadataBootstrap.java:

Line 353:                 .getVirtualBufferCaches(index.getDatasetId().getId(), 
metadataPartition.getPartitionId());
If you agree that the best thing is to pass the io device number, this needs to 
be changed to the IO device number of the metadataPartition.


https://asterix-gerrit.ics.uci.edu/#/c/705/1/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMBTreeLocalResourceMetadata.java
File 
asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMBTreeLocalResourceMetadata.java:

Line 67:         List<IVirtualBufferCache> virtualBufferCaches = 
runtimeContextProvider.getVirtualBufferCaches(datasetID, partition);
You need to pass the IO device number from 
RecoveryManager#startRecoveryRedoPhase of the locaResource partition. You need 
to add a method in PersistentLocalResourceRepository that takes a partition and 
return the partition IO device number on this node. (similar to 
PersistentLocalResourceRepository#getPartitionPath).


https://asterix-gerrit.ics.uci.edu/#/c/705/1/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMInvertedIndexLocalResourceMetadata.java
File 
asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMInvertedIndexLocalResourceMetadata.java:

Line 77:         List<IVirtualBufferCache> virtualBufferCaches = 
runtimeContextProvider.getVirtualBufferCaches(datasetID, partition);
You need to pass the IO device number from 
RecoveryManager#startRecoveryRedoPhase of the locaResource partition. You need 
to add a method in PersistentLocalResourceRepository that takes a partition and 
return the partition IO device number on this node. (similar to 
PersistentLocalResourceRepository#getPartitionPath)


https://asterix-gerrit.ics.uci.edu/#/c/705/1/asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMRTreeLocalResourceMetadata.java
File 
asterix-transactions/src/main/java/org/apache/asterix/transaction/management/resource/LSMRTreeLocalResourceMetadata.java:

Line 79:         List<IVirtualBufferCache> virtualBufferCaches = 
runtimeContextProvider.getVirtualBufferCaches(datasetID, partition);
You need to pass the IO device number from 
RecoveryManager#startRecoveryRedoPhase of the locaResource partition. You need 
to add a method in PersistentLocalResourceRepository that takes a partition and 
return the partition IO device number on this node. (similar to 
PersistentLocalResourceRepository#getPartitionPath).


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/705
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbf08f532c1210c30be6a51c73570a789174213b
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Blow <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Murtadha Hubail <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to