Yingyi Bu has posted comments on this change. Change subject: Cleanup Buffer Cache ......................................................................
Patch Set 8: (15 comments) My concern is index creation. There're several possible scenarios that leave the files with the same names on the file system. I think the source of truth should come from Metadata, instead of coming from the file system. https://asterix-gerrit.ics.uci.edu/#/c/1840/8/asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/FileMapManager.java File asterixdb/asterix-common/src/main/java/org/apache/asterix/common/context/FileMapManager.java: PS8, Line 32: ConcurrentMap Since the call sites from BufferCache is synchronized on fileInfoMap for any method call of this class. Why do you need ConcurrentMap here? BTW, I remember that the big synchronized block of fileInfoMap in BufferCache reduced quite some multi-core parallelism, especially when you have a number of partitions on a node. PS8, Line 65: synchronized synchronized is not need since all call sites from BufferCache has synchronized on fileInfoMap? PS8, Line 75: synchronized synchronized is not need since all call sites from BufferCache has synchronized on fileInfoMap? https://asterix-gerrit.ics.uci.edu/#/c/1840/8/hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/IFileHandle.java File hyracks-fullstack/hyracks/hyracks-api/src/main/java/org/apache/hyracks/api/io/IFileHandle.java: PS8, Line 24: FileReference getFileReference(); Remove corresponding casts that are no longer needed at call sites? It seems to me that if getFileReference(...) is the only function of this interface, we probably doesn't need to have this yet-another interface. I think some more functionalities should be moved to this interface, .e.g, open(), close(), etc. https://asterix-gerrit.ics.uci.edu/#/c/1840/8/hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/io/FileHandle.java File hyracks-fullstack/hyracks/hyracks-control/hyracks-control-nc/src/main/java/org/apache/hyracks/control/nc/io/FileHandle.java: PS8, Line 56: if (!fileRef.getFile().exists()) { : throw HyracksDataException.create(ErrorCode.FILE_DOES_NOT_EXIST, fileRef.getAbsolutePath()); : } If we want to complain non existence, we probably should move this to be before the switch clause, for symmetry? I don't quite understand "read fails automatically" -- to me, a unified form of error messages for read/write is nice to me. https://asterix-gerrit.ics.uci.edu/#/c/1840/8/hyracks-fullstack/hyracks/hyracks-storage-am-common/src/main/java/org/apache/hyracks/storage/am/common/build/IndexBuilder.java File hyracks-fullstack/hyracks/hyracks-storage-am-common/src/main/java/org/apache/hyracks/storage/am/common/build/IndexBuilder.java: PS8, Line 94: // Index is not registered but the index file exists : // This is another big problem that we need to disallow soon : // We can only disallow this if we have a global cleanup after crash : // on reboot Can this happen if drop index is interrupted (cancelled)? https://asterix-gerrit.ics.uci.edu/#/c/1840/8/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIndexFileManager.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/api/ILSMIndexFileManager.java: PS8, Line 84: public boolean exists(); >From the call site, it looks that this method doesn't need to be exposed? It could be private/internal to createDirs()? https://asterix-gerrit.ics.uci.edu/#/c/1840/8/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractLSMIndex.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-common/src/main/java/org/apache/hyracks/storage/am/lsm/common/impls/AbstractLSMIndex.java: PS8, Line 158: if (fileManager.exists()) { : throw HyracksDataException.create(ErrorCode.CANNOT_CREATE_EXISTING_INDEX); : } I suspect this potentially might cause bugs/regressions. Scenarios: 1. For example, drop index was interrupted/cancelled; 2. the system was crashed while we're dropping index; 3. a user put the same file with the same name in the directory. To me, the source of truth should come from Metadata since we have transactional properties, rather than from the file system, which could be unreliable. PS8, Line 235: destroyDiskComponents(); Why do memory components not need to be destroyed any more? https://asterix-gerrit.ics.uci.edu/#/c/1840/8/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/api/IInvertedIndex.java File hyracks-fullstack/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/api/IInvertedIndex.java: PS8, Line 50: void purge() throws HyracksDataException; Why only inverted index needs to purge? https://asterix-gerrit.ics.uci.edu/#/c/1840/8/hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/file/TransientFileMapManager.java File hyracks-fullstack/hyracks/hyracks-storage-common/src/main/java/org/apache/hyracks/storage/common/file/TransientFileMapManager.java: PS8, Line 31: ConcurrentMap<Integer, FileReference> id2nameMap = new ConcurrentHashMap<>(); : private ConcurrentMap< Similar to FileMapManager, do we really need ConcurrentMap here? PS8, Line 64: synchronized Is it necessary here? PS8, Line 74: synchronized Is it necessary? https://asterix-gerrit.ics.uci.edu/#/c/1840/8/hyracks-fullstack/hyracks/hyracks-tests/hyracks-storage-common-test/src/test/java/org/apache/hyracks/storage/common/BufferCacheTest.java File hyracks-fullstack/hyracks/hyracks-tests/hyracks-storage-common-test/src/test/java/org/apache/hyracks/storage/common/BufferCacheTest.java: PS8, Line 105: WS PS8, Line 113: WS -- To view, visit https://asterix-gerrit.ics.uci.edu/1840 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I15565b07afdc94ac74c608bfe4480fa09dcf8f1c Gerrit-PatchSet: 8 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: abdullah alamoudi <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Michael Blow <[email protected]> Gerrit-Reviewer: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
