[kudu-CR] KUDU-3523 Use f bsize as Kudu block size instead of st blksize
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/20680 ) Change subject: KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20680/1/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: http://gerrit.cloudera.org:8080/#/c/20680/1/src/kudu/util/env_posix.cc@1873 PS1, Line 1873: *block_size = sbuf.f_bsize; > It will not. In most systems, f_bsize is equal to st_blksize. In this case, Where will cause the Kudu could not run normally, could you please point out the code? Keep the compatibility is very important in the community, some users may have run Kudu with filesystem_block_size_bytes=st_blksize already. -- To view, visit http://gerrit.cloudera.org:8080/20680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 Gerrit-Change-Number: 20680 Gerrit-PatchSet: 2 Gerrit-Owner: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Mon, 13 Nov 2023 06:34:55 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3520 Fix file descriptor leak in encryption
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/20631 ) Change subject: KUDU-3520 Fix file descriptor leak in encryption .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20631/2/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: http://gerrit.cloudera.org:8080/#/c/20631/2/src/kudu/util/env_posix.cc@2390 PS2, Line 2390: auto cleanup = MakeScopedCleanup([&]() { : DoClose(fd); : }); According to RAII, would it better to place the releasing the resource where the fd is created(i.e. in NewWritableFile() and NewTempWritableFile()), to make the construct and destruct in pairs? -- To view, visit http://gerrit.cloudera.org:8080/20631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2412429d4fe836b705296e9e30453d7c4d030cec Gerrit-Change-Number: 20631 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai Gerrit-Reviewer: Zoltan Chovan Gerrit-Comment-Date: Mon, 13 Nov 2023 04:51:39 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3523 Use f bsize as Kudu block size instead of st blksize
Hello Yingchun Lai, Yifan Zhang, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20680 to look at the new patch set (#2). Change subject: KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize .. KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize Currently, Kudu uses st_blksize to decide Kudu block size. But st_blksize is preferred I/O block size, not the real filesystem block size. In some systems, st_blksize is 64K, and at the same time the file system block size is 4K. Kudu writes 64 bytes data into this file system, it expects the file on disk size is 64K, but actually, it's on disk size is 4K. Many unit tests run failed in these systems because of the st_blksize is not equal to the real filesystem block size. Therefore, it is better to use f_bsize (file system block size) as Kudu's block size. Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 --- M src/kudu/util/env_posix.cc 1 file changed, 6 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/20680/2 -- To view, visit http://gerrit.cloudera.org:8080/20680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 Gerrit-Change-Number: 20680 Gerrit-PatchSet: 2 Gerrit-Owner: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai
[kudu-CR] KUDU-3523 Use f bsize as Kudu block size instead of st blksize
Wang Xixu has posted comments on this change. ( http://gerrit.cloudera.org:8080/20680 ) Change subject: KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20680/1/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: http://gerrit.cloudera.org:8080/#/c/20680/1/src/kudu/util/env_posix.cc@1869 PS1, Line 1869: // See doc: https://man7.org/linux/man-pages/man2/statfs.2.html. > nit: Maybe replace this link with: https://man7.org/linux/man-pages/man2/st Done -- To view, visit http://gerrit.cloudera.org:8080/20680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 Gerrit-Change-Number: 20680 Gerrit-PatchSet: 2 Gerrit-Owner: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Mon, 13 Nov 2023 02:19:43 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3371 [fs] Use RocksDB to store LBM metadata
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/18569 ) Change subject: KUDU-3371 [fs] Use RocksDB to store LBM metadata .. Patch Set 60: (12 comments) http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/dir_manager.h File src/kudu/fs/dir_manager.h: http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/dir_manager.h@46 PS1, Line 46: > RdbStatusToKuduStatus The return type indicate the target type, and the kudu::Status is the deault "Status" we use in Kudu, so I think it would be convenient and obvious to omit the KuduStatus. http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/dir_manager.cc File src/kudu/fs/dir_manager.cc: http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/dir_manager.cc@131 PS1, Line 131: } : : Status Dir::OpenRocksDB() { : CHECK_STREQ(FLAGS_block_manager.c_str(), "logr"); : if (db_) { : // Check 'db_' is only possible to be valid in test environments when OpenRocksDB(). : // Some unit tests (e.g. BlockManagerTest.PersistenceTest) will reopen the block manager, : // 'db_' is valid in these cases. : CHECK(!GetTestDataDirectory().empty()) << : Substitute("It's not allowed to reopen the RocksDB $0 except in tests", dir_); : return Status::OK(); : } : > It is better to read configure from Kudu gflagfile Of course, we can improve this in following patches, as the comments in the latest patches mentioned. http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/dir_manager.cc@172 PS1, Line 172: .table_f > better to rename it is_rdb_init It has been updated, see the latest patch please. http://gerrit.cloudera.org:8080/#/c/18569/58/src/kudu/fs/dir_manager.cc File src/kudu/fs/dir_manager.cc: http://gerrit.cloudera.org:8080/#/c/18569/58/src/kudu/fs/dir_manager.cc@146 PS58, Line 146: rocksdb::Options opts; > Could you define a flag for this parameter, and give some comments about th Comments have been added, but gflag is not necessary currently, it is always true. http://gerrit.cloudera.org:8080/#/c/18569/58/src/kudu/fs/dir_manager.cc@197 PS58, Line 197: > JoinPathSegments(dir_, "rdb")? Done http://gerrit.cloudera.org:8080/#/c/18569/58/src/kudu/fs/dir_manager.cc@198 PS58, Line 198: : if (db_) { > How about using unique_ptr for db_? Done http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/fs_manager.cc File src/kudu/fs/fs_manager.cc: http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/fs_manager.cc@340 PS1, Line 340: } else { > Use Enum type to represent file, log, logr There are much code assign the string flag FLAGS_block_manager to this option, it will introduce many changes. I think it's not necessary to this patch, we can do this refactor in following patch if you think it's meaningful. http://gerrit.cloudera.org:8080/#/c/18569/58/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: http://gerrit.cloudera.org:8080/#/c/18569/58/src/kudu/fs/log_block_manager.h@423 PS58, Line 423: count of containers a > Please give some comments. Done http://gerrit.cloudera.org:8080/#/c/18569/58/src/kudu/fs/log_block_manager.h@562 PS58, Line 562: size_t EstimateContainerCoun > Why can not get the correct container number? The data direcory is a common dirctory on the filesystem, Kudu users are possible to place any kind of files there. For example, a backup file will be generated if using 'kudu pbc edit' against a metadata file. http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/18569/1/src/kudu/fs/log_block_manager.cc@1269 PS1, Line 1269: const st > The parameter 'id' is not used in this function. The design of this interfa Yes, it has been removed already in the latest patch set. http://gerrit.cloudera.org:8080/#/c/18569/59/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/18569/59/src/kudu/fs/log_block_manager.cc@1899 PS59, Line 1899: WARN_NOT_OK > Should it return when delete data file failed? Would it affects line 1902? Similar to line 1165, it doesn't matter, it is an operation in a do..while loop, another container name is generated in the next iteration. http://gerrit.cloudera.org:8080/#/c/18569/59/src/kudu/fs/log_block_manager.cc@3966 PS59, Line 3966: Initialize data direc > nit: Open rocksDB failed Done -- To view, visit http://gerrit.cloudera.org:8080/18569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f Gerrit-Change-Number: 18569 Gerrit-PatchSet: 6
[kudu-CR] KUDU-3371 [fs] Use RocksDB to store LBM metadata
Hello Alexey Serbin, Yuqi Du, Yifan Zhang, Kudu Jenkins, Abhishek Chennaka, KeDeng, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18569 to look at the new patch set (#60). Change subject: KUDU-3371 [fs] Use RocksDB to store LBM metadata .. KUDU-3371 [fs] Use RocksDB to store LBM metadata Since the LogBlockContainerNativeMeta stores block records sequentially in the metadata file, the live blocks maybe in a very low ratio, so it may cause serious disk space amplification and long time bootstrap consumption. This patch introduces a new class LogBlockContainerRdbMeta which uses RocksDB to store LBM metadata, a new item will be Put() into RocksDB when a new block is created in LBM, and the item will be Delete() from RocksDB when the block is removed from LBM. Data in RocksDB can be maintained by RocksDB itself, i.e. deleted items will be GCed so it's not needed to rewrite the metadata as how we do in LogBlockContainerNativeMeta. The implementation also reuses most logic of the base class LogBlockContainer, the main different with LogBlockContainerNativeMeta is LogBlockContainerRdbMeta stores block records metadata in RocksDB rather than a native file, the main implementation of interfaces from the base clase including: a. Create container Data file is created similar to LogBlockContainerNativeMeta, but the metadata part is stored in RocksDB with keys constructed as ".", and values are the same to the records stored in metadata file of LogBlockContainerNativeMeta. b. Open container Similar to LogBlockContainerNativeMeta, and it's not needed to check the metadata part, because it has been checked when load containers when bootstrap. c. Destroy container If the container is dead (full and no live blocks), remove the data file, and clean up metadata part, by deleting all the keys prefixed by "". d. Load container (by ProcessRecords()) Iterate the RocksDB in the key range [, ), because dead blocks have been deleted directly, thus only live block records will be populated, we can use them as LogBlockContainerNativeMeta. e. Create blocks in a container Put() serialized BlockRecordPB records into RocksDB, keys are contructed the same to the above. f. Remove blocks from a container Contruct the keys same to the above, Delete() them from RocksDB in batch. This patch contains the following changes: - Adds a new block manager type named 'logr', it use RocksDB to store LBM metadata, it is specified by flag '--block_manager'. - Related tests add new parameterized value to test the case of "--block_manager=logr". It's optional to use RocksDB, we can use the former LBM as before, we will introduce more tools to convert data between the two implementations in the future. The optimization is obvious as shown in JIRA KUDU-3371, it shows that reopen staged reduced upto 90% time cost. Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f --- M src/kudu/benchmarks/CMakeLists.txt M src/kudu/client/CMakeLists.txt M src/kudu/consensus/CMakeLists.txt M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/data_dirs.cc M src/kudu/fs/dir_manager.cc M src/kudu/fs/dir_manager.h M src/kudu/fs/dir_util.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_report.cc M src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test-util.cc M src/kudu/fs/log_block_manager-test-util.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/dense_node-itest.cc M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/server/CMakeLists.txt M src/kudu/tablet/compaction-test.cc M src/kudu/tools/CMakeLists.txt M src/kudu/tools/kudu-tool-test.cc M src/kudu/tserver/tablet_server-test.cc M src/kudu/util/CMakeLists.txt M thirdparty/build-definitions.sh 31 files changed, 1,610 insertions(+), 166 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/69/18569/60 -- To view, visit http://gerrit.cloudera.org:8080/18569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f Gerrit-Change-Number: 18569 Gerrit-PatchSet: 60 Gerrit-Owner: Yingchun Lai Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: KeDeng Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Reviewer: Yuqi Du