Alex Behm has posted comments on this change. Change subject: IMPALA-4768: Improve logging of table loading. ......................................................................
Patch Set 3: (2 comments) I'm not set on implementing the logging this way, but shipping the current code as is will be a supportability nightmare. Bharath and I have felt the impact of the lack of log messages while debugging a problem in an internal cluster. I'm happy to implement these changes in a different way if you have a suggestion. http://gerrit.cloudera.org:8080/#/c/5709/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS3, Line 241: BlockMetadataLoadStats > I don't see any block-metadata-specific stats, so I am wondering if the nam The file counts are specific to the block metadata loading. For full loads of partitioned tables, the common case is that we recursively iterate over *all* files in the table's root directory and then map the files to partitions based on the their containing directory. This means that there could be files in the table's root dir that don't belong to any partition. So the number of files scanned and the number of files actually belonging to the table could be different, and a large discrepancy could explain loading perf regressions (if any). I'm ok with removing this change if you prefer. PS3, Line 244: // Number of files that were found to belong to the table being loaded. : public long tableFileCount = 0; > Not sure I understand what this field represents? Can you please explain ? See above comment, hope it explains it. Let me know if you want to keep or remove this. If you prefer to keep, I'll expand the comments. -- To view, visit http://gerrit.cloudera.org:8080/5709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8de96d0cb6d09b2272b1925d42cb059367fe7196 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-HasComments: Yes