Gergely Fürnstáhl has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19850 )

Change subject: WIP IMPALA-11619: Improve Iceberg V2 reads with a custom 
Iceberg Position Delete operator
......................................................................


Patch Set 10:

(34 comments)

Thanks, went through the backend, build/probe logic refactor and frontend is 
still TODO

http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/hash-table.h@442
PS10, Line 442:   friend class IcebergDeleteBuilder;
              :   friend class IcebergDeleteNode;
> Are these lines needed?
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h
File be/src/exec/iceberg-delete-builder.h:

http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@18
PS10, Line 18: #ifndef IMPALA_EXEC_ICEBERG_DELETE_BUILDER_H
             : #define IMPALA_EXEC_ICEBERG_DELETE_BUILDER_H
> nit: use #pragma once
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@57
PS10, Line 57: a
> nit: an
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@58
PS10, Line 58:   /// IcebergDeleteNode.
> nit: fits previous line
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@63
PS10, Line 63: a
> nit: an
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@110
PS10, Line 110: each probe thread working on the same set of
              : /// partitions
> Obsolete comment, as this operator doesn't partition its inputs.
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@120
PS10, Line 120: long long
> nit: we use int64_t instead of long long in the Impala code
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@120
PS10, Line 120: h;
> nit: missing '_' at and of member name.
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@157
PS10, Line 157: hash_partitions_
> There is no 'hash_partitions_'
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@160
PS10, Line 160: that that
> nit: multiple 'that'
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.h@164
PS10, Line 164: Also used by
> nit: end of sentence is missing
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.cc
File be/src/exec/iceberg-delete-builder.cc:

http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.cc@58
PS10, Line 58: using strings::Substitute
> nit: common/names.h already has this using.
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.cc@95
PS10, Line 95: eq_join_conjuncts
> We could add DCHECK that the size of it is 2.
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.cc@127
PS10, Line 127: Hash
> IcebergPositionDelete?
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.cc@150
PS10, Line 150: Hash
> IcebergPositionDelete?
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-builder.cc@206
PS10, Line 206: boost
> Can we use std::filesystem?
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h
File be/src/exec/iceberg-delete-node.h:

http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@18
PS10, Line 18: #ifndef IMPALA_EXEC_ICEBERG_DELETE_NODE_H
             : #define IMPALA_EXEC_ICEBERG_DELETE_NODE_H
> nit: use #pragma once
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@37
PS10, Line 37: class BloomFilter;
> unused
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@39
PS10, Line 39: class IcebergDeleteNode;
> not needed here
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@78
PS10, Line 78: t
> nit: to early line break here
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@79
PS10, Line 79: build partitions
> This solution doesn't use build partitions AFAICT.
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@81
PS10, Line 81: phase
> nit: 'build phase', or 'this phase' instead of 'the phase'
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@89
PS10, Line 89: ///
             : ///
             : ///
> nit: too many empty lines
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@104
PS10, Line 104: virtual
> nit: 'virtual' keyword is redundant here
that is the more common style in the code base, I would keep it that way


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@186
PS10, Line 186: Get the next row batch from the probe (left) side (child(0))
> This operator always gets the next row bratch from the probe side, right?
Yes


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.h@210
PS10, Line 210:   /// Additional state for flushing build-side data. Needed for
              :   /// separate build.
              :   /// TODO: IMPALA-9411: this could be removed if we attached 
the buffers.
              :   bool flushed_unattachable_build_buffers_ = false;
> Not relevant for this operator?
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.cc
File be/src/exec/iceberg-delete-node.cc:

http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.cc@52
PS10, Line 52: using strings::Substitute
> Unnecessary because of common/names.h
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.cc@142
PS10, Line 142:
> nit: unnecessary empty line
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.cc@345
PS10, Line 345: (
> nit: I see no closing parenthesis.
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.cc@359
PS10, Line 359:   ;
> nit: unnecessary ;
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.cc@392
PS10, Line 392: {
              :       }
> empty block
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.inline.h
File be/src/exec/iceberg-delete-node.inline.h:

http://gerrit.cloudera.org:8080/#/c/19850/10/be/src/exec/iceberg-delete-node.inline.h@18
PS10, Line 18: #ifndef IMPALA_EXEC_ICEBERG_DELETE_NODE_INLINE_H
             : #define IMPALA_EXEC_ICEBERG_DELETE_NODE_INLINE_H
> nit: use #pragma once
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/19850/10/common/thrift/ImpalaService.thrift@795
PS10, Line 795: of
> nit: off
Done


http://gerrit.cloudera.org:8080/#/c/19850/10/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/19850/10/common/thrift/Query.thrift@638
PS10, Line 638: 159: optional TJoinDistributionMode 
optimized_iceberg_v2_read_distribution_mode_override
> This is good for perf testing for now, but we should limit the number of qu
yes, that is the plan



--
To view, visit http://gerrit.cloudera.org:8080/19850
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I024a61573c83bda5584f243c879d9ff39dd2dcfa
Gerrit-Change-Number: 19850
Gerrit-PatchSet: 10
Gerrit-Owner: Gergely Fürnstáhl <gfurnst...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 23 May 2023 10:29:47 +0000
Gerrit-HasComments: Yes

Reply via email to