Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18526 )

Change subject: IMPALA-10851: Codegen for structs
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/18526/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18526/2//COMMIT_MSG@10
PS2, Line 10: codegen turned off. This commit implements codegen for struct 
types.
Can you also mention that some code has been refactored?

As far as I understand the main reason for the refactor is to  add recursion 
inside the codegen logic to write structs in tuples to StructVal 
(SlotRef::GetCodegendComputeFnImpl()) and that write StructVal to tuples 
(CodegenAnyVal::WriteToSlot ).


http://gerrit.cloudera.org:8080/#/c/18526/2//COMMIT_MSG@11
PS2, Line 11:
Can you add some measurement about the codegen time of a deeply nested 
structure? I don't expect it to be bad, but just to check that it is not too 
slow.


http://gerrit.cloudera.org:8080/#/c/18526/2/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

http://gerrit.cloudera.org:8080/#/c/18526/2/be/src/codegen/codegen-anyval.h@52
PS2, Line 52: /// TYPE_ARRAY/TYPE_MAP/CollectionVal: { i64, i8* }
Can you mention StructVal here?


http://gerrit.cloudera.org:8080/#/c/18526/2/be/src/codegen/codegen-anyval.h@317
PS2, Line 317: SetToNull
Can you add a comment for this? So it sets the null indicatior bit to 0 + for 
structures it set all children's null indicator bit to 0


http://gerrit.cloudera.org:8080/#/c/18526/2/be/src/exprs/slot-ref.cc
File be/src/exprs/slot-ref.cc:

http://gerrit.cloudera.org:8080/#/c/18526/2/be/src/exprs/slot-ref.cc@328
PS2, Line 328: child
can you distinguish somehow between source and dest child, e.g. by using 
dstChild for destination?


http://gerrit.cloudera.org:8080/#/c/18526/2/be/src/exprs/slot-ref.cc@486
PS2, Line 486: /// The generated code can be conceptually divided into the 
following parts:
> just a brain dump:
I thought a bit more, and it seems feasible to extend this to structs: so this 
CodegenValueReader could also have child CodegenValueReaders, and the source 
would have to create this tree (so all of them would have a block for null and 
non-null case), while the destination would fill the null and non-null blocks. 
The logic seems the same in the AnyVal->slot and the slot->AnyVal case: the 
reading drives which block to branch to while checking nullness, and 
null/non-null write logic is implemented based on the destination, e.g. set all 
null bits to 0 if the destination is a slot.

I don't want to force this refactor during this patch, but I think that it 
would generally reduce duplications and some logic could be moved to the place 
where it actually belongs, e.g. CodegenAnyVal would no longer need a deep 
understanding of slots and tuples.



--
To view, visit http://gerrit.cloudera.org:8080/18526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5272c3f095fd9f07877104ee03c8e43d0c4ec0b6
Gerrit-Change-Number: 18526
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Mon, 23 May 2022 11:01:32 +0000
Gerrit-HasComments: Yes

Reply via email to