Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8146 )

Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro
......................................................................


Patch Set 11:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8146/11/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

http://gerrit.cloudera.org:8080/#/c/8146/11/be/src/runtime/descriptors.cc@87
PS11, Line 87:   return ConstantStruct::get(null_indicator_offset_type,
             :       {ConstantInt::get(codegen->int_type(), byte_offset),
             :       ConstantInt::get(codegen->tinyint_type(), bit_mask),
             :       zeroes->getStructElement(2)});
> This needs to be updated when we change the definition of NullIndicatorOffs
Added a comment to the class definition.

ConstantStruct::get() asserts if the number of elements doesn't match the 
number of elements in the struct (that's why I had to add the padding 
explicitly). This behaviour seems better than silently failing.

The code in LLVM is:

  Constant *ConstantStruct::get(StructType *ST, ArrayRef<Constant*> V) {
    assert((ST->isOpaque() || ST->getNumElements() == V.size()) &&
          "Incorrect # elements specified to ConstantStruct::get");


http://gerrit.cloudera.org:8080/#/c/8146/11/be/src/runtime/tuple.h
File be/src/runtime/tuple.h:

http://gerrit.cloudera.org:8080/#/c/8146/11/be/src/runtime/tuple.h@289
PS11, Line 289: );
Looking in the IR I noticed that there were "invoke" instruction emitted when 
calling this function.


http://gerrit.cloudera.org:8080/#/c/8146/11/be/src/runtime/tuple.cc
File be/src/runtime/tuple.cc:

http://gerrit.cloudera.org:8080/#/c/8146/11/be/src/runtime/tuple.cc@397
PS11, Line 397: Status(
> Status::Expected() ?
I think finalisation failing should be very unusual - either an Impala bug or a 
corrupt IR UDF. To be honest I don't know how robust our handling of this is, 
since we'll continue on and eventually try to optimise and compile the module.


http://gerrit.cloudera.org:8080/#/c/8146/11/be/src/runtime/tuple.cc@449
PS11, Line 449:   Value* result_val = builder.CreateCall(cross_compiled_fn,
> Just curious if you inspected the output IR to see if it's actually unrolli
I confirmed that it was unrolled for a handful of string columns. For 100s of 
string columns, e.g,  select count(*) from (select distinct * from 
widetable_1000_cols) v;, it is not unrolled.

I checked some intermediate values too and it seems pretty conservative - it 
seemed like the switch-over point is around materialising >=8 string columns. 
E.g. select * from tpch_avro did not unroll the loop.



--
To view, visit http://gerrit.cloudera.org:8080/8146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233
Gerrit-Change-Number: 8146
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Mon, 23 Oct 2017 23:36:23 +0000
Gerrit-HasComments: Yes

Reply via email to