Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8146 )
Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro ...................................................................... Patch Set 15: (15 comments) http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner-ir.cc File be/src/exec/hdfs-avro-scanner-ir.cc: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner-ir.cc@a213 PS15, Line 213: > If I understand it correctly, this if branch was dead code before this chan Yeah I missed cleaning it up in an earlier commit. I mention this in my admittedly gigantic commit message. http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner-ir.cc@51 PS15, Line 51: !tuple->CopyStrings("HdfsAvroScanner::DecodeAvroData()", : state_, string_slot_offsets_.data(), string_slot_offsets_.size(), pool, : &parse_status_)) > nit: tuple->CopyStrings(...) == nullptr It returns a bool though, since it doesn't reallocate the tuple itself. http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner.h File be/src/exec/hdfs-avro-scanner.h: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner.h@133 PS15, Line 133: // > nit: /// to be consistent. Done http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner.cc File be/src/exec/hdfs-avro-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner.cc@1066 PS15, Line 1066: HdfsScanNodeBase* node > Should this be const HdfsScanNodeBase* ? Done. this required propagating the const qualifier a few more places. http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-scanner.h@406 PS15, Line 406: /// Codegen function to replace InitTuple(). The codegen'd version of InitTuple() is : /// stored in 'init_tuple_fn' if codegen was successful. > May help to also state the codegen'd version of the function has some const Done http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-scanner.cc File be/src/exec/hdfs-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-scanner.cc@535 PS15, Line 535: > nit: indent 4 Done http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/descriptors.h@93 PS15, Line 93: if > is Done http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/descriptors.h@109 PS15, Line 109: llvm::Constant* ToIR(LlvmCodeGen* codegen) const; > Comment: This needs to be updated should the layout of this struct change. Done http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple-ir.cc File be/src/runtime/tuple-ir.cc: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple-ir.cc@28 PS15, Line 28: for (int i = 0; i < num_string_slots; ++i) { > Not sure if it will help but have you tried #pragma unroll hint here to see I tried a couple of queries but didn't see a noticeable difference in perf. http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.h File be/src/runtime/tuple.h: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.h@47 PS15, Line 47: /// Generate an LLVM Constant containing the offset values of this SlotOffsets instance. > Please also comment that this function needs to be updated if the layout of Done http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.h@204 PS15, Line 204: // > nit: /// Done http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc File be/src/runtime/tuple.cc: http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc@404 PS15, Line 404: materialize_strings_fn > nit: Using the name copy_strings_fn will be more consistent. Thanks for catching this, I missed this one place. http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc@435 PS15, Line 435: Constant* > Not your change but I feel it's generally less confusing to include llvm:: I removed the "using namespace llvm" in this file and added llvm:: to the appropriate places. http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc@435 PS15, Line 435: slot_offset_constants > nit: 'slot_offset_ir_constants' may make it easier to follow. Done http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc@436 PS15, Line 436: for (SlotDescriptor* slot_desc : desc.string_slots()) { : SlotOffsets offsets = {slot_desc->null_indicator_offset(), slot_desc->tuple_offset()}; : slot_offset_constants.push_back(offsets.ToIR(codegen)); : } : : Constant* constant_slot_offsets = codegen->ConstantsToGVArrayPtr( : slot_offsets_type, slot_offset_constants, "slot_offsets"); : Constant* num_string_slots = : ConstantInt::get(codegen->int_type(), desc.string_slots().size()); > I think it may be helpful to add a comment on what line 435 - 444 is trying Done -- To view, visit http://gerrit.cloudera.org:8080/8146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233 Gerrit-Change-Number: 8146 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Mon, 30 Oct 2017 22:03:05 +0000 Gerrit-HasComments: Yes