Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18433 )

Change subject: IMPALA-11141: Use exact data types in IN-list filter
......................................................................


Patch Set 3:

(14 comments)

Thank Qifan! Addressed the comments.

http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/exec/filter-context.cc
File be/src/exec/filter-context.cc:

http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/exec/filter-context.cc@451
PS2, Line 451: DCHECK
> DCHECK(false)
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter-ir.cc
File be/src/util/in-list-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter-ir.cc@32
PS2, Line 32: UNLIKELY(val
> UNLIKELY
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter-ir.cc@41
PS2, Line 41:       }                                                           
                    \
> probably can be moved to a reset() method.
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter-ir.cc@85
PS2, Line 85:  }                                                                
                     \
            :                                                                   
                      \
            :   template<>                                                      
                      \
            :   bool InListFilterImpl<StringValue, SLOT_TYPE>::Find(const void* 
val,                  \
            :       const ColumnType& col_typ
> should be part of a reset() method.
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter-test.cc
File be/src/util/in-list-filter-test.cc:

http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter-test.cc@173
PS2, Line 173: char str_buffer[] = "aabbccddeeff";
             :   const char* ptr = str_buffer;
             :   // Insert 3 values first
             :   for (int i = 0; i < 3; ++i) {
             :     filter->Insert(ptr);
             :     ptr += 2;
             :   }
> Some code duplication between CHAR test case and VARCHAR/STRING test case.
The trouble is they use different Find() and Insert() methods. VARCHAR/STRING 
expects their first argument to be "const StringValue*". CHAR expects their 
first argument to be "const char*". Maybe it's ok to keep the current code.


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.h
File be/src/util/in-list-filter.h:

http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.h@70
PS2, Line 70:
> nit: excluding
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.h@113
PS2, Line 113:   InListFilter(entry_limit, contains_null) {}
> I wonder if this can be moved to the base class with the modification:
sure


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.h@145
PS2, Line 145:     total_len += (res.second? v.len : 0);
             :     return res.second;
             :   }
> See the above comment about moving the method to base class.
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.h@163
PS2, Line 163:
> nit. transferred
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.h@165
PS2, Line 165:
> nit. maybe newly_inserted_values_ to make it clear.
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.h@165
PS2, Line 165: template<PrimitiveType SLOT_TYPE>
             : class InListFilterImpl<StringValue,
> These two data members can be combined into a new struct
Good point!


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.cc
File be/src/util/in-list-filter.cc:

http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.cc@65
PS2, Line 65: default:
> nit. At some point of time in future, we may need to support float and doub
I think it's not a best practise to use float/double as join keys since 
floating point computation is inaccurate. We can add them if users require them.


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.cc@143
PS2, Line 143:   for (const StringValue& s : newly_inserted_values_.values) {
> nit. May add a comment here: total_entries_ is updated during insert().
Done


http://gerrit.cloudera.org:8080/#/c/18433/2/be/src/util/in-list-filter.cc@144
PS2, Line 144:   Ubsan::MemCpy(buffer, s.ptr, s.len);
             :     values_.insert(StringVal
> this could be part of a method (say reset()) on the struct SetOfStringsWIth
Done



--
To view, visit http://gerrit.cloudera.org:8080/18433
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id434a542b2ced64efa3bfc974cb565b94a4193e9
Gerrit-Change-Number: 18433
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Mon, 25 Apr 2022 09:32:06 +0000
Gerrit-HasComments: Yes

Reply via email to