-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37985/#review97837
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 272)
<https://reviews.apache.org/r/37985/#comment153887>

    Who calls this method? If this is required later can you put this in that 
patch. Focus only on refactoring in this patch. Easy to review too :)



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 275)
<https://reviews.apache.org/r/37985/#comment153888>

    Split into two assignments?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 284)
<https://reviews.apache.org/r/37985/#comment153885>

    Split statement?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 348)
<https://reviews.apache.org/r/37985/#comment153877>

    nit: rename to neededColumnNames?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 362)
<https://reviews.apache.org/r/37985/#comment153880>

    I think this is messed up from the beginning. The second argument is list 
of all projected column names and not just sarg column names. So rename of this 
method will be helpful.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 378)
<https://reviews.apache.org/r/37985/#comment153881>

    Same naming confusion here.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 388)
<https://reviews.apache.org/r/37985/#comment153875>

    This function name is misleading. This config represents the columns that 
are needed to be read not necessarily the column names in the SARG. SARG 
columns will be subset of needed columns.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 392)
<https://reviews.apache.org/r/37985/#comment153876>

    Similarly as above. None of these methods gets the sarg column names 
instead it gets the included column names.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1388)
<https://reviews.apache.org/r/37985/#comment153882>

    This is the actual internal column index corresponding to SARG columns. 
Rename suggestion again. mapSargColumnsToInternalColIdx()?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1393)
<https://reviews.apache.org/r/37985/#comment153893>

    where is this used? I don't see it being used anywhere. Can this be moved 
to relevant jira?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java (line 1228)
<https://reviews.apache.org/r/37985/#comment153889>

    indexInSourceTable will never be null. Right?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java (line 1231)
<https://reviews.apache.org/r/37985/#comment153894>

    Whats this translation doing exactly?
    Is this trying to map between sarg column -> table column index -> orc 
internal column index?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
(line 517)
<https://reviews.apache.org/r/37985/#comment153884>

    Create a bug?



serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java (line 
146)
<https://reviews.apache.org/r/37985/#comment153883>

    I also don't understand why would it contain duplicates. IIRC, this was 
probably caused by multiple concatenation to the READ_COLUMN_IDS_CONF_STR. I am 
not sure if this happens anymore, in any case we should create a bug and remove 
this code. Or may be remove in the next patch. Also use, guava 
Splitter.splitToList?



storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentFactory.java
 (line 28)
<https://reviews.apache.org/r/37985/#comment153895>

    I would prefer setting new name based on internal column names. 
    For example:
    Schema from filedump: struct<_col0:tinyint,_col1:smallint>
    
    Current SARG: leaf-0 = (EQUALS i1 100)
    
    After this patch: leaf-0 = (EQUALS _col1 100)
    
    If we can map to internal names, the it will be easy to map sarg column 
names to internal column index. i1 -> 1 (after ripping off _col)


- Prasanth_J


On Sept. 2, 2015, 1:06 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37985/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2015, 1:06 a.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 05efc5f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java 8beff4b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java fcb3746 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java 
> 4480600 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
> e034650 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
> 2dc15f9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b809a23 
>   serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 
> 10086c5 
>   
> storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentFactory.java
>  0778935 
>   
> storage-api/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
>  d27ac16 
> 
> Diff: https://reviews.apache.org/r/37985/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

Reply via email to