[jira] [Commented] (KYLIN-2116) when hive field delimitor exists in table field values, fields order is wrong

Shaofeng SHI (JIRA) Wed, 26 Oct 2016 02:24:07 -0700

    [ 
https://issues.apache.org/jira/browse/KYLIN-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15607966#comment-15607966
 ]


Shaofeng SHI commented on KYLIN-2116:
-------------------------------------

No need I guess; Kylin uses HCat to read the table so the delimitor is 
transparent for Kylin. Just remove the "delimited by " may fix that.

> when hive field delimitor exists in table field values, fields order is wrong
> -----------------------------------------------------------------------------
>
>                 Key: KYLIN-2116
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2116
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v1.5.2
>            Reporter: yubo
>            Assignee: Dong Li
>
> in #1 step when creating temp hive table,there is delimitor 
>  
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177' 
>  
> when this delimitor exists in some fields, fields order is  wrong 
> test details:
> when we search with same sql, different results are returned as below. 
> 25 in kylin and 24 in hive. 
> We guess there maybe some problem within #2 Step(Extract Fact Table Distinct 
> Columns) when building cube. 
> 1 search in kylin 
>  select distinct visit_hour from KYLIN_REPORT_DB.session_behavior_channel_oms 
> where  visit_date >= '2016-10-19' and visit_date <= '2016-10-19' 
>  Results (25) 
> 19 
> 17 
> 18 
> 15 
> 16 
> 13 
> 14 
> 11 
> 12 
> 21 
> 神马搜索 
> 20 
> 08 
> 09 
> 04 
> 22 
> 05 
> 23 
> 06 
> 07 
> 00 
> 01 
> 02 
> 03 
> 10 
> 2. #2 Step Name: Extract Fact Table Distinct Columns output 
> hadoop fs -cat 
> /kylin/kylin_metadata/kylin-e8bb517d-6c29-4f89-a83e-871e142e3d48/channel_first_stage_flow_cube/fact_distinct_columns/VISIT_HOUR
>  
> 00 
> 01 
> 02 
> 03 
> 04 
> 05 
> 06 
> 07 
> 08 
> 09 
> 10 
> 11 
> 12 
> 13 
> 14 
> 15 
> 16 
> 17 
> 18 
> 19 
> 20 
> 21 
> 22 
> 23 
> 神马搜索 
> 3. hive table 
> hive -e " select distinct visit_hour from 
> KYLIN_REPORT_DB.session_behavior_channel_oms where  visit_date >= 
> '2016-10-19' and visit_date <= '2016-10-19' " 
> WARNING: Use "yarn jar" to launch YARN applications. 
> Logging initialized using configuration in 
> file:/etc/hive/2.3.4.0-3485/0/hive-log4j.properties 
> Query ID = hdfs_20161020164441_dcea3e55-1a8b-4f3a-9378-7dcda008001b 
> Total jobs = 1 
> Launching Job 1 out of 1 
> Status: Running (Executing on YARN cluster with App id 
> application_1476342479107_13013) 
> --------------------------------------------------------------------------------
>  
>         VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED 
> --------------------------------------------------------------------------------
>  
> Map 1 ..........   SUCCEEDED    564        564        0        0       0      
>  0 
> Reducer 2 ......   SUCCEEDED     15         15        0        0       0      
>  0 
> --------------------------------------------------------------------------------
>  
> VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 168.70 s  
>  
> --------------------------------------------------------------------------------
>  
> OK 
> 03 
> 12 
> 13 
> 22 
> 05 
> 14 
> 08 
> 17 
> 00 
> 02 
> 18 
> 06 
> 23 
> 01 
> 19 
> 07 
> 10 
> 15 
> 20 
> 16 
> 11 
> 04 
> 09 
> 21 
> Time taken: 172.907 seconds, Fetched: 24 row(s) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KYLIN-2116) when hive field delimitor exists in table field values, fields order is wrong

Reply via email to