[ https://issues.apache.org/jira/browse/KYLIN-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15607966#comment-15607966 ]
Shaofeng SHI commented on KYLIN-2116: ------------------------------------- No need I guess; Kylin uses HCat to read the table so the delimitor is transparent for Kylin. Just remove the "delimited by " may fix that. > when hive field delimitor exists in table field values, fields order is wrong > ----------------------------------------------------------------------------- > > Key: KYLIN-2116 > URL: https://issues.apache.org/jira/browse/KYLIN-2116 > Project: Kylin > Issue Type: Bug > Components: Job Engine > Affects Versions: v1.5.2 > Reporter: yubo > Assignee: Dong Li > > in #1 step when creating temp hive table,there is delimitor > > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177' > > when this delimitor exists in some fields, fields order is wrong > test details: > when we search with same sql, different results are returned as below. > 25 in kylin and 24 in hive. > We guess there maybe some problem within #2 Step(Extract Fact Table Distinct > Columns) when building cube. > 1 search in kylin > select distinct visit_hour from KYLIN_REPORT_DB.session_behavior_channel_oms > where visit_date >= '2016-10-19' and visit_date <= '2016-10-19' > Results (25) > 19 > 17 > 18 > 15 > 16 > 13 > 14 > 11 > 12 > 21 > 神马搜索 > 20 > 08 > 09 > 04 > 22 > 05 > 23 > 06 > 07 > 00 > 01 > 02 > 03 > 10 > 2. #2 Step Name: Extract Fact Table Distinct Columns output > hadoop fs -cat > /kylin/kylin_metadata/kylin-e8bb517d-6c29-4f89-a83e-871e142e3d48/channel_first_stage_flow_cube/fact_distinct_columns/VISIT_HOUR > > 00 > 01 > 02 > 03 > 04 > 05 > 06 > 07 > 08 > 09 > 10 > 11 > 12 > 13 > 14 > 15 > 16 > 17 > 18 > 19 > 20 > 21 > 22 > 23 > 神马搜索 > 3. hive table > hive -e " select distinct visit_hour from > KYLIN_REPORT_DB.session_behavior_channel_oms where visit_date >= > '2016-10-19' and visit_date <= '2016-10-19' " > WARNING: Use "yarn jar" to launch YARN applications. > Logging initialized using configuration in > file:/etc/hive/2.3.4.0-3485/0/hive-log4j.properties > Query ID = hdfs_20161020164441_dcea3e55-1a8b-4f3a-9378-7dcda008001b > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1476342479107_13013) > -------------------------------------------------------------------------------- > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > -------------------------------------------------------------------------------- > > Map 1 .......... SUCCEEDED 564 564 0 0 0 > 0 > Reducer 2 ...... SUCCEEDED 15 15 0 0 0 > 0 > -------------------------------------------------------------------------------- > > VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 168.70 s > > -------------------------------------------------------------------------------- > > OK > 03 > 12 > 13 > 22 > 05 > 14 > 08 > 17 > 00 > 02 > 18 > 06 > 23 > 01 > 19 > 07 > 10 > 15 > 20 > 16 > 11 > 04 > 09 > 21 > Time taken: 172.907 seconds, Fetched: 24 row(s) -- This message was sent by Atlassian JIRA (v6.3.4#6332)