[jira] [Commented] (SPARK-25648) Spark 2.3.1 reads orc format files with native and hive, and return different results

Hyukjin Kwon (JIRA) Fri, 05 Oct 2018 06:55:16 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-25648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639824#comment-16639824
 ]


Hyukjin Kwon commented on SPARK-25648:
--------------------------------------

[~justinnju] What results were different? How can we reproduce this? Was this 
dataloss? or correctness issue?

> Spark 2.3.1 reads orc format  files with native and hive, and return 
> different results
> --------------------------------------------------------------------------------------
>
>                 Key: SPARK-25648
>                 URL: https://issues.apache.org/jira/browse/SPARK-25648
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: Jun Zheng
>            Priority: Major
>
> Hi All
> I am testing TPCx-BB[link title|www.tpc.org/tpcx-bb/default.asp] with the 
> code from 
> [https://github.com/BigData-Lab-Frankfurt/Big-Data-Benchmark-for-Big-Bench,] 
>  # The test data are loaded by spark-sql, the parameter 
> _spark_.sql._orc_.impl sets to native;
>  # During the engine validation power test,  when use the different read 
> engines that is set _spark_.sql._orc_.impl = hive or _spark_.sql._orc_.impl = 
> native, the q02 return different results. When set to hive,  the result is 
> right, but set to native, less results are returned. Can someone help to find 
> why it happens.
> Thanks in advance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-25648) Spark 2.3.1 reads orc format files with native and hive, and return different results

Reply via email to