[jira] [Updated] (SPARK-25648) Spark 2.3.1 reads orc format files with native and hive, and return different results

Jun Zheng (JIRA) Fri, 05 Oct 2018 06:57:09 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-25648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jun Zheng updated SPARK-25648:
------------------------------
    Description: 
Hi All

I am testing TPCx-BB[link title|www.tpc.org/tpcx-bb/default.asp] with the code 
from 
[https://github.com/BigData-Lab-Frankfurt/Big-Data-Benchmark-for-Big-Bench,] 
 # The test data are loaded by spark-sql, the parameter _spark_.sql._orc_.impl 
sets to native;
 # During the engine validation power test,  when use the different read 
engines that is set _spark_.sql._orc_.impl = hive or _spark_.sql._orc_.impl = 
native, the q22 return different results. When set to hive,  the result is 
right, but set to native, less results are returned. Can someone help to find 
why it happens.

Thanks in advance

  was:
Hi All

I am testing TPCx-BB[link title|www.tpc.org/tpcx-bb/default.asp] with the code 
from 
[https://github.com/BigData-Lab-Frankfurt/Big-Data-Benchmark-for-Big-Bench,] 
 # The test data are loaded by spark-sql, the parameter _spark_.sql._orc_.impl 
sets to native;
 # During the engine validation power test,  when use the different read 
engines that is set _spark_.sql._orc_.impl = hive or _spark_.sql._orc_.impl = 
native, the q02 return different results. When set to hive,  the result is 
right, but set to native, less results are returned. Can someone help to find 
why it happens.

Thanks in advance


> Spark 2.3.1 reads orc format  files with native and hive, and return 
> different results
> --------------------------------------------------------------------------------------
>
>                 Key: SPARK-25648
>                 URL: https://issues.apache.org/jira/browse/SPARK-25648
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.1
>            Reporter: Jun Zheng
>            Priority: Major
>
> Hi All
> I am testing TPCx-BB[link title|www.tpc.org/tpcx-bb/default.asp] with the 
> code from 
> [https://github.com/BigData-Lab-Frankfurt/Big-Data-Benchmark-for-Big-Bench,] 
>  # The test data are loaded by spark-sql, the parameter 
> _spark_.sql._orc_.impl sets to native;
>  # During the engine validation power test,  when use the different read 
> engines that is set _spark_.sql._orc_.impl = hive or _spark_.sql._orc_.impl = 
> native, the q22 return different results. When set to hive,  the result is 
> right, but set to native, less results are returned. Can someone help to find 
> why it happens.
> Thanks in advance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-25648) Spark 2.3.1 reads orc format files with native and hive, and return different results

Reply via email to