[ 
https://issues.apache.org/jira/browse/SPARK-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

XiaodongCui updated SPARK-19102:
--------------------------------
    Description: 
the problem is the  result  of the code blow that the second column's value   
is not the same.the second  sql result is 10000 times bigger than the     first 
sql result.the bug is only reappear  in the format like sum(a * b),count 
(distinct  c)

        DataFrame 
df1=sqlContext.read().parquet("hdfs://cdh01:8020/sandboxdata_A/test/a");
                df1.registerTempTable("hd_salesflat");
                DataFrame cube5 = sqlContext.sql("SELECT areacode1, 
SUM(quantity*unitprice) AS sumprice FROM hd_salesflat GROUP BY areacode1");
                DataFrame cube6 = sqlContext.sql("SELECT areacode1, 
SUM(quantity*unitprice) AS sumprice, COUNT(DISTINCT transno)  FROM hd_salesflat 
GROUP BY areacode1");
                cube5.show(50);
                cube6.show(50);

my data :
| transno|   
lineno|productid|netamount|netamountoperation|serviceamount|quantity|unitprice|taxamount|discountamount|discountamountoperation|saleshour|businessdate|salesdate|week|holidayname|holidayid|financialyear|financialmonth|
          
dateticket|calendaryear|calendarmonth|calendarmonthchr|memberno|salestype|covers|grossamountticket|netamountticket|netamountoperationticket|points|discountamountticket|discountamounroperationticket|serviceamountticket|invoicecount|taxamountticket|
  
shopno|shopid|tableno|areacode1|areaname1|areacode2|areaname2|areacode3|areaname3|areacode4|areaname4|
   orgno|orgtype|  hdsino|shopname|   
shopenname|shopbrname|commercial1|com1name|commercial2|   
com2name|shoptype1|shoptype1name|shoptype2|shoptype2name|taxtype|floorlocation| 
       
m2|deliverareano|deliverareaname|parentorgno|cityno|country|menutype|menutypename|costcenterno|costcentername|pricearea|priceareaname|
            opendate|openyear|shopcategory|timeperiod|           closedate| 
sapshopno|cg5no|countryname|province|provincename|cityname|countrycode|categoryno|categoryname|categoryno2|categoryname2|categoryno3|categoryname3|categoryno4|categoryname4|productno|productname|productenname|salesprice|vouchertype|
           startdate|             
enddate|flavor|basicunit|discountno|discountdetailamountoperation|disdesctiption|promotionno|salestag|salestagname|usertype|usertypevalue|usercd|
        grossavg|   netoperationavg|            
netavg|dineincount|dayamttotal|daynetamttotal|daynetamtopttotal|daytctotal|tablecount|
+--------+---------+---------+---------+------------------+-------------+--------+---------+---------+--------------+-----------------------+---------+------------+---------+----+-----------+---------+-------------+--------------+--------------------+------------+-------------+----------------+--------+---------+------+-----------------+---------------+------------------------+------+--------------------+-----------------------------+-------------------+------------+---------------+--------+------+-------+---------+---------+---------+---------+---------+---------+---------+---------+--------+-------+--------+--------+-------------+----------+-----------+--------+-----------+-----------+---------+-------------+---------+-------------+-------+-------------+----------+-------------+---------------+-----------+------+-------+--------+------------+------------+--------------+---------+-------------+--------------------+--------+------------+----------+--------------------+----------+-----+-----------+--------+------------+--------+-----------+----------+------------+-----------+-------------+-----------+-------------+-----------+-------------+---------+-----------+-------------+----------+-----------+--------------------+--------------------+------+---------+----------+-----------------------------+--------------+-----------+--------+------------+--------+-------------+------+----------------+------------------+------------------+-----------+-----------+--------------+-----------------+----------+----------+
|76317828|121082663|     1392|  25.0000|           25.0000|         null|  
1.0000|  25.0000|   1.4200|        0.0000|                 0.0000|        5|    
20160920| 20160920| Tue|           |     null|         2017|             
4|2016-09-20 17:03:...|        2016|            9|             Sep| 1329651|    
 SALE|     1|          25.0000|        25.0000|                 25.0000|  null| 
             0.0000|                       0.0000|             0.0000|          
 0|         1.4200|CNSHA006|   202|       |     HDCN|   哈根达斯中国|     CN01|     
大华东区|   CN0001|     上海大区| CN000001|     上海1区|CNSHA006|      1|HDAS0251|   
上海南东店|NAN DONG SHOP|      SHND|         01|  市级商业中心|          2|High Street|    
    1|     Flagship|        1|          无户外|     10|            1|298.000000|   
    DL0141|          上海A天天|   CN000001|   SHA|     CN|       1|        Full|    
    CN8X|          上海本地|   MK0004|    新外带菜单价格区域|2003-10-01 00:00:...|      -1|  
       old|   Morning|9999-12-31 00:00:...|C_CN8XNA04|  BBR|         中国|      
SH|          上海|      上海|         CN|         1|       Sales|        101|     
Products|      10104|       Coffee|       C006|         纸杯咖啡|     null|     
美式咖啡TK|         null|      null|       null|2013-01-01 00:00:...|2020-11-12 
00:00:...|  null|       BL|      null|                         null|          
null|       null|      40|      月饼及月饼券|    null|         null|  
null|25.0000000000000|25.000000000000000|25.000000000000000|       
null|840000.0000|       84.0000|          84.0000|         1|        53|
+--------+---------+---------+---------+------------------+-------------+--------+---------+---------+--------------+-----------------------+---------+------------+---------+----+-----------+---------+-------------+--------------+--------------------+------------+-------------+----------------+--------+---------+------+-----------------+---------------+------------------------+------+--------------------+-----------------------------+-------------------+------------+---------------+--------+------+-------+---------+---------+---------+---------+---------+---------+---------+---------+--------+-------+--------+--------+-------------+----------+-----------+--------+-----------+-----------+---------+-------------+---------+-------------+-------+-------------+----------+-------------+---------------+-----------+------+-------+--------+------------+------------+--------------+---------+-------------+--------------------+--------+------------+----------+--------------------+----------+-----+-----------+--------+------------+--------+-----------+----------+------------+-----------+-------------+-----------+-------------+-----------+-------------+---------+-----------+-------------+----------+-----------+--------------------+--------------------+------+---------+----------+-----------------------------+--------------+-----------+--------+------------+--------+-------------+------+----------------+------------------+------------------+-----------+-----------+--------------+-----------------+----------+----------+



  was:
the problem is the  result  of the code blow that the second column's value   
is not the same.the second  sql result is 10000 times bigger than the     first 
sql result.the bug is only reappear  in the format like sum(a * b),count 
(distinct  c)

        DataFrame 
df1=sqlContext.read().parquet("hdfs://cdh01:8020/sandboxdata_A/test/a");
                df1.registerTempTable("hd_salesflat");
                DataFrame cube5 = sqlContext.sql("SELECT areacode1, 
SUM(quantity*unitprice) AS sumprice FROM hd_salesflat GROUP BY areacode1");
                DataFrame cube6 = sqlContext.sql("SELECT areacode1, 
SUM(quantity*unitprice) AS sumprice, COUNT(DISTINCT transno)  FROM hd_salesflat 
GROUP BY areacode1");
                cube5.show(50);
                cube6.show(50);




> Accuracy error of spark SQL results
> -----------------------------------
>
>                 Key: SPARK-19102
>                 URL: https://issues.apache.org/jira/browse/SPARK-19102
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 1.6.0, 1.6.1
>         Environment: Spark 1.6.0, Hadoop 2.6.0,JDK 1.8,CentOS6.6
>            Reporter: XiaodongCui
>
> the problem is the  result  of the code blow that the second column's value   
> is not the same.the second  sql result is 10000 times bigger than the   first 
> sql result.the bug is only reappear  in the format like sum(a * b),count 
> (distinct  c)
>       DataFrame 
> df1=sqlContext.read().parquet("hdfs://cdh01:8020/sandboxdata_A/test/a");
>               df1.registerTempTable("hd_salesflat");
>               DataFrame cube5 = sqlContext.sql("SELECT areacode1, 
> SUM(quantity*unitprice) AS sumprice FROM hd_salesflat GROUP BY areacode1");
>               DataFrame cube6 = sqlContext.sql("SELECT areacode1, 
> SUM(quantity*unitprice) AS sumprice, COUNT(DISTINCT transno)  FROM 
> hd_salesflat GROUP BY areacode1");
>               cube5.show(50);
>               cube6.show(50);
> my data :
> | transno|   
> lineno|productid|netamount|netamountoperation|serviceamount|quantity|unitprice|taxamount|discountamount|discountamountoperation|saleshour|businessdate|salesdate|week|holidayname|holidayid|financialyear|financialmonth|
>           
> dateticket|calendaryear|calendarmonth|calendarmonthchr|memberno|salestype|covers|grossamountticket|netamountticket|netamountoperationticket|points|discountamountticket|discountamounroperationticket|serviceamountticket|invoicecount|taxamountticket|
>   
> shopno|shopid|tableno|areacode1|areaname1|areacode2|areaname2|areacode3|areaname3|areacode4|areaname4|
>    orgno|orgtype|  hdsino|shopname|   
> shopenname|shopbrname|commercial1|com1name|commercial2|   
> com2name|shoptype1|shoptype1name|shoptype2|shoptype2name|taxtype|floorlocation|
>         
> m2|deliverareano|deliverareaname|parentorgno|cityno|country|menutype|menutypename|costcenterno|costcentername|pricearea|priceareaname|
>             opendate|openyear|shopcategory|timeperiod|           closedate| 
> sapshopno|cg5no|countryname|province|provincename|cityname|countrycode|categoryno|categoryname|categoryno2|categoryname2|categoryno3|categoryname3|categoryno4|categoryname4|productno|productname|productenname|salesprice|vouchertype|
>            startdate|             
> enddate|flavor|basicunit|discountno|discountdetailamountoperation|disdesctiption|promotionno|salestag|salestagname|usertype|usertypevalue|usercd|
>         grossavg|   netoperationavg|            
> netavg|dineincount|dayamttotal|daynetamttotal|daynetamtopttotal|daytctotal|tablecount|
> +--------+---------+---------+---------+------------------+-------------+--------+---------+---------+--------------+-----------------------+---------+------------+---------+----+-----------+---------+-------------+--------------+--------------------+------------+-------------+----------------+--------+---------+------+-----------------+---------------+------------------------+------+--------------------+-----------------------------+-------------------+------------+---------------+--------+------+-------+---------+---------+---------+---------+---------+---------+---------+---------+--------+-------+--------+--------+-------------+----------+-----------+--------+-----------+-----------+---------+-------------+---------+-------------+-------+-------------+----------+-------------+---------------+-----------+------+-------+--------+------------+------------+--------------+---------+-------------+--------------------+--------+------------+----------+--------------------+----------+-----+-----------+--------+------------+--------+-----------+----------+------------+-----------+-------------+-----------+-------------+-----------+-------------+---------+-----------+-------------+----------+-----------+--------------------+--------------------+------+---------+----------+-----------------------------+--------------+-----------+--------+------------+--------+-------------+------+----------------+------------------+------------------+-----------+-----------+--------------+-----------------+----------+----------+
> |76317828|121082663|     1392|  25.0000|           25.0000|         null|  
> 1.0000|  25.0000|   1.4200|        0.0000|                 0.0000|        5|  
>   20160920| 20160920| Tue|           |     null|         2017|             
> 4|2016-09-20 17:03:...|        2016|            9|             Sep| 1329651|  
>    SALE|     1|          25.0000|        25.0000|                 25.0000|  
> null|              0.0000|                       0.0000|             0.0000|  
>          0|         1.4200|CNSHA006|   202|       |     HDCN|   哈根达斯中国|     
> CN01|     大华东区|   CN0001|     上海大区| CN000001|     上海1区|CNSHA006|      
> 1|HDAS0251|   上海南东店|NAN DONG SHOP|      SHND|         01|  市级商业中心|          
> 2|High Street|        1|     Flagship|        1|          无户外|     10|        
>     1|298.000000|       DL0141|          上海A天天|   CN000001|   SHA|     CN|    
>    1|        Full|        CN8X|          上海本地|   MK0004|    
> 新外带菜单价格区域|2003-10-01 00:00:...|      -1|         old|   Morning|9999-12-31 
> 00:00:...|C_CN8XNA04|  BBR|         中国|      SH|          上海|      上海|        
>  CN|         1|       Sales|        101|     Products|      10104|       
> Coffee|       C006|         纸杯咖啡|     null|     美式咖啡TK|         null|      
> null|       null|2013-01-01 00:00:...|2020-11-12 00:00:...|  null|       BL|  
>     null|                         null|          null|       null|      40|   
>    月饼及月饼券|    null|         null|  
> null|25.0000000000000|25.000000000000000|25.000000000000000|       
> null|840000.0000|       84.0000|          84.0000|         1|        53|
> +--------+---------+---------+---------+------------------+-------------+--------+---------+---------+--------------+-----------------------+---------+------------+---------+----+-----------+---------+-------------+--------------+--------------------+------------+-------------+----------------+--------+---------+------+-----------------+---------------+------------------------+------+--------------------+-----------------------------+-------------------+------------+---------------+--------+------+-------+---------+---------+---------+---------+---------+---------+---------+---------+--------+-------+--------+--------+-------------+----------+-----------+--------+-----------+-----------+---------+-------------+---------+-------------+-------+-------------+----------+-------------+---------------+-----------+------+-------+--------+------------+------------+--------------+---------+-------------+--------------------+--------+------------+----------+--------------------+----------+-----+-----------+--------+------------+--------+-----------+----------+------------+-----------+-------------+-----------+-------------+-----------+-------------+---------+-----------+-------------+----------+-----------+--------------------+--------------------+------+---------+----------+-----------------------------+--------------+-----------+--------+------------+--------+-------------+------+----------------+------------------+------------------+-----------+-----------+--------------+-----------------+----------+----------+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to