Re: Suprised!!!!!Spark-shell showing inconsistent results

2017-02-03 Thread Alex
Hi Team,

Actually I figured out something ..

While Hive Java UDF executed on hive it is giving output with 10 decimal
precision but in spark same udf is giving results rounded off to 6 decimal
precision... How do I stop that? Its the same java udf jar files used in
both hive and spark..

[image: Inline image 1]



On Thu, Feb 2, 2017 at 3:33 PM, Alex  wrote:

> Hi As shown below same query when ran back to back showing inconsistent
> results..
>
> testtable1 is Avro Serde table...
>
> [image: Inline image 1]
>
>
>
>  hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res14: Array[org.apache.spark.sql.Row] = Array([1570,3364,201607,Y,APJ,
> PHILIPPINES,8518944,null,null,null,null,-15.992583,0.0,-15.
> 992583,null,null,MONTH_ITEM_GROUP])
>
> scala> hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res15: Array[org.apache.spark.sql.Row] = Array([1570,485888,20163,N,
> AMERICAS,BRAZIL,null,null,null,null,null,6019.2999,17198.0,6019.
> 2999,null,null,QUARTER_GROUP])
>
> scala> hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res16: Array[org.apache.spark.sql.Row] = Array([1570,3930,201607,Y,APJ,INDIA
> SUB-CONTINENT,8741220,null,null,null,null,-208.485216,0.
> 0,-208.485216,null,null,MONTH_ITEM_GROUP])
>
>


Re: Suprised!!!!!Spark-shell showing inconsistent results

2017-02-02 Thread Marco Mistroni
Hi
 Have u tried to sort the results before comparing?


On 2 Feb 2017 10:03 am, "Alex"  wrote:

> Hi As shown below same query when ran back to back showing inconsistent
> results..
>
> testtable1 is Avro Serde table...
>
> [image: Inline image 1]
>
>
>
>  hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res14: Array[org.apache.spark.sql.Row] = Array([1570,3364,201607,Y,APJ,
> PHILIPPINES,8518944,null,null,null,null,-15.992583,0.0,-15.
> 992583,null,null,MONTH_ITEM_GROUP])
>
> scala> hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res15: Array[org.apache.spark.sql.Row] = Array([1570,485888,20163,N,
> AMERICAS,BRAZIL,null,null,null,null,null,6019.2999,17198.0,6019.
> 2999,null,null,QUARTER_GROUP])
>
> scala> hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res16: Array[org.apache.spark.sql.Row] = Array([1570,3930,201607,Y,APJ,INDIA
> SUB-CONTINENT,8741220,null,null,null,null,-208.485216,0.
> 0,-208.485216,null,null,MONTH_ITEM_GROUP])
>
>


Re: Suprised!!!!!Spark-shell showing inconsistent results

2017-02-02 Thread Didac Gil
Is 1570 the value of Col1?
If so, you have ordered by that column and selected only the first item. It 
seems that both results have the same Col1 value, therefore any of them would 
be a right answer to return. Right?

> On 2 Feb 2017, at 11:03, Alex  wrote:
> 
> Hi As shown below same query when ran back to back showing inconsistent 
> results..
> 
> testtable1 is Avro Serde table... 
> 
> 
> 
> 
> 
>  hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res14: Array[org.apache.spark.sql.Row] = 
> Array([1570,3364,201607,Y,APJ,PHILIPPINES,8518944,null,null,null,null,-15.992583,0.0,-15.992583,null,null,MONTH_ITEM_GROUP])
> 
> scala> hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res15: Array[org.apache.spark.sql.Row] = 
> Array([1570,485888,20163,N,AMERICAS,BRAZIL,null,null,null,null,null,6019.2999,17198.0,6019.2999,null,null,QUARTER_GROUP])
> 
> scala> hc.sql("select * from testtable1 order by col1 limit 1").collect;
> res16: Array[org.apache.spark.sql.Row] = Array([1570,3930,201607,Y,APJ,INDIA 
> SUB-CONTINENT,8741220,null,null,null,null,-208.485216,0.0,-208.485216,null,null,MONTH_ITEM_GROUP])
> 


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Suprised!!!!!Spark-shell showing inconsistent results

2017-02-02 Thread Alex
Hi As shown below same query when ran back to back showing inconsistent
results..

testtable1 is Avro Serde table...

[image: Inline image 1]



 hc.sql("select * from testtable1 order by col1 limit 1").collect;
res14: Array[org.apache.spark.sql.Row] =
Array([1570,3364,201607,Y,APJ,PHILIPPINES,8518944,null,null,null,null,-15.992583,0.0,-15.992583,null,null,MONTH_ITEM_GROUP])

scala> hc.sql("select * from testtable1 order by col1 limit 1").collect;
res15: Array[org.apache.spark.sql.Row] =
Array([1570,485888,20163,N,AMERICAS,BRAZIL,null,null,null,null,null,6019.2999,17198.0,6019.2999,null,null,QUARTER_GROUP])

scala> hc.sql("select * from testtable1 order by col1 limit 1").collect;
res16: Array[org.apache.spark.sql.Row] =
Array([1570,3930,201607,Y,APJ,INDIA
SUB-CONTINENT,8741220,null,null,null,null,-208.485216,0.0,-208.485216,null,null,MONTH_ITEM_GROUP])