Re: Suprised!!!!!Spark-shell showing inconsistent results
Hi Team, Actually I figured out something .. While Hive Java UDF executed on hive it is giving output with 10 decimal precision but in spark same udf is giving results rounded off to 6 decimal precision... How do I stop that? Its the same java udf jar files used in both hive and spark.. [image: Inline image 1] On Thu, Feb 2, 2017 at 3:33 PM, Alex wrote: > Hi As shown below same query when ran back to back showing inconsistent > results.. > > testtable1 is Avro Serde table... > > [image: Inline image 1] > > > > hc.sql("select * from testtable1 order by col1 limit 1").collect; > res14: Array[org.apache.spark.sql.Row] = Array([1570,3364,201607,Y,APJ, > PHILIPPINES,8518944,null,null,null,null,-15.992583,0.0,-15. > 992583,null,null,MONTH_ITEM_GROUP]) > > scala> hc.sql("select * from testtable1 order by col1 limit 1").collect; > res15: Array[org.apache.spark.sql.Row] = Array([1570,485888,20163,N, > AMERICAS,BRAZIL,null,null,null,null,null,6019.2999,17198.0,6019. > 2999,null,null,QUARTER_GROUP]) > > scala> hc.sql("select * from testtable1 order by col1 limit 1").collect; > res16: Array[org.apache.spark.sql.Row] = Array([1570,3930,201607,Y,APJ,INDIA > SUB-CONTINENT,8741220,null,null,null,null,-208.485216,0. > 0,-208.485216,null,null,MONTH_ITEM_GROUP]) > >
Re: Suprised!!!!!Spark-shell showing inconsistent results
Hi Have u tried to sort the results before comparing? On 2 Feb 2017 10:03 am, "Alex" wrote: > Hi As shown below same query when ran back to back showing inconsistent > results.. > > testtable1 is Avro Serde table... > > [image: Inline image 1] > > > > hc.sql("select * from testtable1 order by col1 limit 1").collect; > res14: Array[org.apache.spark.sql.Row] = Array([1570,3364,201607,Y,APJ, > PHILIPPINES,8518944,null,null,null,null,-15.992583,0.0,-15. > 992583,null,null,MONTH_ITEM_GROUP]) > > scala> hc.sql("select * from testtable1 order by col1 limit 1").collect; > res15: Array[org.apache.spark.sql.Row] = Array([1570,485888,20163,N, > AMERICAS,BRAZIL,null,null,null,null,null,6019.2999,17198.0,6019. > 2999,null,null,QUARTER_GROUP]) > > scala> hc.sql("select * from testtable1 order by col1 limit 1").collect; > res16: Array[org.apache.spark.sql.Row] = Array([1570,3930,201607,Y,APJ,INDIA > SUB-CONTINENT,8741220,null,null,null,null,-208.485216,0. > 0,-208.485216,null,null,MONTH_ITEM_GROUP]) > >
Re: Suprised!!!!!Spark-shell showing inconsistent results
Is 1570 the value of Col1? If so, you have ordered by that column and selected only the first item. It seems that both results have the same Col1 value, therefore any of them would be a right answer to return. Right? > On 2 Feb 2017, at 11:03, Alex wrote: > > Hi As shown below same query when ran back to back showing inconsistent > results.. > > testtable1 is Avro Serde table... > > > > > > hc.sql("select * from testtable1 order by col1 limit 1").collect; > res14: Array[org.apache.spark.sql.Row] = > Array([1570,3364,201607,Y,APJ,PHILIPPINES,8518944,null,null,null,null,-15.992583,0.0,-15.992583,null,null,MONTH_ITEM_GROUP]) > > scala> hc.sql("select * from testtable1 order by col1 limit 1").collect; > res15: Array[org.apache.spark.sql.Row] = > Array([1570,485888,20163,N,AMERICAS,BRAZIL,null,null,null,null,null,6019.2999,17198.0,6019.2999,null,null,QUARTER_GROUP]) > > scala> hc.sql("select * from testtable1 order by col1 limit 1").collect; > res16: Array[org.apache.spark.sql.Row] = Array([1570,3930,201607,Y,APJ,INDIA > SUB-CONTINENT,8741220,null,null,null,null,-208.485216,0.0,-208.485216,null,null,MONTH_ITEM_GROUP]) > - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Suprised!!!!!Spark-shell showing inconsistent results
Hi As shown below same query when ran back to back showing inconsistent results.. testtable1 is Avro Serde table... [image: Inline image 1] hc.sql("select * from testtable1 order by col1 limit 1").collect; res14: Array[org.apache.spark.sql.Row] = Array([1570,3364,201607,Y,APJ,PHILIPPINES,8518944,null,null,null,null,-15.992583,0.0,-15.992583,null,null,MONTH_ITEM_GROUP]) scala> hc.sql("select * from testtable1 order by col1 limit 1").collect; res15: Array[org.apache.spark.sql.Row] = Array([1570,485888,20163,N,AMERICAS,BRAZIL,null,null,null,null,null,6019.2999,17198.0,6019.2999,null,null,QUARTER_GROUP]) scala> hc.sql("select * from testtable1 order by col1 limit 1").collect; res16: Array[org.apache.spark.sql.Row] = Array([1570,3930,201607,Y,APJ,INDIA SUB-CONTINENT,8741220,null,null,null,null,-208.485216,0.0,-208.485216,null,null,MONTH_ITEM_GROUP])