Re: Weird results with Spark SQL Outer joins

2016-05-03 Thread Kevin Peng
counts. They are basically > pulling > >>>>>> out > >>>>>> selected columns from the query, but there is no roll up happening > or > >>>>>> anything that would possible make it suspicious that there is any > >>>>

Re: Weird results with Spark SQL Outer joins

2016-05-03 Thread Michael Segel
re >>>>>> producing >>>>>> the same counts, the natural suspicions is that the tables are >>>>>> identical, >>>>>> but I when I run the following two queries: >>>>>> >>>>>> scala> sqlCont

Re: Weird results with Spark SQL Outer joins

2016-05-03 Thread Kevin Peng
tered by date being above 2016-01-03. Since all the joins are > >>>> > producing > >>>> > the same counts, the natural suspicions is that the tables are > >>>> > identical, > >>>> > but I when I run the following two q

Re: Weird results with Spark SQL Outer joins

2016-05-03 Thread Davies Liu
n_promo_lt where date >>>> >>='2016-01-03'").count >>>> > >>>> > res14: Long = 34158 >>>> > >>>> > scala> sqlContext.sql("select * from dps_pin_promo_lt where date >>>> >>='2016-01-03'").count >&

Re: Weird results with Spark SQL Outer joins

2016-05-03 Thread Cesar Flores
gt; >>> > The above two queries filter out the data based on date used by the >>> joins of >>> > 2016-01-03 and you can see the row count between the two tables are >>> > different, which is why I am suspecting something is wrong with the >>> outer &g

Re: Weird results with Spark SQL Outer joins

2016-05-03 Thread Gourav Sengupta
wrong counts for >> > dps.count, the real value is res16: Long = 42694 >> > >> > >> > Thanks, >> > >> > >> > KP >> > >> > >> > >> > >> > On Mon, May 2, 2016 at 12:50 PM, Yong Zhang <java8...@hotmail.

Re: Weird results with Spark SQL Outer joins

2016-05-03 Thread Kevin Peng
gt; >> For dps with 42632 rows, and swig with 42034 rows, if dps full outer > join > >> with swig on 3 columns; with additional filters, get the same resultSet > row > >> count as dps lefter outer join with swig on 3 columns, with additional > >> filters,

Re: Weird results with Spark SQL Outer joins

2016-05-03 Thread Davies Liu
same resultSet row count as dps right outer join >> with swig on 3 columns, with same additional filters. >> >> Without knowing your data, I cannot see the reason that has to be a bug in >> the spark. >> >> Am I misunderstanding your bug? >> >> Yong >

Re: Weird results with Spark SQL Outer joins

2016-05-02 Thread Gourav Sengupta
gt;> >> Without knowing your data, I cannot see the reason that has to be a bug >> in the spark. >> >> Am I misunderstanding your bug? >> >> Yong >> >> -- >> From: kpe...@gmail.com >> Date: Mon, 2 May 201

Re: Weird results with Spark SQL Outer joins

2016-05-02 Thread Kevin Peng
ight outer join > with swig on 3 columns, with same additional filters. > > Without knowing your data, I cannot see the reason that has to be a bug in > the spark. > > Am I misunderstanding your bug? > > Yong > > ------ > From: kpe...@gmail.com

RE: Weird results with Spark SQL Outer joins

2016-05-02 Thread Yong Zhang
: Mon, 2 May 2016 12:11:18 -0700 Subject: Re: Weird results with Spark SQL Outer joins To: gourav.sengu...@gmail.com CC: user@spark.apache.org Gourav, I wish that was case, but I have done a select count on each of the two tables individually and they return back different number of rows

Re: Weird results with Spark SQL Outer joins

2016-05-02 Thread Kevin Peng
>> s.ad = >> d.ad) WHERE s.date >= '2016-01-03'AND d.date >= >> '2016-01-03'").count() >> RESULT:23747 >> >> >> >> -- >> View this message in context: >> http://ap

Re: Weird results with Spark SQL Outer joins

2016-05-02 Thread Gourav Sengupta
swig_pin_promo_lt s INNER JOIN > dps_pin_promo_lt d ON (s.date = d.date AND s.account = d.account AND s.ad > = > d.ad) WHERE s.date >= '2016-01-03'AND d.date >= '2016-01-03'").count() > RESULT:23747 > > > > -- > View this message in context: > http://ap

Re: Weird results with Spark SQL Outer joins

2016-05-02 Thread kpeng1
.date = d.date AND s.account = d.account AND s.ad = d.ad) WHERE s.date >= '2016-01-03'AND d.date >= '2016-01-03'").count() RESULT:23747 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Weird-results-with-Spark-SQL-Outer-joins-tp26861p2686

Re: Weird results with Spark SQL Outer joins

2016-05-02 Thread Gourav Sengupta
S s_acc , >>> d.account AS >>> d_acc , s.ad as s_ad , d.ad as d_ad , s.spend AS s_spend , >>> d.spend_in_dollar AS d_spend FROM swig_pin_promo_lt s RIGHT OUTER JOIN >>> dps_pin_promo_lt d ON (s.date = d.dat

Re: Weird results with Spark SQL Outer joins

2016-05-02 Thread Kevin Peng
swig_pin_promo_lt s RIGHT OUTER JOIN >> dps_pin_promo_lt d ON (s.date = d.date AND s.account = d.account AND >> s.ad = >> d.ad) WHERE s.date >= '2016-01-03'AND d.date >= >> '2016-01-03'").count() >> RESULT

Re: Weird results with Spark SQL Outer joins

2016-05-02 Thread Gourav Sengupta
g if someone had encountered this issues before. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Weird-results-with-Spark-SQL-Outer-joins-tp26861.html > Sent from the Apache Spark User List mailing list archive at Nabble.

Weird results with Spark SQL Outer joins

2016-05-02 Thread kpeng1
ntext: http://apache-spark-user-list.1001560.n3.nabble.com/Weird-results-with-Spark-SQL-Outer-joins-tp26861.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.ap