counts. They are basically
> pulling
> >>>>>> out
> >>>>>> selected columns from the query, but there is no roll up happening
> or
> >>>>>> anything that would possible make it suspicious that there is any
> >>>>
re
>>>>>> producing
>>>>>> the same counts, the natural suspicions is that the tables are
>>>>>> identical,
>>>>>> but I when I run the following two queries:
>>>>>>
>>>>>> scala> sqlCont
tered by date being above 2016-01-03. Since all the joins are
> >>>> > producing
> >>>> > the same counts, the natural suspicions is that the tables are
> >>>> > identical,
> >>>> > but I when I run the following two q
n_promo_lt where date
>>>> >>='2016-01-03'").count
>>>> >
>>>> > res14: Long = 34158
>>>> >
>>>> > scala> sqlContext.sql("select * from dps_pin_promo_lt where date
>>>> >>='2016-01-03'").count
>&
gt;
>>> > The above two queries filter out the data based on date used by the
>>> joins of
>>> > 2016-01-03 and you can see the row count between the two tables are
>>> > different, which is why I am suspecting something is wrong with the
>>> outer
&g
wrong counts for
>> > dps.count, the real value is res16: Long = 42694
>> >
>> >
>> > Thanks,
>> >
>> >
>> > KP
>> >
>> >
>> >
>> >
>> > On Mon, May 2, 2016 at 12:50 PM, Yong Zhang <java8...@hotmail.
gt; >> For dps with 42632 rows, and swig with 42034 rows, if dps full outer
> join
> >> with swig on 3 columns; with additional filters, get the same resultSet
> row
> >> count as dps lefter outer join with swig on 3 columns, with additional
> >> filters,
same resultSet row count as dps right outer join
>> with swig on 3 columns, with same additional filters.
>>
>> Without knowing your data, I cannot see the reason that has to be a bug in
>> the spark.
>>
>> Am I misunderstanding your bug?
>>
>> Yong
>
gt;>
>> Without knowing your data, I cannot see the reason that has to be a bug
>> in the spark.
>>
>> Am I misunderstanding your bug?
>>
>> Yong
>>
>> --
>> From: kpe...@gmail.com
>> Date: Mon, 2 May 201
ight outer join
> with swig on 3 columns, with same additional filters.
>
> Without knowing your data, I cannot see the reason that has to be a bug in
> the spark.
>
> Am I misunderstanding your bug?
>
> Yong
>
> ------
> From: kpe...@gmail.com
: Mon, 2 May 2016 12:11:18 -0700
Subject: Re: Weird results with Spark SQL Outer joins
To: gourav.sengu...@gmail.com
CC: user@spark.apache.org
Gourav,
I wish that was case, but I have done a select count on each of the two tables
individually and they return back different number of rows
>> s.ad =
>> d.ad) WHERE s.date >= '2016-01-03'AND d.date >=
>> '2016-01-03'").count()
>> RESULT:23747
>>
>>
>>
>> --
>> View this message in context:
>> http://ap
swig_pin_promo_lt s INNER JOIN
> dps_pin_promo_lt d ON (s.date = d.date AND s.account = d.account AND s.ad
> =
> d.ad) WHERE s.date >= '2016-01-03'AND d.date >= '2016-01-03'").count()
> RESULT:23747
>
>
>
> --
> View this message in context:
> http://ap
.date = d.date AND s.account = d.account AND s.ad =
d.ad) WHERE s.date >= '2016-01-03'AND d.date >= '2016-01-03'").count()
RESULT:23747
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Weird-results-with-Spark-SQL-Outer-joins-tp26861p2686
S s_acc ,
>>> d.account AS
>>> d_acc , s.ad as s_ad , d.ad as d_ad , s.spend AS s_spend ,
>>> d.spend_in_dollar AS d_spend FROM swig_pin_promo_lt s RIGHT OUTER JOIN
>>> dps_pin_promo_lt d ON (s.date = d.dat
swig_pin_promo_lt s RIGHT OUTER JOIN
>> dps_pin_promo_lt d ON (s.date = d.date AND s.account = d.account AND
>> s.ad =
>> d.ad) WHERE s.date >= '2016-01-03'AND d.date >=
>> '2016-01-03'").count()
>> RESULT
g if someone had encountered this issues before.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Weird-results-with-Spark-SQL-Outer-joins-tp26861.html
> Sent from the Apache Spark User List mailing list archive at Nabble.
ntext:
http://apache-spark-user-list.1001560.n3.nabble.com/Weird-results-with-Spark-SQL-Outer-joins-tp26861.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.ap
18 matches
Mail list logo