Oh, sorry you’re right. I looked at the doc for join()
<http://spark.apache.org/docs/1.6.2/api/python/pyspark.sql.html#pyspark.sql.DataFrame.join>
and didn’t realize you could do a cartesian join. But it turns out that
df1.join(df2) does the job and matches the SQL equivalent too.
​

On Mon, Jul 25, 2016 at 6:45 PM Reynold Xin <r...@databricks.com> wrote:

> DataFrame can do cartesian joins.
>
>
> On July 25, 2016 at 3:43:19 PM, Nicholas Chammas (
> nicholas.cham...@gmail.com) wrote:
>
> It appears that RDDs can do a cartesian join, but not DataFrames. Is there
> a fundamental reason why not, or is this just waiting for someone to
> implement?
>
> I know you can get the RDDs underlying the DataFrames and do the cartesian
> join that way, but you lose the schema of course.
>
> Nick
>
>

Reply via email to