Hi Jacek , The difference is being mentioned in Spark doc itself
Note that if you perform a self-join using this function without aliasing the input * [[DataFrame]]s, you will NOT be able to reference any columns after the join, since * there is no way to disambiguate which side of the join you would like to reference. * On 26 March 2016 at 04:19, Jacek Laskowski <ja...@japila.pl> wrote: > Hi, > > I've read the note about both columns included when DataFrames are > joined, but don't think it differentiated between versions of join. Is > this a feature or a bug that the following session shows one _1 column > with Seq("_1") and two columns for ===? > > {code} > scala> left.join(right, Seq("_1")).show > +---+---+---+ > | _1| _2| _2| > +---+---+---+ > | 1| a| a| > | 2| b| b| > +---+---+---+ > > > scala> left.join(right, left("_1") === right("_1")).show > +---+---+---+---+ > | _1| _2| _1| _2| > +---+---+---+---+ > | 1| a| 1| a| > | 2| b| 2| b| > +---+---+---+---+ > {code} > > Pozdrawiam, > Jacek Laskowski > ---- > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >