Re: How to avoid duplicate column names after join with multiple conditions

2018-07-12 Thread Prem Sure
Yes Nirav, we can probably request dev for a config param enablement to take care of this automatically (internally) - additional care required while specifying column names and joining from users Thanks, Prem On Thu, Jul 12, 2018 at 10:53 PM Nirav Patel wrote: > Hi Prem, dropping column,

Re: How to avoid duplicate column names after join with multiple conditions

2018-07-12 Thread Nirav Patel
Hi Prem, dropping column, renaming column are working for me as a workaround. I thought it just nice to have generic api that can handle that for me. or some intelligence that since both columns are same it shouldn't complain in subsequent Select clause that it doesn't know if I mean a#12 or a#81.

Re: How to avoid duplicate column names after join with multiple conditions

2018-07-12 Thread Prem Sure
Hi Nirav, did you try .drop(df1(a) after join Thanks, Prem On Thu, Jul 12, 2018 at 9:50 PM Nirav Patel wrote: > Hi Vamshi, > > That api is very restricted and not generic enough. It imposes that all > conditions of joins has to have same column on both side and it also has to > be equijoin. It

Re: How to avoid duplicate column names after join with multiple conditions

2018-07-12 Thread Nirav Patel
Hi Vamshi, That api is very restricted and not generic enough. It imposes that all conditions of joins has to have same column on both side and it also has to be equijoin. It doesn't serve my usecase where some join predicates don't have same column names. Thanks On Sun, Jul 8, 2018 at 7:39 PM,

Re: How to avoid duplicate column names after join with multiple conditions

2018-07-08 Thread Vamshi Talla
Nirav, Spark does not create a duplicate column when you use the below join expression, as an array of column(s) like below but that requires the column name to be same in both the data frames. Example: df1.join(df2, [‘a’]) Thanks. Vamshi Talla On Jul 6, 2018, at 4:47 PM, Gokula Krishnan D

Re: How to avoid duplicate column names after join with multiple conditions

2018-07-06 Thread Gokula Krishnan D
Nirav, withColumnRenamed() API might help but it does not different column and renames all the occurrences of the given column. either use select() API and rename as you want. Thanks & Regards, Gokula Krishnan* (Gokul)* On Mon, Jul 2, 2018 at 5:52 PM, Nirav Patel wrote: > Expr is `df1(a)

How to avoid duplicate column names after join with multiple conditions

2018-07-02 Thread Nirav Patel
Expr is `df1(a) === df2(a) and df1(b) === df2(c)` How to avoid duplicate column 'a' in result? I don't see any api that combines both. Rename manually? --