Unsubscribe

2021-07-06 Thread kushagra deep
Unsubscribe

Unsubscribe

2021-07-01 Thread kushagra deep

Re: Max of multiple columns of a row in spark

2021-06-06 Thread kushagra deep
I think we can do it using greatest function . Closing this ticket ! On Mon, Jun 7, 2021 at 2:43 AM kushagra deep wrote: > Hi Guys, > > I have a problem where I have a df as below: > > ===+ > Marks1 | Marks2| Marks3 | > 10. 30. 40.

Max of multiple columns of a row in spark

2021-06-06 Thread kushagra deep
be : ===+ Marks1| Marks2| Marks3 | Max | 10. 30. 40. 40 Thanks In Advance Reg, Kushagra Deep

Re: Merge two dataframes

2021-05-18 Thread kushagra deep
Thanks a lot Mich , this works though I have to test for scalability. I have one question though . If we dont specify any column in partitionBy will it shuffle all the records in one executor ? Because this is what seems to be happening. Thanks once again ! Regards Kushagra Deep On Tue, May 18

Re: Merge two dataframes

2021-05-18 Thread kushagra deep
one column and you want to UNION them > in a certain way but the correlation is not known. In other words this > UNION is as is? > >amount_6m | amount_9m >100 500 >200 600 > > HTH > > > On Wed, 12 May 2021 at 13:51, ku

Re: Merge two dataframes

2021-05-12 Thread kushagra deep
into logical spark partitions with the same cardinality for each partition ? Reg, Kushagra Deep On Wed, May 12, 2021, 21:00 Raghavendra Ganesh wrote: > You can add an extra id column and perform an inner join. > > val df1_with_id = df1.withColumn("id", monotonically_increasing_id())

Merge two dataframes

2021-05-12 Thread kushagra deep
500 200 600 300 700 400 800 500 900 Thanks in advance Reg, Kushagra Deep

Spark Views Functioning

2021-03-26 Thread Kushagra Deep
Hi all, I just wanted to know that when we create a 'createOrReplaceTempView' on a spark dataset, where does the view reside ? Does all the data come to driver and the view is created ? Or individual executors have part of the views (based on the data each executor has) with them , so that

Re: Spark as computing engine vs spark cluster

2020-10-12 Thread Kushagra Deep
Kushagra Deep From: Mich Talebzadeh Date: Monday, 12 October 2020 at 11:23 PM To: Santosh74 Cc: "user @spark" Subject: Re: Spark as computing engine vs spark cluster Hi Santosh, Generally speaking, there are two ways of making a process faster: 1. Do more intelligent work by creati

Cogrouping in Streaming Datasets/DataFrames is not supported ?

2019-08-23 Thread Kushagra Deep
Hi , I have a use case where I have to cogroup two streams using cogroup in streaming. However when I do so I get an exception that “Cogrouping in streaming is not supported in DataFrame/Dataset”. Please clarify. Regards , Kushagra Deep