Re: Sorting on a streaming dataframe

2018-05-01 Thread Hemant Bhanawat
y this requirement is not supported ;) >>>> >>>> Best, >>>> >>>> Chayapan (A) >>>> >>>> >>>> On Apr 24, 2018, at 2:18 PM, Hemant Bhanawat <hemant9...@gmail.com> >>>> wrote: >>>> >>>&

Re: Sorting on a streaming dataframe

2018-04-30 Thread Michael Armbrust
oupsWithState is probably sufficient >>>> and requires less working knowledge to make effective reuse of internals. >>>> Just group by foo and then sort accordingly and assign ids. The id counter >>>> can be stateful per group. Sometimes this problem

Re: Sorting on a streaming dataframe

2018-04-27 Thread Hemant Bhanawat
;hemant9...@gmail.com> >> wrote: >> >> Thanks Chris. There are many ways in which I can solve this problem but >> they are cumbersome. The easiest way would have been to sort the streaming >> dataframe. The reason I asked this question is because I could not find a &g

Re: Sorting on a streaming dataframe

2018-04-26 Thread Michael Armbrust
> message offsets may be “good enough”. >> ------ >> *From:* Hemant Bhanawat <hemant9...@gmail.com> >> *Sent:* Thursday, April 12, 2018 11:42:59 PM >> *To:* Reynold Xin >> *Cc:* dev >> *Subject:* Re: Sorting on a stream

Re: Sorting on a streaming dataframe

2018-04-24 Thread Chayapan Khannabha
would have been to sort the streaming > dataframe. The reason I asked this question is because I could not find a > reason why sorting on streaming dataframe is disallowed. > > Hemant > > On Mon, Apr 16, 2018 at 6:09 PM, Bowden, Chris <chris.bow...@microfocus.com > <ma

Re: Sorting on a streaming dataframe

2018-04-24 Thread Arun Mahadevan
). Thanks, Arun From: Hemant Bhanawat <hemant9...@gmail.com> Date: Tuesday, April 24, 2018 at 12:18 AM To: "Bowden, Chris" <chris.bow...@microfocus.com> Cc: Reynold Xin <r...@databricks.com>, dev <dev@spark.apache.org> Subject: Re: Sorting on a streaming dataf

Re: Sorting on a streaming dataframe

2018-04-24 Thread Hemant Bhanawat
Thanks Chris. There are many ways in which I can solve this problem but they are cumbersome. The easiest way would have been to sort the streaming dataframe. The reason I asked this question is because I could not find a reason why sorting on streaming dataframe is disallowed. Hemant On Mon, Apr

Re: Sorting on a streaming dataframe

2018-04-13 Thread Hemant Bhanawat
Well, we want to assign snapshot ids (incrementing counters) to the incoming records. For that, we are zipping the streaming rdds with that counter using a modified version of ZippedWithIndexRDD. We are ok if the records in the streaming dataframe gets counters in random order but the counter

Re: Sorting on a streaming dataframe

2018-04-13 Thread Reynold Xin
Can you describe your use case more? On Thu, Apr 12, 2018 at 11:12 PM Hemant Bhanawat wrote: > Hi Guys, > > Why is sorting on streaming dataframes not supported(unless it is complete > mode)? My downstream needs me to sort the streaming dataframe. > > Hemant >

Sorting on a streaming dataframe

2018-04-13 Thread Hemant Bhanawat
Hi Guys, Why is sorting on streaming dataframes not supported(unless it is complete mode)? My downstream needs me to sort the streaming dataframe. Hemant