ur own sink(s).
>>>>> That is, just grabbing the parquet sink, etc. isn’t going to work out of
>>>>> the box. Alternatively map/flatMapGroupsWithState is probably sufficient
>>>>> and requires less working knowledge to make effective reuse of interna
ve reuse of internals.
>>>> Just group by foo and then sort accordingly and assign ids. The id counter
>>>> can be stateful per group. Sometimes this problem may not need to be solved
>>>> at all. For example, if you are using kafka, a proper partitioning scheme
>>
rnals. Just
>>> group by foo and then sort accordingly and assign ids. The id counter can
>>> be stateful per group. Sometimes this problem may not need to be solved at
>>> all. For example, if you are using kafka, a proper partitioning scheme and
>>> message offsets may be “good enough”.
>>> -
t; --------------
>> *From:* Hemant Bhanawat
>> *Sent:* Thursday, April 12, 2018 11:42:59 PM
>> *To:* Reynold Xin
>> *Cc:* dev
>> *Subject:* Re: Sorting on a streaming dataframe
>>
>> Well, we want to assign snapshot ids (incrementing counters
titioning scheme and message
> offsets may be “good enough”.
> From: Hemant Bhanawat mailto:hemant9...@gmail.com>>
> Sent: Thursday, April 12, 2018 11:42:59 PM
> To: Reynold Xin
> Cc: dev
> Subject: Re: Sorting on a streaming dataframe
>
> Well, we want to assign snap
).
Thanks,
Arun
From: Hemant Bhanawat
Date: Tuesday, April 24, 2018 at 12:18 AM
To: "Bowden, Chris"
Cc: Reynold Xin , dev
Subject: Re: Sorting on a streaming dataframe
Thanks Chris. There are many ways in which I can solve this problem but they
are cumbersome. The easiest way would
gt; all. For example, if you are using kafka, a proper partitioning scheme and
> message offsets may be “good enough”.
> --
> *From:* Hemant Bhanawat
> *Sent:* Thursday, April 12, 2018 11:42:59 PM
> *To:* Reynold Xin
> *Cc:* dev
> *Subject:* Re: Sorti
Well, we want to assign snapshot ids (incrementing counters) to the
incoming records. For that, we are zipping the streaming rdds with that
counter using a modified version of ZippedWithIndexRDD. We are ok if the
records in the streaming dataframe gets counters in random order but the
counter shoul
Can you describe your use case more?
On Thu, Apr 12, 2018 at 11:12 PM Hemant Bhanawat
wrote:
> Hi Guys,
>
> Why is sorting on streaming dataframes not supported(unless it is complete
> mode)? My downstream needs me to sort the streaming dataframe.
>
> Hemant
>
Hi Guys,
Why is sorting on streaming dataframes not supported(unless it is complete
mode)? My downstream needs me to sort the streaming dataframe.
Hemant
10 matches
Mail list logo