It is okay to collect the iterator. That will not break Spark. However,
collecting it requires memory in the executor, so you may cause OOMs if a
group has a LOT of new data.
On Wed, Oct 31, 2018 at 3:44 AM Antonio Murgia -
antonio.murg...@studio.unibo.it wrote:
> Hi all,
>
> I'm currently
Hi all,
I'm currently developing a Spark Structured Streaming job and I'm performing
flatMapGroupsWithState.
I'm concerned about the laziness of the Iterator[V] that is passed to my custom
function (func: (K, Iterator[V], GroupState[S]) => Iterator[U]).
Is it ok to collect that iterator (with