One way is to split->explode->pivot
These are column and Dataframe methods.
Here are quick examples from web:
https://www.google.com/amp/s/sparkbyexamples.com/spark/spark-split-dataframe-column-into-multiple-columns/amp/


https://www.google.com/amp/s/sparkbyexamples.com/spark/explode-spark-array-and-map-dataframe-column/amp/

On Wed, 9 Feb 2022, 01:55 frakass, <capitnfrak...@free.fr> wrote:

> Hello
>
> for the RDD I can apply flatMap method:
>
>  >>> sc.parallelize(["a few words","ba na ba na"]).flatMap(lambda x:
> x.split(" ")).collect()
> ['a', 'few', 'words', 'ba', 'na', 'ba', 'na']
>
>
> But for a dataframe table how can I flatMap that as above?
>
>  >>> df.show()
> +----------------+
> |           value|
> +----------------+
> |     a few lines|
> |hello world here|
> |     ba na ba na|
> +----------------+
>
>
> Thanks
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to