Assuming you know the number of elements in the list, this should work:

df.withColumn('total', df["_1"].getItem(0) + df["_1"].getItem(1) +
df["_1"].getItem(2))

Mike

On Mon, Aug 15, 2016 at 12:02 PM, Javier Rey <jre...@gmail.com> wrote:

> Hi everyone,
>
> I have one dataframe with one column this column is an array of numbers,
> how can I sum each array by row a obtain a new column with sum? in pyspark.
>
> Example:
>
> +------------+
> |     numbers|
> +------------+
> |[10, 20, 30]|
> |[40, 50, 60]|
> |[70, 80, 90]|
> +------------+
>
> The idea is obtain the same df with a new column with totals:
>
> +------------+------
> |     numbers|     |
> +------------+------
> |[10, 20, 30]|60   |
> |[40, 50, 60]|150  |
> |[70, 80, 90]|240  |
> +------------+------
>
> Regards!
>
> Samir
>
>
>
>

Reply via email to