Assuming you know the number of elements in the list, this should work: df.withColumn('total', df["_1"].getItem(0) + df["_1"].getItem(1) + df["_1"].getItem(2))
Mike On Mon, Aug 15, 2016 at 12:02 PM, Javier Rey <jre...@gmail.com> wrote: > Hi everyone, > > I have one dataframe with one column this column is an array of numbers, > how can I sum each array by row a obtain a new column with sum? in pyspark. > > Example: > > +------------+ > | numbers| > +------------+ > |[10, 20, 30]| > |[40, 50, 60]| > |[70, 80, 90]| > +------------+ > > The idea is obtain the same df with a new column with totals: > > +------------+------ > | numbers| | > +------------+------ > |[10, 20, 30]|60 | > |[40, 50, 60]|150 | > |[70, 80, 90]|240 | > +------------+------ > > Regards! > > Samir > > > >