Yes, I didn't work out how to solve that - sorry
On 3 February 2016 at 22:37, Devesh Raj Singh <raj.deves...@gmail.com> wrote: > Hi, > > but "withColumn" will only add once, if i want to add columns to the same > dataframe in a loop it will keep overwriting the added column and in the > end the last added column( in the loop) will be the added column. like in > my code above. > > On Wed, Feb 3, 2016 at 5:05 PM, Franc Carter <franc.car...@gmail.com> > wrote: > >> >> I had problems doing this as well - I ended up using 'withColumn', it's >> not particularly graceful but it worked (1.5.2 on AWS EMR) >> >> cheerd >> >> On 3 February 2016 at 22:06, Devesh Raj Singh <raj.deves...@gmail.com> >> wrote: >> >>> Hi, >>> >>> i am trying to create dummy variables in sparkR by creating new columns >>> for categorical variables. But it is not appending the columns >>> >>> >>> df <- createDataFrame(sqlContext, iris) >>> class(dtypes(df)) >>> >>> cat.column<-vector(mode="character",length=nrow(df)) >>> cat.column<-collect(select(df,df$Species)) >>> lev<-length(levels(as.factor(unlist(cat.column)))) >>> varb.names<-vector(mode="character",length=lev) >>> for (i in 1:lev){ >>> >>> varb.names[i]<-paste0(colnames(cat.column),i) >>> >>> } >>> >>> for (j in 1:lev) >>> >>> { >>> >>> dummy.df.new<-withColumn(df,paste0(colnames >>> (cat.column),j),if else(df$Species==levels(as.factor(un >>> list(cat.column)) >>> [j],1,0) ) >>> >>> } >>> >>> I am getting the below output for >>> >>> head(dummy.df.new) >>> >>> output: >>> >>> Sepal_Length Sepal_Width Petal_Length Petal_Width Species Species1 >>> 1 5.1 3.5 1.4 0.2 setosa 1 >>> 2 4.9 3.0 1.4 0.2 setosa 1 >>> 3 4.7 3.2 1.3 0.2 setosa 1 >>> 4 4.6 3.1 1.5 0.2 setosa 1 >>> 5 5.0 3.6 1.4 0.2 setosa 1 >>> 6 5.4 3.9 1.7 0.4 setosa 1 >>> >>> Problem: Species2 and Species3 column are not getting added to the >>> dataframe >>> >>> -- >>> Warm regards, >>> Devesh. >>> >> >> >> >> -- >> Franc >> > > > > -- > Warm regards, > Devesh. > -- Franc