RE: Select all columns except some

2015-07-17 Thread Saif.A.Ellafi
@spark.apache.orgmailto:user@spark.apache.org Subject: Re: Select all columns except some Have you tried to examine what clean_cols contains -- I'm suspect of this part mkString(“, “). Try this: val clean_cols : Seq[String] = df.columns... if you get a type error you need to work on clean_cols (I

Select all columns except some

2015-07-16 Thread Saif.A.Ellafi
Hi, In a hundred columns dataframe, I wish to either select all of them except or drop the ones I dont want. I am failing in doing such simple task, tried two ways val clean_cols = df.columns.filterNot(col_name = col_name.startWith(STATE_).mkString(, ) df.select(clean_cols) But this throws

Re: Select all columns except some

2015-07-16 Thread Yana Kadiyska
Have you tried to examine what clean_cols contains -- I'm suspect of this part mkString(“, “). Try this: val clean_cols : Seq[String] = df.columns... if you get a type error you need to work on clean_cols (I suspect yours is of type String at the moment and presents itself to Spark as a single

Re: Select all columns except some

2015-07-16 Thread Lars Albertsson
The snippet at the end worked for me. We run Spark 1.3.x, so DataFrame.drop is not available to us. As pointed out by Yana, DataFrame operations typically return a new DataFrame, so use as such: import com.foo.sparkstuff.DataFrameOps._ ... val df = ... val prunedDf = df.dropColumns(one_col,