@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Select all columns except some
Have you tried to examine what clean_cols contains -- I'm suspect of this part
mkString(“, “).
Try this:
val clean_cols : Seq[String] = df.columns...
if you get a type error you need to work on clean_cols (I
Hi,
In a hundred columns dataframe, I wish to either select all of them except or
drop the ones I dont want.
I am failing in doing such simple task, tried two ways
val clean_cols = df.columns.filterNot(col_name =
col_name.startWith(STATE_).mkString(, )
df.select(clean_cols)
But this throws
Have you tried to examine what clean_cols contains -- I'm suspect of this
part mkString(“, “).
Try this:
val clean_cols : Seq[String] = df.columns...
if you get a type error you need to work on clean_cols (I suspect yours is
of type String at the moment and presents itself to Spark as a single
The snippet at the end worked for me. We run Spark 1.3.x, so
DataFrame.drop is not available to us.
As pointed out by Yana, DataFrame operations typically return a new
DataFrame, so use as such:
import com.foo.sparkstuff.DataFrameOps._
...
val df = ...
val prunedDf = df.dropColumns(one_col,