You can use udf to convert one column for array type. Here's one sample val conf = new SparkConf().setMaster("local[4]").setAppName("test") val sc = new SparkContext(conf) val sqlContext = new SQLContext(sc) import sqlContext.implicits._ import sqlContext._ sqlContext.udf.register("f", (a:String) => Array(a,a)) val df1 = Seq( (1, "jeff", 12), (2, "andy", 34), (3, "pony", 23), (4, "jeff", 14) ).toDF("id", "name", "age")
val df2=df1.withColumn("name", expr("f(name)")) df2.printSchema() df2.show() On Fri, Dec 25, 2015 at 3:44 PM, zml张明磊 <mingleizh...@ctrip.com> wrote: > Thanks, Jeff. It’s not choose some columns of a Row. It’s just choose all > data in a column and convert it to an Array. Do you understand my mean ? > > > > In Chinese > > 我是想基于这个列名把这一列中的所有数据都选出来,然后放到数组里面去。 > > > > > > *发件人:* Jeff Zhang [mailto:zjf...@gmail.com] > *发送时间:* 2015年12月25日 15:39 > *收件人:* zml张明磊 > *抄送:* dev@spark.apache.org > *主题:* Re: How can I get the column data based on specific column name and > then stored these data in array or list ? > > > > Not sure what you mean. Do you want to choose some columns of a Row and > convert it to an Arrray ? > > > > On Fri, Dec 25, 2015 at 3:35 PM, zml张明磊 <mingleizh...@ctrip.com> wrote: > > > > Hi, > > > > I am a new to Scala and Spark and trying to find relative API in > DataFrame > to solve my problem as title described. However, I just only find this API > *DataFrame.col(colName > : String) : Column * which returns an object of Column. Not the content. > If only DataFrame support such API which like *Column.toArray : Type* is > enough for me. But now, it doesn’t. How can I do can achieve this function > ? > > > > Thanks, > > Minglei. > > > > > > -- > > Best Regards > > Jeff Zhang > -- Best Regards Jeff Zhang