You can use udf to convert one column for array type. Here's one sample

val conf = new SparkConf().setMaster("local[4]").setAppName("test")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
import sqlContext._
sqlContext.udf.register("f", (a:String) => Array(a,a))
val df1 = Seq(
  (1, "jeff", 12),
  (2, "andy", 34),
  (3, "pony", 23),
  (4, "jeff", 14)
).toDF("id", "name", "age")

val df2=df1.withColumn("name", expr("f(name)"))
df2.printSchema()
df2.show()


On Fri, Dec 25, 2015 at 3:44 PM, zml张明磊 <mingleizh...@ctrip.com> wrote:

> Thanks, Jeff. It’s not choose some columns of a Row. It’s just choose all
> data in a column and convert it to an Array. Do you understand my mean ?
>
>
>
> In Chinese
>
> 我是想基于这个列名把这一列中的所有数据都选出来,然后放到数组里面去。
>
>
>
>
>
> *发件人:* Jeff Zhang [mailto:zjf...@gmail.com]
> *发送时间:* 2015年12月25日 15:39
> *收件人:* zml张明磊
> *抄送:* dev@spark.apache.org
> *主题:* Re: How can I get the column data based on specific column name and
> then stored these data in array or list ?
>
>
>
> Not sure what you mean. Do you want to choose some columns of a Row and
> convert it to an Arrray ?
>
>
>
> On Fri, Dec 25, 2015 at 3:35 PM, zml张明磊 <mingleizh...@ctrip.com> wrote:
>
>
>
> Hi,
>
>
>
>        I am a new to Scala and Spark and trying to find relative API in 
> DataFrame
> to solve my problem as title described. However, I just only find this API 
> *DataFrame.col(colName
> : String) : Column * which returns an object of Column. Not the content.
> If only DataFrame support such API which like *Column.toArray : Type* is
> enough for me. But now, it doesn’t. How can I do can achieve this function
> ?
>
>
>
> Thanks,
>
> Minglei.
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>



-- 
Best Regards

Jeff Zhang

Reply via email to