RE: get corrupted rows using columnNameOfCorruptRecord

2016-12-12 Thread Yehuda Finkelstein
Ok got it. The destination column must be exists in the data frame. I thought that it will create new column in the data frame. Thanks you for your help. Yehuda *From:* Hyukjin Kwon [mailto:gurwls...@gmail.com] *Sent:* Wednesday, December 07, 2016 12:19 PM *To:* Yehuda Finkelstein

RE: get corrupted rows using columnNameOfCorruptRecord

2016-12-07 Thread Yehuda Finkelstein
.org $apache$spark$sql$Dataset$$withPlan(Dataset.scala:2603) at org.apache.spark.sql.Dataset.select(Dataset.scala:969) at org.apache.spark.sql.Dataset.select(Dataset.scala:987) ... 48 elided scala> *From:* Michael Armbrust [mailto:mich...@databricks.com] *Sent:* Tuesday, December 06, 2016 1

get corrupted rows using columnNameOfCorruptRecord

2016-12-06 Thread Yehuda Finkelstein
Hi all I’m trying to parse json using existing schema and got rows with NULL’s //get schema val df_schema = spark.sqlContext.sql("select c1,c2,…cn t1 limit 1") //read json file val f = sc.textFile("/tmp/x") //load json into data frame using schema var df =