Thanks, it didn't work. Because, the folder has files from 2 different schemas. 
It fails with the following exception:
org.apache.spark.sql.AnalysisException: cannot resolve '`f2`' given input 
columns: [f1];


-----Original Message-----
From: smartzjp [mailto:zjp_j...@163.com] 
Sent: Tuesday, February 14, 2017 10:32 AM
To: Begar, Veena <veena.be...@hpe.com>; user@spark.apache.org
Subject: Re: How to specify default value for StructField?

You can try the below code.

val df = spark.read.format("orc").load("/user/hos/orc_files_test_together")
df.select(“f1”,”f2”).show





在 2017/2/14 上午6:54,“vbegar”<user-return-67879-zjp_jdev=163....@spark.apache.org 
代表 veena.be...@hpe.com> 写入:

>Hello,
>
>I specified a StructType like this: 
>
>*val mySchema = StructType(Array(StructField("f1", StringType, 
>true),StructField("f2", StringType, true)))*
>
>I have many ORC files stored in HDFS location:* 
>/user/hos/orc_files_test_together
>*
>
>These files use different schema : some of them have only f1 columns 
>and other have both f1 and f2 columns.
>
>I read the data from these files to a dataframe:
>*val df =
>spark.read.format("orc").schema(mySchema).load("/user/hos/orc_files_tes
>t_together")*
>
>But, now when I give the following command to see the data, it fails:
>*df.show*
>
>The error message is like "f2" comun doesn't exist. 
>
>Since I have specified nullable attribute as true for f2 column, why it 
>fails?
>
>Or, is there any way to specify default vaule for StructField?
>
>Because, in AVRO schema, we can specify the default value in this way 
>and can read AVRO files in a folder which have 2 different schemas 
>(either only
>f1 column or both f1 and f2 columns): 
>
>*{
>   "type": "record",
>   "name": "myrecord",
>   "fields": 
>   [
>      {
>         "name": "f1",
>         "type": "string",
>         "default": ""
>      },
>      {
>         "name": "f2",
>         "type": "string",
>         "default": ""
>      }
>   ]
>}*
>
>Wondering why it doesn't work with ORC files.
>
>thanks.
>
>
>
>--
>View this message in context: 
>http://apache-spark-user-list.1001560.n3.nabble.com/How-to-specify-defa
>ult-value-for-StructField-tp28386.html
>Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
>---------------------------------------------------------------------
>To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>


Reply via email to