OK, but what about on an action, like collect()? Shouldn't it be able to
determine the correctness at that time?

On Fri, Feb 13, 2015 at 4:49 PM, Yin Huai <yh...@databricks.com> wrote:

> Hi Justin,
>
> It is expected. We do not check if the provided schema matches rows since
> all rows need to be scanned to give a correct answer.
>
> Thanks,
>
> Yin
>
> On Fri, Feb 13, 2015 at 1:33 PM, Justin Pihony <justin.pih...@gmail.com>
> wrote:
>
>> Per the documentation:
>>
>>   It is important to make sure that the structure of every Row of the
>> provided RDD matches the provided schema. Otherwise, there will be runtime
>> exception.
>>
>> However, it appears that this is not being enforced.
>>
>> import org.apache.spark.sql._
>> val sqlContext = new SqlContext(sc)
>> val struct = StructType(List(StructField("test", BooleanType, true)))
>> val myData = sc.parallelize(List(Row(0), Row(true), Row("stuff")))
>> val schemaData = sqlContext.applySchema(myData, struct) //No error
>> schemaData.collect()(0).getBoolean(0) //Only now will I receive an error
>>
>> Is this expected or a bug?
>>
>> Thanks,
>> Justin
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/SQLContext-applySchema-strictness-tp21650.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to