OK, but what about on an action, like collect()? Shouldn't it be able to determine the correctness at that time?
On Fri, Feb 13, 2015 at 4:49 PM, Yin Huai <yh...@databricks.com> wrote: > Hi Justin, > > It is expected. We do not check if the provided schema matches rows since > all rows need to be scanned to give a correct answer. > > Thanks, > > Yin > > On Fri, Feb 13, 2015 at 1:33 PM, Justin Pihony <justin.pih...@gmail.com> > wrote: > >> Per the documentation: >> >> It is important to make sure that the structure of every Row of the >> provided RDD matches the provided schema. Otherwise, there will be runtime >> exception. >> >> However, it appears that this is not being enforced. >> >> import org.apache.spark.sql._ >> val sqlContext = new SqlContext(sc) >> val struct = StructType(List(StructField("test", BooleanType, true))) >> val myData = sc.parallelize(List(Row(0), Row(true), Row("stuff"))) >> val schemaData = sqlContext.applySchema(myData, struct) //No error >> schemaData.collect()(0).getBoolean(0) //Only now will I receive an error >> >> Is this expected or a bug? >> >> Thanks, >> Justin >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/SQLContext-applySchema-strictness-tp21650.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >