You can run hive query in the spark-avro, but you cannot query the hive view in the spark-avro, as the view is stored in the Hive metadata. What do you mean the right version of spark, then "can't determine table schema" problem is fixed? I faced this problem before, and my guess is the Hive library mismatch causing it, but not sure. I never faced your 2nd problem, can you post the whole stack for that error? Most of our datasets are also in AVRO format. Yong
Date: Thu, 27 Aug 2015 09:45:45 -0700 Subject: Re: query avro hive table in spark sql From: gpatc...@gmail.com To: java8...@hotmail.com CC: mich...@databricks.com; user@spark.apache.org can we run hive queries using spark-avro ? In our case its not just reading the avro file. we have view in hive which is based on multiple tables. On Thu, Aug 27, 2015 at 9:41 AM, Giri P <gpatc...@gmail.com> wrote: we are using hive1.1 . I was able to fix below error when I used right version spark 15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeExceptiondetermining schema. Returning signal schema to indicate problemorg.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neitheravro.schema.literal nor avro.schema.url specified, can't determine tableschema atorg.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68) atorg.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93) atorg.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60) atorg.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375) atorg.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249) But I still see this error when querying on some hive avro tables. 15/08/26 17:51:27 WARN scheduler.TaskSetManager: Lost task 30.0 in stage 0.0 (TID 14, dtord01hdw0227p.dc.dotomi.net): org.apache.hadoop.hive.serde2.avro.BadSchemaException at org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:91) at org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:321) at org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:320) I haven't tried spark-avro. We are using Sqlcontext to run queries in our application Any idea if this issue might be coz of querying across different schema version of data ? ThanksGiri On Thu, Aug 27, 2015 at 5:39 AM, java8964 <java8...@hotmail.com> wrote: What version of the Hive you are using? And do you compile to the right version of Hive when you compiled Spark? BTY, spark-avro works great for our experience, but still, some non-tech people just want to use as a SQL shell in spark, like HIVE-CLI. Yong From: mich...@databricks.com Date: Wed, 26 Aug 2015 17:48:44 -0700 Subject: Re: query avro hive table in spark sql To: gpatc...@gmail.com CC: user@spark.apache.org I'd suggest looking at http://spark-packages.org/package/databricks/spark-avro On Wed, Aug 26, 2015 at 11:32 AM, gpatcham <gpatc...@gmail.com> wrote: Hi, I'm trying to query hive table which is based on avro in spark SQL and seeing below errors. 15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException determining schema. Returning signal schema to indicate problem org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither avro.schema.literal nor avro.schema.url specified, can't determine table schema at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68) at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375) at org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249) Its not able to determine schema. Hive table is pointing to avro schema using url. I'm stuck and couldn't find more info on this. Any pointers ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org