Re: query avro hive table in spark sql

2015-08-28 Thread Giri P
Any idea what causing this error

15/08/28 21:03:03 WARN scheduler.TaskSetManager: Lost task 34.0 in stage
9.0 (TID 20, dtord01hdw0228p.dc.dotomi.net): java.lang.RuntimeException:
cannot find field message_campaign_id from
[0:error_error_error_error_error_error_error, 1:cannot_determine_schema,
2:check, 3:schema, 4:url, 5:and, 6:literal]
at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:410)
at
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147)
at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$12.apply(TableReader.scala:278)
at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$12.apply(TableReader.scala:277)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at
org.apache.spark.sql.hive.HadoopTableReader$.fillObject(TableReader.scala:277)
at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$4$$anonfun$9.apply(TableReader.scala:194)
at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$4$$anonfun$9.apply(TableReader.scala:188)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

On Thu, Aug 27, 2015 at 12:02 PM, Michael Armbrust mich...@databricks.com
wrote:

 BTY, spark-avro works great for our experience, but still, some non-tech
 people just want to use as a SQL shell in spark, like HIVE-CLI.


 To clarify: you can still use the spark-avro library with pure SQL.  Just
 use the CREATE TABLE ... USING com.databricks.spark.avro OPTIONS (path
 '...') syntax.



RE: query avro hive table in spark sql

2015-08-27 Thread java8964
What version of the Hive you are using? And do you compile to the right version 
of Hive when you compiled Spark?
BTY, spark-avro works great for our experience, but still, some non-tech people 
just want to use as a SQL shell in spark, like HIVE-CLI.
Yong

From: mich...@databricks.com
Date: Wed, 26 Aug 2015 17:48:44 -0700
Subject: Re: query avro hive table in spark sql
To: gpatc...@gmail.com
CC: user@spark.apache.org

I'd suggest looking at http://spark-packages.org/package/databricks/spark-avro
On Wed, Aug 26, 2015 at 11:32 AM, gpatcham gpatc...@gmail.com wrote:
Hi,



I'm trying to query hive table which is based on avro in spark SQL and

seeing below errors.



15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException

determining schema. Returning signal schema to indicate problem

org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither

avro.schema.literal nor avro.schema.url specified, can't determine table

schema

at

org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)

at

org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)

at

org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)

at

org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)

at

org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)





Its not able to determine schema. Hive table is pointing to avro schema

using url. I'm stuck and couldn't find more info on this.



Any pointers ?







--

View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462.html

Sent from the Apache Spark User List mailing list archive at Nabble.com.



-

To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

For additional commands, e-mail: user-h...@spark.apache.org




  

Re: query avro hive table in spark sql

2015-08-27 Thread Giri P
I was using different build of spark compiled with different version of
hive before

I error which I see now

org.apache.hadoop.hive.serde2.avro.BadSchemaException
at
org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:195)
at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:321)
at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:320)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at
org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$6.apply(Aggregate.scala:128)
at
org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$6.apply(Aggregate.scala:124)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)


On Thu, Aug 27, 2015 at 10:38 AM, java8964 java8...@hotmail.com wrote:

 You can run hive query in the spark-avro, but you cannot query the hive
 view in the spark-avro, as the view is stored in the Hive metadata.

 What do you mean the right version of spark, then can't determine table
 schema problem is fixed? I faced this problem before, and my guess is the
 Hive library mismatch causing it, but not sure.

 I never faced your 2nd problem, can you post the whole stack for that
 error?

 Most of our datasets are also in AVRO format.

 Yong

 --
 Date: Thu, 27 Aug 2015 09:45:45 -0700
 Subject: Re: query avro hive table in spark sql
 From: gpatc...@gmail.com
 To: java8...@hotmail.com
 CC: mich...@databricks.com; user@spark.apache.org


 can we run hive queries using spark-avro ?

 In our case its not just reading the avro file. we have view in hive which
 is based on multiple tables.

 On Thu, Aug 27, 2015 at 9:41 AM, Giri P gpatc...@gmail.com wrote:

 we are using hive1.1 .

 I was able to fix below error when I used right version spark

 15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException
 determining schema. Returning signal schema to indicate problem
 org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither
 avro.schema.literal nor avro.schema.url specified, can't determine table
 schema
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.
 determineSchemaOrThrowException(AvroSerdeUtils.java:68)
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.
 determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)
 at
 org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(
 MetaStoreUtils.java:375

RE: query avro hive table in spark sql

2015-08-27 Thread java8964
You can run hive query in the spark-avro, but you cannot query the hive view in 
the spark-avro, as the view is stored in the Hive metadata.
What do you mean the right version of spark, then can't determine table 
schema problem is fixed? I faced this problem before, and my guess is the Hive 
library mismatch causing it, but not sure.
I never faced your 2nd problem, can you post the whole stack for that error?
Most of our datasets are also in AVRO format.
Yong

Date: Thu, 27 Aug 2015 09:45:45 -0700
Subject: Re: query avro hive table in spark sql
From: gpatc...@gmail.com
To: java8...@hotmail.com
CC: mich...@databricks.com; user@spark.apache.org

can we run hive queries using spark-avro ?
In our case its not just reading the avro file. we have view in hive which is 
based on multiple tables.
On Thu, Aug 27, 2015 at 9:41 AM, Giri P gpatc...@gmail.com wrote:
we are using hive1.1 . 
I was able to fix below error when I used right version spark
15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered 
AvroSerdeExceptiondetermining schema. Returning signal schema to indicate 
problemorg.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
Neitheravro.schema.literal nor avro.schema.url specified, can't determine 
tableschema
atorg.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)

atorg.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)

atorg.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)

atorg.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)

atorg.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)



But I still see this error when querying on some hive avro tables.
15/08/26 17:51:27 WARN
scheduler.TaskSetManager: Lost task 30.0 in stage 0.0 (TID 14,
dtord01hdw0227p.dc.dotomi.net):
org.apache.hadoop.hive.serde2.avro.BadSchemaException

   
at org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:91)

   
at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:321)

   at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:320)
I haven't tried spark-avro. We are using Sqlcontext to run queries in our 
application
Any idea if this issue might be coz of querying across different schema version 
of data ?
ThanksGiri
On Thu, Aug 27, 2015 at 5:39 AM, java8964 java8...@hotmail.com wrote:



What version of the Hive you are using? And do you compile to the right version 
of Hive when you compiled Spark?
BTY, spark-avro works great for our experience, but still, some non-tech people 
just want to use as a SQL shell in spark, like HIVE-CLI.
Yong

From: mich...@databricks.com
Date: Wed, 26 Aug 2015 17:48:44 -0700
Subject: Re: query avro hive table in spark sql
To: gpatc...@gmail.com
CC: user@spark.apache.org

I'd suggest looking at http://spark-packages.org/package/databricks/spark-avro
On Wed, Aug 26, 2015 at 11:32 AM, gpatcham gpatc...@gmail.com wrote:
Hi,



I'm trying to query hive table which is based on avro in spark SQL and

seeing below errors.



15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException

determining schema. Returning signal schema to indicate problem

org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither

avro.schema.literal nor avro.schema.url specified, can't determine table

schema

at

org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)

at

org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)

at

org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)

at

org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)

at

org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)





Its not able to determine schema. Hive table is pointing to avro schema

using url. I'm stuck and couldn't find more info on this.



Any pointers ?







--

View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462.html

Sent from the Apache Spark User List mailing list archive at Nabble.com.



-

To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

For additional commands, e-mail: user-h...@spark.apache.org




  



  

Re: query avro hive table in spark sql

2015-08-27 Thread Giri P
can we run hive queries using spark-avro ?

In our case its not just reading the avro file. we have view in hive which
is based on multiple tables.

On Thu, Aug 27, 2015 at 9:41 AM, Giri P gpatc...@gmail.com wrote:

 we are using hive1.1 .

 I was able to fix below error when I used right version spark

 15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException
 determining schema. Returning signal schema to indicate problem
 org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither
 avro.schema.literal nor avro.schema.url specified, can't determine table
 schema
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.
 determineSchemaOrThrowException(AvroSerdeUtils.java:68)
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.
 determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)
 at
 org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(
 MetaStoreUtils.java:375)
 at
 org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.
 java:249)



 But I still see this error when querying on some hive avro tables.

 15/08/26 17:51:27 WARN scheduler.TaskSetManager: Lost task 30.0 in stage
 0.0 (TID 14, dtord01hdw0227p.dc.dotomi.net):
 org.apache.hadoop.hive.serde2.avro.BadSchemaException

 at
 org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:91)

 at
 org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:321)

at
 org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:320)

 I haven't tried spark-avro. We are using Sqlcontext to run queries in our
 application

 Any idea if this issue might be coz of querying across different schema
 version of data ?

 Thanks
 Giri

 On Thu, Aug 27, 2015 at 5:39 AM, java8964 java8...@hotmail.com wrote:

 What version of the Hive you are using? And do you compile to the right
 version of Hive when you compiled Spark?

 BTY, spark-avro works great for our experience, but still, some non-tech
 people just want to use as a SQL shell in spark, like HIVE-CLI.

 Yong

 --
 From: mich...@databricks.com
 Date: Wed, 26 Aug 2015 17:48:44 -0700
 Subject: Re: query avro hive table in spark sql
 To: gpatc...@gmail.com
 CC: user@spark.apache.org


 I'd suggest looking at
 http://spark-packages.org/package/databricks/spark-avro

 On Wed, Aug 26, 2015 at 11:32 AM, gpatcham gpatc...@gmail.com wrote:

 Hi,

 I'm trying to query hive table which is based on avro in spark SQL and
 seeing below errors.

 15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException
 determining schema. Returning signal schema to indicate problem
 org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither
 avro.schema.literal nor avro.schema.url specified, can't determine table
 schema
 at

 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)
 at

 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)
 at

 org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)
 at

 org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)


 Its not able to determine schema. Hive table is pointing to avro schema
 using url. I'm stuck and couldn't find more info on this.

 Any pointers ?



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org






Re: query avro hive table in spark sql

2015-08-27 Thread Giri P
we are using hive1.1 .

I was able to fix below error when I used right version spark

15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException
determining schema. Returning signal schema to indicate problem
org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither
avro.schema.literal nor avro.schema.url specified, can't determine table
schema
at
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.
determineSchemaOrThrowException(AvroSerdeUtils.java:68)
at
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.
determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)
at
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(
MetaStoreUtils.java:375)
at
org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.
java:249)



But I still see this error when querying on some hive avro tables.

15/08/26 17:51:27 WARN scheduler.TaskSetManager: Lost task 30.0 in stage
0.0 (TID 14, dtord01hdw0227p.dc.dotomi.net):
org.apache.hadoop.hive.serde2.avro.BadSchemaException

at
org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:91)

at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:321)

   at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:320)

I haven't tried spark-avro. We are using Sqlcontext to run queries in our
application

Any idea if this issue might be coz of querying across different schema
version of data ?

Thanks
Giri

On Thu, Aug 27, 2015 at 5:39 AM, java8964 java8...@hotmail.com wrote:

 What version of the Hive you are using? And do you compile to the right
 version of Hive when you compiled Spark?

 BTY, spark-avro works great for our experience, but still, some non-tech
 people just want to use as a SQL shell in spark, like HIVE-CLI.

 Yong

 --
 From: mich...@databricks.com
 Date: Wed, 26 Aug 2015 17:48:44 -0700
 Subject: Re: query avro hive table in spark sql
 To: gpatc...@gmail.com
 CC: user@spark.apache.org


 I'd suggest looking at
 http://spark-packages.org/package/databricks/spark-avro

 On Wed, Aug 26, 2015 at 11:32 AM, gpatcham gpatc...@gmail.com wrote:

 Hi,

 I'm trying to query hive table which is based on avro in spark SQL and
 seeing below errors.

 15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException
 determining schema. Returning signal schema to indicate problem
 org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither
 avro.schema.literal nor avro.schema.url specified, can't determine table
 schema
 at

 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)
 at

 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)
 at

 org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)
 at

 org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)


 Its not able to determine schema. Hive table is pointing to avro schema
 using url. I'm stuck and couldn't find more info on this.

 Any pointers ?



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





Re: query avro hive table in spark sql

2015-08-27 Thread Michael Armbrust

 BTY, spark-avro works great for our experience, but still, some non-tech
 people just want to use as a SQL shell in spark, like HIVE-CLI.


To clarify: you can still use the spark-avro library with pure SQL.  Just
use the CREATE TABLE ... USING com.databricks.spark.avro OPTIONS (path
'...') syntax.


Re: query avro hive table in spark sql

2015-08-27 Thread ponkin
Can you select something from this table using Hive? And also could you post
your spark code which leads to this exception.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462p24468.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



query avro hive table in spark sql

2015-08-26 Thread gpatcham
Hi,

I'm trying to query hive table which is based on avro in spark SQL and
seeing below errors.

15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException
determining schema. Returning signal schema to indicate problem
org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither
avro.schema.literal nor avro.schema.url specified, can't determine table
schema
at
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)
at
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)
at
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)
at
org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)


Its not able to determine schema. Hive table is pointing to avro schema
using url. I'm stuck and couldn't find more info on this. 

Any pointers ?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: query avro hive table in spark sql

2015-08-26 Thread Michael Armbrust
I'd suggest looking at
http://spark-packages.org/package/databricks/spark-avro

On Wed, Aug 26, 2015 at 11:32 AM, gpatcham gpatc...@gmail.com wrote:

 Hi,

 I'm trying to query hive table which is based on avro in spark SQL and
 seeing below errors.

 15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException
 determining schema. Returning signal schema to indicate problem
 org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither
 avro.schema.literal nor avro.schema.url specified, can't determine table
 schema
 at

 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)
 at

 org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)
 at
 org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)
 at

 org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)
 at

 org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)


 Its not able to determine schema. Hive table is pointing to avro schema
 using url. I'm stuck and couldn't find more info on this.

 Any pointers ?



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org