Re: spark-sql not coming up with Hive 0.10.0/CDH 4.6

2014-10-15 Thread Anurag Tangri
Hi Marcelo,
Exactly. Found it few minutes ago.

I ran mysql hive 12 sql on my hive 10 metastore, which created missing
tables and it seems to be working now.

Not sure if everything else in CDH 4.6/Hive 10 would also still be working
though or not.

Looks like we cannot use Spark SQL in a clean way with CDH4 unless we
upgrade to CDH5.


Thanks for your response!

Thanks,
Anurag Tangri


On Wed, Oct 15, 2014 at 12:02 PM, Marcelo Vanzin 
wrote:

> Hi Anurag,
>
> Spark SQL (from the Spark standard distribution / sources) currently
> requires Hive 0.12; as you mention, CDH4 has Hive 0.10, so that's not
> gonna work.
>
> CDH 5.2 ships with Spark 1.1.0 and is modified so that Spark SQL can
> talk to the Hive 0.13.1 that is also bundled with CDH, so if that's an
> option for you, you could try it out.
>
>
> On Wed, Oct 15, 2014 at 11:23 AM, Anurag Tangri 
> wrote:
> > I see Hive 0.10.0 metastore sql does not have a VERSION table but spark
> is
> > looking for it.
> >
> > Anyone else faced this issue or any ideas on how to fix it ?
> >
> >
> > Thanks,
> > Anurag Tangri
> >
> >
> >
> > On Wed, Oct 15, 2014 at 10:51 AM, Anurag Tangri 
> wrote:
> >>
> >> Hi,
> >> I compiled spark 1.1.0 with CDH 4.6 but when I try to get spark-sql cli
> >> up, it gives error:
> >>
> >>
> >> ==
> >>
> >> [atangri@pit-uat-hdputil1 bin]$ ./spark-sql
> >> Spark assembly has been built with Hive, including Datanucleus jars on
> >> classpath
> >> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> >> MaxPermSize=128m; support was removed in 8.0
> >> log4j:WARN No appenders could be found for logger
> >> (org.apache.hadoop.conf.Configuration).
> >> log4j:WARN Please initialize the log4j system properly.
> >> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig
> for
> >> more info.
> >> Unable to initialize logging using hive-log4j.properties, not found on
> >> CLASSPATH!
> >> Using Spark's default log4j profile:
> >> org/apache/spark/log4j-defaults.properties
> >> 14/10/15 17:45:17 INFO SecurityManager: Changing view acls to: atangri,
> >> 14/10/15 17:45:17 INFO SecurityManager: Changing modify acls to:
> atangri,
> >> 14/10/15 17:45:17 INFO SecurityManager: SecurityManager: authentication
> >> disabled; ui acls disabled; users with view permissions: Set(atangri, );
> >> users with modify permissions: Set(atangri, )
> >> 14/10/15 17:45:17 INFO Slf4jLogger: Slf4jLogger started
> >> 14/10/15 17:45:17 INFO Remoting: Starting remoting
> >> 14/10/15 17:45:17 INFO Remoting: Remoting started; listening on
> addresses
> >> :[akka.tcp://sparkDriver@pit-uat-hdputil1.snc1:54506]
> >> 14/10/15 17:45:17 INFO Remoting: Remoting now listens on addresses:
> >> [akka.tcp://sparkDriver@pit-uat-hdputil1.snc1:54506]
> >> 14/10/15 17:45:17 INFO Utils: Successfully started service 'sparkDriver'
> >> on port 54506.
> >> 14/10/15 17:45:17 INFO SparkEnv: Registering MapOutputTracker
> >> 14/10/15 17:45:17 INFO SparkEnv: Registering BlockManagerMaster
> >> 14/10/15 17:45:17 INFO DiskBlockManager: Created local directory at
> >> /tmp/spark-local-20141015174517-bdfa
> >> 14/10/15 17:45:17 INFO Utils: Successfully started service 'Connection
> >> manager for block manager' on port 58400.
> >> 14/10/15 17:45:17 INFO ConnectionManager: Bound socket to port 58400
> with
> >> id = ConnectionManagerId(pit-uat-hdputil1.snc1,58400)
> >> 14/10/15 17:45:17 INFO MemoryStore: MemoryStore started with capacity
> >> 265.1 MB
> >> 14/10/15 17:45:17 INFO BlockManagerMaster: Trying to register
> BlockManager
> >> 14/10/15 17:45:17 INFO BlockManagerMasterActor: Registering block
> manager
> >> pit-uat-hdputil1.snc1:58400 with 265.1 MB RAM
> >> 14/10/15 17:45:17 INFO BlockManagerMaster: Registered BlockManager
> >> 14/10/15 17:45:17 INFO HttpFileServer: HTTP File server directory is
> >> /tmp/spark-c7f28004-6189-424f-a214-379d5dcc72b7
> >> 14/10/15 17:45:17 INFO HttpServer: Starting HTTP Server
> >> 14/10/15 17:45:17 INFO Utils: Successfully started service 'HTTP file
> >> server' on port 33666.
> >> 14/10/15 17:45:18 INFO Utils: Successfully started service 'SparkUI' on
> >> port 4040.
> >> 14/10/15 17:45:18 INFO SparkUI: Started SparkUI at
> >> http://pit-uat-hdputil1.snc1:4040
> >> 14/10/15 17:45:18 INFO AkkaUtils: Connecting to Hea

Re: spark-sql not coming up with Hive 0.10.0/CDH 4.6

2014-10-15 Thread Anurag Tangri
I see Hive 0.10.0 metastore sql does not have a VERSION table but spark is
looking for it.

Anyone else faced this issue or any ideas on how to fix it ?


Thanks,
Anurag Tangri



On Wed, Oct 15, 2014 at 10:51 AM, Anurag Tangri  wrote:

> Hi,
> I compiled spark 1.1.0 with CDH 4.6 but when I try to get spark-sql cli
> up, it gives error:
>
>
> ==
>
> [atangri@pit-uat-hdputil1 bin]$ ./spark-sql
> Spark assembly has been built with Hive, including Datanucleus jars on
> classpath
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=128m; support was removed in 8.0
> log4j:WARN No appenders could be found for logger
> (org.apache.hadoop.conf.Configuration).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
> Unable to initialize logging using hive-log4j.properties, not found on
> CLASSPATH!
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 14/10/15 17:45:17 INFO SecurityManager: Changing view acls to: atangri,
> 14/10/15 17:45:17 INFO SecurityManager: Changing modify acls to: atangri,
> 14/10/15 17:45:17 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users with view permissions: Set(atangri, );
> users with modify permissions: Set(atangri, )
> 14/10/15 17:45:17 INFO Slf4jLogger: Slf4jLogger started
> 14/10/15 17:45:17 INFO Remoting: Starting remoting
> 14/10/15 17:45:17 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://sparkDriver@pit-uat-hdputil1.snc1:54506]
> 14/10/15 17:45:17 INFO Remoting: Remoting now listens on addresses:
> [akka.tcp://sparkDriver@pit-uat-hdputil1.snc1:54506]
> 14/10/15 17:45:17 INFO Utils: Successfully started service 'sparkDriver'
> on port 54506.
> 14/10/15 17:45:17 INFO SparkEnv: Registering MapOutputTracker
> 14/10/15 17:45:17 INFO SparkEnv: Registering BlockManagerMaster
> 14/10/15 17:45:17 INFO DiskBlockManager: Created local directory at
> /tmp/spark-local-20141015174517-bdfa
> 14/10/15 17:45:17 INFO Utils: Successfully started service 'Connection
> manager for block manager' on port 58400.
> 14/10/15 17:45:17 INFO ConnectionManager: Bound socket to port 58400 with
> id = ConnectionManagerId(pit-uat-hdputil1.snc1,58400)
> 14/10/15 17:45:17 INFO MemoryStore: MemoryStore started with capacity
> 265.1 MB
> 14/10/15 17:45:17 INFO BlockManagerMaster: Trying to register BlockManager
> 14/10/15 17:45:17 INFO BlockManagerMasterActor: Registering block manager
> pit-uat-hdputil1.snc1:58400 with 265.1 MB RAM
> 14/10/15 17:45:17 INFO BlockManagerMaster: Registered BlockManager
> 14/10/15 17:45:17 INFO HttpFileServer: HTTP File server directory is
> /tmp/spark-c7f28004-6189-424f-a214-379d5dcc72b7
> 14/10/15 17:45:17 INFO HttpServer: Starting HTTP Server
> 14/10/15 17:45:17 INFO Utils: Successfully started service 'HTTP file
> server' on port 33666.
> 14/10/15 17:45:18 INFO Utils: Successfully started service 'SparkUI' on
> port 4040.
> 14/10/15 17:45:18 INFO SparkUI: Started SparkUI at
> http://pit-uat-hdputil1.snc1:4040
> 14/10/15 17:45:18 INFO AkkaUtils: Connecting to HeartbeatReceiver:
> akka.tcp://sparkDriver@pit-uat-hdputil1.snc1:54506/user/HeartbeatReceiver
> spark-sql> show tables;
> 14/10/15 17:45:22 INFO ParseDriver: Parsing command: show tables
> 14/10/15 17:45:22 INFO ParseDriver: Parse Completed
> 14/10/15 17:45:23 INFO Driver: 
> 14/10/15 17:45:23 INFO Driver: 
> 14/10/15 17:45:23 INFO Driver: 
> 14/10/15 17:45:23 INFO Driver: 
> 14/10/15 17:45:23 INFO ParseDriver: Parsing command: show tables
> 14/10/15 17:45:23 INFO ParseDriver: Parse Completed
> 14/10/15 17:45:23 INFO Driver:  end=1413395123539 duration=1>
> 14/10/15 17:45:23 INFO Driver: 
> 14/10/15 17:45:23 INFO Driver: Semantic Analysis Completed
> 14/10/15 17:45:23 INFO Driver:  start=1413395123539 end=1413395123641 duration=102>
> 14/10/15 17:45:23 INFO ListSinkOperator: Initializing Self 0 OP
> 14/10/15 17:45:23 INFO ListSinkOperator: Operator 0 OP initialized
> 14/10/15 17:45:23 INFO ListSinkOperator: Initialization Done 0 OP
> 14/10/15 17:45:23 INFO Driver: Returning Hive schema:
> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from
> deserializer)], properties:null)
> 14/10/15 17:45:23 INFO Driver:  start=1413395123517 end=1413395123696 duration=179>
> 14/10/15 17:45:23 INFO Driver: 
> 14/10/15 17:45:23 INFO Driver: Starting command: show tables
> 14/10/15 17:45:23 INFO Driver:  start=1413395123517 end=1413395123698 duration=181>
> 14/10/15 17:45:23 INFO Driver: 
> 14/10/15 17:45:23 INFO Driver: 
> 14/10/15 17:45:23 INFO HiveMetaStore: 0: Opening raw store with
> 

spark-sql not coming up with Hive 0.10.0/CDH 4.6

2014-10-15 Thread Anurag Tangri
mmand.sideEffectResult(NativeCommand.scala:35)
at
org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:38)
at
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:360)
at
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:360)
at
org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
at org.apache.spark.sql.SchemaRDD.(SchemaRDD.scala:103)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:98)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:58)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:291)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
14/10/15 17:45:25 ERROR RetryingRawStore: JDO datastore error. Retrying
metastore command after 1000 ms (attempt 1 of 1)
14/10/15 17:45:26 WARN Query: Query for candidates of
org.apache.hadoop.hive.metastore.model.MVersionTable and subclasses
resulted in no possible candidates
Required table missing : "`VERSION`" in Catalog "" Schema "". DataNucleus
requires this table to perform its persistence operations. Either your
MetaData is incorrect, or you need to enable "datanucleus.autoCreateTables"
org.datanucleus.store.rdbms.exceptions.MissingTableException: Required
table missing : "`VERSION`" in





can somebody tell what am I missing ?


Same works via hive shell.


Thanks,
Anurag Tangri


Hive 11 / CDH 4.6/ Spark 0.9.1 dilemmna

2014-08-06 Thread Anurag Tangri
I posted this in cdh-user mailing list yesterday and think this should have
been the right audience for this:

=

Hi All,
Not sure if anyone else faced this same issue or not.

We installed CDH 4.6 that uses Hive 0.10.

And we have Spark 0.9.1 that comes with Hive 11.

Now our hive jobs that work on CDH, fail in Shark.

Anyone else facing same issues and any work-arounds ?

Can we re-compile shark 0.9.1 with hive 10 or compile hive 11 on CDH 4.6 ?



Thanks,
Anurag Tangri


Re: 1.0.0 Release Date?

2014-05-13 Thread Anurag Tangri
Hi All,
We are also waiting for this. Does anyone know of tentative date for this
release ?

We are at spark 0.8.0 right now.  Should we wait for spark 1.0 or upgrade
to spark 0.9.1 ?


Thanks,
Anurag Tangri



On Tue, May 13, 2014 at 9:40 AM, bhusted  wrote:

> Can anyone comment on the anticipated date or worse case timeframe for when
> Spark 1.0.0 will be released?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/1-0-0-Release-Date-tp5664.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>