Hi Jenny,

You may try to use |--files $SPARK_HOME/conf/hive-site.xml --driver-class-path hive-site.xml| when submitting your application. The problem is that when running in cluster mode, the driver is actually running in a random container directory on a random executor node. By using |--files|, you upload hive-site.xml to the container directory, by using |--driver-class-path hive-site.xml|, you add the file to classpath (the path is relative to the container directory).

When running in cluster mode, have you tried to check the tables inside the default database? If my guess is right, this should be an empty default database inside the default Derby metastore created by HiveContext when the hive-site.xml is missing.

Best,
Cheng

On 8/12/14 5:38 PM, Jenny Zhao wrote:


Hi Yin,

hive-site.xml was copied to spark/conf and the same as the one under $HIVE_HOME/conf.

through hive cli, I don't see any problem. but for spark on yarn-cluster mode, I am not able to switch to a database other than the default one, for Yarn-client mode, it works fine.

Thanks!

Jenny


On Tue, Aug 12, 2014 at 12:53 PM, Yin Huai <huaiyin....@gmail.com <mailto:huaiyin....@gmail.com>> wrote:

    Hi Jenny,

    Have you copied hive-site.xml to spark/conf directory? If not, can
    you put it in conf/ and try again?

    Thanks,

    Yin


    On Mon, Aug 11, 2014 at 8:57 PM, Jenny Zhao
    <linlin200...@gmail.com <mailto:linlin200...@gmail.com>> wrote:


        Thanks Yin!

        here is my hive-site.xml,  which I copied from
        $HIVE_HOME/conf, didn't experience problem connecting to the
        metastore through hive. which uses DB2 as metastore database.

        <?xml version="1.0"?>
        <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
        <!--
           Licensed to the Apache Software Foundation (ASF) under one
        or more
           contributor license agreements.  See the NOTICE file
        distributed with
           this work for additional information regarding copyright
        ownership.
           The ASF licenses this file to You under the Apache License,
        Version 2.0
           (the "License"); you may not use this file except in
        compliance with
           the License.  You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

           Unless required by applicable law or agreed to in writing,
        software
           distributed under the License is distributed on an "AS IS"
        BASIS,
           WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
        express or implied.
           See the License for the specific language governing
        permissions and
           limitations under the License.
        -->
        <configuration>
         <property>
        <name>hive.hwi.listen.port</name>
          <value>9999</value>
         </property>
         <property>
        <name>hive.querylog.location</name>
        <value>/var/ibm/biginsights/hive/query/${user.name
        <http://user.name>}</value>
         </property>
         <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/biginsights/hive/warehouse</value>
         </property>
         <property>
        <name>hive.hwi.war.file</name>
        <value>lib/hive-hwi-0.12.0.war</value>
         </property>
         <property>
        <name>hive.metastore.metrics.enabled</name>
          <value>true</value>
         </property>
         <property>
        <name>javax.jdo.option.ConnectionURL</name>
          <value>jdbc:db2://hdtest022.svl.ibm.com:50001/BIDB
        <http://hdtest022.svl.ibm.com:50001/BIDB></value>
         </property>
         <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.ibm.db2.jcc.DB2Driver</value>
         </property>
         <property>
        <name>hive.stats.autogather</name>
          <value>false</value>
         </property>
         <property>
        <name>javax.jdo.mapping.Schema</name>
          <value>HIVE</value>
         </property>
         <property>
        <name>javax.jdo.option.ConnectionUserName</name>
          <value>catalog</value>
         </property>
         <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>V2pJNWMxbFlVbWhaZHowOQ==</value>
         </property>
         <property>
        <name>hive.metastore.password.encrypt</name>
          <value>true</value>
         </property>
         <property>
        <name>org.jpox.autoCreateSchema</name>
          <value>true</value>
         </property>
         <property>
        <name>hive.server2.thrift.min.worker.threads</name>
          <value>5</value>
         </property>
         <property>
        <name>hive.server2.thrift.max.worker.threads</name>
          <value>100</value>
         </property>
         <property>
        <name>hive.server2.thrift.port</name>
          <value>10000</value>
         </property>
         <property>
        <name>hive.server2.thrift.bind.host</name>
          <value>hdtest022.svl.ibm.com
        <http://hdtest022.svl.ibm.com></value>
         </property>
         <property>
        <name>hive.server2.authentication</name>
          <value>CUSTOM</value>
         </property>
         <property>
        <name>hive.server2.custom.authentication.class</name>
        
<value>org.apache.hive.service.auth.WebConsoleAuthenticationProviderImpl</value>
         </property>
         <property>
        <name>hive.server2.enable.impersonation</name>
          <value>true</value>
         </property>
         <property>
        <name>hive.security.webconsole.url</name>
          <value>http://hdtest022.svl.ibm.com:8080</value>
         </property>
         <property>
        <name>hive.security.authorization.enabled</name>
          <value>true</value>
         </property>
         <property>
        <name>hive.security.authorization.createtable.owner.grants</name>
          <value>ALL</value>
         </property>
        </configuration>



        On Mon, Aug 11, 2014 at 4:29 PM, Yin Huai
        <huaiyin....@gmail.com <mailto:huaiyin....@gmail.com>> wrote:

            Hi Jenny,

            How's your metastore configured for both Hive and Spark
            SQL? Which metastore mode are you using (based on
            
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin)?

            Thanks,

            Yin


            On Mon, Aug 11, 2014 at 6:15 PM, Jenny Zhao
            <linlin200...@gmail.com <mailto:linlin200...@gmail.com>>
            wrote:



                you can reproduce this issue with the following steps
                (assuming you have Yarn cluster + Hive 12):

                1) using hive shell, create a database, e.g: create
                database ttt

                2) write a simple spark sql program

                import org.apache.spark.{SparkConf, SparkContext}
                import org.apache.spark.sql._
                import org.apache.spark.sql.hive.HiveContext

                object HiveSpark {
                  case class Record(key: Int, value: String)

                  def main(args: Array[String]) {
                    val sparkConf = new
                SparkConf().setAppName("HiveSpark")
                    val sc = new SparkContext(sparkConf)

                    // A hive context creates an instance of the Hive
                Metastore in process,
                    val hiveContext = new HiveContext(sc)
                    import hiveContext._

                    hql("use ttt")
                    hql("CREATE TABLE IF NOT EXISTS src (key INT,
                value STRING)")
                    hql("LOAD DATA INPATH '/user/biadmin/kv1.txt' INTO
                TABLE src")

                    // Queries are expressed in HiveQL
                    println("Result of 'SELECT *': ")
                    hql("SELECT * FROM src").collect.foreach(println)
                    sc.stop()
                  }
                }
                3) run it in yarn-cluster mode.


                On Mon, Aug 11, 2014 at 9:44 AM, Cheng Lian
                <lian.cs....@gmail.com <mailto:lian.cs....@gmail.com>>
                wrote:

                    Since you were using |hql(...)|, it’s probably not
                    related to JDBC driver. But I failed to reproduce
                    this issue locally with a single node pseudo
                    distributed YARN cluster. Would you mind to
                    elaborate more about steps to reproduce this bug?
                    Thanks

                    ​


                    On Sun, Aug 10, 2014 at 9:36 PM, Cheng Lian
                    <lian.cs....@gmail.com
                    <mailto:lian.cs....@gmail.com>> wrote:

                        Hi Jenny, does this issue only happen when
                        running Spark SQL with YARN in your environment?


                        On Sat, Aug 9, 2014 at 3:56 AM, Jenny Zhao
                        <linlin200...@gmail.com
                        <mailto:linlin200...@gmail.com>> wrote:


                            Hi,

                            I am able to run my hql query on yarn
                            cluster mode when connecting to the
                            default hive metastore defined in
                            hive-site.xml.

                            however, if I want to switch to a
                            different database, like:

                              hql("use other-database")


                            it only works in yarn client mode, but
                            failed on yarn-cluster mode with the
                            following stack:

                            14/08/08 12:09:11 INFO HiveMetaStore: 0: 
get_database: tt
                            14/08/08 12:09:11 INFO audit: ugi=biadmin   
ip=unknown-ip-addr      cmd=get_database: tt    
                            14/08/08 12:09:11 ERROR RetryingHMSHandler: 
NoSuchObjectException(message:There is no database named tt)
                                at 
org.apache.hadoop.hive.metastore.ObjectStore.getMDatabase(ObjectStore.java:431)
                                at 
org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:441)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
                                at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
                                at 
java.lang.reflect.Method.invoke(Method.java:611)
                                at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124)
                                at $Proxy15.getDatabase(Unknown Source)
                                at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:628)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
                                at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
                                at 
java.lang.reflect.Method.invoke(Method.java:611)
                                at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
                                at $Proxy17.get_database(Unknown Source)
                                at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:810)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
                                at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
                                at 
java.lang.reflect.Method.invoke(Method.java:611)
                                at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
                                at $Proxy18.getDatabase(Unknown Source)
                                at 
org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1139)
                                at 
org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1128)
                                at 
org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3479)
                                at 
org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:237)
                                at 
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
                                at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
                                at 
org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
                                at 
org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
                                at 
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
                                at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
                                at 
org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:208)
                                at 
org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:182)
                                at 
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:272)
                                at 
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:269)
                                at 
org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:86)
                                at 
org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:91)
                                at 
org.apache.spark.examples.sql.hive.HiveSpark$.main(HiveSpark.scala:35)
                                at 
org.apache.spark.examples.sql.hive.HiveSpark.main(HiveSpark.scala)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
                                at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
                                at 
java.lang.reflect.Method.invoke(Method.java:611)
                                at 
org.apache.spark.deploy.yarn.ApplicationMaster$anon$2.run(ApplicationMaster.scala:186)

                            14/08/08 12:09:11 ERROR DDLTask: 
org.apache.hadoop.hive.ql.metadata.HiveException: Database does not exist: tt
                                at 
org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3480)
                                at 
org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:237)
                                at 
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
                                at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
                                at 
org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
                                at 
org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
                                at 
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
                                at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
                                at 
org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:208)
                                at 
org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:182)
                                at 
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:272)
                                at 
org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:269)
                                at 
org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:86)
                                at 
org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:91)
                                at 
org.apache.spark.examples.sql.hive.HiveSpark$.main(HiveSpark.scala:35)
                                at 
org.apache.spark.examples.sql.hive.HiveSpark.main(HiveSpark.scala)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                                at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
                                at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
                                at 
java.lang.reflect.Method.invoke(Method.java:611)
                                at 
org.apache.spark.deploy.yarn.ApplicationMaster$anon$2.run(ApplicationMaster.scala:186)
                            nono

                                why is that? not sure if this is
                            something to do with hive jdbc driver?

                            Thank you!

                            Jenny








Reply via email to