[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-11-26 Thread Thomas Sebastian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15698420#comment-15698420
 ] 

Thomas Sebastian commented on SPARK-15396:
--

Hi, [~yizishou] and [~redlighter]
I am facing the same issue. I want my local mode spark metastore point to the 
hdfs location: /user/hive/warehouse.
Could you please suggest, what solution works here.
- I am using spark in local mode.
- Hive is already pointing to /user/hive/warehouse  in hdfs location
-I do not see spark not getting pointed to /user/hive/warehouse in hdfs, 
instead it points to a local file location-where spark is installed. 
-I already tried below, still not working.
  SparkSession.builder().appName("eg").config("spark.sql.warehouse.dir", 
"hdfs://localhost:9000/user/hive/warehouse").enableHiveSupport().getOrCreate()
- Error thrown is:
---
16/11/27 01:06:26 WARN metastore.ObjectStore: Version information not found in 
metastore. hive.metastore.schema.verification is not enabled so recording the 
schema version 1.2.0
16/11/27 01:06:26 WARN metastore.ObjectStore: Failed to get database default, 
returning NoSuchObjectException
16/11/27 01:06:31 ERROR metastore.RetryingHMSHandler: 
AlreadyExistsException(message:Database default already exists)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:891)

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-23 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296105#comment-15296105
 ] 

Yi Zhou commented on SPARK-15396:
-

Thanks a lot for your respone !

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does not have its own datastore table.
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-23 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296092#comment-15296092
 ] 

Xiao Li commented on SPARK-15396:
-

Will try to reproduce it and see if it is caused by the recent code changes. 
Will keep you posted. Thanks!

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does not have its own datastore table.
> 16/05/12 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-23 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296088#comment-15296088
 ] 

Yi Zhou commented on SPARK-15396:
-

BTW, I only replaced 1.6 with 2.0 package.

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does not have its own datastore table.
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-23 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296086#comment-15296086
 ] 

Yi Zhou commented on SPARK-15396:
-

hive-site.xml put in right 'conf' folder in Spark home. my concerning is that 
this worked well in Spark 1.6

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does not have its own datastore table.
> 16/05/12 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-23 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296078#comment-15296078
 ] 

Xiao Li commented on SPARK-15396:
-

Are you sure hive-site.xml is put in the right directory that your Spark 2.0 
will read? Do you know if the other parameters in hive-site.xml are effective? 

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-23 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296010#comment-15296010
 ] 

Yi Zhou commented on SPARK-15396:
-

sorry i can't access the 
http://www.hiregion.com/2010/01/hive-metastore-derby-db.html, 
In my hive-site.xml, it like below: 
{code}
 
javax.jdo.option.ConnectionURL
jdbc:postgresql://test-node1:7432/hive
  
{cdoe}
Is there any problem in this parameter ?

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-22 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295920#comment-15295920
 ] 

Yi Zhou commented on SPARK-15396:
-

these parameters worked well in Spark 1.6.1 but no sure why doesn't work with 
hive metastore in 2.0

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does not have its own datastore table.
> 16/05/12 22:23:32 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-22 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295899#comment-15295899
 ] 

Xiao Li commented on SPARK-15396:
-

Uh,,, I know what you are talking about. 

You need to change the value of javax.jdo.option.ConnectionURL. 

See the link: http://www.hiregion.com/2010/01/hive-metastore-derby-db.html

Let me know if you still hit the issue.

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-22 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295874#comment-15295874
 ] 

Yi Zhou commented on SPARK-15396:
-

Add,
1) Spark SQL can't find existing hive databases in spark-sql shell by issuing 
'show databases;'
2) Always told me that there is already existing 'test_sparksql' database...
{code}
16/05/23 09:48:24 ERROR metastore.RetryingHMSHandler: 
AlreadyExistsException(message:Database test_sparksql already exists)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:898)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:133)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
at com.sun.proxy.$Proxy34.create_database(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:645)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:91)
at com.sun.proxy.$Proxy35.createDatabase(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:341)
at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply$mcV$sp(HiveClientImpl.scala:289)
at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply(HiveClientImpl.scala:289)
at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createDatabase$1.apply(HiveClientImpl.scala:289)
at 
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:260)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:207)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:206)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:249)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.createDatabase(HiveClientImpl.scala:288)
at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply$mcV$sp(HiveExternalCatalog.scala:94)
at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply(HiveExternalCatalog.scala:94)
at 
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createDatabase$1.apply(HiveExternalCatalog.scala:94)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:68)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.createDatabase(HiveExternalCatalog.scala:93)
at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:142)
at 
org.apache.spark.sql.execution.command.CreateDatabaseCommand.run(ddl.scala:58)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
at org.apache.spark.sql.Dataset.(Dataset.scala:187)
at org.apache.spark.sql.Dataset.(Dataset.scala:168)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:529)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:649)
at 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-22 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295836#comment-15295836
 ] 

Yi Zhou commented on SPARK-15396:
-

Hi , 
even if setting 'spark.sql.warehouse.dir' in spark-defaults.conf , it still 
can't connect the my existing hive metastore database. And  i also saw it 
create a 'metastore_db' folder in my current working directory. Could you 
please point me out what's the potential cause ? Thanks !
spark.sql.hive.metastore.version=1.1.0
spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
spark.executor.extraClassPath=/etc/hive/conf
spark.driver.extraClassPath=/etc/hive/conf
spark.yarn.jars=local:/usr/lib/spark/jars/*
spark.sql.warehouse.dir=/usr/hive/warehouse

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Assignee: Xiao Li
>Priority: Critical
> Fix For: 2.0.0
>
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-20 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293603#comment-15293603
 ] 

Apache Spark commented on SPARK-15396:
--

User 'gatorsmile' has created a pull request for this issue:
https://github.com/apache/spark/pull/13225

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Priority: Critical
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does not have its own datastore table.
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-19 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292706#comment-15292706
 ] 

Xiao Li commented on SPARK-15396:
-

Based on your description, it sounds like that is caused by the parameter you 
used. If you can fix the issue by setting  `spark.sql.warehouse.dir`, I think 
that is a document issue. I will submit a PR soon. Thanks!

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Priority: Critical
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-19 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292672#comment-15292672
 ] 

Yi Zhou commented on SPARK-15396:
-

It seem it is not only a doc issue , it may be functional issue.

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Priority: Critical
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does not have its own datastore table.
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged 

[jira] [Commented] (SPARK-15396) [Spark] [SQL] [DOC] It can't connect hive metastore database

2016-05-19 Thread Yi Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292655#comment-15292655
 ] 

Yi Zhou commented on SPARK-15396:
-

Hi [~rxin]
I saw a bug fix https://issues.apache.org/jira/browse/SPARK-15345. Is it also 
fixed this issue in this jira ?

> [Spark] [SQL] [DOC] It can't connect hive metastore database
> 
>
> Key: SPARK-15396
> URL: https://issues.apache.org/jira/browse/SPARK-15396
> Project: Spark
>  Issue Type: Documentation
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Yi Zhou
>Priority: Critical
>
> I am try to run Spark SQL using bin/spark-sql with Spark 2.0 master 
> code(commit ba181c0c7a32b0e81bbcdbe5eed94fc97b58c83e) but ran across an issue 
> that it always connect local derby database and can't connect my existing 
> hive metastore database. Could you help me to check what's the root cause ? 
> What's specific configuration for integration with hive metastore in Spark 
> 2.0 ? BTW, this case is OK in Spark 1.6. Thanks in advance !
> Build package command:
> ./dev/make-distribution.sh --tgz -Pyarn -Phadoop-2.6 
> -Dhadoop.version=2.6.0-cdh5.5.1 -Phive -Phive-thriftserver -DskipTests
> Key configurations in spark-defaults.conf:
> {code}
> spark.sql.hive.metastore.version=1.1.0
> spark.sql.hive.metastore.jars=/usr/lib/hive/lib/*:/usr/lib/hadoop/client/*
> spark.executor.extraClassPath=/etc/hive/conf
> spark.driver.extraClassPath=/etc/hive/conf
> spark.yarn.jars=local:/usr/lib/spark/jars/*
> {code}
> There is existing hive metastore database named by "test_sparksql". I always 
> got error "metastore.ObjectStore: Failed to get database test_sparksql, 
> returning NoSuchObjectException" after issuing 'use test_sparksql'. Please 
> see below steps for details.
>  
> $ /usr/lib/spark/bin/spark-sql --master yarn --deploy-mode client
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.5.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/05/12 22:23:28 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:30 INFO metastore.HiveMetaStore: 0: Opening raw store with 
> implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
> 16/05/12 22:23:30 INFO metastore.ObjectStore: ObjectStore, initialize called
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.store.rdbms" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" 
> is already registered. Ensure you dont have multiple JAR versions of the same 
> plugin in the classpath. The URL 
> "file:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar" is already registered, 
> and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/spark/jars/datanucleus-core-3.2.10.jar."
> 16/05/12 22:23:30 WARN DataNucleus.General: Plugin (Bundle) 
> "org.datanucleus.api.jdo" is already registered. Ensure you dont have 
> multiple JAR versions of the same plugin in the classpath. The URL 
> "file:/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already 
> registered, and you are trying to register an identical plugin located at URL 
> "file:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar."
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> datanucleus.cache.level2 unknown - will be ignored
> 16/05/12 22:23:30 INFO DataNucleus.Persistence: Property 
> hive.metastore.integral.jdo.pushdown unknown - will be ignored
> 16/05/12 22:23:31 WARN conf.HiveConf: HiveConf of name 
> hive.enable.spark.execution.engine does not exist
> 16/05/12 22:23:31 INFO metastore.ObjectStore: Setting MetaStore object pin 
> classes with 
> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as 
> "embedded-only" so does not have its own datastore table.
> 16/05/12 22:23:32 INFO DataNucleus.Datastore: The class 
>