Hello Vinayak,

As I understand it, Spark creates a Derby metastore database in the current
location, in the metastore_db subdirectory, whenever you first use an SQL
context. This database cannot be shared by multiple instances.
This should be controlled by the  javax.jdo.option.ConnectionURL property.
I can imagine that using another kind of metastore database, like an
in-memory or server-client db, would solve this specific problem. However,
I do not think it is advisable.
Is there a specific reason why you are creating a second SQL context? I
think it is meant to be created only once per application and passed around.
I also have no idea why the behavior changed between Spark 1.6 and Spark
2.0.

Michal Šenkýř

On Thu, Dec 1, 2016, 18:33 Vinayak Joshi5 <vijos...@in.ibm.com> wrote:

> This is the error received:
>
>
> 16/12/01 22:35:36 ERROR Schema: Failed initialising database.
> Unable to open a test connection to the given database. JDBC url =
> jdbc:derby:;databaseName=metastore_db;create=true, username = APP.
> Terminating connection pool (set lazyInit to true if you expect to start
> your database after your app). Original Exception: ------
> java.sql.SQLException: Failed to start database 'metastore_db' with class
> loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@4494053,
> see the next exception for details.
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
>         at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
> .
> .
> ------
>
> org.datanucleus.exceptions.NucleusDataStoreException: Unable to open a
> test connection to the given database. JDBC url =
> jdbc:derby:;databaseName=metastore_db;create=true, username = APP.
> Terminating connection pool (set lazyInit to true if you expect to start
> your database after your app). Original Exception: ------
> java.sql.SQLException: Failed to start database 'metastore_db' with class
> loader
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd,
> see the next exception for details.
>         at org.apache.derby.impl.jdb
> .
> .
> .
> NestedThrowables:
> java.sql.SQLException: Unable to open a test connection to the given
> database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true,
> username = APP. Terminating connection pool (set lazyInit to true if you
> expect to start your database after your app). Original Exception: ------
> java.sql.SQLException: Failed to start database 'metastore_db' with class
> loader
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd,
> see the next exception for details.
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
> .
> .
> .
> Caused by: java.sql.SQLException: Unable to open a test connection to the
> given database. JDBC url =
> jdbc:derby:;databaseName=metastore_db;create=true, username = APP.
> Terminating connection pool (set lazyInit to true if you expect to start
> your database after your app). Original Exception: ------
> java.sql.SQLException: Failed to start database 'metastore_db' with class
> loader
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd,
> see the next exception for details.
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
>         at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
>         at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown
> Source)
>         at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown
> Source)
> .
> .
> .
> 16/12/01 22:48:09 ERROR Schema: Failed initialising database.
> Unable to open a test connection to the given database. JDBC url =
> jdbc:derby:;databaseName=metastore_db;create=true, username = APP.
> Terminating connection pool (set lazyInit to true if you expect to start
> your database after your app). Original Exception: ------
> java.sql.SQLException: Failed to start database 'metastore_db' with class
> loader
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd,
> see the next exception for details.
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
>         at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
>         at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown
> Source)
>         at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown
> Source)
> .
> .
> .
> Caused by: java.sql.SQLException: Failed to start database 'metastore_db'
> with class loader
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd,
> see the next exception for details.
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown
> Source)
>         at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
>         at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown
> Source)
>         at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown
> Source)
> .
> .
> .
>
> Caused by: ERROR XJ040: Failed to start database 'metastore_db' with class
> loader
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd,
> see the next exception for details.
>         at
> org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
>         at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown
> Source)
>         ... 111 more
> Caused by: ERROR XSDB6: Another instance of Derby may have already booted
> the database
> /Users/vinayak/devel/spark-stc/git_repo/spark-master-x/spark/metastore_db.
>         at
> org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
>         at
> org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
>         at
> org.apache.derby.impl.store.raw.data.BaseDataFileFactory.privGetJBMSLockOnDB(Unknown
> Source)
>         at
> org.apache.derby.impl.store.raw.data.BaseDataFileFactory.run(Unknown Source)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at
> org.apache.derby.impl.store.raw.data.BaseDataFileFactory.getJBMSLockOnDB(Unknown
> Source)
>         at
> org.apache.derby.impl.store.raw.data.BaseDataFileFactory.boot(Unknown
> Source)
>         at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown
> Source)
>
>
> Regards,
> Vinayak Joshi
>
>
>
> From:        Vinayak Joshi5/India/IBM@IBMIN
> To:        "user.spark" <user@spark.apache.org>
> Date:        01/12/2016 10:53 PM
> Subject:        Spark 2.x Pyspark Spark SQL createDataframe Error
> ------------------------------
>
>
>
> With a local spark instance built with hive support, (-Pyarn -Phadoop-2.6
> -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver)
>
> The following script/sequence works in Pyspark without any error against
> 1.6.x, but fails with 2.x.
>
> people = sc.parallelize(["Michael,30", "Andy,12", "Justin,19"])
> peoplePartsRDD = people.map(lambda p: p.split(","))
> peopleRDD = peoplePartsRDD.map(lambda p: pyspark.sql.Row(name=p[0],
> age=int(p[1])))
> peopleDF= sqlContext.createDataFrame(peopleRDD)
> peopleDF.first()
>
> sqlContext2 = SQLContext(sc)
> people2 = sc.parallelize(["Abcd,40", "Efgh,14", "Ijkl,16"])
> peoplePartsRDD2 = people2.map(lambda l: l.split(","))
> peopleRDD2 = peoplePartsRDD2.map(lambda p: pyspark.sql.Row(fname=p[0],
> age=int(p[1])))
> peopleDF2 = sqlContext2.createDataFrame(peopleRDD2) # <==== error here
>
>
> The error goes away if sqlContext2 is replaced with sqlContext in the
> error line. Is this a regression, or has something changed that makes
> this the expected behavior in Spark 2.x ?
>
> Regards,
> Vinayak
>
>
>
>
>

Reply via email to