[ https://issues.apache.org/jira/browse/SPARK-18687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li resolved SPARK-18687. ----------------------------- Resolution: Fixed Assignee: Vinayak Joshi Fix Version/s: 2.2.0 2.1.1 2.0.3 > Backward compatibility - creating a Dataframe on a new SQLContext object > fails with a Derby error > ------------------------------------------------------------------------------------------------- > > Key: SPARK-18687 > URL: https://issues.apache.org/jira/browse/SPARK-18687 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL > Affects Versions: 2.0.0, 2.0.1, 2.0.2 > Environment: Spark built with hive support > Reporter: Vinayak Joshi > Assignee: Vinayak Joshi > Fix For: 2.0.3, 2.1.1, 2.2.0 > > > With a local spark instance built with hive support, (-Pyarn -Phadoop-2.6 > -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver) > The following script/sequence works in Pyspark without any error in 1.6.x, > but fails in 2.x. > {code} > people = sc.parallelize(["Michael,30", "Andy,12", "Justin,19"]) > peoplePartsRDD = people.map(lambda p: p.split(",")) > peopleRDD = peoplePartsRDD.map(lambda p: pyspark.sql.Row(name=p[0], > age=int(p[1]))) > peopleDF= sqlContext.createDataFrame(peopleRDD) > peopleDF.first() > sqlContext2 = SQLContext(sc) > people2 = sc.parallelize(["Abcd,40", "Efgh,14", "Ijkl,16"]) > peoplePartsRDD2 = people2.map(lambda l: l.split(",")) > peopleRDD2 = peoplePartsRDD2.map(lambda p: pyspark.sql.Row(fname=p[0], > age=int(p[1]))) > peopleDF2 = sqlContext2.createDataFrame(peopleRDD2) # <==== error here > {code} > The error produced is: > {noformat} > 16/12/01 22:35:36 ERROR Schema: Failed initialising database. > Unable to open a test connection to the given database. JDBC url = > jdbc:derby:;databaseName=metastore_db;create=true, username = APP. > Terminating connection pool (set lazyInit to true if you expect to start your > database after your app). Original Exception: ------ > java.sql.SQLException: Failed to start database 'metastore_db' with class > loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@4494053, > see the next exception for details. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) > . > . > ------ > org.datanucleus.exceptions.NucleusDataStoreException: Unable to open a test > connection to the given database. JDBC url = > jdbc:derby:;databaseName=metastore_db;create=true, username = APP. > Terminating connection pool (set lazyInit to true if you expect to start your > database after your app). Original Exception: ------ > java.sql.SQLException: Failed to start database 'metastore_db' with class > loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd, see > the next exception for details. > at org.apache.derby.impl.jdb > . > . > . > NestedThrowables: > java.sql.SQLException: Unable to open a test connection to the given > database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, > username = APP. Terminating connection pool (set lazyInit to true if you > expect to start your database after your app). Original Exception: ------ > java.sql.SQLException: Failed to start database 'metastore_db' with class > loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd, see > the next exception for details. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > . > . > . > Caused by: java.sql.SQLException: Unable to open a test connection to the > given database. JDBC url = jdbc:derby:;databaseName=metastore_db;create=true, > username = APP. Terminating connection pool (set lazyInit to true if you > expect to start your database after your app). Original Exception: ------ > java.sql.SQLException: Failed to start database 'metastore_db' with class > loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd, see > the next exception for details. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source) > . > . > . > 16/12/01 22:48:09 ERROR Schema: Failed initialising database. > Unable to open a test connection to the given database. JDBC url = > jdbc:derby:;databaseName=metastore_db;create=true, username = APP. > Terminating connection pool (set lazyInit to true if you expect to start your > database after your app). Original Exception: ------ > java.sql.SQLException: Failed to start database 'metastore_db' with class > loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd, see > the next exception for details. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source) > . > . > . > Caused by: java.sql.SQLException: Failed to start database 'metastore_db' > with class loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd, see > the next exception for details. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source) > . > . > . > Caused by: ERROR XJ040: Failed to start database 'metastore_db' with class > loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@519dabfd, see > the next exception for details. > at org.apache.derby.iapi.error.StandardException.newException(Unknown > Source) > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown > Source) > ... 111 more > Caused by: ERROR XSDB6: Another instance of Derby may have already booted the > database > /Users/vinayak/devel/spark-stc/git_repo/spark-master-x/spark/metastore_db. > at org.apache.derby.iapi.error.StandardException.newException(Unknown > Source) > at org.apache.derby.iapi.error.StandardException.newException(Unknown > Source) > at > org.apache.derby.impl.store.raw.data.BaseDataFileFactory.privGetJBMSLockOnDB(Unknown > Source) > at > org.apache.derby.impl.store.raw.data.BaseDataFileFactory.run(Unknown Source) > at java.security.AccessController.doPrivileged(Native Method) > at > org.apache.derby.impl.store.raw.data.BaseDataFileFactory.getJBMSLockOnDB(Unknown > Source) > at > org.apache.derby.impl.store.raw.data.BaseDataFileFactory.boot(Unknown Source) > at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown > Source) > {noformat} > The error goes away if sqlContext2 is replaced with sqlContext in the last > (error) line. Since the SQLContext class is preserved for backward > compatibility, the changes in 2.x break scripts/notebooks that follow the > above pattern of calls and used to run fine with 1.6.x. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org