[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431409#comment-15431409 ] Andrew Davidson commented on SPARK-17172: - Hi Sean I forgot about that older jira issue. I never resolved it. I am using juypter. I believe each notebook gets it own spark context. I googled around and found some old issue that seem to suggest that a hive and sql context where being created . I have not figure out how to either use a different database for the hive context or prevent the original spark context from being created. > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > Attachments: hiveUDFBug.html, hiveUDFBug.ipynb > > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431384#comment-15431384 ] Sean Owen commented on SPARK-17172: --- That seems in order then, though there's an error about it. I think it's actually saying this because of the error, which you see farther down. Another instance of Derby may have already booted the database Isn't this the same then as a third JIRA you opened? https://issues.apache.org/jira/browse/SPARK-15506 > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > Attachments: hiveUDFBug.html, hiveUDFBug.ipynb > > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431371#comment-15431371 ] Andrew Davidson commented on SPARK-17172: - Hi Sean the data center was created using spark-ec2 from spark-1.6.1-bin-hadoop2.6 ec2-user@ip-172-31-22-140 root]$ cat /root/spark/RELEASE Spark 1.6.1 built for Hadoop 2.0.0-mr1-cdh4.2.0 Build flags: -Psparkr -Phadoop-1 -Phive -Phive-thriftserver -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DzincPort=3032 [ec2-user@ip-172-31-22-140 root]$ > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > Attachments: hiveUDFBug.html, hiveUDFBug.ipynb > > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431030#comment-15431030 ] Sean Owen commented on SPARK-17172: --- It shows this error: You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt assembly I think that's the cause. Did you build with hive support? (Despite the message I think the more direct way to do it is -Phive) > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > Attachments: hiveUDFBug.html, hiveUDFBug.ipynb > > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15431018#comment-15431018 ] Andrew Davidson commented on SPARK-17172: - Hi Sean It should be very easy to use the attached notebook to reproduce the hive bug. I got the code example from a blog. The original code worked in spark 1.5.x I also attached an html version of the notebook so you can see the entire stack trace with out having to start jupyter thanks Andy > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > Attachments: hiveUDFBug.html, hiveUDFBug.ipynb > > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430004#comment-15430004 ] Andrew Davidson commented on SPARK-17172: - Hi Sean I do not think it is the same error. In the related to bug, I could not create a udf using sqlcontext. The work around solution was to change the permission on hdfs:///tmp The error msg actually mentioned problem with /tmp. (I thought the msg referred to the file:///tmp ) not sure how permission got messed up? maybe some one deleted it by accident and spark does not recreated it if its missing? so I am able to create udf using sqlcontext. hiveContext does not work. Given I fixed the hdfs:/// permission problem I think its probably something else. Hopefully the attached notebook makes it easy to reproduce thanks Andy > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > Attachments: hiveUDFBug.html, hiveUDFBug.ipynb > > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429680#comment-15429680 ] Sean Owen commented on SPARK-17172: --- Yeah, is it the same error? the actual error doesn't seem to be part of this. It sounds like a duplicate if you see the same cause. > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > Attachments: hiveUDFBug.html, hiveUDFBug.ipynb > > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429465#comment-15429465 ] Andrew Davidson commented on SPARK-17172: - attached a notebook that demonstrates the bug. Also attaced an html version of notebook > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > Attachments: hiveUDFBug.html, hiveUDFBug.ipynb > > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17172) pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while calling None.org.apache.spark.sql.hive.HiveContext.
[ https://issues.apache.org/jira/browse/SPARK-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429463#comment-15429463 ] Andrew Davidson commented on SPARK-17172: - related bug report : https://issues.apache.org/jira/browse/SPARK-17143 > pyspak hiveContext can not create UDF: Py4JJavaError: An error occurred while > calling None.org.apache.spark.sql.hive.HiveContext. > -- > > Key: SPARK-17172 > URL: https://issues.apache.org/jira/browse/SPARK-17172 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.6.2 > Environment: spark version: 1.6.2 > python version: 3.4.2 (v3.4.2:ab2c023a9432, Oct 5 2014, 20:42:22) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] >Reporter: Andrew Davidson > > from pyspark.sql import HiveContext > sqlContext = HiveContext(sc) > # Define udf > from pyspark.sql.functions import udf > def scoreToCategory(score): > if score >= 80: return 'A' > elif score >= 60: return 'B' > elif score >= 35: return 'C' > else: return 'D' > > udfScoreToCategory=udf(scoreToCategory, StringType()) > throws exception > Py4JJavaError: An error occurred while calling > None.org.apache.spark.sql.hive.HiveContext. > : java.lang.RuntimeException: java.lang.RuntimeException: Unable to > instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org