[jira] [Updated] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-42752: -- Affects Version/s: (was: 3.1.3) (was: 3.2.4) (was: 3.4.1) (was: 3.3.3) > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > --- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.5.0 > Environment: local >Reporter: Gera Shegalov >Assignee: Gera Shegalov >Priority: Minor > Fix For: 3.5.0 > > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "", line 1, in > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in __call__ > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", > line 117, in deco > raise converted from None > pyspark.sql.utils.IllegalArgumentException: > {code} > 4. In fact you can just call spark.conf to trigger this issue > {code:java} > >>> spark.conf > Traceback (most recent call last): > File "", line 1, in > ... > {code} > There are probably two issues here: > 1) that Hive support should be gracefully disabled if it the dependency not > on the classpath as claimed by > https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > 2) but at the very least the user should be able to see the exception to > understand the issue, and take an action > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen updated SPARK-42752: - Priority: Minor (was: Major) > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > --- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 > Environment: local >Reporter: Gera Shegalov >Priority: Minor > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "", line 1, in > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in __call__ > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", > line 117, in deco > raise converted from None > pyspark.sql.utils.IllegalArgumentException: > {code} > 4. In fact you can just call spark.conf to trigger this issue > {code:java} > >>> spark.conf > Traceback (most recent call last): > File "", line 1, in > ... > {code} > There are probably two issues here: > 1) that Hive support should be gracefully disabled if it the dependency not > on the classpath as claimed by > https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > 2) but at the very least the user should be able to see the exception to > understand the issue, and take an action > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen updated SPARK-42752: - Issue Type: Improvement (was: Bug) > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > --- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 > Environment: local >Reporter: Gera Shegalov >Priority: Major > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "", line 1, in > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in __call__ > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", > line 117, in deco > raise converted from None > pyspark.sql.utils.IllegalArgumentException: > {code} > 4. In fact you can just call spark.conf to trigger this issue > {code:java} > >>> spark.conf > Traceback (most recent call last): > File "", line 1, in > ... > {code} > There are probably two issues here: > 1) that Hive support should be gracefully disabled if it the dependency not > on the classpath as claimed by > https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > 2) but at the very least the user should be able to see the exception to > understand the issue, and take an action > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42752) Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop Free" distibution
[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated SPARK-42752: -- Description: Reproduction steps: 1. download a standard "Hadoop Free" build 2. Start pyspark REPL with Hive support {code:java} SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf spark.sql.catalogImplementation=hive {code} 3. Execute any simple dataframe operation {code:java} >>> spark.range(100).show() Traceback (most recent call last): File "", line 1, in File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", line 416, in range jdf = self._jsparkSession.range(0, int(start), int(step), int(numPartitions)) File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__ File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", line 117, in deco raise converted from None pyspark.sql.utils.IllegalArgumentException: {code} 4. In fact you can just call spark.conf to trigger this issue {code:java} >>> spark.conf Traceback (most recent call last): File "", line 1, in ... {code} There are probably two issues here: 1) that Hive support should be gracefully disabled if it the dependency not on the classpath as claimed by https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html 2) but at the very least the user should be able to see the exception to understand the issue, and take an action was: Reproduction steps: 1. download a standard "Hadoop Free" build 2. Start pyspark REPL with Hive support {code:java} SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf spark.sql.catalogImplementation=hive {code} 3. Execute any simple dataframe operation {code:java} >>> spark.range(100).show() Traceback (most recent call last): File "", line 1, in File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", line 416, in range jdf = self._jsparkSession.range(0, int(start), int(step), int(numPartitions)) File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__ File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", line 117, in deco raise converted from None pyspark.sql.utils.IllegalArgumentException: >>> spark.conf Traceback (most recent call last): File "", line 1, in File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", line 347, in conf self._conf = RuntimeConfig(self._jsparkSession.conf()) File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__ File "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", line 117, in deco raise converted from None pyspark.sql.utils.IllegalArgumentException: {code} 4. In fact you can just call spark.conf to trigger this issue {code:java} >>> spark.conf Traceback (most recent call last): File "", line 1, in ... {code} There are probably two issues here: 1) that Hive support should be gracefully disabled if it the dependency not on the classpath as claimed by https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html 2) but at the very least the user should be able to see the exception to understand the issue, and take an action > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > --- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL >Affects Versions: 3.1.3, 3.2.4, 3.3.3, 3.4.1, 3.5.0 > Environment: local >Reporter: Gera Shegalov >Priority: Major > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "", line 1, in > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in