[ https://issues.apache.org/jira/browse/SPARK-42752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-42752: ---------------------------------- Affects Version/s: (was: 3.1.3) (was: 3.2.4) (was: 3.4.1) (was: 3.3.3) > Unprintable IllegalArgumentException with Hive catalog enabled in "Hadoop > Free" distibution > ------------------------------------------------------------------------------------------- > > Key: SPARK-42752 > URL: https://issues.apache.org/jira/browse/SPARK-42752 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL > Affects Versions: 3.5.0 > Environment: local > Reporter: Gera Shegalov > Assignee: Gera Shegalov > Priority: Minor > Fix For: 3.5.0 > > > Reproduction steps: > 1. download a standard "Hadoop Free" build > 2. Start pyspark REPL with Hive support > {code:java} > SPARK_DIST_CLASSPATH=$(~/dist/hadoop-3.4.0-SNAPSHOT/bin/hadoop classpath) > ~/dist/spark-3.2.3-bin-without-hadoop/bin/pyspark --conf > spark.sql.catalogImplementation=hive > {code} > 3. Execute any simple dataframe operation > {code:java} > >>> spark.range(100).show() > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/session.py", > line 416, in range > jdf = self._jsparkSession.range(0, int(start), int(step), > int(numPartitions)) > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", > line 1321, in __call__ > File > "/home/user/dist/spark-3.2.3-bin-without-hadoop/python/pyspark/sql/utils.py", > line 117, in deco > raise converted from None > pyspark.sql.utils.IllegalArgumentException: <exception str() failed> > {code} > 4. In fact you can just call spark.conf to trigger this issue > {code:java} > >>> spark.conf > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ... > {code} > There are probably two issues here: > 1) that Hive support should be gracefully disabled if it the dependency not > on the classpath as claimed by > https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > 2) but at the very least the user should be able to see the exception to > understand the issue, and take an action > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org