[ https://issues.apache.org/jira/browse/SPARK-40874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuming Wang updated SPARK-40874: -------------------------------- Fix Version/s: 3.3.2 (was: 3.3.1) > Fix broadcasts in Python UDFs when encryption is enabled > -------------------------------------------------------- > > Key: SPARK-40874 > URL: https://issues.apache.org/jira/browse/SPARK-40874 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.4.0 > Reporter: Peter Toth > Assignee: Peter Toth > Priority: Major > Fix For: 3.1.4, 3.4.0, 3.2.3, 3.3.2 > > > The following Pyspark script: > {noformat} > bin/pyspark --conf spark.io.encryption.enabled=true > ... > bar = {"a": "aa", "b": "bb"} > foo = spark.sparkContext.broadcast(bar) > spark.udf.register("MYUDF", lambda x: foo.value[x] if x else "") > spark.sql("SELECT MYUDF('a') AS a, MYUDF('b') AS b").collect() > {noformat} > fails with: > {noformat} > 22/10/21 17:14:32 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)/ > 1] > org.apache.spark.api.python.PythonException: Traceback (most recent call > last): > File > "/Users/petertoth/git/apache/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 811, in main > func, profiler, deserializer, serializer = read_command(pickleSer, infile) > File > "/Users/petertoth/git/apache/spark/python/lib/pyspark.zip/pyspark/worker.py", > line 87, in read_command > command = serializer._read_with_length(file) > File > "/Users/petertoth/git/apache/spark/python/lib/pyspark.zip/pyspark/serializers.py", > line 173, in _read_with_length > return self.loads(obj) > File > "/Users/petertoth/git/apache/spark/python/lib/pyspark.zip/pyspark/serializers.py", > line 471, in loads > return cloudpickle.loads(obj, encoding=encoding) > EOFError: Ran out of input > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org