Dr. Nikolaos Servos created SPARK-49484: -------------------------------------------
Summary: Windows 11, spark SDF creation issue Key: SPARK-49484 URL: https://issues.apache.org/jira/browse/SPARK-49484 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 4.0.0 Environment: Python 3.12 Windows 11 Java 17 LTS (also tested Java 21 LTS same problem) Reporter: Dr. Nikolaos Servos When I create a spark dataframe from dicts or Row objects a simple count or show is failing. Worked perfectly on Spark 3.5. Using an sdf = spark.range(10) does not create issues. Usually it fails with spark.range if your installtions is wrong. I used the CMD for testing C:\Users\nikol>pyspark Python 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:03:56) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. WARNING: Using incubator modules: jdk.incubator.vector Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 24/09/01 08:10:17 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 4.0.0-preview1 /_/ Using Python version 3.12.4 (main, Jun 18 2024 15:03:56) Spark context Web UI available at http://NikkTheGreek.station:4041 Spark context available as 'sc' (master = local[*], app id = local-1725171017708). SparkSession available as 'spark'. >>> sdf = spark.range(10) >>> sdf.count() 10 >>> sdf.show() +---+ | id| +---+ | 0| | 1| | 2| | 3| | 4| | 5| | 6| | 7| | 8| | 9| +---+ >>> from pyspark.sql import Row >>> l = [Row(a=1, b=2),Row(a=3, b=4)] >>> sdf = spark.createDataFrame(l) >>> print(sdf.count()) 24/09/01 08:11:17 ERROR Executor: Exception in task 16.0 in stage 4.0 (TID 73) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 18.0 in stage 4.0 (TID 75) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 20.0 in stage 4.0 (TID 77) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 8.0 in stage 4.0 (TID 65) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 19.0 in stage 4.0 (TID 76) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 4.0 in stage 4.0 (TID 61) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 14.0 in stage 4.0 (TID 71) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 11.0 in stage 4.0 (TID 68) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 1.0 in stage 4.0 (TID 58) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 21.0 in stage 4.0 (TID 78) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 3.0 in stage 4.0 (TID 60) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 2.0 in stage 4.0 (TID 59) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 15.0 in stage 4.0 (TID 72) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 26.0 in stage 4.0 (TID 83) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 5.0 in stage 4.0 (TID 62) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 10.0 in stage 4.0 (TID 67)8] java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 17.0 in stage 4.0 (TID 74) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 27.0 in stage 4.0 (TID 84) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 23.0 in stage 4.0 (TID 80) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 22.0 in stage 4.0 (TID 79) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 7.0 in stage 4.0 (TID 64) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 0.0 in stage 4.0 (TID 57) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 13.0 in stage 4.0 (TID 70) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 9.0 in stage 4.0 (TID 66) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 24.0 in stage 4.0 (TID 81) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 12.0 in stage 4.0 (TID 69) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 25.0 in stage 4.0 (TID 82) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR Executor: Exception in task 6.0 in stage 4.0 (TID 63) java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 WARN TaskSetManager: Lost task 16.0 in stage 4.0 (TID 73) (NikkTheGreek.station executor driver): java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more 24/09/01 08:11:17 ERROR TaskSetManager: Task 16 in stage 4.0 failed 1 times; aborting job Traceback (most recent call last): File "<stdin>", line 1, in <module> File "D:\Spark\spark-4.0.0-preview1-bin-hadoop3\python\pyspark\sql\classic\dataframe.py", line 441, in count return int(self._jdf.count()) ^^^^^^^^^^^^^^^^^ File "D:\Spark\spark-4.0.0-preview1-bin-hadoop3\python\lib\py4j-0.10.9.7-src.zip\py4j\java_gateway.py", line 1322, in __call__ File "D:\Spark\spark-4.0.0-preview1-bin-hadoop3\python\pyspark\errors\exceptions\captured.py", line 239, in deco return f(*a, **kw) ^^^^^^^^^^^ File "D:\Spark\spark-4.0.0-preview1-bin-hadoop3\python\lib\py4j-0.10.9.7-src.zip\py4j\protocol.py", line 326, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o58.count. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 16 in stage 4.0 failed 1 times, most recent failure: Lost task 16.0 in stage 4.0 (TID 73) (NikkTheGreek.station executor driver): java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$3(DAGScheduler.scala:2884) at scala.Option.getOrElse(Option.scala:201) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2884) at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2876) at scala.collection.immutable.List.foreach(List.scala:334) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2876) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1280) at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1280) at scala.Option.foreach(Option.scala:437) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1280) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3155) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3089) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3078) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:50) Caused by: java.io.IOException: Cannot run program "python3": CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089) at org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178) at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369) at org.apache.spark.rdd.RDD.iterator(RDD.scala:333) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) at org.apache.spark.scheduler.Task.run(Task.scala:146) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64) at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1583) Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified at java.base/java.lang.ProcessImpl.create(Native Method) at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500) at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126) ... 36 more >>> -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org