Dr. Nikolaos Servos created SPARK-49484:
-------------------------------------------

             Summary: Windows 11, spark SDF creation issue
                 Key: SPARK-49484
                 URL: https://issues.apache.org/jira/browse/SPARK-49484
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 4.0.0
         Environment: Python 3.12

Windows 11

Java 17 LTS (also tested Java 21 LTS same problem)
            Reporter: Dr. Nikolaos Servos


When I create a spark dataframe from dicts or Row objects a simple count or 
show is failing. Worked perfectly on Spark 3.5. Using an sdf = spark.range(10) 
does not create issues. Usually it fails with spark.range if your installtions 
is wrong. I used the CMD for testing

 

C:\Users\nikol>pyspark
Python 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:03:56) [MSC 
v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
WARNING: Using incubator modules: jdk.incubator.vector
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
24/09/01 08:10:17 WARN Utils: Service 'SparkUI' could not bind on port 4040. 
Attempting port 4041.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 4.0.0-preview1
      /_/

Using Python version 3.12.4 (main, Jun 18 2024 15:03:56)
Spark context Web UI available at http://NikkTheGreek.station:4041
Spark context available as 'sc' (master = local[*], app id = 
local-1725171017708).
SparkSession available as 'spark'.
>>> sdf = spark.range(10)
>>> sdf.count()
10
>>> sdf.show()
+---+
| id|
+---+
|  0|
|  1|
|  2|
|  3|
|  4|
|  5|
|  6|
|  7|
|  8|
|  9|
+---+

>>> from pyspark.sql import Row
>>> l = [Row(a=1, b=2),Row(a=3, b=4)]
>>> sdf = spark.createDataFrame(l)
>>> print(sdf.count())
24/09/01 08:11:17 ERROR Executor: Exception in task 16.0 in stage 4.0 (TID 73)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 18.0 in stage 4.0 (TID 75)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 20.0 in stage 4.0 (TID 77)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 8.0 in stage 4.0 (TID 65)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 19.0 in stage 4.0 (TID 76)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 4.0 in stage 4.0 (TID 61)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 14.0 in stage 4.0 (TID 71)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 11.0 in stage 4.0 (TID 68)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 1.0 in stage 4.0 (TID 58)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 21.0 in stage 4.0 (TID 78)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 3.0 in stage 4.0 (TID 60)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 2.0 in stage 4.0 (TID 59)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 15.0 in stage 4.0 (TID 72)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 26.0 in stage 4.0 (TID 83)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 5.0 in stage 4.0 (TID 62)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 10.0 in stage 4.0 (TID 67)8]
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 17.0 in stage 4.0 (TID 74)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 27.0 in stage 4.0 (TID 84)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 23.0 in stage 4.0 (TID 80)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 22.0 in stage 4.0 (TID 79)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 7.0 in stage 4.0 (TID 64)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 0.0 in stage 4.0 (TID 57)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 13.0 in stage 4.0 (TID 70)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 9.0 in stage 4.0 (TID 66)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 24.0 in stage 4.0 (TID 81)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 12.0 in stage 4.0 (TID 69)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 25.0 in stage 4.0 (TID 82)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 ERROR Executor: Exception in task 6.0 in stage 4.0 (TID 63)
java.io.IOException: Cannot run program "python3": CreateProcess error=2, The 
system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more
24/09/01 08:11:17 WARN TaskSetManager: Lost task 16.0 in stage 4.0 (TID 73) 
(NikkTheGreek.station executor driver): java.io.IOException: Cannot run program 
"python3": CreateProcess error=2, The system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more

24/09/01 08:11:17 ERROR TaskSetManager: Task 16 in stage 4.0 failed 1 times; 
aborting job
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File 
"D:\Spark\spark-4.0.0-preview1-bin-hadoop3\python\pyspark\sql\classic\dataframe.py",
 line 441, in count
    return int(self._jdf.count())
               ^^^^^^^^^^^^^^^^^
  File 
"D:\Spark\spark-4.0.0-preview1-bin-hadoop3\python\lib\py4j-0.10.9.7-src.zip\py4j\java_gateway.py",
 line 1322, in __call__
  File 
"D:\Spark\spark-4.0.0-preview1-bin-hadoop3\python\pyspark\errors\exceptions\captured.py",
 line 239, in deco
    return f(*a, **kw)
           ^^^^^^^^^^^
  File 
"D:\Spark\spark-4.0.0-preview1-bin-hadoop3\python\lib\py4j-0.10.9.7-src.zip\py4j\protocol.py",
 line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o58.count.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 16 in 
stage 4.0 failed 1 times, most recent failure: Lost task 16.0 in stage 4.0 (TID 
73) (NikkTheGreek.station executor driver): java.io.IOException: Cannot run 
program "python3": CreateProcess error=2, The system cannot find the file 
specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more

Driver stacktrace:
        at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$3(DAGScheduler.scala:2884)
        at scala.Option.getOrElse(Option.scala:201)
        at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2884)
        at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2876)
        at scala.collection.immutable.List.foreach(List.scala:334)
        at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2876)
        at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1280)
        at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1280)
        at scala.Option.foreach(Option.scala:437)
        at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1280)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3155)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3089)
        at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3078)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:50)
Caused by: java.io.IOException: Cannot run program "python3": CreateProcess 
error=2, The system cannot find the file specified
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1170)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1089)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSimpleWorker(PythonWorkerFactory.scala:195)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:118)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:158)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:178)
        at 
org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:209)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:369)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:333)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171)
        at org.apache.spark.scheduler.Task.run(Task.scala:146)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:640)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at 
org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:643)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find 
the file specified
        at java.base/java.lang.ProcessImpl.create(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:500)
        at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:159)
        at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1126)
        ... 36 more

>>>

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org


Reply via email to