I saw this originally with 2.0.0 but as 2.0.1 is freshly out I thought I'd retry.

With pyspark apps, it seems that in 2.x you must call .stop() on the spark context as otherwise the application doesn't stop (see log below - first gap is when it finishes, second gap is when I hit ^C).

Thought I'd mention this in case a) it's a bug (certainly changed behaviour since 1.6.1) and b) anyone else hits it.


Unique devices= 1243123123123
16/10/05 16:52:30 INFO mesos.MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 0

(hangs here - spark UI still running - show all executors there)
(after a while dynamic allocation kicks in which cleans up a little)


16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on ip-10-1-202-125.ec2.internal:18594 in memory (size: 3.7 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on 10.1.200.212:38722 in memory (size: 3.7 KB, free: 912.3 MB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on 10.1.200.212:38722 in memory (size: 26.2 KB, free: 912.3 MB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-202-125.ec2.internal:18594 in memory (size: 26.2 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-201-35.ec2.internal:31353 in memory (size: 26.2 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-200-127.ec2.internal:11645 in memory (size: 26.2 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-200-126.ec2.internal:17810 in memory (size: 26.2 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-201-190.ec2.internal:37033 in memory (size: 26.2 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-201-8.ec2.internal:14506 in memory (size: 26.2 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-200-128.ec2.internal:36370 in memory (size: 26.2 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-202-128.ec2.internal:34654 in memory (size: 26.2 KB, free: 10.5 GB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on ip-10-1-202-127.ec2.internal:28749 in memory (size: 26.2 KB, free: 10.5 GB)
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3471
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3472
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3473
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3474
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3475
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3476
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3477
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3478
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3479
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3480
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3481
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3482
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3483
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3484
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3485
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3486
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3487
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3488
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3489
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3490
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3491
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3492
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned accumulator 3493
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned shuffle 1
16/10/05 16:53:21 INFO spark.ContextCleaner: Cleaned shuffle 2
16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on 10.1.200.212:38722 in memory (size: 9.9 KB, free: 912.3 MB) 16/10/05 16:53:21 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on ip-10-1-202-125.ec2.internal:18594 in memory (size: 9.9 KB, free: 10.5 GB)

(but app is still hanging around - not sure what the import sys is doing here, but I've left it in case it explains anything!)




▽
import sys


▽
import sys
(and here is the ^C)
^C16/10/05 16:55:21 INFO spark.SparkContext: Invoking stop() from shutdown hook 16/10/05 16:55:21 INFO server.ServerConnector: Stopped ServerConnector@70087549{HTTP/1.1}{0.0.0.0:4040} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@22516b56{/stages/stage/kill,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6b5c5804{/api,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@98268{/,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@59865b04{/static,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5a5272d3{/executors/threadDump/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1ade2c09{/executors/threadDump,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7ff4c458{/executors/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@667cb568{/executors,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@60441110{/environment/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@39cb8097{/environment,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@16b1e403{/storage/rdd/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6f8d2666{/storage/rdd,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7455a06e{/storage/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5e0e2b61{/storage,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@5615ea0e{/stages/pool/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@57598349{/stages/pool,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6a38093f{/stages/stage/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2b9460e5{/stages/stage,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6c3dbe02{/stages/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1079b7a{/stages,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@432049a9{/jobs/job/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6144d55d{/jobs/job,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7c4232ba{/jobs/json,null,UNAVAILABLE} 16/10/05 16:55:21 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@291d6f18{/jobs,null,UNAVAILABLE} 16/10/05 16:55:21 INFO ui.SparkUI: Stopped Spark web UI at http://10.1.200.212:4040 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Shutting down all executors 16/10/05 16:55:21 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 8 is now TASK_FINISHED 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 6 is now TASK_FINISHED 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 7 is now TASK_FINISHED 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_FINISHED 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 5 is now TASK_FINISHED 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 2 is now TASK_FINISHED 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 4 is now TASK_FINISHED 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 3 is now TASK_FINISHED 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 0 is now TASK_FINISHED
I1005 16:55:21.862349 16747 sched.cpp:1903] Asked to stop the driver
I1005 16:55:21.862777 16390 sched.cpp:1143] Stopping framework 'cd7fd4fe-1eff-438c-87f9-348461e6a640-1683' 16/10/05 16:55:21 INFO mesos.MesosCoarseGrainedSchedulerBackend: driver.run() returned with code DRIVER_STOPPED 16/10/05 16:55:21 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/05 16:55:21 INFO memory.MemoryStore: MemoryStore cleared
16/10/05 16:55:21 INFO storage.BlockManager: BlockManager stopped
16/10/05 16:55:21 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 16/10/05 16:55:21 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/10/05 16:55:21 INFO spark.SparkContext: Successfully stopped SparkContext
16/10/05 16:55:21 INFO util.ShutdownHookManager: Shutdown hook called
16/10/05 16:55:21 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3d420117-9442-4375-91ac-d54038e2c18b/pyspark-28e7d692-a344-4d0f-9495-3304ccaa520e 16/10/05 16:55:21 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3d420117-9442-4375-91ac-d54038e2c18b

And now a run with sc.stop():
16/10/05 16:57:52 INFO scheduler.DAGScheduler: Job 2 finished: count at NativeMethodAccessorImpl.java:-2, took 12.148737 s
Unique devices= 1243123123123
16/10/05 16:57:53 INFO server.ServerConnector: Stopped ServerConnector@6395b5{HTTP/1.1}{0.0.0.0:4040} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@41d00db1{/stages/stage/kill,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@6dc9bd9a{/api,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@62c55d05{/,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@69c56749{/static,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@2271d89a{/executors/threadDump/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7a7ebfb8{/executors/threadDump,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@4469ec58{/executors/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@231d0c59{/executors,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@352daf52{/environment/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@72798a1f{/environment,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1def5f95{/storage/rdd/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@697feb2d{/storage/rdd,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@7056f8aa{/storage/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@44056f98{/storage,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@48e1d5a{/stages/pool/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@54d9ad20{/stages/pool,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3921b8da{/stages/stage/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@630479fa{/stages/stage,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1e69dddc{/stages/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@37eb5fc9{/stages,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@3feca272{/jobs/job/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@fae47cc{/jobs/job,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@59587d7{/jobs/json,null,UNAVAILABLE} 16/10/05 16:57:53 INFO handler.ContextHandler: Stopped o.s.j.s.ServletContextHandler@1fe2fda3{/jobs,null,UNAVAILABLE} 16/10/05 16:57:53 INFO ui.SparkUI: Stopped Spark web UI at http://10.1.200.212:4040 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 0 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Shutting down all executors 16/10/05 16:57:53 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 6 is now TASK_FINISHED 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_FINISHED 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 7 is now TASK_FINISHED 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 8 is now TASK_FINISHED 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 2 is now TASK_FINISHED 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 3 is now TASK_FINISHED 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 4 is now TASK_FINISHED 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 5 is now TASK_FINISHED 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 0 is now TASK_FINISHED
I1005 16:57:53.946076 17178 sched.cpp:1903] Asked to stop the driver
I1005 16:57:53.946187 17215 sched.cpp:1143] Stopping framework 'cd7fd4fe-1eff-438c-87f9-348461e6a640-1685' 16/10/05 16:57:53 INFO mesos.MesosCoarseGrainedSchedulerBackend: driver.run() returned with code DRIVER_STOPPED 16/10/05 16:57:53 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/05 16:57:53 INFO memory.MemoryStore: MemoryStore cleared
16/10/05 16:57:53 INFO storage.BlockManager: BlockManager stopped
16/10/05 16:57:53 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 16/10/05 16:57:53 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/10/05 16:57:53 INFO spark.SparkContext: Successfully stopped SparkContext
16/10/05 16:57:54 INFO util.ShutdownHookManager: Shutdown hook called
16/10/05 16:57:54 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-db6e7cc5-21d1-4895-a51f-ad797989d711/pyspark-04c90784-4350-4660-bac8-106fdfc7e6cf 16/10/05 16:57:54 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-db6e7cc5-21d1-4895-a51f-ad797989d711

Code was this (sorry, C&P job - could kill off the sys of course!)

import sys
from pyspark.sql import SparkSession

sc= SparkSession\
        .builder\
        .appName("spark2test")\
        .getOrCreate()

df = sc.read.load('/prod/data/core/filtered/parquet/2016-08-01')
print("Rowcount=", df.count())
print("Unique devices=",df.select('device_id_time').distinct().count())
#sc.stop()


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to