Cores and memory setting of driver ? On Wed, 23 Nov 2022, 12:56 Pralabh Kumar, <pralabhku...@gmail.com> wrote:
> How many cores and u are running driver with? > > On Tue, 22 Nov 2022, 21:00 Nikhil Goyal, <nownik...@gmail.com> wrote: > >> Hi folks, >> We are running a job on our on prem cluster on K8s but writing the output >> to S3. We noticed that all the executors finish in < 1h but the driver >> takes another 5h to finish. Logs: >> >> 22/11/22 02:08:29 INFO BlockManagerInfo: Removed broadcast_3_piece0 on >> 10.42.145.11:39001 in memory (size: 7.3 KiB, free: 9.4 GiB) >> 22/11/22 *02:08:29* INFO BlockManagerInfo: Removed broadcast_3_piece0 on >> 10.42.137.10:33425 in memory (size: 7.3 KiB, free: 9.4 GiB) >> 22/11/22 *04:57:46* INFO FileFormatWriter: Write Job >> 4f0051fc-dda9-457f-a072-26311fd5e132 committed. >> 22/11/22 04:57:46 INFO FileFormatWriter: Finished processing stats for write >> job 4f0051fc-dda9-457f-a072-26311fd5e132. >> 22/11/22 04:57:47 INFO FileUtils: Creating directory if it doesn't exist: >> s3://rbx.usr/masked/dw_pii/creator_analytics_user_universe_first_playsession_dc_ngoyal/ds=2022-10-21 >> 22/11/22 04:57:48 INFO SessionState: Could not get hdfsEncryptionShim, it is >> only applicable to hdfs filesystem. >> 22/11/22 *04:57:48* INFO SessionState: Could not get hdfsEncryptionShim, it >> is only applicable to hdfs filesystem. >> 22/11/22 *07:20:20* WARN ExecutorPodsWatchSnapshotSource: Kubernetes client >> has been closed (this is expected if the application is shutting down.) >> 22/11/22 07:20:22 INFO MapOutputTrackerMasterEndpoint: >> MapOutputTrackerMasterEndpoint stopped! >> 22/11/22 07:20:22 INFO MemoryStore: MemoryStore cleared >> 22/11/22 07:20:22 INFO BlockManager: BlockManager stopped >> 22/11/22 07:20:22 INFO BlockManagerMaster: BlockManagerMaster stopped >> 22/11/22 07:20:22 INFO >> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: >> OutputCommitCoordinator stopped! >> 22/11/22 07:20:22 INFO SparkContext: Successfully stopped SparkContext >> 22/11/22 07:20:22 INFO ShutdownHookManager: Shutdown hook called >> 22/11/22 07:20:22 INFO ShutdownHookManager: Deleting directory >> /tmp/spark-d9aa302f-86f2-4668-9c01-07b3e71cba82 >> 22/11/22 07:20:22 INFO ShutdownHookManager: Deleting directory >> /var/data/spark-5295849e-a0f3-4355-9a6a-b510616aefaa/spark-43772336-8c86-4e2b-839e-97b2442b2959 >> 22/11/22 07:20:22 INFO MetricsSystemImpl: Stopping s3a-file-system metrics >> system... >> 22/11/22 07:20:22 INFO MetricsSystemImpl: s3a-file-system metrics system >> stopped. >> 22/11/22 07:20:22 INFO MetricsSystemImpl: s3a-file-system metrics system >> shutdown complete. >> >> Seems like the job is taking time to write to S3. Any idea how to fix this >> issue? >> >> Thanks >> >>