Dear Spark team, I am Lorenzo from University of Genoa. I am currently using (ubuntu 18.04) the nextflow/sarek pipeline to analyse genomic data through a singularity container. One of the step of the pipeline uses GATK4 and it implements Spark. However, after some time I get the following error:
23:27:48.112 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-4.2.6.1-local.jar!/com/intel/gkl/native/libgkl_compression.so 23:27:48.523 INFO ApplyBQSRSpark - ------------------------------------------------------------ 23:27:48.524 INFO ApplyBQSRSpark - The Genome Analysis Toolkit (GATK) v4.2.6.1 23:27:48.524 INFO ApplyBQSRSpark - For support and documentation go to https://software.broadinstitute.org/gatk/ 23:27:48.525 INFO ApplyBQSRSpark - Executing as ferrandl@alucard on Linux v5.4.0-91-generic amd64 23:27:48.525 INFO ApplyBQSRSpark - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 23:27:48.526 INFO ApplyBQSRSpark - Start Date/Time: March 24, 2023 11:27:47 PM GMT 23:27:48.526 INFO ApplyBQSRSpark - ------------------------------------------------------------ 23:27:48.526 INFO ApplyBQSRSpark - ------------------------------------------------------------ 23:27:48.527 INFO ApplyBQSRSpark - HTSJDK Version: 2.24.1 23:27:48.527 INFO ApplyBQSRSpark - Picard Version: 2.27.1 23:27:48.527 INFO ApplyBQSRSpark - Built for Spark Version: 2.4.5 23:27:48.527 INFO ApplyBQSRSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2 23:27:48.527 INFO ApplyBQSRSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false 23:27:48.527 INFO ApplyBQSRSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true 23:27:48.527 INFO ApplyBQSRSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false 23:27:48.527 INFO ApplyBQSRSpark - Deflater: IntelDeflater 23:27:48.528 INFO ApplyBQSRSpark - Inflater: IntelInflater 23:27:48.528 INFO ApplyBQSRSpark - GCS max retries/reopens: 20 23:27:48.528 INFO ApplyBQSRSpark - Requester pays: disabled 23:27:48.528 WARN ApplyBQSRSpark - !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Warning: ApplyBQSRSpark is a BETA tool and is not yet ready for use in production !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 23:27:48.528 INFO ApplyBQSRSpark - Initializing engine 23:27:48.528 INFO ApplyBQSRSpark - Done initializing engine Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 23/03/24 23:27:49 INFO SparkContext: Running Spark version 2.4.5 23/03/24 23:27:49 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 23/03/24 23:27:50 INFO SparkContext: Submitted application: ApplyBQSRSpark 23/03/24 23:27:50 INFO SecurityManager: Changing view acls to: ferrandl 23/03/24 23:27:50 INFO SecurityManager: Changing modify acls to: ferrandl 23/03/24 23:27:50 INFO SecurityManager: Changing view acls groups to: 23/03/24 23:27:50 INFO SecurityManager: Changing modify acls groups to: 23/03/24 23:27:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ferrandl); groups with view permissions: Set(); users with modify permissions: Set(ferrandl); groups with modify permissions: Set() 23/03/24 23:27:50 INFO Utils: Successfully started service 'sparkDriver' on port 46757. 23/03/24 23:27:50 INFO SparkEnv: Registering MapOutputTracker 23/03/24 23:27:50 INFO SparkEnv: Registering BlockManagerMaster 23/03/24 23:27:50 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 23/03/24 23:27:50 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 23/03/24 23:27:50 INFO DiskBlockManager: Created local directory at /home/ferrandl/projects/ribas_reanalysis/sarek/work/27/89b7451fcac6fd31461885b5774752/blockmgr-e76f7d59-da0b-4e62-8a99-3cdb23f11ae6 23/03/24 23:27:50 INFO MemoryStore: MemoryStore started with capacity 2004.6 MB 23/03/24 23:27:50 INFO SparkEnv: Registering OutputCommitCoordinator 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4043. Attempting port 4044. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4044. Attempting port 4045. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4045. Attempting port 4046. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4046. Attempting port 4047. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4047. Attempting port 4048. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4048. Attempting port 4049. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4049. Attempting port 4050. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4050. Attempting port 4051. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4051. Attempting port 4052. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4052. Attempting port 4053. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4053. Attempting port 4054. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4054. Attempting port 4055. 23/03/24 23:27:51 WARN Utils: Service 'SparkUI' could not bind on port 4055. Attempting port 4056. 23/03/24 23:27:51 ERROR SparkUI: Failed to bind SparkUI java.net.BindException: Address already in use: Service 'SparkUI' failed after 16 retries (starting from 4040)! Consider explicitly setting the appropriate port for the service 'SparkUI' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries. at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:433) at sun.nio.ch.Net.bind(Net.java:425) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:220) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85) at org.spark_project.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:351) at org.spark_project.jetty.server.ServerConnector.open(ServerConnector.java:319) at org.spark_project.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80) at org.spark_project.jetty.server.ServerConnector.doStart(ServerConnector.java:235) at org.spark_project.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$newConnector$1(JettyUtils.scala:353) at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$httpConnect$1(JettyUtils.scala:384) at org.apache.spark.ui.JettyUtils$$anonfun$7.apply(JettyUtils.scala:387) at org.apache.spark.ui.JettyUtils$$anonfun$7.apply(JettyUtils.scala:387) at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:2269) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:2261) at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:387) at org.apache.spark.ui.WebUI.bind(WebUI.scala:147) at org.apache.spark.SparkContext$$anonfun$11.apply(SparkContext.scala:452) at org.apache.spark.SparkContext$$anonfun$11.apply(SparkContext.scala:452) at scala.Option.foreach(Option.scala:257) at org.apache.spark.SparkContext.<init>(SparkContext.scala:452) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) at org.broadinstitute.hellbender.engine.spark.SparkContextFactory.createSparkContext(SparkContextFactory.java:185) at org.broadinstitute.hellbender.engine.spark.SparkContextFactory.getSparkContext(SparkContextFactory.java:117) at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:28) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289) 23/03/24 23:27:51 INFO DiskBlockManager: Shutdown hook called 23/03/24 23:27:51 INFO ShutdownHookManager: Shutdown hook called 23/03/24 23:27:51 INFO ShutdownHookManager: Deleting directory /home/ferrandl/projects/ribas_reanalysis/sarek/work/27/89b7451fcac6fd31461885b5774752/spark-9cc6fe23-6115-40ef-b1f6-2084ad19150e 23/03/24 23:27:51 INFO ShutdownHookManager: Deleting directory /home/ferrandl/projects/ribas_reanalysis/sarek/work/27/89b7451fcac6fd31461885b5774752/spark-9cc6fe23-6115-40ef-b1f6-2084ad19150e/userFiles-1324f982-0cfa-4289-bdbb-274c0a0e374e Using GATK jar /gatk/gatk-package-4.2.6.1-local.jar Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4g -jar /gatk/gatk-package-4.2.6.1-local.jar ApplyBQSRSpark --input UANG01HM24-1T.md.cram --output UANG01HM24-1T_chr3_69176524-69176656.recal.cram --reference Homo_sapiens_assembly38.fasta --bqsr-recal-file UANG01HM24-1T.recal.table --intervals chr3_69176524-69176656.bed --spark-master local[2] --tmp-dir . Do you have any suggestions on how solve this problem? Thank you so much. best, Lorenzo -- -- Engineering Fellow, Translational Genomic Lab/Life Science Computational Lab, IRCCS - Policlinico San Martino Bioengineer - Information and Communication Technology (ICT), University of Genoa Ph.D Translational Oncology, DiMI, University of Genoa Email: lorenzo.ferra...@hsanmartino.it Alt. Email: lorenzo.ferra...@edu.unige.it