RE: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1
Hi, We figured out the issue it was due to higher value of spark.network.timeout in our configuration after reducing this value of this parameter results are inline with spark 3.0.1 . thank-you for the support. Thank-you Prakash From: Mich Talebzadeh Sent: Tuesday, August 31, 2021 1:06 AM To: Sharma, Prakash (Nokia - IN/Bangalore) Cc: user@spark.apache.org Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1 The problem with these tickets is that it tends to generalise the performance as opposed to a statement of specifics. According to the latter ticket it states and I quote "Spark 3.1.1 is slower than 3.0.2 by 4-5 times". This is not what we have observed migrating from 3.0.1 to 3.1.1. Unless it impacts your area of interest specifically, I would not worry too about it. Anyway back to your point, as I understand, you are using Spark on Kubernetes 3.0.2,launching with Spark-submit 3.0.2 right? Your data is on HDFS, Are you reading HDFS buckets. How is Spark accessing HDFS? Your Spark on k8 gives me the impression that you are accessing cloud buckets. HTH [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Mon, 30 Aug 2021 at 11:53, Sharma, Prakash (Nokia - IN/Bangalore) mailto:prakash.sha...@nokia.com>> wrote: Hi , we are not moving to 3.1.1 because some open ticket are there I have mentioned below. https://issues.apache.org/jira/browse/SPARK-30536 https://issues.apache.org/jira/browse/SPARK-35066 please refer attached mail for spark 35066. Thanks. From: Mich Talebzadeh mailto:mich.talebza...@gmail.com>> Sent: Monday, August 30, 2021 1:15:07 PM To: Sharma, Prakash (Nokia - IN/Bangalore) mailto:prakash.sha...@nokia.com>> Cc: user@spark.apache.org<mailto:user@spark.apache.org> mailto:user@spark.apache.org>> Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1 Hi, Any particular reason why you are not using 3.1.1 on Kubernetes? [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) mailto:prakash.sha...@nokia.com>> wrote: Sessional Greetings , We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data on HDFS and we're observing delays in query execution time when compared to Spark 3.0.1 on same environment. We've observed that some stages fail, but looks like it is taking some time to realise this failure and re-trigger these stages. I am attaching the configuration also which we used for the spark driver . We observe the same behaviour with sapark 3.0.3 also. Please let us know if anyone has observed similar issues. Configuration which we use for spark driver: spark.io.compression.codec=snappy spark.sql.parquet.filterPushdown=true spark.sql.inMemoryColumnarStorage.batchSize=15000 spark.shuffle.file.buffer=1024k spark.ui.retainedStages=1 spark.kerberos.keytab= spark.speculation=false spark.submit.deployMode=cluster spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true> spark.sql.orc.filterPushdown=true spark.serializer=org.apache.spark.serializer.KryoSerializer spark.sql.crossJoin.enabled=true spark.kubernetes.kerberos.keytab= spark.sql.adaptive.enabled=true spark.kryo.unsafe=true spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=> spark.executor.cores=2 spark.ui.retainedTasks=20 spark.network.timeout=2400 spark.rdd.compress=true spark.executor.memoryoverhead=3G spark.master=k8s\: spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=<http://spark.kubernetes.driver.label.sparkoperator
RE: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1
Yes we are using the spark 3.0.2 submit and we are not accessing the cloud buckets . Actually tpc-ds data is stored on HDFS and not in any cloud storage. From this tpc-ds data external tables are created and we are running some queries on this tables basically this queries are select queries. Please refer below link for, how we created the tpcds data https://github.com/hortonworks/hive-testbench/ . Thanks From: Mich Talebzadeh Sent: Tuesday, August 31, 2021 1:06 AM To: Sharma, Prakash (Nokia - IN/Bangalore) Cc: user@spark.apache.org Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1 The problem with these tickets is that it tends to generalise the performance as opposed to a statement of specifics. According to the latter ticket it states and I quote "Spark 3.1.1 is slower than 3.0.2 by 4-5 times". This is not what we have observed migrating from 3.0.1 to 3.1.1. Unless it impacts your area of interest specifically, I would not worry too about it. Anyway back to your point, as I understand, you are using Spark on Kubernetes 3.0.2,launching with Spark-submit 3.0.2 right? Your data is on HDFS, Are you reading HDFS buckets. How is Spark accessing HDFS? Your Spark on k8 gives me the impression that you are accessing cloud buckets. HTH [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Mon, 30 Aug 2021 at 11:53, Sharma, Prakash (Nokia - IN/Bangalore) mailto:prakash.sha...@nokia.com>> wrote: Hi , we are not moving to 3.1.1 because some open ticket are there I have mentioned below. https://issues.apache.org/jira/browse/SPARK-30536 https://issues.apache.org/jira/browse/SPARK-35066 please refer attached mail for spark 35066. Thanks. From: Mich Talebzadeh mailto:mich.talebza...@gmail.com>> Sent: Monday, August 30, 2021 1:15:07 PM To: Sharma, Prakash (Nokia - IN/Bangalore) mailto:prakash.sha...@nokia.com>> Cc: user@spark.apache.org<mailto:user@spark.apache.org> mailto:user@spark.apache.org>> Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1 Hi, Any particular reason why you are not using 3.1.1 on Kubernetes? [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) mailto:prakash.sha...@nokia.com>> wrote: Sessional Greetings , We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data on HDFS and we're observing delays in query execution time when compared to Spark 3.0.1 on same environment. We've observed that some stages fail, but looks like it is taking some time to realise this failure and re-trigger these stages. I am attaching the configuration also which we used for the spark driver . We observe the same behaviour with sapark 3.0.3 also. Please let us know if anyone has observed similar issues. Configuration which we use for spark driver: spark.io.compression.codec=snappy spark.sql.parquet.filterPushdown=true spark.sql.inMemoryColumnarStorage.batchSize=15000 spark.shuffle.file.buffer=1024k spark.ui.retainedStages=1 spark.kerberos.keytab= spark.speculation=false spark.submit.deployMode=cluster spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true> spark.sql.orc.filterPushdown=true spark.serializer=org.apache.spark.serializer.KryoSerializer spark.sql.crossJoin.enabled=true spark.kubernetes.kerberos.keytab= spark.sql.adaptive.enabled=true spark.kryo.unsafe=true spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=> spark.executor.cores=2 spark.ui.retainedTasks=20 spark.network.timeout=2400 spark.rdd.compress=true spark.exec
Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1
Hi , we are not moving to 3.1.1 because some open ticket are there I have mentioned below. https://issues.apache.org/jira/browse/SPARK-30536 https://issues.apache.org/jira/browse/SPARK-35066 please refer attached mail for spark 35066. Thanks. From: Mich Talebzadeh Sent: Monday, August 30, 2021 1:15:07 PM To: Sharma, Prakash (Nokia - IN/Bangalore) Cc: user@spark.apache.org Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1 Hi, Any particular reason why you are not using 3.1.1 on Kubernetes? [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) mailto:prakash.sha...@nokia.com>> wrote: Sessional Greetings , We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data on HDFS and we're observing delays in query execution time when compared to Spark 3.0.1 on same environment. We've observed that some stages fail, but looks like it is taking some time to realise this failure and re-trigger these stages. I am attaching the configuration also which we used for the spark driver . We observe the same behaviour with sapark 3.0.3 also. Please let us know if anyone has observed similar issues. Configuration which we use for spark driver: spark.io.compression.codec=snappy spark.sql.parquet.filterPushdown=true spark.sql.inMemoryColumnarStorage.batchSize=15000 spark.shuffle.file.buffer=1024k spark.ui.retainedStages=1 spark.kerberos.keytab= spark.speculation=false spark.submit.deployMode=cluster spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true> spark.sql.orc.filterPushdown=true spark.serializer=org.apache.spark.serializer.KryoSerializer spark.sql.crossJoin.enabled=true spark.kubernetes.kerberos.keytab= spark.sql.adaptive.enabled=true spark.kryo.unsafe=true spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=> spark.executor.cores=2 spark.ui.retainedTasks=20 spark.network.timeout=2400 spark.rdd.compress=true spark.executor.memoryoverhead=3G spark.master=k8s\: spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=> spark.kubernetes.driver.limit.cores=6144m spark.kubernetes.submission.waitAppCompletion=false spark.kerberos.principal= spark.kubernetes.kerberos.enabled=true spark.kubernetes.allocation.batch.size=5 spark.kubernetes.authenticate.driver.serviceAccountName= spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true<http://spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true> spark.reducer.maxSizeInFlight=1024m spark.storage.memoryFraction=0.25 spark.kubernetes.namespace= spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=<http://spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=> spark.rpc.numRetries=5 spark.shuffle.consolidateFiles=true spark.sql.shuffle.partitions=400 spark.kubernetes.kerberos.krb5.path=/ spark.sql.codegen=true spark.ui.strictTransportSecurity=max-age\=31557600 spark.ui.retainedJobs=1 spark.driver.port=7078 spark.shuffle.io.backLog=256 spark.ssl.ui.enabled=true spark.kubernetes.memoryOverheadFactor=0.1 spark.driver.blockManager.port=7079 spark.kubernetes.executor.limit.cores=4096m spark.submit.pyFiles= spark.kubernetes.container.image= spark.shuffle.io.numConnectionsPerPeer=10 spark.sql.broadcastTimeout=7200 spark.driver.cores=3 spark.executor.memory=9g spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099<http://spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099> spark.driver.maxResultSize=4g spark.sql.parquet.mergeSchema=false spark.sql.inMemoryColumnarStorage.compressed=true spark.rpc.retry.wait=5 spark.hadoop.parquet.enable.summary-metadata=false spark.kubernetes.allocation.batch.delay=9 spark.driver.memory=16g spark.sql.starJoinOptimization=true spark.kubernetes.submitInDriver=true spark.shuffle.compress=true spark.memory.useLegacyMode=tr
Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1
Sessional Greetings , We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data on HDFS and we're observing delays in query execution time when compared to Spark 3.0.1 on same environment. We've observed that some stages fail, but looks like it is taking some time to realise this failure and re-trigger these stages. I am attaching the configuration also which we used for the spark driver . We observe the same behaviour with sapark 3.0.3 also. Please let us know if anyone has observed similar issues. Configuration which we use for spark driver: spark.io.compression.codec=snappy spark.sql.parquet.filterPushdown=true spark.sql.inMemoryColumnarStorage.batchSize=15000 spark.shuffle.file.buffer=1024k spark.ui.retainedStages=1 spark.kerberos.keytab= spark.speculation=false spark.submit.deployMode=cluster spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true spark.sql.orc.filterPushdown=true spark.serializer=org.apache.spark.serializer.KryoSerializer spark.sql.crossJoin.enabled=true spark.kubernetes.kerberos.keytab= spark.sql.adaptive.enabled=true spark.kryo.unsafe=true spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id= spark.executor.cores=2 spark.ui.retainedTasks=20 spark.network.timeout=2400 spark.rdd.compress=true spark.executor.memoryoverhead=3G spark.master=k8s\: spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name= spark.kubernetes.driver.limit.cores=6144m spark.kubernetes.submission.waitAppCompletion=false spark.kerberos.principal= spark.kubernetes.kerberos.enabled=true spark.kubernetes.allocation.batch.size=5 spark.kubernetes.authenticate.driver.serviceAccountName= spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true spark.reducer.maxSizeInFlight=1024m spark.storage.memoryFraction=0.25 spark.kubernetes.namespace= spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name= spark.rpc.numRetries=5 spark.shuffle.consolidateFiles=true spark.sql.shuffle.partitions=400 spark.kubernetes.kerberos.krb5.path=/ spark.sql.codegen=true spark.ui.strictTransportSecurity=max-age\=31557600 spark.ui.retainedJobs=1 spark.driver.port=7078 spark.shuffle.io.backLog=256 spark.ssl.ui.enabled=true spark.kubernetes.memoryOverheadFactor=0.1 spark.driver.blockManager.port=7079 spark.kubernetes.executor.limit.cores=4096m spark.submit.pyFiles= spark.kubernetes.container.image= spark.shuffle.io.numConnectionsPerPeer=10 spark.sql.broadcastTimeout=7200 spark.driver.cores=3 spark.executor.memory=9g spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099 spark.driver.maxResultSize=4g spark.sql.parquet.mergeSchema=false spark.sql.inMemoryColumnarStorage.compressed=true spark.rpc.retry.wait=5 spark.hadoop.parquet.enable.summary-metadata=false spark.kubernetes.allocation.batch.delay=9 spark.driver.memory=16g spark.sql.starJoinOptimization=true spark.kubernetes.submitInDriver=true spark.shuffle.compress=true spark.memory.useLegacyMode=true spark.jars= spark.kubernetes.resource.type=java spark.locality.wait=0s spark.kubernetes.driver.ui.svc.port=4040 spark.sql.orc.splits.include.file.footer=true spark.kubernetes.kerberos.principal= spark.sql.orc.cache.stripe.details.size=1 spark.executor.instances=22 spark.hadoop.fs.hdfs.impl.disable.cache=true spark.sql.hive.metastorePartitionPruning=true Thanks and Regards Prakash
Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1
Sessional Greetings , We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data on HDFS and we're observing delays in query execution time when compared to Spark 3.0.1 on same environment. We've observed that some stages fail, but looks like it is taking some time to realise this failure and re-trigger these stages. I am attaching the configuration also which we used for the spark driver . We observe the same behaviour with sapark 3.0.3 also. Please let us know if anyone has observed similar issues. Configuration which we use for spark driver: spark.io.compression.codec=snappy spark.sql.parquet.filterPushdown=true spark.sql.inMemoryColumnarStorage.batchSize=15000 spark.shuffle.file.buffer=1024k spark.ui.retainedStages=1 spark.kerberos.keytab= spark.speculation=false spark.submit.deployMode=cluster spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true spark.sql.orc.filterPushdown=true spark.serializer=org.apache.spark.serializer.KryoSerializer spark.sql.crossJoin.enabled=true spark.kubernetes.kerberos.keytab= spark.sql.adaptive.enabled=true spark.kryo.unsafe=true spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id= spark.executor.cores=2 spark.ui.retainedTasks=20 spark.network.timeout=2400 spark.rdd.compress=true spark.executor.memoryoverhead=3G spark.master=k8s\: spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name= spark.kubernetes.driver.limit.cores=6144m spark.kubernetes.submission.waitAppCompletion=false spark.kerberos.principal= spark.kubernetes.kerberos.enabled=true spark.kubernetes.allocation.batch.size=5 spark.kubernetes.authenticate.driver.serviceAccountName= spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true spark.reducer.maxSizeInFlight=1024m spark.storage.memoryFraction=0.25 spark.kubernetes.namespace= spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name= spark.rpc.numRetries=5 spark.shuffle.consolidateFiles=true spark.sql.shuffle.partitions=400 spark.kubernetes.kerberos.krb5.path=/ spark.sql.codegen=true spark.ui.strictTransportSecurity=max-age\=31557600 spark.ui.retainedJobs=1 spark.driver.port=7078 spark.shuffle.io.backLog=256 spark.ssl.ui.enabled=true spark.kubernetes.memoryOverheadFactor=0.1 spark.driver.blockManager.port=7079 spark.kubernetes.executor.limit.cores=4096m spark.submit.pyFiles= spark.kubernetes.container.image= spark.shuffle.io.numConnectionsPerPeer=10 spark.sql.broadcastTimeout=7200 spark.driver.cores=3 spark.executor.memory=9g spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099 spark.driver.maxResultSize=4g spark.sql.parquet.mergeSchema=false spark.sql.inMemoryColumnarStorage.compressed=true spark.rpc.retry.wait=5 spark.hadoop.parquet.enable.summary-metadata=false spark.kubernetes.allocation.batch.delay=9 spark.driver.memory=16g spark.sql.starJoinOptimization=true spark.kubernetes.submitInDriver=true spark.shuffle.compress=true spark.memory.useLegacyMode=true spark.jars= spark.kubernetes.resource.type=java spark.locality.wait=0s spark.kubernetes.driver.ui.svc.port=4040 spark.sql.orc.splits.include.file.footer=true spark.kubernetes.kerberos.principal= spark.sql.orc.cache.stripe.details.size=1 spark.executor.instances=22 spark.hadoop.fs.hdfs.impl.disable.cache=true spark.sql.hive.metastorePartitionPruning=true Thanks and Regards Prakash