from:"Sharma, Prakash \(Nokia \- IN\/Bangalore\)"

RE: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-09-05 Thread Sharma, Prakash (Nokia - IN/Bangalore)

Hi,
   We figured out the issue it was due to higher value of spark.network.timeout 
 in our configuration after reducing this value  of this parameter results are 
inline with spark 3.0.1 .
thank-you for the support.

Thank-you
Prakash


From: Mich Talebzadeh 
Sent: Tuesday, August 31, 2021 1:06 AM
To: Sharma, Prakash (Nokia - IN/Bangalore) 
Cc: user@spark.apache.org
Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

The problem with these tickets is that it tends to generalise the performance 
as opposed to a statement of specifics.

According to the latter ticket it states and I quote

 "Spark 3.1.1 is slower than 3.0.2 by 4-5 times".

This is not what we have observed migrating from 3.0.1 to 3.1.1. Unless it 
impacts your area of interest specifically, I would not worry too about it.

Anyway back to your point, as I understand,  you are using Spark on Kubernetes 
3.0.2,launching with Spark-submit 3.0.2 right?  Your data is on HDFS, Are you 
reading HDFS buckets. How is Spark accessing HDFS? Your Spark on k8 gives me 
the impression that you are accessing cloud buckets.

HTH




 
[https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Mon, 30 Aug 2021 at 11:53, Sharma, Prakash (Nokia - IN/Bangalore) 
mailto:prakash.sha...@nokia.com>> wrote:

Hi ,

we are not moving to 3.1.1 because some open ticket are there I have mentioned 
below.
https://issues.apache.org/jira/browse/SPARK-30536

https://issues.apache.org/jira/browse/SPARK-35066


please refer attached mail for spark 35066.

Thanks.


From: Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>>
Sent: Monday, August 30, 2021 1:15:07 PM
To: Sharma, Prakash (Nokia - IN/Bangalore) 
mailto:prakash.sha...@nokia.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org> 
mailto:user@spark.apache.org>>
Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

Hi,

Any particular reason why you are not using 3.1.1 on Kubernetes?





 
[https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) 
mailto:prakash.sha...@nokia.com>> wrote:

Sessional Greetings ,
 We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data 
on HDFS and we're observing delays in query execution time when compared to 
Spark 3.0.1 on same environment. We've observed that some stages fail, but 
looks like it is taking some time to realise this failure and re-trigger these 
stages.  I am attaching the configuration also which we used for the spark 
driver . We observe the same behaviour with sapark 3.0.3 also.

Please let us know if anyone has observed similar issues.

Configuration which we use for spark driver:

spark.io.compression.codec=snappy

spark.sql.parquet.filterPushdown=true



spark.sql.inMemoryColumnarStorage.batchSize=15000

spark.shuffle.file.buffer=1024k

spark.ui.retainedStages=1

spark.kerberos.keytab=



spark.speculation=false

spark.submit.deployMode=cluster



spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true>



spark.sql.orc.filterPushdown=true

spark.serializer=org.apache.spark.serializer.KryoSerializer



spark.sql.crossJoin.enabled=true

spark.kubernetes.kerberos.keytab=



spark.sql.adaptive.enabled=true

spark.kryo.unsafe=true

spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=>

spark.executor.cores=2

spark.ui.retainedTasks=20

spark.network.timeout=2400





spark.rdd.compress=true

spark.executor.memoryoverhead=3G

spark.master=k8s\:



spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=<http://spark.kubernetes.driver.label.sparkoperator

RE: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-08-31 Thread Sharma, Prakash (Nokia - IN/Bangalore)

Yes we are using the spark 3.0.2 submit and we are not accessing the cloud 
buckets .
Actually tpc-ds data is stored on HDFS and not in any cloud storage. From this 
tpc-ds data  external tables are created and we are running some queries on 
this tables basically this queries are select queries. Please refer below link 
for, how we created the tpcds data 
https://github.com/hortonworks/hive-testbench/  .
 Thanks
From: Mich Talebzadeh 
Sent: Tuesday, August 31, 2021 1:06 AM
To: Sharma, Prakash (Nokia - IN/Bangalore) 
Cc: user@spark.apache.org
Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

The problem with these tickets is that it tends to generalise the performance 
as opposed to a statement of specifics.

According to the latter ticket it states and I quote

 "Spark 3.1.1 is slower than 3.0.2 by 4-5 times".

This is not what we have observed migrating from 3.0.1 to 3.1.1. Unless it 
impacts your area of interest specifically, I would not worry too about it.

Anyway back to your point, as I understand,  you are using Spark on Kubernetes 
3.0.2,launching with Spark-submit 3.0.2 right?  Your data is on HDFS, Are you 
reading HDFS buckets. How is Spark accessing HDFS? Your Spark on k8 gives me 
the impression that you are accessing cloud buckets.

HTH




 
[https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Mon, 30 Aug 2021 at 11:53, Sharma, Prakash (Nokia - IN/Bangalore) 
mailto:prakash.sha...@nokia.com>> wrote:

Hi ,

we are not moving to 3.1.1 because some open ticket are there I have mentioned 
below.
https://issues.apache.org/jira/browse/SPARK-30536

https://issues.apache.org/jira/browse/SPARK-35066


please refer attached mail for spark 35066.

Thanks.


From: Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>>
Sent: Monday, August 30, 2021 1:15:07 PM
To: Sharma, Prakash (Nokia - IN/Bangalore) 
mailto:prakash.sha...@nokia.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org> 
mailto:user@spark.apache.org>>
Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

Hi,

Any particular reason why you are not using 3.1.1 on Kubernetes?





 
[https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) 
mailto:prakash.sha...@nokia.com>> wrote:

Sessional Greetings ,
 We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data 
on HDFS and we're observing delays in query execution time when compared to 
Spark 3.0.1 on same environment. We've observed that some stages fail, but 
looks like it is taking some time to realise this failure and re-trigger these 
stages.  I am attaching the configuration also which we used for the spark 
driver . We observe the same behaviour with sapark 3.0.3 also.

Please let us know if anyone has observed similar issues.

Configuration which we use for spark driver:

spark.io.compression.codec=snappy

spark.sql.parquet.filterPushdown=true



spark.sql.inMemoryColumnarStorage.batchSize=15000

spark.shuffle.file.buffer=1024k

spark.ui.retainedStages=1

spark.kerberos.keytab=



spark.speculation=false

spark.submit.deployMode=cluster



spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true>



spark.sql.orc.filterPushdown=true

spark.serializer=org.apache.spark.serializer.KryoSerializer



spark.sql.crossJoin.enabled=true

spark.kubernetes.kerberos.keytab=



spark.sql.adaptive.enabled=true

spark.kryo.unsafe=true

spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=>

spark.executor.cores=2

spark.ui.retainedTasks=20

spark.network.timeout=2400





spark.rdd.compress=true

spark.exec

Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-08-30 Thread Sharma, Prakash (Nokia - IN/Bangalore)

Hi ,

we are not moving to 3.1.1 because some open ticket are there I have mentioned 
below.
https://issues.apache.org/jira/browse/SPARK-30536

https://issues.apache.org/jira/browse/SPARK-35066


please refer attached mail for spark 35066.


Thanks.


From: Mich Talebzadeh 
Sent: Monday, August 30, 2021 1:15:07 PM
To: Sharma, Prakash (Nokia - IN/Bangalore) 
Cc: user@spark.apache.org 
Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

Hi,

Any particular reason why you are not using 3.1.1 on Kubernetes?




 
[https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) 
mailto:prakash.sha...@nokia.com>> wrote:

Sessional Greetings ,
 We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data 
on HDFS and we're observing delays in query execution time when compared to 
Spark 3.0.1 on same environment. We've observed that some stages fail, but 
looks like it is taking some time to realise this failure and re-trigger these 
stages.  I am attaching the configuration also which we used for the spark 
driver . We observe the same behaviour with sapark 3.0.3 also.

Please let us know if anyone has observed similar issues.

Configuration which we use for spark driver:

spark.io.compression.codec=snappy

spark.sql.parquet.filterPushdown=true



spark.sql.inMemoryColumnarStorage.batchSize=15000

spark.shuffle.file.buffer=1024k

spark.ui.retainedStages=1

spark.kerberos.keytab=



spark.speculation=false

spark.submit.deployMode=cluster



spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true>



spark.sql.orc.filterPushdown=true

spark.serializer=org.apache.spark.serializer.KryoSerializer



spark.sql.crossJoin.enabled=true

spark.kubernetes.kerberos.keytab=



spark.sql.adaptive.enabled=true

spark.kryo.unsafe=true

spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=>

spark.executor.cores=2

spark.ui.retainedTasks=20

spark.network.timeout=2400





spark.rdd.compress=true

spark.executor.memoryoverhead=3G

spark.master=k8s\:



spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=<http://spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=>

spark.kubernetes.driver.limit.cores=6144m

spark.kubernetes.submission.waitAppCompletion=false

spark.kerberos.principal=

spark.kubernetes.kerberos.enabled=true

spark.kubernetes.allocation.batch.size=5



spark.kubernetes.authenticate.driver.serviceAccountName=



spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true<http://spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true>

spark.reducer.maxSizeInFlight=1024m



spark.storage.memoryFraction=0.25



spark.kubernetes.namespace=

spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=<http://spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=>

spark.rpc.numRetries=5



spark.shuffle.consolidateFiles=true

spark.sql.shuffle.partitions=400

spark.kubernetes.kerberos.krb5.path=/

spark.sql.codegen=true

spark.ui.strictTransportSecurity=max-age\=31557600

spark.ui.retainedJobs=1



spark.driver.port=7078

spark.shuffle.io.backLog=256

spark.ssl.ui.enabled=true

spark.kubernetes.memoryOverheadFactor=0.1



spark.driver.blockManager.port=7079

spark.kubernetes.executor.limit.cores=4096m

spark.submit.pyFiles=

spark.kubernetes.container.image=

spark.shuffle.io.numConnectionsPerPeer=10



spark.sql.broadcastTimeout=7200



spark.driver.cores=3

spark.executor.memory=9g

spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099<http://spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099>



spark.driver.maxResultSize=4g

spark.sql.parquet.mergeSchema=false



spark.sql.inMemoryColumnarStorage.compressed=true

spark.rpc.retry.wait=5

spark.hadoop.parquet.enable.summary-metadata=false





spark.kubernetes.allocation.batch.delay=9

spark.driver.memory=16g

spark.sql.starJoinOptimization=true

spark.kubernetes.submitInDriver=true

spark.shuffle.compress=true

spark.memory.useLegacyMode=tr

Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-08-29 Thread Sharma, Prakash (Nokia - IN/Bangalore)

Sessional Greetings ,
 We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data 
on HDFS and we're observing delays in query execution time when compared to 
Spark 3.0.1 on same environment. We've observed that some stages fail, but 
looks like it is taking some time to realise this failure and re-trigger these 
stages.  I am attaching the configuration also which we used for the spark 
driver . We observe the same behaviour with sapark 3.0.3 also.
Please let us know if anyone has observed similar issues.

Configuration which we use for spark driver:
spark.io.compression.codec=snappy
spark.sql.parquet.filterPushdown=true

spark.sql.inMemoryColumnarStorage.batchSize=15000
spark.shuffle.file.buffer=1024k
spark.ui.retainedStages=1
spark.kerberos.keytab=

spark.speculation=false
spark.submit.deployMode=cluster

spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true

spark.sql.orc.filterPushdown=true
spark.serializer=org.apache.spark.serializer.KryoSerializer

spark.sql.crossJoin.enabled=true
spark.kubernetes.kerberos.keytab=

spark.sql.adaptive.enabled=true
spark.kryo.unsafe=true
spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=
spark.executor.cores=2
spark.ui.retainedTasks=20
spark.network.timeout=2400


spark.rdd.compress=true
spark.executor.memoryoverhead=3G
spark.master=k8s\:

spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=
spark.kubernetes.driver.limit.cores=6144m
spark.kubernetes.submission.waitAppCompletion=false
spark.kerberos.principal=
spark.kubernetes.kerberos.enabled=true
spark.kubernetes.allocation.batch.size=5

spark.kubernetes.authenticate.driver.serviceAccountName=

spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true
spark.reducer.maxSizeInFlight=1024m

spark.storage.memoryFraction=0.25

spark.kubernetes.namespace=
spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=
spark.rpc.numRetries=5

spark.shuffle.consolidateFiles=true
spark.sql.shuffle.partitions=400
spark.kubernetes.kerberos.krb5.path=/
spark.sql.codegen=true
spark.ui.strictTransportSecurity=max-age\=31557600
spark.ui.retainedJobs=1

spark.driver.port=7078
spark.shuffle.io.backLog=256
spark.ssl.ui.enabled=true
spark.kubernetes.memoryOverheadFactor=0.1

spark.driver.blockManager.port=7079
spark.kubernetes.executor.limit.cores=4096m
spark.submit.pyFiles=
spark.kubernetes.container.image=
spark.shuffle.io.numConnectionsPerPeer=10

spark.sql.broadcastTimeout=7200

spark.driver.cores=3
spark.executor.memory=9g
spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099

spark.driver.maxResultSize=4g
spark.sql.parquet.mergeSchema=false

spark.sql.inMemoryColumnarStorage.compressed=true
spark.rpc.retry.wait=5
spark.hadoop.parquet.enable.summary-metadata=false


spark.kubernetes.allocation.batch.delay=9
spark.driver.memory=16g
spark.sql.starJoinOptimization=true
spark.kubernetes.submitInDriver=true
spark.shuffle.compress=true
spark.memory.useLegacyMode=true
spark.jars=
spark.kubernetes.resource.type=java
spark.locality.wait=0s
spark.kubernetes.driver.ui.svc.port=4040
spark.sql.orc.splits.include.file.footer=true
spark.kubernetes.kerberos.principal=

spark.sql.orc.cache.stripe.details.size=1

spark.executor.instances=22
spark.hadoop.fs.hdfs.impl.disable.cache=true
spark.sql.hive.metastorePartitionPruning=true

Thanks and Regards
Prakash

Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-08-27 Thread Sharma, Prakash (Nokia - IN/Bangalore)

Sessional Greetings ,
 We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data 
on HDFS and we're observing delays in query execution time when compared to 
Spark 3.0.1 on same environment. We've observed that some stages fail, but 
looks like it is taking some time to realise this failure and re-trigger these 
stages.  I am attaching the configuration also which we used for the spark 
driver . We observe the same behaviour with sapark 3.0.3 also.
Please let us know if anyone has observed similar issues.

Configuration which we use for spark driver:
spark.io.compression.codec=snappy
spark.sql.parquet.filterPushdown=true

spark.sql.inMemoryColumnarStorage.batchSize=15000
spark.shuffle.file.buffer=1024k
spark.ui.retainedStages=1
spark.kerberos.keytab=

spark.speculation=false
spark.submit.deployMode=cluster

spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true

spark.sql.orc.filterPushdown=true
spark.serializer=org.apache.spark.serializer.KryoSerializer

spark.sql.crossJoin.enabled=true
spark.kubernetes.kerberos.keytab=

spark.sql.adaptive.enabled=true
spark.kryo.unsafe=true
spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=
spark.executor.cores=2
spark.ui.retainedTasks=20
spark.network.timeout=2400


spark.rdd.compress=true
spark.executor.memoryoverhead=3G
spark.master=k8s\:

spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=
spark.kubernetes.driver.limit.cores=6144m
spark.kubernetes.submission.waitAppCompletion=false
spark.kerberos.principal=
spark.kubernetes.kerberos.enabled=true
spark.kubernetes.allocation.batch.size=5

spark.kubernetes.authenticate.driver.serviceAccountName=

spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true
spark.reducer.maxSizeInFlight=1024m

spark.storage.memoryFraction=0.25

spark.kubernetes.namespace=
spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=
spark.rpc.numRetries=5

spark.shuffle.consolidateFiles=true
spark.sql.shuffle.partitions=400
spark.kubernetes.kerberos.krb5.path=/
spark.sql.codegen=true
spark.ui.strictTransportSecurity=max-age\=31557600
spark.ui.retainedJobs=1

spark.driver.port=7078
spark.shuffle.io.backLog=256
spark.ssl.ui.enabled=true
spark.kubernetes.memoryOverheadFactor=0.1

spark.driver.blockManager.port=7079
spark.kubernetes.executor.limit.cores=4096m
spark.submit.pyFiles=
spark.kubernetes.container.image=
spark.shuffle.io.numConnectionsPerPeer=10

spark.sql.broadcastTimeout=7200

spark.driver.cores=3
spark.executor.memory=9g
spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099

spark.driver.maxResultSize=4g
spark.sql.parquet.mergeSchema=false

spark.sql.inMemoryColumnarStorage.compressed=true
spark.rpc.retry.wait=5
spark.hadoop.parquet.enable.summary-metadata=false


spark.kubernetes.allocation.batch.delay=9
spark.driver.memory=16g
spark.sql.starJoinOptimization=true
spark.kubernetes.submitInDriver=true
spark.shuffle.compress=true
spark.memory.useLegacyMode=true
spark.jars=
spark.kubernetes.resource.type=java
spark.locality.wait=0s
spark.kubernetes.driver.ui.svc.port=4040
spark.sql.orc.splits.include.file.footer=true
spark.kubernetes.kerberos.principal=

spark.sql.orc.cache.stripe.details.size=1

spark.executor.instances=22
spark.hadoop.fs.hdfs.impl.disable.cache=true
spark.sql.hive.metastorePartitionPruning=true

Thanks and Regards
Prakash

RE: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

RE: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

5 matches

Site Navigation

Mail list logo

Footer information