Re: spark 3.2 release date

2021-08-30 Thread Gengliang Wang
Hi,

There is not exact release date now. As per 
https://spark.apache.org/release-process.html 
, we need a Release Candidate 
which passes the release vote.
Spark 3.2 RC1 failed recently. I will cut RC2 after 
https://issues.apache.org/jira/browse/SPARK-36619 
 is resolved.


Gengliang Wang




> On Aug 31, 2021, at 12:06 PM, infa elance  wrote:
> 
> What is the expected ballpark release date of spark 3.2 ? 
> 
> Thanks and Regards,
> Ajay.



spark 3.2 release date

2021-08-30 Thread infa elance
What is the expected ballpark release date of spark 3.2 ?

Thanks and Regards,
Ajay.


Re: Can’t write to PVC in K8S

2021-08-30 Thread Bjørn Jørgensen
ok, so when I use spark on k8s I can only save files to s3 buckets or to a 
database? 

Note my setup, its spark with jupyterlab on top on k8s. 

What are those for if I cant write files from spark in k8s to disk? 

"spark.kubernetes.driver.volumes.persistentVolumeClaim.nfs100.mount.readOnly", 
"False"
"spark.kubernetes.executor.volumes.persistentVolumeClaim.nfs100.mount.readOnly",
 "False"

On 2021/08/30 20:50:22, Mich Talebzadeh  wrote: 
> Hi,
> 
> You are trying to write to work-dir inside the docker and create
> sub-directories:
> 
> The error you are getting is this
> 
> Mkdirs failed to create
> file:/opt/spark/work-dir/falk/F01test_df.parquet/_temporary/0/_temporary/attempt_202108291906304682784428756208427_0026_m_00_9563
> (exists=false, cwd=file:/opt/spark/work-dir)
> 
> That directory /work-dir is not recognised as a valid directory
> for storage. It is not in HDFS or HCFS format
> 
> 
> From Spark you can write to a bucket outside as a permanent storage.
> 
> HTH
> 
> 
>view my Linkedin profile
> 
> 
> 
> 
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
> 
> 
> 
> 
> On Mon, 30 Aug 2021 at 14:11, Bjørn Jørgensen 
> wrote:
> 
> > Hi, I have built and running spark on k8s. A link to my repo
> > https://github.com/bjornjorgensen/jlpyk8s
> >
> > Everything seems to be running fine, but I can’t save to PVC.
> > If I convert the dataframe to pandas, then I can save it.
> >
> >
> >
> > from pyspark.sql import SparkSession
> > spark = SparkSession.builder \
> > .master("k8s://https://kubernetes.default.svc.cluster.local:443;) \
> > .config("spark.kubernetes.container.image",
> > "bjornjorgensen/spark-py:v3.2-290821") \
> > .config("spark.kubernetes.authenticate.caCertFile", "/var/run/secrets/
> > kubernetes.io/serviceaccount/ca.crt") \
> > .config("spark.kubernetes.authenticate.oauthTokenFile",
> > "/var/run/secrets/kubernetes.io/serviceaccount/token") \
> > .config("spark.kubernetes.authenticate.driver.serviceAccountName",
> > "my-pyspark-notebook") \
> > .config("spark.executor.instances", "10") \
> > .config("spark.driver.host",
> > "my-pyspark-notebook-spark-driver.default.svc.cluster.local") \
> > .config("spark.driver.port", "29413") \
> >
> > .config("spark.kubernetes.driver.volumes.persistentVolumeClaim.nfs100.options.claimName",
> > "nfs100") \
> >
> > .config("spark.kubernetes.driver.volumes.persistentVolumeClaim.nfs100.mount.path",
> > "/opt/spark/work-dir") \
> >
> > .config("spark.kubernetes.executor.volumes.persistentVolumeClaim.nfs100.options.claimName",
> > "nfs100") \
> >
> > .config("spark.kubernetes.executor.volumes.persistentVolumeClaim.nfs100.mount.path",
> > "/opt/spark/work-dir") \
> >
> > .config("spark.kubernetes.driver.volumes.persistentVolumeClaim.nfs100.mount.readOnly",
> > "False") \
> >
> > .config("spark.kubernetes.executor.volumes.persistentVolumeClaim.nfs100.mount.readOnly",
> > "False") \
> > .appName("myApp") \
> > .config("spark.sql.repl.eagerEval.enabled", "True") \
> > .config("spark.driver.memory", "4g") \
> > .config("spark.executor.memory", "4g") \
> > .getOrCreate()
> > sc = spark.sparkContext
> >
> > pdf.to_parquet("/opt/spark/work-dir/falk/test/F01test.parquet")
> >
> >
> > 21/08/30 12:20:34 WARN WindowExec: No Partition Defined for Window
> > operation! Moving all data to a single partition, this can cause serious
> > performance degradation.
> > 21/08/30 12:20:34 WARN WindowExec: No Partition Defined for Window
> > operation! Moving all data to a single partition, this can cause serious
> > performance degradation.
> > 21/08/30 12:20:37 WARN WindowExec: No Partition Defined for Window
> > operation! Moving all data to a single partition, this can cause serious
> > performance degradation.
> > 21/08/30 12:20:39 WARN TaskSetManager: Lost task 0.0 in stage 25.0 (TID
> > 9497) (10.42.0.16 executor 3): java.io.IOException: Mkdirs failed to create
> > file:/opt/spark/work-dir/falk/test/F01test.parquet/_temporary/0/_temporary/attempt_202108301220375889526593865835092_0025_m_00_9497
> > (exists=false, cwd=file:/opt/spark/work-dir)
> > at
> > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:515)
> > at
> > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:500)
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1195)
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1175)
> > at
> > org.apache.parquet.hadoop.util.HadoopOutputFile.create(HadoopOutputFile.java:74)
> > at
> > org.apache.parquet.hadoop.ParquetFileWriter.(ParquetFileWriter.java:329)
> >  

Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-08-30 Thread Mich Talebzadeh
The problem with these tickets is that it tends to generalise the
performance as opposed to a statement of specifics.

According to the latter ticket it states and I quote

 "Spark 3.1.1 is slower than 3.0.2 by 4-5 times".

This is not what we have observed migrating from 3.0.1 to 3.1.1. Unless it
impacts your area of interest specifically, I would not worry too about it.

Anyway back to your point, as I understand,  you are using Spark on
Kubernetes 3.0.2,launching with Spark-submit 3.0.2 right?  Your data is on
HDFS, Are you reading HDFS buckets. How is Spark accessing HDFS? Your Spark
on k8 gives me the impression that you are accessing cloud buckets.

HTH



   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 30 Aug 2021 at 11:53, Sharma, Prakash (Nokia - IN/Bangalore) <
prakash.sha...@nokia.com> wrote:

> Hi ,
>
> we are not moving to 3.1.1 because some open ticket are there I have
> mentioned below.
> https://issues.apache.org/jira/browse/SPARK-30536
>
> https://issues.apache.org/jira/browse/SPARK-35066
>
>
> please refer attached mail for spark 35066.
>
> Thanks.
>
> --
> *From:* Mich Talebzadeh 
> *Sent:* Monday, August 30, 2021 1:15:07 PM
> *To:* Sharma, Prakash (Nokia - IN/Bangalore) 
> *Cc:* user@spark.apache.org 
> *Subject:* Re: Performance Degradation in Spark 3.0.2 compared to Spark
> 3.0.1
>
> Hi,
>
> Any particular reason why you are not using 3.1.1 on Kubernetes?
>
>
>
>view my Linkedin profile
> 
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) <
> prakash.sha...@nokia.com> wrote:
>
> Sessional Greetings ,
>  We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with
> data on HDFS and we're observing delays in query execution time *when*
> compared to Spark 3.0.1 on same environment. We've observed that some
> stages fail, but looks like it is taking some time to realise this failure
> and re-trigger these stages.  I am attaching the configuration also which
> we used for the spark driver . We observe the same behaviour with sapark
> 3.0.3 also.
>
> *Please let us know if anyone has observed similar issues.*
>
> Configuration which we use for spark driver:
>
> spark.io.compression.codec=snappy
>
> spark.sql.parquet.filterPushdown=true
>
>
>
> spark.sql.inMemoryColumnarStorage.batchSize=15000
>
> spark.shuffle.file.buffer=1024k
>
> spark.ui.retainedStages=1
>
> spark.kerberos.keytab=
>
>
>
> spark.speculation=false
>
> spark.submit.deployMode=cluster
>
>
>
>
> spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true
>
>
>
> spark.sql.orc.filterPushdown=true
>
> spark.serializer=org.apache.spark.serializer.KryoSerializer
>
>
>
> spark.sql.crossJoin.enabled=true
>
> spark.kubernetes.kerberos.keytab=
>
>
>
> spark.sql.adaptive.enabled=true
>
> spark.kryo.unsafe=true
>
> spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id= label>
>
> spark.executor.cores=2
>
> spark.ui.retainedTasks=20
>
> spark.network.timeout=2400
>
>
>
>
>
> spark.rdd.compress=true
>
> spark.executor.memoryoverhead=3G
>
> spark.master=k8s\:
>
>
>
> spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name= name>
>
> spark.kubernetes.driver.limit.cores=6144m
>
> spark.kubernetes.submission.waitAppCompletion=false
>
> spark.kerberos.principal=
>
> spark.kubernetes.kerberos.enabled=true
>
> spark.kubernetes.allocation.batch.size=5
>
>
>
> spark.kubernetes.authenticate.driver.serviceAccountName= name>
>
>
>
>
> spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true
>
> spark.reducer.maxSizeInFlight=1024m
>
>
>
> spark.storage.memoryFraction=0.25
>
>
>
> spark.kubernetes.namespace=
>
> spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name= label>
>
> spark.rpc.numRetries=5
>
>
>
> spark.shuffle.consolidateFiles=true
>
> spark.sql.shuffle.partitions=400
>
> spark.kubernetes.kerberos.krb5.path=/
>
> spark.sql.codegen=true
>
> spark.ui.strictTransportSecurity=max-age\=31557600
>
> spark.ui.retainedJobs=1
>
>
>
> spark.driver.port=7078
>
> spark.shuffle.io.backLog=256
>
> spark.ssl.ui.enabled=true
>
> spark.kubernetes.memoryOverheadFactor=0.1
>
>
>
> spark.driver.blockManager.port=7079
>
> 

Re: Connection reset by peer : failed to remove cache rdd

2021-08-30 Thread Jacek Laskowski
Hi,

No idea what might be going on here, but I'd not worry much about it and
simply monitor disk usage as some broadcast blocks might have left over.

Do you know when in your application lifecycle it happens? Spark SQL or
Structured Streaming? Do you use broadcast variables or are the errors
coming from broadcast joins perhaps?

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
"The Internals Of" Online Books 
Follow me on https://twitter.com/jaceklaskowski




On Mon, Aug 30, 2021 at 3:26 PM Harsh Sharma 
wrote:

> We are facing issue in production where we are getting frequent
>
> Still have 1 request outstanding when connection with the hostname was
> closed
>
> connection reset by peer : errors as well as warnings  : failed to remove
> cache rdd or failed  to remove broadcast variable.
>
> Please help us how to mitigate this  :
>
> Executor memory : 12g
>
> Network timeout :   60
>
> Heartbeat interval : 25
>
>
>
> [Stage 284:>(199 + 1) / 200][Stage 292:>  (1 + 3)
> / 200]
> [Stage 284:>(199 + 1) / 200][Stage 292:>  (2 + 3)
> / 200]
> [Stage 292:>  (2 + 4)
> / 200][14/06/21 10:46:17,006 WARN
> shuffle-server-4](TransportChannelHandler) Exception in connection from
> 
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
> at
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
> at
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
> at
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:748)
> [14/06/21 10:46:17,010 ERROR shuffle-server-4](TransportResponseHandler)
> Still have 1 requests outstanding when connection from  is closed
> [14/06/21 10:46:17,012 ERROR Spark Context Cleaner](ContextCleaner) Error
> cleaning broadcast 159
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
> at
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
> at
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
> at
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:748)
> [14/06/21 10:46:17,012 WARN
> block-manager-ask-thread-pool-69](BlockManagerMaster) Failed to remove
> broadcast 159 with removeFromMaster = true - Connection reset by peer
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
> at
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
> at
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
> at
> 

Connection reset by peer : failed to remove cache rdd

2021-08-30 Thread Harsh Sharma
We are facing issue in production where we are getting frequent

Still have 1 request outstanding when connection with the hostname was closed

connection reset by peer : errors as well as warnings  : failed to remove cache 
rdd or failed  to remove broadcast variable.

Please help us how to mitigate this  :

Executor memory : 12g

Network timeout :   60

Heartbeat interval : 25

 

[Stage 284:>(199 + 1) / 200][Stage 292:>  (1 + 3) / 200]
[Stage 284:>(199 + 1) / 200][Stage 292:>  (2 + 3) / 200]
[Stage 292:>  (2 + 4) / 
200][14/06/21 10:46:17,006 WARN  shuffle-server-4](TransportChannelHandler) 
Exception in connection from 
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
at 
io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at 
io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
[14/06/21 10:46:17,010 ERROR shuffle-server-4](TransportResponseHandler) Still 
have 1 requests outstanding when connection from  is closed
[14/06/21 10:46:17,012 ERROR Spark Context Cleaner](ContextCleaner) Error 
cleaning broadcast 159
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
at 
io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at 
io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
[14/06/21 10:46:17,012 WARN  
block-manager-ask-thread-pool-69](BlockManagerMaster) Failed to remove 
broadcast 159 with removeFromMaster = true - Connection reset by peer
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
at 
io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at 
io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)

-
To unsubscribe 

Can’t write to PVC in K8S

2021-08-30 Thread Bjørn Jørgensen
Hi, I have built and running spark on k8s. A link to my repo 
https://github.com/bjornjorgensen/jlpyk8s

Everything seems to be running fine, but I can’t save to PVC. 
If I convert the dataframe to pandas, then I can save it. 



from pyspark.sql import SparkSession
spark = SparkSession.builder \
.master("k8s://https://kubernetes.default.svc.cluster.local:443;) \
.config("spark.kubernetes.container.image", 
"bjornjorgensen/spark-py:v3.2-290821") \
.config("spark.kubernetes.authenticate.caCertFile", 
"/var/run/secrets/kubernetes.io/serviceaccount/ca.crt") \
.config("spark.kubernetes.authenticate.oauthTokenFile", 
"/var/run/secrets/kubernetes.io/serviceaccount/token") \
.config("spark.kubernetes.authenticate.driver.serviceAccountName", 
"my-pyspark-notebook") \
.config("spark.executor.instances", "10") \
.config("spark.driver.host", 
"my-pyspark-notebook-spark-driver.default.svc.cluster.local") \
.config("spark.driver.port", "29413") \

.config("spark.kubernetes.driver.volumes.persistentVolumeClaim.nfs100.options.claimName",
 "nfs100") \

.config("spark.kubernetes.driver.volumes.persistentVolumeClaim.nfs100.mount.path",
 "/opt/spark/work-dir") \

.config("spark.kubernetes.executor.volumes.persistentVolumeClaim.nfs100.options.claimName",
 "nfs100") \

.config("spark.kubernetes.executor.volumes.persistentVolumeClaim.nfs100.mount.path",
 "/opt/spark/work-dir") \

.config("spark.kubernetes.driver.volumes.persistentVolumeClaim.nfs100.mount.readOnly",
 "False") \

.config("spark.kubernetes.executor.volumes.persistentVolumeClaim.nfs100.mount.readOnly",
 "False") \
.appName("myApp") \
.config("spark.sql.repl.eagerEval.enabled", "True") \
.config("spark.driver.memory", "4g") \
.config("spark.executor.memory", "4g") \
.getOrCreate()
sc = spark.sparkContext

pdf.to_parquet("/opt/spark/work-dir/falk/test/F01test.parquet")


21/08/30 12:20:34 WARN WindowExec: No Partition Defined for Window operation! 
Moving all data to a single partition, this can cause serious performance 
degradation.
21/08/30 12:20:34 WARN WindowExec: No Partition Defined for Window operation! 
Moving all data to a single partition, this can cause serious performance 
degradation.
21/08/30 12:20:37 WARN WindowExec: No Partition Defined for Window operation! 
Moving all data to a single partition, this can cause serious performance 
degradation.
21/08/30 12:20:39 WARN TaskSetManager: Lost task 0.0 in stage 25.0 (TID 9497) 
(10.42.0.16 executor 3): java.io.IOException: Mkdirs failed to create 
file:/opt/spark/work-dir/falk/test/F01test.parquet/_temporary/0/_temporary/attempt_202108301220375889526593865835092_0025_m_00_9497
 (exists=false, cwd=file:/opt/spark/work-dir)
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:515)
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:500)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1195)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1175)
at 
org.apache.parquet.hadoop.util.HadoopOutputFile.create(HadoopOutputFile.java:74)
at 
org.apache.parquet.hadoop.ParquetFileWriter.(ParquetFileWriter.java:329)
at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:482)
at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:420)
at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:409)
at 
org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.(ParquetOutputWriter.scala:36)
at 
org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:150)
at 
org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:161)
at 
org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.(FileFormatDataWriter.scala:146)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:290)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$16(FileFormatWriter.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
Source)
at java.base/java.lang.Thread.run(Unknown Source)

21/08/30 12:20:40 WARN TaskSetManager: Lost task 0.1 in stage 25.0 (TID 9498) 

Unsubscribe

2021-08-30 Thread Sandeep Patra
Unsubscribe


Unsubscribe

2021-08-30 Thread Dhaval Patel



Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-08-30 Thread Sharma, Prakash (Nokia - IN/Bangalore)
Hi ,

we are not moving to 3.1.1 because some open ticket are there I have mentioned 
below.
https://issues.apache.org/jira/browse/SPARK-30536

https://issues.apache.org/jira/browse/SPARK-35066


please refer attached mail for spark 35066.


Thanks.


From: Mich Talebzadeh 
Sent: Monday, August 30, 2021 1:15:07 PM
To: Sharma, Prakash (Nokia - IN/Bangalore) 
Cc: user@spark.apache.org 
Subject: Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

Hi,

Any particular reason why you are not using 3.1.1 on Kubernetes?




 
[https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]
   view my Linkedin 
profile



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.




On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) 
mailto:prakash.sha...@nokia.com>> wrote:

Sessional Greetings ,
 We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with data 
on HDFS and we're observing delays in query execution time when compared to 
Spark 3.0.1 on same environment. We've observed that some stages fail, but 
looks like it is taking some time to realise this failure and re-trigger these 
stages.  I am attaching the configuration also which we used for the spark 
driver . We observe the same behaviour with sapark 3.0.3 also.

Please let us know if anyone has observed similar issues.

Configuration which we use for spark driver:

spark.io.compression.codec=snappy

spark.sql.parquet.filterPushdown=true



spark.sql.inMemoryColumnarStorage.batchSize=15000

spark.shuffle.file.buffer=1024k

spark.ui.retainedStages=1

spark.kerberos.keytab=



spark.speculation=false

spark.submit.deployMode=cluster



spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true



spark.sql.orc.filterPushdown=true

spark.serializer=org.apache.spark.serializer.KryoSerializer



spark.sql.crossJoin.enabled=true

spark.kubernetes.kerberos.keytab=



spark.sql.adaptive.enabled=true

spark.kryo.unsafe=true

spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id=

spark.executor.cores=2

spark.ui.retainedTasks=20

spark.network.timeout=2400





spark.rdd.compress=true

spark.executor.memoryoverhead=3G

spark.master=k8s\:



spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name=

spark.kubernetes.driver.limit.cores=6144m

spark.kubernetes.submission.waitAppCompletion=false

spark.kerberos.principal=

spark.kubernetes.kerberos.enabled=true

spark.kubernetes.allocation.batch.size=5



spark.kubernetes.authenticate.driver.serviceAccountName=



spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true

spark.reducer.maxSizeInFlight=1024m



spark.storage.memoryFraction=0.25



spark.kubernetes.namespace=

spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name=

spark.rpc.numRetries=5



spark.shuffle.consolidateFiles=true

spark.sql.shuffle.partitions=400

spark.kubernetes.kerberos.krb5.path=/

spark.sql.codegen=true

spark.ui.strictTransportSecurity=max-age\=31557600

spark.ui.retainedJobs=1



spark.driver.port=7078

spark.shuffle.io.backLog=256

spark.ssl.ui.enabled=true

spark.kubernetes.memoryOverheadFactor=0.1



spark.driver.blockManager.port=7079

spark.kubernetes.executor.limit.cores=4096m

spark.submit.pyFiles=

spark.kubernetes.container.image=

spark.shuffle.io.numConnectionsPerPeer=10



spark.sql.broadcastTimeout=7200



spark.driver.cores=3

spark.executor.memory=9g

spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099



spark.driver.maxResultSize=4g

spark.sql.parquet.mergeSchema=false



spark.sql.inMemoryColumnarStorage.compressed=true

spark.rpc.retry.wait=5

spark.hadoop.parquet.enable.summary-metadata=false





spark.kubernetes.allocation.batch.delay=9

spark.driver.memory=16g

spark.sql.starJoinOptimization=true

spark.kubernetes.submitInDriver=true

spark.shuffle.compress=true

spark.memory.useLegacyMode=true

spark.jars=

spark.kubernetes.resource.type=java

spark.locality.wait=0s


Re: Performance Degradation in Spark 3.0.2 compared to Spark 3.0.1

2021-08-30 Thread Mich Talebzadeh
Hi,

Any particular reason why you are not using 3.1.1 on Kubernetes?



   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 30 Aug 2021 at 06:10, Sharma, Prakash (Nokia - IN/Bangalore) <
prakash.sha...@nokia.com> wrote:

> Sessional Greetings ,
>  We're doing tpc-ds query tests using Spark 3.0.2 on kubernetes with
> data on HDFS and we're observing delays in query execution time *when*
> compared to Spark 3.0.1 on same environment. We've observed that some
> stages fail, but looks like it is taking some time to realise this failure
> and re-trigger these stages.  I am attaching the configuration also which
> we used for the spark driver . We observe the same behaviour with sapark
> 3.0.3 also.
>
> *Please let us know if anyone has observed similar issues.*
>
> Configuration which we use for spark driver:
>
> spark.io.compression.codec=snappy
>
> spark.sql.parquet.filterPushdown=true
>
>
>
> spark.sql.inMemoryColumnarStorage.batchSize=15000
>
> spark.shuffle.file.buffer=1024k
>
> spark.ui.retainedStages=1
>
> spark.kerberos.keytab=
>
>
>
> spark.speculation=false
>
> spark.submit.deployMode=cluster
>
>
>
>
> spark.kubernetes.driver.label.sparkoperator.k8s.io/launched-by-spark-operator=true
>
>
>
> spark.sql.orc.filterPushdown=true
>
> spark.serializer=org.apache.spark.serializer.KryoSerializer
>
>
>
> spark.sql.crossJoin.enabled=true
>
> spark.kubernetes.kerberos.keytab=
>
>
>
> spark.sql.adaptive.enabled=true
>
> spark.kryo.unsafe=true
>
> spark.kubernetes.driver.label.sparkoperator.k8s.io/submission-id= label>
>
> spark.executor.cores=2
>
> spark.ui.retainedTasks=20
>
> spark.network.timeout=2400
>
>
>
>
>
> spark.rdd.compress=true
>
> spark.executor.memoryoverhead=3G
>
> spark.master=k8s\:
>
>
>
> spark.kubernetes.driver.label.sparkoperator.k8s.io/app-name= name>
>
> spark.kubernetes.driver.limit.cores=6144m
>
> spark.kubernetes.submission.waitAppCompletion=false
>
> spark.kerberos.principal=
>
> spark.kubernetes.kerberos.enabled=true
>
> spark.kubernetes.allocation.batch.size=5
>
>
>
> spark.kubernetes.authenticate.driver.serviceAccountName= name>
>
>
>
>
> spark.kubernetes.executor.label.sparkoperator.k8s.io/launched-by-spark-operator=true
>
> spark.reducer.maxSizeInFlight=1024m
>
>
>
> spark.storage.memoryFraction=0.25
>
>
>
> spark.kubernetes.namespace=
>
> spark.kubernetes.executor.label.sparkoperator.k8s.io/app-name= label>
>
> spark.rpc.numRetries=5
>
>
>
> spark.shuffle.consolidateFiles=true
>
> spark.sql.shuffle.partitions=400
>
> spark.kubernetes.kerberos.krb5.path=/
>
> spark.sql.codegen=true
>
> spark.ui.strictTransportSecurity=max-age\=31557600
>
> spark.ui.retainedJobs=1
>
>
>
> spark.driver.port=7078
>
> spark.shuffle.io.backLog=256
>
> spark.ssl.ui.enabled=true
>
> spark.kubernetes.memoryOverheadFactor=0.1
>
>
>
> spark.driver.blockManager.port=7079
>
> spark.kubernetes.executor.limit.cores=4096m
>
> spark.submit.pyFiles=
>
> spark.kubernetes.container.image=
>
> spark.shuffle.io.numConnectionsPerPeer=10
>
>
>
> spark.sql.broadcastTimeout=7200
>
>
>
> spark.driver.cores=3
>
> spark.executor.memory=9g
>
>
> spark.kubernetes.executor.label.sparkoperator.k8s.io/submission-id=dfbd9c75-3771-4392-928e-10bf28d94099
>
>
>
> spark.driver.maxResultSize=4g
>
> spark.sql.parquet.mergeSchema=false
>
>
>
> spark.sql.inMemoryColumnarStorage.compressed=true
>
> spark.rpc.retry.wait=5
>
> spark.hadoop.parquet.enable.summary-metadata=false
>
>
>
>
>
> spark.kubernetes.allocation.batch.delay=9
>
> spark.driver.memory=16g
>
> spark.sql.starJoinOptimization=true
>
> spark.kubernetes.submitInDriver=true
>
> spark.shuffle.compress=true
>
> spark.memory.useLegacyMode=true
>
> spark.jars=
>
> spark.kubernetes.resource.type=java
>
> spark.locality.wait=0s
>
> spark.kubernetes.driver.ui.svc.port=4040
>
> spark.sql.orc.splits.include.file.footer=true
>
> spark.kubernetes.kerberos.principal=
>
>
>
> spark.sql.orc.cache.stripe.details.size=1
>
>
>
> spark.executor.instances=22
>
> spark.hadoop.fs.hdfs.impl.disable.cache=true
>
> spark.sql.hive.metastorePartitionPruning=true
>
>
>
> Thanks and Regards
> Prakash
>
>
>


Unsubscribe

2021-08-30 Thread Junior Alvarez



Unsubscribe

2021-08-30 Thread Lisa Fiedler




-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Unsubscribe

2021-08-30 Thread Agostino Calamita