from:"Nebi Aydin"

unsubscribe

2024-05-01 Thread Nebi Aydin

unsubscribe

About shuffle partition size

2023-12-20 Thread Nebi Aydin

Hi all, What happens when # of unique join keys less than shuffle partitions? Are we going to end up with lots of empty partitions? If yes,is there any point to have shuffle partitions bigger than # of unique join keys?

Thread dump only shows 10 shuffle clients

2023-09-28 Thread Nebi Aydin

Hi all, I set the spark.shuffle.io.serverThreads and spark.shuffle.io.clientThreads to *800* But when I click Thread dump from the Spark UI for the executor: I only see 10 shuffle client threads for the executor. Is that normal, am I missing something?

Files io threads vs shuffle io threads

2023-09-27 Thread Nebi Aydin

Hi all, Can someone explain the difference between Files io threads and shuffle io threads, as I couldn't find any explanation. I'm specifically asking about these: spark.rpc.io.serverThreads spark.rpc.io.clientThreads spark.rpc.io.threads spark.files.io.serverThreads spark.files.io.clientThreads

About Peak Jvm Memory Onheap

2023-09-17 Thread Nebi Aydin

Hi all, I couldn't find any useful doc that explains `*Peak JVM Memory Onheap`* field on Spark UI. Most of the time my applications have very low *On heap storage memory *and *Peak execution memory on heap* But have very big `*Peak JVM Memory Onheap`.* on Spark UI Can someone please explain the

[Spark Core]: How does rpc threads influence shuffle?

2023-09-15 Thread Nebi Aydin

Hello all, I know that these parameters exist for shuffle tuning: *spark.shuffle.io.serverThreadsspark.shuffle.io.clientThreadsspark.shuffle.io.threads* But we also have *spark.rpc.io.serverThreadsspark.rpc.io.clientThreadsspark.rpc.io.threads* So specifically talking about *Shuffling,

Re: [External Email] Re: About /mnt/hdfs/current/BP directories

2023-09-08 Thread Nebi Aydin

>> ) >> >> On Fri, Sep 8, 2023 at 14:56 Jack Wells wrote: >> >>> Hi Nebi, can you share the code you’re using to read and write from S3? >>> >>> On Sep 8, 2023 at 10:59:59, Nebi Aydin >>> wrote: >>> >>>> H

Re: [External Email] Re: About /mnt/hdfs/current/BP directories

2023-09-08 Thread Nebi Aydin

> > On Sep 8, 2023 at 10:59:59, Nebi Aydin > wrote: > >> Hi all, >> I am using spark on EMR to process data. Basically i read data from AWS >> S3 and do the transformation and post transformation i am loading/writing >> data to s3. >> >> Recently we

About /mnt/hdfs/current/BP directories

2023-09-08 Thread Nebi Aydin

Hi all, I am using spark on EMR to process data. Basically i read data from AWS S3 and do the transformation and post transformation i am loading/writing data to s3. Recently we have found that hdfs(/mnt/hdfs) utilization is going too high. I disabled `yarn.log-aggregation-enable` by setting it

Re: [External Email] Re: [Spark Core]: What's difference among spark.shuffle.io.threads

2023-08-19 Thread Nebi Aydin

r will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 18 Aug 2023 at 23:30, Nebi Aydin wrote: > >> >> Hi, sorry for duplicates. First time user :) >> I keep getting fetchfailedexception 733

Re: [External Email] Re: [Spark Core]: What's difference among spark.shuffle.io.threads

2023-08-18 Thread Nebi Aydin

or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 18 Aug 2023 at 20:39, Nebi Aydin

[Spark Core]: What's difference among spark.shuffle.io.threads

2023-08-18 Thread Nebi Aydin

I want to learn differences among below thread configurations. spark.shuffle.io.serverThreads spark.shuffle.io.clientThreads spark.shuffle.io.threads spark.rpc.io.serverThreads spark.rpc.io.clientThreads spark.rpc.io.threads Thanks.

[Spark Core]: What's difference among spark.shuffle.io.threads

2023-08-18 Thread Nebi Aydin

I want to learn differences among below thread configurations. spark.shuffle.io.serverThreads spark.shuffle.io.clientThreads spark.shuffle.io.threads spark.rpc.io.serverThreads spark.rpc.io.clientThreads spark.rpc.io.threads Thanks.

unsubscribe

About shuffle partition size

Thread dump only shows 10 shuffle clients

Files io threads vs shuffle io threads

About Peak Jvm Memory Onheap

[Spark Core]: How does rpc threads influence shuffle?

Re: [External Email] Re: About /mnt/hdfs/current/BP directories

Re: [External Email] Re: About /mnt/hdfs/current/BP directories

About /mnt/hdfs/current/BP directories

Re: [External Email] Re: [Spark Core]: What's difference among spark.shuffle.io.threads

Re: [External Email] Re: [Spark Core]: What's difference among spark.shuffle.io.threads

[Spark Core]: What's difference among spark.shuffle.io.threads

[Spark Core]: What's difference among spark.shuffle.io.threads

13 matches

Site Navigation

Mail list logo

Footer information