Re: about cpu cores
We were using Yarn. thanks. On Sun, Jul 10, 2022 at 9:02 PM Tufan Rakshit wrote: > Mainly depends what your cluster manager Yarn or kubernates ? > Best > Tufan > > On Sun, 10 Jul 2022 at 14:38, Sean Owen wrote: > >> Jobs consist of tasks, each of which consumes a core (can be set to >1 >> too, but that's a different story). If there are more tasks ready to >> execute than available cores, some tasks simply wait. >> >> On Sun, Jul 10, 2022 at 3:31 AM Yong Walt wrote: >> >>> given my spark cluster has 128 cores totally. >>> If the jobs (each job was assigned only one core) I submitted to the >>> cluster are over 128, what will happen? >>> >>> Thank you. >>> >>
about cpu cores
given my spark cluster has 128 cores totally. If the jobs (each job was assigned only one core) I submitted to the cluster are over 128, what will happen? Thank you.
Re: Will it lead to OOM error?
We have many cases like this. it won't cause OOM. Thanks On Wed, Jun 22, 2022 at 8:28 PM Sid wrote: > I have a 150TB CSV file. > > I have a total of 100 TB RAM and 100TB disk. So If I do something like this > > spark.read.option("header","true").csv(filepath).show(false) > > Will it lead to an OOM error since it doesn't have enough memory? or it > will spill data onto the disk and process it? > > Thanks, > Sid >
Re: Spark Doubts
These are the basic concepts in spark :) You may take a bit time to read this small book: https://cloudcache.net/resume/PDDWS2-V2.pdf regards On Wed, Jun 22, 2022 at 3:17 AM Sid wrote: > Hi Team, > > I have a few doubts about the below questions: > > 1) data frame will reside where? memory? disk? memory allocation about > data frame? > 2) How do you configure each partition? > 3) Is there any way to calculate the exact partitions needed to load a > specific file? > > Thanks, > Sid >
Re: input file size
import java.io.Fileval someFile = new File("somefile.txt")val fileSize = someFile.length This one? On Sun, Jun 19, 2022 at 4:33 AM mbreuer wrote: > Hello Community, > > I am working on optimizations for file sizes and number of files. In the > data frame there is a function input_file_name which returns the file > name. I miss a counterpart to get the size of the file. Just the size, > like "ls -l" returns. Is there something like that? > > Kind regards, > Markus > > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >