Re: about cpu cores

2022-07-11 Thread Yong Walt
We were using Yarn. thanks.

On Sun, Jul 10, 2022 at 9:02 PM Tufan Rakshit  wrote:

> Mainly depends what your cluster manager Yarn or kubernates ?
> Best
> Tufan
>
> On Sun, 10 Jul 2022 at 14:38, Sean Owen  wrote:
>
>> Jobs consist of tasks, each of which consumes a core (can be set to >1
>> too, but that's a different story). If there are more tasks ready to
>> execute than available cores, some tasks simply wait.
>>
>> On Sun, Jul 10, 2022 at 3:31 AM Yong Walt  wrote:
>>
>>> given my spark cluster has 128 cores totally.
>>> If the jobs (each job was assigned only one core) I submitted to the
>>> cluster are over 128, what will happen?
>>>
>>> Thank you.
>>>
>>


about cpu cores

2022-07-10 Thread Yong Walt
given my spark cluster has 128 cores totally.
If the jobs (each job was assigned only one core) I submitted to the
cluster are over 128, what will happen?

Thank you.


Re: Will it lead to OOM error?

2022-06-22 Thread Yong Walt
We have many cases like this. it won't cause OOM.

Thanks

On Wed, Jun 22, 2022 at 8:28 PM Sid  wrote:

> I have a 150TB CSV file.
>
> I have a total of 100 TB RAM and 100TB disk. So If I do something like this
>
> spark.read.option("header","true").csv(filepath).show(false)
>
> Will it lead to an OOM error since it doesn't have enough memory? or it
> will spill data onto the disk and process it?
>
> Thanks,
> Sid
>


Re: Spark Doubts

2022-06-21 Thread Yong Walt
These are the basic concepts in spark :)
You may take a bit time to read this small book:
https://cloudcache.net/resume/PDDWS2-V2.pdf

regards


On Wed, Jun 22, 2022 at 3:17 AM Sid  wrote:

> Hi Team,
>
> I have a few doubts about the below questions:
>
> 1) data frame will reside where? memory? disk? memory allocation about
> data frame?
> 2) How do you configure each partition?
> 3) Is there any way to calculate the exact partitions needed to load a
> specific file?
>
> Thanks,
> Sid
>


Re: input file size

2022-06-18 Thread Yong Walt
import java.io.Fileval someFile = new File("somefile.txt")val fileSize
= someFile.length

This one?



On Sun, Jun 19, 2022 at 4:33 AM mbreuer  wrote:

> Hello Community,
>
> I am working on optimizations for file sizes and number of files. In the
> data frame there is a function input_file_name which returns the file
> name. I miss a counterpart to get the size of the file. Just the size,
> like "ls -l" returns. Is there something like that?
>
> Kind regards,
> Markus
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>