I think, you will get 1 partition as you have only one Executor/Worker
(I.e. your local machine, a node). But your tasks (smallest unit of work
item in Spark framework) will be processed in parallel on your 4 core. As
Spark runs one task per core.

You can also force to repartition it if you want while calling repartition
function anytime on your DataFrame but processing will be sequential (I
think) since one Executor

On Sun, 2 Jan 2022 at 00:20, Bitfox <bit...@bitfox.top> wrote:

> One more question, for this big filter, given my server has 4 Cores, will
> spark (standalone mode) split the RDD to 4 partitions automatically?
>
> Thanks
>
> On Sun, Jan 2, 2022 at 6:30 AM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Create a list of values that you don't want anf filter oon those
>>
>> >>> DF = spark.range(10)
>> >>> DF
>> DataFrame[id: bigint]
>> >>>
>> >>> array = [1, 2, 3, 8]  # don't want these
>> >>> DF.filter(DF.id.isin(array) == False).show()
>> +---+
>> | id|
>> +---+
>> |  0|
>> |  4|
>> |  5|
>> |  6|
>> |  7|
>> |  9|
>> +---+
>>
>>  or use binary NOT operator:
>>
>>
>> >>> DF.filter(*~*DF.id.isin(array)).show()
>>
>> +---+
>>
>> | id|
>>
>> +---+
>>
>> |  0|
>>
>> |  4|
>>
>> |  5|
>>
>> |  6|
>>
>> |  7|
>>
>> |  9|
>>
>> +---+
>>
>>
>> HTH
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Sat, 1 Jan 2022 at 20:59, Bitfox <bit...@bitfox.top> wrote:
>>
>>> Using the dataframe API I need to implement a batch filter:
>>>
>>> DF. select(..).where(col(..) != ‘a’ and col(..) != ‘b’ and …)
>>>
>>> There are a lot of keywords should be filtered for the same column in
>>> where statement.
>>>
>>> How can I make it more smater? UDF or others?
>>>
>>> Thanks & Happy new Year!
>>> Bitfox
>>>
>>

Reply via email to