Karthick, you can add randomness by yourself. Just have one bolt which may
produce a new key which will be a mix of your long id plus some random
generated number and then you can field group on that part.

We also faced same issue and in our case we found that for some packets we
were getting null as a key under kafka message so we generated random
strings to avoid skewness issue.

You can also try generating custom hash in your bolt and do some compute.

On Sat, 17 Aug 2024 at 6:26 PM, Karthick <[email protected]> wrote:

> Thanks Sahil, Aaron for the reply
>
> Are you encountering a performance problem?
>
> Yes facing slowness on the executors with more load.
>
> Is the fields grouping absolutely required for your use case?
>
> Yes to maintain the ordering it required for my case.
>
> Can you suggest some hashes?
>
>
> Does your case needs packets of same hash to go to same executor or
>> ordering of events ?
>>
>
> Yes, ordering of events is important.
>
>  Currently we are using a unique id which datatype is bigint(Long).
>
> Can you suggest any changes reg this.
>
>
> On Sat, Aug 17, 2024 at 8:40 AM Sahil Kamboj <[email protected]>
> wrote:
>
>> Hey karthick
>>
>> Does your case needs packets of same hash to go to same executor or
>> ordering of events ? If not then go with local or shuffle grouping which
>> will distribute load in round robin manner across executors. Otherwise
>> choose your key wisely to avoid skewness when using fields grouping.
>>
>>
>>
>> On Sat, 17 Aug 2024 at 5:20 AM, Karthick <[email protected]>
>> wrote:
>>
>>> Hi Team,
>>>
>>> I'm using fields grouping for a bolt to maintain field-based ordering,
>>> but I'm facing data skewness among the bolt's executors. I have 96
>>> executors, and I'm sending data with 500 distinct fields used in the fields
>>> grouping. While reviewing the Storm UI, I noticed that a few executors are
>>> underutilized while others are overutilized.
>>>
>>> This seems to be a hashing problem i guess. Can anyone suggest a better
>>> hashing technique or approach to resolve this issue?
>>>
>>> Thanks in advance for your help.
>>>
>>

Reply via email to