Thiago, have you had any luck with this? I haven't seen this particular
issue before. Is this happening for all of your topologies, or just a
specific set? Which ones? Also, can you check for any errors in the kafka
broker logs along with your Storm topology logs when the hang occurs? One
other thing to look at on each broker node:

# on HDP
/usr/hdp/current/kafka-broker/bin/kafka status
# If you installed Kafka manually
$KAFKA_HOME/bin/kafka status

I'd also take a peek at your Zookeeper health as well.

Best,
Mike

On Tue, Apr 9, 2019 at 7:23 AM Thiago Rahal Disposti <
[email protected]> wrote:

> Hi Michael,
>
> You are correct, we have a script that calls load_tool.sh with:
> /usr/metron/0.7.0/bin/load_tool.sh -p 1 -mt $index -z dn-01.mgpsoc.pe -md
> 3000 -tl 30000 -l 0 -c /tmp/measure-$index.csv
>
> Every 15 minutes, this ran fine for a few months, now, when we try to run
> the load_tool.sh tool, it's stops after this:
>
> Consumer Group: metron.load.group
> Thread pool size: 4
> Monitoring ossec every 10000 ms
> Summarizing over the last 5 monitoring periods (50000ms)
>
> We have tried out kafka-manager, but it could not provide any additional
> detail as far as what we saw, something interesting or that could detail it
> a bit more is that when we try the following command:
> /usr/hdp/2.6.5.0-292/kafka/bin/kafka-consumer-groups.sh --bootstrap-server
> $BROKERLIST --describe --group metron.load.group
> On the environment with the issue, we get no output until the command
> times out, on other test environments, it collects and displays the
> information as expected.
>
>
> <http://www.kryptus.com.br>
> *Thiago Rahal    *
> Cybersecurity
>
> +55 (19) 3112-5000
> [email protected]
>
> www.kryptus.com <http://www.kryptus.com.br>
>
>   <http://www.kryptus.com.br>
>
>
>
> On Mon, Apr 8, 2019 at 7:40 PM James Meyer <[email protected]> wrote:
>
>> unsubscribe
>>
>> On Tue, 9 Apr 2019 at 02:39, Thiago Rahal Disposti <
>> [email protected]> wrote:
>>
>>> Hello all, How's it going?
>>>
>>> We've been seeing an issue with load_tool.sh (which we use to collect
>>> our topics EPS):
>>>
>>> After a few months running every 15min on the servers, it just
>>> stopped working, like this:
>>>
>>> [image: image.png]
>>>
>>> It does not write anything else after those messages
>>>
>>> After a little bit of digging, we tried to check the kafka consumer
>>> group metron.load.group with:
>>> /usr/hdp/2.6.5.0-292/kafka/bin/kafka-consumer-groups.sh
>>> --bootstrap-server $BROKERLIST --describe --group metron.load.group
>>>
>>> but it times out every time.
>>>
>>> Have you guys ever seen something like this?
>>>
>>> Thanks!
>>> <http://www.kryptus.com.br>
>>> *Thiago Rahal    *
>>> Cybersecurity
>>>
>>> +55 (19) 3112-5000
>>> [email protected]
>>>
>>> www.kryptus.com <http://www.kryptus.com.br>
>>>
>>>   <http://www.kryptus.com.br>
>>>
>>>

Reply via email to