Are you able to turn on DEBUG logging, replicate the problem, then make a larger chunk of the logs available? Maybe I will get lucky and find something interesting that you are overlooking.
On Mon, Mar 16, 2020 at 11:14 AM Gonçalo Pedras <[email protected]> wrote: > Doesn’t tell much either (I guess): > > > > Windowing debug before restarting: > > > > “2020-03-16 13:33:58.089 o.a.s.w.WindowManager > Thread-14-builderBolt-executor[6 6] [DEBUG] Scan events, eviction policy > WatermarkTimeEvictionPolicy{lag=60000} > TimeEvictionPolicy{windowLength=30000, referenceTime=1584365250000}” > > “2020-03-16 13:33:58.090 o.a.s.w.WindowManager > Thread-14-builderBolt-executor[6 6] [DEBUG] [0] events expired from window.” > > “2020-03-16 13:35:05.635 o.a.s.w.WindowManager > Thread-14-builderBolt-executor[6 6] [DEBUG] Scan events, eviction policy > WatermarkTimeEvictionPolicy{lag=60000} > TimeEvictionPolicy{windowLength=30000, referenceTime=1584365250000}” > > “2020-03-16 13:35:05.635 o.a.s.w.WindowManager > Thread-14-builderBolt-executor[6 6] [DEBUG] [0] events expired from window.” > > > > Storm debug before restarting: > > > > “2020-03-16 13:59:19.024 o.a.s.m.SystemBolt Thread-18-__system-executor[-1 > -1] [DEBUG] Emitting aggregated metric tuple - taskInfoEntry: > TaskInfo{srcWorkerHost='XXXXXXXXXXXXXXXXXXXXX’, srcWorkerPort=6706, > srcComponentId='__system', srcTaskId=-1, timestamp=1584367158, > updateIntervalSecs=60} / aggregated data points: > [[__transfer-count.__metrics_aggregate = [20]], [memory/nonHeap.usedBytes = > [123928088]], [__transfer.sojourn_time_ms = [0.0]], > [__recv-iconnection.enqueued = [{}]], [__transfer.read_pos = [-1]], > [__sendqueue.capacity = [1024]], [memory/heap.initBytes = [528482304]], > [memory/heap.virtualFreeBytes = [422777392]], [__sendqueue.sojourn_time_ms > = [0.0]], [__transfer.tuple_population = [0]], > [__sendqueue.arrival_rate_secs = [0.0]], [memory/heap.unusedBytes = > [327881264]], [memory/nonHeap.maxBytes = [-1]], [__transfer.write_pos = > [-1]], [__emit-count.__metrics_aggregate = [20]], [__receive.read_pos = > [202]], [__receive.population = [1]], [__receive.arrival_rate_secs = > [0.11095084877399312]], [GC/PSScavenge.count = [0]], [GC/PSScavenge.timeMs > = [0]], [__execute-latency.__system:__metrics = [0.0]], > [memory/heap.usedBytes = [3395087824]], [__receive.tuple_population = [0]], > [__sendqueue.population = [0]], [memory/nonHeap.committedBytes = > [126509056]], [__receive.sojourn_time_ms = [0.0]], [__receive.overflow = > [2]], [memory/heap.committedBytes = [3722969088]], [GC/PSMarkSweep.timeMs = > [60661]], [memory/nonHeap.initBytes = [2555904]], [GC/PSMarkSweep.count = > [16]], [__transfer.arrival_rate_secs = [0.0]], [__sendqueue.overflow = > [0]], [__receive.write_pos = [203]], [__sendqueue.read_pos = [127]], > [memory/nonHeap.virtualFreeBytes = [-123928089]], [uptimeSecs = > [1035.751]], [newWorkerEvent = [0]], [__sendqueue.tuple_population = [0]], > [memory/nonHeap.unusedBytes = [2580968]], [__transfer.population = [0]], > [__receive.capacity = [1024]], [__transfer.overflow = [0]], [startTimeSecs > = [1.584366114667E9]], [memory/heap.maxBytes = [3817865216]], > [__transfer.capacity = [1024]], [__recv-iconnection.dequeuedMessages = > [0]], [__sendqueue.write_pos = [127]], [__execute-count.__system:__metrics > = [0]]]” > > “2020-03-16 13:59:19.024 o.a.s.m.SystemBolt Thread-18-__system-executor[-1 > -1] [DEBUG] Clearing sent metrics” > > “2020-03-16 13:59:41.692 o.a.s.u.LocalState heartbeat-timer [DEBUG] New > Local State for > /hadoop/storm/workers/b41f3744-522b-4f17-bc0d-35828556a5ea/heartbeats” >
