[prometheus-users] Re: chunks_head space issue

Brian Candler Thu, 17 Feb 2022 14:10:01 -0800

Now would be a good time to do:

ls -l /var/lib/prometheus/data/chunks_head/
du -sck /var/lib/prometheus/data/chunks_head/*


My suspicion is your out-of-memory condition is messing up the writing of 
chunks.  Are you using cgroups/containers?

Also, is prometheus continually crashing and being restarted by systemd? 
Try looking in "journalctl -eu prometheus".  That might explain why you see 
lots of free memory most of the time (when prometheus is stopped).

On Thursday, 17 February 2022 at 14:57:25 UTC Senthil wrote:

> The issue started again. 
>
> 629G    chunks_head
> 0       lock
> 4.0K    queries.active
> 9.3G    wal
>
> There is numerous restart of Prometheus
> Feb 17 09:02:02 kernel: Out of memory: Kill process 36580 (prometheus) 
> score 844 or sacrifice child
> Feb 17 09:08:36 kernel: Out of memory: Kill process 39001 (prometheus) 
> score 846 or sacrifice child
> Feb 17 09:16:02 kernel: Out of memory: Kill process 41074 (prometheus) 
> score 845 or sacrifice child
> Feb 17 09:22:17 kernel: Out of memory: Kill process 44665 (prometheus) 
> score 844 or sacrifice child
> Feb 17 09:29:25 kernel: Out of memory: Kill process 47234 (prometheus) 
> score 844 or sacrifice child
> Feb 17 09:36:06 kernel: Out of memory: Kill process 48970 (prometheus) 
> score 846 or sacrifice child
> Feb 17 09:43:21 kernel: Out of memory: Kill process 50661 (prometheus) 
> score 844 or sacrifice child
>
> but there is plenty of mem available in the servers.
>
>               total        used        free      shared  buff/cache   
> available
> Mem:             47           5          31           0          10       
>    40
> Swap:             5           1           3
> Total:           52           7          35
>
> On Tuesday, February 1, 2022 at 5:21:32 PM UTC-5 Brian Candler wrote:
>
>> On Tuesday, 1 February 2022 at 21:52:30 UTC Senthil wrote:
>>
>>> I started on Jan 31, so it's a day.
>>>
>>> # du -sck chunks_head/*
>>> 54140   chunks_head/024326
>>> 4       chunks_head/024327
>>> 54144   total
>>>
>>
>> That's perfectly reasonable: it's only 54MB (which is a long way from 
>> 689GB!)
>>
>> Here's what I see on a moderately busy system:
>>
>> root@ldex-prometheus:~# du -sck /var/lib/prometheus/data/chunks_head/*
>> 81004        /var/lib/prometheus/data/chunks_head/006831
>> 77824        /var/lib/prometheus/data/chunks_head/006832
>> 158828        total
>>
>> That's comparable to yours.
>>
>> Therefore, I think you need to keep an eye on this periodically.  If only 
>> you had a monitoring system which could do this for you :-)
>>
>> If it does start to rise, that's when you'll need to check prometheus log 
>> output and find out what's happening.  But this is very strange, and it 
>> does seem to be something specific to your system.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/25405bc6-d4e6-4152-8dde-87b89e18bdd9n%40googlegroups.com.

[prometheus-users] Re: chunks_head space issue

Reply via email to