Prometheus was working fine before 2 days and today when i stop and start 
it again, it shows below error and not able to start the prometheus.

I build prometheus from master branch and using that binary since 2 days. 
Below are the logs when i start prometheus.


[neel@localhost prometheus]$ ./prometheus --config.file=prometheus.yml 
--storage.tsdb.path=data/
level=info ts=2020-06-11T06:25:10.967Z caller=main.go:302 msg="No time or 
size retention was set so using the default time retention" duration=15d
level=info ts=2020-06-11T06:25:10.967Z caller=main.go:337 msg="Starting 
Prometheus" version="(version=2.18.1, branch=master, 
revision=18d9ebf0ffc26b8bd0e136f552c8e9886d29ade4)"
level=info ts=2020-06-11T06:25:10.967Z caller=main.go:338 
build_context="(go=go1.14.3, [email protected], 
date=20200604-05:51:34)"
level=info ts=2020-06-11T06:25:10.967Z caller=main.go:339 
host_details="(Linux 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 
UTC 2019 x86_64 localhost.localdomain (none))"
level=info ts=2020-06-11T06:25:10.967Z caller=main.go:340 
fd_limits="(soft=1024, hard=4096)"
level=info ts=2020-06-11T06:25:10.967Z caller=main.go:341 
vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2020-06-11T06:25:10.973Z caller=main.go:678 msg="Starting 
TSDB ..."
level=info ts=2020-06-11T06:25:10.973Z caller=web.go:524 component=web 
msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2020-06-11T06:25:10.974Z caller=repair.go:59 component=tsdb 
msg="Found healthy block" mint=1591250134254 maxt=1591257600000 
ulid=01EACBSE6H4EP5CS2K2G6KWJ0W
level=info ts=2020-06-11T06:25:10.974Z caller=repair.go:59 component=tsdb 
msg="Found healthy block" mint=1591682400000 maxt=1591747200000 
ulid=01EAE4FPDZ932CDRBAGH019TK7
level=info ts=2020-06-11T06:25:10.974Z caller=repair.go:59 component=tsdb 
msg="Found healthy block" mint=1591747200000 maxt=1591768800000 
ulid=01EAEQAB8X70893C6ZC2T2ZK38
level=info ts=2020-06-11T06:25:10.975Z caller=repair.go:59 component=tsdb 
msg="Found healthy block" mint=1591790400000 maxt=1591797600000 
ulid=01EAGR8C1PMZZACQMN185NC254
level=info ts=2020-06-11T06:25:10.975Z caller=repair.go:59 component=tsdb 
msg="Found healthy block" mint=1591797600000 maxt=1591804800000 
ulid=01EAGR8D3YVDV517XCMBMMKCS7
level=info ts=2020-06-11T06:25:10.975Z caller=repair.go:59 component=tsdb 
msg="Found healthy block" mint=1591768800000 maxt=1591790400000 
ulid=01EAGR8ETQQC8MZ09SSC62N1D3
level=info ts=2020-06-11T06:25:10.982Z caller=main.go:547 msg="Stopping 
scrape discovery manager..."
level=info ts=2020-06-11T06:25:10.983Z caller=main.go:561 msg="Stopping 
notify discovery manager..."
level=info ts=2020-06-11T06:25:10.983Z caller=main.go:583 msg="Stopping 
scrape manager..."
level=info ts=2020-06-11T06:25:10.983Z caller=main.go:557 msg="Notify 
discovery manager stopped"
level=info ts=2020-06-11T06:25:10.983Z caller=main.go:543 msg="Scrape 
discovery manager stopped"
level=info ts=2020-06-11T06:25:10.983Z caller=manager.go:882 
component="rule manager" msg="Stopping rule manager..."
level=info ts=2020-06-11T06:25:10.983Z caller=manager.go:892 
component="rule manager" msg="Rule manager stopped"
level=info ts=2020-06-11T06:25:10.983Z caller=notifier.go:601 
component=notifier msg="Stopping notification manager..."
level=info ts=2020-06-11T06:25:10.983Z caller=main.go:749 msg="Notifier 
manager stopped"
level=info ts=2020-06-11T06:25:10.983Z caller=main.go:577 msg="Scrape 
manager stopped"
level=error ts=2020-06-11T06:25:10.983Z caller=main.go:758 err="opening 
storage failed: found unsequential head chunk files 144 and 148"

#########################

Data directory content as below

01EACBSE6H4EP5CS2K2G6KWJ0W  01EAEQAB8X70893C6ZC2T2ZK38  
01EAGR8D3YVDV517XCMBMMKCS7  chunks_head  queries.active
01EAE4FPDZ932CDRBAGH019TK7  01EAGR8C1PMZZACQMN185NC254  
01EAGR8ETQQC8MZ09SSC62N1D3  lock         wal

##########################



Prometheus config file as below.

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. 
Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default 
is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 
'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries 
scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 
seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9090']

  # The job name is added as a label `job=<job_name>` to any timeseries 
scraped from this config.
  - job_name: 'postgres-exporter'

    # Override the global default and scrape targets from this job every 5 
seconds.
    scrape_interval: 15s

    static_configs:
      - targets: ['localhost:9187']
#############################################

Let me know, is the TSDB is corrupted ? If yes, is there anyway to recover ?

Thanks in Advance

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a7b51fdd-1b14-4c15-8342-ea113ab24315o%40googlegroups.com.

Reply via email to