tonysun83 opened a new pull request #3416: URL: https://github.com/apache/couchdb/pull/3416
## Overview This implements a new `_prometheus` endpoint outlined by https://github.com/apache/couchdb/issues/3377 Users can retrieve metrics info via `/_node/{node-name}/_prometheus`. **Additional Capability not mentioned in the RFC** Some users may not want to use authentication for scraping metrics, so I've added a configurable option that allows scraping via a different port. The output looks like this: ``` # TYPE couchdb_couch_log_requests_total counter couchdb_couch_log_requests_total{level="alert"} 0 couchdb_couch_log_requests_total{level="critical"} 0 couchdb_couch_log_requests_total{level="debug"} 0 couchdb_couch_log_requests_total{level="emergency"} 0 couchdb_couch_log_requests_total{level="error"} 1 couchdb_couch_log_requests_total{level="info"} 5 couchdb_couch_log_requests_total{level="notice"} 10 couchdb_couch_log_requests_total{level="warning"} 0 # TYPE couchdb_couch_replicator_changes_manager_deaths_total counter couchdb_couch_replicator_changes_manager_deaths_total 0 # TYPE couchdb_couch_replicator_changes_queue_deaths_total counter couchdb_couch_replicator_changes_queue_deaths_total 0 # TYPE couchdb_couch_replicator_changes_read_failures_total counter couchdb_couch_replicator_changes_read_failures_total 0 # TYPE couchdb_couch_replicator_changes_reader_deaths_total counter couchdb_couch_replicator_changes_reader_deaths_total 0 # TYPE couchdb_couch_replicator_checkpoints_failure_total counter couchdb_couch_replicator_checkpoints_failure_total 0 # TYPE couchdb_couch_replicator_checkpoints_total counter couchdb_couch_replicator_checkpoints_total 0 # TYPE couchdb_couch_replicator_connection_acquires_total counter couchdb_couch_replicator_connection_acquires_total 0 # TYPE couchdb_couch_replicator_connection_closes_total counter couchdb_couch_replicator_connection_closes_total 0 # TYPE couchdb_couch_replicator_connection_creates_total counter couchdb_couch_replicator_connection_creates_total 0 # TYPE couchdb_couch_replicator_connection_owner_crashes_total counter couchdb_couch_replicator_connection_owner_crashes_total 0 # TYPE couchdb_couch_replicator_connection_releases_total counter couchdb_couch_replicator_connection_releases_total 0 # TYPE couchdb_couch_replicator_connection_worker_crashes_total counter couchdb_couch_replicator_connection_worker_crashes_total 0 # TYPE couchdb_couch_replicator_docs_completed_state_updates_total counter couchdb_couch_replicator_docs_completed_state_updates_total 0 # TYPE couchdb_couch_replicator_docs_db_changes_total counter couchdb_couch_replicator_docs_db_changes_total 0 # TYPE couchdb_couch_replicator_docs_dbs_created_total counter couchdb_couch_replicator_docs_dbs_created_total 0 # TYPE couchdb_couch_replicator_docs_dbs_deleted_total counter couchdb_couch_replicator_docs_dbs_deleted_total 0 # TYPE couchdb_couch_replicator_docs_failed_state_updates_total counter couchdb_couch_replicator_docs_failed_state_updates_total 0 # TYPE couchdb_couch_replicator_failed_starts_total counter couchdb_couch_replicator_failed_starts_total 0 # TYPE couchdb_couch_replicator_jobs_accepting gauge couchdb_couch_replicator_jobs_accepting 2 # TYPE couchdb_couch_replicator_jobs_accepts_total counter couchdb_couch_replicator_jobs_accepts_total 16 # TYPE couchdb_couch_replicator_jobs_adds_total counter couchdb_couch_replicator_jobs_adds_total 0 # TYPE couchdb_couch_replicator_jobs_crashes_total counter couchdb_couch_replicator_jobs_crashes_total 0 # TYPE couchdb_couch_replicator_jobs_pending gauge couchdb_couch_replicator_jobs_pending 0 # TYPE couchdb_couch_replicator_jobs_removes_total counter couchdb_couch_replicator_jobs_removes_total 0 # TYPE couchdb_couch_replicator_jobs_reschedules_total counter couchdb_couch_replicator_jobs_reschedules_total 2 # TYPE couchdb_couch_replicator_jobs_running gauge couchdb_couch_replicator_jobs_running 0 # TYPE couchdb_couch_replicator_jobs_starts_total counter couchdb_couch_replicator_jobs_starts_total 0 # TYPE couchdb_couch_replicator_jobs_stops_total counter couchdb_couch_replicator_jobs_stops_total 0 # TYPE couchdb_couch_replicator_requests_total counter couchdb_couch_replicator_requests_total 0 # TYPE couchdb_couch_replicator_responses_failure_total counter couchdb_couch_replicator_responses_failure_total 0 # TYPE couchdb_couch_replicator_responses_total counter couchdb_couch_replicator_responses_total 0 # TYPE couchdb_couch_replicator_stream_responses_failure_total counter couchdb_couch_replicator_stream_responses_failure_total 0 # TYPE couchdb_couch_replicator_stream_responses_total counter couchdb_couch_replicator_stream_responses_total 0 # TYPE couchdb_couch_replicator_worker_deaths_total counter couchdb_couch_replicator_worker_deaths_total 0 # TYPE couchdb_couch_replicator_workers_started_total counter couchdb_couch_replicator_workers_started_total 0 # TYPE couchdb_auth_cache_requests_total counter couchdb_auth_cache_requests_total 0 # TYPE couchdb_auth_cache_misses_total counter couchdb_auth_cache_misses_total 0 # TYPE couchdb_collect_results_time_seconds summary couchdb_collect_results_time_seconds{quantile="0.5"} 0.0 couchdb_collect_results_time_seconds{quantile="0.75"} 0.0 couchdb_collect_results_time_seconds{quantile="0.9"} 0.0 couchdb_collect_results_time_seconds{quantile="0.95"} 0.0 couchdb_collect_results_time_seconds{quantile="0.99"} 0.0 couchdb_collect_results_time_seconds{quantile="0.999"} 0.0 couchdb_collect_results_time_seconds_sum 0.0 couchdb_collect_results_time_seconds_count 0 # TYPE couchdb_couch_server_lru_skip_total counter couchdb_couch_server_lru_skip_total 0 # TYPE couchdb_database_purges_total counter couchdb_database_purges_total 0 # TYPE couchdb_database_reads_total counter couchdb_database_reads_total 0 # TYPE couchdb_database_writes_total counter couchdb_database_writes_total 0 # TYPE couchdb_db_open_time_seconds summary couchdb_db_open_time_seconds{quantile="0.5"} 0.0 couchdb_db_open_time_seconds{quantile="0.75"} 0.0 couchdb_db_open_time_seconds{quantile="0.9"} 0.0 couchdb_db_open_time_seconds{quantile="0.95"} 0.0 couchdb_db_open_time_seconds{quantile="0.99"} 0.0 couchdb_db_open_time_seconds{quantile="0.999"} 0.0 couchdb_db_open_time_seconds_sum 0.0 couchdb_db_open_time_seconds_count 0 # TYPE couchdb_dbinfo_seconds summary couchdb_dbinfo_seconds{quantile="0.5"} 0.0 couchdb_dbinfo_seconds{quantile="0.75"} 0.0 couchdb_dbinfo_seconds{quantile="0.9"} 0.0 couchdb_dbinfo_seconds{quantile="0.95"} 0.0 couchdb_dbinfo_seconds{quantile="0.99"} 0.0 couchdb_dbinfo_seconds{quantile="0.999"} 0.0 couchdb_dbinfo_seconds_sum 0.0 couchdb_dbinfo_seconds_count 0 # TYPE couchdb_document_inserts_total counter couchdb_document_inserts_total 0 # TYPE couchdb_document_purges_failure_total counter couchdb_document_purges_failure_total 0 # TYPE couchdb_document_purges_success_total counter couchdb_document_purges_success_total 0 . . . ``` ## Testing recommendations To see metrics flowing, run: 1)`brew install prometheus` 2) Create a prometheus.yml file in a folder with: ``` global: scrape_interval: 15s # By default, scrape targets every 15 seconds. # Attach these labels to any time series or alerts when communicating with # external systems (federation, remote storage, Alertmanager). external_labels: monitor: 'codelab-monitor' # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 30s static_configs: - targets: ['localhost:15984'] metrics_path: "/_node/_local/_prometheus" basic_auth: username: 'adm' password: 'pass' ``` 3) Launch your cluster with `./dev/run -a adm pass` 4) Launch prometheus with `prometheus --config.file=prometheus.yml` 5) Open a browser to `http://localhost:9090/metrics` to view metrics data. For additional info check out https://prometheus.io/docs/introduction/first_steps/ Additional Unit Tests will be added. ## Related Issues or Pull Requests <!-- If your changes affects multiple components in different repositories please put links to those issues or pull requests here. --> ## Checklist - [ ] Code is written and works correctly - [ ] Changes are covered by tests - [ ] Any new configurable parameters are documented in `rel/overlay/etc/default.ini` - [ ] A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
