FullDataAlchemist opened a new issue, #18788:
URL: https://github.com/apache/druid/issues/18788
### Affected Version
The Druid version 33.0.0.
### Description
We have identified a potential issue with orphaned segments in our
deployment, which utilizes Ceph as deep storage and Postgres as the metadata
store.
Several anomalies have been observed:
- A significant number of segments are present in Ceph but missing from
Postgres metadata.
- Some of these segments are very old, have exceeded their retention period,
and were never cleaned up.
- A subset of segments had never been loaded by the cluster because they did
not exist in Postgres at all, implying they were unknown to the coordinator.
- After manually deleting these segments from Ceph, there were no related
errors or recovery attempts from the cluster, and Ceph disk usage dropped
noticeably, confirming they were unused and orphaned.
- the steps we took for removing were:
- list segments from Ceph
- list from postgres using payload field from druid_segments table
([payload] [loadSpec] [key])
- check differences and remove keys that were not in PostgreSQL and
existed on Ceph storage
Additional context:
- These segments appear to be completely unmanaged by Druid since their
metadata entries never existed or were removed prematurely.
- Manual deletion did not cause any segment load/unload events, coordinator
log warnings, or missing segment alerts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]