Re: [ceph-users] OSD assert hit suicide timeout
=15908 cs=1 l=0 c=0x7fe01b2e1500).connect got RESETSESSION 2017-09-19 02:06:55.979722 7fdf9500d700 0 -- 10.10.13.29:6826/3332721 <http://10.10.13.29:6826/3332721> >> 10.10.13.27:6830/2018590 <http://10.10.13.27:6830/2018590> pipe(0x7fe003a06800 sd=98 :42191 s=1 pgs=12697 cs=1 l=0 c=0x7fe01b44d780).connect got RESETSESSION 2017-09-19 02:06:56.106436 7fdfba1dc700 0 -- 10.10.13.29:6826/3332721 <http://10.10.13.29:6826/3332721> >> 10.10.13.27:6811/2018593 <http://10.10.13.27:6811/2018593> pipe(0x7fe009e79400 sd=137 :54582 s=1 pgs=11500 cs=1 l=0 c=0x7fe005820880).connect got RESETSESSION 2017-09-19 02:06:56.107146 7fdfbbaf5700 0 -- 10.10.13.29:6826/3332721 <http://10.10.13.29:6826/3332721> >> 10.10.13.27:6811/2018593 <http://10.10.13.27:6811/2018593> pipe(0x7fe009e79400 sd=137 :54582 s=2 pgs=11602 cs=1 l=0 c=0x7fe005820880).fault, initiating reconnect --- 2017-09-19 02:06:56.213980 7fdfdd58d700 0 log_channel(cluster) log [WRN] : map e48123 wrongly marked me down --- 2017-09-19 03:06:34.778837 7fdfeae86700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7fdfde58f700' had timed out after 60 2017-09-19 03:06:34.778840 7fdfeae86700 1 heartbeat_map is_healthy 'FileStore::op_tp thread 0x7fdfddd8e700' had timed out after 60 2017-09-19 03:06:39.778908 7fdfeae86700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fdfc422f700' had timed out after 15 2017-09-19 03:06:39.778921 7fdfeae86700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fdfc5231700' had timed out after 15 2017-09-19 03:06:39.778930 7fdfeae86700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fdfc5231700' had suicide timed out after 150 2017-09-19 03:06:39.782749 7fdfeae86700 -1 common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(const ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fdfeae86700 time 2017-09-19 03:06:39.778940 common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit suicide timeout") ___________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- *Stanley Zhang | * Senior Operations Engineer *Telephone:* +64 9 302 0515 *Fax:* +64 9 302 0518 *Mobile:* +64 22 318 3664 *Freephone:* 0800 SMX SMX (769 769) *SMX Limited:* Level 15, 19 Victoria Street West, Auckland, New Zealand *Web:* http://smxemail.com SMX | Cloud Email Hosting & Security _ This email has been filtered by SMX. For more info visit http://smxemail.com _ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-osd restartd via systemd in case of disk error
I like this, there is some similar ideas we probably can borrow from Cassandra on disk failure # policy for data disk failures: # die: shut down gossip and Thrift and kill the JVM for any fs errors or # single-sstable errors, so the node can be replaced. # stop_paranoid: shut down gossip and Thrift even for single-sstable errors. # stop: shut down gossip and Thrift, leaving the node effectively dead, but # can still be inspected via JMX. # best_effort: stop using the failed disk and respond to requests based on # remaining available sstables. This means you WILL see obsolete # data at CL.ONE! # ignore: ignore fatal errors and let requests fail, as in pre-1.2 Cassandra disk_failure_policy: stop_paranoid Regards Stanley On 19/09/17 9:16 PM, Manuel Lausch wrote: Am Tue, 19 Sep 2017 08:24:48 + schrieb Adrian Saul <adrian.s...@tpgtelecom.com.au>: I understand what you mean and it's indeed dangerous, but see: https://github.com/ceph/ceph/blob/master/systemd/ceph-osd%40.service Looking at the systemd docs it's difficult though: https://www.freedesktop.org/software/systemd/man/systemd.service.ht ml If the OSD crashes due to another bug you do want it to restart. But for systemd it's not possible to see if the crash was due to a disk I/O- error or a bug in the OSD itself or maybe the OOM-killer or something. Perhaps using something like RestartPreventExitStatus and defining a specific exit code for the OSD to exit on when it is exiting due to an IO error. A other idea: The OSD daemon keeps running in a defined error state and only stops the listeners with other OSDs and the clients. -- *Stanley Zhang | * Senior Operations Engineer *Telephone:* +64 9 302 0515 *Fax:* +64 9 302 0518 *Mobile:* +64 22 318 3664 *Freephone:* 0800 SMX SMX (769 769) *SMX Limited:* Level 15, 19 Victoria Street West, Auckland, New Zealand *Web:* http://smxemail.com SMX | Cloud Email Hosting & Security _ This email has been filtered by SMX. For more info visit http://smxemail.com _ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [rgw][s3] Object not in objects list
Your bucket index got corrupted. I believe there is no easy way to restore the index other than downloading existing objects and re-upload them, correct me if anybody else know a better way. You can check out all your objects in that bucket with: rados -p .rgw.buckets ls | grep default.32785769.2 Plus what's your region map? where is the your bucket index stored? from the naming of data pool, you seems using same pool for bucket index? Regards Stanley On 31/08/17 9:01 PM, Rudenko Aleksandr wrote: Hi, Maybe someone have thoughts? --- Best regards, Alexander Rudenko On 30 Aug 2017, at 12:28, Rudenko Aleksandr> wrote: Hi, I use ceph 0.94.10(hammer) with radosgw as S3-compatible object store. I have few objects in some bucket with strange problem. I use awscli as s3 client. GET/HEAD objects work fine but list object doesn’t. In objects list I don’t see these objects. Object metadata: radosgw-admin bi list --bucket={my-bucket} --object={my-object} Return []. But: rados -p .rgw.buckets stat default.32785769.2_{my-object} .rgw.buckets/default.32785769.2_{my-object} mtime 2017-08-15 18:07:29.00, size 97430 Bucket versioning not enabled. Bucket has more 13M objects. Where can I find the problem? --- Best regards, Alexander Rudenko ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _ This email has been filtered by SMX. For more info visit http://smxemail.com _ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] deep-scrub taking long time(possible leveldb corruption?)
Hi We have a 4 physical nodes cluster running Jewel, our app talks S3 to the cluster and uses S3 index heavily no-doubt. We've had several big outages in the past that seem caused by a deep-scrub on one of the PGs in S3 index pool. Generally it starts from a deep scrub on one such PG then follows with lots of slow requests blocking/accumulating which eventually makes the whole cluster down. In the event like this, we have to set OSD to noup/nodown/noout to let OSD not suicide during such deep-scrub. In a recent outage, the deep-scrub of one PG took 2 hours to finish, after finished, I happened to try listing all omap keys of the objects in that PG and found that listing keys of one particular object can cause same outage described above, it indicates to me that the index object was corrupted, but I can't find anything in the logs. Interestingly (to me), 2 days later that index object seems have fixed itself: listing omap keys quick and easy, deep-scrubbing same PG only takes 3 seconds. The deep-scrub that took 2 hours to finish: .log-20170730.gz:2017-07-29 12:14:10.476325 osd.2 x.x.x.x:6800/78482 217 : cluster [INF] 11.11 deep-scrub starts .log-20170730.gz:2017-07-29 14:05:12.108523 osd.2 x.x.x.203:6800/78482 1795 : cluster [INF] 11.11 deep-scrub ok The command I used to list all omap keys: rados -p .rgw.buckets.index listomapkeys .dir.c82cdc62-7926-440d-8085-4e7879ef8155.26048.647 | wc -l Most recent deep-scrub kicked off manually: 2017-07-31 09:54:37.997911 7f78bc333700 0 log_channel(cluster) log [INF] : 11.11 deep-scrub starts 2017-07-31 09:54:40.539494 7f78bc333700 0 log_channel(cluster) log [INF] : 11.11 deep-scrub ok Setting debug_leveldb to 20/5 didn't log any useful information for the event, sorry, but a perf record shows most (83%) of the time was used on LevelDB operations (screenshot or perf file can be supplied if anybody interested since it's over 150KB size limit.). I wonder if anybody came across similar issue before or can explain what happened to the index object to make it not-usable before but usable 2 days later? One thing that might fix the index object is leveldb compactions I guess. By the way the above problematic index object has ~30k keys, the biggest index object in our cluster holds about 300k keys. Regards Stanley -- *Stanley Zhang | * Senior Operations Engineer *Telephone:* +64 9 302 0515 *Fax:* +64 9 302 0518 *Mobile:* +64 22 318 3664 *Freephone:* 0800 SMX SMX (769 769) *SMX Limited:* Level 15, 19 Victoria Street West, Auckland, New Zealand *Web:* http://smxemail.com SMX | Cloud Email Hosting & Security _ This email has been filtered by SMX. For more info visit http://smxemail.com _ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com