Re: [ceph-users] OSD assert hit suicide timeout

2017-09-19 Thread Stanley Zhang
=15908 cs=1 l=0 c=0x7fe01b2e1500).connect got RESETSESSION
2017-09-19 02:06:55.979722 7fdf9500d700  0 -- 10.10.13.29:6826/3332721 
<http://10.10.13.29:6826/3332721> >> 10.10.13.27:6830/2018590 
<http://10.10.13.27:6830/2018590> pipe(0x7fe003a06800 sd=98 :42191 s=1 
pgs=12697 cs=1 l=0 c=0x7fe01b44d780).connect got RESETSESSION
2017-09-19 02:06:56.106436 7fdfba1dc700  0 -- 10.10.13.29:6826/3332721 
<http://10.10.13.29:6826/3332721> >> 10.10.13.27:6811/2018593 
<http://10.10.13.27:6811/2018593> pipe(0x7fe009e79400 sd=137 :54582 
s=1 pgs=11500 cs=1 l=0 c=0x7fe005820880).connect got RESETSESSION
2017-09-19 02:06:56.107146 7fdfbbaf5700  0 -- 10.10.13.29:6826/3332721 
<http://10.10.13.29:6826/3332721> >> 10.10.13.27:6811/2018593 
<http://10.10.13.27:6811/2018593> pipe(0x7fe009e79400 sd=137 :54582 
s=2 pgs=11602 cs=1 l=0 c=0x7fe005820880).fault, initiating reconnect

---
2017-09-19 02:06:56.213980 7fdfdd58d700  0 log_channel(cluster) log 
[WRN] : map e48123 wrongly marked me down

---
2017-09-19 03:06:34.778837 7fdfeae86700  1 heartbeat_map is_healthy 
'FileStore::op_tp thread 0x7fdfde58f700' had timed out after 60
2017-09-19 03:06:34.778840 7fdfeae86700  1 heartbeat_map is_healthy 
'FileStore::op_tp thread 0x7fdfddd8e700' had timed out after 60
2017-09-19 03:06:39.778908 7fdfeae86700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7fdfc422f700' had timed out after 15
2017-09-19 03:06:39.778921 7fdfeae86700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7fdfc5231700' had timed out after 15
2017-09-19 03:06:39.778930 7fdfeae86700  1 heartbeat_map is_healthy 
'OSD::osd_op_tp thread 0x7fdfc5231700' had suicide timed out after 150
2017-09-19 03:06:39.782749 7fdfeae86700 -1 common/HeartbeatMap.cc: In 
function 'bool ceph::HeartbeatMap::_check(const 
ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fdfeae86700 
time 2017-09-19

 03:06:39.778940
common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit suicide timeout")


___________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--

*Stanley Zhang | * Senior Operations Engineer
*Telephone:* +64 9 302 0515 *Fax:* +64 9 302 0518
*Mobile:* +64 22 318 3664 *Freephone:* 0800 SMX SMX (769 769)
*SMX Limited:* Level 15, 19 Victoria Street West, Auckland, New Zealand
*Web:* http://smxemail.com
SMX | Cloud Email Hosting & Security

_

This email has been filtered by SMX. For more info visit http://smxemail.com
_

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-osd restartd via systemd in case of disk error

2017-09-19 Thread Stanley Zhang
I like this, there is some similar ideas we probably can borrow from 
Cassandra on disk failure



# policy for data disk failures:
# die: shut down gossip and Thrift and kill the JVM for any fs errors or
#  single-sstable errors, so the node can be replaced.
# stop_paranoid: shut down gossip and Thrift even for single-sstable 
errors.
# stop: shut down gossip and Thrift, leaving the node effectively 
dead, but

#   can still be inspected via JMX.
# best_effort: stop using the failed disk and respond to requests based on
#  remaining available sstables.  This means you WILL see 
obsolete

#  data at CL.ONE!
# ignore: ignore fatal errors and let requests fail, as in pre-1.2 
Cassandra

disk_failure_policy: stop_paranoid

Regards

Stanley


On 19/09/17 9:16 PM, Manuel Lausch wrote:

Am Tue, 19 Sep 2017 08:24:48 +
schrieb Adrian Saul <adrian.s...@tpgtelecom.com.au>:


I understand what you mean and it's indeed dangerous, but see:
https://github.com/ceph/ceph/blob/master/systemd/ceph-osd%40.service

Looking at the systemd docs it's difficult though:
https://www.freedesktop.org/software/systemd/man/systemd.service.ht
ml

If the OSD crashes due to another bug you do want it to restart.

But for systemd it's not possible to see if the crash was due to a
disk I/O- error or a bug in the OSD itself or maybe the OOM-killer
or something.

Perhaps using something like RestartPreventExitStatus and defining a
specific exit code for the OSD to exit on when it is exiting due to
an IO error.

A other idea: The OSD daemon keeps running in a defined error state
and only stops the listeners with other OSDs and the clients.




--

*Stanley Zhang | * Senior Operations Engineer
*Telephone:* +64 9 302 0515 *Fax:* +64 9 302 0518
*Mobile:* +64 22 318 3664 *Freephone:* 0800 SMX SMX (769 769)
*SMX Limited:* Level 15, 19 Victoria Street West, Auckland, New Zealand
*Web:* http://smxemail.com
SMX | Cloud Email Hosting & Security

_

This email has been filtered by SMX. For more info visit http://smxemail.com
_

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [rgw][s3] Object not in objects list

2017-08-31 Thread Stanley Zhang
Your bucket index got corrupted. I believe there is no easy way to 
restore the index other than downloading existing objects and re-upload 
them, correct me if anybody else know a better way.


You can check out all your objects in that bucket with:

rados -p .rgw.buckets ls | grep default.32785769.2

Plus what's your region map? where is the your bucket index stored? from 
the naming of data pool, you seems using same pool for bucket index?


Regards

Stanley


On 31/08/17 9:01 PM, Rudenko Aleksandr wrote:

Hi,

Maybe someone have thoughts?

---
Best regards,

Alexander Rudenko



On 30 Aug 2017, at 12:28, Rudenko Aleksandr > wrote:


Hi,

I use ceph 0.94.10(hammer) with radosgw as S3-compatible object store.

I have few objects in some bucket with strange problem.

I use awscli as s3 client.

GET/HEAD objects work fine but list object doesn’t.
In objects list I don’t see these objects.

Object metadata:

radosgw-admin bi list --bucket={my-bucket} --object={my-object}

Return [].

But:

rados -p .rgw.buckets stat default.32785769.2_{my-object}

.rgw.buckets/default.32785769.2_{my-object} mtime 2017-08-15 
18:07:29.00, size 97430



Bucket versioning not enabled.
Bucket has more 13M objects.

Where can I find the problem?

---
Best regards,

Alexander Rudenko







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_

This email has been filtered by SMX. For more info visit http://smxemail.com
_

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] deep-scrub taking long time(possible leveldb corruption?)

2017-08-01 Thread Stanley Zhang

Hi

We have a 4 physical nodes cluster running Jewel, our app talks S3 to 
the cluster and uses S3 index heavily no-doubt. We've had several big 
outages in the past that seem caused by a deep-scrub on one of the PGs 
in S3 index pool. Generally it starts from a deep scrub on one such PG 
then follows with lots of slow requests blocking/accumulating which 
eventually makes the whole cluster down. In the event like this, we have 
to set OSD to noup/nodown/noout to let OSD not suicide during such 
deep-scrub.


In a recent outage, the deep-scrub of one PG took 2 hours to finish, 
after finished, I happened to try listing all omap keys of the objects 
in that PG and found that listing keys of one particular object can 
cause same outage described above, it indicates to me that the index 
object was corrupted, but I can't find anything in the logs. 
Interestingly (to me), 2 days later that index object seems have fixed 
itself: listing omap keys quick and easy, deep-scrubbing same PG only 
takes 3 seconds.


The deep-scrub that took 2 hours to finish:
.log-20170730.gz:2017-07-29 12:14:10.476325 osd.2 x.x.x.x:6800/78482 
217 : cluster [INF] 11.11 deep-scrub starts
.log-20170730.gz:2017-07-29 14:05:12.108523 osd.2 
x.x.x.203:6800/78482 1795 : cluster [INF] 11.11 deep-scrub ok


The command I used to list all omap keys:
rados -p .rgw.buckets.index listomapkeys 
.dir.c82cdc62-7926-440d-8085-4e7879ef8155.26048.647 | wc -l


Most recent deep-scrub kicked off manually:
2017-07-31 09:54:37.997911 7f78bc333700  0 log_channel(cluster) log 
[INF] : 11.11 deep-scrub starts
2017-07-31 09:54:40.539494 7f78bc333700  0 log_channel(cluster) log 
[INF] : 11.11 deep-scrub ok


Setting debug_leveldb to 20/5 didn't log any useful information for the 
event, sorry, but a perf record shows most (83%) of the time was used on 
LevelDB operations (screenshot or perf file can be supplied if anybody 
interested since it's over 150KB size limit.).


I wonder if anybody came across similar issue before or can explain what 
happened to the index object to make it not-usable before but usable 2 
days later? One thing that might fix the index object is leveldb 
compactions I guess. By the way the above problematic index object has 
~30k keys, the biggest index object in our cluster holds about 300k keys.


Regards

Stanley

--

*Stanley Zhang | * Senior Operations Engineer
*Telephone:* +64 9 302 0515 *Fax:* +64 9 302 0518
*Mobile:* +64 22 318 3664 *Freephone:* 0800 SMX SMX (769 769)
*SMX Limited:* Level 15, 19 Victoria Street West, Auckland, New Zealand
*Web:* http://smxemail.com
SMX | Cloud Email Hosting & Security

_

This email has been filtered by SMX. For more info visit http://smxemail.com
_

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com