[ceph-users] ceph mds/db tansfer

2019-01-28 Thread renjianxinlover
hi, professor
Recently, i am intend to make big adaption for local small-scale ceph 
cluster. The job mainly includes two parts:
(1) mds metadata: switch metadata storage medium to ssd.
(2) osd bluestore wal: switch wal storage medium to ssd.
   now, we are doing some research and test but have doubt and concern about 
cluster stability and availability.
   so, could you please guide us by outlined solution or steps?
   thanks very much!
Ren   
Brs


 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Slow requests from bluestore osds

2019-01-28 Thread Marc Schöchlin
Hello cephers,

as described - we also have the slow requests in our setup.

We recently updated from ceph 12.2.4 to 12.2.10, updated Ubuntu 16.04 to the 
latest patchlevel (with kernel 4.15.0-43) and applied dell firmware 2.8.0.

On 12.2.5 (before updating the cluster) we had in a frequency of 10min to 
30minutes in the entire deepscrub-window between 8:00 PM and 6:00 AM.
Especially between 04:00AM and 06:00 AM when when we sequentially create a rbd 
snapshots for every rbd image and delete a outdated snapshot (we hold 3 
snapshots per rbd device).

After the upgrade to 12.2.10 (and the other patches) slow requests seems to be 
reduced, but they still occur after the snapshot creation/deletion procedure.
Today we changed the time of the creation/deletion procedure from 4:00 AM to 
7:30PM and we experienced slow request right in the the snapshot process at 
8:00PM.

The slow requests only happen on a certain storage class osds (30 * 8GB 
spinners)  - i.e ssd osds do not have this problem on the same cluster
The pools which use this storage class are loaded by 80% write requests.

Our configuration looks like this:
---
bluestore cache kv max = 2147483648
bluestore cache kv ratio = 0.9
bluestore cache meta ratio = 0.1
bluestore cache size hdd = 10737418240
osd deep scrub interval = 2592000
osd scrub begin hour = 19
osd scrub end hour = 6
osd scrub load threshold = 4
osd scrub sleep = 0.3
osd max trimming pgs = 2
---
We do not have so much devices in this storage class (a enhancement is in 
progress to get more iops)

What can i do to decrease the impact of snaptrims to prevent slow requests?
(i.e. reduce "osd max trimming pgs" to "1")

Regards
Marc Schöchlin

Am 03.09.18 um 10:13 schrieb Marc Schöchlin:
> Hi,
>
> we are also experiencing this type of behavior for some weeks on our not
> so performance critical hdd pools.
> We haven't spent so much time on this problem, because there are
> currently more important tasks - but here are a few details:
>
> Running the following loop results in the following output:
>
> while true; do ceph health|grep -q HEALTH_OK || (date;  ceph health
> detail); sleep 2; done
>
> Sun Sep  2 20:59:47 CEST 2018
> HEALTH_WARN 4 slow requests are blocked > 32 sec
> REQUEST_SLOW 4 slow requests are blocked > 32 sec
>     4 ops are blocked > 32.768 sec
>     osd.43 has blocked requests > 32.768 sec
> Sun Sep  2 20:59:50 CEST 2018
> HEALTH_WARN 4 slow requests are blocked > 32 sec
> REQUEST_SLOW 4 slow requests are blocked > 32 sec
>     4 ops are blocked > 32.768 sec
>     osd.43 has blocked requests > 32.768 sec
> Sun Sep  2 20:59:52 CEST 2018
> HEALTH_OK
> Sun Sep  2 21:00:28 CEST 2018
> HEALTH_WARN 1 slow requests are blocked > 32 sec
> REQUEST_SLOW 1 slow requests are blocked > 32 sec
>     1 ops are blocked > 32.768 sec
>     osd.41 has blocked requests > 32.768 sec
> Sun Sep  2 21:00:31 CEST 2018
> HEALTH_WARN 7 slow requests are blocked > 32 sec
> REQUEST_SLOW 7 slow requests are blocked > 32 sec
>     7 ops are blocked > 32.768 sec
>     osds 35,41 have blocked requests > 32.768 sec
> Sun Sep  2 21:00:33 CEST 2018
> HEALTH_WARN 7 slow requests are blocked > 32 sec
> REQUEST_SLOW 7 slow requests are blocked > 32 sec
>     7 ops are blocked > 32.768 sec
>     osds 35,51 have blocked requests > 32.768 sec
> Sun Sep  2 21:00:35 CEST 2018
> HEALTH_WARN 7 slow requests are blocked > 32 sec
> REQUEST_SLOW 7 slow requests are blocked > 32 sec
>     7 ops are blocked > 32.768 sec
>     osds 35,51 have blocked requests > 32.768 sec
>
> Our details:
>
>   * system details:
> * Ubuntu 16.04
>  * Kernel 4.13.0-39
>  * 30 * 8 TB Disk (SEAGATE/ST8000NM0075)
>  * 3* Dell Power Edge R730xd (Firmware 2.50.50.50)
>* Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
>* 2*10GBITS SFP+ Network Adapters
>* 192GB RAM
>  * Pools are using replication factor 3, 2MB object size,
>85% write load, 1700 write IOPS/sec
>(ops mainly between 4k and 16k size), 300 read IOPS/sec
>   * we have the impression that this appears on deepscrub/scrub activity.
>   * Ceph 12.2.5, we alread played with the osd settings OSD Settings
> (our assumtion was that the problem is related to rocksdb compaction)
> bluestore cache kv max = 2147483648
> bluestore cache kv ratio = 0.9
> bluestore cache meta ratio = 0.1
> bluestore cache size hdd = 10737418240
>   * this type problem only appears on hdd/bluestore osds, ssd/bluestore
> osds did never experienced that problem
>   * the system is healthy, no swapping, no high load, no errors in dmesg
>
> I attached a log excerpt of osd.35 - probably this is useful for
> investigating the problem is someone owns deeper bluestore knowledge.
> (slow requests appeared on Sun Sep  2 21:00:35)
>
> Regards
> Marc
>
>
> Am 02.09.2018 um 15:50 schrieb Brett Chancellor:
>> The warnings look like this. 
>>
>> 6 ops are blocked > 32.768 sec on osd.219
>> 1 osds have slow requests
>>
>> On Sun, Sep 2, 2018, 8:45 AM Alfredo 

Re: [ceph-users] Bucket logging howto

2019-01-28 Thread Casey Bodley
On Sat, Jan 26, 2019 at 6:57 PM Marc Roos  wrote:
>
>
>
>
> From the owner account of the bucket I am trying to enable logging, but
> I don't get how this should work. I see the s3:PutBucketLogging is
> supported, so I guess this should work. How do you enable it? And how do
> you access the log?
>
>
> [@ ~]$ s3cmd -c .s3cfg accesslog s3://archive Access logging for:
> s3://archive/
>Logging Enabled: False
>
> [@ ~]$ s3cmd -c .s3cfg.archive accesslog s3://archive
> --access-logging-target-prefix=s3://archive/xx
> ERROR: S3 error: 405 (MethodNotAllowed)
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Hi Marc,

The s3:PutBucketLogging action is recognized by bucket policy, but the
feature is otherwise not supported.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Questions about using existing HW for PoC cluster

2019-01-28 Thread Will Dennis


The hope is to be able to provide scale-out storage, that will be performant 
enough to use as a primary fs-based data store for research data (right now we 
mount via NFS on our cluster nodes, may do that with Ceph or perhaps do native 
cephfs access from the cluster nodes.) Right now I’m still in the “I don’t know 
what I don’t know” stage :)
From: Willem Jan Withagen mailto:w...@digiware.nl>>
Date: Monday, Jan 28, 2019, 8:11 AM

I'd carefully define the term: "all seems to work well".

I'm running several ZFS instances of equal or bigger size, that are
specifically tuned (buses, ssds, memory and ARC ) to their usage. And
they usually do perform very well.

No if you define "work well" as performance close to what you get out of
your zfs store be careful not to compare pears to lemons. You might
need rather beefy HW to get to the ceph-cluster performance at the same
level as your ZFS.

So you'd better define you PoC target with real expectations.

--WjW

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-fs crashed after upgrade to 13.2.4

2019-01-28 Thread Ansgar Jazdzewski
hi folks we need some help with our cephfs, all mds keep crashing

starting mds.mds02 at -
terminate called after throwing an instance of
'ceph::buffer::bad_alloc'
 what():  buffer::bad_alloc
*** Caught signal (Aborted) **
in thread 7f542d825700 thread_name:md_log_replay
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
1: /usr/bin/ceph-mds() [0x7cc8a0]
2: (()+0x11390) [0x7f543cf29390]
3: (gsignal()+0x38) [0x7f543c676428]
4: (abort()+0x16a) [0x7f543c67802a]
5: (__gnu_cxx::__verbose_terminate_handler()+0x135) [0x7f543dae6e65]
6: (__cxxabiv1::__terminate(void (*)())+0x6) [0x7f543dadae46]
7: (()+0x734e91) [0x7f543dadae91]
8: (()+0x7410a4) [0x7f543dae70a4]
9: (ceph::buffer::create_aligned_in_mempool(unsigned int, unsigned
int, int)+0x258) [0x7f543d63b348]
10: (ceph::buffer::list::iterator_impl::copy_shallow(unsigned
int, ceph::buffer::ptr&)+0xa2) [0x7f543d640ee2]
11: (compact_map_base,
mempool::pool_allocator<(mempool::pool_index_t)18, char> >,
ceph::buffer::ptr, std::map,
mempool::pool_allocator<(mempool::pool_index_t)18, char> >,
ceph::buffer::ptr, std::less, mempool::po
ol_allocator<(mempool::pool_index_t)18, char> > >,
mempool::pool_allocator<(mempool::pool_index_t)18,
std::pair,
mempool::pool_allocator<(mempool::pool_index_t)18, char> > const,
ceph::buffer::ptr> > > >::decode(ceph::buffer::list::iterator&)+0x122)
[0x66b202]
12: (EMetaBlob::fullbit::decode(ceph::buffer::list::iterator&)+0xe3) [0x7aa633]
13: /usr/bin/ceph-mds() [0x7aeae6]
14: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x3d36) [0x7b4fa6]
15: (EImportStart::replay(MDSRank*)+0x5b) [0x7bbb1b]
16: (MDLog::_replay_thread()+0x864) [0x760024]
17: (MDLog::ReplayThread::entry()+0xd) [0x4f487d]
18: (()+0x76ba) [0x7f543cf1f6ba]
19: (clone()+0x6d) [0x7f543c74841d]
2019-01-28 13:10:02.202 7f542d825700 -1 *** Caught signal (Aborted) **
in thread 7f542d825700 thread_name:md_log_replay

ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
1: /usr/bin/ceph-mds() [0x7cc8a0]
2: (()+0x11390) [0x7f543cf29390]
3: (gsignal()+0x38) [0x7f543c676428]
4: (abort()+0x16a) [0x7f543c67802a]
5: (__gnu_cxx::__verbose_terminate_handler()+0x135) [0x7f543dae6e65]
6: (__cxxabiv1::__terminate(void (*)())+0x6) [0x7f543dadae46]
7: (()+0x734e91) [0x7f543dadae91]
8: (()+0x7410a4) [0x7f543dae70a4]
9: (ceph::buffer::create_aligned_in_mempool(unsigned int, unsigned
int, int)+0x258) [0x7f543d63b348]
10: (ceph::buffer::list::iterator_impl::copy_shallow(unsigned
int, ceph::buffer::ptr&)+0xa2) [0x7f543d640ee2]
11: (compact_map_base,
mempool::pool_allocator<(mempool::pool_index_t)18, char> >,
ceph::buffer::ptr, std::map,
mempool::pool_allocator<(mempool::pool_index_t)18, char> >,
ceph::buffer::ptr, std::less, mempool::po
ol_allocator<(mempool::pool_index_t)18, char> > >,
mempool::pool_allocator<(mempool::pool_index_t)18,
std::pair,
mempool::pool_allocator<(mempool::pool_index_t)18, char> > const,
ceph::buffer::ptr> > > >::decode(ceph::buffer::list::iterator&)+0x122)
[0x66b202]
12: (EMetaBlob::fullbit::decode(ceph::buffer::list::iterator&)+0xe3) [0x7aa633]
13: /usr/bin/ceph-mds() [0x7aeae6]
14: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x3d36) [0x7b4fa6]
15: (EImportStart::replay(MDSRank*)+0x5b) [0x7bbb1b]
16: (MDLog::_replay_thread()+0x864) [0x760024]
17: (MDLog::ReplayThread::entry()+0xd) [0x4f487d]
18: (()+0x76ba) [0x7f543cf1f6ba]
19: (clone()+0x6d) [0x7f543c74841d]
NOTE: a copy of the executable, or `objdump -rdS ` is
needed to interpret this.

0> 2019-01-28 13:10:02.202 7f542d825700 -1 *** Caught signal
(Aborted) **
in thread 7f542d825700 thread_name:md_log_replay

ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
1: /usr/bin/ceph-mds() [0x7cc8a0]
2: (()+0x11390) [0x7f543cf29390]
3: (gsignal()+0x38) [0x7f543c676428]
4: (abort()+0x16a) [0x7f543c67802a]
5: (__gnu_cxx::__verbose_terminate_handler()+0x135) [0x7f543dae6e65]
6: (__cxxabiv1::__terminate(void (*)())+0x6) [0x7f543dadae46]
7: (()+0x734e91) [0x7f543dadae91]
8: (()+0x7410a4) [0x7f543dae70a4]
9: (ceph::buffer::create_aligned_in_mempool(unsigned int, unsigned
int, int)+0x258) [0x7f543d63b348]
10: (ceph::buffer::list::iterator_impl::copy_shallow(unsigned
int, ceph::buffer::ptr&)+0xa2) [0x7f543d640ee2]
11: (compact_map_base,
mempool::pool_allocator<(mempool::pool_index_t)18, char> >,
ceph::buffer::ptr, std::map,
mempool::pool_allocator<(mempool::pool_index_t)18, char> >,
ceph::buffer::ptr, std::less, mempool::po
ol_allocator<(mempool::pool_index_t)18, char> > >,
mempool::pool_allocator<(mempool::pool_index_t)18,
std::pair,
mempool::pool_allocator<(mempool::pool_index_t)18, char> > const,
ceph::buffer::ptr> > > >::decode(ceph::buffer::list::iterator&)+0x122)
[0x66b202]
12: (EMetaBlob::fullbit::decode(ceph::buffer::list::iterator&)+0xe3) [0x7aa633]
13: /usr/bin/ceph-mds() [0x7aeae6]
14: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x3d36) [0x7b4fa6]
15: (EImportStart::replay(MDSRank*)+0x5b) [0x7bbb1b]
16: 

Re: [ceph-users] Questions about using existing HW for PoC cluster

2019-01-28 Thread Willem Jan Withagen

On 28-1-2019 02:56, Will Dennis wrote:

I mean to use CephFS on this PoC; the initial use would be to back up an 
existing ZFS server with ~43TB data (may have to limit the backed-up data 
depending on how much capacity I can get out of the OSD servers) and then share 
out via NFS as a read-only copy, that would give me some I/O speeds on writes 
and reads, and allow me to test different aspects of Ceph before I go pitching 
it as a primary data storage technology (it will be our org's first foray into 
SDS, and I want it to succeed.)

No way I'd go primary production storage with this motley collection of 
"pre-loved" equipment :) If it all seems to work well, I think I could get a 
reasonable budget for new production-grade gear.


Perhaps superfluous, my 2ct anyways.

I'd carefully define the term: "all seems to work well".

I'm running several ZFS instances of equal or bigger size, that are 
specifically tuned (buses, ssds, memory and ARC ) to their usage. And 
they usually do perform very well.


No if you define "work well" as performance close to what you get out of 
your zfs store be careful not to compare pears to lemons. You might 
need rather beefy HW to get to the ceph-cluster performance at the same 
level as your ZFS.


So you'd better define you PoC target with real expectations.

--WjW




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] krbd reboot hung

2019-01-28 Thread Jason Dillaman
On Mon, Jan 28, 2019 at 4:48 AM Gao, Wenjun  wrote:
>
> The "rbdmap" unit needs rbdmap and fstab to be configured for each volume, 
> what if the map and mount are done by applications instead of the system 
> unit? See, we don't write each volume info into /etc/ceph/rbdmap /etc/fstab, 
> and if the "rbdmap" systemd unit is stopped unexpected, not by rebooting, 
> then all rbd volumes will be umounted and unmapped, it's dangerous to 
> applications.

It will unmap all krbd volumes upon shutdown, not just the listed
ones. If you are worried about an root users running "systemctl stop
rbdmap" causing issues, there are tons of other ways a root user can
destroy the system.

>
> On 1/25/19, 9:35 PM, "Jason Dillaman"  wrote:
>
> The "rbdmap" systemd unit file should take care of it [1].
>
> [1] 
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fceph%2Fceph%2Fblob%2Fmaster%2Fsystemd%2Frbdmap.service.in%23L4data=02%7C01%7Cwenjgao%40ebay.com%7C00f03b3f52d744723c3008d682c9e7c9%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636840201026876922sdata=3s374%2BYvUojNprbfQu%2BoQuswpro%2BPGsAvoy%2FyDgrMw8%3Dreserved=0
>
> On Fri, Jan 25, 2019 at 3:00 AM Gao, Wenjun  wrote:
> >
> > Thanks, what’s the configuration you mentioned?
> >
> >
> >
> > --
> >
> > Thanks,
> >
> > Wenjun
> >
> >
> >
> > From: Gregory Farnum 
> > Date: Friday, January 25, 2019 at 3:35 PM
> > To: "Gao, Wenjun" 
> > Cc: "ceph-users@lists.ceph.com" 
> > Subject: Re: [ceph-users] krbd reboot hung
> >
> >
> >
> > Looks like your network deactivated before the rbd volume was 
> unmounted. This is a known issue without a good programmatic workaround and 
> you’ll need to adjust your configuration.
> >
> > On Tue, Jan 22, 2019 at 9:17 AM Gao, Wenjun  wrote:
> >
> > I’m using krbd to map a rbd device to a VM, it appears when the device 
> is mounted, reboot OS will hung for more than 7min, in baremetal case, it 
> could be more than 15min, even using the latest kernel 5.0.0, the problem 
> still occurs.
> >
> > Here are the console logs with 4.15.18 kernel and mimic rbd client, 
> reboot seems to be stuck in umount rbd operation
> >
> > [  OK  ] Stopped Update UTMP about System Boot/Shutdown.
> >
> > [  OK  ] Stopped Create Volatile Files and Directories.
> >
> > [  OK  ] Stopped target Local File Systems.
> >
> >  Unmounting /run/user/110281572...
> >
> >  Unmounting /var/tmp...
> >
> >  Unmounting /root/test...
> >
> >  Unmounting /run/user/78402...
> >
> >  Unmounting Configuration File System...
> >
> > [  OK  ] Stopped Configure read-only root support.
> >
> > [  OK  ] Unmounted /var/tmp.
> >
> > [  OK  ] Unmounted /run/user/78402.
> >
> > [  OK  ] Unmounted /run/user/110281572.
> >
> > [  OK  ] Stopped target Swap.
> >
> > [  OK  ] Unmounted Configuration File System.
> >
> > [  189.919062] libceph: mon4 XX.XX.XX.XX:6789 session lost, hunting for 
> new mon
> >
> > [  189.950085] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  189.950764] libceph: mon3 XX.XX.XX.XX:6789 connect error
> >
> > [  190.687090] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  190.694197] libceph: mon3 XX.XX.XX.XX:6789 connect error
> >
> > [  191.711080] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  191.745254] libceph: mon3 XX.XX.XX.XX:6789 connect error
> >
> > [  193.695065] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  193.727694] libceph: mon3 XX.XX.XX.XX:6789 connect error
> >
> > [  197.087076] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  197.121077] libceph: mon4 XX.XX.XX.XX:6789 connect error
> >
> > [  197.663082] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  197.680671] libceph: mon4 XX.XX.XX.XX:6789 connect error
> >
> > [  198.687122] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  198.719253] libceph: mon4 XX.XX.XX.XX:6789 connect error
> >
> > [  200.671136] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  200.702717] libceph: mon4 XX.XX.XX.XX:6789 connect error
> >
> > [  204.703115] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  204.736586] libceph: mon4 XX.XX.XX.XX:6789 connect error
> >
> > [  209.887141] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  209.918721] libceph: mon0 XX.XX.XX.XX:6789 connect error
> >
> > [  210.719078] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  210.750378] libceph: mon0 XX.XX.XX.XX:6789 connect error
> >
> > [  211.679118] libceph: connect XX.XX.XX.XX:6789 error -101
> >
> > [  211.712246] libceph: mon0 XX.XX.XX.XX:6789 connect error

Re: [ceph-users] Commercial support

2019-01-28 Thread Robert Sander
Hi,

Am 23.01.19 um 23:28 schrieb Ketil Froyn:

> How is the commercial support for Ceph?

At Heinlein Support we also offer independent
ceph consulting. We are concentrating on the
German speaking regions of Europe.

Regards
-- 
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] krbd reboot hung

2019-01-28 Thread Gao, Wenjun
The "rbdmap" unit needs rbdmap and fstab to be configured for each volume, what 
if the map and mount are done by applications instead of the system unit? See, 
we don't write each volume info into /etc/ceph/rbdmap /etc/fstab, and if the 
"rbdmap" systemd unit is stopped unexpected, not by rebooting, then all rbd 
volumes will be umounted and unmapped, it's dangerous to applications.

On 1/25/19, 9:35 PM, "Jason Dillaman"  wrote:

The "rbdmap" systemd unit file should take care of it [1].

[1] 
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fceph%2Fceph%2Fblob%2Fmaster%2Fsystemd%2Frbdmap.service.in%23L4data=02%7C01%7Cwenjgao%40ebay.com%7C00f03b3f52d744723c3008d682c9e7c9%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636840201026876922sdata=3s374%2BYvUojNprbfQu%2BoQuswpro%2BPGsAvoy%2FyDgrMw8%3Dreserved=0

On Fri, Jan 25, 2019 at 3:00 AM Gao, Wenjun  wrote:
>
> Thanks, what’s the configuration you mentioned?
>
>
>
> --
>
> Thanks,
>
> Wenjun
>
>
>
> From: Gregory Farnum 
> Date: Friday, January 25, 2019 at 3:35 PM
> To: "Gao, Wenjun" 
> Cc: "ceph-users@lists.ceph.com" 
> Subject: Re: [ceph-users] krbd reboot hung
>
>
>
> Looks like your network deactivated before the rbd volume was unmounted. 
This is a known issue without a good programmatic workaround and you’ll need to 
adjust your configuration.
>
> On Tue, Jan 22, 2019 at 9:17 AM Gao, Wenjun  wrote:
>
> I’m using krbd to map a rbd device to a VM, it appears when the device is 
mounted, reboot OS will hung for more than 7min, in baremetal case, it could be 
more than 15min, even using the latest kernel 5.0.0, the problem still occurs.
>
> Here are the console logs with 4.15.18 kernel and mimic rbd client, 
reboot seems to be stuck in umount rbd operation
>
> [  OK  ] Stopped Update UTMP about System Boot/Shutdown.
>
> [  OK  ] Stopped Create Volatile Files and Directories.
>
> [  OK  ] Stopped target Local File Systems.
>
>  Unmounting /run/user/110281572...
>
>  Unmounting /var/tmp...
>
>  Unmounting /root/test...
>
>  Unmounting /run/user/78402...
>
>  Unmounting Configuration File System...
>
> [  OK  ] Stopped Configure read-only root support.
>
> [  OK  ] Unmounted /var/tmp.
>
> [  OK  ] Unmounted /run/user/78402.
>
> [  OK  ] Unmounted /run/user/110281572.
>
> [  OK  ] Stopped target Swap.
>
> [  OK  ] Unmounted Configuration File System.
>
> [  189.919062] libceph: mon4 XX.XX.XX.XX:6789 session lost, hunting for 
new mon
>
> [  189.950085] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  189.950764] libceph: mon3 XX.XX.XX.XX:6789 connect error
>
> [  190.687090] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  190.694197] libceph: mon3 XX.XX.XX.XX:6789 connect error
>
> [  191.711080] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  191.745254] libceph: mon3 XX.XX.XX.XX:6789 connect error
>
> [  193.695065] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  193.727694] libceph: mon3 XX.XX.XX.XX:6789 connect error
>
> [  197.087076] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  197.121077] libceph: mon4 XX.XX.XX.XX:6789 connect error
>
> [  197.663082] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  197.680671] libceph: mon4 XX.XX.XX.XX:6789 connect error
>
> [  198.687122] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  198.719253] libceph: mon4 XX.XX.XX.XX:6789 connect error
>
> [  200.671136] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  200.702717] libceph: mon4 XX.XX.XX.XX:6789 connect error
>
> [  204.703115] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  204.736586] libceph: mon4 XX.XX.XX.XX:6789 connect error
>
> [  209.887141] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  209.918721] libceph: mon0 XX.XX.XX.XX:6789 connect error
>
> [  210.719078] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  210.750378] libceph: mon0 XX.XX.XX.XX:6789 connect error
>
> [  211.679118] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  211.712246] libceph: mon0 XX.XX.XX.XX:6789 connect error
>
> [  213.663116] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  213.696943] libceph: mon0 XX.XX.XX.XX:6789 connect error
>
> [  217.695062] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  217.728511] libceph: mon0 XX.XX.XX.XX:6789 connect error
>
> [  225.759109] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  225.775869] libceph: mon0 XX.XX.XX.XX:6789 connect error
>
> [  233.951062] libceph: connect XX.XX.XX.XX:6789 error -101
>
> [  

Re: [ceph-users] cephfs kernel client instability

2019-01-28 Thread Martin Palma
Upgrading to 4.15.0-43-generic fixed the problem.

Best,
Martin

On Fri, Jan 25, 2019 at 9:43 PM Ilya Dryomov  wrote:
>
> On Fri, Jan 25, 2019 at 9:40 AM Martin Palma  wrote:
> >
> > > Do you see them repeating every 30 seconds?
> >
> > yes:
> >
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615] libceph: mon4
> > 10.8.55.203:6789 session lost, hunting for new mon
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.737620] libceph: mon3
> > 10.8.55.202:6789 session lost, hunting for new mon
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.737728] libceph: mon2
> > 10.8.55.201:6789 session lost, hunting for new mon
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.739711] libceph: mon1
> > 10.7.55.202:6789 session established
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.739899] libceph: mon1
> > 10.7.55.202:6789 session established
> > Jan 25 09:34:37 sdccgw01 kernel: [6306813.740015] libceph: mon3
> > 10.8.55.202:6789 session established
> > Jan 25 09:34:43 sdccgw01 kernel: [6306819.881560] libceph: mon2
> > 10.8.55.201:6789 session lost, hunting for new mon
> > Jan 25 09:34:43 sdccgw01 kernel: [6306819.883730] libceph: mon4
> > 10.8.55.203:6789 session established
> > Jan 25 09:34:47 sdccgw01 kernel: [6306823.977566] libceph: mon0
> > 10.7.55.201:6789 session lost, hunting for new mon
> > Jan 25 09:34:47 sdccgw01 kernel: [6306823.980033] libceph: mon1
> > 10.7.55.202:6789 session established
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.457449] libceph: mon1
> > 10.7.55.202:6789 session lost, hunting for new mon
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.457450] libceph: mon3
> > 10.8.55.202:6789 session lost, hunting for new mon
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.457612] libceph: mon1
> > 10.7.55.202:6789 session lost, hunting for new mon
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.459168] libceph: mon3
> > 10.8.55.202:6789 session established
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.459537] libceph: mon4
> > 10.8.55.203:6789 session established
> > Jan 25 09:35:07 sdccgw01 kernel: [6306844.459792] libceph: mon4
> > 10.8.55.203:6789 session established
> >
> > > Which kernel are you running?
> >
> > Current running kernel is 4.11.0-13-generic  (Ubuntu 16.04.5 LTS), and
> > the latest that is provided is  4.15.0-43-generic
>
> Looks like https://tracker.ceph.com/issues/23537 indeed.  A kernel
> upgrade will fix it.
>
> Thanks,
>
> Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD client hangs

2019-01-28 Thread Ilya Dryomov
On Mon, Jan 28, 2019 at 7:31 AM ST Wong (ITSC)  wrote:
>
> > That doesn't appear to be an error -- that's just stating that it found a 
> > dead client that was holding the exclusice-lock, so it broke the dead 
> > client's lock on the image (by blacklisting the client).
>
> As there is only 1 RBD client in this testing, does it mean the RBD client 
> process keeps failing?
> In a fresh boot RBD client, doing some basic operations also gives the 
> warning:
>
>  cut here 
> # rbd -n client.acapp1 map 4copy/foo
> /dev/rbd0
> # mount /dev/rbd0 /4copy
> # cd /4copy; ls
>
>
> # tail /var/log/messages
> Jan 28 14:23:39 acapp1 kernel: Key type ceph registered
> Jan 28 14:23:39 acapp1 kernel: libceph: loaded (mon/osd proto 15/24)
> Jan 28 14:23:39 acapp1 kernel: rbd: loaded (major 252)
> Jan 28 14:23:39 acapp1 kernel: libceph: mon2 192.168.1.156:6789 session 
> established
> Jan 28 14:23:39 acapp1 kernel: libceph: client80624 fsid 
> cc795498-5d16-4b84-9584-1788d0458be9
> Jan 28 14:23:39 acapp1 kernel: rbd: rbd0: capacity 10737418240 features 0x5
> Jan 28 14:23:44 acapp1 kernel: XFS (rbd0): Mounting V5 Filesystem
> Jan 28 14:23:44 acapp1 kernel: rbd: rbd0: client80621 seems dead, breaking 
> lock <--
> Jan 28 14:23:45 acapp1 kernel: XFS (rbd0): Starting recovery (logdev: 
> internal)
> Jan 28 14:23:45 acapp1 kernel: XFS (rbd0): Ending recovery (logdev: internal)
>
>  cut here 
>
> Is this normal?

Yes -- the lock isn't released because you are hard resetting your
machine.  When it comes back up, the new client fences the old client
to avoid split brain.

>
>
>
> Besides, repeated the testing:
> * Map and mount the rbd device, read/write ok.
> * Umount all rbd, then reboot without problem
> * Reboot hangs if not umounting all rbd before reboot:
>
>  cut here 
> Jan 28 14:13:12 acapp1 kernel: rbd: rbd0: client80531 seems dead, breaking 
> lock
> Jan 28 14:13:13 acapp1 kernel: XFS (rbd0): Ending clean mount 
>   <-- Reboot hangs here
> Jan 28 14:14:06 acapp1 systemd: Stopping Session 1 of user root.  
>   <-- pressing power reset
> Jan 28 14:14:06 acapp1 systemd: Stopped target Multi-User System.
>  cut here 
>
> Is it necessary to umount all RDB before rebooting  the client host?

Yes, it's necessary.  If you enable rbdmap.service, it should do it for
you:

https://github.com/ceph/ceph/blob/f52c22ebf5ff24107faf061a8de1f36376ed515d/systemd/rbdmap.service.in#L15

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com