[ceph-users] CephFS client issue

2015-06-14 Thread David Z
 


 On Monday, June 15, 2015 3:05 AM, ceph-users-requ...@lists.ceph.com 
ceph-users-requ...@lists.ceph.com wrote:
   

 Send ceph-users mailing list submissions to
    ceph-users@lists.ceph.com

To subscribe or unsubscribe via the World Wide Web, visit
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
or, via email, send a message with subject or body 'help' to
    ceph-users-requ...@lists.ceph.com

You can reach the person managing the list at
    ceph-users-ow...@lists.ceph.com

When replying, please edit your Subject line so it is more specific
than Re: Contents of ceph-users digest...


Today's Topics:

  1. Re: Erasure coded pools and bit-rot protection (Pawe? Sadowski)
  2. CephFS client issue (Matteo Dacrema)
  3. Re: Erasure coded pools and bit-rot protection (Gregory Farnum)
  4. Re: CephFS client issue (Lincoln Bryant)
  5. Re: .New Ceph cluster - cannot add additional monitor
      (Mike Carlson)
  6. Re: CephFS client issue (Matteo Dacrema)


--

Message: 1
Date: Sat, 13 Jun 2015 21:08:25 +0200
From: Pawe? Sadowski c...@sadziu.pl
To: Gregory Farnum g...@gregs42.com
Cc: ceph-users ceph-us...@ceph.com
Subject: Re: [ceph-users] Erasure coded pools and bit-rot protection
Message-ID: 557c7fa9.1020...@sadziu.pl
Content-Type: text/plain; charset=utf-8

Thanks for taking care of this so fast. Yes, I'm getting broken object.
I haven't checked this on other versions but is this bug present
only in Hammer or in all versions?


W dniu 12.06.2015 o 21:43, Gregory Farnum pisze:
 Okay, Sam thinks he knows what's going on; here's a ticket:
 http://tracker.ceph.com/issues/12000

 On Fri, Jun 12, 2015 at 12:32 PM, Gregory Farnum g...@gregs42.com wrote:
 On Fri, Jun 12, 2015 at 1:07 AM, Pawe? Sadowski c...@sadziu.pl wrote:
 Hi All,

 I'm testing erasure coded pools. Is there any protection from bit-rot
 errors on object read? If I modify one bit in object part (directly on
 OSD) I'm getting *broken*object:
 Sorry, are you saying that you're getting a broken object if you flip
 a bit in an EC pool? That should detect the chunk as invalid and
 reconstruct on read...
 -Greg

    mon-01:~ # rados --pool ecpool get `hostname -f`_16 - | md5sum
    bb2d82bbb95be6b9a039d135cc7a5d0d  -

    # modify one bit directly on OSD

    mon-01:~ # rados --pool ecpool get `hostname -f`_16 - | md5sum
    02f04f590010b4b0e6af4741c4097b4f  -

    # restore bit to original value

    mon-01:~ # rados --pool ecpool get `hostname -f`_16 - | md5sum
    bb2d82bbb95be6b9a039d135cc7a5d0d  -

 If I run deep-scrub on modified bit I'm getting inconsistent PG which is
 correct in this case. After restoring bit and running deep-scrub again
 all PGs are clean.


 [ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)]
-- 
PS


--

Message: 2
Date: Sun, 14 Jun 2015 15:26:54 +
From: Matteo Dacrema mdacr...@enter.it
To: ceph-users ceph-us...@ceph.com
Subject: [ceph-users] CephFS client issue
Message-ID: d28e061762104ed68e06effd5199ef06@Exch2013Mb.enter.local
Content-Type: text/plain; charset=us-ascii

?Hi all,


I'm using CephFS on Hammer and sometimes I need to reboot one or more clients 
because , as ceph -s tells me, it's failing to respond to capability 
release.After tha?t all clients stop to respond: can't access files or 
mount/umont cephfs.

I've 1.5 million files , 2 metadata servers in active/standby configuration 
with 8 GB of RAM , 20 clients with 2 GB of RAM each and 2 OSD nodes with 4 80GB 
osd and 4GB of RAM.



Here my configuration:


[global]
        fsid = 2de7b17f-0a3e-4109-b878-c035dd2f7735
        mon_initial_members = cephmds01
        mon_host = 10.29.81.161
        auth_cluster_required = cephx
        auth_service_required = cephx
        auth_client_required = cephx
        public network = 10.29.81.0/24
        tcp nodelay = true
        tcp rcvbuf = 0
        ms tcp read timeout = 600

        #Capacity
        mon osd full ratio = .95
        mon osd nearfull ratio = .85


[osd]
        osd journal size = 1024
        journal dio = true
        journal aio = true

        osd op threads = 2
        osd op thread timeout = 60
        osd disk threads = 2
        osd recovery threads = 1
        osd recovery max active = 1
        osd max backfills = 2


        # Pool
        osd pool default size = 2

        #XFS
        osd mkfs type = xfs
        osd mkfs options xfs = -f -i size=2048
        osd mount options xfs = rw,noatime,inode64,logbsize=256k,delaylog

        #FileStore Settings
        filestore xattr use omap = false
        filestore max inline xattr size = 512
        filestore max sync interval = 10
        filestore merge threshold = 40
        filestore split multiple = 8
        filestore flusher = false
        filestore queue max ops = 2000
        filestore queue max bytes = 536870912
        filestore queue committing max ops = 500
        filestore queue committing max bytes = 

Re: [ceph-users] rbd format v2 support

2015-06-08 Thread David Z
Hi Ilya,
Thanks for the reply. I knew that v2 image can be mapped if using default 
striping parameters without --stripe-unit or --stripe-count.

It is just the rbd performance (IOPS  bandwidth) we tested hasn't met our 
goal. We found at this point OSDs seemed not to be the bottleneck, so we want 
to try fancy striping.
Do you know if there is an approximate ETA for this feature? Or it would be 
great that you could share some info on tuning rbd performance. Anything will 
be appreciated.
Thanks.
Zhi (David) 

 On Sunday, June 7, 2015 3:50 PM, Ilya Dryomov idryo...@gmail.com wrote:
   

 On Fri, Jun 5, 2015 at 6:47 AM, David Z david.z1...@yahoo.com wrote:
 Hi Ceph folks,

 We want to use rbd format v2, but find it is not supported on kernel 3.10.0 
 of centos 7:

 [ceph@ ~]$ sudo rbd map zhi_rbd_test_1
 rbd: sysfs write failed
 rbd: map failed: (22) Invalid argument
 [ceph@ ~]$ dmesg | tail
 [662453.664746] rbd: image zhi_rbd_test_1: unsupported stripe unit (got 8192 
 want 4194304)

 As it described in ceph doc, it should be available from kernel 3.11. But I 
 checked the code of kernel 3.12, 3.14 and even 4.1. This piece of code is 
 still there, see below links. Do I miss some codes or info?

What you are referring to is called fancy striping and it is
unsupported (work is underway but it's been slow going).  However
because v2 images with *default* striping parameters disk format wise
are the same as v1 images, you can map a v2 image provided you didn't
specify custom --stripe-unit or --stripe-count on rbd create.

Thanks,

                Ilya


  ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd format v2 support

2015-06-04 Thread David Z
Hi Ceph folks,

We want to use rbd format v2, but find it is not supported on kernel 3.10.0 of 
centos 7:

[ceph@ ~]$ sudo rbd map zhi_rbd_test_1 
rbd: sysfs write failed 
rbd: map failed: (22) Invalid argument 
[ceph@ ~]$ dmesg | tail 
[662453.664746] rbd: image zhi_rbd_test_1: unsupported stripe unit (got 8192 
want 4194304)

As it described in ceph doc, it should be available from kernel 3.11. But I 
checked the code of kernel 3.12, 3.14 and even 4.1. This piece of code is still 
there, see below links. Do I miss some codes or info?  

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/block/rbd.c?id=refs/tags/v4.1-rc6#n4352
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/block/rbd.c?id=refs/tags/v4.1-rc6#n4359

And I also found one email that Sage mentioned there was a commit 
764684ef34af685cd8d46830a73826443f9129df should resolve such problem, but I 
didn't find this commit's details. Could some one know about it?

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-June/031897.html

Thanks a lot!

Regards,
Zhi (David)  
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] The strategy of auto-restarting crashed OSD

2014-11-12 Thread David Z
Hi Guys,

We are experiencing some OSD crashing issues recently, like messenger crash, 
some strange crash (still being investigating), etc. Those crashes seems not to 
reproduce after restarting OSD.

So we are thinking about the strategy of auto-restarting crashed OSD for 1 or 2 
times, then leave it as down if restarting doesn't work. This strategy might 
help us on pg peering and recovering impact to online traffic to some extent, 
since we won't mark OSD out automatically even if it is down unless we are sure 
it is disk failure.

However, we are also aware that this strategy may bring us some problems. Since 
your guys have more experience on CEPH, so we would like to hear some 
suggestions from you.

Thanks.

David Zhang  
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com