Re: [ceph-users] Parallel reads with CephFS

2016-12-07 Thread Andreas Gerstmayr
Hi, thanks for your response. Yes, I ran my benchmark tests with a striping configuration of 1 MB stripe unit, 10 stripe count and 10 MB object size. Therefore when reading with a blocksize of 10 MB, 10 stripe units in 10 different objects could be read in parallel [1]. With an (artificially large

Re: [ceph-users] Parallel reads with CephFS

2016-12-07 Thread Goncalo Borges
Hi... Are you actually playing with file layout? http://docs.ceph.com/docs/jewel/cephfs/file-layouts/ By increasing the stripe count and tuning the stripe unit to your application block size, you may have an increase of performance. Cheers Goncalo On 12/08/2016 12:45 PM, Andreas Ger

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-07 Thread Aravind Ramesh
Yes, I just verified it again. Can create the rbd image with data-pool (EC pool), but cannot access it with disable command. ems@rack9-ems-5:~/ec-rbd/master/ceph/build$ ceph -s cluster f791d144-c685-4e5c-bbff-d569930befc4 health HEALTH_OK monmap e2: 1 mons at {a=127.0.0.1:4/0}

[ceph-users] Parallel reads with CephFS

2016-12-07 Thread Andreas Gerstmayr
Hi, does the CephFS kernel module (as of kernel version 4.8.8) support parallel reads of file stripes? When an application requests a 500MB block from a file (which is splitted into multiple objects and stripes on different OSDs) at once, does the CephFS kernel client request these blocks in paral

Re: [ceph-users] node and its OSDs down...

2016-12-07 Thread Brad Hubbard
On Wed, Dec 7, 2016 at 9:11 PM, M Ranga Swami Reddy wrote: > That's right.. > But, my question was: when an OSD down, all data will be moved to other > OSDs from downed OSD. - Is this correct? > No, only after it is marked out. "If an OSD is down and the degraded condition persists, Ceph may ma

Re: [ceph-users] 10.2.4 Jewel released -- IMPORTANT

2016-12-07 Thread Francois Lafont
On 12/08/2016 12:38 AM, Gregory Farnum wrote: > Yep! Ok, thanks for the confirmations Greg. Bye. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Gregory Farnum
On Wed, Dec 7, 2016 at 3:11 PM, Sage Weil wrote: > On Thu, 8 Dec 2016, Ruben Kerkhof wrote: >> On Wed, Dec 7, 2016 at 11:58 PM, Samuel Just wrote: >> > Actually, Greg and Sage are working up other branches, nvm. >> > -Sam >> >> Ok, I'll hold. If the issue is in the SimpleMessenger, would it be >>

Re: [ceph-users] 10.2.4 Jewel released -- IMPORTANT

2016-12-07 Thread Gregory Farnum
On Wed, Dec 7, 2016 at 3:18 PM, Francois Lafont wrote: > On 12/08/2016 12:06 AM, Sage Weil wrote: > >> Please hold off on upgrading to this release. It triggers a bug in >> SimpleMessenger that causes threads for broken connections to spin, eating >> CPU. >> >> We're making sure we understand the

Re: [ceph-users] 10.2.4 Jewel released -- IMPORTANT

2016-12-07 Thread Francois Lafont
On 12/08/2016 12:06 AM, Sage Weil wrote: > Please hold off on upgrading to this release. It triggers a bug in > SimpleMessenger that causes threads for broken connections to spin, eating > CPU. > > We're making sure we understand the root cause and preparing a fix. Waiting for the fix and its

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
Hi Gregory, On Thu, Dec 8, 2016 at 12:10 AM, Gregory Farnum wrote: > In slightly more detail: you are clearly seeing a problem with the > messenger, as indicated by the sock_recvmsg at the top of the CPU > usage list. We've seen this elsewhere very rarely, which is why > there's already a backpor

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
Hi Sage, On Thu, Dec 8, 2016 at 12:11 AM, Sage Weil wrote: > We haven't verified that all the fixes are backported to jewel or tested > it on jewel, so I wouldn't recommend it. Ok, thanks for letting me know. > > I have a branch that cherry-picks the fix on top of 10.2.4. It is > building now

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Sage Weil
On Thu, 8 Dec 2016, Ruben Kerkhof wrote: > On Wed, Dec 7, 2016 at 11:58 PM, Samuel Just wrote: > > Actually, Greg and Sage are working up other branches, nvm. > > -Sam > > Ok, I'll hold. If the issue is in the SimpleMessenger, would it be > safe to switch to ms type = async as a workaround? > I h

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Gregory Farnum
On Wed, Dec 7, 2016 at 2:58 PM, Samuel Just wrote: > Actually, Greg and Sage are working up other branches, nvm. > -Sam > > On Wed, Dec 7, 2016 at 2:52 PM, Samuel Just wrote: >> I just pushed a branch wip-14120-10.2.4 with a possible fix. >> >> https://github.com/ceph/ceph/pull/12349/ is a fix fo

Re: [ceph-users] 10.2.4 Jewel released -- IMPORTANT

2016-12-07 Thread Sage Weil
Hi everyone, Please hold off on upgrading to this release. It triggers a bug in SimpleMessenger that causes threads for broken connections to spin, eating CPU. We're making sure we understand the root cause and preparing a fix. Thanks! sage On Wed, 7 Dec 2016, Abhishek L wrote: > This po

[ceph-users] Change ownership of objects

2016-12-07 Thread Erik McCormick
Hello everyone, I am running Ceph (firefly) Radosgw integrated with Openstack Keystone. Recently we built a whole new Openstack cloud and created users in that cluster. The names were the same, but the UUID's are not. Both clouds are using the same Ceph cluster with their own RGW. I have managed

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
On Wed, Dec 7, 2016 at 11:58 PM, Samuel Just wrote: > Actually, Greg and Sage are working up other branches, nvm. > -Sam Ok, I'll hold. If the issue is in the SimpleMessenger, would it be safe to switch to ms type = async as a workaround? I heard that it will become the default in Kraken, but how

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Graham Allan
On 12/07/2016 04:20 PM, Francois Lafont wrote: On 12/07/2016 11:16 PM, Steve Taylor wrote: I'm seeing the same behavior with very similar perf top output. One server with 32 OSDs has a load average approaching 800. No excessive memory usage and no iowait at all. Exactly! And another inter

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
Hi Samuel, On Wed, Dec 7, 2016 at 11:52 PM, Samuel Just wrote: > I just pushed a branch wip-14120-10.2.4 with a possible fix. > > https://github.com/ceph/ceph/pull/12349/ is a fix for a known bug > which didn't quite make it into 10.2.4, it's possible that > 165e5abdbf6311974d4001e43982b83d06f9e0

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Samuel Just
Actually, Greg and Sage are working up other branches, nvm. -Sam On Wed, Dec 7, 2016 at 2:52 PM, Samuel Just wrote: > I just pushed a branch wip-14120-10.2.4 with a possible fix. > > https://github.com/ceph/ceph/pull/12349/ is a fix for a known bug > which didn't quite make it into 10.2.4, it's p

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Samuel Just
I just pushed a branch wip-14120-10.2.4 with a possible fix. https://github.com/ceph/ceph/pull/12349/ is a fix for a known bug which didn't quite make it into 10.2.4, it's possible that 165e5abdbf6311974d4001e43982b83d06f9e0cc which did made the bug much more likely to happen. wip-14120-10.2.4 ha

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
On Wed, Dec 7, 2016 at 11:33 PM, Ruben Kerkhof wrote: >> And another interesting information (maybe). I have ceph-osd process with >> big cpu load (as Steve said no iowait and no excessive memory usage). If I >> restart the ceph-osd daemon cpu load becomes OK during exactly 15 minutes >> for me

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
On Wed, Dec 7, 2016 at 11:37 PM, Francois Lafont wrote: > On 12/07/2016 11:33 PM, Ruben Kerkhof wrote: > >> Thanks, l'll check how long it takes for this to happen on my cluster. >> >> I did just pause scrub and deep-scrub. Are there scrubs running on >> your cluster now by any chance? > > Yes but

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Steve Taylor
Not a single scrub in my case. [cid:imagee2204f.JPG@a12e09b2.4ebb7737] Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation 380 Data Drive Suite 300 | Draper | Utah | 84020 Offic

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Francois Lafont
On 12/07/2016 11:33 PM, Ruben Kerkhof wrote: > Thanks, l'll check how long it takes for this to happen on my cluster. > > I did just pause scrub and deep-scrub. Are there scrubs running on > your cluster now by any chance? Yes but normally not currently because I have: osd scrub begin hour =

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
On Wed, Dec 7, 2016 at 11:20 PM, Francois Lafont wrote: > On 12/07/2016 11:16 PM, Steve Taylor wrote: >> I'm seeing the same behavior with very similar perf top output. One server >> with 32 OSDs has a load average approaching 800. No excessive memory usage >> and no iowait at all. > > Exactly!

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
On Wed, Dec 7, 2016 at 11:16 PM, Steve Taylor wrote: > > I'm seeing the same behavior with very similar perf top output. One server > with 32 OSDs has a load average approaching 800. No excessive memory usage > and no iowait at all. Also seeing AVC's so I guess I'll switch to permissive mode f

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Francois Lafont
On 12/07/2016 11:16 PM, Steve Taylor wrote: > I'm seeing the same behavior with very similar perf top output. One server > with 32 OSDs has a load average approaching 800. No excessive memory usage > and no iowait at all. Exactly! And another interesting information (maybe). I have ceph-osd pro

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Steve Taylor
I'm seeing the same behavior with very similar perf top output. One server with 32 OSDs has a load average approaching 800. No excessive memory usage and no iowait at all. [cid:imagea8a69a.JPG@f4e62cf1.419383aa] Steve Taylor | S

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Ruben Kerkhof
On Wed, Dec 7, 2016 at 8:46 PM, Francois Lafont wrote: > Hi, > > On 12/07/2016 01:21 PM, Abhishek L wrote: > >> This point release fixes several important bugs in RBD mirroring, RGW >> multi-site, CephFS, and RADOS. >> >> We recommend that all v10.2.x users upgrade. Also note the following when >

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 21:43 schreef "Will.Boege" : > > > Thanks for the explanation. I guess this case you outlined explains why the > Ceph developers chose to make this a ‘safe’ default. > > 2 osds are transiently down and the third fails hard. The PGs on the 3rd osd > with no more replic

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 21:18 schreef Kevin Olbrich : > > > Is Ceph accepting this OSD if the other (newer) replica is down? > In this case I would assume that my cluster is instantly broken when rack > _after_ rack fails (power outage) and I just start in random order. > We have at least one MO

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Christian Theune
Hi, > On 7 Dec 2016, at 14:39, Peter Maloney > wrote: > > On 12/07/16 13:52, Christian Balzer wrote: >> On Wed, 7 Dec 2016 12:39:11 +0100 Christian Theune wrote: >> >> | cartman06 ~ # fio --filename=/dev/sdl --direct=1 --sync=1 --rw=write >> --bs=128k --numjobs=1 --iodepth=1 --runtime=60 --ti

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Will . Boege
Thanks for the explanation. I guess this case you outlined explains why the Ceph developers chose to make this a ‘safe’ default. 2 osds are transiently down and the third fails hard. The PGs on the 3rd osd with no more replicas are marked unfound. You bring up 1 and 2 and these PGs will remai

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-07 Thread Jason Dillaman
I cannot recreate that "rbd feature disable" error using a master branch build from yesterday. Can you still reproduce this where your rbd CLI can create a data pool image but cannot access the image afterwards? As for how to run against a librbd-backed client, it depends on what your end goal is.

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Kevin Olbrich
Is Ceph accepting this OSD if the other (newer) replica is down? In this case I would assume that my cluster is instantly broken when rack _after_ rack fails (power outage) and I just start in random order. We have at least one MON on stand-alone UPS to resolv such an issue - I just assumed this is

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 21:04 schreef "Will.Boege" : > > > Hi Wido, > > Just curious how blocking IO to the final replica provides protection from > data loss? I’ve never really understood why this is a Ceph best practice. > In my head all 3 replicas would be on devices that have roughly th

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 20:54 schreef John Spray : > > > On Wed, Dec 7, 2016 at 7:47 PM, Wido den Hollander wrote: > > > >> Op 7 december 2016 om 16:53 schreef John Spray : > >> > >> > >> On Wed, Dec 7, 2016 at 3:46 PM, Wido den Hollander wrote: > >> > > >> >> Op 7 december 2016 om 16:38 schre

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Will . Boege
Hi Wido, Just curious how blocking IO to the final replica provides protection from data loss? I’ve never really understood why this is a Ceph best practice. In my head all 3 replicas would be on devices that have roughly the same odds of physically failing or getting logically corrupted in a

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread John Spray
On Wed, Dec 7, 2016 at 7:47 PM, Wido den Hollander wrote: > >> Op 7 december 2016 om 16:53 schreef John Spray : >> >> >> On Wed, Dec 7, 2016 at 3:46 PM, Wido den Hollander wrote: >> > >> >> Op 7 december 2016 om 16:38 schreef John Spray : >> >> >> >> >> >> On Wed, Dec 7, 2016 at 3:28 PM, Wido den

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 16:53 schreef John Spray : > > > On Wed, Dec 7, 2016 at 3:46 PM, Wido den Hollander wrote: > > > >> Op 7 december 2016 om 16:38 schreef John Spray : > >> > >> > >> On Wed, Dec 7, 2016 at 3:28 PM, Wido den Hollander wrote: > >> > (I think John knows the answer, but sendi

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-07 Thread Aravind Ramesh
Thanks Nick, I have not tried using rbd-nbd, I will give it a try. Rbd mapping is failing for the image which was created with -data-pool option, so I can't run fio or any IO on it. Aravind From: Nick Fisk [mailto:n...@fisk.me.uk] Sent: Wednesday, December 07, 2016 6:23 PM To: Aravind Ramesh

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-07 Thread Aravind Ramesh
Thanks Jason. I am running a development cluster (using vstart), so all the running OSDs are running the code from Dec 6th(master branch). Could you please point me in the direction of how to run it on librbd-backed clients ? Aravind -Original Message- From: Jason Dillaman [mailto:jdi

Re: [ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Francois Lafont
Hi, On 12/07/2016 01:21 PM, Abhishek L wrote: > This point release fixes several important bugs in RBD mirroring, RGW > multi-site, CephFS, and RADOS. > > We recommend that all v10.2.x users upgrade. Also note the following when > upgrading from hammer Well... little warning: after upgrade fro

Re: [ceph-users] rgw civetweb ssl official documentation?

2016-12-07 Thread Chris Jones
We terminate all of our TLS at the load-balancer. To make it simple, use HAProxy in front of your single instance. BTW, the latest versions of HAProxy can out perform expensive hardware LBs. We use both at Bloomberg. -CJ On Wed, Dec 7, 2016 at 1:44 PM, Puff, Jonathon wrote: > There’s a few docu

[ceph-users] News on RDMA on future releases

2016-12-07 Thread German Anders
Hi all, I want to know if there's any news on future releases, regarding RDMA if it's going to be integrated or not, since RDMA should increase IOPS performance a lot, specially on small block sizes. Thanks in advance, Best, *German* ___ ceph-users m

[ceph-users] rgw civetweb ssl official documentation?

2016-12-07 Thread Puff, Jonathon
There’s a few documents out around this subject, but I can’t find anything official. Can someone point me to any official documentation for deploying this? Other alternatives appear to be a HAproxy frontend. Currently running 10.2.3 with a single radosgw. -JP ___

[ceph-users] ceph.com Website problems

2016-12-07 Thread Sean Redmond
Looks like ceph.com, tracker.ceph.com download.ceph.com websites / repo are having an issue at the moment, I guess it maybe related to the below:DreamCompute US-East 2 Cluster - Network connectivity issues

[ceph-users] Remove ghost "default" zone group in period map

2016-12-07 Thread piglei
Hi I was configuring two realms in one cluster. After set up the second realm, I found a problem, *the master zonegroup are set to a zonegroup "default" which is not what I want*, here is current period of the second realm rb: # radosgw-admin period get --rgw-realm=rb { "id": "18a3c0f8-c852-4

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread John Spray
On Wed, Dec 7, 2016 at 3:46 PM, Wido den Hollander wrote: > >> Op 7 december 2016 om 16:38 schreef John Spray : >> >> >> On Wed, Dec 7, 2016 at 3:28 PM, Wido den Hollander wrote: >> > (I think John knows the answer, but sending to ceph-users for archival >> > purposes) >> > >> > Hi John, >> > >>

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 16:38 schreef John Spray : > > > On Wed, Dec 7, 2016 at 3:28 PM, Wido den Hollander wrote: > > (I think John knows the answer, but sending to ceph-users for archival > > purposes) > > > > Hi John, > > > > A Ceph cluster lost a PG with CephFS metadata in there and it is

Re: [ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread John Spray
On Wed, Dec 7, 2016 at 3:28 PM, Wido den Hollander wrote: > (I think John knows the answer, but sending to ceph-users for archival > purposes) > > Hi John, > > A Ceph cluster lost a PG with CephFS metadata in there and it is currently > doing a CephFS disaster recovery as described here: > http

[ceph-users] CephFS recovery from missing metadata objects questions

2016-12-07 Thread Wido den Hollander
(I think John knows the answer, but sending to ceph-users for archival purposes) Hi John, A Ceph cluster lost a PG with CephFS metadata in there and it is currently doing a CephFS disaster recovery as described here: http://docs.ceph.com/docs/master/cephfs/disaster-recovery/ This data pool has

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-07 Thread Jason Dillaman
The kernel krbd driver doesn't support the new RBD data pool feature. This will only function using librbd-backed clients. The error that you are seeing with "rbd feature disable" is unexpected -- any chance your OSDs are old and don't support the feature? On Wed, Dec 7, 2016 at 6:07 AM, Aravind

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 15:54 schreef LOIC DEVULDER : > > > Hi Wido, > > > As a Ceph consultant I get numerous calls throughout the year to help people > > with getting their broken Ceph clusters back online. > > > > The causes of downtime vary vastly, but one of the biggest causes is that > >

[ceph-users] CDM in ~2.5 hours

2016-12-07 Thread Patrick McGarry
Her cephers, Just a reminder that this month's CDM is today at 12:30p EST. Sage will be hosting today while I'm traveling. If you have any questions or difficulties joining, drop them in irc/slack and we'll try to help out. Thanks. ___ ceph-users mailing

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread LOIC DEVULDER
> -Message d'origine- > De : Wido den Hollander [mailto:w...@42on.com] > Envoyé : mercredi 7 décembre 2016 16:01 > À : ceph-us...@ceph.com; LOIC DEVULDER - U329683 > Objet : RE: [ceph-users] 2x replication: A BIG warning > > > > Op 7 december 2016 om 15:54 schreef LOIC DEVULDER > : > > >

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread LOIC DEVULDER
Hi Wido, > As a Ceph consultant I get numerous calls throughout the year to help people > with getting their broken Ceph clusters back online. > > The causes of downtime vary vastly, but one of the biggest causes is that > people use replication 2x. size = 2, min_size = 1. We are building a Ceph

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Peter Maloney
On 12/07/16 14:58, Wido den Hollander wrote: >> Op 7 december 2016 om 11:29 schreef Kees Meijs : >> >> >> Hi Wido, >> >> Valid point. At this moment, we're using a cache pool with size = 2 and >> would like to "upgrade" to size = 3. >> >> Again, you're absolutely right... ;-) >> >> Anyway, any thin

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 11:29 schreef Kees Meijs : > > > Hi Wido, > > Valid point. At this moment, we're using a cache pool with size = 2 and > would like to "upgrade" to size = 3. > > Again, you're absolutely right... ;-) > > Anyway, any things to consider or could we just: > > 1. Run "cep

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 10:06 schreef Dan van der Ster : > > > Hi Wido, > > Thanks for the warning. We have one pool as you described (size 2, > min_size 1), simply because 3 replicas would be too expensive and > erasure coding didn't meet our performance requirements. We are well > aware of th

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Peter Maloney
On 12/07/16 13:52, Christian Balzer wrote: > On Wed, 7 Dec 2016 12:39:11 +0100 Christian Theune wrote: > > | cartman06 ~ # fio --filename=/dev/sdl --direct=1 --sync=1 --rw=write > --bs=128k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting > --name=journal-test > | journal-test:

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Дмитрий Глушенок
Hi, The assumptions are: - OSD nearly full - HDD vendor not hides real LSE (latent sector error) rate like 1 in 10^18 under "not more than 1 unrecoverable error in 10^15 bits read" In case of disk (OSD) failure Ceph have to read copy of the disk from other nodes (to restore redundancy). More yo

Re: [ceph-users] Prevent cephfs clients from mount and browsing "/"

2016-12-07 Thread Martin Palma
Thanks all for the clarification. Best, Martin On Mon, Dec 5, 2016 at 2:14 PM, John Spray wrote: > On Mon, Dec 5, 2016 at 12:35 PM, David Disseldorp wrote: >> Hi Martin, >> >> On Mon, 5 Dec 2016 13:27:01 +0100, Martin Palma wrote: >> >>> Ok, just discovered that with the fuse client, we have to

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Christian Balzer
Hello, On Wed, 7 Dec 2016 12:39:11 +0100 Christian Theune wrote: > Hi, > > I’m now working with the raw device and getting interesting results. > > For one, I went through all reviews about the Micron DC S610 again and as > always the devil is in the detail. I noticed that the test results a

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-07 Thread Nick Fisk
Hi Aravind, I've also seen this merge on Monday and tried to create a RBD on an ecpool and also failed. Although I ended up with all my OSD's crashing and refusing to restart. I'm going to rebuild the cluster and try again. Have you tried using the rbd-nbd driver or benchmarking directly

[ceph-users] 10.2.4 Jewel released

2016-12-07 Thread Abhishek L
This point release fixes several important bugs in RBD mirroring, RGW multi-site, CephFS, and RADOS. We recommend that all v10.2.x users upgrade. Also note the following when upgrading from hammer Upgrading from hammer - When the last hammer OSD in a cluster containing jewel

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Christian Balzer
Hello, On Wed, 7 Dec 2016 14:49:28 +0300 Дмитрий Глушенок wrote: > RAID10 also will suffer from LSE on big disks, isn't it? > IF LSE stands for latent sector errors, then yes, but that's not limited to large disks per se. And you counter it by having another replica and checksums like in ZFS o

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wolfgang Link
Hi I'm very interested in this calculation. What assumption do you have done? Network speed, osd degree of fulfilment, etc? Thanks Wolfgang On 12/07/2016 11:16 AM, Дмитрий Глушенок wrote: > Hi, > > Let me add a little math to your warning: with LSE rate of 1 in 10^15 on > modern 8 TB disks

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Дмитрий Глушенок
RAID10 also will suffer from LSE on big disks, isn't it? > 7 дек. 2016 г., в 13:35, Christian Balzer написал(а): > > > > Hello, > > On Wed, 7 Dec 2016 13:16:45 +0300 Дмитрий Глушенок wrote: > >> Hi, >> >> Let me add a little math to your warning: with LSE rate of 1 in 10^15 on >> modern 8

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Christian Theune
Hi, I’m now working with the raw device and getting interesting results. For one, I went through all reviews about the Micron DC S610 again and as always the devil is in the detail. I noticed that the test results are quite favorable, but I didn’t previously notice the caveat (which applies to

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Christian Balzer
Hello, On Wed, 7 Dec 2016 09:04:37 +0100 Christian Theune wrote: > Hi, > > > On 7 Dec 2016, at 05:14, Christian Balzer wrote: > > > > Hello, > > > > On Tue, 6 Dec 2016 20:58:52 +0100 Christian Theune wrote: > > > >> Alright. We’re postponing this for now. Is that actually a more widespread

Re: [ceph-users] node and its OSDs down...

2016-12-07 Thread M Ranga Swami Reddy
That's right.. But, my question was: when an OSD down, all data will be moved to other OSDs from downed OSD. - Is this correct? Now, I change the crushmap as out an OSD, then again data will be moved across the cluster? Thanks Swami On Wed, Dec 7, 2016 at 2:14 PM, 한승진 wrote: > Hi > > Because "d

[ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-07 Thread Aravind Ramesh
Hi, I am seeing this failure when I try to map a rbd device with -data-pool set to a EC Pool. This is a newly merged feature, I am not sure if it is expected to work yet or I need to do something more. Same issue is seen while mapping a rbd image from replicated pool but after disabling the new

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Christian Balzer
Hello, On Wed, 7 Dec 2016 13:16:45 +0300 Дмитрий Глушенок wrote: > Hi, > > Let me add a little math to your warning: with LSE rate of 1 in 10^15 on > modern 8 TB disks there is 5,8% chance to hit LSE during recovery of 8 TB > disk. So, every 18th recovery will probably fail. Similarly to RAI

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Kees Meijs
Hi Wido, Valid point. At this moment, we're using a cache pool with size = 2 and would like to "upgrade" to size = 3. Again, you're absolutely right... ;-) Anyway, any things to consider or could we just: 1. Run "ceph osd pool set cache size 3". 2. Wait for rebalancing to complete. 3. Run "c

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Дмитрий Глушенок
Hi, Let me add a little math to your warning: with LSE rate of 1 in 10^15 on modern 8 TB disks there is 5,8% chance to hit LSE during recovery of 8 TB disk. So, every 18th recovery will probably fail. Similarly to RAID6 (two parity disks) size=3 mitigates the problem. By the way - why it is a c

Re: [ceph-users] Ceph Blog Articles

2016-12-07 Thread Nick Fisk
Does the server you are running fio from have a valid ceph auth key? Can you check if you can kernel mount an RBD on the same server? > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Sascha Vogt > Sent: 06 December 2016 14:56 > To: ceph-us

[ceph-users] where is what in use ...

2016-12-07 Thread Götz Reinicke - IT Koordinator
Hi, I started to play with our Ceph cluster and created some pools and rdbs and did some performance test. Currently I'm up to understand and interpret the different outputs of ceph -s or rados df etc. So far so good so nice. Now I was cleaning up (rbd rm ... ) and still see some space used on t

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Dan van der Ster
Hi Wido, Thanks for the warning. We have one pool as you described (size 2, min_size 1), simply because 3 replicas would be too expensive and erasure coding didn't meet our performance requirements. We are well aware of the risks, but of course this is a balancing act between risk and cost. Anywa

Re: [ceph-users] node and its OSDs down...

2016-12-07 Thread 한승진
Hi Because "down" and "out" are different to ceph cluster Crush map of ceph is depends on how many osds are in ths cluster. Crush map doesn't change when osds are down. However crush map would chage when the osds are absolutelly out. Data location also will change, there fore rebalancing starts.

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Christian Theune
Hi, > On 7 Dec 2016, at 09:04, Christian Theune wrote: > > I guess you’re running XFS? I’m going through code and reading up on the > specific sync behaviour of the journal. I noticed in an XFS comment that > various levels of SYNC might behave differently whether you’re going to > access a r

[ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
Hi, As a Ceph consultant I get numerous calls throughout the year to help people with getting their broken Ceph clusters back online. The causes of downtime vary vastly, but one of the biggest causes is that people use replication 2x. size = 2, min_size = 1. In 2016 the amount of cases I have

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Christian Theune
Hi, > On 7 Dec 2016, at 05:14, Christian Balzer wrote: > > Hello, > > On Tue, 6 Dec 2016 20:58:52 +0100 Christian Theune wrote: > >> Alright. We’re postponing this for now. Is that actually a more widespread >> assumption that Jewel has “prime time” issues? >> > You ask the people here runni