Re: [ceph-users] CephFS: Writes are faster than reads?

2016-09-14 Thread Gregory Farnum
Oh hrm, I missed the stripe count settings. I'm not sure if that's helping you or not; I don't have a good intuitive grasp of what readahead will do in that case. I think you may need to adjust the readahead config knob in order to make it read all those objects together instead of one or two at a

Re: [ceph-users] CephFS: Writes are faster than reads?

2016-09-14 Thread Andreas Gerstmayr
2016-09-14 23:19 GMT+02:00 Gregory Farnum : > This is pretty standard behavior within Ceph as a whole — the journals > really help on writes; How does the journal help with large blocks? I thought the journal speed up is because of coalescing lots of small writes into bigger

Re: [ceph-users] How to associate a cephfs client id to its process

2016-09-14 Thread Heller, Chris
Ok. I’ll see about tracking down the logs (set to stderr for these tasks), and the metadata stuff looks interesting for future association. Thanks, Chris On 9/14/16, 5:04 PM, "Gregory Farnum" wrote: On Wed, Sep 14, 2016 at 7:02 AM, Heller, Chris

Re: [ceph-users] CephFS: Writes are faster than reads?

2016-09-14 Thread Gregory Farnum
This is pretty standard behavior within Ceph as a whole — the journals really help on writes; and especially with big block sizes you'll exceed the size of readahead, but writes will happily flush out in parallel. On Wed, Sep 14, 2016 at 12:51 PM, Henrik Korkuc wrote: > On

Re: [ceph-users] Cleanup old osdmaps after #13990 fix applied

2016-09-14 Thread Gregory Farnum
On Wed, Sep 14, 2016 at 7:19 AM, Dan Van Der Ster wrote: > Indeed, seems to be trimmed by osd_target_transaction_size (default 30) per > new osdmap. > Thanks a lot for your help! IIRC we had an entire separate issue before adding that field, where cleaning up from bad

Re: [ceph-users] How to associate a cephfs client id to its process

2016-09-14 Thread Gregory Farnum
On Wed, Sep 14, 2016 at 7:02 AM, Heller, Chris wrote: > I am making use of CephFS plus the cephfs-hadoop shim to replace HDFS in a > system I’ve been experimenting with. > > > > I’ve noticed that a large number of my HDFS clients have a ‘num_caps’ value > of 16385, as seen

Re: [ceph-users] CephFS: Writes are faster than reads?

2016-09-14 Thread Henrik Korkuc
On 16-09-14 18:21, Andreas Gerstmayr wrote: Hello, I'm currently performing some benchmark tests with our Ceph storage cluster and trying to find the bottleneck in our system. I'm writing a random 30GB file with the following command: $ time fio --name=job1 --rw=write --blocksize=1MB

Re: [ceph-users] Replacing a failed OSD

2016-09-14 Thread Jim Kilborn
Reed, Thanks for the response. Your process is the one that I ran. However, I have a crushmap with ssd and sata drives in different buckets (host made up of host types, with and ssd and spinning hosttype for each host) because I am using ssd drives for a replicated cache in front of an

Re: [ceph-users] Designing ceph cluster

2016-09-14 Thread Gaurav Goyal
Dear Ceph Users, I need you help to sort out following issue with my cinder volume. I have created ceph as backend for cinder. Since i was using SAN storage for ceph and want to get rid of it i had completely uninstalled ceph from my openstack environment. Right now i am in a situation where we

Re: [ceph-users] Replacing a failed OSD

2016-09-14 Thread Reed Dier
Hi Jim, This is pretty fresh in my mind so hopefully I can help you out here. Firstly, the crush map will back fill any holes in the enumeration that are existing. So assuming only one drive has been removed from the crush map, it will repopulate the same OSD number. My steps for removing an

[ceph-users] Replacing a failed OSD

2016-09-14 Thread Jim Kilborn
I am finishing testing our new cephfs cluster and wanted to document a failed osd procedure. I noticed that when I pulled a drive, to simulate a failure, and run through the replacement steps, the osd has to be removed from the crushmap in order to initialize the new drive as the same osd

[ceph-users] CephFS: Writes are faster than reads?

2016-09-14 Thread Andreas Gerstmayr
Hello, I'm currently performing some benchmark tests with our Ceph storage cluster and trying to find the bottleneck in our system. I'm writing a random 30GB file with the following command: $ time fio --name=job1 --rw=write --blocksize=1MB --size=30GB --randrepeat=0 --end_fsync=1 [...] WRITE:

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Ilya Dryomov
On Wed, Sep 14, 2016 at 3:30 PM, Nikolay Borisov wrote: > > > On 09/14/2016 02:55 PM, Ilya Dryomov wrote: >> On Wed, Sep 14, 2016 at 9:01 AM, Nikolay Borisov wrote: >>> >>> >>> On 09/14/2016 09:55 AM, Adrian Saul wrote: I found I could ignore the XFS

Re: [ceph-users] Cleanup old osdmaps after #13990 fix applied

2016-09-14 Thread Dan Van Der Ster
Indeed, seems to be trimmed by osd_target_transaction_size (default 30) per new osdmap. Thanks a lot for your help! -- Dan > On 14 Sep 2016, at 15:49, Steve Taylor wrote: > > I think it's a maximum of 30 maps per osdmap update. So if you've got huge > caches

Re: [ceph-users] cephfs/ceph-fuse: mds0: Client XXX:XXXfailingtorespondto capability release

2016-09-14 Thread Burkhard Linke
Hi, My cluster is back to HEALTH_OK, the involved host has been restarted by the user. But I will debug some more on the host when i see this issue again next time. PS: For completeness, i've stated that this issue was often seen in my current Jewel environment, I meant to say that this

[ceph-users] How to associate a cephfs client id to its process

2016-09-14 Thread Heller, Chris
I am making use of CephFS plus the cephfs-hadoop shim to replace HDFS in a system I’ve been experimenting with. I’ve noticed that a large number of my HDFS clients have a ‘num_caps’ value of 16385, as seen when running ‘session ls’ on the active mds. This appears to be one larger than the

Re: [ceph-users] Cleanup old osdmaps after #13990 fix applied

2016-09-14 Thread Steve Taylor
I think it's a maximum of 30 maps per osdmap update. So if you've got huge caches like we had, then you might have to generate a lot of updates to get things squared away. That's what I did, and it worked really well.

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Nikolay Borisov
On 09/14/2016 02:55 PM, Ilya Dryomov wrote: > On Wed, Sep 14, 2016 at 9:01 AM, Nikolay Borisov wrote: >> >> >> On 09/14/2016 09:55 AM, Adrian Saul wrote: >>> >>> I found I could ignore the XFS issues and just mount it with the >>> appropriate options (below from my backup

Re: [ceph-users] Cleanup old osdmaps after #13990 fix applied

2016-09-14 Thread Dan Van Der Ster
Hi Steve, Thanks, that sounds promising. Are only a limited number of maps trimmed for each new osdmap generated? If so, I'll generate a bit of churn to get these cleaned up. -- Dan > On 14 Sep 2016, at 15:08, Steve Taylor wrote: > >

Re: [ceph-users] Cleanup old osdmaps after #13990 fix applied

2016-09-14 Thread Steve Taylor
http://tracker.ceph.com/issues/13990 was created by a colleague of mine from an issue that was affecting us in production. When 0.94.8 was released with the fix, I immediately deployed a test cluster on 0.94.7, reproduced this issue, upgraded to 0.94.8, and tested the fix. It worked

Re: [ceph-users] cephfs/ceph-fuse: mds0: Client XXX:XXX failingtorespond to capability release

2016-09-14 Thread Dennis Kramer (DT)
Hi Burkhard, Thank you for your reply, see inline: On Wed, 14 Sep 2016, Burkhard Linke wrote: Hi, On 09/14/2016 12:43 PM, Dennis Kramer (DT) wrote: Hi Goncalo, Thank you. Yes, i have seen that thread, but I have no near full osds and my mds cache size is pretty high. You can use the

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Nikolay Borisov
On 09/14/2016 02:55 PM, Ilya Dryomov wrote: > On Wed, Sep 14, 2016 at 9:01 AM, Nikolay Borisov wrote: >> >> >> On 09/14/2016 09:55 AM, Adrian Saul wrote: >>> >>> I found I could ignore the XFS issues and just mount it with the >>> appropriate options (below from my backup

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Ilya Dryomov
On Wed, Sep 14, 2016 at 9:01 AM, Nikolay Borisov wrote: > > > On 09/14/2016 09:55 AM, Adrian Saul wrote: >> >> I found I could ignore the XFS issues and just mount it with the appropriate >> options (below from my backup scripts): >> >> # >> # Mount with nouuid

Re: [ceph-users] Scrub and deep-scrub repeating over and over

2016-09-14 Thread Arvydas Opulskis
Hi, in case someone hit same problem, try to: stop scrubbing by enabling "no scrub" and "no deep-scrub" flags wait until scrub ends restart monitors (one by one) restart OSD servers (I've restarted all three of them, because there was small cluster, but this could be not necessary to restart all

Re: [ceph-users] cephfs/ceph-fuse: mds0: Client XXX:XXX failing to respond to capability release

2016-09-14 Thread Dennis Kramer (DT)
Hi Goncalo, Thank you. Yes, i have seen that thread, but I have no near full osds and my mds cache size is pretty high. On Wed, 14 Sep 2016, Goncalo Borges wrote: Hi Dennis Have you checked http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-February/007207.html ? The issue there was

[ceph-users] Seeking your feedback on the Ceph monitoring and management functionality in openATTIC

2016-09-14 Thread Lenz Grimmer
Hi, if you're running a Ceph cluster and would be interested in trying out a new tool for managing/monitoring it, we've just released version 2.0.14 of openATTIC that now provides a first implementation of a cluster monitoring dashboard. This is work in progress, but we'd like to solicit your

[ceph-users] Cleanup old osdmaps after #13990 fix applied

2016-09-14 Thread Dan Van Der Ster
Hi, We've just upgraded to 0.94.9, so I believe this issue is fixed: http://tracker.ceph.com/issues/13990 AFAICT "resolved" means the number of osdmaps saved on each OSD will not grow unboundedly anymore. However, we have many OSDs with loads of old osdmaps, e.g.: # pwd

Re: [ceph-users] RadosGW index-sharding on Jewel

2016-09-14 Thread Henrik Korkuc
as far as I noticed after doing zone/region changes you need to "radosgw-admin period update --commit" for them to take an effect On 16-09-14 11:22, Ansgar Jazdzewski wrote: Hi, i curently setup my new testcluster (Jewel) and found out the index sharding configuration had changed? i did so

Re: [ceph-users] RadosGW index-sharding on Jewel

2016-09-14 Thread Yoann Moulin
Hello, > i curently setup my new testcluster (Jewel) and found out the index > sharding configuration had changed? > > i did so far: > 1. radosgw-admin realm create --rgw-realm=default --default > 2. radosgw-admin zonegroup get --rgw-zonegroup=default > zonegroup.json > 3. chaned value

[ceph-users] RadosGW index-sharding on Jewel

2016-09-14 Thread Ansgar Jazdzewski
Hi, i curently setup my new testcluster (Jewel) and found out the index sharding configuration had changed? i did so far: 1. radosgw-admin realm create --rgw-realm=default --default 2. radosgw-admin zonegroup get --rgw-zonegroup=default > zonegroup.json 3. chaned value "bucket_index_max_shards":

Re: [ceph-users] cephfs/ceph-fuse: mds0: Client XXX:XXX failing to respond to capability release

2016-09-14 Thread Goncalo Borges
Hi Dennis Have you checked http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-February/007207.html ? The issue there was some near full osd blocking IO. Cheers G. From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Dennis Kramer

[ceph-users] cephfs/ceph-fuse: mds0: Client XXX:XXX failing to respond to capability release

2016-09-14 Thread Dennis Kramer (DBS)
Hi All, Running Ubuntu 16.04, with version JEWEL ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) In our environment we are running cephfs and our clients are connecting through ceph-fuse. Since I have upgraded from Hammer to Jewel I was haunted by the ceph-fuse segfaults, which

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Adrian Saul
> But shouldn't freezing the fs and doing a snapshot constitute a "clean > unmount" hence no need to recover on the next mount (of the snapshot) - > Ilya? It's what I thought as well, but XFS seems to want to attempt to replay the log regardless on mount and write to the device to do so. This

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Nikolay Borisov
On 09/14/2016 09:55 AM, Adrian Saul wrote: > > I found I could ignore the XFS issues and just mount it with the appropriate > options (below from my backup scripts): > > # > # Mount with nouuid (conflicting XFS) and norecovery (ro snapshot) > # > if ! mount -o

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Adrian Saul
I found I could ignore the XFS issues and just mount it with the appropriate options (below from my backup scripts): # # Mount with nouuid (conflicting XFS) and norecovery (ro snapshot) # if ! mount -o ro,nouuid,norecovery $SNAPDEV /backup${FS}; then