[ceph-users] rados mkpool fails, but not ceph osd pool create

2014-11-11 Thread Gauvain Pocentek
Hi all, I'm facing a problem on a ceph deployment. rados mkpool always fails: # rados -n client.admin mkpool test error creating pool test: (2) No such file or directory rados lspool and rmpool commands work just fine, and the following also works: # ceph osd pool create test 128 128 pool 't

Re: [ceph-users] Triggering shallow scrub on OSD where scrub is already in progress

2014-11-11 Thread Mallikarjun Biradar
Hi Greg, I am using 0.86 refering to osd logs to check scrub behaviour.. Please have look at log snippet from osd log ##Triggered scrub on osd.10---> 2014-11-12 16:24:21.393135 7f5026f31700 0 log_channel(default) log [INF] : 0.4 scrub ok 2014-11-12 16:24:24.393586 7f5026f31700 0 log_channel(de

[ceph-users] Log reading/how do I tell what an OSD is trying to connect to

2014-11-11 Thread Scott Laird
I'm having a problem with my cluster. It's running 0.87 right now, but I saw the same behavior with 0.80.5 and 0.80.7. The problem is that my logs are filling up with "replacing existing (lossy) channel" log lines (see below), to the point where I'm filling drives to 100% almost daily just with l

[ceph-users] v0.88 released

2014-11-11 Thread Sage Weil
This is the first development release after Giant. The two main features merged this round are the new AsyncMessenger (an alternative implementation of the network layer) from Haomai Wang at UnitedStack, and support for POSIX file locks in ceph-fuse and libcephfs from Yan, Zheng. There is also a

Re: [ceph-users] Federated gateways

2014-11-11 Thread Craig Lewis
> > I see you're running 0.80.5. Are you using Apache 2.4? There is a known > issue with Apache 2.4 on the primary and replication. It's fixed, just > waiting for the next firefly release. Although, that causes 40x errors > with Apache 2.4, not 500 errors. > > It is apache 2.4, but I’m actually

Re: [ceph-users] Deep scrub, cache pools, replica 1

2014-11-11 Thread Christian Balzer
On Tue, 11 Nov 2014 10:21:49 -0800 Gregory Farnum wrote: > On Mon, Nov 10, 2014 at 10:58 PM, Christian Balzer wrote: > > > > Hello, > > > > One of my clusters has become busy enough (I'm looking at you, evil > > Window VMs that I shall banish elsewhere soon) to experience client > > noticeable pe

Re: [ceph-users] pg's stuck for 4-5 days after reaching backfill_toofull

2014-11-11 Thread JIten Shah
I agree. This was just our brute-force method on our test cluster. We won't do this on production cluster. --Jiten On Nov 11, 2014, at 2:11 PM, cwseys wrote: > 0.5 might be too much. All the PGs squeezed off of one OSD will need to be > stored on another. The fewer you move the less likely

Re: [ceph-users] Typical 10GbE latency

2014-11-11 Thread Robert LeBlanc
Is this with a 8192 byte payload? Theoretical transfer time of 1 Gbps (you are only sending one packet so LACP won't help) one direction is 0.061 ms, double that and you are at 0.122 ms of bits in flight, then there is context switching, switch latency (store and forward assumed for 1 Gbps), etc wh

Re: [ceph-users] pg's stuck for 4-5 days after reaching backfill_toofull

2014-11-11 Thread cwseys
0.5 might be too much. All the PGs squeezed off of one OSD will need to be stored on another. The fewer you move the less likely a different OSD will become toofull. Better to adjust in small increments as Craig suggested. Chad. ___ ceph-users mail

Re: [ceph-users] pg's stuck for 4-5 days after reaching backfill_toofull

2014-11-11 Thread JIten Shah
Actually there were 100’s that were too full. We manually set the OSD weights to 0.5 and it seems to be recovering. Thanks of the tips on crush reweight. I will look into it. —Jiten On Nov 11, 2014, at 1:37 PM, Craig Lewis wrote: > How many OSDs are nearfull? > > I've seen Ceph want two toof

Re: [ceph-users] Federated gateways

2014-11-11 Thread Aaron Bassett
> On Nov 11, 2014, at 4:21 PM, Craig Lewis wrote: > > Is that radosgw log from the primary or the secondary zone? Nothing in that > log jumps out at me. This is the log from the secondary zone. That HTTP 500 response code coming back is the only problem I can find. There are a bunch of 404s f

Re: [ceph-users] pg's stuck for 4-5 days after reaching backfill_toofull

2014-11-11 Thread Craig Lewis
How many OSDs are nearfull? I've seen Ceph want two toofull OSDs to swap PGs. In that case, I dynamically raised mon_osd_nearfull_ratio and osd_backfill_full_ratio a bit, then put it back to normal once the scheduling deadlock finished. Keep in mind that ceph osd reweight is temporary. If you m

Re: [ceph-users] pg's stuck for 4-5 days after reaching backfill_toofull

2014-11-11 Thread JIten Shah
Thanks Chad. It seems to be working. —Jiten On Nov 11, 2014, at 12:47 PM, Chad Seys wrote: > Find out which OSD it is: > > ceph health detail > > Squeeze blocks off the affected OSD: > > ceph osd reweight OSDNUM 0.8 > > Repeat with any OSD which becomes toofull. > > Your cluster is only ab

Re: [ceph-users] Federated gateways

2014-11-11 Thread Craig Lewis
Is that radosgw log from the primary or the secondary zone? Nothing in that log jumps out at me. I see you're running 0.80.5. Are you using Apache 2.4? There is a known issue with Apache 2.4 on the primary and replication. It's fixed, just waiting for the next firefly release. Although, that

Re: [ceph-users] pg's stuck for 4-5 days after reaching backfill_toofull

2014-11-11 Thread Chad Seys
Find out which OSD it is: ceph health detail Squeeze blocks off the affected OSD: ceph osd reweight OSDNUM 0.8 Repeat with any OSD which becomes toofull. Your cluster is only about 50% used, so I think this will be enough. Then when it finishes, allow data back on OSD: ceph osd reweight OSDN

[ceph-users] Not finding systemd files in Giant CentOS7 packages

2014-11-11 Thread Robert LeBlanc
I was trying to get systemd to bring up the monitor using the new systemd files in Giant. However, I'm not finding the systemd files included in the CentOS 7 packages. Are they missing or am I confused about how it should work? ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) Installed

Re: [ceph-users] PG's incomplete after OSD failure

2014-11-11 Thread Matthew Anderson
I've done a bit more work tonight and managed to get some more data back. Osd.121, which was previously completely dead, has made it through an XFS repair with a more fault tolerant HBA firmware and I was able to export both of the placement groups required using ceph_objectstore_tool. The osd woul

Re: [ceph-users] Typical 10GbE latency

2014-11-11 Thread Alexandre DERUMIER
Don't have yet 10GBE, but here my result my simple lacp on 2 gigabit links with a cisco 6500 rtt min/avg/max/mdev = 0.179/0.202/0.221/0.019 ms (Seem to be lower than your 10gbe nexus) - Mail original - De: "Wido den Hollander" À: ceph-users@lists.ceph.com Envoyé: Lundi 10 Novembr

[ceph-users] pg's stuck for 4-5 days after reaching backfill_toofull

2014-11-11 Thread JIten Shah
Hi Guys, We ran into this issue after we nearly max’ed out the sod’s. Since then, we have cleaned up a lot of data in the sod’s but pg’s seem to stuck for last 4 to 5 days. I have run "ceph osd reweight-by-utilization” and that did not seem to work. Any suggestions? ceph -s cluster 909c

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-11-11 Thread Gregory Farnum
On Tue, Nov 11, 2014 at 5:06 AM, Jasper Siero wrote: > No problem thanks for helping. > I don't want to disable the deep scrubbing process itself because its very > useful but one placement group (3.30) is continuously deep scrubbing and it > should finish after some time but it won't. Hmm, how

Re: [ceph-users] Deep scrub, cache pools, replica 1

2014-11-11 Thread Gregory Farnum
On Mon, Nov 10, 2014 at 10:58 PM, Christian Balzer wrote: > > Hello, > > One of my clusters has become busy enough (I'm looking at you, evil Window > VMs that I shall banish elsewhere soon) to experience client noticeable > performance impacts during deep scrub. > Before this I instructed all OSDs

Re: [ceph-users] long term support version?

2014-11-11 Thread Gregory Farnum
Yep! Every other stable release gets the LFS treatment. We're still fixing bugs and backporting some minor features to Dumpling, but haven't done any serious updates to Emperor since Firefly came out. Giant will be superseded by Hammer in the February timeframe, if I have my dates right. -Greg On T

[ceph-users] InInstalling ceph on a single machine with cephdeploy ubuntu 14.04 64 bit

2014-11-11 Thread tejaksjy
Hi, I am unable to figure out how to install and deploy ceph on a single machine with ceph deploy. I have ubuntu 14.04 - 64 bit installed in a virtual machine (on windows 8.1 through VMware player) and have installed devstack on ubuntu. I am trying to install ceph on the same machine (Ubuntu)

Re: [ceph-users] emperor -> firefly 0.80.7 upgrade problem

2014-11-11 Thread Chad Seys
Thanks Craig, I'll jiggle the OSDs around to see if that helps. Otherwise, I'm almost certain removing the pool will work. :/ Have a good one, Chad. > I had the same experience with force_create_pg too. > > I ran it, and the PGs sat there in creating state. I left the cluster > overnight, and

[ceph-users] long term support version?

2014-11-11 Thread Chad Seys
Hi all, Did I notice correctly that firefly is going to be supported "long term" whereas Giant is not going to be supported as long? http://ceph.com/releases/v0-80-firefly-released/ This release will form the basis for our long-term supported release Firefly, v0.80.x. http://ceph.com/uncategor

Re: [ceph-users] Federated gateways

2014-11-11 Thread Aaron Bassett
Ok I believe I’ve made some progress here. I have everything syncing *except* data. The data is getting 500s when it tries to sync to the backup zone. I have a log from the radosgw with debug cranked up to 20: 2014-11-11 14:37:06.688331 7f54447f0700 1 == starting new request req=0x7f546800

Re: [ceph-users] Weight field in osd dump & osd tree

2014-11-11 Thread Mallikarjun Biradar
Thanks christian.. Got clear about the concept.. thanks very much :) On Tue, Nov 11, 2014 at 5:47 PM, Loic Dachary wrote: > Hi Christian, > > On 11/11/2014 13:09, Christian Balzer wrote: > > On Tue, 11 Nov 2014 17:14:49 +0530 Mallikarjun Biradar wrote: > > > >> Hi all > >> > >> When Issued ceph

Re: [ceph-users] Configuring swift user for ceph Rados Gateway - 403 Access Denied

2014-11-11 Thread Daniel Schneller
On 2014-11-11 13:12:32 +, ವಿನೋದ್ Vinod H I said: Hi, I am having problems accessing rados gateway using swift interface. I am using ceph firefly version and have configured a "us" region as explained in the docs. There are two zones "us-east" and "us-west". us-east gateway is running on ho

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-11-11 Thread Jasper Siero
No problem thanks for helping. I don't want to disable the deep scrubbing process itself because its very useful but one placement group (3.30) is continuously deep scrubbing and it should finish after some time but it won't. Jasper Van: Gregory Farnum [

[ceph-users] Configuring swift user for ceph Rados Gateway - 403 Access Denied

2014-11-11 Thread ವಿನೋದ್ Vinod H I
Hi, I am having problems accessing rados gateway using swift interface. I am using ceph firefly version and have configured a "us" region as explained in the docs. There are two zones "us-east" and "us-west". us-east gateway is running on host ceph-node-1 and us-west gateway is running on host ceph

Re: [ceph-users] Weight field in osd dump & osd tree

2014-11-11 Thread Loic Dachary
Hi Christian, On 11/11/2014 13:09, Christian Balzer wrote: > On Tue, 11 Nov 2014 17:14:49 +0530 Mallikarjun Biradar wrote: > >> Hi all >> >> When Issued ceph osd dump it displays weight for that osd as 1 and when >> issued osd tree it displays 0.35 >> > > There are many threads about this, googl

Re: [ceph-users] Weight field in osd dump & osd tree

2014-11-11 Thread Christian Balzer
On Tue, 11 Nov 2014 17:14:49 +0530 Mallikarjun Biradar wrote: > Hi all > > When Issued ceph osd dump it displays weight for that osd as 1 and when > issued osd tree it displays 0.35 > There are many threads about this, google is your friend. For example: https://www.mail-archive.com/ceph-users@

Re: [ceph-users] Stackforge Puppet Module

2014-11-11 Thread David Moreau Simard
Hi Nick, The great thing about puppet-ceph's implementation on Stackforge is that it is both unit and integration tested. You can see the integration tests here: https://github.com/ceph/puppet-ceph/tree/master/spec/system Where I'm getting at is that the tests allow you to see how you can use t

[ceph-users] Weight field in osd dump & osd tree

2014-11-11 Thread Mallikarjun Biradar
Hi all When Issued ceph osd dump it displays weight for that osd as 1 and when issued osd tree it displays 0.35 output from osd dump: { "osd": 20, "uuid": "b2a97a29-1b8a-43e4-a4b0-fd9ee351086e", "up": 1, "in": 1, "weight": "1.00", "pri

[ceph-users] Stackforge Puppet Module

2014-11-11 Thread Nick Fisk
Hi, I'm just looking through the different methods of deploying Ceph and I particularly liked the idea that the stackforge puppet module advertises of using discover to automatically add new disks. I understand the principle of how it should work; using ceph-disk list to find unknown disks, but I

Re: [ceph-users] PG's incomplete after OSD failure

2014-11-11 Thread Matthew Anderson
Thanks for your reply Sage! I've tested with 8.6ae and no luck I'm afraid. Steps taken were - Stop osd.117 Export 8.6ae from osd.117 Remove 8.6ae from osd.117 start osd.117 restart osd.190 after still showing incomplete After this the PG was still showing incomplete and ceph pg dump_stuck inactiv

Re: [ceph-users] osds fails to start with mismatch in id

2014-11-11 Thread Ramakrishna Nishtala (rnishtal)
Hi It appears that in case of pre-created partitions, ceph-deploy create, unable to change the partition guid’s. The parted guid remains as it is. Ran manually sgdisk on each partition as sgdisk --change-name="2:ceph data" --partition-guid="2:${osd_uuid}" --typecode="2:${ptype2}" /dev/${i}. The