[ceph-users] Remove HDD under OSD + rados request = not found
Hello, I have a server with hot swappable SATA disks. When I remove HDD from a working server, OSD does not noice missing of HDD. ceph healt status write HEALTH_OK and all of OSD "in" and "up". When I run a swift client on another server to get an object which one of chunk is available on removed disk, radosgw returns with 404 Not Found. If I check osd's log: 2013-09-05T15:32:21+02:00 stor1 ceph-osd: 2013-09-05 15:32:21.907507 7fd415d93700 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find dd997afb/default.6125.2__shadow__r2NQ0fgMPvMXi2SC8kd1E0IFrbjw-5g_2/head//12 in index: (19) No such device And I can reproduce every time. swift client get false response. In this test the cluster does not get write operations at all from radosgw. Why OSD does not notice missing of it's HDD? When I try to upload via swift, the OSD try to write chunk to HDD, but runs error (missing HDD), and ceph-osd daemon terminate; mon's notice OSD ping loss and update monmap. So it seems OSD can detect missing of HDD when try to write only and not read/write. I'm using Ubuntu 12.04-x64 and Dumpling from Ceph's deb-repository. Thank you, Mihaly ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] CephFS test-case
I appreciate CephFS is not a high priority, but this is a user-experience test-case that can be a source of stability bugs for Ceph developers to investigate (and hopefully resolve): CephFS test-case 1. Create two clusters, each 3 nodes with 4 OSDs each 2. I used Ubuntu 13.04 followed by update/upgrade 3. Install Ceph version 0.61 on Cluster A 4. Install release on Cluster B with ceph-deploy 5. Fill Cluster A (version 0.61) with about one million files (all sizes) 6. rsync ClusterA ClusterB 7. In about 12-hours one or two OSDs on ClusterB will crash, restart OSDs, restart rsync 8. At around 75% full OSDs on ClusterB will become out of balance (some more full than others), one or more OSD will then crash. For (4) it is possible to use freely available .ISOs of old user-group CDROMs that are floating around the web, they are a good source of varied content size, directory size and filename lengths. My impression is that 0.61 was relatively stable but subsequent version such as 0.67.2 are less stable in this particular scenario with CephFS. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] librados vs libcephfs performance for database broker
Hi, I am setting up a cluster that is using Hypertable as one of the key components. This had required some fixes of CephBroker, which I hope would be integrated to the main Hypertable branch soon. However, it seems to me that CephBroker doesn't need full fledged filesystem support. I wonder if raw librados could give me extra performance or metadata management isn't really time consuming. -- King regards, Serge Slipchenko ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
> > > > And a second question regarding ceph-deploy: > > > > How do I specify a second NIC/address to be used as the intercluster > > > > communication? > > > You will not be able to do something like this with ceph-deploy. This > > > sounds like a very specific (or a bit more advanced) > > > configuration than what ceph-deploy offers. > > Actually, you can when editing the ceph.conf (before creating any > > daemons) simply set public addr and cluster addr in whatever section > > is appropriate. :) > Oh, you are right! I was thinking about a flag in ceph-deploy for some reason > :) yepp, flag or option would be nice, but doing it via ceph.conf would work for me too. So immediately after running ceph-deploy new $my_initial_instances I inject what ever I need into ceph.conf than with the " --overwrite-conf" option ceph.conf is pushed to the nodes on the ceph-deploy mon create $my_initial_instances right? ceph-deploy only creates a basic ceph.conf when called with the "new" command, not editing the conf afterwards anymore? so if I add additional MONs/OSDs later on, shall I adjust the "mon_host" list? and the "mon_initial_members" list? Can I introduce the cluster network later on, after the cluster is deployed and started working? (by editing ceph.conf, push it to the cluster members and restart the daemons?) TIA Bernhard -- Bernhard Glomm IT Administration Phone: +49 (30) 86880 134 Fax: +49 (30) 86880 100 Skype: bernhard.glomm.ecologic Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717 Berlin | Germany GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.: DE811963464 Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH http://ceph.com/docs/master/rados/configuration/ceph-conf/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Using radosgw with s3cmd: Bucket failure
Just in case someone stumbles across the same problem: The option name in ceph.conf is rgw_dns_name - not as described "rgw dns name" at http://ceph.com/docs/next/radosgw/config-ref/ !? And the hostname needs to be set to your DNS name withouth any wildcard. Georg On 06.09.2013 08:51, Georg Höllrigl wrote: On 23.08.2013 16:24, Yehuda Sadeh wrote: On Fri, Aug 23, 2013 at 1:47 AM, Tobias Brunner wrote: Hi, I'm trying to use radosgw with s3cmd: # s3cmd ls # s3cmd mb s3://bucket-1 ERROR: S3 error: 405 (MethodNotAllowed): So there seems to be something missing according to buckets. How can I create buckets? What do I have to configure on the radosgw side to have buckets working? The problem that you have here is that s3cmd uses the virtual host bucket name mechanism, e.g. it tries to access http://bucket./ instead of the usual http:///bucket. You can configure the gateway to support that (set 'rgw dns name = ' in your ceph.conf), however, you'll also need to be able to route all these requests to your host, using some catch-all dns. The easiest way to go would be to configure your client to not use that virtual host bucket name, but I'm not completely sure s3cmd can do that. Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com I'm standing directly at the same problem - but this didn't help. I've set up the DNS, can reach the subdomains and also the "rgw dns name". But still the same troubles here :( Georg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Dipl.-Ing. (FH) Georg Höllrigl Technik Xidras GmbH Stockern 47 3744 Stockern Austria Tel: +43 (0) 2983 201 - 30505 Fax: +43 (0) 2983 201 - 930505 Email: georg.hoellr...@xidras.com Web: http://www.xidras.com FN 317036 f | Landesgericht Krems | ATU64485024 VERTRAULICHE INFORMATIONEN! Diese eMail enthält vertrauliche Informationen und ist nur für den berechtigten Empfänger bestimmt. Wenn diese eMail nicht für Sie bestimmt ist, bitten wir Sie, diese eMail an uns zurückzusenden und anschließend auf Ihrem Computer und Mail-Server zu löschen. Solche eMails und Anlagen dürfen Sie weder nutzen, noch verarbeiten oder Dritten zugänglich machen, gleich in welcher Form. Wir danken für Ihre Kooperation! CONFIDENTIAL! This email contains confidential information and is intended for the authorised recipient only. If you are not an authorised recipient, please return the email to us and then delete it from your computer and mail-server. You may neither use nor edit any such emails including attachments, nor make them accessible to third parties in any manner whatsoever. Thank you for your cooperation ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
thnx Jens > > I have my testcluster consisting two OSDs that also host MONs plus > > one to five MONs. > Are you saying that you have a total of 7 mons? yepp > > down the at last, not the other MON though (since - surprise - they > > are in this test szenario just virtual instances residing on some > > ceph rbds) > This seems to be your problem. When you shutdown the cluster, you > haven't got those extra mons. > > In order to reach a quorum after reboot, you need to have more than half > of yours mons running. wait! lets say in my setup (which might be silly giving the MON / OSD ratio but still)with 7 MONs I have to have at least 5 MONS running?4 would be more than the half but insufficient to reach a quorum.But, would the cluster come up at all if I could get only 3 out of the 7 initial MONs up and running? > If you have 5 or more mons in total, this means that the two physical > servers running mons cannot reach quorum by themselves. > > I.e. you have 2 mons out of 5 running for example - it will not reach a > quorum because you need at least 3 mons to do that. Well, as I said, I had 3 MONs running, just the sequence in which they came up after the reboot was odd,first the first MON/OSD combination came to life, than a second MON, than the second MON/OSD combination > You need to either move mons to physical machines, or virtual instances > not depending on the same Ceph cluster, or reduce the number of mons in > the system to 3 (or 1). And do I need to make sure that a quorum capable number of MONs is up BEFORE I restart the OSDs? than the sequence would be: - free cluster of load/usage - stop MDS / any given order - stop OSD / any given order - stop MON / one after the other - start MON / in reverse order, last shutdown is first boot - start OSD - start MDS - allow load to the cluster again central question remains, do I need 5 out of 7 MONs running or would 3 out of 7 be sufficient? TIA Bernhard -- Bernhard Glomm IT Administration Phone: +49 (30) 86880 134 Fax: +49 (30) 86880 100 Skype: bernhard.glomm.ecologic Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717 Berlin | Germany GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.: DE811963464 Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
On 06/09/2013, at 7:49 PM, "Bernhard Glomm" wrote: > Can I introduce the cluster network later on, after the cluster is deployed > and started working? > (by editing ceph.conf, push it to the cluster members and restart the > daemons?) Thanks Bernhard for asking this question, I have the same question. To rephrase, if we use ceph-deploy to setup a cluster, what is the recommended way to add the cluster/client networks later on? It seems that ceph-deploy provides a minimal ceph.conf, not explicitly defining OSDs, how is this file later re-populated with the missing detail? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
Hi, In order to reach a quorum after reboot, you need to have more than half of yours mons running. with 7 MONs I have to have at least 5 MONS running? No. 4 is more than half of 7, so 4 would be a majority and thus would be able to form a quorum. 4 would be more than the half but insufficient to reach a quorum. I don't see why you think 4 would be insufficient. But, would the cluster come up at all if I could get only 3 out of the 7 initial MONs up and running? It wouldn't be able to form a quorum, and thus you would not be able to use the cluster. I.e. you have 2 mons out of 5 running for example - it will not reach a quorum because you need at least 3 mons to do that. Well, as I said, I had 3 MONs running, just the sequence in which they came up after the reboot was odd, The number 3 was specifically for the example of having 5 mons in total. In your case where you have 7 mons in total, you need to have 4 running to do anything meaningful. The order they are started in does not matter as such. And do I need to make sure that a quorum capable number of MONs is up BEFORE I restart the OSDs? No, that is not important. The OSDs will wait until the required number of mons become available. - stop MON / one after the other - start MON / in reverse order, last shutdown is first boot It is not required to stop or start mons in a specific order. central question remains, do I need 5 out of 7 MONs running or would 3 out of 7 be sufficient? The magic number here is 4. -- Jens Kristian Søgaard, Mermaid Consulting ApS, j...@mermaidconsulting.dk, http://.mermaidconsulting.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] trouble with ceph-deploy
>>>Try >>>ceph-disk -v activate /dev/sdaa1 ceph-disk -v activate /dev/sdaa1 /dev/sdaa1: ambivalent result (probably more filesystems on the device, use wipefs(8) to see more details) >>>as there is probably a partition there. And/or tell us what >>>/proc/partitions contains, cat /proc/partitions major minor #blocks name 65 160 2930266584 sdaa 65 161 2930265543 sdaa1 >>>and/or what you get from >>>ceph-disk list ceph-disk list Traceback (most recent call last): File "/usr/sbin/ceph-disk", line 2328, in main() File "/usr/sbin/ceph-disk", line 2317, in main args.func(args) File "/usr/sbin/ceph-disk", line 2001, in main_list tpath = mount(dev=dev, fstype=fs_type, options='') File "/usr/sbin/ceph-disk", line 678, in mount path, File "/usr/lib/python2.7/subprocess.py", line 506, in check_call retcode = call(*popenargs, **kwargs) File "/usr/lib/python2.7/subprocess.py", line 493, in call return Popen(*popenargs, **kwargs).wait() File "/usr/lib/python2.7/subprocess.py", line 679, in __init__ errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child raise child_exception TypeError: execv() arg 2 must contain only strings == -Original Message- From: Sage Weil [mailto:s...@inktank.com] Sent: Thursday, September 05, 2013 6:37 PM To: Pavel Timoschenkov Cc: Alfredo Deza; ceph-users@lists.ceph.com Subject: RE: [ceph-users] trouble with ceph-deploy On Thu, 5 Sep 2013, Pavel Timoschenkov wrote: > >>>What happens if you do > >>>ceph-disk -v activate /dev/sdaa1 > >>>on ceph001? > > Hi. My issue has not been solved. When i execute ceph-disk -v activate > /dev/sdaa - all is ok: > ceph-disk -v activate /dev/sdaa Try ceph-disk -v activate /dev/sdaa1 as there is probably a partition there. And/or tell us what /proc/partitions contains, and/or what you get from ceph-disk list Thanks! sage > DEBUG:ceph-disk:Mounting /dev/sdaa on /var/lib/ceph/tmp/mnt.yQuXIa > with options noatime > mount: Structure needs cleaning > but OSD not created all the same: > ceph -k ceph.client.admin.keyring -s > cluster 0a2e18d2-fd53-4f01-b63a-84851576c076 >health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds >monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch 2, > quorum 0 ceph001 >osdmap e1: 0 osds: 0 up, 0 in > pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB > avail >mdsmap e1: 0/0/1 up > > -Original Message- > From: Sage Weil [mailto:s...@inktank.com] > Sent: Friday, August 30, 2013 6:14 PM > To: Pavel Timoschenkov > Cc: Alfredo Deza; ceph-users@lists.ceph.com > Subject: Re: [ceph-users] trouble with ceph-deploy > > On Fri, 30 Aug 2013, Pavel Timoschenkov wrote: > > > > > <<< > How <<< > > > > > > > In logs everything looks good. After > > > > ceph-deploy disk zap ceph001:sdaa ceph001:sda1 > > > > and > > > > ceph-deploy osd create ceph001:sdaa:/dev/sda1 > > > > where: > > > > HOST: ceph001 > > > > DISK: sdaa > > > > JOURNAL: /dev/sda1 > > > > in log: > > > > == > > > > cat ceph.log > > > > 2013-08-30 13:06:42,030 [ceph_deploy.osd][DEBUG ] Preparing cluster > > ceph disks ceph001:/dev/sdaa:/dev/sda1 > > > > 2013-08-30 13:06:42,590 [ceph_deploy.osd][DEBUG ] Deploying osd to > > ceph001 > > > > 2013-08-30 13:06:42,627 [ceph_deploy.osd][DEBUG ] Host ceph001 is > > now ready for osd use. > > > > 2013-08-30 13:06:42,627 [ceph_deploy.osd][DEBUG ] Preparing host > > ceph001 disk /dev/sdaa journal /dev/sda1 activate True > > > > +++ > > > > But: > > > > +++ > > > > ceph -k ceph.client.admin.keyring -s > > > > cluster 0a2e18d2-fd53-4f01-b63a-84851576c076 > > > > health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; > > no osds > > > > monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch > > 2, quorum 0 ceph001 > > > > osdmap e1: 0 osds: 0 up, 0 in > > > > pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / > > 0 KB avail > > > > mdsmap e1: 0/0/1 up > > > > +++ > > > > And > > > > +++ > > > > ceph -k ceph.client.admin.keyring osd tree > > > > # id weight type name up/down reweight > > > > -1 0 root default > > > > +++ > > > > OSD not created ( > > What happens if you do > > ceph-disk -v activate /dev/sdaa1 > > on ceph001? > > sage > > > > > > > > > > From: Alfredo Deza [mailto:alfredo.d...@inktank.com] > > Sent: Thursday, August 29, 2013 5:41 PM > > To: Pavel Timoschenkov > > Cc: ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] trouble with ceph-deploy > > > > > > > > > > >
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
Thnx a lot for making this clear! I thought 4 out of seven wouldn't be good because it's not an odd number... but I guess after I would have brought up the cluster with 4 MONs I could have removed one of the MONs to reach that (well, or add one) thnx again Bernhard Am 06.09.2013 13:26:37, schrieb Jens Kristian Søgaard: > Hi, > > > In order to reach a quorum after reboot, you need to have more than half > > of yours mons running. > > with 7 MONs I have to have at least 5 MONS running? > No. 4 is more than half of 7, so 4 would be a majority and thus would be > able to form a quorum. > > > 4 would be more than the half but insufficient to reach a quorum. > I don't see why you think 4 would be insufficient. > > > But, would the cluster come up at all if I could get only 3 out of the 7 > > initial MONs up and running? > It wouldn't be able to form a quorum, and thus you would not be able to > use the cluster. > > > I.e. you have 2 mons out of 5 running for example - it will not reach a > > quorum because you need at least 3 mons to do that. > > Well, as I said, I had 3 MONs running, just the sequence in which they > > came up after the reboot was odd, > The number 3 was specifically for the example of having 5 mons in total. > > In your case where you have 7 mons in total, you need to have 4 running > to do anything meaningful. > > The order they are started in does not matter as such. > > > And do I need to make sure that a quorum capable number of MONs is up > > BEFORE I restart the OSDs? > No, that is not important. The OSDs will wait until the required number > of mons become available. > > > - stop MON / one after the other > > - start MON / in reverse order, last shutdown is first boot > It is not required to stop or start mons in a specific order. > > > central question remains, do I need 5 out of 7 MONs running or would 3 > > out of 7 be sufficient? > The magic number here is 4. > > -- > Jens Kristian Søgaard, Mermaid Consulting ApS, > j...@mermaidconsulting.dk> , > http://.mermaidconsulting.com/ > > -- Bernhard Glomm IT Administration Phone: +49 (30) 86880 134 Fax: +49 (30) 86880 100 Skype: bernhard.glomm.ecologic Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717 Berlin | Germany GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.: DE811963464 Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] quick-ceph-deploy
On Thu, Sep 5, 2013 at 8:25 PM, sriram wrote: > I am trying to deploy ceph reading the instructions from this link. > > http://ceph.com/docs/master/start/quick-ceph-deploy/ > > I get the error below. Can someone let me know if this is something related > to what I am doing wrong or the script? > > [abc@abc-ld ~]$ ceph-deploy install abc-ld > [ceph_deploy.install][DEBUG ] Installing stable version dumpling on cluster > ceph hosts abc-ld > [ceph_deploy.install][DEBUG ] Detecting platform for host abc-ld ... > [sudo] password for abc: > [ceph_deploy.install][INFO ] Distro info: RedHatEnterpriseWorkstation 6.1 > Santiago > [abc-ld][INFO ] installing ceph on abc-ld > [abc-ld][INFO ] Running command: su -c 'rpm --import > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' > [abc-ld][ERROR ] Traceback (most recent call last): > [abc-ld][ERROR ] File > "/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py", line > 21, in install > [abc-ld][ERROR ] File > "/usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py", line 10, > in inner > [abc-ld][ERROR ] def inner(*args, **kwargs): > [abc-ld][ERROR ] File > "/usr/lib/python2.6/site-packages/ceph_deploy/util/wrappers.py", line 6, in > remote_call > [abc-ld][ERROR ] This allows us to only remote-execute the actual calls, > not whole functions. > [abc-ld][ERROR ] File "/usr/lib64/python2.6/subprocess.py", line 502, in > check_call > [abc-ld][ERROR ] raise CalledProcessError(retcode, cmd) > [abc-ld][ERROR ] CalledProcessError: Command '['su -c \'rpm --import > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc"\'']' > returned non-zero exit status 1 > [abc-ld][ERROR ] error: > https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: key 1 > import failed. > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: su -c 'rpm > --import "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' Can you try running that command on the host that it failed (I think that would be abc-ld) and paste the output? For some reason that `rpm --import` failed. Could be network related. > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
Hi Bernhard, I thought 4 out of seven wouldn't be good because it's not an odd number... but I guess after I would have brought up the cluster with 4 MONs I could have removed one of the MONs to reach that (well, or add one) Think of it like this: You created 7 mons in ceph. This is like having a parliament with 7 members. Whenever you want to do something, you need to convince a majority of parliament to vote yes. A majority would then be 4 members voting yes. If two members of parliament decide to stay at home instead of turning up to vote - you still need 4 members to get a majority. It is _not_ the case that everyone would suddenly agree and acknowledge that only 5 parliament members have turned up to vote, so that only 3 yes votes would be enough to form a majority. It doesn't matter at all whether the number of parliament members showing up to vote is odd or even. -- Jens Kristian Søgaard, Mermaid Consulting ApS, j...@mermaidconsulting.dk, http://.mermaidconsulting.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Using radosgw with s3cmd: Bucket failure
On Fri, Sep 6, 2013 at 3:00 AM, Georg Höllrigl wrote: > Just in case someone stumbles across the same problem: > > The option name in ceph.conf is rgw_dns_name - not as described "rgw dns > name" at http://ceph.com/docs/next/radosgw/config-ref/ !? Both should work. > > And the hostname needs to be set to your DNS name withouth any wildcard. > That's correct. Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] quick-ceph-deploy
I am running it on the same machine. "[abc@abc-ld ~]$ ceph-deploy install abc-ld" On Fri, Sep 6, 2013 at 5:42 AM, Alfredo Deza wrote: > On Thu, Sep 5, 2013 at 8:25 PM, sriram wrote: > > I am trying to deploy ceph reading the instructions from this link. > > > > http://ceph.com/docs/master/start/quick-ceph-deploy/ > > > > I get the error below. Can someone let me know if this is something > related > > to what I am doing wrong or the script? > > > > [abc@abc-ld ~]$ ceph-deploy install abc-ld > > [ceph_deploy.install][DEBUG ] Installing stable version dumpling on > cluster > > ceph hosts abc-ld > > [ceph_deploy.install][DEBUG ] Detecting platform for host abc-ld ... > > [sudo] password for abc: > > [ceph_deploy.install][INFO ] Distro info: RedHatEnterpriseWorkstation > 6.1 > > Santiago > > [abc-ld][INFO ] installing ceph on abc-ld > > [abc-ld][INFO ] Running command: su -c 'rpm --import > > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' > > [abc-ld][ERROR ] Traceback (most recent call last): > > [abc-ld][ERROR ] File > > "/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py", > line > > 21, in install > > [abc-ld][ERROR ] File > > "/usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py", line > 10, > > in inner > > [abc-ld][ERROR ] def inner(*args, **kwargs): > > [abc-ld][ERROR ] File > > "/usr/lib/python2.6/site-packages/ceph_deploy/util/wrappers.py", line 6, > in > > remote_call > > [abc-ld][ERROR ] This allows us to only remote-execute the actual > calls, > > not whole functions. > > [abc-ld][ERROR ] File "/usr/lib64/python2.6/subprocess.py", line 502, > in > > check_call > > [abc-ld][ERROR ] raise CalledProcessError(retcode, cmd) > > [abc-ld][ERROR ] CalledProcessError: Command '['su -c \'rpm --import > > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc"\'']' > > returned non-zero exit status 1 > > [abc-ld][ERROR ] error: > > https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: key 1 > > import failed. > > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: su -c 'rpm > > --import " > https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' > > Can you try running that command on the host that it failed (I think > that would be abc-ld) > and paste the output? > > For some reason that `rpm --import` failed. Could be network related. > > > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] CORS not working
Sorry for delay, static3 bucket was created on 0.56 afair, I've tested the same operation with fresh bucket created now on dumpling, and the problem still occurs. regards! -- pawel On 04.09.2013 20:15, Yehuda Sadeh wrote: Is static3 a bucket that you created before the upgrade? Can you test it with newly created buckets? Might be that you're hitting some other issue. Thanks, Yehuda On Tue, Sep 3, 2013 at 11:19 PM, Pawel Stefanski wrote: hello! yes, dns name is configured and working perfectly, the bucket (in this example static3) is found actually, but RGW can't read CORS configuration due some reason. 2013-09-04 08:07:46.082740 7ff4bf7ee700 2 req 10:0.000275:s3:OPTIONS /::getting op 2013-09-04 08:07:46.082745 7ff4bf7ee700 2 req 10:0.000280:s3:OPTIONS /:options_cors:authorizing 2013-09-04 08:07:46.082753 7ff4bf7ee700 2 req 10:0.000287:s3:OPTIONS /:options_cors:reading permissions 2013-09-04 08:07:46.082790 7ff4bf7ee700 20 get_obj_state: rctx=0x7ff4f8003400 obj=.rgw:static3 state=0x7ff4f8005968 s->prefetch_data=0 2013-09-04 08:07:46.082810 7ff4bf7ee700 10 moving .rgw+static3 to cache LRU end 2013-09-04 08:07:46.082819 7ff4bf7ee700 10 cache get: name=.rgw+static3 : hit 2013-09-04 08:07:46.082840 7ff4bf7ee700 20 get_obj_state: s->obj_tag was set empty 2013-09-04 08:07:46.082845 7ff4bf7ee700 20 Read xattr: user.rgw.acl 2013-09-04 08:07:46.082847 7ff4bf7ee700 20 Read xattr: user.rgw.cors 2013-09-04 08:07:46.082848 7ff4bf7ee700 20 Read xattr: user.rgw.idtag 2013-09-04 08:07:46.082849 7ff4bf7ee700 20 Read xattr: user.rgw.manifest 2013-09-04 08:07:46.082855 7ff4bf7ee700 10 moving .rgw+static3 to cache LRU end 2013-09-04 08:07:46.082857 7ff4bf7ee700 10 cache get: name=.rgw+static3 : hit 2013-09-04 08:07:46.082898 7ff4bf7ee700 20 rgw_get_bucket_info: old bucket info, bucket=static3(@.rgw.buckets2[99137.2]) owner pejotes 2013-09-04 08:07:46.082921 7ff4bf7ee700 15 Read AccessControlPolicyhttp://s3.amazonaws.com/doc/2006-03-01/";>pejotesofehttp://www.w3.org/2001/XMLSchema-instance"; xsi:type="Group">http://acs.amazonaws.com/groups/global/AllUsersFULL_CONTROLhttp://www.w3.org/2001/XMLSchema-instance"; xsi:type="CanonicalUser">pejotesofeFULL_CONTROL 2013-09-04 08:07:46.082943 7ff4bf7ee700 15 Read AccessControlPolicyhttp://s3.amazonaws.com/doc/2006-03-01/";>pejotesofehttp://www.w3.org/2001/XMLSchema-instance"; xsi:type="Group">http://acs.amazonaws.com/groups/global/AllUsersFULL_CONTROLhttp://www.w3.org/2001/XMLSchema-instance"; xsi:type="CanonicalUser">pejotesofeFULL_CONTROL 2013-09-04 08:07:46.082951 7ff4bf7ee700 2 req 10:0.000486:s3:OPTIONS /:options_cors:verifying op mask 2013-09-04 08:07:46.082955 7ff4bf7ee700 20 required_mask= 1 user.op_mask=7 2013-09-04 08:07:46.082957 7ff4bf7ee700 2 req 10:0.000492:s3:OPTIONS /:options_cors:verifying op permissions 2013-09-04 08:07:46.082960 7ff4bf7ee700 2 req 10:0.000495:s3:OPTIONS /:options_cors:verifying op params 2013-09-04 08:07:46.082963 7ff4bf7ee700 2 req 10:0.000498:s3:OPTIONS /:options_cors:executing 2013-09-04 08:07:46.082966 7ff4bf7ee700 2 No CORS configuration set yet for this bucket 2013-09-04 08:07:46.083105 7ff4bf7ee700 2 req 10:0.000640:s3:OPTIONS /:options_cors:http status=403 2013-09-04 08:07:46.083548 7ff4bf7ee700 1 == req done req=0xbcd910 http_status=403 == best regards! -- pawel On Tue, Sep 3, 2013 at 5:17 PM, Yehuda Sadeh wrote: On Tue, Sep 3, 2013 at 3:40 AM, Pawel Stefanski wrote: hello! I've tried with wip-6078 and git dumpling builds and got the same error during OPTIONS request. curl -v -X OPTIONS -H 'Access-Control-Request-Method: PUT' -H "Origin: http://X.pl"; http://static3.X.pl/ OPTIONS / HTTP/1.1 User-Agent: curl/7.31.0 Host: static3.X.pl Accept: */* Access-Control-Request-Method: PUT Origin: http://X.pl < HTTP/1.1 403 Forbidden < Date: Tue, 03 Sep 2013 09:34:39 GMT * Server Apache/2.2.22 (Ubuntu) is not blacklisted < Server: Apache/2.2.22 (Ubuntu) < Accept-Ranges: bytes < Content-Length: 78 < Content-Type: application/xml < AccessDenied Of course CORS was set, but RGW can't find it, dump from log: Did you configure the virtual bucket name through host name correctly? For that you would have needed to set 'rgw dns name' in your ceph.conf. It seems as if it's not set and you're using it. Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] CephFS test-case
On Fri, 6 Sep 2013, Nigel Williams wrote: > I appreciate CephFS is not a high priority, but this is a > user-experience test-case that can be a source of stability bugs for > Ceph developers to investigate (and hopefully resolve): > > CephFS test-case > > 1. Create two clusters, each 3 nodes with 4 OSDs each > > 2. I used Ubuntu 13.04 followed by update/upgrade > > 3. Install Ceph version 0.61 on Cluster A > > 4. Install release on Cluster B with ceph-deploy > > 5. Fill Cluster A (version 0.61) with about one million files (all sizes) > > 6. rsync ClusterA ClusterB > > 7. In about 12-hours one or two OSDs on ClusterB will crash, restart > OSDs, restart rsync > > 8. At around 75% full OSDs on ClusterB will become out of balance > (some more full than others), one or more OSD will then crash. It sounds like the problem is cluster B's pools have too few PGs, making the data distribution get all out of whack. What does ceph osd dump | grep ^pool say, and how many OSDs do you have? sage > > For (4) it is possible to use freely available .ISOs of old user-group > CDROMs that are floating around the web, they are a good source of > varied content size, directory size and filename lengths. > > My impression is that 0.61 was relatively stable but subsequent > version such as 0.67.2 are less stable in this particular scenario > with CephFS. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-deploy depends on sudo
Perhaps it's worth a bug report, or some changes in ceph-deploy : I've just deployed some test clusters with ceph-deploy on Debian Wheezy. I had errors with ceph-deploy, when the destination node does not have sudo installed. Even if a run it as root, and so connect to the node as root. Either ceph-deploy has not to use sudo if it's run with root privilege (prefered), or sudo must be a dependancy somwhere. As it's not on the node where ceph-deploy is installed, it's quite difficult to say. At least, ceph-deploy should just print a friendier message if it does not found sudo... Thank you devs for your work ! -- Gilles Mocellin Nuage Libre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] quick-ceph-deploy
On Fri, Sep 6, 2013 at 11:05 AM, sriram wrote: > sudo su -c 'rpm --import > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' > error: https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: key > 1 import failed. > Can you actually get to that URL and see the GPG key? Via curl/wget or the browser (if you have one in that host) > > On Fri, Sep 6, 2013 at 8:01 AM, Alfredo Deza > wrote: >> >> On Fri, Sep 6, 2013 at 10:54 AM, sriram wrote: >> > I am running it on the same machine. >> > >> > "[abc@abc-ld ~]$ ceph-deploy install abc-ld" >> > >> > >> > On Fri, Sep 6, 2013 at 5:42 AM, Alfredo Deza >> > wrote: >> >> >> >> On Thu, Sep 5, 2013 at 8:25 PM, sriram wrote: >> >> > I am trying to deploy ceph reading the instructions from this link. >> >> > >> >> > http://ceph.com/docs/master/start/quick-ceph-deploy/ >> >> > >> >> > I get the error below. Can someone let me know if this is something >> >> > related >> >> > to what I am doing wrong or the script? >> >> > >> >> > [abc@abc-ld ~]$ ceph-deploy install abc-ld >> >> > [ceph_deploy.install][DEBUG ] Installing stable version dumpling on >> >> > cluster >> >> > ceph hosts abc-ld >> >> > [ceph_deploy.install][DEBUG ] Detecting platform for host abc-ld ... >> >> > [sudo] password for abc: >> >> > [ceph_deploy.install][INFO ] Distro info: >> >> > RedHatEnterpriseWorkstation >> >> > 6.1 >> >> > Santiago >> >> > [abc-ld][INFO ] installing ceph on abc-ld >> >> > [abc-ld][INFO ] Running command: su -c 'rpm --import >> >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' >> >> > [abc-ld][ERROR ] Traceback (most recent call last): >> >> > [abc-ld][ERROR ] File >> >> > >> >> > "/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py", >> >> > line >> >> > 21, in install >> >> > [abc-ld][ERROR ] File >> >> > "/usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py", >> >> > line >> >> > 10, >> >> > in inner >> >> > [abc-ld][ERROR ] def inner(*args, **kwargs): >> >> > [abc-ld][ERROR ] File >> >> > "/usr/lib/python2.6/site-packages/ceph_deploy/util/wrappers.py", line >> >> > 6, >> >> > in >> >> > remote_call >> >> > [abc-ld][ERROR ] This allows us to only remote-execute the actual >> >> > calls, >> >> > not whole functions. >> >> > [abc-ld][ERROR ] File "/usr/lib64/python2.6/subprocess.py", line >> >> > 502, >> >> > in >> >> > check_call >> >> > [abc-ld][ERROR ] raise CalledProcessError(retcode, cmd) >> >> > [abc-ld][ERROR ] CalledProcessError: Command '['su -c \'rpm --import >> >> > >> >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc"\'']' >> >> > returned non-zero exit status 1 >> >> > [abc-ld][ERROR ] error: >> >> > https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: key >> >> > 1 >> >> > import failed. >> >> > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: su -c >> >> > 'rpm >> >> > --import >> >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' >> >> >> >> Can you try running that command on the host that it failed (I think >> >> that would be abc-ld) >> >> and paste the output? >> >> I mean, to run the actual command (from the log output) that caused the >> failure. >> >> In your case, it would be: >> >> rpm --import >> "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc"; >> >> >> >> >> For some reason that `rpm --import` failed. Could be network related. >> >> >> >> > >> >> > >> >> > ___ >> >> > ceph-users mailing list >> >> > ceph-users@lists.ceph.com >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > >> > >> > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
Le 06/09/2013 12:12, Nigel Williams a écrit : On 06/09/2013, at 7:49 PM, "Bernhard Glomm" wrote: Can I introduce the cluster network later on, after the cluster is deployed and started working? (by editing ceph.conf, push it to the cluster members and restart the daemons?) Thanks Bernhard for asking this question, I have the same question. To rephrase, if we use ceph-deploy to setup a cluster, what is the recommended way to add the cluster/client networks later on? It seems that ceph-deploy provides a minimal ceph.conf, not explicitly defining OSDs, how is this file later re-populated with the missing detail? Hello, I have done that last week. After having created and used my initial cluster without cluster network, I have added the public an cluster network lines in ceph.con, pushed it to the nodes and restarted the osd demons. Looking at the interfaces traffic (with bwm-ng) I see that the cluster network is now used. (you can also look at established connextions with ss). -- Gilles Mocellin Nuage Libre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] CORS not working
Can you provide a log that includes the bucket creation, CORS settings and the OPTIONS call? It'd be best if you could do it with also 'debug ms = 1'. Thanks, Yehuda On Fri, Sep 6, 2013 at 7:54 AM, Paweł Stefański wrote: > Sorry for delay, > > static3 bucket was created on 0.56 afair, I've tested the same operation > with fresh bucket created now on dumpling, and the problem still occurs. > > regards! > -- > pawel > > > On 04.09.2013 20:15, Yehuda Sadeh wrote: >> >> Is static3 a bucket that you created before the upgrade? Can you test >> it with newly created buckets? Might be that you're hitting some other >> issue. >> >> Thanks, >> Yehuda >> >> On Tue, Sep 3, 2013 at 11:19 PM, Pawel Stefanski >> wrote: >>> >>> hello! >>> >>> yes, dns name is configured and working perfectly, the bucket (in this >>> example static3) is found actually, but RGW can't read CORS configuration >>> due some reason. >>> >>> 2013-09-04 08:07:46.082740 7ff4bf7ee700 2 req 10:0.000275:s3:OPTIONS >>> /::getting op >>> 2013-09-04 08:07:46.082745 7ff4bf7ee700 2 req 10:0.000280:s3:OPTIONS >>> /:options_cors:authorizing >>> 2013-09-04 08:07:46.082753 7ff4bf7ee700 2 req 10:0.000287:s3:OPTIONS >>> /:options_cors:reading permissions >>> 2013-09-04 08:07:46.082790 7ff4bf7ee700 20 get_obj_state: >>> rctx=0x7ff4f8003400 obj=.rgw:static3 state=0x7ff4f8005968 >>> s->prefetch_data=0 >>> 2013-09-04 08:07:46.082810 7ff4bf7ee700 10 moving .rgw+static3 to cache >>> LRU >>> end >>> 2013-09-04 08:07:46.082819 7ff4bf7ee700 10 cache get: name=.rgw+static3 : >>> hit >>> 2013-09-04 08:07:46.082840 7ff4bf7ee700 20 get_obj_state: s->obj_tag was >>> set >>> empty >>> 2013-09-04 08:07:46.082845 7ff4bf7ee700 20 Read xattr: user.rgw.acl >>> 2013-09-04 08:07:46.082847 7ff4bf7ee700 20 Read xattr: user.rgw.cors >>> 2013-09-04 08:07:46.082848 7ff4bf7ee700 20 Read xattr: user.rgw.idtag >>> 2013-09-04 08:07:46.082849 7ff4bf7ee700 20 Read xattr: user.rgw.manifest >>> 2013-09-04 08:07:46.082855 7ff4bf7ee700 10 moving .rgw+static3 to cache >>> LRU >>> end >>> 2013-09-04 08:07:46.082857 7ff4bf7ee700 10 cache get: name=.rgw+static3 : >>> hit >>> 2013-09-04 08:07:46.082898 7ff4bf7ee700 20 rgw_get_bucket_info: old >>> bucket >>> info, bucket=static3(@.rgw.buckets2[99137.2]) owner pejotes >>> 2013-09-04 08:07:46.082921 7ff4bf7ee700 15 Read >>> AccessControlPolicy>> >>> xmlns="http://s3.amazonaws.com/doc/2006-03-01/";>pejotesofe>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; >>> >>> xsi:type="Group">http://acs.amazonaws.com/groups/global/AllUsersFULL_CONTROL>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; >>> >>> xsi:type="CanonicalUser">pejotesofeFULL_CONTROL >>> 2013-09-04 08:07:46.082943 7ff4bf7ee700 15 Read >>> AccessControlPolicy>> >>> xmlns="http://s3.amazonaws.com/doc/2006-03-01/";>pejotesofe>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; >>> >>> xsi:type="Group">http://acs.amazonaws.com/groups/global/AllUsersFULL_CONTROL>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; >>> >>> xsi:type="CanonicalUser">pejotesofeFULL_CONTROL >>> 2013-09-04 08:07:46.082951 7ff4bf7ee700 2 req 10:0.000486:s3:OPTIONS >>> /:options_cors:verifying op mask >>> 2013-09-04 08:07:46.082955 7ff4bf7ee700 20 required_mask= 1 >>> user.op_mask=7 >>> 2013-09-04 08:07:46.082957 7ff4bf7ee700 2 req 10:0.000492:s3:OPTIONS >>> /:options_cors:verifying op permissions >>> 2013-09-04 08:07:46.082960 7ff4bf7ee700 2 req 10:0.000495:s3:OPTIONS >>> /:options_cors:verifying op params >>> 2013-09-04 08:07:46.082963 7ff4bf7ee700 2 req 10:0.000498:s3:OPTIONS >>> /:options_cors:executing >>> 2013-09-04 08:07:46.082966 7ff4bf7ee700 2 No CORS configuration set yet >>> for >>> this bucket >>> 2013-09-04 08:07:46.083105 7ff4bf7ee700 2 req 10:0.000640:s3:OPTIONS >>> /:options_cors:http status=403 >>> 2013-09-04 08:07:46.083548 7ff4bf7ee700 1 == req done req=0xbcd910 >>> http_status=403 == >>> >>> best regards! >>> -- >>> pawel >>> >>> >>> On Tue, Sep 3, 2013 at 5:17 PM, Yehuda Sadeh wrote: On Tue, Sep 3, 2013 at 3:40 AM, Pawel Stefanski wrote: > > hello! > > I've tried with wip-6078 and git dumpling builds and got the same error > during OPTIONS request. > > curl -v -X OPTIONS -H 'Access-Control-Request-Method: PUT' -H "Origin: > http://X.pl"; http://static3.X.pl/ > >> OPTIONS / HTTP/1.1 >> User-Agent: curl/7.31.0 >> Host: static3.X.pl >> Accept: */* >> Access-Control-Request-Method: PUT >> Origin: http://X.pl >> > < HTTP/1.1 403 Forbidden > < Date: Tue, 03 Sep 2013 09:34:39 GMT > * Server Apache/2.2.22 (Ubuntu) is not blacklisted > < Server: Apache/2.2.22 (Ubuntu) > < Accept-Ranges: bytes > < Content-Length: 78 > < Content-Type: application/xml > < > encoding="UTF-8"?>AccessDenied > > Of course CORS was set, but RGW can't find it, dump from log: Did you configure the virtual bucket name through host name co
Re: [ceph-users] quick-ceph-deploy
On Fri, Sep 6, 2013 at 10:54 AM, sriram wrote: > I am running it on the same machine. > > "[abc@abc-ld ~]$ ceph-deploy install abc-ld" > > > On Fri, Sep 6, 2013 at 5:42 AM, Alfredo Deza > wrote: >> >> On Thu, Sep 5, 2013 at 8:25 PM, sriram wrote: >> > I am trying to deploy ceph reading the instructions from this link. >> > >> > http://ceph.com/docs/master/start/quick-ceph-deploy/ >> > >> > I get the error below. Can someone let me know if this is something >> > related >> > to what I am doing wrong or the script? >> > >> > [abc@abc-ld ~]$ ceph-deploy install abc-ld >> > [ceph_deploy.install][DEBUG ] Installing stable version dumpling on >> > cluster >> > ceph hosts abc-ld >> > [ceph_deploy.install][DEBUG ] Detecting platform for host abc-ld ... >> > [sudo] password for abc: >> > [ceph_deploy.install][INFO ] Distro info: RedHatEnterpriseWorkstation >> > 6.1 >> > Santiago >> > [abc-ld][INFO ] installing ceph on abc-ld >> > [abc-ld][INFO ] Running command: su -c 'rpm --import >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' >> > [abc-ld][ERROR ] Traceback (most recent call last): >> > [abc-ld][ERROR ] File >> > "/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py", >> > line >> > 21, in install >> > [abc-ld][ERROR ] File >> > "/usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py", line >> > 10, >> > in inner >> > [abc-ld][ERROR ] def inner(*args, **kwargs): >> > [abc-ld][ERROR ] File >> > "/usr/lib/python2.6/site-packages/ceph_deploy/util/wrappers.py", line 6, >> > in >> > remote_call >> > [abc-ld][ERROR ] This allows us to only remote-execute the actual >> > calls, >> > not whole functions. >> > [abc-ld][ERROR ] File "/usr/lib64/python2.6/subprocess.py", line 502, >> > in >> > check_call >> > [abc-ld][ERROR ] raise CalledProcessError(retcode, cmd) >> > [abc-ld][ERROR ] CalledProcessError: Command '['su -c \'rpm --import >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc"\'']' >> > returned non-zero exit status 1 >> > [abc-ld][ERROR ] error: >> > https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: key 1 >> > import failed. >> > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: su -c >> > 'rpm >> > --import >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' >> >> Can you try running that command on the host that it failed (I think >> that would be abc-ld) >> and paste the output? I mean, to run the actual command (from the log output) that caused the failure. In your case, it would be: rpm --import "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc"; >> >> For some reason that `rpm --import` failed. Could be network related. >> >> > >> > >> > ___ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] trouble with ceph-deploy
On Fri, 6 Sep 2013, Pavel Timoschenkov wrote: > >>>Try > >>>ceph-disk -v activate /dev/sdaa1 > > ceph-disk -v activate /dev/sdaa1 > /dev/sdaa1: ambivalent result (probably more filesystems on the device, use > wipefs(8) to see more details) Looks like thre are multiple fs signatures on that partition. See http://ozancaglayan.com/2013/01/29/multiple-filesystem-signatures-on-a-partition/ for how to clean that up. And please share the wipefs output that you see; it may be that we need to make the --zap-disk behavior also explicitly clear any signatures on the device. Thanks! sage > >>>as there is probably a partition there. And/or tell us what > >>>/proc/partitions contains, > > cat /proc/partitions > major minor #blocks name > > 65 160 2930266584 sdaa > 65 161 2930265543 sdaa1 > > >>>and/or what you get from > >>>ceph-disk list > > ceph-disk list > Traceback (most recent call last): > File "/usr/sbin/ceph-disk", line 2328, in > main() > File "/usr/sbin/ceph-disk", line 2317, in main > args.func(args) > File "/usr/sbin/ceph-disk", line 2001, in main_list > tpath = mount(dev=dev, fstype=fs_type, options='') > File "/usr/sbin/ceph-disk", line 678, in mount > path, > File "/usr/lib/python2.7/subprocess.py", line 506, in check_call > retcode = call(*popenargs, **kwargs) > File "/usr/lib/python2.7/subprocess.py", line 493, in call > return Popen(*popenargs, **kwargs).wait() > File "/usr/lib/python2.7/subprocess.py", line 679, in __init__ > errread, errwrite) > File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child > raise child_exception > TypeError: execv() arg 2 must contain only strings > > == > -Original Message- > From: Sage Weil [mailto:s...@inktank.com] > Sent: Thursday, September 05, 2013 6:37 PM > To: Pavel Timoschenkov > Cc: Alfredo Deza; ceph-users@lists.ceph.com > Subject: RE: [ceph-users] trouble with ceph-deploy > > On Thu, 5 Sep 2013, Pavel Timoschenkov wrote: > > >>>What happens if you do > > >>>ceph-disk -v activate /dev/sdaa1 > > >>>on ceph001? > > > > Hi. My issue has not been solved. When i execute ceph-disk -v activate > > /dev/sdaa - all is ok: > > ceph-disk -v activate /dev/sdaa > > Try > > ceph-disk -v activate /dev/sdaa1 > > as there is probably a partition there. And/or tell us what /proc/partitions > contains, and/or what you get from > > ceph-disk list > > Thanks! > sage > > > > DEBUG:ceph-disk:Mounting /dev/sdaa on /var/lib/ceph/tmp/mnt.yQuXIa > > with options noatime > > mount: Structure needs cleaning > > but OSD not created all the same: > > ceph -k ceph.client.admin.keyring -s > > cluster 0a2e18d2-fd53-4f01-b63a-84851576c076 > >health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds > >monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch 2, > > quorum 0 ceph001 > >osdmap e1: 0 osds: 0 up, 0 in > > pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB > > avail > >mdsmap e1: 0/0/1 up > > > > -Original Message- > > From: Sage Weil [mailto:s...@inktank.com] > > Sent: Friday, August 30, 2013 6:14 PM > > To: Pavel Timoschenkov > > Cc: Alfredo Deza; ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] trouble with ceph-deploy > > > > On Fri, 30 Aug 2013, Pavel Timoschenkov wrote: > > > > > > > > <<< > > How <<< > > > > > > > > > > > In logs everything looks good. After > > > > > > ceph-deploy disk zap ceph001:sdaa ceph001:sda1 > > > > > > and > > > > > > ceph-deploy osd create ceph001:sdaa:/dev/sda1 > > > > > > where: > > > > > > HOST: ceph001 > > > > > > DISK: sdaa > > > > > > JOURNAL: /dev/sda1 > > > > > > in log: > > > > > > == > > > > > > cat ceph.log > > > > > > 2013-08-30 13:06:42,030 [ceph_deploy.osd][DEBUG ] Preparing cluster > > > ceph disks ceph001:/dev/sdaa:/dev/sda1 > > > > > > 2013-08-30 13:06:42,590 [ceph_deploy.osd][DEBUG ] Deploying osd to > > > ceph001 > > > > > > 2013-08-30 13:06:42,627 [ceph_deploy.osd][DEBUG ] Host ceph001 is > > > now ready for osd use. > > > > > > 2013-08-30 13:06:42,627 [ceph_deploy.osd][DEBUG ] Preparing host > > > ceph001 disk /dev/sdaa journal /dev/sda1 activate True > > > > > > +++ > > > > > > But: > > > > > > +++ > > > > > > ceph -k ceph.client.admin.keyring -s > > > > > > cluster 0a2e18d2-fd53-4f01-b63a-84851576c076 > > > > > > health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; > > > no osds > > > > > > monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch > > > 2, quorum 0 ceph001 > > > > > > osdmap e1: 0 osds: 0 up, 0 in > > > > > > pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / > > > 0 KB avail > > > > > > mdsmap e1: 0/0/1 up > > > > > >
Re: [ceph-users] quick-ceph-deploy
sudo su -c 'rpm --import " https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' error: https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: key 1 import failed. On Fri, Sep 6, 2013 at 8:01 AM, Alfredo Deza wrote: > On Fri, Sep 6, 2013 at 10:54 AM, sriram wrote: > > I am running it on the same machine. > > > > "[abc@abc-ld ~]$ ceph-deploy install abc-ld" > > > > > > On Fri, Sep 6, 2013 at 5:42 AM, Alfredo Deza > > wrote: > >> > >> On Thu, Sep 5, 2013 at 8:25 PM, sriram wrote: > >> > I am trying to deploy ceph reading the instructions from this link. > >> > > >> > http://ceph.com/docs/master/start/quick-ceph-deploy/ > >> > > >> > I get the error below. Can someone let me know if this is something > >> > related > >> > to what I am doing wrong or the script? > >> > > >> > [abc@abc-ld ~]$ ceph-deploy install abc-ld > >> > [ceph_deploy.install][DEBUG ] Installing stable version dumpling on > >> > cluster > >> > ceph hosts abc-ld > >> > [ceph_deploy.install][DEBUG ] Detecting platform for host abc-ld ... > >> > [sudo] password for abc: > >> > [ceph_deploy.install][INFO ] Distro info: RedHatEnterpriseWorkstation > >> > 6.1 > >> > Santiago > >> > [abc-ld][INFO ] installing ceph on abc-ld > >> > [abc-ld][INFO ] Running command: su -c 'rpm --import > >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' > >> > [abc-ld][ERROR ] Traceback (most recent call last): > >> > [abc-ld][ERROR ] File > >> > > "/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py", > >> > line > >> > 21, in install > >> > [abc-ld][ERROR ] File > >> > "/usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py", > line > >> > 10, > >> > in inner > >> > [abc-ld][ERROR ] def inner(*args, **kwargs): > >> > [abc-ld][ERROR ] File > >> > "/usr/lib/python2.6/site-packages/ceph_deploy/util/wrappers.py", line > 6, > >> > in > >> > remote_call > >> > [abc-ld][ERROR ] This allows us to only remote-execute the actual > >> > calls, > >> > not whole functions. > >> > [abc-ld][ERROR ] File "/usr/lib64/python2.6/subprocess.py", line > 502, > >> > in > >> > check_call > >> > [abc-ld][ERROR ] raise CalledProcessError(retcode, cmd) > >> > [abc-ld][ERROR ] CalledProcessError: Command '['su -c \'rpm --import > >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc > "\'']' > >> > returned non-zero exit status 1 > >> > [abc-ld][ERROR ] error: > >> > https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: > key 1 > >> > import failed. > >> > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: su -c > >> > 'rpm > >> > --import > >> > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";' > >> > >> Can you try running that command on the host that it failed (I think > >> that would be abc-ld) > >> and paste the output? > > I mean, to run the actual command (from the log output) that caused the > failure. > > In your case, it would be: > > rpm --import > "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc"; > > >> > >> For some reason that `rpm --import` failed. Could be network related. > >> > >> > > >> > > >> > ___ > >> > ceph-users mailing list > >> > ceph-users@lists.ceph.com > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy depends on sudo
On Fri, Sep 6, 2013 at 11:17 AM, Gilles Mocellin wrote: > Perhaps it's worth a bug report, or some changes in ceph-deploy : > > I've just deployed some test clusters with ceph-deploy on Debian Wheezy. > I had errors with ceph-deploy, when the destination node does not have sudo > installed. > Even if a run it as root, and so connect to the node as root. Funny you mention this, as this was just fixed today (see: http://tracker.ceph.com/issues/6104) and should get released soon. It will basically not use sudo on the remote host if you are connecting as the root user. > > Either ceph-deploy has not to use sudo if it's run with root privilege > (prefered), or sudo must be a dependancy somwhere. > As it's not on the node where ceph-deploy is installed, it's quite difficult > to say. > > At least, ceph-deploy should just print a friendier message if it does not > found sudo... Do you have the output of the error from the machine that did not have sudo? > > Thank you devs for your work ! > > -- > Gilles Mocellin > Nuage Libre > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Remove HDD under OSD + rados request = not found
On Fri, Sep 6, 2013 at 12:06 AM, Mihály Árva-Tóth wrote: > Hello, > > I have a server with hot swappable SATA disks. When I remove HDD from a > working server, OSD does not noice missing of HDD. ceph healt status write > HEALTH_OK and all of OSD "in" and "up". When I run a swift client on another > server to get an object which one of chunk is available on removed disk, > radosgw returns with 404 Not Found. If I check osd's log: > > 2013-09-05T15:32:21+02:00 stor1 ceph-osd: 2013-09-05 15:32:21.907507 > 7fd415d93700 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find > dd997afb/default.6125.2__shadow__r2NQ0fgMPvMXi2SC8kd1E0IFrbjw-5g_2/head//12 > in index: (19) No such device > > And I can reproduce every time. swift client get false response. In this > test the cluster does not get write operations at all from radosgw. Why OSD > does not notice missing of it's HDD? > > When I try to upload via swift, the OSD try to write chunk to HDD, but runs > error (missing HDD), and ceph-osd daemon terminate; mon's notice OSD ping > loss and update monmap. So it seems OSD can detect missing of HDD when try > to write only and not read/write. Well that's interesting. I thought we'd set up proper filters on all the error codes we can get back from the FS but some combination of this error and the read path must have gotten missed. I've created a ticket: http://tracker.ceph.com/issues/6250 -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
On 6 Sep 2013, at 13:46, Jens Kristian Søgaard wrote: > You created 7 mons in ceph. This is like having a parliament with 7 members. > > Whenever you want to do something, you need to convince a majority of > parliament to vote yes. A majority would then be 4 members voting yes. > > If two members of parliament decide to stay at home instead of turning up to > vote - you still need 4 members to get a majority. > > It is _not_ the case that everyone would suddenly agree and acknowledge that > only 5 parliament members have turned up to vote, so that only 3 yes votes > would be enough to form a majority. Perhaps not a great analogy. At least in the case of the UK parliament, if 2 members of a 7 member parliament stay at home and don't vote, you would only need 3 members to pass a resolution. In the UK (and I believe in most other parliaments) you need the number of 'yes' to exceed the number of 'no'. The number of members does not matter. In ceph, you need the number of monitors active and voting yes to exceed (i.e. be strictly greater than) half the number of monitors configured. There is no magic about anything being odd or even, save that the quorum for an n-MON cluster, where n is odd, is the same as the quorum for an n+1 MON cluster (as n+1 is even) - in both cases if at least k=(n+1)/2 devices fail it will take the cluster out (i.e. (n-1)/2 have to survive). This makes deploying even numbers of MON devices wasteful (does not increase quorum) and arguably increases the chance of failure (as now we need k devices of n+1 to fail, as opposed to k devices of n). -- Alex Bligh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure
Hi, Perhaps not a great analogy. At least in the case of the UK Perhaps not, I don't know the UK system. I was merely trying to illustrate the difference between the number of mons that the system is configured with (the ones eligible to vote), and the number of mons actually alive and able to vote at a specific time. For the first group, it is a good idea to have an odd number, as adding just one extra mon won't buy you anything. For the second group, it doesn't matter if the number is odd or even - only thing matters is that the number is greater than half of the configured mons. -- Jens Kristian Søgaard, Mermaid Consulting ApS, j...@mermaidconsulting.dk, http://www.mermaidconsulting.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy depends on sudo
Le 06/09/2013 17:37, Alfredo Deza a écrit : On Fri, Sep 6, 2013 at 11:17 AM, Gilles Mocellin wrote: Perhaps it's worth a bug report, or some changes in ceph-deploy : I've just deployed some test clusters with ceph-deploy on Debian Wheezy. I had errors with ceph-deploy, when the destination node does not have sudo installed. Even if a run it as root, and so connect to the node as root. Funny you mention this, as this was just fixed today (see: http://tracker.ceph.com/issues/6104) and should get released soon. It will basically not use sudo on the remote host if you are connecting as the root user. Yes ! cool. [...] Do you have the output of the error from the machine that did not have sudo? Strange, I removed sudo and now I have a clear message, not a stacktrace anymore. Have I dreamed ? root@vmceph1:~# ceph-deploy -n osd create vmceph2:vdb [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks vmceph2:/dev/vdb: [ceph_deploy][ERROR ] ClientInitException: [remote] bash: sudo : commande introuvable In english : "bash: sudo : command not found" root@vmceph1:~# ceph-deploy --version 1.2.3 Anyway, thanks ! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com