[ceph-users] Error in ceph-deploy

2016-05-12 Thread Tu Holmes
So I got an odd error code 100 when upgrading my mons to Jewel on Ubuntu Trusty using ceph-deploy. After running an apt-get install ceph-deploy (which would install all other ceph requirements) I noticed that the error from the apt-get was that that there was a process already running as user

Re: [ceph-users] Weighted Priority Queue testing

2016-05-12 Thread Somnath Roy
FYI in my test I used osd_max_backfills = 10 which is hammer default. Post hammer it's been changed to 1. Thanks & Regards Somnath -Original Message- From: Christian Balzer [mailto:ch...@gol.com] Sent: Thursday, May 12, 2016 10:40 PM To: Scottix Cc: Somnath Roy;

Re: [ceph-users] Weighted Priority Queue testing

2016-05-12 Thread Christian Balzer
Hello, On Thu, 12 May 2016 15:41:13 + Scottix wrote: > We have run into this same scenarios in terms of the long tail taking > much longer on recovery than the initial. > > Either time we are adding osd or an osd get taken down. At first we have > max-backfill set to 1 so it doesn't kill

Re: [ceph-users] about available space

2016-05-12 Thread Christian Balzer
Hello, On Thu, 12 May 2016 07:57:07 -0700 LOPEZ Jean-Charles wrote: > Hi > > you can use the pool quota feature to limit the usage of a particular > pool. > Indeed, however if the pool reaches that quota, it blocks all writes, even if the underlying storage still has plenty of space. It

[ceph-users] What's the minimal version of "ceph" client side the current "jewel" release would support?

2016-05-12 Thread Yang X
See title. We have Firefly on the client side (SLES11SP3) and it does not seem to work well with the "jewel" server nodes (CentOS 7) Can somebody please provide some guidelines? Thanks, Yang ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] CRUSH map help

2016-05-12 Thread Gregory Farnum
On Thu, May 12, 2016 at 2:54 PM, Stephen Mercier < stephen.merc...@attainia.com> wrote: > Thank you very much for the thorough explanation. What you described was > one of the ways I was interpreting this. > > Now, out of curiosity, if I did: > > rule replicated_rack { > ruleset 0 > type

Re: [ceph-users] CRUSH map help

2016-05-12 Thread Stephen Mercier
Thank you very much for the thorough explanation. What you described was one of the ways I was interpreting this. Now, out of curiosity, if I did: rule replicated_rack { ruleset 0 type replicated min_size 1 max_size 10 step take default step

Re: [ceph-users] CRUSH map help

2016-05-12 Thread Gregory Farnum
On Thu, May 12, 2016 at 2:36 PM, Stephen Mercier < stephen.merc...@attainia.com> wrote: > I'm trying to setup a crush rule, and I was hoping you guys could clarify > something for me. > > I have 4 storage nodes across 2 cabinets. (2x2) > > I have the crush hierarchy setup to reflect this layout

[ceph-users] CRUSH map help

2016-05-12 Thread Stephen Mercier
I'm trying to setup a crush rule, and I was hoping you guys could clarify something for me. I have 4 storage nodes across 2 cabinets. (2x2) I have the crush hierarchy setup to reflect this layout (as follows): rack cabinet2 { id -3 # do not change unnecessarily #

Re: [ceph-users] Ceph ANT task and file is empty

2016-05-12 Thread Gregory Farnum
Can you provide more details about exactly what you're doing, and exactly how it fails? -Greg On Thu, May 12, 2016 at 12:49 AM, gjprabu wrote: > > Hi > > Anybody facing similar issue. Please share the solution. > > Regards > Prabu GJ > > On Wed, 11 May 2016

Re: [ceph-users] How do ceph clients determine a monitor's address (and esp. port) for initial connection?

2016-05-12 Thread Christian Sarrasin
Thanks Greg! If I understood correctly, your suggesting this: cd /etc/ceph grep -v 'mon host' testcluster.conf > testcluster_client.conf diff testcluster.conf testcluster_client.conf 4d3 < mon host = mona ceph -c ./testcluster_client.conf --cluster testcluster status no monitors specified to

Re: [ceph-users] How do ceph clients determine a monitor's address (and esp. port) for initial connection?

2016-05-12 Thread Gregory Farnum
On Thu, May 12, 2016 at 6:45 AM, Christian Sarrasin wrote: > I'm trying to run monitors on a non-standard port and having trouble > connecting to them. The below shows the ceph client attempting to connect > to default port 6789 rather than 6788: > > ceph --cluster

Re: [ceph-users] Backfilling caused RBD corruption on Hammer?

2016-05-12 Thread Florian Haas
On Sun, May 8, 2016 at 11:57 PM, Robert Sander wrote: > Am 29.04.2016 um 17:11 schrieb Robert Sander: > >> As the backfilling with the full weight of the new OSDs would have run >> for more than 28h and no VM was usable we re-weighted the new OSDs to >> 0.1. The

[ceph-users] PGS stuck inactive and osd down

2016-05-12 Thread Vincenzo Pii
I have installed a new ceph cluster with ceph-ansible (using the same version and playbook that had worked before, with some necessary changes to variables). The only major difference is that now an osd (osd3) has a disk twice as big as the others, thus a different weight (check the crushmap

Re: [ceph-users] How to change setting for tunables "require_feature_tunables5"

2016-05-12 Thread Andrey Shevel
Great! You are marvelos ceph expert ! I did [ceph@ceph-client ~]$ rbd feature disable mycephrbd deep-flatten,fast-diff,object-map,exclusive-lock [ceph@ceph-client ~]$ sudo rbd map rbd/mycephrbd --id admin --keyfile /etc/ceph/admin.key /dev/rbd0 Also I added the recommended line in client

Re: [ceph-users] Weighted Priority Queue testing

2016-05-12 Thread Scottix
We have run into this same scenarios in terms of the long tail taking much longer on recovery than the initial. Either time we are adding osd or an osd get taken down. At first we have max-backfill set to 1 so it doesn't kill the cluster with io. As time passes by the single osd is performing the

Re: [ceph-users] rbd resize option

2016-05-12 Thread M Ranga Swami Reddy
sure...checking the resize2fs before using the "rbd resize"... Thanks Swami On Thu, May 12, 2016 at 7:17 PM, Eneko Lacunza wrote: > You have to shrink FS before RBD block! Now your FS is corrupt! :) > > El 12/05/16 a las 15:41, M Ranga Swami Reddy escribió: > >> Used

Re: [ceph-users] about available space

2016-05-12 Thread LOPEZ Jean-Charles
Hi you can use the pool quota feature to limit the usage of a particular pool. ceph osd pool set-quota [max_objects ] [max_bytes ] To remove a quota, set its value to 0. Cheers JC > On May 11, 2016, at 19:49, Geocast Networks wrote: > > Hi, > > my ceph df output as

Re: [ceph-users] How to change setting for tunables "require_feature_tunables5"

2016-05-12 Thread Ilya Dryomov
On Thu, May 12, 2016 at 4:37 PM, Andrey Shevel wrote: > Thanks a lot. > > I tried > > [ceph@ceph-client ~]$ ceph osd crush tunables hammer. > > Now I have > > > [ceph@ceph-client ~]$ ceph osd crush show-tunables > { > "choose_local_tries": 0, >

Re: [ceph-users] How to change setting for tunables "require_feature_tunables5"

2016-05-12 Thread Andrey Shevel
Thanks a lot. I tried [ceph@ceph-client ~]$ ceph osd crush tunables hammer. Now I have [ceph@ceph-client ~]$ ceph osd crush show-tunables { "choose_local_tries": 0, "choose_local_fallback_tries": 0, "choose_total_tries": 50, "chooseleaf_descend_once": 1,

Re: [ceph-users] rbd resize option

2016-05-12 Thread M Ranga Swami Reddy
Used "resize2fs" and its working for resize to higher number (ie from 10G -> 20G) or so... If I tried to resize the lower numbers (ie from 10G -> 5G), its failied...with below message: === ubuntu@swami-resize-test-vm:/$ sudo resize2fs /dev/vdb sudo: unable to resolve host swami-resize-test-vm

Re: [ceph-users] RadosGW - Problems running the S3 and SWIFT API at the same time

2016-05-12 Thread Yehuda Sadeh-Weinraub
On Thu, May 12, 2016 at 12:29 AM, Saverio Proto wrote: >> While I'm usually not fond of blaming the client application, this is >> really the swift command line tool issue. It tries to be smart by >> comparing the md5sum of the object's content with the object's etag, >> and

Re: [ceph-users] rbd resize option

2016-05-12 Thread M Ranga Swami Reddy
Not done any FS shrink before "rbd resize". Please let me know what to do with FS shink before "rbd resize" Thanks Swami On Thu, May 12, 2016 at 4:34 PM, Eneko Lacunza wrote: > Did you shrink the FS to be smaller than the target rbd size before doing > "rbd resize"? > > El

Re: [ceph-users] rbd resize option

2016-05-12 Thread Eneko Lacunza
You have to shrink FS before RBD block! Now your FS is corrupt! :) El 12/05/16 a las 15:41, M Ranga Swami Reddy escribió: Used "resize2fs" and its working for resize to higher number (ie from 10G -> 20G) or so... If I tried to resize the lower numbers (ie from 10G -> 5G), its failied...with

[ceph-users] How do ceph clients determine a monitor's address (and esp. port) for initial connection?

2016-05-12 Thread Christian Sarrasin
I'm trying to run monitors on a non-standard port and having trouble connecting to them. The below shows the ceph client attempting to connect to default port 6789 rather than 6788: ceph --cluster testcluster status 2016-05-12 13:31:12.246246 7f710478c700 0 -- :/2044977896 >>

Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-12 Thread Jan Schermer
Btw try replacing WantedBy=ceph-mon.target With: WantedBy=default.target then systemctl daemon-reload. See if that does the trick I only messed with systemctl to have my own services start, I still hope it goes away eventually... :P Jan > On 12 May 2016, at 15:01, Wido den Hollander

Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-12 Thread Jan Schermer
So systemctl is-enabled ceph-mon.target says "enabled" as well? I think it should start then, or at least try Jan > On 12 May 2016, at 15:14, Wido den Hollander wrote: > > >> Op 12 mei 2016 om 15:12 schreef Jan Schermer : >> >> >> What about systemctl

Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-12 Thread Wido den Hollander
> Op 12 mei 2016 om 15:12 schreef Jan Schermer : > > > What about systemctl is-enabled ceph-mon.target? > Just tried that, no luck either. There is simply no trace of the monitors trying to start on boot. > Jan > > > > > On 12 May 2016, at 15:01, Wido den Hollander

Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-12 Thread Jan Schermer
What about systemctl is-enabled ceph-mon.target? Jan > On 12 May 2016, at 15:01, Wido den Hollander wrote: > > > To also answer Sage's question: No, this is a fresh Jewel install in a few > test VMs. This system was not upgraded. > > It was installed 2 hours ago. > >> Op

Re: [ceph-users] Try to find the right way to enable rbd-mirror.

2016-05-12 Thread Jason Dillaman
On Thu, May 12, 2016 at 6:33 AM, Mika c wrote: > 4.) Both sites installed "rbd-mirror". > Start daemon "rbd-mirror" . > On site1:$sudo rbd-mirror -m 192.168.168.21:6789 > On site2:$sudo rbd-mirror -m 192.168.168.22:6789 Assuming you use keep "ceph" as the

Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-12 Thread Wido den Hollander
To also answer Sage's question: No, this is a fresh Jewel install in a few test VMs. This system was not upgraded. It was installed 2 hours ago. > Op 12 mei 2016 om 14:51 schreef Jan Schermer : > > > Can you post the contents of ceph-mon@.service file? > Yes, here you go:

Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-12 Thread Jan Schermer
Can you post the contents of ceph-mon@.service file? what does systemctl is-enabled ceph-mon@charlie say? However, this looks like it was just started at a bad moment and died - nothing in logs? Jan > On 12 May 2016, at 14:44, Sage Weil wrote: > > On Thu, 12 May 2016,

Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-12 Thread Sage Weil
On Thu, 12 May 2016, Wido den Hollander wrote: > Hi, > > I am setting up a Jewel cluster in VMs with Ubuntu 16.04. > > ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9) > > After a reboot the Ceph Monitors don't start and I have to do so manually. > > Three machines, alpha, bravo

Re: [ceph-users] ACL nightmare on RadosGW for 200 TB dataset

2016-05-12 Thread Wido den Hollander
> Op 12 mei 2016 om 14:05 schreef Saverio Proto : > > > > Can't you set the ACL on the object when you put it? > > What do you think of this bug ? > https://github.com/s3tools/s3cmd/issues/743 > Seems like a valid issue. That could make your situation easier. Wido >

Re: [ceph-users] Kernel:BUG: Soft Lockup, H/W or S/W Issue?

2016-05-12 Thread Joe Landman
On 05/12/2016 08:00 AM, Lazuardi Nasution wrote: Hi, Suddenly some of our Infernalis OSD nodes are down with "kernel:BUG: soft lockup" message. Nothing can do after that until rebooting. When I do recovery by restarting the down OSDs, one by one while add additional OSDs too, I get the

[ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-12 Thread Wido den Hollander
Hi, I am setting up a Jewel cluster in VMs with Ubuntu 16.04. ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9) After a reboot the Ceph Monitors don't start and I have to do so manually. Three machines, alpha, bravo and charlie all have the same problem. root@charlie:~# systemctl

Re: [ceph-users] ACL nightmare on RadosGW for 200 TB dataset

2016-05-12 Thread Saverio Proto
> Can't you set the ACL on the object when you put it? What do you think of this bug ? https://github.com/s3tools/s3cmd/issues/743 Saverio ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Kernel:BUG: Soft Lockup, H/W or S/W Issue?

2016-05-12 Thread Lazuardi Nasution
Hi, Suddenly some of our Infernalis OSD nodes are down with "kernel:BUG: soft lockup" message. Nothing can do after that until rebooting. When I do recovery by restarting the down OSDs, one by one while add additional OSDs too, I get the same error again with on the same nodes. I'm not sure which

Re: [ceph-users] rbd resize option

2016-05-12 Thread Eneko Lacunza
Did you shrink the FS to be smaller than the target rbd size before doing "rbd resize"? El 12/05/16 a las 12:33, M Ranga Swami Reddy escribió: When I used "rbd resize" option for size shrink, the image/volume lost its fs sectors and asking for "fs" not found... I have used "mkf" option, then

Re: [ceph-users] How to change setting for tunables "require_feature_tunables5"

2016-05-12 Thread Xusangdi
Hi Andrey, You may change your cluster to a previous version of crush profile (e.g. hammer) by command: `ceph osd crush tunables hammer` Or, if you want to only switch off the tunables5, do as the following steps (not sure if there is a simpler way :<) 1. `ceph osd getcrushmap -o crushmap` 2.

Re: [ceph-users] Incomplete PGs, how do I get them back without data loss?

2016-05-12 Thread george.vasilakakos
What exactly do you mean by log? As in a journal of the actions taken or logging done by a daemon. I'm making the same guess but I'm not sure what else I can try at this point. The PG I've been working on actively reports it needs to probe 4 OSDs (the new set and the old primary) which are all

[ceph-users] Try to find the right way to enable rbd-mirror.

2016-05-12 Thread Mika c
Hi cephers, I am wondering someone that familiar with function of rbd-mirror can give me some hints. There were 2 clusters site1 and site2. Both sites deployed by command ceph-deploy. And there are the steps of my test. With Ubuntu 14.04(kernel 3.19). 1.) On site1 : Copy ceph.conf and

Re: [ceph-users] rbd resize option

2016-05-12 Thread M Ranga Swami Reddy
When I used "rbd resize" option for size shrink, the image/volume lost its fs sectors and asking for "fs" not found... I have used "mkf" option, then all data lost in it? This happens with shrink option... Thanks Swami On Wed, May 11, 2016 at 5:28 PM, Christian Balzer wrote: >

Re: [ceph-users] How to change setting for tunables "require_feature_tunables5"

2016-05-12 Thread Andrey Shevel
Hello, I am still working with the issue, however no success yet. Any ideas would be helpful. The problem is: [ceph@ceph-client ~]$ ceph -s cluster 65b8080e-d813-45ca-9cc1-ecb242967694 health HEALTH_OK monmap e21: 5 mons at

Re: [ceph-users] Ceph ANT task and file is empty

2016-05-12 Thread gjprabu
Hi Anybody facing similar issue. Please share the solution. Regards Prabu GJ On Wed, 11 May 2016 17:38:15 +0530 gjprabu gjpr...@zohocorp.comwrote Hi, We are using ceph rbd with cepfs mounted file system, Here while use ant copy task with in ceph shared

Re: [ceph-users] RadosGW - Problems running the S3 and SWIFT API at the same time

2016-05-12 Thread Saverio Proto
> While I'm usually not fond of blaming the client application, this is > really the swift command line tool issue. It tries to be smart by > comparing the md5sum of the object's content with the object's etag, > and it breaks with multipart objects. Multipart objects is calculated > differently

Re: [ceph-users] ACL nightmare on RadosGW for 200 TB dataset

2016-05-12 Thread Saverio Proto
> Can't you set the ACL on the object when you put it? I could create two tenants. One tenant DATASETADMIN for read/write access, and a tenant DATASETUSERS for readonly access. When I load the dataset into the object store, I need a "s3cmd put" operation and a "s3cmd setacl" operation for each

Re: [ceph-users] wrong exit status if bucket already exists

2016-05-12 Thread Gregory Farnum
Yes, it's intentional. All ceph CLI operations are idempotent. On Tuesday, May 10, 2016, Swapnil Jain wrote: > Hi > > I am using infernalis 9.2.1. While creating bucket, if the bucket already > exists, its still returns 0 as exit status. Is it intentional out of some > reason

Re: [ceph-users] Mixed versions of Ceph Cluster and RadosGW

2016-05-12 Thread Gregory Farnum
Sadly not. RGW generally requires updates to the OSD-side object class code for a lot of its functionality andisnt expected to work against older clusters. :( On Wednesday, May 11, 2016, Saverio Proto wrote: > Hello, > > I have a production Ceph cluster running the latest

Re: [ceph-users] ACL nightmare on RadosGW for 200 TB dataset

2016-05-12 Thread Wido den Hollander
> Op 11 mei 2016 om 15:42 schreef Saverio Proto : > > > Hello there, > > Our setup is with Ceph Hammer (latest release). > > We want to publish in our Object Storage some Scientific Datasets. > These are collections of around 100K objects and total size of about > 200 TB.