Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-06-26 Thread Stefan Priebe - Profihost AG
Hi Greg,

Am 26.06.2014 02:17, schrieb Gregory Farnum:
> Sorry we let this drop; we've all been busy traveling and things.
> 
> There have been a lot of changes to librados between Dumpling and
> Firefly, but we have no idea what would have made it slower. Can you
> provide more details about how you were running these tests?

it's just a normal fio run:
fio --ioengine=rbd --bs=4k --name=foo --invalidate=0
--readwrite=randwrite --iodepth=32 --rbdname=fio_test2 --pool=teststor
--runtime=90 --numjobs=32 --direct=1 --group

Running one time with firefly libs and one time with dumpling libs.
Traget is always the same pool on a firefly ceph storage.

Stefan

> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> 
> 
> On Fri, Jun 13, 2014 at 7:59 AM, Stefan Priebe  wrote:
>> Hi,
>>
>> while testint firefly i cam into the sitation where i had a client where the
>> latest dumpling packages where installed (0.67.9).
>>
>> As my pool has hashppool false and the tunables are set to default it can
>> talk to my firefly ceph sotrage.
>>
>> For random 4k writes using fio with librbd and 32 jobs and an iodepth of 32.
>>
>> I get these results:
>>
>> librbd / librados2 from dumpling:
>>   write: io=3020.9MB, bw=103083KB/s, iops=25770, runt= 30008msec
>>   WRITE: io=3020.9MB, aggrb=103082KB/s, minb=103082KB/s, maxb=103082KB/s,
>> mint=30008msec, maxt=30008msec
>>
>> librbd / librados2 from firefly:
>>   write: io=7344.3MB, bw=83537KB/s, iops=20884, runt= 90026msec
>>   WRITE: io=7344.3MB, aggrb=83537KB/s, minb=83537KB/s, maxb=83537KB/s,
>> mint=90026msec, maxt=90026msec
>>
>> Stefan
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Brian Rak
My current workaround plan is to just upload both versions of the 
file... I think this is probably the simplest solution with the least 
possibility of breaking later on.


On 6/26/2014 6:35 PM, Craig Lewis wrote:
Note that wget did URL encode the space ("test file" became 
"test%20file"), because it knows that a space is never valid.  It 
can't know if you meant an actual plus, or a encoded space in 
"test+file", so it left it alone.


I will say that I would prefer that the + be left alone.  If I have a 
static "test+file", Apache will serve that static file correctly.




How badly do you need this to work, right now?  If you need it now, I 
can suggest a work around.  This is dirty hack, and I'm not saying 
it's a good idea.  It's more of a thought exercise.


A quick google indicates that mod_rewrite might help: 
http://stackoverflow.com/questions/459667/how-to-encode-special-characters-using-mod-rewrite-apache 
.


But that might make the problem worse for other characters... If it 
does, I'm sure I could get it working by installing an Apache hook. 
 Off the top of my head, I'd try a hook in 
http://perl.apache.org/docs/2.0/user/handlers/http.html#PerlFixupHandler to 
replace all + characters with the correct escape sequence, %2B. I know 
mod_python can hook into Apache too.  I don't know if nginx has 
a similar capability.



As with all dirty hacks, if you actually implement it, you'll want to 
watch the release notes.  Once you work around a bug, someone will fix 
the bug and break your hack.





On Thu, Jun 26, 2014 at 8:54 AM, Brian Rak > wrote:


Going back to my first post, I linked to this
http://stackoverflow.com/questions/1005676/urls-and-plus-signs

Per the defintion of application/x-www-form-urlencoded:
http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1

"Control names and values are escaped. Space characters are
replaced by`+', and then reserved characters are escaped as
described in[RFC1738]
,"

The whole +=space thing is only for the query portion of the URL,
not the filename.

I've done some testing with nginx, and this is how it behaves:

On the server, somewhere in the webroot:

echo space > "test file"

Then, from a client:
$ wget --spider "http://example.com/test/test file"


Spider mode enabled. Check if remote file exists.
--2014-06-26 11:46:54-- http://example.com/test/test%20file
Connecting to example.com:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6 [application/octet-stream]
Remote file exists.

$ wget --spider "http://example.com/test/test+file";


Spider mode enabled. Check if remote file exists.
--2014-06-26 11:46:57-- http://example.com/test/test+file
Connecting to example.com:80... connected.
HTTP request sent, awaiting response... 404 Not Found

Remote file does not exist -- broken link!!!

These tests were done just with the standard filesystem. I wasn't
using radosgw for this.  Feel free to repeat with the web server
of your choice, you'll find the same thing happens.

URL decoding the path is not the correct behavior.



On 6/26/2014 11:36 AM, Sylvain Munaut wrote:

Hi,


Based on the debug log, radosgw is definitely the software that's
incorrectly parsing the URL.  For example:


2014-06-25 17:30:37.383134 7f7c6cfa9700 20
REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383199 7f7c6cfa9700 10
s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb
s->bucket=ubuntu

I'll dig into this some more, but it definitely looks like radosgw is the
one that's unencoding the + character here.  How else would it be receiving
the request_uri with the + in it, but then a little bit later the request
has a space in it instead?

Note that AFAIK, in fastcgi, REQUEST_URI is _supposed_ to be an URL
encoded version and should be URL-decoded by the fastcgi handler. So
converting the + to ' ' seems valid to me.


Cheers,

Sylvain



___
ceph-users mailing list
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Continuing placement group problems

2014-06-26 Thread kevin horan





On 06/26/2014 01:08 PM, Gregory Farnum wrote:

On Thu, Jun 26, 2014 at 12:52 PM, Kevin Horan
 wrote:

I am also getting inconsistent object errors on a regular basis, about 1-2
every week or so for about 300GB of data. All OSDs are using XFS
filesystems. Some OSDs are individual 3TB internal hard drives and some are
external FC attached raid6 arrays. I am using this cluster to store kvm
images and I've noticed that the inconsistent objects always occur on my two
most recently created VM images, even though one of them is hardly ever used
(just a bare VM not put into production yet). This all started about 4
months ago on 0.72 and now is continuing to occur on version .80. I also
changed the number of replicas from 2 to 3 for the pool containing these
images and that had no effect.

Here is an example log entry:

2014-06-24 18:11:51.683310 7faf44297700  0 log [ERR] : 4.b6 shard 0: soid
c539a8b6/rbd_data.9fdea2ae8944a.04e2/head//4 digest 2541762784
!= known digest 3305022936
2014-06-24 18:11:52.107321 7faf50f60700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:52.215752 7faf5075f700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:52.365798 7faf50f60700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:52.674643 7faf5075f700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:52.749641 7faf50f60700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:55.194967 7faf5075f700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:55.259322 7faf50f60700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:55.526157 7faf5075f700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:55.547270 7faf44297700  0 log [ERR] : 4.b6 deep-scrub 0
missing, 1 inconsistent objects
2014-06-24 18:11:55.547282 7faf44297700  0 log [ERR] : 4.b6 deep-scrub 1
errors

Can you go find out what about those files is different? Are they
different sizes, with the overlapping pieces being the same? Are they
completely different?

  Here is the info block on three images:

root@vashti:~/t1# rbd info libvirt-pool/radosgw
rbd image 'radosgw':
size 1 MB in 2500 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.6aad02ae8944a
format: 2
features: layering

root@vashti:~/t1# rbd info libvirt-pool/auth-data
rbd image 'auth-data':
size 1 MB in 2500 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.9fdea2ae8944a
format: 2
features: layering

root@vashti:~/t1# rbd info libvirt-pool/auth
rbd image 'auth':
size 10240 MB in 2560 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.51a3b2ae8944a
format: 2
features: layering
root@vashti:~/t1#

The first two incur the inconsistent objects, while the third one (which 
was created a year ago) does not (nor do my other older images). All of 
them are 10G in size, including the non-problematic one. I'm not sure 
what you mean by "overlapping pieces"?



Are your systems losing power or otherwise doing
mean things to the local filesystem?
  I have not seen any kernel errors about file systems nor have I had 
any file system level problems.

  Have you noticed a pattern of
distribution in terms of the underlying storage system on the
inconsistent OSDs?
I have found the bad objects on PGs whose primary OSD was on a single 
internal drive, and in other cases the primary OSD was on an external drive.


About 3 months ago I had an event where 3 out of only 6 OSDs where 
down while noout was set (pool was set to size=2, min_size=1). About 2 
minutes after these 3 OSDs came back up, another OSD, not one of these 
three, suffered a physical error and was lost. This resulted in about 10 
or so lost objects. I soon got this all cleaned up got the cluster back 
to the clean state (see here 
 
for the full story). But it was soon after that that I started getting 
these inconsistent objects. Prior to that event I had gone over a year 
without any inconsistent objects. There has also been a lot of 
re-structuring going on with new OSDs being added and/or moved (still 
getting it ready for production). But I always take one step and let it 
return to clean before taking the next step.
  When I got the first inconsistent object a simple repair didn't work 
so then I started trying some online suggestions of truncating objects 
to the correct size and/or removing objects. Some of these things cau

Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Craig Lewis
Note that wget did URL encode the space ("test file" became "test%20file"),
because it knows that a space is never valid.  It can't know if you meant
an actual plus, or a encoded space in "test+file", so it left it alone.

I will say that I would prefer that the + be left alone.  If I have a
static "test+file", Apache will serve that static file correctly.



How badly do you need this to work, right now?  If you need it now, I can
suggest a work around.  This is dirty hack, and I'm not saying it's a good
idea.  It's more of a thought exercise.

A quick google indicates that mod_rewrite might help:
http://stackoverflow.com/questions/459667/how-to-encode-special-characters-using-mod-rewrite-apache
.

But that might make the problem worse for other characters... If it does,
I'm sure I could get it working by installing an Apache hook.  Off the top
of my head, I'd try a hook in
http://perl.apache.org/docs/2.0/user/handlers/http.html#PerlFixupHandler to
replace all + characters with the correct escape sequence, %2B.  I know
mod_python can hook into Apache too.  I don't know if nginx has
a similar capability.


As with all dirty hacks, if you actually implement it, you'll want to watch
the release notes.  Once you work around a bug, someone will fix the bug
and break your hack.




On Thu, Jun 26, 2014 at 8:54 AM, Brian Rak  wrote:

>  Going back to my first post, I linked to this
> http://stackoverflow.com/questions/1005676/urls-and-plus-signs
>
> Per the defintion of application/x-www-form-urlencoded:
> http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1
>
> "Control names and values are escaped. Space characters are replaced by
> `+', and then reserved characters are escaped as described in [RFC1738]
> ,"
>
> The whole +=space thing is only for the query portion of the URL, not the
> filename.
>
> I've done some testing with nginx, and this is how it behaves:
>
> On the server, somewhere in the webroot:
>
> echo space > "test file"
>
> Then, from a client:
> $ wget --spider "http://example.com/test/test file"
> 
>
> Spider mode enabled. Check if remote file exists.
> --2014-06-26 11:46:54--  http://example.com/test/test%20file
> Connecting to example.com:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 6 [application/octet-stream]
> Remote file exists.
>
> $ wget --spider "http://example.com/test/test+file";
> 
>
> Spider mode enabled. Check if remote file exists.
> --2014-06-26 11:46:57--  http://example.com/test/test+file
> Connecting to example.com:80... connected.
> HTTP request sent, awaiting response... 404 Not Found
>
> Remote file does not exist -- broken link!!!
>
> These tests were done just with the standard filesystem.  I wasn't using
> radosgw for this.  Feel free to repeat with the web server of your choice,
> you'll find the same thing happens.
>
> URL decoding the path is not the correct behavior.
>
>
>
> On 6/26/2014 11:36 AM, Sylvain Munaut wrote:
>
> Hi,
>
>
>  Based on the debug log, radosgw is definitely the software that's
> incorrectly parsing the URL.  For example:
>
>
> 2014-06-25 17:30:37.383134 7f7c6cfa9700 20
> REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
> 2014-06-25 17:30:37.383199 7f7c6cfa9700 10
> s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb
> s->bucket=ubuntu
>
> I'll dig into this some more, but it definitely looks like radosgw is the
> one that's unencoding the + character here.  How else would it be receiving
> the request_uri with the + in it, but then a little bit later the request
> has a space in it instead?
>
>  Note that AFAIK, in fastcgi, REQUEST_URI is _supposed_ to be an URL
> encoded version and should be URL-decoded by the fastcgi handler. So
> converting the + to ' ' seems valid to me.
>
>
> Cheers,
>
>Sylvain
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Difference between "ceph osd reweight" and "ceph osd crush reweight"

2014-06-26 Thread Craig Lewis
Note that 'ceph osd reweight' is not a persistent setting.  When an OSD
gets marked out, the osd weight will be set to 0.  When it gets marked in
again, the weight will be changed to 1.

Because of this 'ceph osd reweight' is a temporary solution.  You should
only use it to keep your cluster running while you're ordering more
hardware.




On Thu, Jun 26, 2014 at 10:05 AM, Gregory Farnum  wrote:

> On Thu, Jun 26, 2014 at 7:03 AM, Micha Krause  wrote:
> > Hi,
> >
> > could someone explain to me what the difference is between
> >
> > ceph osd reweight
> >
> > and
> >
> > ceph osd crush reweight
>
> "ceph osd crush reweight" sets the CRUSH weight of the OSD. This
> weight is an arbitrary value (generally the size of the disk in TB or
> something) and controls how much data the system tries to allocate to
> the OSD.
>
> "ceph osd reweight" sets an override weight on the OSD. This value is
> in the range 0 to 1, and forces CRUSH to re-place (1-weight) of the
> data that would otherwise live on this drive. It does *not* change the
> weights assigned to the buckets above the OSD, and is a corrective
> measure in case the normal CRUSH distribution isn't working out quite
> right. (For instance, if one of your OSDs is at 90% and the others are
> at 50%, you could reduce this weight to try and compensate for it.)
>
> It looks like our docs aren't very clear on the difference, when it
> even mentions them...and admittedly it's a pretty subtle issue!
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to improve performance of ceph objcect storage cluster

2014-06-26 Thread Craig Lewis
Cern noted that they need to reformat to put the Journal in a partition
rather than on the OSD's filesystem like you did.  See
http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern, slide 24.

When I saw that ceph disk prepare created a journal partition, I thought it
was stupid to force a seek like that.  (This was before I saw Cern's
slides).  I really should've known better, there's a reason it's the
default behavior.  I didn't even benchmark the two. *hangs head in shame*

I really can't tell you why it's bad idea, but I can say that my recoveries
are extremely painful.  I'm using RadosGW, and I only care about seconds of
latency.  During large recoveries (like adding new nodes), people complain
about how slow the cluster is.

I'm in the middle of rolling out SSD journals to all machines.




On Tue, Jun 24, 2014 at 11:52 PM, wsnote  wrote:

> OS: CentOS 6.5
> Version: Ceph 0.79
>
> Hi, everybody!
> I have installed a ceph cluster with 10 servers.
> I test the throughput of ceph cluster in the same  datacenter.
> Upload files of 1GB from one server or several servers to one server or
> several servers, the total is about 30MB/s.
> That is to say, there is no difference between one server or one cluster
> about throughput when uploading files.
> How to optimize the performance of ceph object storage?
> Thanks!
>
>
> 
> Info about ceph cluster:
> 4 MONs in the first 4 nodes in the cluster.
> 11 OSDs in each server, 109 OSDs in total (one disk was bad).
> 4TB each disk, 391TB in total (109*4-391=45TB.Where did 45TB space?)
> 1 RGW in each server, 10 RGWs in total.That is to say, I can use S3 API in
> each Server.
>
> ceph.conf:
> [global]
> auth supported = none
>
> ;auth_service_required = cephx
> ;auth_client_required = cephx
> ;auth_cluster_required = cephx
> filestore_xattr_use_omap = true
>
> max open files = 131072
> log file = /var/log/ceph/$name.log
> pid file = /var/run/ceph/$name.pid
> keyring = /etc/ceph/keyring.admin
>
> mon_clock_drift_allowed = 2 ;clock skew detected
>
> [mon]
> mon data = /data/mon$id
> keyring = /etc/ceph/keyring.$name
>  [osd]
> osd data = /data/osd$id
> osd journal = /data/osd$id/journal
> osd journal size = 1024;
> keyring = /etc/ceph/keyring.$name
> osd mkfs type = xfs
> osd mount options xfs = rw,noatime
> osd mkfs options xfs = -f
>
> [client.radosgw.cn-bj-1]
> rgw region = cn
> rgw region root pool = .cn.rgw.root
> rgw zone = cn-bj
> rgw zone root pool = .cn-wz.rgw.root
> host = yun168
> public_addr = 192.168.10.115
> rgw dns name = s3.domain.com
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> rgw socket path = /var/run/ceph/$name.sock
> log file = /var/log/ceph/radosgw.log
> debug rgw = 20
> rgw print continue = true
> rgw should log = true
>
>
>
>
> [root@yun168 ~]# ceph -s
> cluster e48b0d5b-ff08-4a8e-88aa-4acd3f5a6204
>  health HEALTH_OK
>  monmap e7: 4 mons at {... ...  ...}, election epoch 78, quorum
> 0,1,2,3 0,1,2,3
>  mdsmap e49: 0/0/1 up
>  osdmap e3722: 109 osds: 109 up, 109 in
>   pgmap v106768: 29432 pgs, 19 pools, 12775 GB data, 12786 kobjects
> 640 GB used, 390 TB / 391 TB avail
>29432 active+clean
>   client io 1734 kB/s rd, 29755 kB/s wr, 443 op/s
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Continuing placement group problems

2014-06-26 Thread Gregory Farnum
On Thu, Jun 26, 2014 at 12:52 PM, Kevin Horan
 wrote:
> I am also getting inconsistent object errors on a regular basis, about 1-2
> every week or so for about 300GB of data. All OSDs are using XFS
> filesystems. Some OSDs are individual 3TB internal hard drives and some are
> external FC attached raid6 arrays. I am using this cluster to store kvm
> images and I've noticed that the inconsistent objects always occur on my two
> most recently created VM images, even though one of them is hardly ever used
> (just a bare VM not put into production yet). This all started about 4
> months ago on 0.72 and now is continuing to occur on version .80. I also
> changed the number of replicas from 2 to 3 for the pool containing these
> images and that had no effect.
>
> Here is an example log entry:
>
> 2014-06-24 18:11:51.683310 7faf44297700  0 log [ERR] : 4.b6 shard 0: soid
> c539a8b6/rbd_data.9fdea2ae8944a.04e2/head//4 digest 2541762784
> != known digest 3305022936
> 2014-06-24 18:11:52.107321 7faf50f60700  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
> Invalid argument
> 2014-06-24 18:11:52.215752 7faf5075f700  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
> Invalid argument
> 2014-06-24 18:11:52.365798 7faf50f60700  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
> Invalid argument
> 2014-06-24 18:11:52.674643 7faf5075f700  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
> Invalid argument
> 2014-06-24 18:11:52.749641 7faf50f60700  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
> Invalid argument
> 2014-06-24 18:11:55.194967 7faf5075f700  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
> Invalid argument
> 2014-06-24 18:11:55.259322 7faf50f60700  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
> Invalid argument
> 2014-06-24 18:11:55.526157 7faf5075f700  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
> Invalid argument
> 2014-06-24 18:11:55.547270 7faf44297700  0 log [ERR] : 4.b6 deep-scrub 0
> missing, 1 inconsistent objects
> 2014-06-24 18:11:55.547282 7faf44297700  0 log [ERR] : 4.b6 deep-scrub 1
> errors

Can you go find out what about those files is different? Are they
different sizes, with the overlapping pieces being the same? Are they
completely different? Are your systems losing power or otherwise doing
mean things to the local filesystem? Have you noticed a pattern of
distribution in terms of the underlying storage system on the
inconsistent OSDs?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

>
> Sometimes one of the objects has 0 size. I've also started getting the
> FSSETXATTR errors recently, though I think that started after this problem
> started. I've read elsewhere that these are harmless and will go away in a
> future version.  I also looked in the monitor logs but didn't see any
> reference to inconsistent or scrubbed objects.
>
> Kevin
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Continuing placement group problems

2014-06-26 Thread Kevin Horan
I am also getting inconsistent object errors on a regular basis, about 
1-2 every week or so for about 300GB of data. All OSDs are using XFS 
filesystems. Some OSDs are individual 3TB internal hard drives and some 
are external FC attached raid6 arrays. I am using this cluster to store 
kvm images and I've noticed that the inconsistent objects always occur 
on my two most recently created VM images, even though one of them is 
hardly ever used (just a bare VM not put into production yet). This all 
started about 4 months ago on 0.72 and now is continuing to occur on 
version .80. I also changed the number of replicas from 2 to 3 for the 
pool containing these images and that had no effect.


Here is an example log entry:

2014-06-24 18:11:51.683310 7faf44297700  0 log [ERR] : 4.b6 shard 0: 
soid c539a8b6/rbd_data.9fdea2ae8944a.04e2/head//4 digest 
2541762784 != known digest 3305022936
2014-06-24 18:11:52.107321 7faf50f60700  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: 
(22) Invalid argument
2014-06-24 18:11:52.215752 7faf5075f700  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: 
(22) Invalid argument
2014-06-24 18:11:52.365798 7faf50f60700  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: 
(22) Invalid argument
2014-06-24 18:11:52.674643 7faf5075f700  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: 
(22) Invalid argument
2014-06-24 18:11:52.749641 7faf50f60700  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: 
(22) Invalid argument
2014-06-24 18:11:55.194967 7faf5075f700  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: 
(22) Invalid argument
2014-06-24 18:11:55.259322 7faf50f60700  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: 
(22) Invalid argument
2014-06-24 18:11:55.526157 7faf5075f700  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: 
(22) Invalid argument
2014-06-24 18:11:55.547270 7faf44297700  0 log [ERR] : 4.b6 deep-scrub 0 
missing, 1 inconsistent objects
2014-06-24 18:11:55.547282 7faf44297700  0 log [ERR] : 4.b6 deep-scrub 1 
errors


Sometimes one of the objects has 0 size. I've also started getting the 
FSSETXATTR errors recently, though I think that started after this 
problem started. I've read elsewhere that these are harmless and will go 
away in a future version.  I also looked in the monitor logs but didn't 
see any reference to inconsistent or scrubbed objects.


Kevin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Some PG remains "peering" after upgrade to Firefly 0.80.1

2014-06-26 Thread Pierre BLONDEAU

Hy,

Some pg remains in the "peering" after upgrade to firefly.
All pg seem to be on the same OSD ( 16 ). You can view the file.

I tried to stop this OSD, but when i did, some PG becomes inactive.

How can i do ?

Regards.

--
--
Pierre BLONDEAU
Administrateur Systèmes & réseaux
Université de Caen
Laboratoire GREYC, Département d'informatique

tel : 02 31 56 75 42
bureau  : Campus 2, Science 3, 406
--
# ceph pg dump | grep peering
dumped all in format plain
0.183   11983   0   0   0   10427139093 30013001peering 
2014-06-19 20:02:14.336102  7272'38060  9443:205684 [16,21] 16  
[16,21] 16  7272'38060  2014-06-18 11:14:44.256675  7272'38060
2014-06-15 09:00:51.503603
0.17d   11781   0   0   0   10069345173 30023002peering 
2014-06-19 20:17:55.111756  9363'37428  9443:154364 [16,1]  16  
[16,1]  16  7272'37427  2014-06-18 08:09:28.789890  7272'37427
2014-06-18 08:09:28.789890
0.a411802   0   0   0   10197325776 30013001peering 
2014-06-19 18:58:30.358562  7272'37300  9443:154382 [16,7]  16  
[16,7]  16  7272'37300  2014-06-17 20:11:47.525174  7272'37300
2014-06-16 17:41:53.331833
0.4c11924   0   0   0   10143098090 30023002peering 
2014-06-19 18:54:53.964913  9403'38144  9443:145595 [16,25] 16  
[16,25] 16  7272'38143  2014-06-18 09:52:08.198845  7272'38143
2014-06-14 06:48:08.792218
0.6e8   11737   0   0   0   10003845159 30033003peering 
2014-06-19 18:10:04.675405  9070'37041  9443:139424 [16,0]  16  
[16,0]  16  7815'37040  2014-06-18 01:48:24.490826  7272'37039
2014-06-16 21:05:49.085204
0.6cb   12133   0   0   0   10401273250 30013001peering 
2014-06-19 19:45:45.549566  7272'38338  9443:157345 [16,2]  16  
[16,2]  16  7272'38338  2014-06-18 09:20:22.852169  7272'38338
2014-06-14 04:29:41.282777
0.605   11739   0   0   0   10457751036 30023002peering 
2014-06-19 18:39:24.262705  9403'36957  9443:150978 [16,23] 16  
[16,23] 16  7272'36956  2014-06-18 08:48:23.483915  7272'36956
2014-06-14 03:21:45.419081
0.5d7   11780   0   0   0   10069556805 30013001peering 
2014-06-19 19:23:04.636194  7272'37647  9443:141532 [16,7]  16  
[16,7]  16  7272'37647  2014-06-17 11:06:55.623203  7272'37647
2014-06-15 10:28:18.834868
0.52d   11933   0   0   0   9987516970  30033003peering 
2014-06-19 20:03:51.966678  8140'37995  9443:141404 [16,9]  16  
[16,9]  16  7910'37994  2014-06-17 20:00:29.459131  7272'37993
2014-06-15 16:40:55.982749
0.4b2   11983   0   0   0   10051670129 30013001peering 
2014-06-19 18:58:54.192528  7272'37626  9443:139692 [16,10] 16  
[16,10] 16  7272'37626  2014-06-18 03:30:02.698832  7272'37626
2014-06-16 23:14:07.622190
0.48a   11956   0   0   0   10128500386 30013001peering 
2014-06-19 20:19:09.503945  7272'37701  9443:175908 [16,1]  16  
[16,1]  16  7272'37701  2014-06-18 09:10:22.653136  7272'37701
2014-06-14 04:13:54.103455
0.374   24030   0   0   0   20635805936 30063006peering 
2014-06-19 19:02:57.694534  9403'75637  9443:341986 [16,18] 16  
[16,18] 16  7815'75633  2014-06-18 03:22:06.559734  7272'75632
2014-06-16 22:37:43.670052
0.373   24267   0   0   0   20721317362 30013001peering 
2014-06-19 18:31:21.671566  7272'77231  9443:298500 [16,24] 16  
[16,24] 16  7272'77231  2014-06-18 04:18:57.258497  7272'77231
2014-06-18 04:18:57.258497
0.36e   23767   0   0   0   19873454495 30013001peering 
2014-06-19 20:16:31.483342  7272'75823  9443:402552 [16,3]  16  
[16,3]  16  7272'75823  2014-06-17 20:48:31.470419  7272'75823
2014-06-16 20:06:44.687559
0.331   23751   0   0   0   19725402006 30013001peering 
2014-06-19 19:52:01.676693  7272'74947  9443:349176 [16,18] 16  
[16,18] 16  7272'74947  2014-06-18 01:58:06.619495  7272'74947
2014-06-16 21:25:09.111350
0.2eb   12029   0   0   0   10156848656 30013001peering 
2014-06-19 19:38:01.638117  7272'37893  9443:147549 [16,6]  16  
[16,6]  16  7272'37893  2014-06-18 02:06:36.900841  7272'37893
2014-06-16 21:31:37.370617
0.2a9   11882   0   0   0   10149455357 30013001peering 
2014-06-19 20:01:16.604454  7272'37768  9443:175913 [16,26] 16

Re: [ceph-users] How to improve performance of ceph objcect storage cluster

2014-06-26 Thread Aronesty, Erik
Well, it's the same for rbd, what's your stripe count set to?  For a small 
system, it should be at least the # of nodes in your system.As systems get 
larger, there's limited returns... I would imagine there would be some OSD 
caching advantage to keeping the number limited (IE: more requests of the same 
device = more likely the device has the next stripe unit prefetched).   


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Udo 
Lembke
Sent: Thursday, June 26, 2014 2:03 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] How to improve performance of ceph objcect storage 
cluster

Hi,

Am 25.06.2014 16:48, schrieb Aronesty, Erik:
> I'm assuming you're testing the speed of cephfs (the file system) and not 
> ceph "object storage".

for my part I mean object storage (VM disk via rbd).

Udo


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Difference between "ceph osd reweight" and "ceph osd crush reweight"

2014-06-26 Thread Gregory Farnum
On Thu, Jun 26, 2014 at 7:03 AM, Micha Krause  wrote:
> Hi,
>
> could someone explain to me what the difference is between
>
> ceph osd reweight
>
> and
>
> ceph osd crush reweight

"ceph osd crush reweight" sets the CRUSH weight of the OSD. This
weight is an arbitrary value (generally the size of the disk in TB or
something) and controls how much data the system tries to allocate to
the OSD.

"ceph osd reweight" sets an override weight on the OSD. This value is
in the range 0 to 1, and forces CRUSH to re-place (1-weight) of the
data that would otherwise live on this drive. It does *not* change the
weights assigned to the buckets above the OSD, and is a corrective
measure in case the normal CRUSH distribution isn't working out quite
right. (For instance, if one of your OSDs is at 90% and the others are
at 50%, you could reduce this weight to try and compensate for it.)

It looks like our docs aren't very clear on the difference, when it
even mentions them...and admittedly it's a pretty subtle issue!
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Sylvain Munaut
> URL decoding the path is not the correct behavior.

Yes it is ... If you remove that then every other special char file
will be broken because % encoding will not be applied.

>From the rfc3986, the path must be split in its components first
(split on '/') then url-decoded component per component.

It's just be that converting '+' to space might not be part of the url
decoding process for the path components because the '+' thing is a
pplication/x-www-form-urlencoded thing and not a url-encoding stuff,
but the whole url decoding process can't be removed without breaking
much more stuff.


Cheers,

   Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Brian Rak
Going back to my first post, I linked to this 
http://stackoverflow.com/questions/1005676/urls-and-plus-signs


Per the defintion of application/x-www-form-urlencoded: 
http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1


"Control names and values are escaped. Space characters are replaced 
by`+', and then reserved characters are escaped as described in[RFC1738] 
,"


The whole +=space thing is only for the query portion of the URL, not 
the filename.


I've done some testing with nginx, and this is how it behaves:

On the server, somewhere in the webroot:

echo space > "test file"

Then, from a client:
$ wget --spider "http://example.com/test/test file"
Spider mode enabled. Check if remote file exists.
--2014-06-26 11:46:54--  http://example.com/test/test%20file
Connecting to example.com:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6 [application/octet-stream]
Remote file exists.

$ wget --spider "http://example.com/test/test+file";
Spider mode enabled. Check if remote file exists.
--2014-06-26 11:46:57--  http://example.com/test/test+file
Connecting to example.com:80... connected.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist -- broken link!!!

These tests were done just with the standard filesystem.  I wasn't using 
radosgw for this.  Feel free to repeat with the web server of your 
choice, you'll find the same thing happens.


URL decoding the path is not the correct behavior.


On 6/26/2014 11:36 AM, Sylvain Munaut wrote:

Hi,


Based on the debug log, radosgw is definitely the software that's
incorrectly parsing the URL.  For example:


2014-06-25 17:30:37.383134 7f7c6cfa9700 20
REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383199 7f7c6cfa9700 10
s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb
s->bucket=ubuntu

I'll dig into this some more, but it definitely looks like radosgw is the
one that's unencoding the + character here.  How else would it be receiving
the request_uri with the + in it, but then a little bit later the request
has a space in it instead?

Note that AFAIK, in fastcgi, REQUEST_URI is _supposed_ to be an URL
encoded version and should be URL-decoded by the fastcgi handler. So
converting the + to ' ' seems valid to me.


Cheers,

Sylvain


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Sylvain Munaut
Hi,

> Based on the debug log, radosgw is definitely the software that's
> incorrectly parsing the URL.  For example:
>
>
> 2014-06-25 17:30:37.383134 7f7c6cfa9700 20
> REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
> 2014-06-25 17:30:37.383199 7f7c6cfa9700 10
> s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb
> s->bucket=ubuntu
>
> I'll dig into this some more, but it definitely looks like radosgw is the
> one that's unencoding the + character here.  How else would it be receiving
> the request_uri with the + in it, but then a little bit later the request
> has a space in it instead?

Note that AFAIK, in fastcgi, REQUEST_URI is _supposed_ to be an URL
encoded version and should be URL-decoded by the fastcgi handler. So
converting the + to ' ' seems valid to me.


Cheers,

   Sylvain
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] limitations of erasure coded pools

2014-06-26 Thread Chad Seys
Thanks for the link Blairo!

I can think of a use case already!  (combo replicated pool / erasure pool for 
a virtual tape library)

! Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Difference between "ceph osd reweight" and "ceph osd crush reweight"

2014-06-26 Thread Micha Krause

Hi,

could someone explain to me what the difference is between

ceph osd reweight

and

ceph osd crush reweight


Micha Krause
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Brian Rak

I have reproduced this on a standard apache config:

2014-06-26 09:24:43.733549 7fc57effd700 20 RGWWQ:
2014-06-26 09:24:43.733551 7fc57effd700 20 req: 0x7fc56c00b180
2014-06-26 09:24:43.733554 7fc57effd700 10 allocated request 
req=0x7fc56c00c1f0
2014-06-26 09:24:43.733622 7fc5377ae700 20 dequeued request 
req=0x7fc56c00b180

2014-06-26 09:24:43.733628 7fc5377ae700 20 RGWWQ: empty
2014-06-26 09:24:43.733662 7fc5377ae700 20 DOCUMENT_ROOT=/var/www
2014-06-26 09:24:43.733668 7fc5377ae700 20 FCGI_ROLE=RESPONDER
2014-06-26 09:24:43.733669 7fc5377ae700 20 GATEWAY_INTERFACE=CGI/1.1
2014-06-26 09:24:43.733670 7fc5377ae700 20 HTTP_ACCEPT=*/*
2014-06-26 09:24:43.733670 7fc5377ae700 20 HTTP_AUTHORIZATION=
2014-06-26 09:24:43.733671 7fc5377ae700 20 HTTP_CONNECTION=Keep-Alive
2014-06-26 09:24:43.733671 7fc5377ae700 20 HTTP_HOST=ubuntu.example.com:8080
2014-06-26 09:24:43.733672 7fc5377ae700 20 HTTP_USER_AGENT=Wget/1.12 
(linux-gnu)
2014-06-26 09:24:43.733673 7fc5377ae700 20 
PATH=/sbin:/usr/sbin:/bin:/usr/bin

2014-06-26 09:24:43.733673 7fc5377ae700 20 QUERY_STRING=
2014-06-26 09:24:43.733674 7fc5377ae700 20 REMOTE_ADDR=1.1.1.1
2014-06-26 09:24:43.733674 7fc5377ae700 20 REMOTE_PORT=43402
2014-06-26 09:24:43.733675 7fc5377ae700 20 REQUEST_METHOD=HEAD
2014-06-26 09:24:43.733675 7fc5377ae700 20 
REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-26 09:24:43.733676 7fc5377ae700 20 
SCRIPT_FILENAME=/var/www/s3gw.fcgi
2014-06-26 09:24:43.733677 7fc5377ae700 20 
SCRIPT_NAME=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-26 09:24:43.733677 7fc5377ae700 20 
SCRIPT_URI=http://ubuntu.example.com:8080/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-26 09:24:43.733678 7fc5377ae700 20 
SCRIPT_URL=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb

2014-06-26 09:24:43.733679 7fc5377ae700 20 SERVER_ADDR=1.1.1.1
2014-06-26 09:24:43.733680 7fc5377ae700 20 SERVER_ADMIN=t...@example.com
2014-06-26 09:24:43.733680 7fc5377ae700 20 SERVER_NAME=ubuntu.example.com
2014-06-26 09:24:43.733681 7fc5377ae700 20 SERVER_PORT=8080
2014-06-26 09:24:43.733681 7fc5377ae700 20 SERVER_PROTOCOL=HTTP/1.0
2014-06-26 09:24:43.733682 7fc5377ae700 20 SERVER_SIGNATURE=
2014-06-26 09:24:43.733682 7fc5377ae700 20 SERVER_SOFTWARE=Apache/2.2.15 
(CentOS)
2014-06-26 09:24:43.733683 7fc5377ae700  1 == starting new request 
req=0x7fc56c00b180 =
2014-06-26 09:24:43.733696 7fc5377ae700  2 req 1:0.12::HEAD 
/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb::initializing
2014-06-26 09:24:43.733700 7fc5377ae700 10 host=ubuntu.example.com:8080 
rgw_dns_name=example.com
2014-06-26 09:24:43.733732 7fc5377ae700 10 
s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb 
s->bucket=ubuntu
2014-06-26 09:24:43.733737 7fc5377ae700  2 req 1:0.54:s3:HEAD 
/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb::getting op
2014-06-26 09:24:43.733744 7fc5377ae700  2 req 1:0.61:s3:HEAD 
/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb:get_obj:authorizing
2014-06-26 09:24:43.733751 7fc5377ae700  2 req 1:0.68:s3:HEAD 
/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb:get_obj:reading 
permissions
2014-06-26 09:24:43.733785 7fc5377ae700 20 get_obj_state: 
rctx=0x7fc5377ad640 obj=.rgw:ubuntu state=0x7fc56c015228 s->prefetch_data=0
2014-06-26 09:24:43.733791 7fc5377ae700 10 cache get: name=.rgw+ubuntu : 
miss

2014-06-26 09:24:43.735748 7fc5377ae700 10 cache put: name=.rgw+ubuntu
2014-06-26 09:24:43.735759 7fc5377ae700 10 adding .rgw+ubuntu to cache 
LRU end
2014-06-26 09:24:43.735765 7fc5377ae700 20 get_obj_state: s->obj_tag was 
set empty

2014-06-26 09:24:43.735770 7fc5377ae700 20 Read xattr: user.rgw.idtag
2014-06-26 09:24:43.735772 7fc5377ae700 20 Read xattr: user.rgw.manifest
2014-06-26 09:24:43.735776 7fc5377ae700 10 cache get: name=.rgw+ubuntu : 
type miss (requested=17, cached=22)
2014-06-26 09:24:43.735780 7fc5377ae700 20 get_obj_state: 
rctx=0x7fc5377ad640 obj=.rgw:ubuntu state=0x7fc56c015228 s->prefetch_data=0
2014-06-26 09:24:43.735783 7fc5377ae700 20 get_obj_state: 
rctx=0x7fc5377ad640 obj=.rgw:ubuntu state=0x7fc56c015228 s->prefetch_data=0
2014-06-26 09:24:43.735785 7fc5377ae700 20 state for obj=.rgw:ubuntu is 
not atomic, not appending atomic test
2014-06-26 09:24:43.735807 7fc5377ae700 20 rados->read obj-ofs=0 
read_ofs=0 read_len=524288

2014-06-26 09:24:43.737560 7fc5377ae700 20 rados->read r=0 bl.length=129
2014-06-26 09:24:43.737604 7fc5377ae700 10 cache put: name=.rgw+ubuntu
2014-06-26 09:24:43.737610 7fc5377ae700 10 moving .rgw+ubuntu to cache 
LRU end
2014-06-26 09:24:43.737623 7fc5377ae700 20 rgw_get_bucket_info: bucket 
instance: ubuntu(@{i=.rgw.buckets.index}.rgw.buckets[default.58906.10])
2014-06-26 09:24:43.737631 7fc5377ae700 20 reading from 
.rgw:.bucket.meta.ubuntu:default.58906.10
2014-06-26 09:24:43.737666 7fc5377ae700 20 get_obj_state: 
rctx=0x7fc5377ad640 obj=.rgw:.bucket.meta.ubuntu:default.

Re: [ceph-users] ceph-rest-api (image locks)

2014-06-26 Thread Wido den Hollander

On 06/26/2014 02:49 PM, NEVEU Stephane wrote:

Hi,

I’m just discovering ceph-rest-api and I ‘d like to know : is there a
way to play with images ?

I mean, I cannot find a way to create/destroy/resize images in a pool
for example… and what about lock_exclusive / lock_shared / unlock images ?

Is it still to come or am I missing something ?



Nope, that's not in the API.

The API is there to managed your Ceph cluster, not to manage the data in 
the cluster.



Thank you !



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-rest-api (image locks)

2014-06-26 Thread NEVEU Stephane
Hi,

I'm just discovering ceph-rest-api and I 'd like to know : is there a way to 
play with images ?
I mean, I cannot find a way to create/destroy/resize images in a pool for 
example... and what about lock_exclusive / lock_shared / unlock images ?
Is it still to come or am I missing something ?

Thank you !


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Brian Rak
Based on the debug log, radosgw is definitely the software that's 
incorrectly parsing the URL.  For example:


2014-06-25 17:30:37.383134 7f7c6cfa9700 20 
REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383199 7f7c6cfa9700 10 
s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb 
s->bucket=ubuntu


I'll dig into this some more, but it definitely looks like radosgw is 
the one that's unencoding the + character here.  How else would it be 
receiving the request_uri with the + in it, but then a little bit later 
the request has a space in it instead?


On 6/26/2014 2:59 AM, Yehuda Sadeh wrote:

The gateway itself supports these kind of characters. Usually we see
this issue when there's something in front of the web server (like a
load balancer) that modifies the requests. Another possibility is the
web server configuration that might be rewriting the requests. In this
case it seems that you're using nginx which is outside of our usual
test environment, so it might be related.

Yehuda

On Jun 25, 2014 5:39 PM, "Brian Rak"  wrote:

Unfortunately, both the client and actual files are outside of my control 
here In the case that I noticed, the client is the Ubuntu installer, and 
the files are part of the Ubuntu archives content.

On 6/25/2014 8:07 PM, Gerard Toonstra wrote:

the + is a reserved character in the HTTP protocol, which means it may have 
specific meaning in a specific part of the URL, but not everywhere.

The earliest HTTP specification re-encoded spaces in the URL as + characters 
after the question mark and form fields for posts that were
sent with urlencode.

Best is to prevent these characters in filenames or percentage encode the URL 
explicitly.

Rgds,

G>



On Wed, Jun 25, 2014 at 8:41 PM, Brian Rak  wrote:

ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)

I'll try to take a look through the bug tracker, but I didn't see anything 
obvious at first glance.


On 6/25/2014 7:33 PM, Gregory Farnum wrote:

Unfortunately Yehuda's out for a while as he could best handle this,
but it sounds familiar so I think you probably want to search the list
archives and the bug tracker (http://tracker.ceph.com/projects/rgw).
What version precisely are you on?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Wed, Jun 25, 2014 at 2:58 PM, Brian Rak  wrote:

I'm trying to find an issue with RadosGW and special characters in
filenames.  Specifically, it seems that filenames with a + in them are not
being handled correctly, and that I need to explicitly escape them.

For example:

---request begin---
HEAD /ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)

Will fail with a 404 error, but

---request begin---
HEAD /ubuntu/pool/main/a/adduser/adduser_3.113%2Bnmu3ubuntu3_all.deb
HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)

will work properly.

I enabled debug mode on radosgw, and see this:

2014-06-25 17:30:37.383029 7f7ca7fff700 20 RGWWQ:
2014-06-25 17:30:37.383040 7f7ca7fff700 20 req: 0x7f7ca000b180
2014-06-25 17:30:37.383053 7f7ca7fff700 10 allocated request
req=0x7f7ca0015ef0
2014-06-25 17:30:37.383064 7f7c6cfa9700 20 dequeued request
req=0x7f7ca000b180
2014-06-25 17:30:37.383070 7f7c6cfa9700 20 RGWWQ: empty
2014-06-25 17:30:37.383121 7f7c6cfa9700 20 CONTENT_LENGTH=
2014-06-25 17:30:37.383123 7f7c6cfa9700 20 CONTENT_TYPE=
2014-06-25 17:30:37.383124 7f7c6cfa9700 20 DOCUMENT_ROOT=/etc/nginx/html
2014-06-25 17:30:37.383125 7f7c6cfa9700 20
DOCUMENT_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383126 7f7c6cfa9700 20 FCGI_ROLE=RESPONDER
2014-06-25 17:30:37.383127 7f7c6cfa9700 20 GATEWAY_INTERFACE=CGI/1.1
2014-06-25 17:30:37.383128 7f7c6cfa9700 20 HTTP_ACCEPT=*/*
2014-06-25 17:30:37.383129 7f7c6cfa9700 20 HTTP_CONNECTION=Keep-Alive
2014-06-25 17:30:37.383129 7f7c6cfa9700 20 HTTP_HOST=xxx
2014-06-25 17:30:37.383130 7f7c6cfa9700 20 HTTP_USER_AGENT=Wget/1.12
(linux-gnu)
2014-06-25 17:30:37.383131 7f7c6cfa9700 20 QUERY_STRING=
2014-06-25 17:30:37.383131 7f7c6cfa9700 20 REDIRECT_STATUS=200
2014-06-25 17:30:37.383132 7f7c6cfa9700 20 REMOTE_ADDR=yyy
2014-06-25 17:30:37.383133 7f7c6cfa9700 20 REMOTE_PORT=43855
2014-06-25 17:30:37.383134 7f7c6cfa9700 20 REQUEST_METHOD=HEAD
2014-06-25 17:30:37.383134 7f7c6cfa9700 20
REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383135 7f7c6cfa9700 20
SCRIPT_NAME=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383136 7f7c6cfa9700 20 SERVER_ADDR=yyy
2014-06-25 17:30:37.383136 7f7c6cfa9700 20 SERVER_NAME=xxx
2014-06-25 17:30:37.383137 7f7c6cfa9700 20 SERVER_PORT=80
2014-06-25 17:30:37.383138 7f7c6cfa9700 20 SERVER_PROTOCOL=HTTP/1.0
2014-06-25 17:30:37.383138 7f7c6cfa9700 20 SERVER_SOFTWARE=nginx/1.4.6
2014-06-25 17:30:37.383140 7f7c6cfa9700  1 == starting new request
req=0x7f7ca000b180 =
2014-06-25 17:30:37.383152 

Re: [ceph-users] Behaviour of ceph pg repair on different replication levels

2014-06-26 Thread Christian Kauhaus
Am 26.06.2014 02:08, schrieb Gregory Farnum:
> It's a good idea, and in fact there was a discussion yesterday during
> the Ceph Developer Summit about making scrub repair significantly more
> powerful; they're keeping that use case in mind in addition to very
> fine-grained ones like specifying a particular replica for every
> object.

+1

This would be very cool.

> Yeah, it's got nothing and is relying on the local filesystem to barf
> if that happens. Unfortunately, neither xfs nor ext4 provide that
> checking functionality (which is one of the reasons we continue to
> look to btrfs as our long-term goal).

When thinking in petabytes scale, bit rot going to happen as a matter of fact.
So I think Ceph should be prepared, at least when there are more than 2 
replicas.

Regards

Christian

-- 
Dipl.-Inf. Christian Kauhaus <>< · k...@gocept.com · systems administration
gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany
http://gocept.com · tel +49 345 219401-11
Python, Pyramid, Plone, Zope · consulting, development, hosting, operations



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor performance on all SSD cluster

2014-06-26 Thread Mark Kirkwood

On 26/06/14 03:15, Josef Johansson wrote:

Hi,

On 25/06/14 00:27, Mark Kirkwood wrote:


Yes - same kind of findings, specifically:

- random read and write (e.g index access) faster than local disk
- sequential write (e.g batch inserts) similar or faster than local disk
- sequential read (e.g table scan) slower than local disk


Regarding sequential read, I think it was
https://software.intel.com/en-us/blogs/2013/11/20/measure-ceph-rbd-performance-in-a-quantitative-way-part-ii
that did some tuning with that.
Anyone tried to optimize it the way they did in the article?




In a similar vein, enabling striping in the rbd volume might be worth 
experimenting with (just thought of it after reading 'How to improve 
performance of ceph objcect storage cluster' thread).


Regards

Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS ACL support status

2014-06-26 Thread Sean Crosby
On 26 June 2014 13:47, Yan, Zheng  wrote:

> Can you try the attached patch. It should solve this issue.
>
>
Yep, works for me! Thanks!

Sean



> Regards
> Yan, Zheng
>
> On Thu, Jun 26, 2014 at 10:45 AM, Sean Crosby
>  wrote:
> > Hi,
> >
> >
> > On 26 June 2014 12:07, Yan, Zheng  wrote:
> >>
> >> On Wed, Jun 25, 2014 at 2:56 PM, Sean Crosby
> >>  wrote:
> >> > I have recently deployed a Firefly CephFS cluster, and am trying out
> >> > the POSIX ACL feature that is supposed to have come in as of kernel
> >> > 3.14. I've mounted my CephFS volume on a machine with kernel 3.15
> >> >
> >> > The ACL support seems to work (as in I can set and retrieve ACL's),
> >> > but it seems kinda buggy, especially when it tries to change an
> >> > existing ACL
> >> >
> >> > E.g.
> >> >
> >> > # uname -r
> >> > 3.15.1-1.el6.elrepo.x86_64
> >> >
> >> > # cat /boot/config-3.15.1-1.el6.elrepo.x86_64 | grep CEPH
> >> > CONFIG_CEPH_LIB=m
> >> > # CONFIG_CEPH_LIB_PRETTYDEBUG is not set
> >> > # CONFIG_CEPH_LIB_USE_DNS_RESOLVER is not set
> >> > CONFIG_CEPH_FS=m
> >> > CONFIG_CEPH_FSCACHE=y
> >> > CONFIG_CEPH_FS_POSIX_ACL=y
> >> >
> >> > # rpm -qa | grep ceph
> >> > libcephfs1-0.80.1-0.el6.x86_64
> >> > python-ceph-0.80.1-0.el6.x86_64
> >> > ceph-0.80.1-0.el6.x86_64
> >> >
> >> > (This is the same version on the MDS and all OSD's)
> >> >
> >> > # mount | grep ceph
> >> > 192.168.1.8:/ on /ceph type ceph (acl,name=admin,key=client.admin)
> >> >
> >> > # ls -la /ceph/
> >> > total 5
> >> > drwxrwxr-x   1 rootpeople0 Jun 25 05:57 .
> >> > dr-xr-xr-x. 25 rootroot   4096 Jun 20 04:33 ..
> >> > -rw-rwx---+  1 scrosby people   31 Jun 25 05:57 sean
> >> >
> >> > # getfacl /ceph/sean
> >> > getfacl: Removing leading '/' from absolute path names
> >> > # file: ceph/sean
> >> > # owner: scrosby
> >> > # group: people
> >> > user::rw-
> >> > user:lucien:rw-
> >> > group::---
> >> > mask::rwx
> >> > other::---
> >> >
> >> > # setfacl -m "u:jkahn:rw" /ceph/sean
> >> >
> >> > # getfacl /ceph/sean
> >> > getfacl: Removing leading '/' from absolute path names
> >> > # file: ceph/sean
> >> > # owner: scrosby
> >> > # group: people
> >> > user::rw-
> >> > group::rw-
> >> > other::---
> >> >
> >> > If I umount and mount /ceph again, the ACL shows up again
> >> >
> >> > # umount /ceph
> >> >
> >> > # mount -t ceph 192.168.1.8:/ /ceph -o
> >> > acl,name=admin,secret=`ceph-authtool -p
> >> > /etc/ceph/ceph.client.admin.keyring`
> >> >
> >> > # getfacl /ceph/sean
> >> > getfacl: Removing leading '/' from absolute path names
> >> > # file: ceph/sean
> >> > # owner: scrosby
> >> > # group: people
> >> > user::rw-
> >> > user:lucien:rw-
> >> > user:jkahn:rw-
> >> > group::---
> >> > mask::rw-
> >> > other::---
> >> >
> >> > Is there some outstanding bugs regarding CephFS and POSIX ACL's?
> >> >
> >>
> >> thank you for reporting this. I run the same test locally. It seems
> >> the issue only happens on root directory of cephfs, could you test and
> >> confirm this.
> >
> >
> > Based on my testing, you are correct in your synopsis. I can successfully
> > add an ACL to a file, then add another ACL to the file without the ACL
> > breaking, as long as the file is in a directory other than the root.
> >
> > Sean
> >
> >>
> >>
> >> Regards
> >> Yan, Zheng
> >>
> >>
> >> > Cheers,
> >> > Sean
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with RadosGW and special characters

2014-06-26 Thread Yehuda Sadeh
The gateway itself supports these kind of characters. Usually we see
this issue when there's something in front of the web server (like a
load balancer) that modifies the requests. Another possibility is the
web server configuration that might be rewriting the requests. In this
case it seems that you're using nginx which is outside of our usual
test environment, so it might be related.

Yehuda

On Jun 25, 2014 5:39 PM, "Brian Rak"  wrote:
>
> Unfortunately, both the client and actual files are outside of my control 
> here In the case that I noticed, the client is the Ubuntu installer, and 
> the files are part of the Ubuntu archives content.
>
> On 6/25/2014 8:07 PM, Gerard Toonstra wrote:
>
> the + is a reserved character in the HTTP protocol, which means it may have 
> specific meaning in a specific part of the URL, but not everywhere.
>
> The earliest HTTP specification re-encoded spaces in the URL as + characters 
> after the question mark and form fields for posts that were
> sent with urlencode.
>
> Best is to prevent these characters in filenames or percentage encode the URL 
> explicitly.
>
> Rgds,
>
> G>
>
>
>
> On Wed, Jun 25, 2014 at 8:41 PM, Brian Rak  wrote:
>>
>> ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
>>
>> I'll try to take a look through the bug tracker, but I didn't see anything 
>> obvious at first glance.
>>
>>
>> On 6/25/2014 7:33 PM, Gregory Farnum wrote:
>>>
>>> Unfortunately Yehuda's out for a while as he could best handle this,
>>> but it sounds familiar so I think you probably want to search the list
>>> archives and the bug tracker (http://tracker.ceph.com/projects/rgw).
>>> What version precisely are you on?
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>
>>>
>>> On Wed, Jun 25, 2014 at 2:58 PM, Brian Rak  wrote:

 I'm trying to find an issue with RadosGW and special characters in
 filenames.  Specifically, it seems that filenames with a + in them are not
 being handled correctly, and that I need to explicitly escape them.

 For example:

 ---request begin---
 HEAD /ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb HTTP/1.0
 User-Agent: Wget/1.12 (linux-gnu)

 Will fail with a 404 error, but

 ---request begin---
 HEAD /ubuntu/pool/main/a/adduser/adduser_3.113%2Bnmu3ubuntu3_all.deb
 HTTP/1.0
 User-Agent: Wget/1.12 (linux-gnu)

 will work properly.

 I enabled debug mode on radosgw, and see this:

 2014-06-25 17:30:37.383029 7f7ca7fff700 20 RGWWQ:
 2014-06-25 17:30:37.383040 7f7ca7fff700 20 req: 0x7f7ca000b180
 2014-06-25 17:30:37.383053 7f7ca7fff700 10 allocated request
 req=0x7f7ca0015ef0
 2014-06-25 17:30:37.383064 7f7c6cfa9700 20 dequeued request
 req=0x7f7ca000b180
 2014-06-25 17:30:37.383070 7f7c6cfa9700 20 RGWWQ: empty
 2014-06-25 17:30:37.383121 7f7c6cfa9700 20 CONTENT_LENGTH=
 2014-06-25 17:30:37.383123 7f7c6cfa9700 20 CONTENT_TYPE=
 2014-06-25 17:30:37.383124 7f7c6cfa9700 20 DOCUMENT_ROOT=/etc/nginx/html
 2014-06-25 17:30:37.383125 7f7c6cfa9700 20
 DOCUMENT_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
 2014-06-25 17:30:37.383126 7f7c6cfa9700 20 FCGI_ROLE=RESPONDER
 2014-06-25 17:30:37.383127 7f7c6cfa9700 20 GATEWAY_INTERFACE=CGI/1.1
 2014-06-25 17:30:37.383128 7f7c6cfa9700 20 HTTP_ACCEPT=*/*
 2014-06-25 17:30:37.383129 7f7c6cfa9700 20 HTTP_CONNECTION=Keep-Alive
 2014-06-25 17:30:37.383129 7f7c6cfa9700 20 HTTP_HOST=xxx
 2014-06-25 17:30:37.383130 7f7c6cfa9700 20 HTTP_USER_AGENT=Wget/1.12
 (linux-gnu)
 2014-06-25 17:30:37.383131 7f7c6cfa9700 20 QUERY_STRING=
 2014-06-25 17:30:37.383131 7f7c6cfa9700 20 REDIRECT_STATUS=200
 2014-06-25 17:30:37.383132 7f7c6cfa9700 20 REMOTE_ADDR=yyy
 2014-06-25 17:30:37.383133 7f7c6cfa9700 20 REMOTE_PORT=43855
 2014-06-25 17:30:37.383134 7f7c6cfa9700 20 REQUEST_METHOD=HEAD
 2014-06-25 17:30:37.383134 7f7c6cfa9700 20
 REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
 2014-06-25 17:30:37.383135 7f7c6cfa9700 20
 SCRIPT_NAME=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
 2014-06-25 17:30:37.383136 7f7c6cfa9700 20 SERVER_ADDR=yyy
 2014-06-25 17:30:37.383136 7f7c6cfa9700 20 SERVER_NAME=xxx
 2014-06-25 17:30:37.383137 7f7c6cfa9700 20 SERVER_PORT=80
 2014-06-25 17:30:37.383138 7f7c6cfa9700 20 SERVER_PROTOCOL=HTTP/1.0
 2014-06-25 17:30:37.383138 7f7c6cfa9700 20 SERVER_SOFTWARE=nginx/1.4.6
 2014-06-25 17:30:37.383140 7f7c6cfa9700  1 == starting new request
 req=0x7f7ca000b180 =
 2014-06-25 17:30:37.383152 7f7c6cfa9700  2 req 1:0.13::HEAD
 /ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb::initializing
 2014-06-25 17:30:37.383158 7f7c6cfa9700 10 host= rgw_dns_name=
 2014-06-25 17:30:37.383199 7f7c6cfa9700 10
 s->object=ubuntu/pool/main/a/adduser/add