Re: [ceph-users] [ceph-deploy] problem creating mds after a full cluster wipe

2013-09-05 Thread Alfredo Deza
On Wed, Sep 4, 2013 at 11:56 PM, Sage Weil  wrote:

> On Wed, 4 Sep 2013, Alphe Salas Michels wrote:
> > Hi again,
> > as I was doomed to full wipe my cluster once again after. I uploaded to
> > ceph-deploy 1.2.3
> > all went smoothing along my ceph-deploy process.
> >
> > until I create the mds and then
> >
> > ceph-deploy mds create myhost provoked
> > first a
> >
> >   File
> "/usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py",
> > line 645, in __handle
> > raise e
> > pushy.protocol.proxy.ExceptionProxy: [Errno 2] No such file or directory:
> > '/var/lib/ceph/bootstrap-mds'
> >
> > doing a mkdir -p /var/lib/ceph/bootstrap-mds solved that one
> >
> > then I got a:
> >
> > pushy.protocol.proxy.ExceptionProxy: [Errno 2] No such file or directory:
> > '/var/lib/ceph/mds/ceph-mds01'
> >
> > doing a mkdir -p /var/lib/ceph/mds/ceph-mds01 solved that one too
>

This looks very unexpected, aside from getting us your distro and ceph
version, could you
paste the exact way you got here? Like, what commands you ran, in order,
with output if possible.



>
> What distro was this?  And what version of ceph did you install?
>
> Thanks!
> sage
>
>
> >
> >
> > After that all was runing nicely ...
> >   health HEALTH_OK
> > etc ../..
> >  mdsmap e4: 1/1/1 up {0=mds01=up:active}
> >
> > Hope that can help.
> >
> > --
> > signature
> >
> > *Alph? Salas*
> > Ingeniero T.I
> >
> > Descripci?n: cid:image001.gif@01CAA59C.F14CE4B0*Kepler Data Recovery*
> >
> > *Asturias 97, Las Condes**
> > Santiago- Chile**
> > **(56 2) 2362 7504*
> >
> > asa...@kepler.cl
> > *www.kepler.cl *
> >
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] trouble with ceph-deploy

2013-09-05 Thread Pavel Timoschenkov
>>>What happens if you do
>>>ceph-disk -v activate /dev/sdaa1
>>>on ceph001?

Hi. My issue has not been solved. When i execute ceph-disk -v activate 
/dev/sdaa - all is ok:
ceph-disk -v activate /dev/sdaa
DEBUG:ceph-disk:Mounting /dev/sdaa on /var/lib/ceph/tmp/mnt.yQuXIa with options 
noatime
mount: Structure needs cleaning
but OSD not created all the same:
ceph -k ceph.client.admin.keyring -s
  cluster 0a2e18d2-fd53-4f01-b63a-84851576c076
   health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
   monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch 2, quorum 
0 ceph001
   osdmap e1: 0 osds: 0 up, 0 in
pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail
   mdsmap e1: 0/0/1 up 

-Original Message-
From: Sage Weil [mailto:s...@inktank.com] 
Sent: Friday, August 30, 2013 6:14 PM
To: Pavel Timoschenkov
Cc: Alfredo Deza; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] trouble with ceph-deploy

On Fri, 30 Aug 2013, Pavel Timoschenkov wrote:

> 
> <<< How <<< 
>  
> 
> In logs everything looks good. After
> 
> ceph-deploy disk zap ceph001:sdaa ceph001:sda1
> 
> and
> 
> ceph-deploy osd create ceph001:sdaa:/dev/sda1
> 
> where:
> 
> HOST: ceph001
> 
> DISK: sdaa
> 
> JOURNAL: /dev/sda1
> 
> in log:
> 
> ==
> 
> cat ceph.log
> 
> 2013-08-30 13:06:42,030 [ceph_deploy.osd][DEBUG ] Preparing cluster 
> ceph disks ceph001:/dev/sdaa:/dev/sda1
> 
> 2013-08-30 13:06:42,590 [ceph_deploy.osd][DEBUG ] Deploying osd to 
> ceph001
> 
> 2013-08-30 13:06:42,627 [ceph_deploy.osd][DEBUG ] Host ceph001 is now 
> ready for osd use.
> 
> 2013-08-30 13:06:42,627 [ceph_deploy.osd][DEBUG ] Preparing host 
> ceph001 disk /dev/sdaa journal /dev/sda1 activate True
> 
> +++
> 
> But:
> 
> +++
> 
> ceph -k ceph.client.admin.keyring -s
> 
>   cluster 0a2e18d2-fd53-4f01-b63a-84851576c076
> 
>    health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no 
> osds
> 
>    monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch 
> 2, quorum 0 ceph001
> 
>    osdmap e1: 0 osds: 0 up, 0 in
> 
>     pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 
> KB avail
> 
>    mdsmap e1: 0/0/1 up
> 
> +++
> 
> And
> 
> +++
> 
> ceph -k ceph.client.admin.keyring osd tree
> 
> # id    weight  type name   up/down reweight
> 
> -1  0   root default
> 
> +++
> 
> OSD not created (

What happens if you do

 ceph-disk -v activate /dev/sdaa1

on ceph001?

sage


> 
>  
> 
> From: Alfredo Deza [mailto:alfredo.d...@inktank.com]
> Sent: Thursday, August 29, 2013 5:41 PM
> To: Pavel Timoschenkov
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] trouble with ceph-deploy
> 
>  
> 
>  
> 
>  
> 
> On Thu, Aug 29, 2013 at 10:23 AM, Pavel Timoschenkov 
>  wrote:
> 
>   Hi.
> 
>   If I use the example of the doc:
>   
> http://ceph.com/docs/master/rados/deployment/ceph-deploy-osd/#create-o
> sds
> 
>   ceph-deploy osd prepare ceph001:sdaa:/dev/sda1
>   ceph-deploy osd activate ceph001:sdaa:/dev/sda1
>   or
>   ceph-deploy osd prepare ceph001:/dev/sdaa1:/dev/sda1
>   ceph-deploy osd activate ceph001:/dev/sdaa:/dev/sda1
> 
> or
> 
> ceph-deploy osd create ceph001:sdaa:/dev/sda1
> 
> OSD is not created. No errors, but when I execute
> 
> ceph -k ceph.client.admin.keyring ?s
> 
> I see the following:
> 
> cluster 4b91a9e9-0e6c-4570-98c6-1398c6900a9e
>    health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no 
> osds
>    monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch 
> 2, quorum 0 ceph001
>    osdmap e1: 0 osds: 0 up, 0 in
>     pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 
> KB avail
>    mdsmap e1: 0/0/1 up
> 
>  
> 
> 0 OSD.
> 
>  
> 
> But if I use as an DISK argument to a local folder
> (/var/lib/ceph/osd/osd001) - it works, but only if used prepare + 
> activate construction:
> 
> ceph-deploy osd prepare ceph001:/var/lib/ceph/osd/osd001:/dev/sda1
> ceph-deploy osd activate ceph001:/var/lib/ceph/osd/osd001:/dev/sda1
> 
> If I use CREATE, OSD is not created also.
> 
>  
> 
>  
> 
> From: Alfredo Deza [mailto:alfredo.d...@inktank.com]
> Sent: Thursday, August 29, 2013 4:36 PM
> To: Pavel Timoschenkov
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] trouble with ceph-deploy
> 
>  
> 
>  
> 
>  
> 
> On Thu, Aug 29, 2013 at 8:00 AM, Pavel Timoschenkov 
>  wrote:
> 
>   Hi.
>   New trouble with ceph-deploy. When i'm executing:
> 
>   ceph-deploy osd prepare ceph001:sdaa:/dev/sda1
>   ceph-deploy osd activate ceph001:sdaa:/dev/sda1
>   or
>   ceph-deploy osd prepare ceph001:/dev/sdaa1:/dev/sda1
>   ceph-deploy osd activate ceph001:/dev/sdaa:/dev/sda1
> 
>  
> 
> Have you

[ceph-users] Impossible to Create Bucket on RadosGW?

2013-09-05 Thread Georg Höllrigl

Hello,

I'm horribly failing at creating a bucket on radosgw at ceph 0.67.2 
running on ubuntu 12.04.


Right now I feel frustrated about radosgw-admin for beeing inconsistent 
in its options. It's possible to list the buckets and also to delete 
them but not to create!


No matter what I tried - using telnet, curl, s3cmd - I'm getting back

S3Error: 405 (Method Not Allowed)

I don't see a way to configure this somewhere in apache!?

I've configured radosgw according to 
http://ceph.com/docs/next/radosgw/config/ - one radosgw process is 
running and I can already see the XML output when accessing the webserver.


Neither the rados log nor the apache logs give me anything userful back. 
Here is what the rados log looks like:


2013-09-05 15:34:10.886099 7f1a89fbb700  1 == starting new request 
req=0xcd0800 =
2013-09-05 15:34:10.886140 7f1a89fbb700  2 req 38:0.41::PUT 
/::initializing

2013-09-05 15:34:10.886156 7f1a89fbb700 10 meta>> HTTP_X_AMZ_DATE
2013-09-05 15:34:10.886161 7f1a89fbb700 10 x>> x-amz-date:Thu, 05 Sep 
2013 13:34:10 +

2013-09-05 15:34:10.886172 7f1a89fbb700 10 s->object= s->bucket=
2013-09-05 15:34:10.886176 7f1a89fbb700 20 FCGI_ROLE=RESPONDER
2013-09-05 15:34:10.886177 7f1a89fbb700 20 SCRIPT_URL=/
2013-09-05 15:34:10.886178 7f1a89fbb700 20 
SCRIPT_URI=http://logix.s3.xidrasservice.com/
2013-09-05 15:34:10.886178 7f1a89fbb700 20 HTTP_AUTHORIZATION=AWS 
J63G2KDB1POTI7EQVFKY:GtAma5xSZUKcBPl4RfhqcVaYZbI=
2013-09-05 15:34:10.886179 7f1a89fbb700 20 
HTTP_HOST=logix.s3.xidrasservice.com

2013-09-05 15:34:10.886179 7f1a89fbb700 20 HTTP_ACCEPT_ENCODING=identity
2013-09-05 15:34:10.886180 7f1a89fbb700 20 CONTENT_LENGTH=0
2013-09-05 15:34:10.886180 7f1a89fbb700 20 HTTP_X_AMZ_DATE=Thu, 05 Sep 
2013 13:34:10 +

2013-09-05 15:34:10.886181 7f1a89fbb700 20 PATH=/usr/local/bin:/usr/bin:/bin
2013-09-05 15:34:10.886181 7f1a89fbb700 20 SERVER_SIGNATURE=
2013-09-05 15:34:10.886182 7f1a89fbb700 20 SERVER_SOFTWARE=Apache/2.2.22 
(Ubuntu)
2013-09-05 15:34:10.886182 7f1a89fbb700 20 
SERVER_NAME=logix.s3.xidrasservice.com

2013-09-05 15:34:10.886183 7f1a89fbb700 20 SERVER_ADDR=10.0.0.176
2013-09-05 15:34:10.886183 7f1a89fbb700 20 SERVER_PORT=80
2013-09-05 15:34:10.886184 7f1a89fbb700 20 REMOTE_ADDR=10.0.0.176
2013-09-05 15:34:10.886184 7f1a89fbb700 20 DOCUMENT_ROOT=/var/www
2013-09-05 15:34:10.886185 7f1a89fbb700 20 
SCRIPT_FILENAME=/var/www/s3gw.fcgi

2013-09-05 15:34:10.886186 7f1a89fbb700 20 REMOTE_PORT=46591
2013-09-05 15:34:10.886186 7f1a89fbb700 20 GATEWAY_INTERFACE=CGI/1.1
2013-09-05 15:34:10.886187 7f1a89fbb700 20 SERVER_PROTOCOL=HTTP/1.1
2013-09-05 15:34:10.886188 7f1a89fbb700 20 REQUEST_METHOD=PUT
2013-09-05 15:34:10.886188 7f1a89fbb700 20 QUERY_STRING=page=¶ms=
2013-09-05 15:34:10.886189 7f1a89fbb700 20 REQUEST_URI=/
2013-09-05 15:34:10.886189 7f1a89fbb700 20 SCRIPT_NAME=/
2013-09-05 15:34:10.886190 7f1a89fbb700  2 req 38:0.91:s3:PUT 
/::getting op
2013-09-05 15:34:10.886207 7f1a89fbb700  2 req 38:0.000108:s3:PUT 
/::http status=405
2013-09-05 15:34:10.886342 7f1a89fbb700  1 == req done req=0xcd0800 
http_status=405 ==



Any ideas whats going on here?

Georg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Bill Omer
Thats correct.  We created 65k buckets, using two hex characters as the
naming convention, then stored the files in each container based on their
first two characters in the file name.  The end result was 20-50 files per
bucket.  Once all of the buckets were created and files were being loaded,
we still observed an increase in latency overtime.

Is there a way to disable indexing?  Or are there other settings you can
suggest to attempt to speed this process up?


On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson  wrote:

> Just for clarification, distributing objects over lots of buckets isn't
> helping improve small object performance?
>
> The degradation over time is similar to something I've seen in the past,
> with higher numbers of seeks on the underlying OSD device over time.  Is it
> always (temporarily) resolved writing to a new empty bucket?
>
> Mark
>
>
> On 09/04/2013 02:45 PM, Bill Omer wrote:
>
>> We've actually done the same thing, creating 65k buckets and storing
>> 20-50 objects in each.  No change really, not noticeable anyway
>>
>>
>> On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
>> > >
>> wrote:
>>
>> So far I haven't seen much of a change.  It's still working through
>> removing the bucket that reached 1.5 million objects though (my
>> guess is that'll take a few more days), so I believe that might have
>> something to do with it.
>>
>> Bryan
>>
>>
>> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
>> > >
>> wrote:
>>
>> Bryan,
>>
>> Good explanation.  How's performance now that you've spread the
>> load over multiple buckets?
>>
>> Mark
>>
>> On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
>>
>> Bill,
>>
>> I've run into a similar issue with objects averaging
>> ~100KiB.  The
>> explanation I received on IRC is that there are scaling
>> issues if you're
>> uploading them all to the same bucket because the index
>> isn't sharded.
>>The recommended solution is to spread the objects out to
>> a lot of
>> buckets.  However, that ran me into another issue once I hit
>> 1000
>> buckets which is a per user limit.  I switched the limit to
>> be unlimited
>> with this command:
>>
>> radosgw-admin user modify --uid=your_username --max-buckets=0
>>
>> Bryan
>>
>>
>> On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
>> mailto:bill.o...@gmail.com>
>> >>
>>
>> wrote:
>>
>>  I'm testing ceph for storing a very large number of
>> small files.
>>I'm seeing some performance issues and would like to
>> see if anyone
>>  could offer any insight as to what I could do to
>> correct this.
>>
>>  Some numbers:
>>
>>  Uploaded 184111 files, with an average file size of
>> 5KB, using
>>  10 separate servers to upload the request using Python
>> and the
>>  cloudfiles module.  I stopped uploading after 53
>> minutes, which
>>  seems to average 5.7 files per second per node.
>>
>>
>>  My storage cluster consists of 21 OSD's across 7
>> servers, with their
>>  journals written to SSD drives.  I've done a default
>> installation,
>>  using ceph-deploy with the dumpling release.
>>
>>  I'm using statsd to monitor the performance, and what's
>> interesting
>>  is when I start with an empty bucket, performance is
>> amazing, with
>>  average response times of 20-50ms.  However as time
>> goes on, the
>>  response times go in to the hundreds, and the average
>> number of
>>  uploads per second drops.
>>
>>  I've installed radosgw on all 7 ceph servers.  I've
>> tested using a
>>  load balancer to distribute the api calls, as well as
>> pointing the
>>  10 worker servers to a single instance.  I've not seen
>> a real
>>  different in performance with this either.
>>
>>
>>  Each of the ceph servers are 16 core Xeon 2.53GHz with
>> 72GB of ram,
>>  OCZ Vertex4 SSD drives for the journals and Seagate
>> Barracuda ES2
>>  drives for storage.
>>
>>
>>  Any help would be greatly appreciated.
>>
>>
>>  __**___
>>
>>  ceph-users mailing list
>> 

Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Mark Nelson

On 09/05/2013 09:19 AM, Bill Omer wrote:

Thats correct.  We created 65k buckets, using two hex characters as the
naming convention, then stored the files in each container based on
their first two characters in the file name.  The end result was 20-50
files per bucket.  Once all of the buckets were created and files were
being loaded, we still observed an increase in latency overtime.

Is there a way to disable indexing?  Or are there other settings you can
suggest to attempt to speed this process up?


There's been some talk recently about indexless buckets, but I don't 
think it's possible right now.  Yehuda can probably talk about it.


If you remove objects from the bucket so it is empty does it speed up 
again?  Anything you can tell us about when and how it slows down would 
be very useful!


Mark




On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson mailto:mark.nel...@inktank.com>> wrote:

Just for clarification, distributing objects over lots of buckets
isn't helping improve small object performance?

The degradation over time is similar to something I've seen in the
past, with higher numbers of seeks on the underlying OSD device over
time.  Is it always (temporarily) resolved writing to a new empty
bucket?

Mark


On 09/04/2013 02:45 PM, Bill Omer wrote:

We've actually done the same thing, creating 65k buckets and storing
20-50 objects in each.  No change really, not noticeable anyway


On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
mailto:bstillw...@photobucket.com>
>> wrote:

 So far I haven't seen much of a change.  It's still working
through
 removing the bucket that reached 1.5 million objects though (my
 guess is that'll take a few more days), so I believe that
might have
 something to do with it.

 Bryan


 On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
 mailto:mark.nel...@inktank.com>
>> wrote:

 Bryan,

 Good explanation.  How's performance now that you've
spread the
 load over multiple buckets?

 Mark

 On 09/04/2013 12:39 PM, Bryan Stillwell wrote:

 Bill,

 I've run into a similar issue with objects averaging
 ~100KiB.  The
 explanation I received on IRC is that there are scaling
 issues if you're
 uploading them all to the same bucket because the index
 isn't sharded.
The recommended solution is to spread the
objects out to
 a lot of
 buckets.  However, that ran me into another issue
once I hit
 1000
 buckets which is a per user limit.  I switched the
limit to
 be unlimited
 with this command:

 radosgw-admin user modify --uid=your_username
--max-buckets=0

 Bryan


 On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
 mailto:bill.o...@gmail.com>
>
  

[ceph-users] ceph rbd format2 in kernel

2013-09-05 Thread Timofey Koolin
I have read about support image format 2 in 3.9 kernel.
Is 3.9/3.10 kernel support rbd images format 2 now (I need connect to
images, cloned from snapshot)?

-- 
Blog: www.rekby.ru
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD map question

2013-09-05 Thread Laurent Barbe

Hello Gaylord,

I do not think there is something implemented to do this. Perhaps it 
could be useful. For example, with the command "rbd info".
For now, I did not find any other way than using "rbd showmapped" on 
each host.


Laurent Barbe


Le 05/09/2013 01:18, Gaylord Holder a écrit :

Is it possible know if an RBD is mapped by a machine?
-Gaylord
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] trouble with ceph-deploy

2013-09-05 Thread Sage Weil
On Thu, 5 Sep 2013, Pavel Timoschenkov wrote:
> >>>What happens if you do
> >>>ceph-disk -v activate /dev/sdaa1
> >>>on ceph001?
> 
> Hi. My issue has not been solved. When i execute ceph-disk -v activate 
> /dev/sdaa - all is ok:
> ceph-disk -v activate /dev/sdaa

Try

 ceph-disk -v activate /dev/sdaa1

as there is probably a partition there.  And/or tell us what 
/proc/partitions contains, and/or what you get from

 ceph-disk list

Thanks!
sage


> DEBUG:ceph-disk:Mounting /dev/sdaa on /var/lib/ceph/tmp/mnt.yQuXIa with 
> options noatime
> mount: Structure needs cleaning
> but OSD not created all the same:
> ceph -k ceph.client.admin.keyring -s
>   cluster 0a2e18d2-fd53-4f01-b63a-84851576c076
>health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
>monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch 2, 
> quorum 0 ceph001
>osdmap e1: 0 osds: 0 up, 0 in
> pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB 
> avail
>mdsmap e1: 0/0/1 up 
> 
> -Original Message-
> From: Sage Weil [mailto:s...@inktank.com] 
> Sent: Friday, August 30, 2013 6:14 PM
> To: Pavel Timoschenkov
> Cc: Alfredo Deza; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] trouble with ceph-deploy
> 
> On Fri, 30 Aug 2013, Pavel Timoschenkov wrote:
> 
> > 
> > <<< > How <<< > 
> >  
> > 
> > In logs everything looks good. After
> > 
> > ceph-deploy disk zap ceph001:sdaa ceph001:sda1
> > 
> > and
> > 
> > ceph-deploy osd create ceph001:sdaa:/dev/sda1
> > 
> > where:
> > 
> > HOST: ceph001
> > 
> > DISK: sdaa
> > 
> > JOURNAL: /dev/sda1
> > 
> > in log:
> > 
> > ==
> > 
> > cat ceph.log
> > 
> > 2013-08-30 13:06:42,030 [ceph_deploy.osd][DEBUG ] Preparing cluster 
> > ceph disks ceph001:/dev/sdaa:/dev/sda1
> > 
> > 2013-08-30 13:06:42,590 [ceph_deploy.osd][DEBUG ] Deploying osd to 
> > ceph001
> > 
> > 2013-08-30 13:06:42,627 [ceph_deploy.osd][DEBUG ] Host ceph001 is now 
> > ready for osd use.
> > 
> > 2013-08-30 13:06:42,627 [ceph_deploy.osd][DEBUG ] Preparing host 
> > ceph001 disk /dev/sdaa journal /dev/sda1 activate True
> > 
> > +++
> > 
> > But:
> > 
> > +++
> > 
> > ceph -k ceph.client.admin.keyring -s
> > 
> >   cluster 0a2e18d2-fd53-4f01-b63a-84851576c076
> > 
> >    health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no 
> > osds
> > 
> >    monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch 
> > 2, quorum 0 ceph001
> > 
> >    osdmap e1: 0 osds: 0 up, 0 in
> > 
> >     pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 
> > KB avail
> > 
> >    mdsmap e1: 0/0/1 up
> > 
> > +++
> > 
> > And
> > 
> > +++
> > 
> > ceph -k ceph.client.admin.keyring osd tree
> > 
> > # id    weight  type name   up/down reweight
> > 
> > -1  0   root default
> > 
> > +++
> > 
> > OSD not created (
> 
> What happens if you do
> 
>  ceph-disk -v activate /dev/sdaa1
> 
> on ceph001?
> 
> sage
> 
> 
> > 
> >  
> > 
> > From: Alfredo Deza [mailto:alfredo.d...@inktank.com]
> > Sent: Thursday, August 29, 2013 5:41 PM
> > To: Pavel Timoschenkov
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] trouble with ceph-deploy
> > 
> >  
> > 
> >  
> > 
> >  
> > 
> > On Thu, Aug 29, 2013 at 10:23 AM, Pavel Timoschenkov 
> >  wrote:
> > 
> >   Hi.
> > 
> >   If I use the example of the doc:
> >   
> > http://ceph.com/docs/master/rados/deployment/ceph-deploy-osd/#create-o
> > sds
> > 
> >   ceph-deploy osd prepare ceph001:sdaa:/dev/sda1
> >   ceph-deploy osd activate ceph001:sdaa:/dev/sda1
> >   or
> >   ceph-deploy osd prepare ceph001:/dev/sdaa1:/dev/sda1
> >   ceph-deploy osd activate ceph001:/dev/sdaa:/dev/sda1
> > 
> > or
> > 
> > ceph-deploy osd create ceph001:sdaa:/dev/sda1
> > 
> > OSD is not created. No errors, but when I execute
> > 
> > ceph -k ceph.client.admin.keyring ?s
> > 
> > I see the following:
> > 
> > cluster 4b91a9e9-0e6c-4570-98c6-1398c6900a9e
> >    health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no 
> > osds
> >    monmap e1: 1 mons at {ceph001=172.16.4.32:6789/0}, election epoch 
> > 2, quorum 0 ceph001
> >    osdmap e1: 0 osds: 0 up, 0 in
> >     pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 
> > KB avail
> >    mdsmap e1: 0/0/1 up
> > 
> >  
> > 
> > 0 OSD.
> > 
> >  
> > 
> > But if I use as an DISK argument to a local folder
> > (/var/lib/ceph/osd/osd001) - it works, but only if used prepare + 
> > activate construction:
> > 
> > ceph-deploy osd prepare ceph001:/var/lib/ceph/osd/osd001:/dev/sda1
> > ceph-deploy osd activate ceph001:/var/lib/ceph/osd/osd001:/dev/sda1
> > 
> > If I use CREATE, OSD is not created also.
> > 
> >  
> > 
> >  
> > 
> > From: Alfredo Deza [m

[ceph-users] newbie question: rebooting the whole cluster, powerfailure

2013-09-05 Thread Bernhard Glomm
Hi all, 

as a ceph newbie I got another question that is probably solved long ago.
I have my testcluster consisting two OSDs that also host MONs
plus one to five MONs.
Now I want to reboot all instance, simulating a power failure.
So I shutdown the extra MONs, 
Than shutting down the first OSD/MON instance (call it "ping")
and after shutdown is complete, shutting down the second OSD/MON
instance (call it "pong")
5 Minutes later I restart "pong", than after I checked all services are 
up and running I restart "pong", afterwards I restart the MON that I brought
down the at last, not the other MON though (since - surprise - they are in 
this test szenario just virtual instances residing on some ceph rbds)

I think this is the wrong way to do it, since it brakes the cluster 
unrecoverable...
at least that's what it seems, ceph tries to call one of the MONs that isn't 
there yet
How to shut down and restart the whole cluster in a coordinated way in case
of a powerfailure (need a script for our UPS)

And a second question regarding ceph-deploy:
How do I specify a second NIC/address to be used as the intercluster 
communication?

TIA

Bernhard



-- 


  
 


  

  
Bernhard Glomm

IT Administration


  

  Phone:


  +49 (30) 86880 134

  
  Fax:


  +49 (30) 86880 100

  
  Skype:


  bernhard.glomm.ecologic

  

  









  


  Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717 
Berlin | Germany

  GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | 
USt/VAT-IdNr.: DE811963464

  Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH

  

 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure

2013-09-05 Thread Jens Kristian Søgaard

Hi Bernhard,


I have my testcluster consisting two OSDs that also host MONs plus
one to five MONs.


Are you saying that you have a total of 7 mons?


down the at last, not the other MON though (since - surprise - they
are in this test szenario just virtual instances residing on some
ceph rbds)


This seems to be your problem. When you shutdown the cluster, you
haven't got those extra mons.

In order to reach a quorum after reboot, you need to have more than half
of yours mons running.

If you have 5 or more mons in total, this means that the two physical
servers running mons cannot reach quorum by themselves.

I.e. you have 2 mons out of 5 running for example - it will not reach a
quorum because you need at least 3 mons to do that.

You need to either move mons to physical machines, or virtual instances
not depending on the same Ceph cluster, or reduce the number of mons in
the system to 3 (or 1).

--
Jens Kristian Søgaard, Mermaid Consulting ApS,
j...@mermaidconsulting.dk,
http://.mermaidconsulting.com/


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure

2013-09-05 Thread Alfredo Deza
On Thu, Sep 5, 2013 at 11:42 AM, Bernhard Glomm
 wrote:
>
> Hi all,
>
> as a ceph newbie I got another question that is probably solved long ago.
> I have my testcluster consisting two OSDs that also host MONs
> plus one to five MONs.
> Now I want to reboot all instance, simulating a power failure.
> So I shutdown the extra MONs,
> Than shutting down the first OSD/MON instance (call it "ping")
> and after shutdown is complete, shutting down the second OSD/MON
> instance (call it "pong")
> 5 Minutes later I restart "pong", than after I checked all services are
> up and running I restart "pong", afterwards I restart the MON that I brought
> down the at last, not the other MON though (since - surprise - they are in
> this test szenario just virtual instances residing on some ceph rbds)
>
> I think this is the wrong way to do it, since it brakes the cluster 
> unrecoverable...
> at least that's what it seems, ceph tries to call one of the MONs that isn't 
> there yet
> How to shut down and restart the whole cluster in a coordinated way in case
> of a powerfailure (need a script for our UPS)
>
> And a second question regarding ceph-deploy:
> How do I specify a second NIC/address to be used as the intercluster 
> communication?

You will not be able to do something like this with ceph-deploy. This
sounds like a very specific (or a bit more advanced)
configuration than what ceph-deploy offers.


>
> TIA
>
> Bernhard
>
>
>
> --
> 
> Bernhard Glomm
> IT Administration
>
> Phone: +49 (30) 86880 134
> Fax: +49 (30) 86880 100
> Skype: bernhard.glomm.ecologic
> Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717 Berlin 
> | Germany
> GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.: 
> DE811963464
> Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH
> 
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph rbd format2 in kernel

2013-09-05 Thread Laurent Barbe

Hello Timofey,

Yes, it works in kernel 3.9 and 3.10.


Laurent Barbe


Le 05/09/2013 17:21, Timofey Koolin a écrit :

I have read about support image format 2 in 3.9 kernel.
Is 3.9/3.10 kernel support rbd images format 2 now (I need connect to
images, cloned from snapshot)?

--
Blog: www.rekby.ru 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Bryan Stillwell
Wouldn't using only the first two characters in the file name result
in less then 65k buckets being used?

For example if the file names contained 0-9 and a-f, that would only
be 256 buckets (16*16).  Or if they contained 0-9, a-z, and A-Z, that
would only be 3,844 buckets (62 * 62).

Bryan


On Thu, Sep 5, 2013 at 8:19 AM, Bill Omer  wrote:
>
> Thats correct.  We created 65k buckets, using two hex characters as the 
> naming convention, then stored the files in each container based on their 
> first two characters in the file name.  The end result was 20-50 files per 
> bucket.  Once all of the buckets were created and files were being loaded, we 
> still observed an increase in latency overtime.
>
> Is there a way to disable indexing?  Or are there other settings you can 
> suggest to attempt to speed this process up?
>
>
> On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson  wrote:
>>
>> Just for clarification, distributing objects over lots of buckets isn't 
>> helping improve small object performance?
>>
>> The degradation over time is similar to something I've seen in the past, 
>> with higher numbers of seeks on the underlying OSD device over time.  Is it 
>> always (temporarily) resolved writing to a new empty bucket?
>>
>> Mark
>>
>>
>> On 09/04/2013 02:45 PM, Bill Omer wrote:
>>>
>>> We've actually done the same thing, creating 65k buckets and storing
>>> 20-50 objects in each.  No change really, not noticeable anyway
>>>
>>>
>>> On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
>>> mailto:bstillw...@photobucket.com>> wrote:
>>>
>>> So far I haven't seen much of a change.  It's still working through
>>> removing the bucket that reached 1.5 million objects though (my
>>> guess is that'll take a few more days), so I believe that might have
>>> something to do with it.
>>>
>>> Bryan
>>>
>>>
>>> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
>>> mailto:mark.nel...@inktank.com>> wrote:
>>>
>>> Bryan,
>>>
>>> Good explanation.  How's performance now that you've spread the
>>> load over multiple buckets?
>>>
>>> Mark
>>>
>>> On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
>>>
>>> Bill,
>>>
>>> I've run into a similar issue with objects averaging
>>> ~100KiB.  The
>>> explanation I received on IRC is that there are scaling
>>> issues if you're
>>> uploading them all to the same bucket because the index
>>> isn't sharded.
>>>The recommended solution is to spread the objects out to
>>> a lot of
>>> buckets.  However, that ran me into another issue once I hit
>>> 1000
>>> buckets which is a per user limit.  I switched the limit to
>>> be unlimited
>>> with this command:
>>>
>>> radosgw-admin user modify --uid=your_username --max-buckets=0
>>>
>>> Bryan
>>>
>>>
>>> On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
>>> mailto:bill.o...@gmail.com>
>>> >>
>>>
>>> wrote:
>>>
>>>  I'm testing ceph for storing a very large number of
>>> small files.
>>>I'm seeing some performance issues and would like to
>>> see if anyone
>>>  could offer any insight as to what I could do to
>>> correct this.
>>>
>>>  Some numbers:
>>>
>>>  Uploaded 184111 files, with an average file size of
>>> 5KB, using
>>>  10 separate servers to upload the request using Python
>>> and the
>>>  cloudfiles module.  I stopped uploading after 53
>>> minutes, which
>>>  seems to average 5.7 files per second per node.
>>>
>>>
>>>  My storage cluster consists of 21 OSD's across 7
>>> servers, with their
>>>  journals written to SSD drives.  I've done a default
>>> installation,
>>>  using ceph-deploy with the dumpling release.
>>>
>>>  I'm using statsd to monitor the performance, and what's
>>> interesting
>>>  is when I start with an empty bucket, performance is
>>> amazing, with
>>>  average response times of 20-50ms.  However as time
>>> goes on, the
>>>  response times go in to the hundreds, and the average
>>> number of
>>>  uploads per second drops.
>>>
>>>  I've installed radosgw on all 7 ceph servers.  I've
>>> tested using a
>>>  load balancer to distribute the api calls, as well as
>>> pointing the
>>>  10 worker servers to a single instance.  I've not seen
>>> a real
>>>  different in performance with t

Re: [ceph-users] ceph-mon runs on 6800 not 6789.

2013-09-05 Thread Gilles Mocellin

Le 03/09/2013 14:56, Joao Eduardo Luis a écrit :

On 09/03/2013 02:02 AM, 이주헌 wrote:

Hi all.

I have 1 MDS and 3 OSDs. I installed them via ceph-deploy. (dumpling
0.67.2 version)
At first, It works perfectly. But, after I reboot one of OSD, ceph-mon
launched on port 6800 not 6789.


This has been a recurrent issue I've been completely unable to 
reproduce so far.


Are you able to reproduce this reliably?

Could you share the steps you took leading you to this state?

  -Joao


Hello !

It just happened to me now.
Ceph 0.67.2 on Debian Wheezy.

As my testing cluster is on virtual machines (KVM), I had some overload 
that leads a monitor out (load + time skew I think).

When I restart the monitor, the port changed, and I often see that message :
2013-09-05 18:55:17.759159 7f2f94bbd700  0 -- :/1007156 >> 
10.0.0.53:6789/0 pipe(0x1859370 sd=3 :0 s=1 pgs=0 cs=0 l=1 
c=0x18595d0).fault


=> The mon is on 10.0.0.53 but know listen to port 6800.

Restarting does not change the port.
I had to destroy and re-create the mon (I use ceph-deploy), and now it's 
OK, listening on 6789.


Is there a command to change it in the map ?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Bryan Stillwell
Mark,

Yesterday I blew away all the objects and restarted my test using
multiple buckets, and things are definitely better!

After ~20 hours I've already uploaded ~3.5 million objects, which much
is better then the ~1.5 million I did over ~96 hours this past
weekend.  Unfortunately it seems that things have slowed down a bit.
The average upload rate over those first 20 hours was ~48
objects/second, but now I'm only seeing ~20 objects/second.  This is
with 18,836 buckets.

Bryan

On Wed, Sep 4, 2013 at 12:43 PM, Bryan Stillwell
 wrote:
> So far I haven't seen much of a change.  It's still working through removing
> the bucket that reached 1.5 million objects though (my guess is that'll take
> a few more days), so I believe that might have something to do with it.
>
> Bryan
>
>
> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson 
> wrote:
>>
>> Bryan,
>>
>> Good explanation.  How's performance now that you've spread the load over
>> multiple buckets?
>>
>> Mark
>>
>> On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
>>>
>>> Bill,
>>>
>>> I've run into a similar issue with objects averaging ~100KiB.  The
>>> explanation I received on IRC is that there are scaling issues if you're
>>> uploading them all to the same bucket because the index isn't sharded.
>>>   The recommended solution is to spread the objects out to a lot of
>>> buckets.  However, that ran me into another issue once I hit 1000
>>> buckets which is a per user limit.  I switched the limit to be unlimited
>>> with this command:
>>>
>>> radosgw-admin user modify --uid=your_username --max-buckets=0
>>>
>>> Bryan
>>>
>>>
>>> On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer >> > wrote:
>>>
>>> I'm testing ceph for storing a very large number of small files.
>>>   I'm seeing some performance issues and would like to see if anyone
>>> could offer any insight as to what I could do to correct this.
>>>
>>> Some numbers:
>>>
>>> Uploaded 184111 files, with an average file size of 5KB, using
>>> 10 separate servers to upload the request using Python and the
>>> cloudfiles module.  I stopped uploading after 53 minutes, which
>>> seems to average 5.7 files per second per node.
>>>
>>>
>>> My storage cluster consists of 21 OSD's across 7 servers, with their
>>> journals written to SSD drives.  I've done a default installation,
>>> using ceph-deploy with the dumpling release.
>>>
>>> I'm using statsd to monitor the performance, and what's interesting
>>> is when I start with an empty bucket, performance is amazing, with
>>> average response times of 20-50ms.  However as time goes on, the
>>> response times go in to the hundreds, and the average number of
>>> uploads per second drops.
>>>
>>> I've installed radosgw on all 7 ceph servers.  I've tested using a
>>> load balancer to distribute the api calls, as well as pointing the
>>> 10 worker servers to a single instance.  I've not seen a real
>>> different in performance with this either.
>>>
>>>
>>> Each of the ceph servers are 16 core Xeon 2.53GHz with 72GB of ram,
>>> OCZ Vertex4 SSD drives for the journals and Seagate Barracuda ES2
>>> drives for storage.
>>>
>>>
>>> Any help would be greatly appreciated.
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com 
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Bill Omer
I'm using all defaults created with ceph-deploy

I will try the rgw cache setting.  Do you have any other recommendations?


On Thu, Sep 5, 2013 at 1:14 PM, Yehuda Sadeh  wrote:

> On Thu, Sep 5, 2013 at 9:49 AM, Sage Weil  wrote:
> > On Thu, 5 Sep 2013, Bill Omer wrote:
> >> Thats correct.  We created 65k buckets, using two hex characters as the
> >> naming convention, then stored the files in each container based on
> their
> >> first two characters in the file name.  The end result was 20-50 files
> per
> >> bucket.  Once all of the buckets were created and files were being
> loaded,
> >> we still observed an increase in latency overtime.
> >
> > This might be going too far in the opposite direction.  I would target
> > 1000's of objects per bucket, not 10's.  The radosgw has to validate
> > bucket ACLs on requests.  It caches them, but it probably can't cache 64K
> > of them (not by default at least!).  And even if it can, it will take a
> > long long time for the cache to warm up.  In any case, the end result is
> > that there is probably an extra rados request going on on the backend for
> > every request.
> >
> > Maybe try over ~1000 buckets and see how that goes?  And give the cache a
> > bit of time to warm up?
>
>
> There's actually a configurable that can be played with. Try setting
> something like this in your ceph.conf:
>
> rgw cache lru size = 10
>
> That is 10 times the default 10k.
>
>
> Also, I don't remember if the obvious has been stated, but how many
> pgs do you have on your data and index pools?
>
> Yehuda
>
> >
> > sage
> >
> >
> >
> >> Is there a way to disable indexing?  Or are there other settings you can
> >> suggest to attempt to speed this process up?
> >>
> >>
> >> On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson 
> wrote:
> >>   Just for clarification, distributing objects over lots of
> >>   buckets isn't helping improve small object performance?
> >>
> >>   The degradation over time is similar to something I've seen in
> >>   the past, with higher numbers of seeks on the underlying OSD
> >>   device over time.  Is it always (temporarily) resolved writing
> >>   to a new empty bucket?
> >>
> >>   Mark
> >>
> >>   On 09/04/2013 02:45 PM, Bill Omer wrote:
> >>   We've actually done the same thing, creating 65k buckets
> >>   and storing
> >>   20-50 objects in each.  No change really, not noticeable
> >>   anyway
> >>
> >>
> >>   On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
> >> mailto:bstillw...@photobucket.com>>
> >> wrote:
> >>
> >> So far I haven't seen much of a change.  It's still working
> >> through
> >> removing the bucket that reached 1.5 million objects though
> >> (my
> >> guess is that'll take a few more days), so I believe that
> >> might have
> >> something to do with it.
> >>
> >> Bryan
> >>
> >>
> >> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
> >> mailto:mark.nel...@inktank.com>>
> >> wrote:
> >>
> >> Bryan,
> >>
> >> Good explanation.  How's performance now that you've
> >> spread the
> >> load over multiple buckets?
> >>
> >> Mark
> >>
> >> On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
> >>
> >> Bill,
> >>
> >> I've run into a similar issue with objects averaging
> >> ~100KiB.  The
> >> explanation I received on IRC is that there are
> >> scaling
> >> issues if you're
> >> uploading them all to the same bucket because the
> >> index
> >> isn't sharded.
> >>The recommended solution is to spread the objects
> >> out to
> >> a lot of
> >> buckets.  However, that ran me into another issue
> >> once I hit
> >> 1000
> >> buckets which is a per user limit.  I switched the
> >> limit to
> >> be unlimited
> >> with this command:
> >>
> >> radosgw-admin user modify --uid=your_username
> >> --max-buckets=0
> >>
> >> Bryan
> >>
> >>
> >> On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
> >> mailto:bill.o...@gmail.com>
> >>  >> >>
> >> wrote:
> >>
> >>  I'm testing ceph for storing a very large
> >> number of
> >> small files.
> >>I'm seeing some performance issues and would
> >> like to
> >> see if anyone
> >>  could offer any insight as to what I could do
> >> to
> >> correct this.
> >>
> >>  Some numbers:
> >>
> >>  Uploaded 184111 files, with an average file
> >> size of
> >> 5KB, using
> >>  10 separate servers to upload the request using
> >> Python
> >> and the
> >>  cloudfiles module.  I stopped uploading after
> >> 53
> >> minutes, which
> >>  seems

Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Yehuda Sadeh
On Thu, Sep 5, 2013 at 9:49 AM, Sage Weil  wrote:
> On Thu, 5 Sep 2013, Bill Omer wrote:
>> Thats correct.  We created 65k buckets, using two hex characters as the
>> naming convention, then stored the files in each container based on their
>> first two characters in the file name.  The end result was 20-50 files per
>> bucket.  Once all of the buckets were created and files were being loaded,
>> we still observed an increase in latency overtime.
>
> This might be going too far in the opposite direction.  I would target
> 1000's of objects per bucket, not 10's.  The radosgw has to validate
> bucket ACLs on requests.  It caches them, but it probably can't cache 64K
> of them (not by default at least!).  And even if it can, it will take a
> long long time for the cache to warm up.  In any case, the end result is
> that there is probably an extra rados request going on on the backend for
> every request.
>
> Maybe try over ~1000 buckets and see how that goes?  And give the cache a
> bit of time to warm up?


There's actually a configurable that can be played with. Try setting
something like this in your ceph.conf:

rgw cache lru size = 10

That is 10 times the default 10k.


Also, I don't remember if the obvious has been stated, but how many
pgs do you have on your data and index pools?

Yehuda

>
> sage
>
>
>
>> Is there a way to disable indexing?  Or are there other settings you can
>> suggest to attempt to speed this process up?
>>
>>
>> On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson  wrote:
>>   Just for clarification, distributing objects over lots of
>>   buckets isn't helping improve small object performance?
>>
>>   The degradation over time is similar to something I've seen in
>>   the past, with higher numbers of seeks on the underlying OSD
>>   device over time.  Is it always (temporarily) resolved writing
>>   to a new empty bucket?
>>
>>   Mark
>>
>>   On 09/04/2013 02:45 PM, Bill Omer wrote:
>>   We've actually done the same thing, creating 65k buckets
>>   and storing
>>   20-50 objects in each.  No change really, not noticeable
>>   anyway
>>
>>
>>   On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
>> mailto:bstillw...@photobucket.com>>
>> wrote:
>>
>> So far I haven't seen much of a change.  It's still working
>> through
>> removing the bucket that reached 1.5 million objects though
>> (my
>> guess is that'll take a few more days), so I believe that
>> might have
>> something to do with it.
>>
>> Bryan
>>
>>
>> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
>> mailto:mark.nel...@inktank.com>>
>> wrote:
>>
>> Bryan,
>>
>> Good explanation.  How's performance now that you've
>> spread the
>> load over multiple buckets?
>>
>> Mark
>>
>> On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
>>
>> Bill,
>>
>> I've run into a similar issue with objects averaging
>> ~100KiB.  The
>> explanation I received on IRC is that there are
>> scaling
>> issues if you're
>> uploading them all to the same bucket because the
>> index
>> isn't sharded.
>>The recommended solution is to spread the objects
>> out to
>> a lot of
>> buckets.  However, that ran me into another issue
>> once I hit
>> 1000
>> buckets which is a per user limit.  I switched the
>> limit to
>> be unlimited
>> with this command:
>>
>> radosgw-admin user modify --uid=your_username
>> --max-buckets=0
>>
>> Bryan
>>
>>
>> On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
>> mailto:bill.o...@gmail.com>
>> > >>
>> wrote:
>>
>>  I'm testing ceph for storing a very large
>> number of
>> small files.
>>I'm seeing some performance issues and would
>> like to
>> see if anyone
>>  could offer any insight as to what I could do
>> to
>> correct this.
>>
>>  Some numbers:
>>
>>  Uploaded 184111 files, with an average file
>> size of
>> 5KB, using
>>  10 separate servers to upload the request using
>> Python
>> and the
>>  cloudfiles module.  I stopped uploading after
>> 53
>> minutes, which
>>  seems to average 5.7 files per second per node.
>>
>>
>>  My storage cluster consists of 21 OSD's across
>> 7
>> servers, with their
>>  journals written to SSD drives.  I've done a
>> default
>> installation,
>>  using ceph-deploy with the dumpling release.
>>
>>  I'm using statsd to monitor the performance,
>> and what's
>> interesting
>>  is when I st

Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure

2013-09-05 Thread Gregory Farnum
On Thu, Sep 5, 2013 at 9:31 AM, Alfredo Deza  wrote:
> On Thu, Sep 5, 2013 at 11:42 AM, Bernhard Glomm
>  wrote:
>>
>> Hi all,
>>
>> as a ceph newbie I got another question that is probably solved long ago.
>> I have my testcluster consisting two OSDs that also host MONs
>> plus one to five MONs.
>> Now I want to reboot all instance, simulating a power failure.
>> So I shutdown the extra MONs,
>> Than shutting down the first OSD/MON instance (call it "ping")
>> and after shutdown is complete, shutting down the second OSD/MON
>> instance (call it "pong")
>> 5 Minutes later I restart "pong", than after I checked all services are
>> up and running I restart "pong", afterwards I restart the MON that I brought
>> down the at last, not the other MON though (since - surprise - they are in
>> this test szenario just virtual instances residing on some ceph rbds)
>>
>> I think this is the wrong way to do it, since it brakes the cluster 
>> unrecoverable...
>> at least that's what it seems, ceph tries to call one of the MONs that isn't 
>> there yet
>> How to shut down and restart the whole cluster in a coordinated way in case
>> of a powerfailure (need a script for our UPS)
>>
>> And a second question regarding ceph-deploy:
>> How do I specify a second NIC/address to be used as the intercluster 
>> communication?
>
> You will not be able to do something like this with ceph-deploy. This
> sounds like a very specific (or a bit more advanced)
> configuration than what ceph-deploy offers.

Actually, you can — when editing the ceph.conf (before creating any
daemons) simply set public addr and cluster addr in whatever section
is appropriate. :)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Mark Nelson
based on your numbers, you were at something like an average of 186 
objects per bucket at the 20 hour mark?  I wonder how this trend 
compares to what you'd see with a single bucket.


With that many buckets you should have indexes well spread across all of 
the OSDs.  It'd be interesting to know what the iops/throughput is on 
all of your OSDs now (blktrace/seekwatcher can help here, but they are 
not the easiest tools to setup/use).


Mark

On 09/05/2013 11:59 AM, Bryan Stillwell wrote:

Mark,

Yesterday I blew away all the objects and restarted my test using
multiple buckets, and things are definitely better!

After ~20 hours I've already uploaded ~3.5 million objects, which much
is better then the ~1.5 million I did over ~96 hours this past
weekend.  Unfortunately it seems that things have slowed down a bit.
The average upload rate over those first 20 hours was ~48
objects/second, but now I'm only seeing ~20 objects/second.  This is
with 18,836 buckets.

Bryan

On Wed, Sep 4, 2013 at 12:43 PM, Bryan Stillwell
 wrote:

So far I haven't seen much of a change.  It's still working through removing
the bucket that reached 1.5 million objects though (my guess is that'll take
a few more days), so I believe that might have something to do with it.

Bryan


On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson 
wrote:


Bryan,

Good explanation.  How's performance now that you've spread the load over
multiple buckets?

Mark

On 09/04/2013 12:39 PM, Bryan Stillwell wrote:


Bill,

I've run into a similar issue with objects averaging ~100KiB.  The
explanation I received on IRC is that there are scaling issues if you're
uploading them all to the same bucket because the index isn't sharded.
   The recommended solution is to spread the objects out to a lot of
buckets.  However, that ran me into another issue once I hit 1000
buckets which is a per user limit.  I switched the limit to be unlimited
with this command:

radosgw-admin user modify --uid=your_username --max-buckets=0

Bryan


On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer mailto:bill.o...@gmail.com>> wrote:

 I'm testing ceph for storing a very large number of small files.
   I'm seeing some performance issues and would like to see if anyone
 could offer any insight as to what I could do to correct this.

 Some numbers:

 Uploaded 184111 files, with an average file size of 5KB, using
 10 separate servers to upload the request using Python and the
 cloudfiles module.  I stopped uploading after 53 minutes, which
 seems to average 5.7 files per second per node.


 My storage cluster consists of 21 OSD's across 7 servers, with their
 journals written to SSD drives.  I've done a default installation,
 using ceph-deploy with the dumpling release.

 I'm using statsd to monitor the performance, and what's interesting
 is when I start with an empty bucket, performance is amazing, with
 average response times of 20-50ms.  However as time goes on, the
 response times go in to the hundreds, and the average number of
 uploads per second drops.

 I've installed radosgw on all 7 ceph servers.  I've tested using a
 load balancer to distribute the api calls, as well as pointing the
 10 worker servers to a single instance.  I've not seen a real
 different in performance with this either.


 Each of the ceph servers are 16 core Xeon 2.53GHz with 72GB of ram,
 OCZ Vertex4 SSD drives for the journals and Seagate Barracuda ES2
 drives for storage.


 Any help would be greatly appreciated.


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supported by Citrix XenServer Yet?

2013-09-05 Thread Liu, Larry
Thanks, Neil!  Anyone has a working doc on how to generate a secret for a 
CentOS6.4 tech preview machine to access a RBD cluster?

From: Neil Levine mailto:neil.lev...@inktank.com>>
Date: Thursday, August 29, 2013 5:01 PM
To: Larry Liu mailto:larry@disney.com>>
Cc: "ceph-users@lists.ceph.com" 
mailto:ceph-users@lists.ceph.com>>
Subject: Re: [ceph-users] Ceph Supported by Citrix XenServer Yet?

The XenServer product has a tech preview version available with Ceph RBD 
support:

http://xenserver.org/discuss-virtualization/virtualization-blog/entry/tech-preview-of-xenserver-libvirt-ceph.html

The fully-supported, commercial version from Citrix will be available sometime 
in Q4.

Neil



On Thu, Aug 29, 2013 at 4:55 PM, Liu, Larry 
mailto:larry@disney.com>> wrote:
Hi guys,

Anyone heard anything if Citrix XenServer supports Ceph yet?  Provision 
CentOS6.4 then on the top of it installing Xen seems a bit too much.

Thanks!

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Sage Weil
On Thu, 5 Sep 2013, Bill Omer wrote:
> Thats correct.  We created 65k buckets, using two hex characters as the
> naming convention, then stored the files in each container based on their
> first two characters in the file name.  The end result was 20-50 files per
> bucket.  Once all of the buckets were created and files were being loaded,
> we still observed an increase in latency overtime.

This might be going too far in the opposite direction.  I would target 
1000's of objects per bucket, not 10's.  The radosgw has to validate 
bucket ACLs on requests.  It caches them, but it probably can't cache 64K 
of them (not by default at least!).  And even if it can, it will take a 
long long time for the cache to warm up.  In any case, the end result is 
that there is probably an extra rados request going on on the backend for 
every request.

Maybe try over ~1000 buckets and see how that goes?  And give the cache a 
bit of time to warm up?

sage



> Is there a way to disable indexing?  Or are there other settings you can
> suggest to attempt to speed this process up?
> 
> 
> On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson  wrote:
>   Just for clarification, distributing objects over lots of
>   buckets isn't helping improve small object performance?
> 
>   The degradation over time is similar to something I've seen in
>   the past, with higher numbers of seeks on the underlying OSD
>   device over time.  Is it always (temporarily) resolved writing
>   to a new empty bucket?
> 
>   Mark
> 
>   On 09/04/2013 02:45 PM, Bill Omer wrote:
>   We've actually done the same thing, creating 65k buckets
>   and storing
>   20-50 objects in each.  No change really, not noticeable
>   anyway
> 
> 
>   On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
> mailto:bstillw...@photobucket.com>>
> wrote:
> 
>     So far I haven't seen much of a change.  It's still working
> through
>     removing the bucket that reached 1.5 million objects though
> (my
>     guess is that'll take a few more days), so I believe that
> might have
>     something to do with it.
> 
>     Bryan
> 
> 
>     On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
>     mailto:mark.nel...@inktank.com>>
> wrote:
> 
>         Bryan,
> 
>         Good explanation.  How's performance now that you've
> spread the
>         load over multiple buckets?
> 
>         Mark
> 
>         On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
> 
>             Bill,
> 
>             I've run into a similar issue with objects averaging
>             ~100KiB.  The
>             explanation I received on IRC is that there are
> scaling
>             issues if you're
>             uploading them all to the same bucket because the
> index
>             isn't sharded.
>                The recommended solution is to spread the objects
> out to
>             a lot of
>             buckets.  However, that ran me into another issue
> once I hit
>             1000
>             buckets which is a per user limit.  I switched the
> limit to
>             be unlimited
>             with this command:
> 
>             radosgw-admin user modify --uid=your_username
> --max-buckets=0
> 
>             Bryan
> 
> 
>             On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
>             mailto:bill.o...@gmail.com>
>              >>
>             wrote:
> 
>                  I'm testing ceph for storing a very large
> number of
>             small files.
>                    I'm seeing some performance issues and would
> like to
>             see if anyone
>                  could offer any insight as to what I could do
> to
>             correct this.
> 
>                  Some numbers:
> 
>                  Uploaded 184111 files, with an average file
> size of
>             5KB, using
>                  10 separate servers to upload the request using
> Python
>             and the
>                  cloudfiles module.  I stopped uploading after
> 53
>             minutes, which
>                  seems to average 5.7 files per second per node.
> 
> 
>                  My storage cluster consists of 21 OSD's across
> 7
>             servers, with their
>                  journals written to SSD drives.  I've done a
> default
>             installation,
>                  using ceph-deploy with the dumpling release.
> 
>                  I'm using statsd to monitor the performance,
> and what's
>             interesting
>                  is when I start with an empty bucket,
> performance is
>             amazing, with
>                  average response times of 20-50ms.  However as
> time
>             goes on, the
>                  response times go in to the hundreds, and the
> average
>             number of
>                  uploads per second drops.
> 
>                  I've installed radosgw on all 7 ceph servers.
>  I've
>             tested using a
>                  load balancer to distribute t

Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Bill Omer
Sorry, I meant to say the first four characters, for a total of 65539
buckets


On Thu, Sep 5, 2013 at 12:30 PM, Bryan Stillwell  wrote:

> Wouldn't using only the first two characters in the file name result
> in less then 65k buckets being used?
>
> For example if the file names contained 0-9 and a-f, that would only
> be 256 buckets (16*16).  Or if they contained 0-9, a-z, and A-Z, that
> would only be 3,844 buckets (62 * 62).
>
> Bryan
>
>
> On Thu, Sep 5, 2013 at 8:19 AM, Bill Omer  wrote:
> >
> > Thats correct.  We created 65k buckets, using two hex characters as the
> naming convention, then stored the files in each container based on their
> first two characters in the file name.  The end result was 20-50 files per
> bucket.  Once all of the buckets were created and files were being loaded,
> we still observed an increase in latency overtime.
> >
> > Is there a way to disable indexing?  Or are there other settings you can
> suggest to attempt to speed this process up?
> >
> >
> > On Wed, Sep 4, 2013 at 5:21 PM, Mark Nelson 
> wrote:
> >>
> >> Just for clarification, distributing objects over lots of buckets isn't
> helping improve small object performance?
> >>
> >> The degradation over time is similar to something I've seen in the
> past, with higher numbers of seeks on the underlying OSD device over time.
>  Is it always (temporarily) resolved writing to a new empty bucket?
> >>
> >> Mark
> >>
> >>
> >> On 09/04/2013 02:45 PM, Bill Omer wrote:
> >>>
> >>> We've actually done the same thing, creating 65k buckets and storing
> >>> 20-50 objects in each.  No change really, not noticeable anyway
> >>>
> >>>
> >>> On Wed, Sep 4, 2013 at 2:43 PM, Bryan Stillwell
> >>> mailto:bstillw...@photobucket.com>>
> wrote:
> >>>
> >>> So far I haven't seen much of a change.  It's still working through
> >>> removing the bucket that reached 1.5 million objects though (my
> >>> guess is that'll take a few more days), so I believe that might
> have
> >>> something to do with it.
> >>>
> >>> Bryan
> >>>
> >>>
> >>> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson
> >>> mailto:mark.nel...@inktank.com>> wrote:
> >>>
> >>> Bryan,
> >>>
> >>> Good explanation.  How's performance now that you've spread the
> >>> load over multiple buckets?
> >>>
> >>> Mark
> >>>
> >>> On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
> >>>
> >>> Bill,
> >>>
> >>> I've run into a similar issue with objects averaging
> >>> ~100KiB.  The
> >>> explanation I received on IRC is that there are scaling
> >>> issues if you're
> >>> uploading them all to the same bucket because the index
> >>> isn't sharded.
> >>>The recommended solution is to spread the objects out to
> >>> a lot of
> >>> buckets.  However, that ran me into another issue once I
> hit
> >>> 1000
> >>> buckets which is a per user limit.  I switched the limit to
> >>> be unlimited
> >>> with this command:
> >>>
> >>> radosgw-admin user modify --uid=your_username
> --max-buckets=0
> >>>
> >>> Bryan
> >>>
> >>>
> >>> On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer
> >>> mailto:bill.o...@gmail.com>
> >>> >>
> >>>
> >>> wrote:
> >>>
> >>>  I'm testing ceph for storing a very large number of
> >>> small files.
> >>>I'm seeing some performance issues and would like to
> >>> see if anyone
> >>>  could offer any insight as to what I could do to
> >>> correct this.
> >>>
> >>>  Some numbers:
> >>>
> >>>  Uploaded 184111 files, with an average file size of
> >>> 5KB, using
> >>>  10 separate servers to upload the request using Python
> >>> and the
> >>>  cloudfiles module.  I stopped uploading after 53
> >>> minutes, which
> >>>  seems to average 5.7 files per second per node.
> >>>
> >>>
> >>>  My storage cluster consists of 21 OSD's across 7
> >>> servers, with their
> >>>  journals written to SSD drives.  I've done a default
> >>> installation,
> >>>  using ceph-deploy with the dumpling release.
> >>>
> >>>  I'm using statsd to monitor the performance, and
> what's
> >>> interesting
> >>>  is when I start with an empty bucket, performance is
> >>> amazing, with
> >>>  average response times of 20-50ms.  However as time
> >>> goes on, the
> >>>  response times go in to the hundreds, and the average
> >>> number of
> >>>  uploads per second drops

Re: [ceph-users] Performance issues with small files

2013-09-05 Thread Bryan Stillwell
I need to restart the upload process again because all the objects
have a content-type of 'binary/octet-stream' instead of 'image/jpeg',
'image/png', etc.  I plan on enabling monitoring this time so we can
see if there are any signs of what might be going on.  Did you want me
to increase the number of buckets to see if that changes anything?
This is pretty easy for me to do.

Bryan

On Thu, Sep 5, 2013 at 11:07 AM, Mark Nelson  wrote:
> based on your numbers, you were at something like an average of 186 objects
> per bucket at the 20 hour mark?  I wonder how this trend compares to what
> you'd see with a single bucket.
>
> With that many buckets you should have indexes well spread across all of the
> OSDs.  It'd be interesting to know what the iops/throughput is on all of
> your OSDs now (blktrace/seekwatcher can help here, but they are not the
> easiest tools to setup/use).
>
> Mark
>
> On 09/05/2013 11:59 AM, Bryan Stillwell wrote:
>>
>> Mark,
>>
>> Yesterday I blew away all the objects and restarted my test using
>> multiple buckets, and things are definitely better!
>>
>> After ~20 hours I've already uploaded ~3.5 million objects, which much
>> is better then the ~1.5 million I did over ~96 hours this past
>> weekend.  Unfortunately it seems that things have slowed down a bit.
>> The average upload rate over those first 20 hours was ~48
>> objects/second, but now I'm only seeing ~20 objects/second.  This is
>> with 18,836 buckets.
>>
>> Bryan
>>
>> On Wed, Sep 4, 2013 at 12:43 PM, Bryan Stillwell
>>  wrote:
>>>
>>> So far I haven't seen much of a change.  It's still working through
>>> removing
>>> the bucket that reached 1.5 million objects though (my guess is that'll
>>> take
>>> a few more days), so I believe that might have something to do with it.
>>>
>>> Bryan
>>>
>>>
>>> On Wed, Sep 4, 2013 at 12:14 PM, Mark Nelson 
>>> wrote:


 Bryan,

 Good explanation.  How's performance now that you've spread the load
 over
 multiple buckets?

 Mark

 On 09/04/2013 12:39 PM, Bryan Stillwell wrote:
>
>
> Bill,
>
> I've run into a similar issue with objects averaging ~100KiB.  The
> explanation I received on IRC is that there are scaling issues if
> you're
> uploading them all to the same bucket because the index isn't sharded.
>The recommended solution is to spread the objects out to a lot of
> buckets.  However, that ran me into another issue once I hit 1000
> buckets which is a per user limit.  I switched the limit to be
> unlimited
> with this command:
>
> radosgw-admin user modify --uid=your_username --max-buckets=0
>
> Bryan
>
>
> On Wed, Sep 4, 2013 at 11:27 AM, Bill Omer  > wrote:
>
>  I'm testing ceph for storing a very large number of small files.
>I'm seeing some performance issues and would like to see if
> anyone
>  could offer any insight as to what I could do to correct this.
>
>  Some numbers:
>
>  Uploaded 184111 files, with an average file size of 5KB, using
>  10 separate servers to upload the request using Python and the
>  cloudfiles module.  I stopped uploading after 53 minutes, which
>  seems to average 5.7 files per second per node.
>
>
>  My storage cluster consists of 21 OSD's across 7 servers, with
> their
>  journals written to SSD drives.  I've done a default installation,
>  using ceph-deploy with the dumpling release.
>
>  I'm using statsd to monitor the performance, and what's
> interesting
>  is when I start with an empty bucket, performance is amazing, with
>  average response times of 20-50ms.  However as time goes on, the
>  response times go in to the hundreds, and the average number of
>  uploads per second drops.
>
>  I've installed radosgw on all 7 ceph servers.  I've tested using a
>  load balancer to distribute the api calls, as well as pointing the
>  10 worker servers to a single instance.  I've not seen a real
>  different in performance with this either.
>
>
>  Each of the ceph servers are 16 core Xeon 2.53GHz with 72GB of
> ram,
>  OCZ Vertex4 SSD drives for the journals and Seagate Barracuda ES2
>  drives for storage.
>
>
>  Any help would be greatly appreciated.
>
>
>  ___
>  ceph-users mailing list
>  ceph-users@lists.ceph.com 
>  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deply preflight hostname check?

2013-09-05 Thread Alfredo Deza
On Thu, Sep 5, 2013 at 12:27 AM, Nigel Williams
 wrote:
> I notice under HOSTNAME RESOLUTION section the use of 'host -4
> {hostname}' as a required test, however, in all my trial deployments
> so far, none would pass as this command is a direct DNS query, and
> instead I usually just add entries to the host file.
>
> Two thoughts, is Ceph expecting to only do DNS queries? or instead
> would it be better for the pre-flight to use the getent  hosts
> {hostname} as a test?

ceph-deploy will require you use the same hostname as the one in the remote.

For example, if you do: `ceph-deploy mon create host1` then `host1`
should be the same
answer you get from the `hostname` command.

This is not strictly related to ceph-deploy but other parts of ceph,
like the monitors which
use this information when attempting to form quorum.


> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Supported by Citrix XenServer Yet?

2013-09-05 Thread John Wilkins
Larry,

If you're talking about how to do that with libvirt and QEMU on
CentOS6.4, you might look at
http://openstack.redhat.com/Using_Ceph_for_Block_Storage_with_RDO. You
just don't need to install and configure OpenStack, obviously. You do
need to get the upstream version of QEMU from the Ceph repository
though.



On Thu, Sep 5, 2013 at 10:03 AM, Liu, Larry  wrote:
> Thanks, Neil!  Anyone has a working doc on how to generate a secret for a
> CentOS6.4 tech preview machine to access a RBD cluster?
>
> From: Neil Levine 
> Date: Thursday, August 29, 2013 5:01 PM
> To: Larry Liu 
> Cc: "ceph-users@lists.ceph.com" 
> Subject: Re: [ceph-users] Ceph Supported by Citrix XenServer Yet?
>
> The XenServer product has a tech preview version available with Ceph RBD
> support:
>
> http://xenserver.org/discuss-virtualization/virtualization-blog/entry/tech-preview-of-xenserver-libvirt-ceph.html
>
> The fully-supported, commercial version from Citrix will be available
> sometime in Q4.
>
> Neil
>
>
>
> On Thu, Aug 29, 2013 at 4:55 PM, Liu, Larry  wrote:
>>
>> Hi guys,
>>
>> Anyone heard anything if Citrix XenServer supports Ceph yet?  Provision
>> CentOS6.4 then on the top of it installing Xen seems a bit too much.
>>
>> Thanks!
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
http://inktank.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deply preflight hostname check?

2013-09-05 Thread John Wilkins
Let me follow up on that and get back to you. There has been a
significant amount of work on ceph-deploy since that was written.

On Wed, Sep 4, 2013 at 9:27 PM, Nigel Williams
 wrote:
> I notice under HOSTNAME RESOLUTION section the use of 'host -4
> {hostname}' as a required test, however, in all my trial deployments
> so far, none would pass as this command is a direct DNS query, and
> instead I usually just add entries to the host file.
>
> Two thoughts, is Ceph expecting to only do DNS queries? or instead
> would it be better for the pre-flight to use the getent  hosts
> {hostname} as a test?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
http://inktank.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] New to ceph, auth/permission error

2013-09-05 Thread Gary Mazzaferro
Hi

Installed the latest ceph and having an issue with permission and don't
know where to start looking.

My Config:
(2) ods data nodes
(1) monitor node
(1) mds node
(1) admin node
(1) deploy node
(1) client node (not configured)

All on vmware

I collected all keysrings
I pushed the config file to all nodes

ceph@aries-admin:~$ ceph status
2013-09-05 08:32:57.908009 7fd03e469700  0 librados: client.admin
authentication error (1) Operation not permitted
Error connecting to cluster: PermissionError

Not sure if this is a local issue or remote failure.

help appreciated

-gary
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New to ceph, auth/permission error

2013-09-05 Thread Mark Kirkwood

On 06/09/13 11:07, Gary Mazzaferro wrote:

Hi

Installed the latest ceph and having an issue with permission and don't
know where to start looking.

My Config:
(2) ods data nodes
(1) monitor node
(1) mds node
(1) admin node
(1) deploy node
(1) client node (not configured)

All on vmware

I collected all keysrings
I pushed the config file to all nodes

ceph@aries-admin:~$ ceph status
2013-09-05 08:32:57.908009 7fd03e469700  0 librados: client.admin
authentication error (1) Operation not permitted
Error connecting to cluster: PermissionError

Not sure if this is a local issue or remote failure.



Hi Gary,

You will need a copy of the admin keyring on the admin host, and 
possibly a copy of ceph.conf too. Then try:


$ ceph -c ./ceph.conf -k ceph.client.admin.keyring status

(I'm assuming your admin keyring is called ceph.client.admin.keyring and 
config is ceph.conf)


Cheers

Mark


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] newbie question: rebooting the whole cluster, powerfailure

2013-09-05 Thread Alfredo Deza
On Thu, Sep 5, 2013 at 12:38 PM, Gregory Farnum  wrote:
> On Thu, Sep 5, 2013 at 9:31 AM, Alfredo Deza  wrote:
>> On Thu, Sep 5, 2013 at 11:42 AM, Bernhard Glomm
>>  wrote:
>>>
>>> Hi all,
>>>
>>> as a ceph newbie I got another question that is probably solved long ago.
>>> I have my testcluster consisting two OSDs that also host MONs
>>> plus one to five MONs.
>>> Now I want to reboot all instance, simulating a power failure.
>>> So I shutdown the extra MONs,
>>> Than shutting down the first OSD/MON instance (call it "ping")
>>> and after shutdown is complete, shutting down the second OSD/MON
>>> instance (call it "pong")
>>> 5 Minutes later I restart "pong", than after I checked all services are
>>> up and running I restart "pong", afterwards I restart the MON that I brought
>>> down the at last, not the other MON though (since - surprise - they are in
>>> this test szenario just virtual instances residing on some ceph rbds)
>>>
>>> I think this is the wrong way to do it, since it brakes the cluster 
>>> unrecoverable...
>>> at least that's what it seems, ceph tries to call one of the MONs that 
>>> isn't there yet
>>> How to shut down and restart the whole cluster in a coordinated way in case
>>> of a powerfailure (need a script for our UPS)
>>>
>>> And a second question regarding ceph-deploy:
>>> How do I specify a second NIC/address to be used as the intercluster 
>>> communication?
>>
>> You will not be able to do something like this with ceph-deploy. This
>> sounds like a very specific (or a bit more advanced)
>> configuration than what ceph-deploy offers.
>
> Actually, you can — when editing the ceph.conf (before creating any
> daemons) simply set public addr and cluster addr in whatever section
> is appropriate. :)

Oh, you are right! I was thinking about a flag in ceph-deploy for some reason :)

Sorry for the confusion!

> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] quick-ceph-deploy

2013-09-05 Thread sriram
I am trying to deploy ceph reading the instructions from this link.

http://ceph.com/docs/master/start/quick-ceph-deploy/

I get the error below. Can someone let me know if this is something related
to what I am doing wrong or the script?

[abc@abc-ld ~]$ ceph-deploy install abc-ld
[ceph_deploy.install][DEBUG ] Installing stable version dumpling on cluster
ceph hosts abc-ld
[ceph_deploy.install][DEBUG ] Detecting platform for host abc-ld ...
[sudo] password for abc:
[ceph_deploy.install][INFO  ] Distro info: RedHatEnterpriseWorkstation 6.1
Santiago
[abc-ld][INFO  ] installing ceph on abc-ld
[abc-ld][INFO  ] Running command: su -c 'rpm --import "
https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";'
[abc-ld][ERROR ] Traceback (most recent call last):
[abc-ld][ERROR ]   File
"/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py",
line 21, in install
[abc-ld][ERROR ]   File
"/usr/lib/python2.6/site-packages/ceph_deploy/util/decorators.py", line 10,
in inner
[abc-ld][ERROR ] def inner(*args, **kwargs):
[abc-ld][ERROR ]   File
"/usr/lib/python2.6/site-packages/ceph_deploy/util/wrappers.py", line 6, in
remote_call
[abc-ld][ERROR ] This allows us to only remote-execute the actual
calls, not whole functions.
[abc-ld][ERROR ]   File "/usr/lib64/python2.6/subprocess.py", line 502, in
check_call
[abc-ld][ERROR ] raise CalledProcessError(retcode, cmd)
[abc-ld][ERROR ] CalledProcessError: Command '['su -c \'rpm --import "
https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc"\'']'
returned non-zero exit status 1
[abc-ld][ERROR ] error:
https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: key 1
import failed.
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: su -c 'rpm
--import "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc";'
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Using radosgw with s3cmd: Bucket failure

2013-09-05 Thread Georg Höllrigl

On 23.08.2013 16:24, Yehuda Sadeh wrote:

On Fri, Aug 23, 2013 at 1:47 AM, Tobias Brunner  wrote:

Hi,

I'm trying to use radosgw with s3cmd:

# s3cmd ls

# s3cmd mb s3://bucket-1
ERROR: S3 error: 405 (MethodNotAllowed):

So there seems to be something missing according to buckets. How can I
create buckets? What do I have to configure on the radosgw side to have
buckets working?



The problem that you have here is that s3cmd uses the virtual host
bucket name mechanism, e.g. it tries to access http://bucket./
instead of the usual http:///bucket. You can configure the
gateway to support that (set 'rgw dns name = ' in your
ceph.conf), however, you'll also need to be able to route all these
requests to your host, using some catch-all dns. The easiest way to go
would be to configure your client to not use that virtual host bucket
name, but I'm not completely sure s3cmd can do that.

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



I'm standing directly at the same problem - but this didn't help. I've 
set up the DNS, can reach the subdomains and also the "rgw dns name".

But still the same troubles here :(

Georg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com