date:20140428

[ceph-users] Only one OSD log available per node?

2014-04-28 Thread Indra Pramana

Dear all,

I have multiple OSDs per node (normally 4) and I realised that for all the
nodes that I have, only one OSD will contain logs under /var/log/ceph, the
rest of the logs are empty.

root@ceph-osd-07:/var/log/ceph# ls -la *.log
-rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-client.admin.log
-rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.0.log
-rw-r--r-- 1 root root 386857 Apr 28 14:02 ceph-osd.12.log
-rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.13.log
-rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.14.log
-rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.15.log
-rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd..log

The ceph-osd.12.log only contains the logs for osd.12 only, while the other
logs for osd.13, 14 and 15 are not available and empty.

Is this normal?

Looking forward to your reply, thank you.

Cheers.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OOM-Killer for ceph-osd

2014-04-28 Thread Gandalf Corvotempesta

2014-04-27 23:58 GMT+02:00 Andrey Korolyov and...@xdel.ru:
 Nothing looks wrong, except heartbeat interval which probably should
 be smaller due to recovery considerations. Try ``ceph osd tell X heap
 release'' and if it will not change memory consumption, file a bug.

What should I look for running this ?
Seems to does nothing
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Udo Lembke

Hi,
perhaps due IOs from the journal?
You can test with iostat (like iostat -dm 5 sdg).

on debian iostat is in the package sysstat.

Udo

Am 28.04.2014 07:38, schrieb Indra Pramana:
 Hi Craig,
 
 Good day to you, and thank you for your enquiry.
 
 As per your suggestion, I have created a 3rd partition on the SSDs and did
 the dd test directly into the device, and the result is very slow.
 
 
 root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
 conv=fdatasync oflag=direct
 128+0 records in
 128+0 records out
 134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
 root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
 conv=fdatasync oflag=direct
 128+0 records in
 128+0 records out
 134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
 
 
 I did a test onto another server with exactly similar specification and
 similar SSD drive (Seagate SSD 100 GB) but not added into the cluster yet
 (thus no load), and the result is fast:
 
 
 root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero of=/dev/sdf1
 conv=fdatasync oflag=direct
 128+0 records in
 128+0 records out
 134217728 bytes (134 MB) copied, 0.742077 s, 181 MB/s
 
 
 Is the Ceph journal load really takes up a lot of the SSD resources? I
 don't understand how come the performance can drop significantly.
 Especially since the two Ceph journals are only taking the first 20 GB out
 of the 100 GB of the SSD total capacity.
 
 Any advice is greatly appreciated.
 
 Looking forward to your reply, thank you.
 
 Cheers.
 
 
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Irek Fasikhov

You what model SSD?
Which version of the kernel?



2014-04-28 12:35 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi,
 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 on debian iostat is in the package sysstat.

 Udo

 Am 28.04.2014 07:38, schrieb Indra Pramana:
  Hi Craig,
 
  Good day to you, and thank you for your enquiry.
 
  As per your suggestion, I have created a 3rd partition on the SSDs and
 did
  the dd test directly into the device, and the result is very slow.
 
  
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
  
 
  I did a test onto another server with exactly similar specification and
  similar SSD drive (Seagate SSD 100 GB) but not added into the cluster yet
  (thus no load), and the result is fast:
 
  
  root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero
 of=/dev/sdf1
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 0.742077 s, 181 MB/s
  
 
  Is the Ceph journal load really takes up a lot of the SSD resources? I
  don't understand how come the performance can drop significantly.
  Especially since the two Ceph journals are only taking the first 20 GB
 out
  of the 100 GB of the SSD total capacity.
 
  Any advice is greatly appreciated.
 
  Looking forward to your reply, thank you.
 
  Cheers.
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cluster_network ignored

2014-04-28 Thread Gandalf Corvotempesta

2014-04-26 12:06 GMT+02:00 Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com:
 I've not defined cluster IPs for each OSD server but only the whole subnet.
 Should I define each IP for each OSD ? This is not wrote on docs and
 could be tricky to do this in big environments with hundreds of nodes

I've added cluster addr and public addr to each OSD configuration
but nothing is changed.
I see all OSDs down except the ones from one server but I'm able to
ping each other nodes on both interfaces.

How can I detect what ceph is doing? I see tons of debug logs but they
are not very easy to understand
with ceph health i can see that pgs down value is slowly
decreasing so I can suppose that caph is recovering. Is that right?

Isn't possible to add a semplified output like the one coming from
mdadm? (cat /proc/mdstat)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Access denied error

2014-04-28 Thread Punit Dambiwal

Hi Yehuda,

I am using the same above method to call the api and used the way which
described in the
http://ceph.com/docs/master/radosgw/s3/authentication/#access-control-lists-aclsfor
connection. The method in the
http://s3.amazonaws.com/doc/s3-developer-guide/RESTAuthentication.html is
for generating the hash of the header string and secret keys, since these
keys are created already and i think we don't need this method, right ?

I also tried one function to list out the bucket data as like

curl -i 'http://gateway.3linux.com/test?format=json' -X GET -H
'Authorization: AWS
KGXJJGKDM5G7G4CNKC7R:LC7S0twZdhtXA1XxthfMDsj5TgJpeKhZrloWa9WN' -H 'Host:
gateway.3linux.com' -H 'Date: Mon, 28 April 2014 07:25:00 GMT ' -H
'Content-Length: 0'

but its also getting the access denied error. But i can view the bucket
details by directly entering http://gateway.3linux.com/test?format=json in
the browser. What do you think ? what may be the reason ? I am able to
connect and list buckets etc using cyberduck ftp clients these access keys
but unable to do with the function calls.




On Sat, Apr 26, 2014 at 12:22 AM, Yehuda Sadeh yeh...@inktank.com wrote:

 On Fri, Apr 25, 2014 at 1:03 AM, Punit Dambiwal hypu...@gmail.com wrote:
  Hi Yehuda,
 
  Thanks for your help...that missing date error gone but still i am
 getting
  the access denied error :-
 
  -
  2014-04-25 15:52:56.988025 7f00d37c6700  1 == starting new request
  req=0x237a090 =
  2014-04-25 15:52:56.988072 7f00d37c6700  2 req 24:0.46::GET
  /admin/usage::initializing
  2014-04-25 15:52:56.988077 7f00d37c6700 10 host=gateway.3linux.com
  rgw_dns_name=gateway.3linux.com
  2014-04-25 15:52:56.988102 7f00d37c6700 20 FCGI_ROLE=RESPONDER
  2014-04-25 15:52:56.988103 7f00d37c6700 20 SCRIPT_URL=/admin/usage
  2014-04-25 15:52:56.988104 7f00d37c6700 20
  SCRIPT_URI=http://gateway.3linux.com/admin/usage
  2014-04-25 15:52:56.988105 7f00d37c6700 20 HTTP_AUTHORIZATION=AWS
  KGXJJGKDM5G7G4CNKC7R:LC7S0twZdhtXA1XxthfMDsj5TgJpeKhZrloWa9WN
  2014-04-25 15:52:56.988107 7f00d37c6700 20 HTTP_USER_AGENT=curl/7.22.0
  (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4libidn/1.23
  librtmp/2.3
  2014-04-25 15:52:56.988108 7f00d37c6700 20 HTTP_ACCEPT=*/*
  2014-04-25 15:52:56.988109 7f00d37c6700 20 HTTP_HOST=gateway.3linux.com
  2014-04-25 15:52:56.988110 7f00d37c6700 20 HTTP_DATE=Fri, 25 April 2014
  07:50:00 GMT
  2014-04-25 15:52:56.988111 7f00d37c6700 20 CONTENT_LENGTH=0
  2014-04-25 15:52:56.988112 7f00d37c6700 20
 PATH=/usr/local/bin:/usr/bin:/bin
  2014-04-25 15:52:56.988113 7f00d37c6700 20 SERVER_SIGNATURE=
  2014-04-25 15:52:56.988114 7f00d37c6700 20 SERVER_SOFTWARE=Apache/2.2.22
  (Ubuntu)
  2014-04-25 15:52:56.988115 7f00d37c6700 20 SERVER_NAME=
 gateway.3linux.com
  2014-04-25 15:52:56.988116 7f00d37c6700 20 SERVER_ADDR=117.18.79.110
  2014-04-25 15:52:56.988117 7f00d37c6700 20 SERVER_PORT=80
  2014-04-25 15:52:56.988117 7f00d37c6700 20 REMOTE_ADDR=122.166.115.191
  2014-04-25 15:52:56.988118 7f00d37c6700 20 DOCUMENT_ROOT=/var/www
  2014-04-25 15:52:56.988119 7f00d37c6700 20 SERVER_ADMIN=c...@3linux.com
  2014-04-25 15:52:56.988120 7f00d37c6700 20
  SCRIPT_FILENAME=/var/www/s3gw.fcgi
  2014-04-25 15:52:56.988120 7f00d37c6700 20 REMOTE_PORT=28840
  2014-04-25 15:52:56.988121 7f00d37c6700 20 GATEWAY_INTERFACE=CGI/1.1
  2014-04-25 15:52:56.988122 7f00d37c6700 20 SERVER_PROTOCOL=HTTP/1.1
  2014-04-25 15:52:56.988123 7f00d37c6700 20 REQUEST_METHOD=GET
  2014-04-25 15:52:56.988123 7f00d37c6700 20
  QUERY_STRING=page=adminparams=/usageformat=json
  2014-04-25 15:52:56.988124 7f00d37c6700 20
  REQUEST_URI=/admin/usage?format=json
  2014-04-25 15:52:56.988125 7f00d37c6700 20 SCRIPT_NAME=/admin/usage
  2014-04-25 15:52:56.988126 7f00d37c6700  2 req 24:0.000101::GET
  /admin/usage::getting op
  2014-04-25 15:52:56.988129 7f00d37c6700  2 req 24:0.000104::GET
  /admin/usage:get_usage:authorizing
  2014-04-25 15:52:56.988141 7f00d37c6700 20 get_obj_state:
  rctx=0x7effbc004aa0 obj=.users:KGXJJGKDM5G7G4CNKC7R state=0x7effbc00e718
  s-prefetch_data=0
  2014-04-25 15:52:56.988148 7f00d37c6700 10 moving
  .users+KGXJJGKDM5G7G4CNKC7R to cache LRU end
  2014-04-25 15:52:56.988150 7f00d37c6700 10 cache get:
  name=.users+KGXJJGKDM5G7G4CNKC7R : hit
  2014-04-25 15:52:56.988155 7f00d37c6700 20 get_obj_state: s-obj_tag was
 set
  empty
  2014-04-25 15:52:56.988160 7f00d37c6700 10 moving
  .users+KGXJJGKDM5G7G4CNKC7R to cache LRU end
  2014-04-25 15:52:56.988161 7f00d37c6700 10 cache get:
  name=.users+KGXJJGKDM5G7G4CNKC7R : hit
  2014-04-25 15:52:56.988179 7f00d37c6700 20 get_obj_state:
  rctx=0x7effbc001ce0 obj=.users.uid:admin state=0x7effbc00ec58
  s-prefetch_data=0
  2014-04-25 15:52:56.988185 7f00d37c6700 10 moving .users.uid+admin to
 cache
  LRU end
  2014-04-25 15:52:56.988186 7f00d37c6700 10 cache get:
 name=.users.uid+admin
  : hit
  2014-04-25 15:52:56.988190 7f00d37c6700 20 get_obj_state: s-obj_tag was
 set
  empty
  2014-04-25

[ceph-users] Please provide me rados gateway configuration (rgw.conf) for lighttpd

2014-04-28 Thread Srinivasa Rao Ragolu

Hi All,

I would like to use lighttpd instead of apache for rados gateway
configuration. But i am facing issues with syntax for rgw.conf.

Could you please share the details how I can prepare rgw.conf fot lighttpd?

Please also suggest version of mod_fastcgi for apache version 2.4.3.

Thanks,
Srininivas.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Andrija Panic

Hi,

I'm trying to add CEPH as Primary Storage, but my libvirt 0.10.2 (CentOS
6.5) does some complaints:
-  internal error missing backend for pool type 8

Is it possible that the libvirt 0.10.2 (shipped with CentOS 6.5) was not
compiled with RBD support ?
Can't find how to check this...

I'm able to use qemu-img to create rbd images etc...

Here is cloudstack-agent DEBUG output, all seems fine...

pool type='rbd'
name1e119e4c-20d1-3fbc-a525-a5771944046d/name
uuid1e119e4c-20d1-3fbc-a525-a5771944046d/uuid
source
host name='10.44.253.10' port='6789'/
namecloudstack/name
auth username='cloudstack' type='ceph'
secret uuid='1e119e4c-20d1-3fbc-a525-a5771944046d'/
/auth
/source
/pool

-- 

Andrija Panić
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Indra Pramana

Hi Udo and Irek,

Good day to you, and thank you for your emails.

perhaps due IOs from the journal?
You can test with iostat (like iostat -dm 5 sdg).

Yes, I have shared the iostat result earlier on this same thread. At times
the utilisation of the 2 journal drives will hit 100%, especially when I
simulate writing data using rados bench command. Any suggestions what could
be the cause of the I/O issue?


avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   1.850.001.653.140.00   93.36

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg   0.00 0.000.00   55.00 0.00 25365.33
922.3834.22  568.900.00  568.90  17.82  98.00
sdf   0.00 0.000.00   55.67 0.00 25022.67
899.0229.76  500.570.00  500.57  17.60  98.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   2.100.001.372.070.00   94.46

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg   0.00 0.000.00   56.67 0.00 25220.00
890.1223.60  412.140.00  412.14  17.62  99.87
sdf   0.00 0.000.00   52.00 0.00 24637.33
947.5933.65  587.410.00  587.41  19.23 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   2.210.001.776.750.00   89.27

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg   0.00 0.000.00   54.33 0.00 24802.67
912.9825.75  486.360.00  486.36  18.40 100.00
sdf   0.00 0.000.00   53.00 0.00 24716.00
932.6835.26  669.890.00  669.89  18.87 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   1.870.001.675.250.00   91.21

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg   0.00 0.000.00   94.33 0.00 26257.33
556.6918.29  208.440.00  208.44  10.50  99.07
sdf   0.00 0.000.00   51.33 0.00 24470.67
953.4032.75  684.620.00  684.62  19.51 100.13

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   1.510.001.347.250.00   89.89

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg   0.00 0.000.00   52.00 0.00 22565.33
867.9024.73  446.510.00  446.51  19.10  99.33
sdf   0.00 0.000.00   64.67 0.00 24892.00
769.8619.50  330.020.00  330.02  15.32  99.07


You what model SSD?

For this one, I am using Seagate 100GB SSD, model: HDS-2TM-ST100FM0012

Which version of the kernel?

Ubuntu 13.04, Linux kernel version: 3.8.0-19-generic #30-Ubuntu SMP Wed May
1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Looking forward to your reply, thank you.

Cheers.



On Mon, Apr 28, 2014 at 4:45 PM, Irek Fasikhov malm...@gmail.com wrote:

 You what model SSD?
 Which version of the kernel?



 2014-04-28 12:35 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi,
 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 on debian iostat is in the package sysstat.

 Udo

 Am 28.04.2014 07:38, schrieb Indra Pramana:
  Hi Craig,
 
  Good day to you, and thank you for your enquiry.
 
  As per your suggestion, I have created a 3rd partition on the SSDs and
 did
  the dd test directly into the device, and the result is very slow.
 
  
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
  
 
  I did a test onto another server with exactly similar specification and
  similar SSD drive (Seagate SSD 100 GB) but not added into the cluster
 yet
  (thus no load), and the result is fast:
 
  
  root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero
 of=/dev/sdf1
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 0.742077 s, 181 MB/s
  
 
  Is the Ceph journal load really takes up a lot of the SSD resources? I
  don't understand how come the performance can drop significantly.
  Especially since the two Ceph journals are only taking the first 20 GB
 out
  of the 100 GB of the SSD total capacity.
 
  Any advice is greatly appreciated.
 
  Looking forward to your reply, thank you.
 
  Cheers.
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Irek Fasikhov

Most likely you need to apply a patch to the kernel.

http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov


2014-04-28 15:20 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Udo and Irek,

 Good day to you, and thank you for your emails.


 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 Yes, I have shared the iostat result earlier on this same thread. At times
 the utilisation of the 2 journal drives will hit 100%, especially when I
 simulate writing data using rados bench command. Any suggestions what could
 be the cause of the I/O issue?


 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.850.001.653.140.00   93.36


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   55.00 0.00 25365.33
 922.3834.22  568.900.00  568.90  17.82  98.00
 sdf   0.00 0.000.00   55.67 0.00 25022.67
 899.0229.76  500.570.00  500.57  17.60  98.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.100.001.372.070.00   94.46


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   56.67 0.00 25220.00
 890.1223.60  412.140.00  412.14  17.62  99.87
 sdf   0.00 0.000.00   52.00 0.00 24637.33
 947.5933.65  587.410.00  587.41  19.23 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.210.001.776.750.00   89.27


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   54.33 0.00 24802.67
 912.9825.75  486.360.00  486.36  18.40 100.00
 sdf   0.00 0.000.00   53.00 0.00 24716.00
 932.6835.26  669.890.00  669.89  18.87 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.870.001.675.250.00   91.21


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   94.33 0.00 26257.33
 556.6918.29  208.440.00  208.44  10.50  99.07
 sdf   0.00 0.000.00   51.33 0.00 24470.67
 953.4032.75  684.620.00  684.62  19.51 100.13


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.510.001.347.250.00   89.89


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   52.00 0.00 22565.33
 867.9024.73  446.510.00  446.51  19.10  99.33
 sdf   0.00 0.000.00   64.67 0.00 24892.00
 769.8619.50  330.020.00  330.02  15.32  99.07
 

 You what model SSD?

 For this one, I am using Seagate 100GB SSD, model: HDS-2TM-ST100FM0012

 Which version of the kernel?

 Ubuntu 13.04, Linux kernel version: 3.8.0-19-generic #30-Ubuntu SMP Wed
 May 1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

 Looking forward to your reply, thank you.

 Cheers.



 On Mon, Apr 28, 2014 at 4:45 PM, Irek Fasikhov malm...@gmail.com wrote:

 You what model SSD?
 Which version of the kernel?



 2014-04-28 12:35 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi,
 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 on debian iostat is in the package sysstat.

 Udo

 Am 28.04.2014 07:38, schrieb Indra Pramana:
  Hi Craig,
 
  Good day to you, and thank you for your enquiry.
 
  As per your suggestion, I have created a 3rd partition on the SSDs and
 did
  the dd test directly into the device, and the result is very slow.
 
  
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
  
 
  I did a test onto another server with exactly similar specification and
  similar SSD drive (Seagate SSD 100 GB) but not added into the cluster
 yet
  (thus no load), and the result is fast:
 
  
  root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero
 of=/dev/sdf1
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 0.742077 s, 181 MB/s
  
 
  Is the Ceph journal load really takes up a lot of the SSD resources? I
  don't understand how come the performance can drop significantly.
  Especially since the two Ceph journals are only taking

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Indra Pramana

Hi Irek,

Thanks for the article. Do you have any other web sources pertaining to the
same issue, which is in English?

Looking forward to your reply, thank you.

Cheers.


On Mon, Apr 28, 2014 at 7:40 PM, Irek Fasikhov malm...@gmail.com wrote:

 Most likely you need to apply a patch to the kernel.


 http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov


 2014-04-28 15:20 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Udo and Irek,

 Good day to you, and thank you for your emails.


 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 Yes, I have shared the iostat result earlier on this same thread. At
 times the utilisation of the 2 journal drives will hit 100%, especially
 when I simulate writing data using rados bench command. Any suggestions
 what could be the cause of the I/O issue?


 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.850.001.653.140.00   93.36


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   55.00 0.00 25365.33
 922.3834.22  568.900.00  568.90  17.82  98.00
 sdf   0.00 0.000.00   55.67 0.00 25022.67
 899.0229.76  500.570.00  500.57  17.60  98.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.100.001.372.070.00   94.46


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   56.67 0.00 25220.00
 890.1223.60  412.140.00  412.14  17.62  99.87
 sdf   0.00 0.000.00   52.00 0.00 24637.33
 947.5933.65  587.410.00  587.41  19.23 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.210.001.776.750.00   89.27


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   54.33 0.00 24802.67
 912.9825.75  486.360.00  486.36  18.40 100.00
 sdf   0.00 0.000.00   53.00 0.00 24716.00
 932.6835.26  669.890.00  669.89  18.87 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.870.001.675.250.00   91.21


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   94.33 0.00 26257.33
 556.6918.29  208.440.00  208.44  10.50  99.07
 sdf   0.00 0.000.00   51.33 0.00 24470.67
 953.4032.75  684.620.00  684.62  19.51 100.13


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.510.001.347.250.00   89.89


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   52.00 0.00 22565.33
 867.9024.73  446.510.00  446.51  19.10  99.33
 sdf   0.00 0.000.00   64.67 0.00 24892.00
 769.8619.50  330.020.00  330.02  15.32  99.07
 

 You what model SSD?

 For this one, I am using Seagate 100GB SSD, model: HDS-2TM-ST100FM0012

 Which version of the kernel?

 Ubuntu 13.04, Linux kernel version: 3.8.0-19-generic #30-Ubuntu SMP Wed
 May 1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

 Looking forward to your reply, thank you.

 Cheers.



 On Mon, Apr 28, 2014 at 4:45 PM, Irek Fasikhov malm...@gmail.com wrote:

 You what model SSD?
 Which version of the kernel?



 2014-04-28 12:35 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi,
 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 on debian iostat is in the package sysstat.

 Udo

 Am 28.04.2014 07:38, schrieb Indra Pramana:
  Hi Craig,
 
  Good day to you, and thank you for your enquiry.
 
  As per your suggestion, I have created a 3rd partition on the SSDs
 and did
  the dd test directly into the device, and the result is very slow.
 
  
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
  
 
  I did a test onto another server with exactly similar specification
 and
  similar SSD drive (Seagate SSD 100 GB) but not added into the cluster
 yet
  (thus no load), and the result is fast:
 
  
  root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero
 of=/dev/sdf1
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Irek Fasikhov

This is my article :).
To patch to the kernel (http://www.theirek.com/downloads/code/CMD_FLUSH.diff
).
After rebooting, run the following commands:
echo temporary write through  /sys/class/scsi_disk/disk/cache_type


2014-04-28 15:44 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Irek,

 Thanks for the article. Do you have any other web sources pertaining to
 the same issue, which is in English?

 Looking forward to your reply, thank you.

 Cheers.


 On Mon, Apr 28, 2014 at 7:40 PM, Irek Fasikhov malm...@gmail.com wrote:

 Most likely you need to apply a patch to the kernel.


 http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov


 2014-04-28 15:20 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Udo and Irek,

 Good day to you, and thank you for your emails.


 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 Yes, I have shared the iostat result earlier on this same thread. At
 times the utilisation of the 2 journal drives will hit 100%, especially
 when I simulate writing data using rados bench command. Any suggestions
 what could be the cause of the I/O issue?


 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.850.001.653.140.00   93.36


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   55.00 0.00 25365.33
 922.3834.22  568.900.00  568.90  17.82  98.00
 sdf   0.00 0.000.00   55.67 0.00 25022.67
 899.0229.76  500.570.00  500.57  17.60  98.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.100.001.372.070.00   94.46


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   56.67 0.00 25220.00
 890.1223.60  412.140.00  412.14  17.62  99.87
 sdf   0.00 0.000.00   52.00 0.00 24637.33
 947.5933.65  587.410.00  587.41  19.23 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.210.001.776.750.00   89.27


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   54.33 0.00 24802.67
 912.9825.75  486.360.00  486.36  18.40 100.00
 sdf   0.00 0.000.00   53.00 0.00 24716.00
 932.6835.26  669.890.00  669.89  18.87 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.870.001.675.250.00   91.21


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   94.33 0.00 26257.33
 556.6918.29  208.440.00  208.44  10.50  99.07
 sdf   0.00 0.000.00   51.33 0.00 24470.67
 953.4032.75  684.620.00  684.62  19.51 100.13


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.510.001.347.250.00   89.89


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   52.00 0.00 22565.33
 867.9024.73  446.510.00  446.51  19.10  99.33
 sdf   0.00 0.000.00   64.67 0.00 24892.00
 769.8619.50  330.020.00  330.02  15.32  99.07
 

 You what model SSD?

 For this one, I am using Seagate 100GB SSD, model: HDS-2TM-ST100FM0012

 Which version of the kernel?

 Ubuntu 13.04, Linux kernel version: 3.8.0-19-generic #30-Ubuntu SMP Wed
 May 1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

 Looking forward to your reply, thank you.

 Cheers.



 On Mon, Apr 28, 2014 at 4:45 PM, Irek Fasikhov malm...@gmail.comwrote:

 You what model SSD?
 Which version of the kernel?



 2014-04-28 12:35 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi,
 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 on debian iostat is in the package sysstat.

 Udo

 Am 28.04.2014 07:38, schrieb Indra Pramana:
  Hi Craig,
 
  Good day to you, and thank you for your enquiry.
 
  As per your suggestion, I have created a 3rd partition on the SSDs
 and did
  the dd test directly into the device, and the result is very slow.
 
  
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
  
 
  I did a test onto another server with exactly similar specification
 and

Re: [ceph-users] Access denied error

2014-04-28 Thread Cedric Lemarchand

Hi Punit,

Le 28 avr. 2014 à 11:55, Punit Dambiwal hypu...@gmail.com
mailto:hypu...@gmail.com a écrit :

 Hi Yehuda,

 I am using the same above method to call the api and used the way
 which described in the
 http://ceph.com/docs/master/radosgw/s3/authentication/#access-control-lists-acls
 for connection. The method in the
 http://s3.amazonaws.com/doc/s3-developer-guide/RESTAuthentication.html
 is for generating the hash of the header string and secret keys, since
 these keys are created already and i think we don't need this method,
 right ?
No, there are difference between the aws_access_id and aws_secret_key
(static, generated by radogw at the user creation) and the AWS
Authentication header, which is dynamic. As of my understanding, the AWS
signature header need to be regularly generated because of the parts it
embeds, plus the time expiration period. I think you can safely
regenerate the AWS Auth signature for each request.

Cheers

 I also tried one function to list out the bucket data as like

 curl -i 'http://gateway.3linux.com/test?format=json' -X GET -H
 'Authorization: AWS
 KGXJJGKDM5G7G4CNKC7R:LC7S0twZdhtXA1XxthfMDsj5TgJpeKhZrloWa9WN' -H
 'Host: gateway.3linux.com http://gateway.3linux.com' -H 'Date: Mon,
 28 April 2014 07:25:00 GMT ' -H 'Content-Length: 0'

 but its also getting the access denied error. But i can view the
 bucket details by directly entering
 http://gateway.3linux.com/test?format=json in the browser. What do you
 think ? what may be the reason ? I am able to connect and list buckets
 etc using cyberduck ftp clients these access keys but unable to do
 with the function calls.




 On Sat, Apr 26, 2014 at 12:22 AM, Yehuda Sadeh yeh...@inktank.com
 mailto:yeh...@inktank.com wrote:

 On Fri, Apr 25, 2014 at 1:03 AM, Punit Dambiwal hypu...@gmail.com
 mailto:hypu...@gmail.com wrote:
  Hi Yehuda,
 
  Thanks for your help...that missing date error gone but still i
 am getting
  the access denied error :-
 
  -
  2014-04-25 15:52:56.988025 7f00d37c6700  1 == starting new
 request
  req=0x237a090 =
  2014-04-25 15:52:56.988072 7f00d37c6700  2 req 24:0.46::GET
  /admin/usage::initializing
  2014-04-25 15:52:56.988077 7f00d37c6700 10
 host=gateway.3linux.com http://gateway.3linux.com
  rgw_dns_name=gateway.3linux.com http://gateway.3linux.com
  2014-04-25 15:52:56.988102 7f00d37c6700 20 FCGI_ROLE=RESPONDER
  2014-04-25 15:52:56.988103 7f00d37c6700 20 SCRIPT_URL=/admin/usage
  2014-04-25 15:52:56.988104 7f00d37c6700 20
  SCRIPT_URI=http://gateway.3linux.com/admin/usage
  2014-04-25 15:52:56.988105 7f00d37c6700 20 HTTP_AUTHORIZATION=AWS
  KGXJJGKDM5G7G4CNKC7R:LC7S0twZdhtXA1XxthfMDsj5TgJpeKhZrloWa9WN
  2014-04-25 15:52:56.988107 7f00d37c6700 20
 HTTP_USER_AGENT=curl/7.22.0
  (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4
 http://1.2.3.4 libidn/1.23
  librtmp/2.3
  2014-04-25 15:52:56.988108 7f00d37c6700 20 HTTP_ACCEPT=*/*
  2014-04-25 15:52:56.988109 7f00d37c6700 20
 HTTP_HOST=gateway.3linux.com http://gateway.3linux.com
  2014-04-25 15:52:56.988110 7f00d37c6700 20 HTTP_DATE=Fri, 25
 April 2014
  07:50:00 GMT
  2014-04-25 15:52:56.988111 7f00d37c6700 20 CONTENT_LENGTH=0
  2014-04-25 15:52:56.988112 7f00d37c6700 20
 PATH=/usr/local/bin:/usr/bin:/bin
  2014-04-25 15:52:56.988113 7f00d37c6700 20 SERVER_SIGNATURE=
  2014-04-25 15:52:56.988114 7f00d37c6700 20
 SERVER_SOFTWARE=Apache/2.2.22
  (Ubuntu)
  2014-04-25 15:52:56.988115 7f00d37c6700 20
 SERVER_NAME=gateway.3linux.com http://gateway.3linux.com
  2014-04-25 15:52:56.988116 7f00d37c6700 20 SERVER_ADDR=117.18.79.110
  2014-04-25 15:52:56.988117 7f00d37c6700 20 SERVER_PORT=80
  2014-04-25 15:52:56.988117 7f00d37c6700 20
 REMOTE_ADDR=122.166.115.191
  2014-04-25 15:52:56.988118 7f00d37c6700 20 DOCUMENT_ROOT=/var/www
  2014-04-25 15:52:56.988119 7f00d37c6700 20
 SERVER_ADMIN=c...@3linux.com mailto:c...@3linux.com
  2014-04-25 15:52:56.988120 7f00d37c6700 20
  SCRIPT_FILENAME=/var/www/s3gw.fcgi
  2014-04-25 15:52:56.988120 7f00d37c6700 20 REMOTE_PORT=28840
  2014-04-25 15:52:56.988121 7f00d37c6700 20 GATEWAY_INTERFACE=CGI/1.1
  2014-04-25 15:52:56.988122 7f00d37c6700 20 SERVER_PROTOCOL=HTTP/1.1
  2014-04-25 15:52:56.988123 7f00d37c6700 20 REQUEST_METHOD=GET
  2014-04-25 15:52:56.988123 7f00d37c6700 20
  QUERY_STRING=page=adminparams=/usageformat=json
  2014-04-25 15:52:56.988124 7f00d37c6700 20
  REQUEST_URI=/admin/usage?format=json
  2014-04-25 15:52:56.988125 7f00d37c6700 20 SCRIPT_NAME=/admin/usage
  2014-04-25 15:52:56.988126 7f00d37c6700  2 req 24:0.000101::GET
  /admin/usage::getting op
  2014-04-25 15:52:56.988129 7f00d37c6700  2 req 24:0.000104::GET
  /admin/usage:get_usage:authorizing
  2014-04-25 15:52:56.988141

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Andrija Panic

Thank you very much Wido,
any suggestion on compiling libvirt with support (I already found a way) or
perhaps use some prebuilt , that you would recommend ?

Best


On 28 April 2014 13:25, Wido den Hollander w...@42on.com wrote:

 On 04/28/2014 12:49 PM, Andrija Panic wrote:

 Hi,

 I'm trying to add CEPH as Primary Storage, but my libvirt 0.10.2 (CentOS
 6.5) does some complaints:
 -  internal error missing backend for pool type 8

 Is it possible that the libvirt 0.10.2 (shipped with CentOS 6.5) was not
 compiled with RBD support ?
 Can't find how to check this...


 No, it's probably not compiled with RBD storage pool support.

 As far as I know CentOS doesn't compile libvirt with that support yet.


  I'm able to use qemu-img to create rbd images etc...

 Here is cloudstack-agent DEBUG output, all seems fine...

 pool type='rbd'
 name1e119e4c-20d1-3fbc-a525-a5771944046d/name
 uuid1e119e4c-20d1-3fbc-a525-a5771944046d/uuid
 source
 host name='10.44.253.10' port='6789'/


 I recommend creating a Round Robin DNS record which points to all your
 monitors.

  namecloudstack/name
 auth username='cloudstack' type='ceph'
 secret uuid='1e119e4c-20d1-3fbc-a525-a5771944046d'/
 /auth
 /source
 /pool

 --

 Andrija Panić


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 --
 Wido den Hollander
 42on B.V.
 Ceph trainer and consultant

 Phone: +31 (0)20 700 9902
 Skype: contact42on
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 

Andrija Panić
--
  http://admintweets.com
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] What happened if rbd lose a block?

2014-04-28 Thread Timofey Koolin

What will happened if RBD lose all copied of data-block and I read the block?

Context:
I want use RDB as main storage with replication factor 1 and drbd for 
replication on non rbd storage by client side.

For example:
Computer1:
1. connect rbd as /dev/rbd15
2. use rbd as disk for drbd

Computer2:
Use HDD for drbd-replication.


I want protect from break of ceph system (for example while upgrade ceph) and 
long-distance replication.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] What happened if rbd lose a block?

2014-04-28 Thread Wido den Hollander


On 04/28/2014 02:35 PM, Timofey Koolin wrote:

What will happened if RBD lose all copied of data-block and I read the block?



The read to the object will block until a replica comes online to serve it.

Remember this with Ceph: Consistency goes over availability


Context:
I want use RDB as main storage with replication factor 1 and drbd for 
replication on non rbd storage by client side.

For example:
Computer1:
1. connect rbd as /dev/rbd15
2. use rbd as disk for drbd

Computer2:
Use HDD for drbd-replication.


I want protect from break of ceph system (for example while upgrade ceph) and 
long-distance replication.


Ceph wants to be consistent at all times. So copying over long distances 
with high latency will be very slow.





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Wido den Hollander


On 04/28/2014 02:15 PM, Andrija Panic wrote:

Thank you very much Wido,
any suggestion on compiling libvirt with support (I already found a way)
or perhaps use some prebuilt , that you would recommend ?



No special suggestions, just make sure you use at least Ceph 0.67.7

I'm not aware of any pre-build packages for CentOS.


Best


On 28 April 2014 13:25, Wido den Hollander w...@42on.com
mailto:w...@42on.com wrote:

On 04/28/2014 12:49 PM, Andrija Panic wrote:

Hi,

I'm trying to add CEPH as Primary Storage, but my libvirt 0.10.2
(CentOS
6.5) does some complaints:
-  internal error missing backend for pool type 8

Is it possible that the libvirt 0.10.2 (shipped with CentOS 6.5)
was not
compiled with RBD support ?
Can't find how to check this...


No, it's probably not compiled with RBD storage pool support.

As far as I know CentOS doesn't compile libvirt with that support yet.


I'm able to use qemu-img to create rbd images etc...

Here is cloudstack-agent DEBUG output, all seems fine...

pool type='rbd'
name1e119e4c-20d1-3fbc-a525-__a5771944046d/name
uuid1e119e4c-20d1-3fbc-a525-__a5771944046d/uuid
source
host name='10.44.253.10' port='6789'/


I recommend creating a Round Robin DNS record which points to all
your monitors.

namecloudstack/name
auth username='cloudstack' type='ceph'
secret uuid='1e119e4c-20d1-3fbc-a525-__a5771944046d'/
/auth
/source
/pool

--

Andrija Panić


_
ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902 tel:%2B31%20%280%2920%20700%209902
Skype: contact42on
_
ceph-users mailing list
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--

Andrija Panić
--
http://admintweets.com
--



--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Dan van der Ster



On 28/04/14 14:54, Wido den Hollander wrote:

On 04/28/2014 02:15 PM, Andrija Panic wrote:

Thank you very much Wido,
any suggestion on compiling libvirt with support (I already found a way)
or perhaps use some prebuilt , that you would recommend ?



No special suggestions, just make sure you use at least Ceph 0.67.7

I'm not aware of any pre-build packages for CentOS.


Look for qemu-kvm-rhev ... el6 ...
That's the Redhat built version of kvm which supports RBD.

Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] packages for Trusty

2014-04-28 Thread Alphe Salas Michels


hello all,
to begin with there is no Emperor package for saucy. Emperor for saucy 
is only rolled through git and based on my experience that can broke 
ceph cluster to have the test builds rolling in constantly.


I don t know why there is a lack on the ceph.com/download section. But 
the fact that inktank consider stable production version of ceph to be 
dumpling should explain that much (that is what they sell). Why carring 
for today's ubuntu and today's stable when the real product sold is 
the ceph of past year that works greatly on the ubuntu from past year.


Alphe Salas.

On 04/25/2014 06:03 PM, Craig Lewis wrote:
Using the Emperor builds for Precise seems to work on Trusty.  I just 
put a hold on all of the ceph, rados, and apache packages before the 
release upgrade.


It makes me nervous though.  I haven't stressed it much, and I don't 
really want to roll it out to production.


I would like to see Emperor builds for Trusty, so I can get started 
rolling out Trusty independently of Firefly.  Changing one thing at a 
time is invaluable when bad things start happening.





*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com mailto:cle...@centraldesktop.com

*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website http://www.centraldesktop.com/  | Twitter 
http://www.twitter.com/centraldesktop  | Facebook 
http://www.facebook.com/CentralDesktop  | LinkedIn 
http://www.linkedin.com/groups?gid=147417  | Blog 
http://cdblog.centraldesktop.com/


On 4/25/14 12:10 , Sebastien wrote:


Well as far as I know trusty has 0.79 and will get firefly as soon as 
it's ready so I'm not sure if it's that urgent. Precise repo should 
work fine.


My 2 cents


Sébastien Han
Cloud Engineer

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine 75008 Paris
Web : www.enovance.com - Twitter : @enovance


On Fri, Apr 25, 2014 at 9:05 PM, Travis Rhoden trho...@gmail.com 
mailto:trho...@gmail.com wrote:


Are there packages for Trusty being built yet?

I don't see it listed at http://ceph.com/debian-emperor/dists/

Thanks,

- Travis



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Andrija Panic

Thanks Dan :)


On 28 April 2014 15:02, Dan van der Ster daniel.vanders...@cern.ch wrote:


 On 28/04/14 14:54, Wido den Hollander wrote:

 On 04/28/2014 02:15 PM, Andrija Panic wrote:

 Thank you very much Wido,
 any suggestion on compiling libvirt with support (I already found a way)
 or perhaps use some prebuilt , that you would recommend ?


 No special suggestions, just make sure you use at least Ceph 0.67.7

 I'm not aware of any pre-build packages for CentOS.


 Look for qemu-kvm-rhev ... el6 ...
 That's the Redhat built version of kvm which supports RBD.

 Cheers, Dan




-- 

Andrija Panić
--
  http://admintweets.com
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Andrija Panic

Dan, is this maybe just rbd support for kvm package (I already have rbd
enabled qemu, qemu-img etc from ceph.com site)
I need just libvirt with rbd support ?

Thanks


On 28 April 2014 15:05, Andrija Panic andrija.pa...@gmail.com wrote:

 Thanks Dan :)


 On 28 April 2014 15:02, Dan van der Ster daniel.vanders...@cern.chwrote:


 On 28/04/14 14:54, Wido den Hollander wrote:

 On 04/28/2014 02:15 PM, Andrija Panic wrote:

 Thank you very much Wido,
 any suggestion on compiling libvirt with support (I already found a way)
 or perhaps use some prebuilt , that you would recommend ?


 No special suggestions, just make sure you use at least Ceph 0.67.7

 I'm not aware of any pre-build packages for CentOS.


 Look for qemu-kvm-rhev ... el6 ...
 That's the Redhat built version of kvm which supports RBD.

 Cheers, Dan




 --

 Andrija Panić
 --
   http://admintweets.com
 --




-- 

Andrija Panić
--
  http://admintweets.com
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Dan van der Ster

Indeed. Actually, we didn't need a special libvirt, only the special 
qemu-kvm-rhev. But we use it via openstack, so I don't know if there is 
other redhat magic involved.


sorry, dan



On 28/04/14 15:08, Andrija Panic wrote:
Dan, is this maybe just rbd support for kvm package (I already have 
rbd enabled qemu, qemu-img etc from ceph.com http://ceph.com site)

I need just libvirt with rbd support ?

Thanks


On 28 April 2014 15:05, Andrija Panic andrija.pa...@gmail.com 
mailto:andrija.pa...@gmail.com wrote:


Thanks Dan :)


On 28 April 2014 15:02, Dan van der Ster
daniel.vanders...@cern.ch mailto:daniel.vanders...@cern.ch wrote:


On 28/04/14 14:54, Wido den Hollander wrote:

On 04/28/2014 02:15 PM, Andrija Panic wrote:

Thank you very much Wido,
any suggestion on compiling libvirt with support (I
already found a way)
or perhaps use some prebuilt , that you would recommend ?


No special suggestions, just make sure you use at least
Ceph 0.67.7

I'm not aware of any pre-build packages for CentOS.


Look for qemu-kvm-rhev ... el6 ...
That's the Redhat built version of kvm which supports RBD.

Cheers, Dan




-- 


Andrija Panic'
--
http://admintweets.com
--




--

Andrija Panic'
--
http://admintweets.com
--


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Installing ceph without access to the internet

2014-04-28 Thread Alfredo Deza

On Sun, Apr 27, 2014 at 7:24 AM, Cedric Lemarchand ced...@yipikai.org wrote:
 Hi rAn,

 Le 27/04/2014 13:13, rAn rAnn a écrit :

 Thanks all
 Im trying to deploy from node1(the admin node) to the new node via the
 command  ceph-deploy install node2.
 I have coppied the two main repositories (noarc and x86-64) to my  secure
 site and I have encountered the folowing warnings and errors;
 [node2][warnin] http://ceph.com/rmp-emperor/e16/x86_64/repodata/repomd.xml:
 Errno 14 PYCURL ERROR 6 -couldn't resolve host 'ceph.com'
 [node2][warnin] ERROR: Cannot retrive repository metadata [repomd.xml] for
 repository: ceph. please verify its path and try again
 [node2][ERROR] RuntimeError: Failed to execute command yum -y -q install
 ceph
 Anyone have an idea?

For the past few versions, ceph-deploy has a few options for
installation without internet access.

If you have a ceph repo mirror you could tell ceph-deploy to install
from that url (you will need to pass
in the gpg key url too). For example:

ceph-deploy install --repo-url {http mirror} --gpg-url {http gpg url} {host}




 ceph-deploy tools remotely 'pilot' the ceph installation and configuration
 on the specified node, including package installation, thus you still need
 an internet access for the installation parts, which is why pycurl (and then
 yum) complains.

 Possible solutions could be :

 - as Eric state it, create a local mirror of the remote package repository
 (don't know if it's an easy task ...), and configure your OS to use it.
 - download and install all the necessary packages and dependency on nodes
 before using ceph-deploy, thus you will profit of the local packages cache
 for every operations.

 Cheers

 Cédric

 בתאריך 27 באפר 2014 14:02, xan.peng xanp...@gmail.com כתב:

 On Sat, Apr 26, 2014 at 7:16 PM, rAn rAnn ran.sh...@gmail.com wrote:
  hi, Im on a site with no access to the internet and Im trying to install
  ceph
  during the installation it tries to download files from the internet and
  then I get an error
  I tried to download the files and make my own repository, also i have
  changed the installation code to point to a different path but it still
  keeps trying to access the internet
  has anyone managed to install ceph on a secure site or maybe someone has
  an
  idea or a way to install it
  thanks in advance.
 

 You can always install ceph by compiling the source code. However this
 half-done
 project (https://github.com/xanpeng/ceph-tools/tree/master/deploy-ceph)
 may help.



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


 --
 Cédric


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] rbd Problem with disk full

2014-04-28 Thread Alphe Salas


Hello,
I need rbd kernel module to really delete data on osd related disks. 
Having a ever growing hidden data is not a great solution.


Then we can say that first of all we should be able at least manually to 
strip out the hidden data aka the replicas.


I use rbd image let say it is 10 TB on  a overall available space of 
25TB. What the real case experience shows me if that if I write in a row 
8Tb of my 10 tb. then overall used data is around 18TB. Then I delete 
from the rbd image 4TB and write 4 TB then the overall data would grow 
from 4 TB, ofcours the pgs used by the rbd image will be reused 
overwritten but the replicas corresponding will not so.

in the end after round 2 of writing the overall used space is 22TB
at that moment i get stuff like this:

2034 active+clean
   7 active+remapped+wait_backfill+backfill_toofull
   7 active+remapped+backfilling

I tried to use ceph osd reweight-by-utilization but that  didn t solve 
the problem. And if the problem is solve it would be only momentarily 
because after cleaning again 4TB and writing 4TB then I will reach the 
full ratio and get my osd stucked until I spend 12 000 dollars to 
enhance my ceph cluster. Because when you manipulate a 40TB ceph cluster

adding 4TB isn t quite mutch of a difference.

In the end for 40TB of real space 20 disks of 2TB after first formating
I get a 37 TB cluster of available data. Then I do a 18TB rbd image. And
can t use much than 16TB before having my osds showing page stucks.

In the end 37TB for a 16TB of available disk space for sometimes is 
quite not the great solution at all because I loose 60% of my data 
storage.


On the how to delete data, really I don't know the more easy way
I can see is at least to be able to manually tell rbd kernel module to
clean released data from osd when we see it fit maintenance time.

If doing it automatically has a too bad impact on overall performances.
I would be glad yet to be able to decide an appropriate moment to force
cleaning task that would be better than nothing and ever growing hiden 
data situation.


Regards,

--
Alphe Salas
I.T ingeneer
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-04-28 Thread Sebastien Han

FYI It’s fixed here: https://review.openstack.org/#/c/90644/1

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 25 Apr 2014, at 18:16, Sebastien Han sebastien@enovance.com wrote:

 I just tried, I have the same problem, it looks like a regression…
 It’s weird because the code didn’t change that much during the Icehouse cycle.
 
 I just reported the bug here: https://bugs.launchpad.net/cinder/+bug/1312819
 
  
 Sébastien Han 
 Cloud Engineer 
 
 Always give 100%. Unless you're giving blood.” 
 
 Phone: +33 (0)1 49 70 99 72 
 Mail: sebastien@enovance.com 
 Address : 11 bis, rue Roquépine - 75008 Paris
 Web : www.enovance.com - Twitter : @enovance 
 
 On 25 Apr 2014, at 16:37, Sebastien Han sebastien@enovance.com wrote:
 
 g
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] rbd always expending data space problem.

2014-04-28 Thread Alphe Salas Michels


Hello all,
recently I get to the conclusion that of a 40 TB of physical space I 
could use only 16TB before seeing pg stick because osd was too full.

The data space used seems to be for ever growing.

using ceph osd reweight-by-utilization 103 seems at first to rebalance 
the osd pg use. Then the problem is solve for a time. But then the 
problems appears again with more PGs stuck_too_full. and the problem for 
ever grows. Sure the solution should be to add more disk space but for 
that enhancement to be significant and solving the problem it should be 
at least of a 25% which means growing the ceph cluster of 10 TB (5 disks 
of 2TB or 3 disks of 4TB) that has a cost, and the problem will only be 
solved for a moment until the replicas that are never freed fills again 
the added data.



In the end I really can count on using a rbd image of 16 TB out of a 37 
TB of global ceph cluster disk. Which means I can really use a 40% and 
over the time that ratio will drop constantly.


So It is requiered of that the replicas and data can be overwriten so 
that the hidden data will not keep growing. Or that I can clean them 
when I need to.


Alphe Salas.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] apple support automated mail ? WTF!?

2014-04-28 Thread Alphe Salas Michels


*http://www.kepler.cl*Hello,
each time I send a mail to the ceph user mailing list I receive an email 
from apple support?!

Is that a joke?


Alphe Salas

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-04-28 Thread Maciej Gałkiewicz

On 28 April 2014 15:58, Sebastien Han sebastien@enovance.com wrote:

 FYI It’s fixed here: https://review.openstack.org/#/c/90644/1


I already have this patch and it didn't help. Have it fixed the problem in
your cluster?

-- 
Maciej Gałkiewicz
Shelly Cloud Sp. z o. o., Co-founder, Sysadmin
http://shellycloud.com/, mac...@shellycloud.com
KRS: 440358 REGON: 101504426
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-04-28 Thread Sebastien Han

Yes yes, just restart cinder-api and cinder-volume.
It worked for me.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 28 Apr 2014, at 16:10, Maciej Gałkiewicz mac...@shellycloud.com wrote:

 On 28 April 2014 15:58, Sebastien Han sebastien@enovance.com wrote:
 FYI It’s fixed here: https://review.openstack.org/#/c/90644/1
 
 I already have this patch and it didn't help. Have it fixed the problem in 
 your cluster? 
 
 -- 
 Maciej Gałkiewicz
 Shelly Cloud Sp. z o. o., Co-founder, Sysadmin
 http://shellycloud.com/, mac...@shellycloud.com
 KRS: 440358 REGON: 101504426



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Data placement algorithm

2014-04-28 Thread Séguin Cyril


Hy all,

I'm currently interesting in comparing Ceph's block replica placement 
policy with other placements algorithm.


Is it possible to access to the source code of Ceph's placement policy 
and where can I find it?


Thanks a lot.

Best regards.

CS

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fwd: Access denied error

2014-04-28 Thread shanil


Hi Yehuda,

We are using the same above method to call the api and used the way 
which described in the 
http://ceph.com/docs/master/radosgw/s3/authentication/#access-control-lists-acls 
for connection. The method in the 
http://s3.amazonaws.com/doc/s3-developer-guide/RESTAuthentication.html 
is for generating the hash of the header string and secret keys, since 
these keys are created already and i think we don't need this method, 
right ?


I also tried one function to list out the bucket data as like

curl -i 'http://gateway.3linux.com/test?format=json' -X GET -H 
'Authorization: AWS 
KGXJJGKDM5G7G4CNKC7R:LC7S0twZdhtXA1XxthfMDsj5TgJpeKhZrloWa9WN' -H 'Host: 
gateway.3linux.com' -H 'Date: Mon, 28 April 2014 07:25:00 GMT ' -H 
'Content-Length: 0'


but its also getting the access denied error. But i can view the bucket 
details by directly entering http://gateway.3linux.com/test?format=json 
in the browser. What do you think ? what may be the reason ? I am able 
to connect and list buckets etc using cyberduck ftp clients these access 
keys but unable to do with the function calls.



On Saturday 26 April 2014 10:17 AM, Punit Dambiwal wrote:

Hi Shanil,

I got the following reply from community :-

Still signing issues. If you're manually constructing the auth header
you need to make it look like the above (copy pasted here):

 2014-04-25 15:52:56.988239 7f00d37c6700 10 auth_hdr:
 GET


 Fri, 25 April 2014 07:50:00 GMT
 /admin/usage

Then you need to run hmac-sha1 on it, as described here:

http://s3.amazonaws.com/doc/s3-developer-guide/RESTAuthentication.html

If you have any backslash in the key then you need to remove it, it's
just an escape character for representing slashes in json.


-- Forwarded message --
From: *Yehuda Sadeh* yeh...@inktank.com mailto:yeh...@inktank.com
Date: Sat, Apr 26, 2014 at 12:22 AM
Subject: Re: [ceph-users] Access denied error
To: Punit Dambiwal hypu...@gmail.com mailto:hypu...@gmail.com
Cc: ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com 
ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com



On Fri, Apr 25, 2014 at 1:03 AM, Punit Dambiwal hypu...@gmail.com 
mailto:hypu...@gmail.com wrote:

 Hi Yehuda,

 Thanks for your help...that missing date error gone but still i am 
getting

 the access denied error :-

 -
 2014-04-25 15:52:56.988025 7f00d37c6700  1 == starting new request
 req=0x237a090 =
 2014-04-25 15:52:56.988072 7f00d37c6700  2 req 24:0.46::GET
 /admin/usage::initializing
 2014-04-25 15:52:56.988077 7f00d37c6700 10 host=gateway.3linux.com 
http://gateway.3linux.com

 rgw_dns_name=gateway.3linux.com http://gateway.3linux.com
 2014-04-25 15:52:56.988102 7f00d37c6700 20 FCGI_ROLE=RESPONDER
 2014-04-25 15:52:56.988103 7f00d37c6700 20 SCRIPT_URL=/admin/usage
 2014-04-25 15:52:56.988104 7f00d37c6700 20
 SCRIPT_URI=http://gateway.3linux.com/admin/usage
 2014-04-25 15:52:56.988105 7f00d37c6700 20 HTTP_AUTHORIZATION=AWS
 KGXJJGKDM5G7G4CNKC7R:LC7S0twZdhtXA1XxthfMDsj5TgJpeKhZrloWa9WN
 2014-04-25 15:52:56.988107 7f00d37c6700 20 HTTP_USER_AGENT=curl/7.22.0
 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 
http://1.2.3.4 libidn/1.23

 librtmp/2.3
 2014-04-25 15:52:56.988108 7f00d37c6700 20 HTTP_ACCEPT=*/*
 2014-04-25 15:52:56.988109 7f00d37c6700 20 
HTTP_HOST=gateway.3linux.com http://gateway.3linux.com

 2014-04-25 15:52:56.988110 7f00d37c6700 20 HTTP_DATE=Fri, 25 April 2014
 07:50:00 GMT
 2014-04-25 15:52:56.988111 7f00d37c6700 20 CONTENT_LENGTH=0
 2014-04-25 15:52:56.988112 7f00d37c6700 20 
PATH=/usr/local/bin:/usr/bin:/bin

 2014-04-25 15:52:56.988113 7f00d37c6700 20 SERVER_SIGNATURE=
 2014-04-25 15:52:56.988114 7f00d37c6700 20 SERVER_SOFTWARE=Apache/2.2.22
 (Ubuntu)
 2014-04-25 15:52:56.988115 7f00d37c6700 20 
SERVER_NAME=gateway.3linux.com http://gateway.3linux.com

 2014-04-25 15:52:56.988116 7f00d37c6700 20 SERVER_ADDR=117.18.79.110
 2014-04-25 15:52:56.988117 7f00d37c6700 20 SERVER_PORT=80
 2014-04-25 15:52:56.988117 7f00d37c6700 20 REMOTE_ADDR=122.166.115.191
 2014-04-25 15:52:56.988118 7f00d37c6700 20 DOCUMENT_ROOT=/var/www
 2014-04-25 15:52:56.988119 7f00d37c6700 20 
SERVER_ADMIN=c...@3linux.com mailto:c...@3linux.com

 2014-04-25 15:52:56.988120 7f00d37c6700 20
 SCRIPT_FILENAME=/var/www/s3gw.fcgi
 2014-04-25 15:52:56.988120 7f00d37c6700 20 REMOTE_PORT=28840
 2014-04-25 15:52:56.988121 7f00d37c6700 20 GATEWAY_INTERFACE=CGI/1.1
 2014-04-25 15:52:56.988122 7f00d37c6700 20 SERVER_PROTOCOL=HTTP/1.1
 2014-04-25 15:52:56.988123 7f00d37c6700 20 REQUEST_METHOD=GET
 2014-04-25 15:52:56.988123 7f00d37c6700 20
 QUERY_STRING=page=adminparams=/usageformat=json
 2014-04-25 15:52:56.988124 7f00d37c6700 20
 REQUEST_URI=/admin/usage?format=json
 2014-04-25 15:52:56.988125 7f00d37c6700 20 SCRIPT_NAME=/admin/usage
 2014-04-25 15:52:56.988126 7f00d37c6700  2 req 24:0.000101::GET
 /admin/usage::getting op
 2014-04-25 15:52:56.988129 7f00d37c6700  2 req 24:0.000104::GET

Re: [ceph-users] cluster_network ignored

2014-04-28 Thread Kurt Bauer


To see where your OSDs and Mon are listening, you have various cmds in
Linux, e.g:

'lsof -ni | grep ceph' - you should see one LISTEN line for the monitor,
2 LISTEN lines for the OSDs and a lot of ESTABLISHED lines, which
indicate communication between OSDs and OSDs and clients
'netstat -atn | grep LIST' - you should see a lot of lines with
portnumber 6800 and upwards (OSDs) and port 6789 (MON)
More comments inline.

HTH,
Kurt
 Gandalf Corvotempesta mailto:gandalf.corvotempe...@gmail.com
 28. April 2014 11:05
 2014-04-26 12:06 GMT+02:00 Gandalf Corvotempesta

 I've added cluster addr and public addr to each OSD configuration
 but nothing is changed.
 I see all OSDs down except the ones from one server but I'm able to
 ping each other nodes on both interfaces.

What do you mean by I see all OSDs down? What does a 'ceph osd stat' say?

 How can I detect what ceph is doing?
'ceph -w'

 I see tons of debug logs but they
 are not very easy to understand
 with ceph health i can see that pgs down value is slowly
 decreasing so I can suppose that caph is recovering. Is that right?
What's the output of 'ceph -s'


 Isn't possible to add a semplified output like the one coming from
 mdadm? (cat /proc/mdstat)
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 Gandalf Corvotempesta mailto:gandalf.corvotempe...@gmail.com
 26. April 2014 12:06
 I've not defined cluster IPs for each OSD server but only the whole
 subnet.
 Should I define each IP for each OSD ? This is not wrote on docs and
 could be tricky to do this in big environments with hundreds of nodes
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 McNamara, Bradley mailto:bradley.mcnam...@seattle.gov
 24. April 2014 20:04
 Do you have all of the cluster IP's defined in the host file on each
 OSD server? As I understand it, the mon's do not use a cluster
 network, only the OSD servers.

 -Original Message-
 From: ceph-users-boun...@lists.ceph.com
 [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Gandalf
 Corvotempesta
 Sent: Thursday, April 24, 2014 8:54 AM
 To: ceph-users@lists.ceph.com
 Subject: [ceph-users] cluster_network ignored

 I'm trying to configure a small ceph cluster with both public and
 cluster networks.
 This is my conf:

 [global]
 public_network = 192.168.0/24
 cluster_network = 10.0.0.0/24
 auth cluster required = cephx
 auth service required = cephx
 auth client required = cephx
 fsid = 004baba0-74dc-4429-84ec-1e376fb7bcad
 osd pool default pg num = 8192
 osd pool default pgp num = 8192
 osd pool default size = 3

 [mon]
 mon osd down out interval = 600
 mon osd mon down reporters = 7
 [mon.osd1]
 host = osd1
 mon addr = 192.168.0.1
 [mon.osd2]
 host = osd2
 mon addr = 192.168.0.2
 [mon.osd3]
 host = osd3
 mon addr = 192.168.0.3

 [osd]
 osd mkfs type = xfs
 osd journal size = 16384
 osd mon heartbeat interval = 30
 filestore merge threshold = 40
 filestore split multiple = 8
 osd op threads = 8
 osd recovery max active = 5
 osd max backfills = 2
 osd recovery op priority = 2


 on each node I have bond0 bound to 192.168.0.x and bond1 bound to
 10.0.0.x When ceph is doing recovery, I can see replication through
 bond0 (public interface) and nothing via bond1 (cluster interface)

 Should I configure anything else ?
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 Gandalf Corvotempesta mailto:gandalf.corvotempe...@gmail.com
 24. April 2014 17:53
 I'm trying to configure a small ceph cluster with both public and
 cluster networks.
 This is my conf:

 [global]
 public_network = 192.168.0/24
 cluster_network = 10.0.0.0/24
 auth cluster required = cephx
 auth service required = cephx
 auth client required = cephx
 fsid = 004baba0-74dc-4429-84ec-1e376fb7bcad
 osd pool default pg num = 8192
 osd pool default pgp num = 8192
 osd pool default size = 3

 [mon]
 mon osd down out interval = 600
 mon osd mon down reporters = 7
 [mon.osd1]
 host = osd1
 mon addr = 192.168.0.1
 [mon.osd2]
 host = osd2
 mon addr = 192.168.0.2
 [mon.osd3]
 host = osd3
 mon addr = 192.168.0.3

 [osd]
 osd mkfs type = xfs
 osd journal size = 16384
 osd mon heartbeat interval = 30
 filestore merge threshold = 40
 filestore split multiple = 8
 osd op threads = 8
 osd recovery max active = 5
 osd max backfills = 2
 osd recovery op priority = 2


 on each node I have bond0 bound to 192.168.0.x and bond1 bound to 10.0.0.x
 When ceph is doing recovery, I can see replication through bond0
 (public interface) and nothing via bond1 (cluster interface)

 Should I configure anything else ?

Re: [ceph-users] Data placement algorithm

2014-04-28 Thread xan.peng

I think what you want is Ceph's CRUSH algorithm.
Source code: https://github.com/ceph/ceph/tree/master/src/crush
Paper: http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf

On Mon, Apr 28, 2014 at 5:27 PM, Séguin Cyril
cyril.seg...@u-picardie.fr wrote:
 Hy all,

 I'm currently interesting in comparing Ceph's block replica placement policy
 with other placements algorithm.

 Is it possible to access to the source code of Ceph's placement policy and
 where can I find it?

 Thanks a lot.

 Best regards.

 CS

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Only one OSD log available per node?

2014-04-28 Thread Gregory Farnum

It is not. My guess from looking at the time stamps is that maybe you have
a log rotation system set up that isn't working properly?
-Greg

On Sunday, April 27, 2014, Indra Pramana in...@sg.or.id wrote:

 Dear all,

 I have multiple OSDs per node (normally 4) and I realised that for all the
 nodes that I have, only one OSD will contain logs under /var/log/ceph, the
 rest of the logs are empty.

 root@ceph-osd-07:/var/log/ceph# ls -la *.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-client.admin.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.0.log
 -rw-r--r-- 1 root root 386857 Apr 28 14:02 ceph-osd.12.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.13.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.14.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.15.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd..log

 The ceph-osd.12.log only contains the logs for osd.12 only, while the
 other logs for osd.13, 14 and 15 are not available and empty.

 Is this normal?

 Looking forward to your reply, thank you.

 Cheers.



-- 
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Data placement algorithm

2014-04-28 Thread Séguin Cyril


Yes it is!

Thanks a lot.

On 28/04/2014 17:18, xan.peng wrote:

I think what you want is Ceph's CRUSH algorithm.
Source code: https://github.com/ceph/ceph/tree/master/src/crush
Paper: http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf

On Mon, Apr 28, 2014 at 5:27 PM, Séguin Cyril
cyril.seg...@u-picardie.fr wrote:

Hy all,

I'm currently interesting in comparing Ceph's block replica placement policy
with other placements algorithm.

Is it possible to access to the source code of Ceph's placement policy and
where can I find it?

Thanks a lot.

Best regards.

CS

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] What happened if rbd lose a block?

2014-04-28 Thread Timofey Koolin

Is a setting for change the behavior to return read error instead of block read?

I think it is more reasonable behavior because it is similar to bad block on 
HDD: it can’t be read.

Or may be a timeout some seconds, then return read error for the block and 
other absented blocks in same image/PG.


Or is any method safe upgrade cluster without downtime.

Now if I will upgrade monitors and upgrade will fail on second (of three) 
monitor - cluster will down. Becouse it will have
1 new monitor
1 down monitor
1 old monitor

Old and mew monitor haven’t quorum.

Same for 5 monitors:
2 new monitors
1 down monitor
2 old monitors.

 On 04/28/2014 02:35 PM, Timofey Koolin wrote:
 What will happened if RBD lose all copied of data-block and I read the block?
 
 
 The read to the object will block until a replica comes online to serve it.
 
 Remember this with Ceph: Consistency goes over availability
 
 Context:
 I want use RDB as main storage with replication factor 1 and drbd for 
 replication on non rbd storage by client side.
 
 For example:
 Computer1:
 1. connect rbd as /dev/rbd15
 2. use rbd as disk for drbd
 
 Computer2:
 Use HDD for drbd-replication.
 
 
 I want protect from break of ceph system (for example while upgrade ceph) 
 and long-distance replication.
 
 Ceph wants to be consistent at all times. So copying over long distances with 
 high latency will be very slow.
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 -- 
 Wido den Hollander
 42on B.V.
 Ceph trainer and consultant
 
 Phone: +31 (0)20 700 9902
 Skype: contact42on
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CentOS 6 Yum repository broken / tampered with?

2014-04-28 Thread Brian Rak

Were there any changes to the EL6 yum packages at 
http://ceph.com/rpm/el6/x86_64/ ?


There are a number of files showing a modification date of 
'25-Apr-2014', but it seems that no one regenerated the repository metadata.


This breaks installations using the repository, you'll get errors like this:

 http://ceph.com/rpm/el6/x86_64/libcephfs1-0.72.2-0.el6.x86_64.rpm: 
[Errno -1] Package does not match intended download. Suggestion: run yum 
--enablerepo=ceph clean metadata.


This is definitely an issue with the repository.  The metadata shows:

package type=rpm
namelibcephfs1/name
archx86_64/arch
version epoch=0 ver=0.72.2 rel=0.el6/
checksum type=sha 
pkgid=YES4bdb7c99a120bb3e0de2b642e00c6e28fa75dbbe/checksum

   
time file=1387606897 build=1387579747/
size package=17693625 installed=88652458 archive=88652888/
location href=libcephfs1-0.72.2-0.el6.x86_64.rpm/
   .
/package

But, if we download that package and check the checksum, we get:

$ wget -q http://ceph.com/rpm/el6/x86_64/libcephfs1-0.72.2-0.el6.x86_64.rpm
$ sha1sum libcephfs1-0.72.2-0.el6.x86_64.rpm
4d9730c9dd6dad6cc1b08abc6a4ef5ae0e497aec libcephfs1-0.72.2-0.el6.x86_64.rpm


It's my understanding that you never want to make changes to an existing 
package, because any machine that's already installed it will not have 
the updates applied.  You'd generally just increase the iteration number.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] apple support automated mail ? WTF!?

2014-04-28 Thread Brian Rak


I thought that was just me..  I guess someone from apple is subscribed?

On 4/28/2014 10:06 AM, Alphe Salas Michels wrote:

**Hello,
each time I send a mail to the ceph user mailing list I receive an 
email from apple support?!

Is that a joke?


Alphe Salas



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] apple support automated mail ? WTF!?

2014-04-28 Thread Patrick McGarry

Yeah, someone subscribed supp...@apple.com.  I just unsubscribed them
from the list.  Let me know if it shows back up and I'll ban it.
Thanks.



Best Regards,

Patrick McGarry
Director, Community || Inktank
http://ceph.com  ||  http://inktank.com
@scuttlemonkey || @ceph || @inktank


On Mon, Apr 28, 2014 at 12:37 PM, Brian Rak b...@gameservers.com wrote:
 I thought that was just me..  I guess someone from apple is subscribed?

 On 4/28/2014 10:06 AM, Alphe Salas Michels wrote:

 Hello,
 each time I send a mail to the ceph user mailing list I receive an email
 from apple support?!
 Is that a joke?


 Alphe Salas



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] [ANN] ceph-deploy 1.5.0 released!

2014-04-28 Thread Alfredo Deza

Hi All,

There is a new release of ceph-deploy, the easy deployment tool for Ceph.

This release comes with a few bug fixes and a few features:

* implement `osd list`
* add a status check on OSDs when deploying
* sync local mirrors to remote hosts when installing
* support flags and options set in cephdeploy.conf

The full list of changes and fixes is documented at:

http://ceph.com/ceph-deploy/docs/changelog.html#id1

Make sure you update!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cluster_network ignored

2014-04-28 Thread Gandalf Corvotempesta

2014-04-28 17:17 GMT+02:00 Kurt Bauer kurt.ba...@univie.ac.at:
 What do you mean by I see all OSDs down?

I mean that my OSDs are detected as down:

$ sudo ceph osd tree
# id weight type name up/down reweight
-1 12.74 root default
-2 3.64 host osd13
0 1.82 osd.0 down 0
2 1.82 osd.2 down 0
-3 5.46 host osd12
1 1.82 osd.1 up 1
3 1.82 osd.3 down 0
4 1.82 osd.4 down 0
-4 3.64 host osd14
5 1.82 osd.5 down 0
6 1.82 osd.6 up 1



 What does a 'ceph osd stat' say?

 osdmap e1640: 7 osds: 2 up, 2 in

 How can I detect what ceph is doing?

 'ceph -w'

Ok, but there I can't see something like recovering, 57% complete or
something similiar.

 What's the output of 'ceph -s'

$ sudo ceph -s
cluster 6b9916f9-c209-4f53-98c6-581adcdf0955
 health HEALTH_WARN 3383 pgs degraded; 59223 pgs down; 12986 pgs
incomplete; 81691 pgs peering; 25071 pgs stale; 95049 pgs stuck
inactive; 25071 pgs stuck stale; 98432 pgs stuck unclean; 16 requests
are blocked  32 sec; recovery 1/189 objects degraded (0.529%)
 monmap e3: 3 mons at
{osd12=192.168.0.112:6789/0,osd13=192.168.0.113:6789/0,osd14=192.168.0.114:6789/0},
election epoch 326, quorum 0,1,2 osd12,osd13,osd14
 osdmap e1640: 7 osds: 2 up, 2 in
  pgmap v1046855: 98432 pgs, 14 pools, 65979 bytes data, 63 objects
969 MB used, 3721 GB / 3722 GB avail
1/189 objects degraded (0.529%)
  24 stale
   12396 peering
 348 remapped
   44014 down+peering
3214 active+degraded
3949 stale+peering
   11613 stale+down+peering
  24 stale+active+degraded
 145 active+replay+degraded
6123 remapped+peering
3159 down+remapped+peering
3962 incomplete
 437 stale+down+remapped+peering
9024 stale+incomplete
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Creating a bucket failed

2014-04-28 Thread Seowon Jung

Hello,

I've installed Ceph Emperor on my Ubuntu 12.04 server to test many things.
 Everything was pretty good so far, but now I got a problem (403,
 AccessDenied) when I try to create a bucket through S3-compatible API.
 Please read the following information.

*Client Information*
Computer: Ubuntu 12.04 64bit Desktop
S3 Client: Dragon Disk 1.05


*Server Information*
Server Hardware: 2 servers, 2 storage array (12 OSDs each, total 24 OSDs)
OS: Ubuntu 12.04 64bit
Ceph: Emperor, Health OK, all OSDs UP


*Configurations:*

ceph.conf
[global]
fsid = 2606e43d-6ca3-4aeb-b760-507a97e06190
mon_initial_members = lab0, lab1
mon_host = 172.17.1.250,172.17.1.251
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_max_attr_size = 655360
osd pool default size = 3
osd pool default min size = 1
osd pool default pg num = 800
osd pool default pgp num = 800

[client.radosgw.gateway]
host = lab0
keyring = /etc/ceph/keyring.radosgw.gateway
rgw socket path = /tmp/radosgw.sock
log file = /var/log/ceph/radosgw.log
rgw data = /var/lib/ceph/radosgw
rgw dns name = lab0.coe.hawaii.edu
rgw print continue = false


Apache
/etc/apache2/sites-enabled/rgw
VirtualHost *:80
FastCgiExternalServer /var/www/s3gw.fcgi -socket /tmp/radosgw.sock
ServerName  lab0.coe.hawaii.edu
ServerAdmin webmaster@localhost
DocumentRoot /var/www

RewriteEngine On
RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*)
/s3gw.fcgi?page=$1params=$2%{QUERY_STRING}
[E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]

IfModule mod_fastcgi.c
Directory /var/www/
Options +ExecCGI
AllowOverride All
SetHandler fastcgi-script
Order allow,deny
allow from all
AuthBasicAuthoritative Off
/Directory
/IfModule

AllowEncodedSlashes On
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
ServerSignature Off
/VirtualHost


User Info:
# radosgw-admin user info --uid=admin
{ user_id: admin,
  display_name: Admin,
  email: ,
  suspended: 0,
  max_buckets: 1000,
  auid: 0,
  subusers: [],
  keys: [
{ user: admin,
  access_key: A3R0CEF3140MLIZIXN4X,
  secret_key: K8TRyfK8ArRjGRnSRvd4N5gY4TdeK1wK\/1iGCcGO}],
  swift_keys: [],
  caps: [],
  op_mask: read, write, delete,
  default_placement: ,
  placement_tags: [],
  bucket_quota: { enabled: false,
  max_size_kb: -1,
  max_objects: -1}}


/var/log/ceph/radosgw.log:
2014-04-28 10:44:42.206681 7fc9b9feb700 15 calculated
digest=6JGkEimcy2pBN3Ty6mfYh6SudcA=
2014-04-28 10:44:42.206685 7fc9b9feb700 15
auth_sign=6JGkEimcy2pBN3Ty6mfYh6SudcA=
2014-04-28 10:44:42.206686 7fc9b9feb700 15 compare=0
2014-04-28 10:44:42.206691 7fc9b9feb700  2 req 20:0.000456:s3:PUT
/:create_bucket:reading permissions
2014-04-28 10:44:42.206697 7fc9b9feb700  2 req 20:0.000463:s3:PUT
/:create_bucket:init op
2014-04-28 10:44:42.206701 7fc9b9feb700  2 req 20:0.000467:s3:PUT
/:create_bucket:verifying op mask
2014-04-28 10:44:42.206704 7fc9b9feb700 20 required_mask= 2 user.op_mask=7
2014-04-28 10:44:42.206706 7fc9b9feb700  2 req 20:0.000472:s3:PUT
/:create_bucket:verifying op permissions
2014-04-28 10:44:42.209718 7fc9b9feb700  2 req 20:0.003483:s3:PUT
/:create_bucket:verifying op params
2014-04-28 10:44:42.209742 7fc9b9feb700  2 req 20:0.003508:s3:PUT
/:create_bucket:executing
2014-04-28 10:44:42.209776 7fc9b9feb700 20 get_obj_state:
rctx=0x7fc928009bd0 obj=.rgw:test state=0x7fc92800cfd8 s-prefetch_data=0
2014-04-28 10:44:42.209790 7fc9b9feb700 10 moving .rgw+test to cache LRU end
2014-04-28 10:44:42.209793 7fc9b9feb700 10 cache get: name=.rgw+test : type
miss (requested=22, cached=0)
2014-04-28 10:44:42.211397 7fc9b9feb700 10 cache put: name=.rgw+test
2014-04-28 10:44:42.211417 7fc9b9feb700 10 moving .rgw+test to cache LRU end
2014-04-28 10:44:42.212563 7fc9b9feb700 20 rgw_create_bucket returned
ret=-1 bucket=test(@{i=.rgw.buckets.index}.rgw.buckets[default.5154.9])
2014-04-28 10:44:42.212629 7fc9b9feb700  2 req 20:0.006394:s3:PUT
/:create_bucket:http status=403
2014-04-28 10:44:42.212749 7fc9b9feb700  1 == req done req=0x1f20f30
http_status=403 ==


I tried to use the secret key both K8TRyfK8ArRjGRnSRvd4N5gY4TdeK1wK\/1iGCcGO
and K8TRyfK8ArRjGRnSRvd4N5gY4TdeK1wK/1iGCcGO

Thank you for your help!
Seowon

--
Seowon Jung
Systems Administrator

College of Education
University of Hawaii at Manoa
(808) 956-7939
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Creating a bucket failed

2014-04-28 Thread Yehuda Sadeh

This could happen if your client is uses the bucket through subdomain
scheme, but the rgw is not resolving it correctly (either rgw_dns_name is
misconfigured, or you were accessing it through different host name).

Yehuda


On Mon, Apr 28, 2014 at 2:02 PM, Seowon Jung seo...@hawaii.edu wrote:

 Hello,

 I've installed Ceph Emperor on my Ubuntu 12.04 server to test many things.
  Everything was pretty good so far, but now I got a problem (403,
  AccessDenied) when I try to create a bucket through S3-compatible API.
  Please read the following information.

 *Client Information*
 Computer: Ubuntu 12.04 64bit Desktop
 S3 Client: Dragon Disk 1.05


 *Server Information*
 Server Hardware: 2 servers, 2 storage array (12 OSDs each, total 24 OSDs)
 OS: Ubuntu 12.04 64bit
 Ceph: Emperor, Health OK, all OSDs UP


 *Configurations:*

 ceph.conf
 [global]
 fsid = 2606e43d-6ca3-4aeb-b760-507a97e06190
 mon_initial_members = lab0, lab1
 mon_host = 172.17.1.250,172.17.1.251
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 osd_max_attr_size = 655360
 osd pool default size = 3
 osd pool default min size = 1
 osd pool default pg num = 800
 osd pool default pgp num = 800

 [client.radosgw.gateway]
 host = lab0
 keyring = /etc/ceph/keyring.radosgw.gateway
 rgw socket path = /tmp/radosgw.sock
 log file = /var/log/ceph/radosgw.log
 rgw data = /var/lib/ceph/radosgw
 rgw dns name = lab0.coe.hawaii.edu
 rgw print continue = false


 Apache
 /etc/apache2/sites-enabled/rgw
 VirtualHost *:80
 FastCgiExternalServer /var/www/s3gw.fcgi -socket /tmp/radosgw.sock
 ServerName  lab0.coe.hawaii.edu
 ServerAdmin webmaster@localhost
 DocumentRoot /var/www

 RewriteEngine On
 RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*)
 /s3gw.fcgi?page=$1params=$2%{QUERY_STRING}
 [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]

 IfModule mod_fastcgi.c
 Directory /var/www/
 Options +ExecCGI
 AllowOverride All
 SetHandler fastcgi-script
 Order allow,deny
 allow from all
 AuthBasicAuthoritative Off
 /Directory
 /IfModule

 AllowEncodedSlashes On
 ErrorLog ${APACHE_LOG_DIR}/error.log
 CustomLog ${APACHE_LOG_DIR}/access.log combined
 ServerSignature Off
 /VirtualHost


 User Info:
 # radosgw-admin user info --uid=admin
 { user_id: admin,
   display_name: Admin,
   email: ,
   suspended: 0,
   max_buckets: 1000,
   auid: 0,
   subusers: [],
   keys: [
 { user: admin,
   access_key: A3R0CEF3140MLIZIXN4X,
   secret_key: K8TRyfK8ArRjGRnSRvd4N5gY4TdeK1wK\/1iGCcGO}],
   swift_keys: [],
   caps: [],
   op_mask: read, write, delete,
   default_placement: ,
   placement_tags: [],
   bucket_quota: { enabled: false,
   max_size_kb: -1,
   max_objects: -1}}


 /var/log/ceph/radosgw.log:
 2014-04-28 10:44:42.206681 7fc9b9feb700 15 calculated
 digest=6JGkEimcy2pBN3Ty6mfYh6SudcA=
 2014-04-28 10:44:42.206685 7fc9b9feb700 15
 auth_sign=6JGkEimcy2pBN3Ty6mfYh6SudcA=
 2014-04-28 10:44:42.206686 7fc9b9feb700 15 compare=0
 2014-04-28 10:44:42.206691 7fc9b9feb700  2 req 20:0.000456:s3:PUT
 /:create_bucket:reading permissions
 2014-04-28 10:44:42.206697 7fc9b9feb700  2 req 20:0.000463:s3:PUT
 /:create_bucket:init op
 2014-04-28 10:44:42.206701 7fc9b9feb700  2 req 20:0.000467:s3:PUT
 /:create_bucket:verifying op mask
 2014-04-28 10:44:42.206704 7fc9b9feb700 20 required_mask= 2 user.op_mask=7
 2014-04-28 10:44:42.206706 7fc9b9feb700  2 req 20:0.000472:s3:PUT
 /:create_bucket:verifying op permissions
 2014-04-28 10:44:42.209718 7fc9b9feb700  2 req 20:0.003483:s3:PUT
 /:create_bucket:verifying op params
 2014-04-28 10:44:42.209742 7fc9b9feb700  2 req 20:0.003508:s3:PUT
 /:create_bucket:executing
 2014-04-28 10:44:42.209776 7fc9b9feb700 20 get_obj_state:
 rctx=0x7fc928009bd0 obj=.rgw:test state=0x7fc92800cfd8 s-prefetch_data=0
 2014-04-28 10:44:42.209790 7fc9b9feb700 10 moving .rgw+test to cache LRU
 end
 2014-04-28 10:44:42.209793 7fc9b9feb700 10 cache get: name=.rgw+test :
 type miss (requested=22, cached=0)
 2014-04-28 10:44:42.211397 7fc9b9feb700 10 cache put: name=.rgw+test
 2014-04-28 10:44:42.211417 7fc9b9feb700 10 moving .rgw+test to cache LRU
 end
 2014-04-28 10:44:42.212563 7fc9b9feb700 20 rgw_create_bucket returned
 ret=-1 bucket=test(@{i=.rgw.buckets.index}.rgw.buckets[default.5154.9])
 2014-04-28 10:44:42.212629 7fc9b9feb700  2 req 20:0.006394:s3:PUT
 /:create_bucket:http status=403
 2014-04-28 10:44:42.212749 7fc9b9feb700  1 == req done req=0x1f20f30
 http_status=403 ==


 I tried to use the secret key both K8TRyfK8ArRjGRnSRvd4N5gY4TdeK1wK\/1iGCcGO
 and K8TRyfK8ArRjGRnSRvd4N5gY4TdeK1wK/1iGCcGO

 Thank you for your help!
 Seowon

 --
 Seowon Jung
 Systems Administrator

 College of Education
 University of Hawaii at Manoa
 (808) 956-7939

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com

Re: [ceph-users] Creating a bucket failed

2014-04-28 Thread Seowon Jung

Thank you so much for your quick reply.  I created a subuser for Swift, but
it got the authorization error.  Is it related to the same problem?

$ swift --verbose  -V 1.0 -A http://lab0.coe.hawaii.edu/auth -U admin:swift
-K RnelTPTJGc4rt6LlRjF4AnxfJhrLvu4J6+PTUl+s post test
Container PUT failed: http://lab0.coe.hawaii.edu:80/swift/v1/test 401
Authorization Required   AccessDenied

Thank you!

--
Seowon Jung
Systems Administrator

College of Education
University of Hawaii at Manoa
(808) 956-7939


On Mon, Apr 28, 2014 at 11:10 AM, Yehuda Sadeh yeh...@inktank.com wrote:

 This could happen if your client is uses the bucket through subdomain
 scheme, but the rgw is not resolving it correctly (either rgw_dns_name is
 misconfigured, or you were accessing it through different host name).

 Yehuda


 On Mon, Apr 28, 2014 at 2:02 PM, Seowon Jung seo...@hawaii.edu wrote:

 Hello,

 I've installed Ceph Emperor on my Ubuntu 12.04 server to test many
 things.  Everything was pretty good so far, but now I got a problem (403,
  AccessDenied) when I try to create a bucket through S3-compatible API.
  Please read the following information.

 *Client Information*
 Computer: Ubuntu 12.04 64bit Desktop
 S3 Client: Dragon Disk 1.05


 *Server Information*
 Server Hardware: 2 servers, 2 storage array (12 OSDs each, total 24 OSDs)
 OS: Ubuntu 12.04 64bit
 Ceph: Emperor, Health OK, all OSDs UP


 *Configurations:*

 ceph.conf
 [global]
 fsid = 2606e43d-6ca3-4aeb-b760-507a97e06190
 mon_initial_members = lab0, lab1
 mon_host = 172.17.1.250,172.17.1.251
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 osd_max_attr_size = 655360
 osd pool default size = 3
 osd pool default min size = 1
 osd pool default pg num = 800
 osd pool default pgp num = 800

 [client.radosgw.gateway]
 host = lab0
 keyring = /etc/ceph/keyring.radosgw.gateway
 rgw socket path = /tmp/radosgw.sock
 log file = /var/log/ceph/radosgw.log
 rgw data = /var/lib/ceph/radosgw
 rgw dns name = lab0.coe.hawaii.edu
 rgw print continue = false


 Apache
 /etc/apache2/sites-enabled/rgw
 VirtualHost *:80
 FastCgiExternalServer /var/www/s3gw.fcgi -socket /tmp/radosgw.sock
 ServerName  lab0.coe.hawaii.edu
 ServerAdmin webmaster@localhost
  DocumentRoot /var/www

 RewriteEngine On
 RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*)
 /s3gw.fcgi?page=$1params=$2%{QUERY_STRING}
 [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]

 IfModule mod_fastcgi.c
 Directory /var/www/
 Options +ExecCGI
 AllowOverride All
 SetHandler fastcgi-script
 Order allow,deny
 allow from all
 AuthBasicAuthoritative Off
 /Directory
 /IfModule

 AllowEncodedSlashes On
 ErrorLog ${APACHE_LOG_DIR}/error.log
 CustomLog ${APACHE_LOG_DIR}/access.log combined
 ServerSignature Off
 /VirtualHost


 User Info:
 # radosgw-admin user info --uid=admin
 { user_id: admin,
   display_name: Admin,
   email: ,
   suspended: 0,
   max_buckets: 1000,
   auid: 0,
   subusers: [],
   keys: [
 { user: admin,
   access_key: A3R0CEF3140MLIZIXN4X,
   secret_key: K8TRyfK8ArRjGRnSRvd4N5gY4TdeK1wK\/1iGCcGO}],
   swift_keys: [],
   caps: [],
   op_mask: read, write, delete,
   default_placement: ,
   placement_tags: [],
   bucket_quota: { enabled: false,
   max_size_kb: -1,
   max_objects: -1}}


 /var/log/ceph/radosgw.log:
 2014-04-28 10:44:42.206681 7fc9b9feb700 15 calculated
 digest=6JGkEimcy2pBN3Ty6mfYh6SudcA=
 2014-04-28 10:44:42.206685 7fc9b9feb700 15
 auth_sign=6JGkEimcy2pBN3Ty6mfYh6SudcA=
 2014-04-28 10:44:42.206686 7fc9b9feb700 15 compare=0
 2014-04-28 10:44:42.206691 7fc9b9feb700  2 req 20:0.000456:s3:PUT
 /:create_bucket:reading permissions
 2014-04-28 10:44:42.206697 7fc9b9feb700  2 req 20:0.000463:s3:PUT
 /:create_bucket:init op
 2014-04-28 10:44:42.206701 7fc9b9feb700  2 req 20:0.000467:s3:PUT
 /:create_bucket:verifying op mask
 2014-04-28 10:44:42.206704 7fc9b9feb700 20 required_mask= 2 user.op_mask=7
 2014-04-28 10:44:42.206706 7fc9b9feb700  2 req 20:0.000472:s3:PUT
 /:create_bucket:verifying op permissions
 2014-04-28 10:44:42.209718 7fc9b9feb700  2 req 20:0.003483:s3:PUT
 /:create_bucket:verifying op params
 2014-04-28 10:44:42.209742 7fc9b9feb700  2 req 20:0.003508:s3:PUT
 /:create_bucket:executing
 2014-04-28 10:44:42.209776 7fc9b9feb700 20 get_obj_state:
 rctx=0x7fc928009bd0 obj=.rgw:test state=0x7fc92800cfd8 s-prefetch_data=0
 2014-04-28 10:44:42.209790 7fc9b9feb700 10 moving .rgw+test to cache LRU
 end
 2014-04-28 10:44:42.209793 7fc9b9feb700 10 cache get: name=.rgw+test :
 type miss (requested=22, cached=0)
 2014-04-28 10:44:42.211397 7fc9b9feb700 10 cache put: name=.rgw+test
 2014-04-28 10:44:42.211417 7fc9b9feb700 10 moving .rgw+test to cache LRU
 end
 2014-04-28 10:44:42.212563 7fc9b9feb700 20 rgw_create_bucket returned
 ret=-1

Re: [ceph-users] osd_recovery_max_single_start

2014-04-28 Thread David Zafman



On Apr 24, 2014, at 10:09 AM, Chad Seys cws...@physics.wisc.edu wrote:

 Hi David,
  Thanks for the reply.
  I'm a little confused by OSD versus PGs in the description of the two 
 options osd_recovery_max_single_start and osd_recovery_max_active .

An OSD manages all the PGs in its object store (a subset of all PGs in the 
cluster).  An OSD only needs to manage recovery of the PGs for which it is 
primary and need recovery.   

 
 The ceph webpage describes osd_recovery_max_active as The number of active 
 recovery requests per OSD at one time. It does not mention PGs. ?
 
 Assuming you meant OSD instead of PG, is this a rephrase of your message:
 
 osd_recovery_max_active (default 15) recovery operations will run total and 
 will be started in groups of osd_recovery_max_single_start (default 5)”

Yes, but PGs are the way the newly started recovery ops group.

The osd_recovery_max_active is the number of recovery operations which can be 
active at any given time for an OSD for all the PGs it is simultaneously 
recovering.

The osd_recovery_max_single_start is the maximum number of recovery operations 
that will be newly started per PG that the OSD is recovering.

 
 So if I set osd_recovery_max_active = 1 then osd_recovery_max_single_start 
 will effectively = 1 ?


Yes, if osd_recovery_max_active = osd_recovery_max_single_start then with no 
ops are currently active we could only start the osd_recovery_max_active new 
ops anyway.

 
 Thanks!
 Chad.
 
 On Thursday, April 24, 2014 11:43:47 you wrote:
 The value of osd_recovery_max_single_start (default 5) is used in
 conjunction with osd_recovery_max_active (default 15).   This means that a
 given PG will start up to 5 recovery operations at time of a total of 15
 operations active at a time.  This allows recovery to spread operations
 across more or less PGs at any given time.
 
 David Zafman
 Senior Developer
 http://www.inktank.com
 
 On Apr 24, 2014, at 8:09 AM, Chad Seys cws...@physics.wisc.edu wrote:
 Hi All,
 
  What does osd_recovery_max_single_start do?  I could not find a
  description
 
 of it.
 
 Thanks!
 Chad.
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



David Zafman
Senior Developer
http://www.inktank.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Creating a bucket failed

2014-04-28 Thread Seowon Jung

Cedric,

I created this user as described on the official document
$ sudo radosgw-admin subuser create --uid=johndoe --subuser=johndoe:swift
--access=full

The subuser seems to have a full permission.
$ radosgw-admin user info --uid=admin

  swift_keys: [
{ user: admin:swift,
  secret_key: RnelTPTJGc4rt6LlRjF4AnxfJhrLvu4J6+PTUl+s}],
  ps: [],
  op_mask: read, write, delete,
  default_placement: ,
  placement_tags: [],
  bucket_quota: { enabled: false,
  max_size_kb: -1,
  max_objects: -1}}

Thank you for your help anyway,
Seowon


--
Seowon Jung
Systems Administrator

College of Education
University of Hawaii at Manoa
(808) 956-7939


On Mon, Apr 28, 2014 at 12:12 PM, Cedric Lemarchand ced...@yipikai.orgwrote:

  Hello,

 Le 28/04/2014 23:29, Seowon Jung a écrit :

  Thank you so much for your quick reply.  I created a subuser for Swift,
 but it got the authorization error.  Is it related to the same problem?

 In the way bucket access via subdomain is specific to S3 and you are now
 using Swift, I don't think so.

   $ swift --verbose  -V 1.0 -A http://lab0.coe.hawaii.edu/auth -U
 admin:swift -K RnelTPTJGc4rt6LlRjF4AnxfJhrLvu4J6+PTUl+s post test
 Container PUT failed: http://lab0.coe.hawaii.edu:80/swift/v1/test 401
 Authorization Required   AccessDenied

 I would first try to check if the subuser has rights to create a bucket.
 (permissions field)

 Cheers


  Thank you!

  --
 Seowon Jung
 Systems Administrator

 College of Education
 University of Hawaii at Manoa
 (808) 956-7939


 On Mon, Apr 28, 2014 at 11:10 AM, Yehuda Sadeh yeh...@inktank.com wrote:

 This could happen if your client is uses the bucket through subdomain
 scheme, but the rgw is not resolving it correctly (either rgw_dns_name is
 misconfigured, or you were accessing it through different host name).

  Yehuda


  On Mon, Apr 28, 2014 at 2:02 PM, Seowon Jung seo...@hawaii.edu wrote:

   Hello,

  I've installed Ceph Emperor on my Ubuntu 12.04 server to test many
 things.  Everything was pretty good so far, but now I got a problem (403,
  AccessDenied) when I try to create a bucket through S3-compatible API.
  Please read the following information.

  *Client Information*
 Computer: Ubuntu 12.04 64bit Desktop
 S3 Client: Dragon Disk 1.05


  *Server Information*
 Server Hardware: 2 servers, 2 storage array (12 OSDs each, total 24 OSDs)
 OS: Ubuntu 12.04 64bit
 Ceph: Emperor, Health OK, all OSDs UP


  *Configurations:*

  ceph.conf
  [global]
 fsid = 2606e43d-6ca3-4aeb-b760-507a97e06190
 mon_initial_members = lab0, lab1
 mon_host = 172.17.1.250,172.17.1.251
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 osd_max_attr_size = 655360
 osd pool default size = 3
 osd pool default min size = 1
 osd pool default pg num = 800
 osd pool default pgp num = 800

  [client.radosgw.gateway]
 host = lab0
 keyring = /etc/ceph/keyring.radosgw.gateway
 rgw socket path = /tmp/radosgw.sock
 log file = /var/log/ceph/radosgw.log
 rgw data = /var/lib/ceph/radosgw
 rgw dns name = lab0.coe.hawaii.edu
 rgw print continue = false


  Apache
 /etc/apache2/sites-enabled/rgw
 VirtualHost *:80
  FastCgiExternalServer /var/www/s3gw.fcgi -socket /tmp/radosgw.sock
 ServerName  lab0.coe.hawaii.edu
 ServerAdmin webmaster@localhost
  DocumentRoot /var/www

  RewriteEngine On
 RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*)
 /s3gw.fcgi?page=$1params=$2%{QUERY_STRING} [E=HTTP_AUTHORIZATION:%{
 HTTP:Authorization},L]

  IfModule mod_fastcgi.c
 Directory /var/www/
 Options +ExecCGI
 AllowOverride All
 SetHandler fastcgi-script
 Order allow,deny
 allow from all
 AuthBasicAuthoritative Off
 /Directory
 /IfModule

  AllowEncodedSlashes On
 ErrorLog ${APACHE_LOG_DIR}/error.log
 CustomLog ${APACHE_LOG_DIR}/access.log combined
 ServerSignature Off
 /VirtualHost


  User Info:
  # radosgw-admin user info --uid=admin
 { user_id: admin,
   display_name: Admin,
   email: ,
   suspended: 0,
   max_buckets: 1000,
   auid: 0,
   subusers: [],
   keys: [
 { user: admin,
   access_key: A3R0CEF3140MLIZIXN4X,
   secret_key: K8TRyfK8ArRjGRnSRvd4N5gY4TdeK1wK\/1iGCcGO}],
   swift_keys: [],
   caps: [],
   op_mask: read, write, delete,
   default_placement: ,
   placement_tags: [],
   bucket_quota: { enabled: false,
   max_size_kb: -1,
   max_objects: -1}}


  /var/log/ceph/radosgw.log:
  2014-04-28 10:44:42.206681 7fc9b9feb700 15 calculated
 digest=6JGkEimcy2pBN3Ty6mfYh6SudcA=
 2014-04-28 10:44:42.206685 7fc9b9feb700 15
 auth_sign=6JGkEimcy2pBN3Ty6mfYh6SudcA=
 2014-04-28 10:44:42.206686 7fc9b9feb700 15 compare=0
 2014-04-28 10:44:42.206691 7fc9b9feb700  2 req 
 20:0.000456:s3:PUT/:create_bucket:reading permissions
 2014-04-28 10:44:42.206697 7fc9b9feb700  2 req 
 20:0.000463:s3:PUT/:create_bucket:init op
 2014-04-28 10:44:42.206701

Re: [ceph-users] Slow RBD Benchmark Compared To Direct I/O Test

2014-04-28 Thread Indra Pramana

Dear Christian and all,

Anyone can advise?

Looking forward to your reply, thank you.

Cheers.



On Thu, Apr 24, 2014 at 1:51 PM, Indra Pramana in...@sg.or.id wrote:

 Hi Christian,

 Good day to you, and thank you for your reply.

 On Wed, Apr 23, 2014 at 11:41 PM, Christian Balzer ch...@gol.com wrote:

Using 32 concurrent writes, result is below. The speed really
fluctuates.
   
 Total time run: 64.31704964.317049
Total writes made:  1095
Write size: 4194304
Bandwidth (MB/sec): 68.100
   
Stddev Bandwidth:   44.6773
Max bandwidth (MB/sec): 184
Min bandwidth (MB/sec): 0
Average Latency:1.87761
Stddev Latency: 1.90906
Max latency:9.99347
Min latency:0.075849
   
   That is really weird, it should get faster, not slower. ^o^
   I assume you've run this a number of times?
  
   Also my apologies, the default is 16 threads, not 1, but that still
   isn't enough to get my cluster to full speed:
   ---
   Bandwidth (MB/sec): 349.044
  
   Stddev Bandwidth:   107.582
   Max bandwidth (MB/sec): 408
   ---
   at 64 threads it will ramp up from a slow start to:
   ---
   Bandwidth (MB/sec): 406.967
  
   Stddev Bandwidth:   114.015
   Max bandwidth (MB/sec): 452
   ---
  
   But what stands out is your latency. I don't have a 10GBE network to
   compare, but my Infiniband based cluster (going through at least one
   switch) gives me values like this:
   ---
   Average Latency:0.335519
   Stddev Latency: 0.177663
   Max latency:1.37517
   Min latency:0.1017
   ---
  
   Of course that latency is not just the network.
  
 
  What else can contribute to this latency? Storage node load, disk speed,
  anything else?
 
 That and the network itself are pretty much it, you should know once
 you've run those test with atop or iostat on the storage nodes.

 
   I would suggest running atop (gives you more information at one
   glance) or iostat -x 3 on all your storage nodes during these tests
   to identify any node or OSD that is overloaded in some way.
  
 
  Will try.
 
 Do that and let us know about the results.


 I have done some tests using iostat and noted some OSDs on a particular
 storage node going up to the 100% limit when I run the rados bench test.

 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.090.000.92   21.740.00   76.25

 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sda   0.00 0.004.33   42.0073.33  6980.00
 304.46 0.296.220.006.86   1.50   6.93
 sdb   0.00 0.000.00   17.67 0.00  6344.00
 718.1959.64  854.260.00  854.26  56.60 *100.00*
 sdc   0.00 0.00   12.33   59.3370.67 18882.33
 528.9236.54  509.80   64.76  602.31  10.51  75.33
 sdd   0.00 0.003.33   54.3324.00 15249.17
 529.71 1.29   22.453.20   23.63   1.64   9.47
 sde   0.00 0.330.000.67 0.00 4.00
 12.00 0.30  450.000.00  450.00 450.00  30.00

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.380.001.137.750.00   89.74

 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sda   0.00 0.005.00   69.0030.67 19408.50
 525.38 4.29   58.020.53   62.18   2.00  14.80
 sdb   0.00 0.007.00   63.3341.33 20911.50
 595.8213.09  826.96   88.57  908.57   5.48  38.53
 sdc   0.00 0.002.67   30.0017.33  6945.33
 426.29 0.216.530.507.07   1.59   5.20
 sdd   0.00 0.002.67   58.6716.00 20661.33
 674.26 4.89   79.54   41.00   81.30   2.70  16.53
 sde   0.00 0.000.001.67 0.00 6.67
 8.00 0.013.200.003.20   1.60   0.27

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
0.970.000.556.730.00   91.75

 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sda   0.00 0.001.67   15.3321.33   120.00
 16.63 0.021.180.001.30   0.63   1.07
 sdb   0.00 0.004.33   62.3324.00 13299.17
 399.69 2.68   11.181.23   11.87   1.94  12.93
 sdc   0.00 0.000.67   38.3370.67  7881.33
 407.7937.66  202.150.00  205.67  13.61  53.07
 sdd   0.00 0.003.00   17.3312.00   166.00
 17.51 0.052.893.112.85   0.98   2.00
 sde   0.00 0.000.000.00 0.00 0.00
 0.00 0.000.000.000.00   0.00   0.00

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.290.00

Re: [ceph-users] RBD clone for OpenStack Nova ephemeral volumes

2014-04-28 Thread Dmitry Borodaenko

I have decoupled the Nova rbd-ephemeral-clone branch from the
multiple-image-location patch, the result can be found at the same
location on GitHub as before:
https://github.com/angdraug/nova/tree/rbd-ephemeral-clone

I will keep rebasing this over Nova master, I also plan to update the
rbd-clone-image-handler blueprint and publish it to nova-specs so that
the patch series could be proposed for Juno.

Icehouse backport of this branch is here:
https://github.com/angdraug/nova/tree/rbd-ephemeral-clone-stable-icehouse

I am not going to track every stable/icehouse commit with this branch,
instead, I will rebase it over stable release tags as they appear.
Right now it's based on tag:2014.1.

For posterity, I'm leaving the multiple-image-location patch rebased
over current Nova master here:
https://github.com/angdraug/nova/tree/multiple-image-location

I don't plan on maintaining multiple-image-location, just leaving it
out there to save some rebasing effort for whoever decides to pick it
up.

-DmitryB

On Fri, Mar 21, 2014 at 1:12 PM, Josh Durgin josh.dur...@inktank.com wrote:
On 03/20/2014 07:03 PM, Dmitry Borodaenko wrote:

On Thu, Mar 20, 2014 at 3:43 PM, Josh Durgin josh.dur...@inktank.com
wrote:

On 03/20/2014 02:07 PM, Dmitry Borodaenko wrote:

The patch series that implemented clone operation for RBD backed
ephemeral volumes in Nova did not make it into Icehouse. We have tried
our best to help it land, but it was ultimately rejected. Furthermore,
an additional requirement was imposed to make this patch series
dependent on full support of Glance API v2 across Nova (due to its
dependency on direct_url that was introduced in v2).

You can find the most recent discussion of this patch series in the
FFE (feature freeze exception) thread on openstack-dev ML:

http://lists.openstack.org/pipermail/openstack-dev/2014-March/029127.html

As I explained in that thread, I believe this feature is essential for
using Ceph as a storage backend for Nova, so I'm going to try and keep
it alive outside of OpenStack mainline until it is allowed to land.

I have created rbd-ephemeral-clone branch in my nova repo fork on
GitHub:
https://github.com/angdraug/nova/tree/rbd-ephemeral-clone

I will keep it rebased over nova master, and will create an
rbd-ephemeral-clone-stable-icehouse to track the same patch series
over nova stable/icehouse once it's branched. I also plan to make sure
that this patch series is included in Mirantis OpenStack 5.0 which
will be based on Icehouse.

If you're interested in this feature, please review and test. Bug
reports and patches are welcome, as long as their scope is limited to
this patch series and is not applicable for mainline OpenStack.

Thanks for taking this on Dmitry! Having rebased those patches many
times during icehouse, I can tell you it's often not trivial.

Indeed, I get conflicts every day lately, even in the current
bugfixing stage of the OpenStack release cycle. I have a feeling it
will not get easier when Icehouse is out and Juno is in full swing.

Do you think the imagehandler-based approach is best for Juno? I'm
leaning towards the older way [1] for simplicity of review, and to
avoid using glance's v2 api by default.
[1] https://review.openstack.org/#/c/46879/

Excellent question, I have thought long and hard about this. In
retrospect, requiring this change to depend on the imagehandler patch
back in December 2013 proven to have been a poor decision.
Unfortunately, now that it's done, porting your original patch from
Havana to Icehouse is more work than keeping the new patch series up
to date with Icehouse, at least short term. Especially if we decide to
keep the rbd_utils refactoring, which I've grown to like.

As far as I understand, your original code made use of the same v2 api
call even before it was rebased over imagehandler patch:

https://github.com/jdurgin/nova/blob/8e4594123b65ddf47e682876373bca6171f4a6f5/nova/image/glance.py#L304

If I read this right, imagehandler doesn't create the dependency on v2
api, the only reason it caused a problem was because it exposed the
output of the same Glance API call to a code path that assumed a v1
data structure. If so, decoupling rbd clone patch from imagehandler
will not help lift the full Glance API v2 support requirement, that v2
api call will still be there.

Also, there's always a chance that imagehandler lands in Juno. If it
does, we'd be forced to dust off the imagehandler based patch series
again, and the effort spent on maintaining the old patch would be
wasted.

Given all that, and without making any assumptions about stability of
the imagehandler patch in its current state, I'm leaning towards
keeping it. If you think it's likely that it will cause us more
problems than the Glance API v2 issue, or if you disagree with my
analysis of that issue, please tell.

My impression was that full glance v2 support was more of an issue
with the imagehandler

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-28 Thread Jingyuan Luke

Hi,

We had applied the patch and recompile ceph as well as updated the
ceph.conf as per suggested, when we re-run ceph-mds we noticed the
following:


2014-04-29 10:45:22.260798 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51366457,12681393 no session for client.324186
2014-04-29 10:45:22.262419 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51366475,12681393 no session for client.324186
2014-04-29 10:45:22.267699 7f90b971d700  0 log [WRN] :  replayed op
client.324186:5135,12681393 no session for client.324186
2014-04-29 10:45:22.271664 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51366724,12681393 no session for client.324186
2014-04-29 10:45:22.281050 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51366945,12681393 no session for client.324186
2014-04-29 10:45:22.283196 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51366996,12681393 no session for client.324186
2014-04-29 10:45:22.287801 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51367043,12681393 no session for client.324186
2014-04-29 10:45:22.289967 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51367082,12681393 no session for client.324186
2014-04-29 10:45:22.291026 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51367110,12681393 no session for client.324186
2014-04-29 10:45:22.294459 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51367192,12681393 no session for client.324186
2014-04-29 10:45:22.297228 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51367257,12681393 no session for client.324186
2014-04-29 10:45:22.297477 7f90b971d700  0 log [WRN] :  replayed op
client.324186:51367264,12681393 no session for client.324186

tcmalloc: large alloc 1136660480 bytes == 0xb2019000 @  0x7f90c2564da7
0x5bb9cb 0x5ac8eb 0x5b32f7 0x79ecd8 0x58cbed 0x7f90c231de9a
0x7f90c0cca3fd
tcmalloc: large alloc 2273316864 bytes == 0x15d73d000 @
0x7f90c2564da7 0x5bb9cb 0x5ac8eb 0x5b32f7 0x79ecd8 0x58cbed
0x7f90c231de9a 0x7f90c0cca3fd

ceph -s shows that MDS up:replay,

Also the messages above seemed to be repeating again after a while but
with a different session number. Is there a way for us to determine
that we are on the right track? Thanks.

Regards,
Luke

On Sun, Apr 27, 2014 at 12:04 PM, Yan, Zheng uker...@gmail.com wrote:
 On Sat, Apr 26, 2014 at 9:56 AM, Jingyuan Luke jyl...@gmail.com wrote:
 Hi Greg,

 Actually our cluster is pretty empty, but we suspect we had a temporary
 network disconnection to one of our OSD, not sure if this caused the
 problem.

 Anyway we don't mind try the method you mentioned, how can we do that?


 compile ceph-mds with the attached patch. add a line mds
 wipe_sessions = 1 to the ceph.conf,

 Yan, Zheng

 Regards,
 Luke


 On Saturday, April 26, 2014, Gregory Farnum g...@inktank.com wrote:

 Hmm, it looks like your on-disk SessionMap is horrendously out of
 date. Did your cluster get full at some point?

 In any case, we're working on tools to repair this now but they aren't
 ready for use yet. Probably the only thing you could do is create an
 empty sessionmap with a higher version than the ones the journal
 refers to, but that might have other fallout effects...
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com


 On Fri, Apr 25, 2014 at 2:57 AM, Mohd Bazli Ab Karim
 bazli.abka...@mimos.my wrote:
  More logs. I ran ceph-mds  with debug-mds=20.
 
  -2 2014-04-25 17:47:54.839672 7f0d6f3f0700 10 mds.0.journal
  EMetaBlob.replay inotable tablev 4316124 = table 4317932
  -1 2014-04-25 17:47:54.839674 7f0d6f3f0700 10 mds.0.journal
  EMetaBlob.replay sessionmap v8632368 -(1|2) == table 7239603 prealloc
  [141df86~1] used 141db9e
0 2014-04-25 17:47:54.840733 7f0d6f3f0700 -1 mds/journal.cc: In
  function 'void EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)' 
  thread
  7f0d6f3f0700 time 2014-04-25 17:47:54.839688 mds/journal.cc: 1303: FAILED
  assert(session)
 
  Please look at the attachment for more details.
 
  Regards,
  Bazli
 
  From: Mohd Bazli Ab Karim
  Sent: Friday, April 25, 2014 12:26 PM
  To: 'ceph-de...@vger.kernel.org'; ceph-users@lists.ceph.com
  Subject: Ceph mds laggy and failed assert in function replay
  mds/journal.cc
 
  Dear Ceph-devel, ceph-users,
 
  I am currently facing issue with my ceph mds server. Ceph-mds daemon
  does not want to bring up back.
  Tried running that manually with ceph-mds -i mon01 -d but it shows that
  it stucks at failed assert(session) line 1303 in mds/journal.cc and 
  aborted.
 
  Can someone shed some light in this issue.
  ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
 
  Let me know if I need to send log with debug enabled.
 
  Regards,
  Bazli
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Only one OSD log available per node?

2014-04-28 Thread Indra Pramana

Hi Greg,

The log rotation works fine, it will rotate the logs every day at around
6:50am. However, there are no writes to the files (except for one osd log
file) so it will rotate empty files for most of them.

-rw-r--r--  1 root root 313884 Apr 29 12:07 ceph-osd.12.log
-rw-r--r--  1 root root 198319 Apr 29 06:36 ceph-osd.12.log.1.gz
-rw-r--r--  1 root root 181675 Apr 28 06:50 ceph-osd.12.log.2.gz
-rw-r--r--  1 root root  44012 Apr 27 06:53 ceph-osd.12.log.3.gz
-rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.13.log
-rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.13.log.1.gz
-rw-r--r--  1 root root506 Apr 27 06:53 ceph-osd.13.log.2.gz
-rw-r--r--  1 root root  44605 Apr 27 06:53 ceph-osd.13.log.3.gz
-rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.14.log
-rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.14.log.1.gz
-rw-r--r--  1 root root502 Apr 27 06:53 ceph-osd.14.log.2.gz
-rw-r--r--  1 root root  55570 Apr 27 06:53 ceph-osd.14.log.3.gz
-rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.15.log
-rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.15.log.1.gz
-rw-r--r--  1 root root500 Apr 27 06:53 ceph-osd.15.log.2.gz
-rw-r--r--  1 root root  49090 Apr 27 06:53 ceph-osd.15.log.3.gz

Any advice?

Thank you.


On Mon, Apr 28, 2014 at 11:26 PM, Gregory Farnum g...@inktank.com wrote:

 It is not. My guess from looking at the time stamps is that maybe you have
 a log rotation system set up that isn't working properly?
 -Greg


 On Sunday, April 27, 2014, Indra Pramana in...@sg.or.id wrote:

 Dear all,

 I have multiple OSDs per node (normally 4) and I realised that for all
 the nodes that I have, only one OSD will contain logs under /var/log/ceph,
 the rest of the logs are empty.

 root@ceph-osd-07:/var/log/ceph# ls -la *.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-client.admin.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.0.log
 -rw-r--r-- 1 root root 386857 Apr 28 14:02 ceph-osd.12.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.13.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.14.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.15.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd..log

 The ceph-osd.12.log only contains the logs for osd.12 only, while the
 other logs for osd.13, 14 and 15 are not available and empty.

 Is this normal?

 Looking forward to your reply, thank you.

 Cheers.



 --
 Software Engineer #42 @ http://inktank.com | http://ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Only one OSD log available per node?

2014-04-28 Thread Gregory Farnum

Are your OSDs actually running? I see that your older logs have more data
in them; did you change log rotation from the defaults?

On Monday, April 28, 2014, Indra Pramana in...@sg.or.id wrote:

 Hi Greg,

 The log rotation works fine, it will rotate the logs every day at around
 6:50am. However, there are no writes to the files (except for one osd log
 file) so it will rotate empty files for most of them.

 -rw-r--r--  1 root root 313884 Apr 29 12:07 ceph-osd.12.log
 -rw-r--r--  1 root root 198319 Apr 29 06:36 ceph-osd.12.log.1.gz
 -rw-r--r--  1 root root 181675 Apr 28 06:50 ceph-osd.12.log.2.gz
 -rw-r--r--  1 root root  44012 Apr 27 06:53 ceph-osd.12.log.3.gz
 -rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.13.log
 -rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.13.log.1.gz
 -rw-r--r--  1 root root506 Apr 27 06:53 ceph-osd.13.log.2.gz
 -rw-r--r--  1 root root  44605 Apr 27 06:53 ceph-osd.13.log.3.gz
 -rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.14.log
 -rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.14.log.1.gz
 -rw-r--r--  1 root root502 Apr 27 06:53 ceph-osd.14.log.2.gz
 -rw-r--r--  1 root root  55570 Apr 27 06:53 ceph-osd.14.log.3.gz
 -rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.15.log
 -rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.15.log.1.gz
 -rw-r--r--  1 root root500 Apr 27 06:53 ceph-osd.15.log.2.gz
 -rw-r--r--  1 root root  49090 Apr 27 06:53 ceph-osd.15.log.3.gz

 Any advice?

 Thank you.


 On Mon, Apr 28, 2014 at 11:26 PM, Gregory Farnum 
 g...@inktank.comjavascript:_e(%7B%7D,'cvml','g...@inktank.com');
  wrote:

 It is not. My guess from looking at the time stamps is that maybe you
 have a log rotation system set up that isn't working properly?
 -Greg


 On Sunday, April 27, 2014, Indra Pramana 
 in...@sg.or.idjavascript:_e(%7B%7D,'cvml','in...@sg.or.id');
 wrote:

 Dear all,

 I have multiple OSDs per node (normally 4) and I realised that for all
 the nodes that I have, only one OSD will contain logs under /var/log/ceph,
 the rest of the logs are empty.

 root@ceph-osd-07:/var/log/ceph# ls -la *.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-client.admin.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.0.log
 -rw-r--r-- 1 root root 386857 Apr 28 14:02 ceph-osd.12.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.13.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.14.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.15.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd..log

 The ceph-osd.12.log only contains the logs for osd.12 only, while the
 other logs for osd.13, 14 and 15 are not available and empty.

 Is this normal?

 Looking forward to your reply, thank you.

 Cheers.



 --
 Software Engineer #42 @ http://inktank.com | http://ceph.com




-- 
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Indra Pramana

Hi Irek,

Good day to you, and thank you for your e-mail.

Is there a better way other than patching the kernel? I would like to avoid
having to compile a custom kernel for my OS. I read that I can disable
write-caching on the drive using hdparm:

hdparm -W0 /dev/sdf
hdparm -W0 /dev/sdg

I tested on one of my test servers and it seems I can disable it using the
command.

Current setup, write-caching is on:


root@ceph-osd-09:/home/indra# hdparm -W /dev/sdg

/dev/sdg:
 write-caching =  1 (on)


I tried to disable write-caching and it's successful:


root@ceph-osd-09:/home/indra# hdparm -W0 /dev/sdg

/dev/sdg:
 setting drive write-caching to 0 (off)
 write-caching =  0 (off)


I check again, and now write-caching is disabled.


root@ceph-osd-09:/home/indra# hdparm -W /dev/sdg

/dev/sdg:
 write-caching =  0 (off)


Would the above give the same result? If yes, I will try to do that on our
running cluster tonight.

May I also know how I can confirm if my SSD comes with volatile cache as
mentioned on your article? I tried to check my SSD's data sheet and there's
no information on whether it comes with volatile cache or not. I also read
that disabling write-caching will also increase the risk of data-loss. Can
you comment on that?

Looking forward to your reply, thank you.

Cheers.



On Mon, Apr 28, 2014 at 7:49 PM, Irek Fasikhov malm...@gmail.com wrote:

 This is my article :).
 To patch to the kernel (
 http://www.theirek.com/downloads/code/CMD_FLUSH.diff).
 After rebooting, run the following commands:
 echo temporary write through  /sys/class/scsi_disk/disk/cache_type


 2014-04-28 15:44 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Irek,

 Thanks for the article. Do you have any other web sources pertaining to
 the same issue, which is in English?

 Looking forward to your reply, thank you.

 Cheers.


 On Mon, Apr 28, 2014 at 7:40 PM, Irek Fasikhov malm...@gmail.com wrote:

 Most likely you need to apply a patch to the kernel.


 http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov


 2014-04-28 15:20 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Udo and Irek,

 Good day to you, and thank you for your emails.


 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 Yes, I have shared the iostat result earlier on this same thread. At
 times the utilisation of the 2 journal drives will hit 100%, especially
 when I simulate writing data using rados bench command. Any suggestions
 what could be the cause of the I/O issue?


 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.850.001.653.140.00   93.36


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   55.00 0.00 25365.33
 922.3834.22  568.900.00  568.90  17.82  98.00
 sdf   0.00 0.000.00   55.67 0.00 25022.67
 899.0229.76  500.570.00  500.57  17.60  98.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.100.001.372.070.00   94.46


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   56.67 0.00 25220.00
 890.1223.60  412.140.00  412.14  17.62  99.87
 sdf   0.00 0.000.00   52.00 0.00 24637.33
 947.5933.65  587.410.00  587.41  19.23 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.210.001.776.750.00   89.27


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   54.33 0.00 24802.67
 912.9825.75  486.360.00  486.36  18.40 100.00
 sdf   0.00 0.000.00   53.00 0.00 24716.00
 932.6835.26  669.890.00  669.89  18.87 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.870.001.675.250.00   91.21


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   94.33 0.00 26257.33
 556.6918.29  208.440.00  208.44  10.50  99.07
 sdf   0.00 0.000.00   51.33 0.00 24470.67
 953.4032.75  684.620.00  684.62  19.51 100.13


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.510.001.347.250.00   89.89


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   52.00 0.00 22565.33
 867.9024.73  446.510.00  446.51  19.10  99.33
 sdf   0.00 0.000.00   64.67 0.00 24892.00
 769.8619.50  330.020.00  330.02  15.32

Re: [ceph-users] Only one OSD log available per node?

2014-04-28 Thread Indra Pramana

Hi Greg,

Yes, all my OSDs are running.

 1945 ?Ssl  215:36 /usr/bin/ceph-osd --cluster=ceph -i 12 -f
 2090 ?Sl   165:07 /usr/bin/ceph-osd --cluster=ceph -i 15 -f
 2100 ?Sl   205:29 /usr/bin/ceph-osd --cluster=ceph -i 13 -f
 2102 ?Sl   196:01 /usr/bin/ceph-osd --cluster=ceph -i 14 -f

I didn't change log rotation settings from the default. This happens to all
my OSD nodes, not only this one.

Is there a way I can verify if the logs are actually being written by the
ceph-osd processes?

Looking forward to your reply, thank you.

Cheers.



On Tue, Apr 29, 2014 at 12:28 PM, Gregory Farnum g...@inktank.com wrote:

 Are your OSDs actually running? I see that your older logs have more data
 in them; did you change log rotation from the defaults?


 On Monday, April 28, 2014, Indra Pramana in...@sg.or.id wrote:

 Hi Greg,

 The log rotation works fine, it will rotate the logs every day at around
 6:50am. However, there are no writes to the files (except for one osd log
 file) so it will rotate empty files for most of them.

 -rw-r--r--  1 root root 313884 Apr 29 12:07 ceph-osd.12.log
 -rw-r--r--  1 root root 198319 Apr 29 06:36 ceph-osd.12.log.1.gz
 -rw-r--r--  1 root root 181675 Apr 28 06:50 ceph-osd.12.log.2.gz
 -rw-r--r--  1 root root  44012 Apr 27 06:53 ceph-osd.12.log.3.gz
 -rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.13.log
 -rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.13.log.1.gz
 -rw-r--r--  1 root root506 Apr 27 06:53 ceph-osd.13.log.2.gz
 -rw-r--r--  1 root root  44605 Apr 27 06:53 ceph-osd.13.log.3.gz
 -rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.14.log
 -rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.14.log.1.gz
 -rw-r--r--  1 root root502 Apr 27 06:53 ceph-osd.14.log.2.gz
 -rw-r--r--  1 root root  55570 Apr 27 06:53 ceph-osd.14.log.3.gz
 -rw-r--r--  1 root root  0 Apr 29 06:36 ceph-osd.15.log
 -rw-r--r--  1 root root 20 Apr 28 06:50 ceph-osd.15.log.1.gz
 -rw-r--r--  1 root root500 Apr 27 06:53 ceph-osd.15.log.2.gz
 -rw-r--r--  1 root root  49090 Apr 27 06:53 ceph-osd.15.log.3.gz

 Any advice?

 Thank you.


 On Mon, Apr 28, 2014 at 11:26 PM, Gregory Farnum g...@inktank.comwrote:

 It is not. My guess from looking at the time stamps is that maybe you
 have a log rotation system set up that isn't working properly?
 -Greg


 On Sunday, April 27, 2014, Indra Pramana in...@sg.or.id wrote:

 Dear all,

 I have multiple OSDs per node (normally 4) and I realised that for all
 the nodes that I have, only one OSD will contain logs under /var/log/ceph,
 the rest of the logs are empty.

 root@ceph-osd-07:/var/log/ceph# ls -la *.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-client.admin.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.0.log
 -rw-r--r-- 1 root root 386857 Apr 28 14:02 ceph-osd.12.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.13.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.14.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd.15.log
 -rw-r--r-- 1 root root  0 Apr 28 06:50 ceph-osd..log

 The ceph-osd.12.log only contains the logs for osd.12 only, while the
 other logs for osd.13, 14 and 15 are not available and empty.

 Is this normal?

 Looking forward to your reply, thank you.

 Cheers.



 --
 Software Engineer #42 @ http://inktank.com | http://ceph.com




 --
 Software Engineer #42 @ http://inktank.com | http://ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

53 matches

Mail list logo