Re: [ceph-users] Ceph Status - Segmentation Fault

2016-05-31 Thread Mathias Buresch
Hi,

here is the output including --debug-auth=20. Does this help?

(gdb) run /usr/bin/ceph status --debug-monc=20 --debug-ms=20 --debug-
rados=20 --debug-auth=20
Starting program: /usr/bin/python /usr/bin/ceph status --debug-monc=20
--debug-ms=20 --debug-rados=20 --debug-auth=20
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-
gnu/libthread_db.so.1".
[New Thread 0x710f5700 (LWP 2210)]
[New Thread 0x708f4700 (LWP 2211)]
[Thread 0x710f5700 (LWP 2210) exited]
[New Thread 0x710f5700 (LWP 2212)]
[Thread 0x710f5700 (LWP 2212) exited]
[New Thread 0x710f5700 (LWP 2213)]
[Thread 0x710f5700 (LWP 2213) exited]
[New Thread 0x710f5700 (LWP 2233)]
[Thread 0x710f5700 (LWP 2233) exited]
[New Thread 0x710f5700 (LWP 2236)]
[Thread 0x710f5700 (LWP 2236) exited]
[New Thread 0x710f5700 (LWP 2237)]
[Thread 0x710f5700 (LWP 2237) exited]
[New Thread 0x710f5700 (LWP 2238)]
[New Thread 0x7fffeb885700 (LWP 2240)]
2016-06-01 07:12:55.656336 710f5700 10 monclient(hunting):
build_initial_monmap
2016-06-01 07:12:55.656440 710f5700  1 librados: starting msgr at
:/0
2016-06-01 07:12:55.656446 710f5700  1 librados: starting objecter
[New Thread 0x7fffeb084700 (LWP 2241)]
2016-06-01 07:12:55.657552 710f5700 10 -- :/0 ready :/0
[New Thread 0x7fffea883700 (LWP 2242)]
[New Thread 0x7fffea082700 (LWP 2245)]
2016-06-01 07:12:55.659548 710f5700  1 -- :/0 messenger.start
[New Thread 0x7fffe9881700 (LWP 2248)]
2016-06-01 07:12:55.660530 710f5700  1 librados: setting wanted
keys
2016-06-01 07:12:55.660539 710f5700  1 librados: calling monclient
init
2016-06-01 07:12:55.660540 710f5700 10 monclient(hunting): init
2016-06-01 07:12:55.660550 710f5700  5 adding auth protocol: cephx
2016-06-01 07:12:55.660552 710f5700 10 monclient(hunting):
auth_supported 2 method cephx
2016-06-01 07:12:55.660532 7fffe9881700 10 -- :/1337675866 reaper_entry
start
2016-06-01 07:12:55.660570 7fffe9881700 10 -- :/1337675866 reaper
2016-06-01 07:12:55.660572 7fffe9881700 10 -- :/1337675866 reaper done
2016-06-01 07:12:55.660733 710f5700  2 auth: KeyRing::load: loaded
key file /etc/ceph/ceph.client.admin.keyring
[New Thread 0x7fffe9080700 (LWP 2251)]
[New Thread 0x7fffe887f700 (LWP 2252)]
2016-06-01 07:12:55.662754 710f5700 10 monclient(hunting):
_reopen_session rank -1 name 
2016-06-01 07:12:55.662764 710f5700 10 -- :/1337675866 connect_rank
to 62.176.141.181:6789/0, creating pipe and registering
[New Thread 0x7fffe3fff700 (LWP 2255)]
2016-06-01 07:12:55.663789 710f5700 10 -- :/1337675866 >>
62.176.141.181:6789/0 pipe(0x7fffec064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1
c=0x7fffec05aa30).register_pipe
2016-06-01 07:12:55.663819 710f5700 10 -- :/1337675866
get_connection mon.0 62.176.141.181:6789/0 new 0x7fffec064010
2016-06-01 07:12:55.663790 7fffe3fff700 10 -- :/1337675866 >>
62.176.141.181:6789/0 pipe(0x7fffec064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1
c=0x7fffec05aa30).writer: state = connecting policy.server=0
2016-06-01 07:12:55.663830 7fffe3fff700 10 -- :/1337675866 >>
62.176.141.181:6789/0 pipe(0x7fffec064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1
c=0x7fffec05aa30).connect 0
2016-06-01 07:12:55.663841 710f5700 10 monclient(hunting): picked
mon.pix01 con 0x7fffec05aa30 addr 62.176.141.181:6789/0
2016-06-01 07:12:55.663847 710f5700 20 -- :/1337675866
send_keepalive con 0x7fffec05aa30, have pipe.
2016-06-01 07:12:55.663850 7fffe3fff700 10 -- :/1337675866 >>
62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7fffec05aa30).connecting to 62.176.141.181:6789/0
2016-06-01 07:12:55.663863 710f5700 10 monclient(hunting):
_send_mon_message to mon.pix01 at 62.176.141.181:6789/0
2016-06-01 07:12:55.663866 710f5700  1 -- :/1337675866 -->
62.176.141.181:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0
0x7fffec060450 con 0x7fffec05aa30
2016-06-01 07:12:55.663870 710f5700 20 -- :/1337675866
submit_message auth(proto 0 30 bytes epoch 0) v1 remote,
62.176.141.181:6789/0, have pipe.
2016-06-01 07:12:55.663874 710f5700 10 monclient(hunting):
renew_subs
2016-06-01 07:12:55.663877 710f5700 10 monclient(hunting):
authenticate will time out at 2016-06-01 07:17:55.663876
2016-06-01 07:12:55.664115 7fffe3fff700 20 -- :/1337675866 >>
62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :41128 s=1 pgs=0 cs=0
l=1 c=0x7fffec05aa30).connect read peer addr 62.176.141.181:6789/0 on
socket 3
2016-06-01 07:12:55.664135 7fffe3fff700 20 -- :/1337675866 >>
62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :41128 s=1 pgs=0 cs=0
l=1 c=0x7fffec05aa30).connect peer addr for me is
62.176.141.181:41128/0
2016-06-01 07:12:55.664143 7fffe3fff700  1 --
62.176.141.181:0/1337675866 learned my addr 62.176.141.181:0/1337675866
2016-06-01 07:12:55.664177 7fffe3fff700 10 --
62.176.141.181:0/1337675866 >> 62.176.141.181:6789/0
pipe(0x7fffec064010 sd=3 :41128 s=1 pgs=0 cs=0 l=1
c=0x7fffec05aa30).connect sent my addr 62.176.141.181:0/1337675866

[ceph-users] [10.2.1] cephfs, mds reliability - client isn't responding to mclientcaps(revoke)

2016-05-31 Thread James Webb
Dear ceph-users...

My team runs an internal buildfarm using ceph as a backend storage platform. 
We’ve recently upgraded to Jewel and are having reliability issues that we need 
some help with.

Our infrastructure is the following:
- We use CEPH/CEPHFS (10.2.1)
- We have 3 mons and 6 storage servers with a total of 36 OSDs (~4160 PGs). 
- We use enterprise SSDs for everything including journals 
- We have one main mds and one standby mds.
- We are using ceph kernel client to mount cephfs.
- We have upgrade to Ubuntu 16.04 (4.4.0-22-generic kernel)
- We are using a kernel NFS to serve NFS clients from a ceph mount (~ 32 nfs 
threads. 0 swappiness)
- These are physical machines with 8 cores & 32GB memory

On a regular basis, we lose all IO via ceph FS. We’re still trying to isolate 
the issue but it surfaces as an issue between MDS and ceph client.  
We can’t tell if our our NFS server is overwhelming the MDS or if this is some 
unrelated issue. Tuning NFS server has not solved our issues.
So far our only recovery has been to fail the MDS and then restart our NFS. Any 
help or advice will be appreciated on the CEPH side of things.
I’m pretty sure we’re running with default tuning of CEPH MDS configuration 
parameters.


Here are the relevant log entries.

From my primary MDS server, I start seeing these entries start to pile up:

2016-05-31 14:34:07.091117 7f9f2eb87700  0 log_channel(cluster) log [WRN] : 
client.4283066 isn't responding to mclientcaps(revoke), ino 1004491 pending 
pAsLsXsFsxcrwb issued pAsxLsXsxFsxcrwb, sent 63.877480 seconds ago\
2016-05-31 14:34:07.091129 7f9f2eb87700  0 log_channel(cluster) log [WRN] : 
client.4283066 isn't responding to mclientcaps(revoke), ino 1005ddf pending 
pAsLsXsFsxcrwb issued pAsxLsXsxFsxcrwb, sent 63.877382 seconds ago\
2016-05-31 14:34:07.091133 7f9f2eb87700  0 log_channel(cluster) log [WRN] : 
client.4283066 isn't responding to mclientcaps(revoke), ino 1000a2a pending 
pAsLsXsFsxcrwb issued pAsxLsXsxFsxcrwb, sent 63.877356 seconds ago

From my NFS server, I see these entries from dmesg also start piling up:
[Tue May 31 14:33:09 2016] libceph: skipping mds0 X.X.X.195:6800 seq 0 expected 
4294967296
[Tue May 31 14:33:09 2016] libceph: skipping mds0 X.X.X.195:6800 seq 1 expected 
4294967296
[Tue May 31 14:33:09 2016] libceph: skipping mds0 X.X.X.195:6800 seq 2 expected 
4294967296

Next, we find something like this on one of the OSDs.:
2016-05-31 14:34:44.130279 mon.0 XX.XX.XX.188:6789/0 1272184 : cluster [INF] 
HEALTH_WARN; mds0: Client storage-nfs-01 failing to respond to capability 
release

Finally, I am seeing consistent HEALTH_WARN in my status regarding trimming 
which I am not sure if it is related:

cluster -bd8f-4091-bed3-8586fd0d6b46
 health HEALTH_WARN
mds0: Behind on trimming (67/30)
 monmap e3: 3 mons at 
{storage02=X.X.X.190:6789/0,storage03=X.X.X.189:6789/0,storage04=X.X.X.188:6789/0}
election epoch 206, quorum 0,1,2 storage04,storage03,storage02
  fsmap e74879: 1/1/1 up {0=cephfs-03=up:active}, 1 up:standby
 osdmap e65516: 36 osds: 36 up, 36 in
  pgmap v15435732: 4160 pgs, 3 pools, 37539 GB data, 9611 kobjects
75117 GB used, 53591 GB / 125 TB avail
4160 active+clean
  client io 334 MB/s rd, 319 MB/s wr, 5839 op/s rd, 4848 op/s wr


Regards,
James Webb
DevOps Engineer, Engineering Tools
Unity Technologies
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw s3 errors after installation quickstart

2016-05-31 Thread hp cre
Thanks Austin. I do not use DNS, just local name resolution through hosts
file. I'm not sure how i would configure resolution or http://bucket.host.

On 31 May 2016 at 17:00, Austin Johnson  wrote:

> Check that DNS is set up correctly to allow http://bucket.host.tld/
> instead of http://host.tld/bucket/. Or change the operation of your
> client to conform to the latter.
>
> There are also a few config parameters that will need to be added to
> ceph.conf to allow dns bucket resolution.
>
> Austin
>
> Sent from my iPhone
>
> On May 31, 2016, at 5:51 AM, hp cre  wrote:
>
> Hello,
>
> I created a test cluster of 3 OSD hosts (xen1,2,3) based on Ubuntu Xenial,
> ceph 10.2.1 using the quick start steps in the docs master branch.
>
> After jumping through a few problems, mainly from the inconsistent details
> in the docs, i got a stable cluster running with RGW.
>
> Running s3boto test script to create a new bucket works fine. however,
> when I use ay other tool to PUT files, I get a strange error stating "host
> not found". There is nothing in the gateway logs that would suggest why
> this happens, i only get the list of get requests from the cliient(s) I use.
>
> sample:
> ===
> 2016-05-31 13:30:21.953366 7fb2f37be700  1 civetweb: 0x7fb32800cbc0:
> 10.0.0.1 - - [31/May/2016:13:30:21 +0200] "GET / HTTP/1.1" 200 0 -
> CrossFTP/1.97.6 (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
> 2016-05-31 13:30:22.975609 7fb2f2fbd700  1 == starting new request
> req=0x7fb2f2fb77e0 =
> 2016-05-31 13:30:22.978613 7fb2f2fbd700  1 == req done
> req=0x7fb2f2fb77e0 op status=0 http_status=200 ==
> 2016-05-31 13:30:22.978710 7fb2f2fbd700  1 civetweb: 0x7fb330016690:
> 10.0.0.1 - - [31/May/2016:13:30:22 +0200] "GET / HTTP/1.1" 200 0 -
> CrossFTP/1.97.6 (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
> 2016-05-31 13:32:04.032800 7fb2f27bc700  1 == starting new request
> req=0x7fb2f27b67e0 =
> 2016-05-31 13:32:04.034847 7fb2f27bc700  1 == req done
> req=0x7fb2f27b67e0 op status=0 http_status=200 ==
> 2016-05-31 13:32:04.034895 7fb2f27bc700  1 civetweb: 0x7fb32c005910:
> 10.0.0.1 - - [31/May/2016:13:32:04 +0200] "GET / HTTP/1.1" 200 0 -
> DragonDisk 1.05 ( http://www.dragondisk.com )
> ==
>
> example error message when I try a PUT operation on DragonDisk client to a
> bucket I created (myb), file called testfile
>
> 
>
> 2
>
> Operation: Copy
>
> /home/wes/testfile -> http://myb.xen1/testfile
>
> Host not found
>
> 
>
> example error I got from using CrossFTP client to upload a file
> ==
> L2] LIST All Buckets
> [R1] LIST /myb
> [R1] LIST All Buckets (cached)
> [R1] Succeeded
> [R1] LIST All Buckets
> [R1] Succeeded
> [L2] Succeeded
>  Secure random seed initialized.
> [L2] S3 Error: -1 (null) error: Request Error: myb.xen1; XML Error
> Message: null
> [L2] -1 (null) error: Request Error: myb.xen1; XML Error Message: null
> [R1] S3 Error: -1 (null) error: Request Error: myb.xen1: unknown error;
> XML Error Message: null
> [R1] -1 (null) error: Request Error: myb.xen1: unknown error; XML Error
> Message: null
> 
>
> example put operation for file "ceph-deploy-ceph.log" using s3cmd client
> on the gateway node (xen1)
>
> =
> root@xen1:/home/cl# s3cmd put ceph-deploy-ceph.log s3://myb
> upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of 1]
>   0 of 641045 0% in0s 0.00 B/s  failed
> WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name
> or service not known)
> WARNING: Waiting 3 sec...
> upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of 1]
>   0 of 641045 0% in0s 0.00 B/s  failed
> WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name
> or service not known)
> WARNING: Waiting 6 sec...
> ===
>
>
>
> Here is my ceph.conf
> =
> [global]
> fsid = 77dbb949-8eed-4eea-b0ff-0c612e7e2991
> mon_initial_members = xen1, xen2, xen3
> mon_host = 10.0.0.10,10.0.0.11,10.0.0.12
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> osd_pool_default_size = 2
>
>
> [client.radosgw.gateway]
> rgw_frontends = "civetweb port=80"
> host = xen1
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> rgw socket path = /var/run/ceph/ceph.radosgw.gateway.sock
> rgw print continue = false
> =
>
> Any troubleshooting help will be appreciated.
>
> Thanks,
> hpcre
>
> ___
> 

Re: [ceph-users] radosgw s3 errors after installation quickstart

2016-05-31 Thread hp cre
Thanks Jean-Charles. I tried to add this in ceph.conf but did not make a
difference.

On 31 May 2016 at 17:11, LOPEZ Jean-Charles  wrote:

> Hi,
>
> in order to use s3cmd, just make sure you have the rgw_dns_name =
> {bucket_fqdn_suffix} in your config file in the RGW section. In your case
> I’d say rgw_dns_name = xen1
>
> And you should be good to go.
>
> JC
>
> On May 31, 2016, at 04:51, hp cre  wrote:
>
> Hello,
>
> I created a test cluster of 3 OSD hosts (xen1,2,3) based on Ubuntu Xenial,
> ceph 10.2.1 using the quick start steps in the docs master branch.
>
> After jumping through a few problems, mainly from the inconsistent details
> in the docs, i got a stable cluster running with RGW.
>
> Running s3boto test script to create a new bucket works fine. however,
> when I use ay other tool to PUT files, I get a strange error stating "host
> not found". There is nothing in the gateway logs that would suggest why
> this happens, i only get the list of get requests from the cliient(s) I use.
>
> sample:
> ===
> 2016-05-31 13:30:21.953366 7fb2f37be700  1 civetweb: 0x7fb32800cbc0:
> 10.0.0.1 - - [31/May/2016:13:30:21 +0200] "GET / HTTP/1.1" 200 0 -
> CrossFTP/1.97.6 (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
> 2016-05-31 13:30:22.975609 7fb2f2fbd700  1 == starting new request
> req=0x7fb2f2fb77e0 =
> 2016-05-31 13:30:22.978613 7fb2f2fbd700  1 == req done
> req=0x7fb2f2fb77e0 op status=0 http_status=200 ==
> 2016-05-31 13:30:22.978710 7fb2f2fbd700  1 civetweb: 0x7fb330016690:
> 10.0.0.1 - - [31/May/2016:13:30:22 +0200] "GET / HTTP/1.1" 200 0 -
> CrossFTP/1.97.6 (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
> 2016-05-31 13:32:04.032800 7fb2f27bc700  1 == starting new request
> req=0x7fb2f27b67e0 =
> 2016-05-31 13:32:04.034847 7fb2f27bc700  1 == req done
> req=0x7fb2f27b67e0 op status=0 http_status=200 ==
> 2016-05-31 13:32:04.034895 7fb2f27bc700  1 civetweb: 0x7fb32c005910:
> 10.0.0.1 - - [31/May/2016:13:32:04 +0200] "GET / HTTP/1.1" 200 0 -
> DragonDisk 1.05 ( http://www.dragondisk.com )
> ==
>
> example error message when I try a PUT operation on DragonDisk client to a
> bucket I created (myb), file called testfile
>
> 
> 2
> Operation: Copy
> /home/wes/testfile -> http://myb.xen1/testfile
> Host not found
> 
>
> example error I got from using CrossFTP client to upload a file
> ==
> L2] LIST All Buckets
> [R1] LIST /myb
> [R1] LIST All Buckets (cached)
> [R1] Succeeded
> [R1] LIST All Buckets
> [R1] Succeeded
> [L2] Succeeded
>  Secure random seed initialized.
> [L2] S3 Error: -1 (null) error: Request Error: myb.xen1; XML Error
> Message: null
> [L2] -1 (null) error: Request Error: myb.xen1; XML Error Message: null
> [R1] S3 Error: -1 (null) error: Request Error: myb.xen1: unknown error;
> XML Error Message: null
> [R1] -1 (null) error: Request Error: myb.xen1: unknown error; XML Error
> Message: null
> 
>
> example put operation for file "ceph-deploy-ceph.log" using s3cmd client
> on the gateway node (xen1)
>
> =
> root@xen1:/home/cl# s3cmd put ceph-deploy-ceph.log s3://myb
> upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of
> 1]
>   0 of 641045 0% in0s 0.00 B/s  failed
> WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name
> or service not known)
> WARNING: Waiting 3 sec...
> upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of
> 1]
>   0 of 641045 0% in0s 0.00 B/s  failed
> WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name
> or service not known)
> WARNING: Waiting 6 sec...
> ===
>
>
>
> Here is my ceph.conf
> =
> [global]
> fsid = 77dbb949-8eed-4eea-b0ff-0c612e7e2991
> mon_initial_members = xen1, xen2, xen3
> mon_host = 10.0.0.10,10.0.0.11,10.0.0.12
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> osd_pool_default_size = 2
>
>
> [client.radosgw.gateway]
> rgw_frontends = "civetweb port=80"
> host = xen1
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> rgw socket path = /var/run/ceph/ceph.radosgw.gateway.sock
> rgw print continue = false
> =
>
> Any troubleshooting help will be appreciated.
>
> Thanks,
> hpcre
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
___
ceph-users mailing list

Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-05-31 Thread Oliver Dzombic
Hi Nick,

well for sure i have some serious design / configuration weeknesses in
this cluster ( which started with 2,3 GHz Dual Core Celeron CPU's hrhr
). -- the mon's and md's are still on that kind of cpu, and quiet
unhappy with that. And also some OSD's with 3x HDD.

Our cluster here is running ~ 1k iops in normal load with 5 mio. objects
and 20 TB of raw data.

Much going via rbd ( kvm ) a lot going via cephfs. All very very random
stuff. When scrubbing or backfilling comes, no matter what i tell ceph,
the HDD's go streight 100% business.

So yes, the CPU is not the problem in my special case. Its more my
configuration i think.

But well, now with jewel, i hope i will be able to create a new cluster
with new hardware and this time, with some better configuration.

---

Thank you for your link.

As it seems to me, the difference between 2,4 and 4,3 GHz seems to be
quiet low, considering the fact, that the price difference, at least
with the E5's is multiple times different.

So i think, when it comes to $$$ / performance ratio, this high end
cpu's are not a good choice.

For the same money you have already a good piece of another node (
enabling you to run more osd's ).

But well, yesterday came the last, required, hardware for the new
cluster, so i will see it now in the very near future.



As to the prevention of sleeping state of the cpu's, i think that can be
turned off, more or less, in the bios.


-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:i...@ip-interactive.de

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 31.05.2016 um 16:24 schrieb Nick Fisk:
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Oliver Dzombic
>> Sent: 31 May 2016 12:51
>> To: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
>> capacity when using Ceph based On high-end storage systems
>>
>> Hi Nick,
>>
>> well as it seems, you have a point.
>>
>> I was starting now processes to make 3 of 4 cores 100% busy.
>>
>> The %wa was dropping down to 2,5-12% without scrubbing, but also ~ 0%
>> Idle time.
> 
> Thanks for testing that out. I was getting worried I had ordered the wrong
> kit!!!
> 
>>
>> While its ~ 20-70 %wa without that 3 of 4 100% busy cores, with ~ 40-50%
> Idle
>> time.
>>
>> ---
>>
>> That means, that %wa does not harm the cpu (considerable).
>>
>> I dont dare to start a new srubbing now again at day time.
> 
> Are you possibly getting poor performance during scrubs due to disk
> contention rather than cpu?
> 
>>
>> So this numbers are missing right now.
>>
>> In the very end it means, that you should be fine with your 12x HDD.
>>
>> So in the very end, the more cores you have, the GHz in sum you get
>>
>> ( 4x4 GHz E3 vs. 12-24x 2,4 GHz E5 )
>>
>> and this way, the more OSD's you can run, without being the CPU the
>> bottleneck.
>>
>> While > frequency == > task performance ( write at most ).
> 
> Yes, that was my plan to get good write latency, but still be able to scale
> easily/cheaply by using the E3's. Also keep in mind with too many cores,
> they will start scaling down their frequency to save power if they are not
> kept busy. If a process gets assigned to a sleeping clocked down core, it
> takes a while for it to boost back up. I found this could cause a 10-30% hit
> in performance. 
> 
> FYI, here is the tests I run last year
> http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/
> 
> I actually found I could get better than 1600 if I ran a couple of CPU
> stress tests to keep the cores running at turbo speed. I also tested with
> the journal on a ram disk to eliminate the SSD speed from the equation and
> managed to get near 2500iops, which is pretty good for QD=1.
> 
>>
>>
>> --
>> Mit freundlichen Gruessen / Best regards
>>
>> Oliver Dzombic
>> IP-Interactive
>>
>> mailto:i...@ip-interactive.de
>>
>> Anschrift:
>>
>> IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3
>> 63571 Gelnhausen
>>
>> HRB 93402 beim Amtsgericht Hanau
>> Geschäftsführung: Oliver Dzombic
>>
>> Steuer Nr.: 35 236 3622 1
>> UST ID: DE274086107
>>
>>
>> Am 31.05.2016 um 13:05 schrieb Nick Fisk:
>>> Hi Oliver,
>>>
>>> Thanks for this, very interesting and relevant to me at the moment as
>>> your two hardware platforms mirror exactly my existing and new cluster.
>>>
>>> Just a couple of comments inline.
>>>
>>>
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
 Of Oliver Dzombic
 Sent: 31 May 2016 11:26
 To: ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
 capacity when using Ceph based On high-end storage systems

 Hi Nick,

 we have here running on one 

Re: [ceph-users] radosgw s3 errors after installation quickstart

2016-05-31 Thread LOPEZ Jean-Charles
Hi,

in order to use s3cmd, just make sure you have the rgw_dns_name = 
{bucket_fqdn_suffix} in your config file in the RGW section. In your case I’d 
say rgw_dns_name = xen1

And you should be good to go.

JC

> On May 31, 2016, at 04:51, hp cre  wrote:
> 
> Hello,
> 
> I created a test cluster of 3 OSD hosts (xen1,2,3) based on Ubuntu Xenial, 
> ceph 10.2.1 using the quick start steps in the docs master branch.
> 
> After jumping through a few problems, mainly from the inconsistent details in 
> the docs, i got a stable cluster running with RGW.
> 
> Running s3boto test script to create a new bucket works fine. however, when I 
> use ay other tool to PUT files, I get a strange error stating "host not 
> found". There is nothing in the gateway logs that would suggest why this 
> happens, i only get the list of get requests from the cliient(s) I use.
> 
> sample:
> ===
> 2016-05-31 13:30:21.953366 7fb2f37be700  1 civetweb: 0x7fb32800cbc0: 10.0.0.1 
> - - [31/May/2016:13:30:21 +0200] "GET / HTTP/1.1" 200 0 - CrossFTP/1.97.6 
> (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
> 2016-05-31 13:30:22.975609 7fb2f2fbd700  1 == starting new request 
> req=0x7fb2f2fb77e0 =
> 2016-05-31 13:30:22.978613 7fb2f2fbd700  1 == req done req=0x7fb2f2fb77e0 
> op status=0 http_status=200 ==
> 2016-05-31 13:30:22.978710 7fb2f2fbd700  1 civetweb: 0x7fb330016690: 10.0.0.1 
> - - [31/May/2016:13:30:22 +0200] "GET / HTTP/1.1" 200 0 - CrossFTP/1.97.6 
> (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
> 2016-05-31 13:32:04.032800 7fb2f27bc700  1 == starting new request 
> req=0x7fb2f27b67e0 =
> 2016-05-31 13:32:04.034847 7fb2f27bc700  1 == req done req=0x7fb2f27b67e0 
> op status=0 http_status=200 ==
> 2016-05-31 13:32:04.034895 7fb2f27bc700  1 civetweb: 0x7fb32c005910: 10.0.0.1 
> - - [31/May/2016:13:32:04 +0200] "GET / HTTP/1.1" 200 0 - DragonDisk 1.05 ( 
> http://www.dragondisk.com  )
> ==
> 
> example error message when I try a PUT operation on DragonDisk client to a 
> bucket I created (myb), file called testfile
> 
> 
> 2
> Operation: Copy
> /home/wes/testfile -> http://myb.xen1/testfile 
> Host not found
> 
> 
> example error I got from using CrossFTP client to upload a file
> ==
> L2] LIST All Buckets
> [R1] LIST /myb
> [R1] LIST All Buckets (cached)
> [R1] Succeeded
> [R1] LIST All Buckets
> [R1] Succeeded
> [L2] Succeeded
>  Secure random seed initialized.
> [L2] S3 Error: -1 (null) error: Request Error: myb.xen1; XML Error Message: 
> null
> [L2] -1 (null) error: Request Error: myb.xen1; XML Error Message: null
> [R1] S3 Error: -1 (null) error: Request Error: myb.xen1: unknown error; XML 
> Error Message: null
> [R1] -1 (null) error: Request Error: myb.xen1: unknown error; XML Error 
> Message: null
> 
> 
> example put operation for file "ceph-deploy-ceph.log" using s3cmd client on 
> the gateway node (xen1)
> 
> =
> root@xen1:/home/cl# s3cmd put ceph-deploy-ceph.log s3://myb
> upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of 1]
>   0 of 641045 0% in0s 0.00 B/s  failed
> WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name or 
> service not known)
> WARNING: Waiting 3 sec...
> upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of 1]
>   0 of 641045 0% in0s 0.00 B/s  failed
> WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name or 
> service not known)
> WARNING: Waiting 6 sec...
> ===
> 
> 
> 
> Here is my ceph.conf
> =
> [global]
> fsid = 77dbb949-8eed-4eea-b0ff-0c612e7e2991
> mon_initial_members = xen1, xen2, xen3
> mon_host = 10.0.0.10,10.0.0.11,10.0.0.12
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> osd_pool_default_size = 2
> 
> 
> [client.radosgw.gateway]
> rgw_frontends = "civetweb port=80"
> host = xen1
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> rgw socket path = /var/run/ceph/ceph.radosgw.gateway.sock
> rgw print continue = false
> =
> 
> Any troubleshooting help will be appreciated.
> 
> Thanks,
> hpcre
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw s3 errors after installation quickstart

2016-05-31 Thread Austin Johnson
Check that DNS is set up correctly to allow http://bucket.host.tld/ instead of 
http://host.tld/bucket/. Or change the operation of your client to conform to 
the latter.

There are also a few config parameters that will need to be added to ceph.conf 
to allow dns bucket resolution.

Austin

Sent from my iPhone

> On May 31, 2016, at 5:51 AM, hp cre  wrote:
> 
> Hello,
> 
> I created a test cluster of 3 OSD hosts (xen1,2,3) based on Ubuntu Xenial, 
> ceph 10.2.1 using the quick start steps in the docs master branch.
> 
> After jumping through a few problems, mainly from the inconsistent details in 
> the docs, i got a stable cluster running with RGW.
> 
> Running s3boto test script to create a new bucket works fine. however, when I 
> use ay other tool to PUT files, I get a strange error stating "host not 
> found". There is nothing in the gateway logs that would suggest why this 
> happens, i only get the list of get requests from the cliient(s) I use.
> 
> sample:
> ===
> 2016-05-31 13:30:21.953366 7fb2f37be700  1 civetweb: 0x7fb32800cbc0: 10.0.0.1 
> - - [31/May/2016:13:30:21 +0200] "GET / HTTP/1.1" 200 0 - CrossFTP/1.97.6 
> (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
> 2016-05-31 13:30:22.975609 7fb2f2fbd700  1 == starting new request 
> req=0x7fb2f2fb77e0 =
> 2016-05-31 13:30:22.978613 7fb2f2fbd700  1 == req done req=0x7fb2f2fb77e0 
> op status=0 http_status=200 ==
> 2016-05-31 13:30:22.978710 7fb2f2fbd700  1 civetweb: 0x7fb330016690: 10.0.0.1 
> - - [31/May/2016:13:30:22 +0200] "GET / HTTP/1.1" 200 0 - CrossFTP/1.97.6 
> (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
> 2016-05-31 13:32:04.032800 7fb2f27bc700  1 == starting new request 
> req=0x7fb2f27b67e0 =
> 2016-05-31 13:32:04.034847 7fb2f27bc700  1 == req done req=0x7fb2f27b67e0 
> op status=0 http_status=200 ==
> 2016-05-31 13:32:04.034895 7fb2f27bc700  1 civetweb: 0x7fb32c005910: 10.0.0.1 
> - - [31/May/2016:13:32:04 +0200] "GET / HTTP/1.1" 200 0 - DragonDisk 1.05 ( 
> http://www.dragondisk.com )
> ==
> 
> example error message when I try a PUT operation on DragonDisk client to a 
> bucket I created (myb), file called testfile
> 
> 
> 2
> Operation: Copy
> /home/wes/testfile -> http://myb.xen1/testfile
> Host not found
> 
> 
> example error I got from using CrossFTP client to upload a file
> ==
> L2] LIST All Buckets
> [R1] LIST /myb
> [R1] LIST All Buckets (cached)
> [R1] Succeeded
> [R1] LIST All Buckets
> [R1] Succeeded
> [L2] Succeeded
>  Secure random seed initialized.
> [L2] S3 Error: -1 (null) error: Request Error: myb.xen1; XML Error Message: 
> null
> [L2] -1 (null) error: Request Error: myb.xen1; XML Error Message: null
> [R1] S3 Error: -1 (null) error: Request Error: myb.xen1: unknown error; XML 
> Error Message: null
> [R1] -1 (null) error: Request Error: myb.xen1: unknown error; XML Error 
> Message: null
> 
> 
> example put operation for file "ceph-deploy-ceph.log" using s3cmd client on 
> the gateway node (xen1)
> 
> =
> root@xen1:/home/cl# s3cmd put ceph-deploy-ceph.log s3://myb
> upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of 1]
>   0 of 641045 0% in0s 0.00 B/s  failed
> WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name or 
> service not known)
> WARNING: Waiting 3 sec...
> upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of 1]
>   0 of 641045 0% in0s 0.00 B/s  failed
> WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name or 
> service not known)
> WARNING: Waiting 6 sec...
> ===
> 
> 
> 
> Here is my ceph.conf
> =
> [global]
> fsid = 77dbb949-8eed-4eea-b0ff-0c612e7e2991
> mon_initial_members = xen1, xen2, xen3
> mon_host = 10.0.0.10,10.0.0.11,10.0.0.12
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> osd_pool_default_size = 2
> 
> 
> [client.radosgw.gateway]
> rgw_frontends = "civetweb port=80"
> host = xen1
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> rgw socket path = /var/run/ceph/ceph.radosgw.gateway.sock
> rgw print continue = false
> =
> 
> Any troubleshooting help will be appreciated.
> 
> Thanks,
> hpcre
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com

Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-05-31 Thread Nick Fisk
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Oliver Dzombic
> Sent: 31 May 2016 12:51
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
> capacity when using Ceph based On high-end storage systems
> 
> Hi Nick,
> 
> well as it seems, you have a point.
> 
> I was starting now processes to make 3 of 4 cores 100% busy.
> 
> The %wa was dropping down to 2,5-12% without scrubbing, but also ~ 0%
> Idle time.

Thanks for testing that out. I was getting worried I had ordered the wrong
kit!!!

> 
> While its ~ 20-70 %wa without that 3 of 4 100% busy cores, with ~ 40-50%
Idle
> time.
> 
> ---
> 
> That means, that %wa does not harm the cpu (considerable).
> 
> I dont dare to start a new srubbing now again at day time.

Are you possibly getting poor performance during scrubs due to disk
contention rather than cpu?

> 
> So this numbers are missing right now.
> 
> In the very end it means, that you should be fine with your 12x HDD.
> 
> So in the very end, the more cores you have, the GHz in sum you get
> 
> ( 4x4 GHz E3 vs. 12-24x 2,4 GHz E5 )
> 
> and this way, the more OSD's you can run, without being the CPU the
> bottleneck.
> 
> While > frequency == > task performance ( write at most ).

Yes, that was my plan to get good write latency, but still be able to scale
easily/cheaply by using the E3's. Also keep in mind with too many cores,
they will start scaling down their frequency to save power if they are not
kept busy. If a process gets assigned to a sleeping clocked down core, it
takes a while for it to boost back up. I found this could cause a 10-30% hit
in performance. 

FYI, here is the tests I run last year
http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/

I actually found I could get better than 1600 if I ran a couple of CPU
stress tests to keep the cores running at turbo speed. I also tested with
the journal on a ram disk to eliminate the SSD speed from the equation and
managed to get near 2500iops, which is pretty good for QD=1.

> 
> 
> --
> Mit freundlichen Gruessen / Best regards
> 
> Oliver Dzombic
> IP-Interactive
> 
> mailto:i...@ip-interactive.de
> 
> Anschrift:
> 
> IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3
> 63571 Gelnhausen
> 
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
> 
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
> 
> 
> Am 31.05.2016 um 13:05 schrieb Nick Fisk:
> > Hi Oliver,
> >
> > Thanks for this, very interesting and relevant to me at the moment as
> > your two hardware platforms mirror exactly my existing and new cluster.
> >
> > Just a couple of comments inline.
> >
> >
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> >> Of Oliver Dzombic
> >> Sent: 31 May 2016 11:26
> >> To: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
> >> capacity when using Ceph based On high-end storage systems
> >>
> >> Hi Nick,
> >>
> >> we have here running on one node:
> >>
> >> E3-1225v5 ( 4x 3,3 GHz ), 32 GB RAM ( newceph1 )
> >>
> >> looks like this: http://pastebin.com/btVpeJrE
> >>
> >>
> >> and we have here running on another node:
> >>
> >>
> >> 2x E5-2620v3 ( 12x 2,4 GHz + HT units ), 64 GB RAM ( new-ceph2 )
> >>
> >> looks like this: http://pastebin.com/S6XYbwzw
> >>
> >>
> >>
> >> The corresponding ceph tree
> >>
> >> looks like this: http://pastebin.com/evRqwNT2
> >>
> >> That all running a replication of 2.
> >>
> >> ---
> >>
> >> So as we can see, we run the same number ( 10 ) of HDDs.
> >> Model is 3 TB 7200 RPM's 128 MB Cache on this two nodes.
> >>
> >> An ceph osd perf looks like: http://pastebin.com/4d0Xik0m
> >>
> >>
> >> What you see right now is normal, everyday, load on a healthy cluster.
> >>
> >> So now, because we are mean, lets turn on deeb scrubbing in the
> >> middle of the day ( to give the people a reason to make a coffee
> >> break now at
> >> 12:00 CET ).
> >>
> >> 
> >>
> >> So now for the E3: http://pastebin.com/ZagKnhBQ
> >>
> >> And for the E5: http://pastebin.com/2J4zqqNW
> >>
> >> And again our osd perf: http://pastebin.com/V6pKGp9u
> >>
> >>
> >> ---
> >>
> >>
> >> So my conclusion out of that is that the E3 CPU becomes faster
> >> overloaded
> > (
> >> 4 Cores with load 12/13 ) vs. ( 24 vCores with load 18/19 )
> >>
> >> To me, even i can not really meassure it, and even osd perf show's a
> >> lower latency at the E3 OSD's compared to the E5 OSD's, i can see
> >> with the E3
> > CPU
> >> stats, that its frequently running into 0% Idle because the CPU has
> >> to
> > wait for
> >> the HDDs ( %WA ). And because the core has to wait for the hardware,
> >> its CPU power will not be used for something else because its in
> >> waiting
> > state.
> >
> > Not sure if this was visible in the paste dump, but what is the run
> > queue for both systems? When I 

Re: [ceph-users] ceph pg status problem

2016-05-31 Thread Michael Hackett
Hello,

Check your CRUSH map and verify what your failure domain for you CRUSH rule
is set to (for example OSD or host). You need to verify that your failure
domain can satisfy you pool replication value. You may need to decrease
your pool replication value or modify your CRUSH map.

Thank you,

Mike Hackett
Sr Software Maintenance Engineer
Red Hat Ceph Storage


On Tue, May 31, 2016 at 7:43 AM, Patrick McGarry 
wrote:

> Moving this to ceph-user list where it belongs.
> On May 31, 2016 6:03 AM, "Liu Lan(上海_技术部_基础平台_运维部_刘览)" 
> wrote:
>
>> Hi team,
>>
>>
>>
>>I’m a newer of using ceph. Now I encourage the ceph health problem
>> about pg status as following:
>>
>>
>>
>> I created 256 pgs and the status stuck at undersized and I don’t know
>> what’s that means and how to resolve it. Please help me to check. Thanks.
>>
>>
>>
>> # ceph health detail
>>
>> HEALTH_ERR 256 pgs are stuck inactive for more than 300 seconds; 255 pgs
>> degraded; 256 pgs stuck inactive; 255 pgs undersized
>>
>> pg 0.c5 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [9]
>>
>> pg 0.c4 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [13]
>>
>> pg 0.c3 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [11]
>>
>> pg 0.c2 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [11]
>>
>> pg 0.c1 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [11]
>>
>> pg 0.c0 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [13]
>>
>> pg 0.bf is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [9]
>>
>> pg 0.be is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [9]
>>
>> pg 0.bd is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [8]
>>
>> pg 0.bc is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [7]
>>
>> pg 0.bb is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [11]
>>
>> pg 0.ba is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [11]
>>
>> pg 0.b9 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [7]
>>
>> pg 0.b8 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [11]
>>
>> pg 0.b7 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [9]
>>
>> pg 0.b6 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [12]
>>
>> pg 0.b5 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [9]
>>
>> pg 0.b4 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [12]
>>
>> pg 0.b3 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [13]
>>
>> pg 0.b2 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [12]
>>
>> pg 0.b1 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [8]
>>
>> pg 0.b0 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [13]
>>
>> pg 0.af is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [12]
>>
>> pg 0.ae is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [11]
>>
>> pg 0.ad is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [13]
>>
>> pg 0.ac is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [7]
>>
>> pg 0.ab is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [13]
>>
>> pg 0.aa is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [9]
>>
>> pg 0.a9 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [12]
>>
>> pg 0.a8 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [8]
>>
>> pg 0.a7 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [12]
>>
>> pg 0.a6 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [9]
>>
>> pg 0.a5 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [8]
>>
>> pg 0.a4 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [8]
>>
>> pg 0.a3 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [8]
>>
>> pg 0.a2 is stuck inactive since forever, current state
>> undersized+degraded+peered, last acting [7]
>>
>> pg 0.a1 is stuck inactive since forever, 

Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-05-31 Thread Nick Fisk
Hi Oliver,

Thanks for this, very interesting and relevant to me at the moment as your
two hardware platforms mirror exactly my existing and new cluster. 

Just a couple of comments inline.


> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Oliver Dzombic
> Sent: 31 May 2016 11:26
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
> capacity when using Ceph based On high-end storage systems
> 
> Hi Nick,
> 
> we have here running on one node:
> 
> E3-1225v5 ( 4x 3,3 GHz ), 32 GB RAM ( newceph1 )
> 
> looks like this: http://pastebin.com/btVpeJrE
> 
> 
> and we have here running on another node:
> 
> 
> 2x E5-2620v3 ( 12x 2,4 GHz + HT units ), 64 GB RAM ( new-ceph2 )
> 
> looks like this: http://pastebin.com/S6XYbwzw
> 
> 
> 
> The corresponding ceph tree
> 
> looks like this: http://pastebin.com/evRqwNT2
> 
> That all running a replication of 2.
> 
> ---
> 
> So as we can see, we run the same number ( 10 ) of HDDs.
> Model is 3 TB 7200 RPM's 128 MB Cache on this two nodes.
> 
> An ceph osd perf looks like: http://pastebin.com/4d0Xik0m
> 
> 
> What you see right now is normal, everyday, load on a healthy cluster.
> 
> So now, because we are mean, lets turn on deeb scrubbing in the middle of
> the day ( to give the people a reason to make a coffee break now at
> 12:00 CET ).
> 
> 
> 
> So now for the E3: http://pastebin.com/ZagKnhBQ
> 
> And for the E5: http://pastebin.com/2J4zqqNW
> 
> And again our osd perf: http://pastebin.com/V6pKGp9u
> 
> 
> ---
> 
> 
> So my conclusion out of that is that the E3 CPU becomes faster overloaded
(
> 4 Cores with load 12/13 ) vs. ( 24 vCores with load 18/19 )
> 
> To me, even i can not really meassure it, and even osd perf show's a lower
> latency at the E3 OSD's compared to the E5 OSD's, i can see with the E3
CPU
> stats, that its frequently running into 0% Idle because the CPU has to
wait for
> the HDDs ( %WA ). And because the core has to wait for the hardware, its
> CPU power will not be used for something else because its in waiting
state.

Not sure if this was visible in the paste dump, but what is the run queue
for both systems? When I looked into this a while back, I thought load
included IOWait in its calculation, but IOWait itself didn't stop another
thread getting run on the CPU if needed. Effectively IOWait = IDLE (if
needed). From what I understand the run queue on the CPU dictates whether or
not there are too many threads queuing to run and thus slow performance. So
I think your example for the E3 shows that there is still around 66% of the
CPU available for processing. As a test, could you trying running something
like "stress" to consume CPU cycles and see if the IOWait drops?

> 
> So even the E3 CPU get the job done faster, the HDD's are usually too slow
> and the bottleneck. So the E3 can not take real advantage of his higher
> Power per core. And because he has a low number of cores, the number of
> "waiting state" cores becomes fast as big as the total number of cpu
cores.

I did some testing last year by scaling the CPU frequency and measuring
write latency. If you are using SSD journals, then I found the frequency
makes a massive difference to small write IO's. If you are doing mainly
reads with HDD's, then faster Cores probably won't do much.

> 
> The result is, that there is an overload of the system, and we are running
in
> an evil nightmare of IOPS.

Are you actually seeing problems with the cluster? I would interested to
hear what you are encountering?

> 
> But again: I can not meassure it really. I can not see which HDD delivers
which
> data and how fast.
> 
> So maybe the E5 is slowing down the whole stuff. Maybe not.
> 
> But for me, the probability, that a 4 Core System with 0% Idle left at
> 12/13 systemload is "guilty", is higher than a 24v Core System, with still
~ 50%
> Idletime and a 18/19 systemload.
> 
> But, of course, i have to admit, because of the 32 GB RAM vs. 64 GB RAM,
the
> compare might be more like apple's and orange's. Maybe with similar RAM,
> the system will perform similar.

I'm sticking 64GB in my E3 servers to be on the safe side.

> 
> But you can judge the stats yourself, and maybe gain some knowledge from
> it :-)
> 
> For us, what we will do here next, after jewel is now out, we will build
up a
> new cluster with:
> 
> 2x E5-2620v3, 128 GB RAM, HBA -> JBOD configuration while we will add a
> SSD cache tier. So right now, i still believe, that with the E3, because
of the
> limited number of cores, you are more limited on the maximum numbers of
> OSD's you can run with it.

If you do need more cores, I think a better solution might be a 8 or 10 core
single CPU. There seems to be a lot of evidence that sticking with a single
socket is best for ceph if you can.

> 
> Maybe with your E3, your 12 HDDs ( depending on/especially if you have
> (SSD) cache tier in between ) will run 

[ceph-users] radosgw s3 errors after installation quickstart

2016-05-31 Thread hp cre
Hello,

I created a test cluster of 3 OSD hosts (xen1,2,3) based on Ubuntu Xenial,
ceph 10.2.1 using the quick start steps in the docs master branch.

After jumping through a few problems, mainly from the inconsistent details
in the docs, i got a stable cluster running with RGW.

Running s3boto test script to create a new bucket works fine. however, when
I use ay other tool to PUT files, I get a strange error stating "host not
found". There is nothing in the gateway logs that would suggest why this
happens, i only get the list of get requests from the cliient(s) I use.

sample:
===
2016-05-31 13:30:21.953366 7fb2f37be700  1 civetweb: 0x7fb32800cbc0:
10.0.0.1 - - [31/May/2016:13:30:21 +0200] "GET / HTTP/1.1" 200 0 -
CrossFTP/1.97.6 (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
2016-05-31 13:30:22.975609 7fb2f2fbd700  1 == starting new request
req=0x7fb2f2fb77e0 =
2016-05-31 13:30:22.978613 7fb2f2fbd700  1 == req done
req=0x7fb2f2fb77e0 op status=0 http_status=200 ==
2016-05-31 13:30:22.978710 7fb2f2fbd700  1 civetweb: 0x7fb330016690:
10.0.0.1 - - [31/May/2016:13:30:22 +0200] "GET / HTTP/1.1" 200 0 -
CrossFTP/1.97.6 (Linux/4.4.0-21-generic; amd64; en; JVM 1.8.0_91)
2016-05-31 13:32:04.032800 7fb2f27bc700  1 == starting new request
req=0x7fb2f27b67e0 =
2016-05-31 13:32:04.034847 7fb2f27bc700  1 == req done
req=0x7fb2f27b67e0 op status=0 http_status=200 ==
2016-05-31 13:32:04.034895 7fb2f27bc700  1 civetweb: 0x7fb32c005910:
10.0.0.1 - - [31/May/2016:13:32:04 +0200] "GET / HTTP/1.1" 200 0 -
DragonDisk 1.05 ( http://www.dragondisk.com )
==

example error message when I try a PUT operation on DragonDisk client to a
bucket I created (myb), file called testfile



2

Operation: Copy

/home/wes/testfile -> http://myb.xen1/testfile

Host not found



example error I got from using CrossFTP client to upload a file
==
L2] LIST All Buckets
[R1] LIST /myb
[R1] LIST All Buckets (cached)
[R1] Succeeded
[R1] LIST All Buckets
[R1] Succeeded
[L2] Succeeded
 Secure random seed initialized.
[L2] S3 Error: -1 (null) error: Request Error: myb.xen1; XML Error Message:
null
[L2] -1 (null) error: Request Error: myb.xen1; XML Error Message: null
[R1] S3 Error: -1 (null) error: Request Error: myb.xen1: unknown error; XML
Error Message: null
[R1] -1 (null) error: Request Error: myb.xen1: unknown error; XML Error
Message: null


example put operation for file "ceph-deploy-ceph.log" using s3cmd client on
the gateway node (xen1)

=
root@xen1:/home/cl# s3cmd put ceph-deploy-ceph.log s3://myb
upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of 1]
  0 of 641045 0% in0s 0.00 B/s  failed
WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name or
service not known)
WARNING: Waiting 3 sec...
upload: 'ceph-deploy-ceph.log' -> 's3://myb/ceph-deploy-ceph.log'  [1 of 1]
  0 of 641045 0% in0s 0.00 B/s  failed
WARNING: Retrying failed request: /ceph-deploy-ceph.log ([Errno -2] Name or
service not known)
WARNING: Waiting 6 sec...
===



Here is my ceph.conf
=
[global]
fsid = 77dbb949-8eed-4eea-b0ff-0c612e7e2991
mon_initial_members = xen1, xen2, xen3
mon_host = 10.0.0.10,10.0.0.11,10.0.0.12
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd_pool_default_size = 2


[client.radosgw.gateway]
rgw_frontends = "civetweb port=80"
host = xen1
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = /var/run/ceph/ceph.radosgw.gateway.sock
rgw print continue = false
=

Any troubleshooting help will be appreciated.

Thanks,
hpcre
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-05-31 Thread Oliver Dzombic
Hi Nick,

well as it seems, you have a point.

I was starting now processes to make 3 of 4 cores 100% busy.

The %wa was dropping down to 2,5-12% without scrubbing,
but also ~ 0% Idle time.

While its ~ 20-70 %wa without that 3 of 4 100% busy cores,
with ~ 40-50% Idle time.

---

That means, that %wa does not harm the cpu (considerable).

I dont dare to start a new srubbing now again at day time.

So this numbers are missing right now.

In the very end it means, that you should be fine with your 12x HDD.

So in the very end, the more cores you have, the GHz in sum you get

( 4x4 GHz E3 vs. 12-24x 2,4 GHz E5 )

and this way, the more OSD's you can run, without being the CPU the
bottleneck.

While > frequency == > task performance ( write at most ).


-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:i...@ip-interactive.de

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 31.05.2016 um 13:05 schrieb Nick Fisk:
> Hi Oliver,
> 
> Thanks for this, very interesting and relevant to me at the moment as your
> two hardware platforms mirror exactly my existing and new cluster. 
> 
> Just a couple of comments inline.
> 
> 
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Oliver Dzombic
>> Sent: 31 May 2016 11:26
>> To: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
>> capacity when using Ceph based On high-end storage systems
>>
>> Hi Nick,
>>
>> we have here running on one node:
>>
>> E3-1225v5 ( 4x 3,3 GHz ), 32 GB RAM ( newceph1 )
>>
>> looks like this: http://pastebin.com/btVpeJrE
>>
>>
>> and we have here running on another node:
>>
>>
>> 2x E5-2620v3 ( 12x 2,4 GHz + HT units ), 64 GB RAM ( new-ceph2 )
>>
>> looks like this: http://pastebin.com/S6XYbwzw
>>
>>
>>
>> The corresponding ceph tree
>>
>> looks like this: http://pastebin.com/evRqwNT2
>>
>> That all running a replication of 2.
>>
>> ---
>>
>> So as we can see, we run the same number ( 10 ) of HDDs.
>> Model is 3 TB 7200 RPM's 128 MB Cache on this two nodes.
>>
>> An ceph osd perf looks like: http://pastebin.com/4d0Xik0m
>>
>>
>> What you see right now is normal, everyday, load on a healthy cluster.
>>
>> So now, because we are mean, lets turn on deeb scrubbing in the middle of
>> the day ( to give the people a reason to make a coffee break now at
>> 12:00 CET ).
>>
>> 
>>
>> So now for the E3: http://pastebin.com/ZagKnhBQ
>>
>> And for the E5: http://pastebin.com/2J4zqqNW
>>
>> And again our osd perf: http://pastebin.com/V6pKGp9u
>>
>>
>> ---
>>
>>
>> So my conclusion out of that is that the E3 CPU becomes faster overloaded
> (
>> 4 Cores with load 12/13 ) vs. ( 24 vCores with load 18/19 )
>>
>> To me, even i can not really meassure it, and even osd perf show's a lower
>> latency at the E3 OSD's compared to the E5 OSD's, i can see with the E3
> CPU
>> stats, that its frequently running into 0% Idle because the CPU has to
> wait for
>> the HDDs ( %WA ). And because the core has to wait for the hardware, its
>> CPU power will not be used for something else because its in waiting
> state.
> 
> Not sure if this was visible in the paste dump, but what is the run queue
> for both systems? When I looked into this a while back, I thought load
> included IOWait in its calculation, but IOWait itself didn't stop another
> thread getting run on the CPU if needed. Effectively IOWait = IDLE (if
> needed). From what I understand the run queue on the CPU dictates whether or
> not there are too many threads queuing to run and thus slow performance. So
> I think your example for the E3 shows that there is still around 66% of the
> CPU available for processing. As a test, could you trying running something
> like "stress" to consume CPU cycles and see if the IOWait drops?
> 
>>
>> So even the E3 CPU get the job done faster, the HDD's are usually too slow
>> and the bottleneck. So the E3 can not take real advantage of his higher
>> Power per core. And because he has a low number of cores, the number of
>> "waiting state" cores becomes fast as big as the total number of cpu
> cores.
> 
> I did some testing last year by scaling the CPU frequency and measuring
> write latency. If you are using SSD journals, then I found the frequency
> makes a massive difference to small write IO's. If you are doing mainly
> reads with HDD's, then faster Cores probably won't do much.
> 
>>
>> The result is, that there is an overload of the system, and we are running
> in
>> an evil nightmare of IOPS.
> 
> Are you actually seeing problems with the cluster? I would interested to
> hear what you are encountering?
> 
>>
>> But again: I can not meassure it really. I can not see which HDD delivers
> which
>> data and how fast.
>>
>> So maybe the E5 is slowing down the 

Re: [ceph-users] ceph pg status problem

2016-05-31 Thread Patrick McGarry
Moving this to ceph-user list where it belongs.
On May 31, 2016 6:03 AM, "Liu Lan(上海_技术部_基础平台_运维部_刘览)" 
wrote:

> Hi team,
>
>
>
>I’m a newer of using ceph. Now I encourage the ceph health problem
> about pg status as following:
>
>
>
> I created 256 pgs and the status stuck at undersized and I don’t know
> what’s that means and how to resolve it. Please help me to check. Thanks.
>
>
>
> # ceph health detail
>
> HEALTH_ERR 256 pgs are stuck inactive for more than 300 seconds; 255 pgs
> degraded; 256 pgs stuck inactive; 255 pgs undersized
>
> pg 0.c5 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [9]
>
> pg 0.c4 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [13]
>
> pg 0.c3 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [11]
>
> pg 0.c2 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [11]
>
> pg 0.c1 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [11]
>
> pg 0.c0 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [13]
>
> pg 0.bf is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [9]
>
> pg 0.be is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [9]
>
> pg 0.bd is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [8]
>
> pg 0.bc is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [7]
>
> pg 0.bb is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [11]
>
> pg 0.ba is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [11]
>
> pg 0.b9 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [7]
>
> pg 0.b8 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [11]
>
> pg 0.b7 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [9]
>
> pg 0.b6 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [12]
>
> pg 0.b5 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [9]
>
> pg 0.b4 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [12]
>
> pg 0.b3 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [13]
>
> pg 0.b2 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [12]
>
> pg 0.b1 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [8]
>
> pg 0.b0 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [13]
>
> pg 0.af is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [12]
>
> pg 0.ae is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [11]
>
> pg 0.ad is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [13]
>
> pg 0.ac is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [7]
>
> pg 0.ab is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [13]
>
> pg 0.aa is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [9]
>
> pg 0.a9 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [12]
>
> pg 0.a8 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [8]
>
> pg 0.a7 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [12]
>
> pg 0.a6 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [9]
>
> pg 0.a5 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [8]
>
> pg 0.a4 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [8]
>
> pg 0.a3 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [8]
>
> pg 0.a2 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [7]
>
> pg 0.a1 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [12]
>
> pg 0.a0 is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [12]
>
> pg 0.9f is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [7]
>
> pg 0.9e is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [7]
>
> pg 0.9d is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [8]
>
> pg 0.9c is stuck inactive since forever, current state
> undersized+degraded+peered, last acting [9]

Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-05-31 Thread Nick Fisk
Hi Oliver,

Thanks for this, very interesting and relevant to me at the moment as your
two hardware platforms mirror exactly my existing and new cluster. 

Just a couple of comments inline.


> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Oliver Dzombic
> Sent: 31 May 2016 11:26
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
> capacity when using Ceph based On high-end storage systems
> 
> Hi Nick,
> 
> we have here running on one node:
> 
> E3-1225v5 ( 4x 3,3 GHz ), 32 GB RAM ( newceph1 )
> 
> looks like this: http://pastebin.com/btVpeJrE
> 
> 
> and we have here running on another node:
> 
> 
> 2x E5-2620v3 ( 12x 2,4 GHz + HT units ), 64 GB RAM ( new-ceph2 )
> 
> looks like this: http://pastebin.com/S6XYbwzw
> 
> 
> 
> The corresponding ceph tree
> 
> looks like this: http://pastebin.com/evRqwNT2
> 
> That all running a replication of 2.
> 
> ---
> 
> So as we can see, we run the same number ( 10 ) of HDDs.
> Model is 3 TB 7200 RPM's 128 MB Cache on this two nodes.
> 
> An ceph osd perf looks like: http://pastebin.com/4d0Xik0m
> 
> 
> What you see right now is normal, everyday, load on a healthy cluster.
> 
> So now, because we are mean, lets turn on deeb scrubbing in the middle of
> the day ( to give the people a reason to make a coffee break now at
> 12:00 CET ).
> 
> 
> 
> So now for the E3: http://pastebin.com/ZagKnhBQ
> 
> And for the E5: http://pastebin.com/2J4zqqNW
> 
> And again our osd perf: http://pastebin.com/V6pKGp9u
> 
> 
> ---
> 
> 
> So my conclusion out of that is that the E3 CPU becomes faster overloaded
(
> 4 Cores with load 12/13 ) vs. ( 24 vCores with load 18/19 )
> 
> To me, even i can not really meassure it, and even osd perf show's a lower
> latency at the E3 OSD's compared to the E5 OSD's, i can see with the E3
CPU
> stats, that its frequently running into 0% Idle because the CPU has to
wait for
> the HDDs ( %WA ). And because the core has to wait for the hardware, its
> CPU power will not be used for something else because its in waiting
state.

Not sure if this was visible in the paste dump, but what is the run queue
for both systems? When I looked into this a while back, I thought load
included IOWait in its calculation, but IOWait itself didn't stop another
thread getting run on the CPU if needed. Effectively IOWait = IDLE (if
needed). From what I understand the run queue on the CPU dictates whether or
not there are too many threads queuing to run and thus slow performance. So
I think your example for the E3 shows that there is still around 66% of the
CPU available for processing. As a test, could you trying running something
like "stress" to consume CPU cycles and see if the IOWait drops?

> 
> So even the E3 CPU get the job done faster, the HDD's are usually too slow
> and the bottleneck. So the E3 can not take real advantage of his higher
> Power per core. And because he has a low number of cores, the number of
> "waiting state" cores becomes fast as big as the total number of cpu
cores.

I did some testing last year by scaling the CPU frequency and measuring
write latency. If you are using SSD journals, then I found the frequency
makes a massive difference to small write IO's. If you are doing mainly
reads with HDD's, then faster Cores probably won't do much.

> 
> The result is, that there is an overload of the system, and we are running
in
> an evil nightmare of IOPS.

Are you actually seeing problems with the cluster? I would interested to
hear what you are encountering?

> 
> But again: I can not meassure it really. I can not see which HDD delivers
which
> data and how fast.
> 
> So maybe the E5 is slowing down the whole stuff. Maybe not.
> 
> But for me, the probability, that a 4 Core System with 0% Idle left at
> 12/13 systemload is "guilty", is higher than a 24v Core System, with still
~ 50%
> Idletime and a 18/19 systemload.
> 
> But, of course, i have to admit, because of the 32 GB RAM vs. 64 GB RAM,
the
> compare might be more like apple's and orange's. Maybe with similar RAM,
> the system will perform similar.

I'm sticking 64GB in my E3 servers to be on the safe side.

> 
> But you can judge the stats yourself, and maybe gain some knowledge from
> it :-)
> 
> For us, what we will do here next, after jewel is now out, we will build
up a
> new cluster with:
> 
> 2x E5-2620v3, 128 GB RAM, HBA -> JBOD configuration while we will add a
> SSD cache tier. So right now, i still believe, that with the E3, because
of the
> limited number of cores, you are more limited on the maximum numbers of
> OSD's you can run with it.

If you do need more cores, I think a better solution might be a 8 or 10 core
single CPU. There seems to be a lot of evidence that sticking with a single
socket is best for ceph if you can.

> 
> Maybe with your E3, your 12 HDDs ( depending on/especially if you have
> (SSD) cache tier in between ) will run 

Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-05-31 Thread Oliver Dzombic
Hi Nick,

we have here running on one node:

E3-1225v5 ( 4x 3,3 GHz ), 32 GB RAM ( newceph1 )

looks like this: http://pastebin.com/btVpeJrE


and we have here running on another node:


2x E5-2620v3 ( 12x 2,4 GHz + HT units ), 64 GB RAM ( new-ceph2 )

looks like this: http://pastebin.com/S6XYbwzw



The corresponding ceph tree

looks like this: http://pastebin.com/evRqwNT2

That all running a replication of 2.

---

So as we can see, we run the same number ( 10 ) of HDDs.
Model is 3 TB 7200 RPM's 128 MB Cache on this two nodes.

An ceph osd perf looks like: http://pastebin.com/4d0Xik0m


What you see right now is normal, everyday, load on a healthy cluster.

So now, because we are mean, lets turn on deeb scrubbing in the middle
of the day ( to give the people a reason to make a coffee break now at
12:00 CET ).



So now for the E3: http://pastebin.com/ZagKnhBQ

And for the E5: http://pastebin.com/2J4zqqNW

And again our osd perf: http://pastebin.com/V6pKGp9u


---


So my conclusion out of that is that the E3 CPU becomes faster
overloaded ( 4 Cores with load 12/13 ) vs. ( 24 vCores with load 18/19 )

To me, even i can not really meassure it, and even osd perf show's a
lower latency at the E3 OSD's compared to the E5 OSD's, i can see with
the E3 CPU stats, that its frequently running into 0% Idle because the
CPU has to wait for the HDDs ( %WA ). And because the core has to wait
for the hardware, its CPU power will not be used for something else
because its in waiting state.

So even the E3 CPU get the job done faster, the HDD's are usually too
slow and the bottleneck. So the E3 can not take real advantage of his
higher Power per core. And because he has a low number of cores, the
number of "waiting state" cores becomes fast as big as the total number
of cpu cores.

The result is, that there is an overload of the system, and we are
running in an evil nightmare of IOPS.

But again: I can not meassure it really. I can not see which HDD
delivers which data and how fast.

So maybe the E5 is slowing down the whole stuff. Maybe not.

But for me, the probability, that a 4 Core System with 0% Idle left at
12/13 systemload is "guilty", is higher than a 24v Core System, with
still ~ 50% Idletime and a 18/19 systemload.

But, of course, i have to admit, because of the 32 GB RAM vs. 64 GB RAM,
the compare might be more like apple's and orange's. Maybe with similar
RAM, the system will perform similar.

But you can judge the stats yourself, and maybe gain some knowledge from
it :-)

For us, what we will do here next, after jewel is now out, we will build
up a new cluster with:

2x E5-2620v3, 128 GB RAM, HBA -> JBOD configuration while we will add a
SSD cache tier. So right now, i still believe, that with the E3, because
of the limited number of cores, you are more limited on the maximum
numbers of OSD's you can run with it.

Maybe with your E3, your 12 HDDs ( depending on/especially if you have
(SSD) cache tier in between ) will run fine. But i think you are going
here into an area where, in special conditions ( hardware failure/deeb
scrub/... ) your storage performance with an E3 will fastly loose so
much speed, that your applications will not operate smooth anymore.

But again, many factor's are involved, so make ur own picture :-)


-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:i...@ip-interactive.de

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 31.05.2016 um 09:41 schrieb Nick Fisk:
> Hi Oliver,
> 
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Oliver Dzombic
>> Sent: 30 May 2016 16:32
>> To: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
>> capacity when using Ceph based On high-end storage systems
>>
>> Hi,
>>
>> E3 CPUs have 4 Cores, with HT Unit. So 8 logical Cores. And they are not
> multi
>> CPU.
>>
>> That means you will be naturally ( fastly ) limited in the number of OSD's
> you
>> can run with that.
> 
> I'm hoping to be able to run 12, do you think that will be a struggle?
> 
>>
>> Because no matter how much Ghz it has, the OSD process occupy a cpu core
>> for ever. 
> 
> I'm not sure I agree with this point. A OSD process is comprised of 10's of
> threads, which unless you have pinned the process to a single core, will be
> running randomly across all the cores on the CPU. As far as I'm aware, all
> these threads are given a 10ms time slice and then scheduled to run on the
> next available core. A 4x4Ghz CPU will run all these threads faster than a
> 8x2Ghz CPU, this is where the latency advantages are seen.
> 
> If you get to the point you have 100's of threads all demanding CPU time, a
> 4x4Ghz CPU will be roughly the same speed as a 8x2Ghz CPU. Yes there are
> 

Re: [ceph-users] RGW Could not create user

2016-05-31 Thread Khang Nguyễn Nhật
Thank, Wasserman!
I followed the instructions here:
http://docs.ceph.com/docs/master/radosgw/multisite/
Step 1:  radosgw-admin realm create --rgw-realm=default  --default
Step 2:  radosgw-admin zonegroup delete --rgw-zonegroup=default
Step3:   *radosgw-admin zonegroup create --rgw-zonegroup=ap --master
--default*
radosgw-admin zonegroup default --rgw-zonegroup=ap
Step4:  *radosgw-admin zone create --rgw-zonegroup=ap
--rgw-zone=ap-southeast --default --master*
radosgw-admin zone default --rgw-zone=ap-southeast
radosgw-admin zonegroup add --rgw-zonegroup=ap
--rgw-zone=ap-southeast

I tried to create the zone group, zone, realm with another name and also
similar problems.


2016-05-31 13:33 GMT+07:00 Orit Wasserman :

> did you set the realm, zonegroup and zone as defaults?
>
> On Tue, May 31, 2016 at 4:45 AM, Khang Nguyễn Nhật
>  wrote:
> > Hi,
> > I'm having problems with CEPH v10.2.1 Jewel when create user. My cluster
> is
> > used CEPH Jewel, including: 3 OSD, 2 monitors and 1 RGW.
> > - Here is the list of cluster pools:
> > .rgw.root
> > ap-southeast.rgw.control
> > ap-southeast.rgw.data.root
> > ap-southeast.rgw.gc
> > ap-southeast.rgw.users.uid
> > ap-southeast.rgw.buckets.data
> > ap-southeast.rgw.users.email
> > ap-southeast.rgw.users.keys
> > ap-southeast.rgw.buckets.index
> > ap-southeast.rgw.buckets.non-ec
> > ap-southeast.rgw.log
> > ap-southeast.rgw.meta
> > ap-southeast.rgw.intent-log
> > ap-southeast.rgw.usage
> > ap-southeast.rgw.users.swift
> > - Zonegroup info:
> > {
> > "id": "e9585cbd-df92-42a0-964b-15efb1cc0ad6",
> > "name": "ap",
> > "api_name": "ap",
> > "is_master": "true",
> > "endpoints": [
> > "http:\/\/192.168.1.1:"
> > ],
> > "hostnames": [],
> > "hostnames_s3website": [],
> > "master_zone": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> > "zones": [
> > {
> > "id": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> > "name": "ap-southeast",
> > "endpoints": [
> > "http:\/\/192.168.1.1:"
> > ],
> > "log_meta": "true",
> > "log_data": "false",
> > "bucket_index_max_shards": 0,
> > "read_only": "false"
> > }
> > ],
> > "placement_targets": [
> > {
> > "name": "default-placement",
> > "tags": []
> > }
> > ],
> > "default_placement": "default-placement",
> > "realm_id": "93dc1f56-6ec6-48f8-8caa-a7e864eeaeb3"
> > }
> > - Zone:
> > {
> > "id": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> > "name": "ap-southeast",
> > "domain_root": "ap-southeast.rgw.data.root",
> > "control_pool": "ap-southeast.rgw.control",
> > "gc_pool": "ap-southeast.rgw.gc",
> > "log_pool": "ap-southeast.rgw.log",
> > "intent_log_pool": "ap-southeast.rgw.intent-log",
> > "usage_log_pool": "ap-southeast.rgw.usage",
> > "user_keys_pool": "ap-southeast.rgw.users.keys",
> > "user_email_pool": "ap-southeast.rgw.users.email",
> > "user_swift_pool": "ap-southeast.rgw.users.swift",
> > "user_uid_pool": "ap-southeast.rgw.users.uid",
> > "system_key": {
> > "access_key": "1555b35654ad1656d805",
> > "secret_key":
> > "h7GhxuBLTrlhVUyxSPUKUV8r\/2EI4ngqJxD7iBdBYLhwluN30JaT3Q12"
> > },
> > "placement_pools": [
> > {
> > "key": "default-placement",
> > "val": {
> > "index_pool": "ap-southeast.rgw.buckets.index",
> > "data_pool": "ap-southeast.rgw.buckets.data",
> > "data_extra_pool": "ap-southeast.rgw.buckets.non-ec",
> > "index_type": 0
> > }
> > }
> > ],
> > "metadata_heap": "ap-southeast.rgw.meta",
> > "realm_id": "93dc1f56-6ec6-48f8-8caa-a7e864eeaeb3"
> > }
> > - Realm:
> > {
> > "id": "93dc1f56-6ec6-48f8-8caa-a7e864eeaeb3",
> > "name": "default",
> > "current_period": "345bcfd4-c120-4862-9c13-1575d8876ce1",
> > "epoch": 1
> > }
> > - Period:
> > "period_map": {
> > "id": "5e66c0e2-a195-4ab4-914f-2b3d7977be0c",
> > "zonegroups": [
> > {
> > "id": "e9585cbd-df92-42a0-964b-15efb1cc0ad6",
> > "name": "ap",
> > "api_name": "ap",
> > "is_master": "true",
> > "endpoints": [
> > "http:\/\/192.168.1.1:"
> > ],
> > "hostnames": [],
> > "hostnames_s3website": [],
> > "master_zone": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> > "zones": [
> > {
> > "id": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> > "name": "ap-southeast",
> > "endpoints": [
> > "http:\/\/192.168.1.1:"
> > 

Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-05-31 Thread Nick Fisk
Hi Oliver,

> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Oliver Dzombic
> Sent: 30 May 2016 16:32
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
> capacity when using Ceph based On high-end storage systems
> 
> Hi,
> 
> E3 CPUs have 4 Cores, with HT Unit. So 8 logical Cores. And they are not
multi
> CPU.
> 
> That means you will be naturally ( fastly ) limited in the number of OSD's
you
> can run with that.

I'm hoping to be able to run 12, do you think that will be a struggle?

> 
> Because no matter how much Ghz it has, the OSD process occupy a cpu core
> for ever. 

I'm not sure I agree with this point. A OSD process is comprised of 10's of
threads, which unless you have pinned the process to a single core, will be
running randomly across all the cores on the CPU. As far as I'm aware, all
these threads are given a 10ms time slice and then scheduled to run on the
next available core. A 4x4Ghz CPU will run all these threads faster than a
8x2Ghz CPU, this is where the latency advantages are seen.

If you get to the point you have 100's of threads all demanding CPU time, a
4x4Ghz CPU will be roughly the same speed as a 8x2Ghz CPU. Yes there are
half the cores available, but each core completes its work in half the time.
There may be some advantages with ever increasing thread counts, but there
is also disadvantages with memory/IO access over the inter CPU link in the
case of dual sockets.

> Not for 100%, but still enough, to ruin ur day, if you have 8 logical
> cores and 12 disks ( in scrubbing/backfilling/high load ).

I did some testing with a 12 Core 2Ghz Xeon E5 (2x6) by disabling 8 cores
and performance was sufficient. I know E3 and E5 are different CPU families,
but hopefully this was a good enough test.

> 
> So all single Core CPU's are just good for a very limited amount of OSD's.

> --
> Mit freundlichen Gruessen / Best regards
> 
> Oliver Dzombic
> IP-Interactive
> 
> mailto:i...@ip-interactive.de
> 
> Anschrift:
> 
> IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3
> 63571 Gelnhausen
> 
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
> 
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
> 
> 
> Am 30.05.2016 um 17:13 schrieb Christian Balzer:
> >
> > Hello,
> >
> > On Mon, 30 May 2016 09:40:11 +0100 Nick Fisk wrote:
> >
> >> The other option is to scale out rather than scale up. I'm currently
> >> building nodes based on a fast Xeon E3 with 12 Drives in 1U. The
> >> MB/CPU is very attractively priced and the higher clock gives you
> >> much lower write latency if that is important. The density is
> >> slightly lower, but I guess you gain an advantage in more granularity
of the
> cluster.
> >>
> > Most definitely, granularity and number of OSDs (up to a point, mind
> > ya) is a good thing [TM].
> >
> > I was citing the designs I did to basically counter the "not dense
enough"
> > argument.
> >
> > Ultimately with Ceph (unless you throw lots of money and brain cells
> > at it), the less dense, the better it will perform.
> >
> > Christian
> >>> -Original Message-
> >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> >>> Behalf Of Jack Makenz
> >>> Sent: 30 May 2016 08:40
> >>> To: Christian Balzer 
> >>> Cc: ceph-users@lists.ceph.com
> >>> Subject: Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage
> >>> capacity when using Ceph based On high-end storage systems
> >>>
> >>> Thanks Christian, and all of ceph users
> >>>
> >>> Your guidance was very helpful, appreciate !
> >>>
> >>> Regards
> >>> Jack Makenz
> >>>
> >>> On Mon, May 30, 2016 at 11:08 AM, Christian Balzer 
> >>> wrote:
> >>>
> >>> Hello,
> >>>
> >>> you may want to read up on the various high-density node threads and
> >>> conversations here.
> >>>
> >>> You most certainly do NOT need high end-storage systems to create
> >>> multi-petabyte storage systems with Ceph.
> >>>
> >>> If you were to use these chassis as a basis:
> >>>
> >>> https://www.supermicro.com.tw/products/system/4U/6048/SSG-
> 6048R-
> >>> E1CR60N.cfm
> >>> [We (and surely others) urged Supermicro to provide a design like
> >>> this]
> >>>
> >>> And fill them with 6TB HDDs, configure them as 5x 12HDD RAID6s, set
> >>> your replication to 2 in Ceph, you will wind up with VERY reliable,
> >>> resilient 1.2PB per rack (32U, leaving space for other bits and not
> >>> melting the PDUs).
> >>> Add fast SSDs or NVMes to this case for journals and you have
> >>> decently performing mass storage.
> >>>
> >>> Need more IOPS for really hot data?
> >>> Add a cache tier or dedicated SSD pools for special needs/customers.
> >>>
> >>> Alternatively, do "classic" Ceph with 3x replication or EC coding,
> >>> but in either case (even more so with EC) you will need the most
> >>> firebreathing CPUs available, so compared to the above design it may
> >>> be a zero sum game cost 

Re: [ceph-users] RGW Could not create user

2016-05-31 Thread Abhishek Lekshmanan

Khang Nguyễn Nhật writes:

> Hi,
> I'm having problems with CEPH v10.2.1 Jewel when create user. My cluster is
> used CEPH Jewel, including: 3 OSD, 2 monitors and 1 RGW.
> - Here is the list of *cluster pools*:
> .rgw.root
> ap-southeast.rgw.control
> ap-southeast.rgw.data.root
> ap-southeast.rgw.gc
> ap-southeast.rgw.users.uid
> ap-southeast.rgw.buckets.data
> ap-southeast.rgw.users.email
> ap-southeast.rgw.users.keys
> ap-southeast.rgw.buckets.index
> ap-southeast.rgw.buckets.non-ec
> ap-southeast.rgw.log
> ap-southeast.rgw.meta
> ap-southeast.rgw.intent-log
> ap-southeast.rgw.usage
> ap-southeast.rgw.users.swift
> - *Zonegroup* info:
> {
> "id": "e9585cbd-df92-42a0-964b-15efb1cc0ad6",
> "name": "ap",
> "api_name": "ap",
> "is_master": "true",
> "endpoints": [
> "http:\/\/192.168.1.1:"
> ],
> "hostnames": [],
> "hostnames_s3website": [],
> "master_zone": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> "zones": [
> {
> "id": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> "name": "ap-southeast",
> "endpoints": [
> "http:\/\/192.168.1.1:"
> ],
> "log_meta": "true",
> "log_data": "false",
> "bucket_index_max_shards": 0,
> "read_only": "false"
> }
> ],
> "placement_targets": [
> {
> "name": "default-placement",
> "tags": []
> }
> ],
> "default_placement": "default-placement",
> "realm_id": "93dc1f56-6ec6-48f8-8caa-a7e864eeaeb3"
> }
> - *Zone*:
> {
> "id": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> "name": "ap-southeast",
> "domain_root": "ap-southeast.rgw.data.root",
> "control_pool": "ap-southeast.rgw.control",
> "gc_pool": "ap-southeast.rgw.gc",
> "log_pool": "ap-southeast.rgw.log",
> "intent_log_pool": "ap-southeast.rgw.intent-log",
> "usage_log_pool": "ap-southeast.rgw.usage",
> "user_keys_pool": "ap-southeast.rgw.users.keys",
> "user_email_pool": "ap-southeast.rgw.users.email",
> "user_swift_pool": "ap-southeast.rgw.users.swift",
> "user_uid_pool": "ap-southeast.rgw.users.uid",
> "system_key": {
> "access_key": "1555b35654ad1656d805",
> "secret_key":
> "h7GhxuBLTrlhVUyxSPUKUV8r\/2EI4ngqJxD7iBdBYLhwluN30JaT3Q12"
> },
> "placement_pools": [
> {
> "key": "default-placement",
> "val": {
> "index_pool": "ap-southeast.rgw.buckets.index",
> "data_pool": "ap-southeast.rgw.buckets.data",
> "data_extra_pool": "ap-southeast.rgw.buckets.non-ec",
> "index_type": 0
> }
> }
> ],
> "metadata_heap": "ap-southeast.rgw.meta",
> "realm_id": "93dc1f56-6ec6-48f8-8caa-a7e864eeaeb3"
> }
> - *Realm*:
> {
> "id": "93dc1f56-6ec6-48f8-8caa-a7e864eeaeb3",
> "name": "default",
> "current_period": "345bcfd4-c120-4862-9c13-1575d8876ce1",
> "epoch": 1
> }
> - *Period:*
> "period_map": {
> "id": "5e66c0e2-a195-4ab4-914f-2b3d7977be0c",
> "zonegroups": [
> {
> "id": "e9585cbd-df92-42a0-964b-15efb1cc0ad6",
> "name": "ap",
> "api_name": "ap",
> "is_master": "true",
> "endpoints": [
> "http:\/\/192.168.1.1:"
> ],
> "hostnames": [],
> "hostnames_s3website": [],
> "master_zone": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> "zones": [
> {
> "id": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> "name": "ap-southeast",
> "endpoints": [
> "http:\/\/192.168.1.1:"
> ],
> "log_meta": "true",
> "log_data": "false",
> "bucket_index_max_shards": 0,
> "read_only": "false"
> }
> ],
>  /// /// ///
> "master_zonegroup": "e9585cbd-df92-42a0-964b-15efb1cc0ad6",
> "master_zone": "e1d58724-e44f-4520-b56f-19a40b2ce8c4",
> "period_config": {
> "bucket_quota": {
> "enabled": false,
> "max_size_kb": -1,
> "max_objects": -1
> },
> "user_quota": {
> "enabled": false,
> "max_size_kb": -1,
> "max_objects": -1
> }
> },
> "realm_id": "93dc1f56-6ec6-48f8-8caa-a7e864eeaeb3",
> "realm_name": "default",
> "realm_epoch": 2
> }
>
> When I used radosgw-admin user create --uid = 1 --display-name = "user1"
> --email=us...@example.com, I get an error "could not create user: unable to
> create user, unable to store user info"

Is the ceph cluster healthy? BTW I don't think radosgw-admin accepts
spacing before & after the 

[ceph-users] inkscope version 1.4

2016-05-31 Thread eric mourgaya
hi guys,

Inkscope 1.4 is released.
You can find the  rpms and debian packages at
https://github.com/inkscope/inkscope-packaging.
This release add a monitor panel  using collectd , and  also some features
about user login.

Enjoy it!


-- 
Eric Mourgaya,


Respectons la planete!
Luttons contre la mediocrite!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com