Re: [ceph-users] HDFS on Ceph (RBD)

2015-05-20 Thread Wang, Warren
We¹ve contemplated doing something like that, but we also realized that
it would result in manual work in Ceph everytime we lose a drive or
server, 
and a pretty bad experience for the customer when we have to do
maintenance.

We also kicked around the idea of leveraging the notion of a Hadoop rack
to define a set of instances which are Cinder volume backed, and the rest
be ephemeral drives (not Ceph backed ephemeral). Using 100% ephemeral
isn¹t out of the question either, but we have seen a few instances where
all the instances in a region were quickly terminated.

Our customer has also tried grabbing the Sahara code (Hadoop Swift) and
running it on their own to interface with RGW backed Swift, but ran into
an issue where Sahara code sequentially stats each item within a
container. 
I think there are efforts to multithread this.

-- 
Warren Wang





On 5/20/15, 7:27 PM, "Blair Bethwaite"  wrote:

>Hi Warren,
>
>Following our brief chat after the Ceph Ops session at the Vancouver
>summit today, I added a few more notes to the etherpad
>(https://etherpad.openstack.org/p/YVR-ops-ceph).
>
>I wonder whether you'd considered setting up crush layouts so you can
>have multiple cinder AZs or volume-types that map to a subset of OSDs
>in your cluster. You'd have them in pools with rep=1 (i.e., no
>replication). Then have your Hadoop users follow a provisioning
>pattern that involves attaching volumes from each crush ruleset and
>building HDFS over them in a manner/topology so as to avoid breaking
>HDFS for any single underlying OSD failure, assuming regular HDFS
>replication is used on top. Maybe a pool per HDFS node is the
>obvious/naive starting point, clearly that implies a certain scale to
>begin with, but probably works for you...?
>
>-- 
>Cheers,
>~Blairo
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU Venom Vulnerability

2015-05-20 Thread Georgios Dimitrakakis

Hi Brad!

Thanks for pointing out that for CentOS 6 the fix is included! Good to 
know that!


But I think that the original package doesn't support RBD by default so 
it has to be built again, am I right?


If that's correct then starting from there and building a new RPM with 
RBD support is the proper way of updating. Correct?


Since I am very new at building RPMs is something else that I should be 
aware of or take care? Any guidelines maybe


Best regards,

George

On Thu, 21 May 2015 09:25:32 +1000, Brad Hubbard wrote:

On 05/21/2015 08:47 AM, Brad Hubbard wrote:

On 05/20/2015 11:02 AM, Robert LeBlanc wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I've downloaded the new tarball, placed it in rpmbuild/SOURCES then
with the extracted spec file in rpmbuild/SPEC, I update it to the 
new

version and then rpmbuild -ba program.spec. If you install the SRPM
then it will install the RH patches that have been applied to the
package and then you get to have the fun of figuring out which 
patches
are still needed and which ones need to be modified. You can 
probably

build the package without the patches, but some things may work a
little differently. That would get you the closest to the official
RPMs

As to where to find the SRPMs, I'm not really sure, I come from a
Debian background where access to source packages is really easy.



# yumdownloader --source qemu-kvm --source qemu-kvm-rhev

This assumes you have the correct source repos enabled. Something 
like;


# subscription-manager repos 
--enable=rhel-7-server-openstack-6.0-source-rpms 
--enable=rhel-7-server-source-rpms


Taken from https://access.redhat.com/solutions/1381603


Of course the above is for RHEL only and is unnecessary as there are 
errata
packages for rhel. I was just trying to explain how you can get 
access to the

source packages for rhel.

As for Centos 6, although the version number may be "small" it has 
the fix.



http://vault.centos.org/6.6/updates/Source/SPackages/qemu-kvm-0.12.1.2-2.448.el6_6.3.src.rpm

$ rpm -qp --changelog qemu-kvm-0.12.1.2-2.448.el6_6.3.src.rpm |head 
-5

warning: qemu-kvm-0.12.1.2-2.448.el6_6.3.src.rpm: Header V3 RSA/SHA1
Signature, key ID c105b9de: NOKEY
* Fri May 08 2015 Miroslav Rezanina  -
0.12.1.2-2.448.el6_6.3
- kvm-fdc-force-the-fifo-access-to-be-in-bounds-of-the-all.patch 
[bz#1219267]

- Resolves: bz#1219267
  (EMBARGOED CVE-2015-3456 qemu-kvm: qemu: floppy disk controller
flaw [rhel-6.6.z])

HTH.


Cheers,
Brad



HTH.

Cheers,
Brad


- 
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Tue, May 19, 2015 at 3:47 PM, Georgios Dimitrakakis  wrote:

Erik,

are you talking about the ones here :

http://ftp.redhat.com/redhat/linux/enterprise/6Server/en/RHEV/SRPMS/ 
???


 From what I see the version is rather "small" 0.12.1.2-2.448

How one can verify that it has been patched against venom 
vulnerability?


Additionally I only see the qemu-kvm package and not the qemu-img. 
Is it
essential to update both in order to have a working CentOS system 
or can I

just proceed with the qemu-kvm?

Robert, any ideas where can I find the latest and patched 
SRPMs...I have
been building v.2.3.0 from source but I am very reluctant to use 
it in my

system :-)

Best,

George


You can also just fetch the rhev SRPMs  and build those. They 
have

rbd enabled already.
On May 19, 2015 12:31 PM, "Robert LeBlanc"  wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

You should be able to get the SRPM, extract the SPEC file and 
use

that
to build a new package. You should be able to tweak all the 
compile

options as well. Im still really new to building/rebuilding RPMs
but
Ive been able to do this for a couple of packages.
- 
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 
B9F1


On Tue, May 19, 2015 at 12:33 PM, Georgios Dimitrakakis  wrote:

I am trying to build the packages manually and I was wondering
is the flag --enable-rbd enough to have full Ceph 
functionality?


Does anybody know what else flags should I include in order to

have the same

functionality as the original CentOS package plus the RBD

support?


Regards,

George


On Tue, 19 May 2015 13:45:50 +0300, Georgios Dimitrakakis 
wrote:


Hi!

The QEMU Venom vulnerability (http://venom.crowdstrike.com/ 
[1])

got my

attention and I would
like to know what are you people doing in order to have the

latest

patched QEMU version
working with Ceph RBD?

In my case I am using the qemu-img and qemu-kvm packages

provided by

Ceph (http://ceph.com/packages/ceph-extras/rpm/centos6/x86_64/

[2]) in

order to have RBD working on CentOS6 since the default

repository

packages do not work!

If I want to update to the latest QEMU packages which ones are

known

to work with Ceph RBD?
I have seen some people mentioning that Fedora packages are

working
but I am not sure if they have the latest packages available 
and

if

they are going to work eve

[ceph-users] Three tier cache setup

2015-05-20 Thread Reid Kelley
Could be a stupid/bad question; is a three tier cache/mid/cold setup supported? 
 Example would be:
1. Fast NVME drives —> (write-back)—> 2. Mid grade MLC SSD for primary working 
set —>(write-back)—> 3. Super-Cold EC Pool for cheapest-deepest-oldest data

Theoretically, that middle tier of quality consumer or lower end enterprise 
SSDs would be cost effective to scale to a large, replicated size, while still 
maintaining fast performance.  NVME could absorb heavy writes and fastest 
interaction, with the EC pool scaling behind it all.

Thanks



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] HDFS on Ceph (RBD)

2015-05-20 Thread Blair Bethwaite
Hi Warren,

Following our brief chat after the Ceph Ops session at the Vancouver
summit today, I added a few more notes to the etherpad
(https://etherpad.openstack.org/p/YVR-ops-ceph).

I wonder whether you'd considered setting up crush layouts so you can
have multiple cinder AZs or volume-types that map to a subset of OSDs
in your cluster. You'd have them in pools with rep=1 (i.e., no
replication). Then have your Hadoop users follow a provisioning
pattern that involves attaching volumes from each crush ruleset and
building HDFS over them in a manner/topology so as to avoid breaking
HDFS for any single underlying OSD failure, assuming regular HDFS
replication is used on top. Maybe a pool per HDFS node is the
obvious/naive starting point, clearly that implies a certain scale to
begin with, but probably works for you...?

-- 
Cheers,
~Blairo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU Venom Vulnerability

2015-05-20 Thread Brad Hubbard

On 05/21/2015 08:47 AM, Brad Hubbard wrote:

On 05/20/2015 11:02 AM, Robert LeBlanc wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I've downloaded the new tarball, placed it in rpmbuild/SOURCES then
with the extracted spec file in rpmbuild/SPEC, I update it to the new
version and then rpmbuild -ba program.spec. If you install the SRPM
then it will install the RH patches that have been applied to the
package and then you get to have the fun of figuring out which patches
are still needed and which ones need to be modified. You can probably
build the package without the patches, but some things may work a
little differently. That would get you the closest to the official
RPMs

As to where to find the SRPMs, I'm not really sure, I come from a
Debian background where access to source packages is really easy.



# yumdownloader --source qemu-kvm --source qemu-kvm-rhev

This assumes you have the correct source repos enabled. Something like;

# subscription-manager repos --enable=rhel-7-server-openstack-6.0-source-rpms 
--enable=rhel-7-server-source-rpms

Taken from https://access.redhat.com/solutions/1381603


Of course the above is for RHEL only and is unnecessary as there are errata
packages for rhel. I was just trying to explain how you can get access to the
source packages for rhel.

As for Centos 6, although the version number may be "small" it has the fix.

http://vault.centos.org/6.6/updates/Source/SPackages/qemu-kvm-0.12.1.2-2.448.el6_6.3.src.rpm

$ rpm -qp --changelog qemu-kvm-0.12.1.2-2.448.el6_6.3.src.rpm |head -5
warning: qemu-kvm-0.12.1.2-2.448.el6_6.3.src.rpm: Header V3 RSA/SHA1 Signature, 
key ID c105b9de: NOKEY
* Fri May 08 2015 Miroslav Rezanina  - 
0.12.1.2-2.448.el6_6.3
- kvm-fdc-force-the-fifo-access-to-be-in-bounds-of-the-all.patch [bz#1219267]
- Resolves: bz#1219267
  (EMBARGOED CVE-2015-3456 qemu-kvm: qemu: floppy disk controller flaw 
[rhel-6.6.z])

HTH.


Cheers,
Brad



HTH.

Cheers,
Brad


- 
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Tue, May 19, 2015 at 3:47 PM, Georgios Dimitrakakis  wrote:

Erik,

are you talking about the ones here :
http://ftp.redhat.com/redhat/linux/enterprise/6Server/en/RHEV/SRPMS/ ???

 From what I see the version is rather "small" 0.12.1.2-2.448

How one can verify that it has been patched against venom vulnerability?

Additionally I only see the qemu-kvm package and not the qemu-img. Is it
essential to update both in order to have a working CentOS system or can I
just proceed with the qemu-kvm?

Robert, any ideas where can I find the latest and patched SRPMs...I have
been building v.2.3.0 from source but I am very reluctant to use it in my
system :-)

Best,

George



You can also just fetch the rhev SRPMs  and build those. They have
rbd enabled already.
On May 19, 2015 12:31 PM, "Robert LeBlanc"  wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

You should be able to get the SRPM, extract the SPEC file and use
that
to build a new package. You should be able to tweak all the compile
options as well. Im still really new to building/rebuilding RPMs
but
Ive been able to do this for a couple of packages.
- 
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

On Tue, May 19, 2015 at 12:33 PM, Georgios Dimitrakakis  wrote:

I am trying to build the packages manually and I was wondering
is the flag --enable-rbd enough to have full Ceph functionality?

Does anybody know what else flags should I include in order to

have the same

functionality as the original CentOS package plus the RBD

support?


Regards,

George


On Tue, 19 May 2015 13:45:50 +0300, Georgios Dimitrakakis wrote:


Hi!

The QEMU Venom vulnerability (http://venom.crowdstrike.com/ [1])

got my

attention and I would
like to know what are you people doing in order to have the

latest

patched QEMU version
working with Ceph RBD?

In my case I am using the qemu-img and qemu-kvm packages

provided by

Ceph (http://ceph.com/packages/ceph-extras/rpm/centos6/x86_64/

[2]) in

order to have RBD working on CentOS6 since the default

repository

packages do not work!

If I want to update to the latest QEMU packages which ones are

known

to work with Ceph RBD?
I have seen some people mentioning that Fedora packages are

working

but I am not sure if they have the latest packages available and

if

they are going to work eventually.

Is building manually the QEMU packages the only way???


Best regards,


George
___
ceph-users mailing list
ceph-users@lists.ceph.com [3]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [4]


___
ceph-users mailing list
ceph-users@lists.ceph.com [5]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [6]


-BEGIN PGP SIGNATURE-
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com [7]

wsFcBAEBCAAQBQJVW4+RCRDmVDuy+mK58QAAg8AP/jqmQFYEwOeGRTJigk9M

Re: [ceph-users] QEMU Venom Vulnerability

2015-05-20 Thread Brad Hubbard

On 05/20/2015 11:02 AM, Robert LeBlanc wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

I've downloaded the new tarball, placed it in rpmbuild/SOURCES then
with the extracted spec file in rpmbuild/SPEC, I update it to the new
version and then rpmbuild -ba program.spec. If you install the SRPM
then it will install the RH patches that have been applied to the
package and then you get to have the fun of figuring out which patches
are still needed and which ones need to be modified. You can probably
build the package without the patches, but some things may work a
little differently. That would get you the closest to the official
RPMs

As to where to find the SRPMs, I'm not really sure, I come from a
Debian background where access to source packages is really easy.



# yumdownloader --source qemu-kvm --source qemu-kvm-rhev

This assumes you have the correct source repos enabled. Something like;

# subscription-manager repos --enable=rhel-7-server-openstack-6.0-source-rpms 
--enable=rhel-7-server-source-rpms

Taken from https://access.redhat.com/solutions/1381603

HTH.

Cheers,
Brad


- 
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Tue, May 19, 2015 at 3:47 PM, Georgios Dimitrakakis  wrote:

Erik,

are you talking about the ones here :
http://ftp.redhat.com/redhat/linux/enterprise/6Server/en/RHEV/SRPMS/ ???

 From what I see the version is rather "small" 0.12.1.2-2.448

How one can verify that it has been patched against venom vulnerability?

Additionally I only see the qemu-kvm package and not the qemu-img. Is it
essential to update both in order to have a working CentOS system or can I
just proceed with the qemu-kvm?

Robert, any ideas where can I find the latest and patched SRPMs...I have
been building v.2.3.0 from source but I am very reluctant to use it in my
system :-)

Best,

George



You can also just fetch the rhev SRPMs  and build those. They have
rbd enabled already.
On May 19, 2015 12:31 PM, "Robert LeBlanc"  wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

You should be able to get the SRPM, extract the SPEC file and use
that
to build a new package. You should be able to tweak all the compile
options as well. Im still really new to building/rebuilding RPMs
but
Ive been able to do this for a couple of packages.
- 
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

On Tue, May 19, 2015 at 12:33 PM, Georgios Dimitrakakis  wrote:

I am trying to build the packages manually and I was wondering
is the flag --enable-rbd enough to have full Ceph functionality?

Does anybody know what else flags should I include in order to

have the same

functionality as the original CentOS package plus the RBD

support?


Regards,

George


On Tue, 19 May 2015 13:45:50 +0300, Georgios Dimitrakakis wrote:


Hi!

The QEMU Venom vulnerability (http://venom.crowdstrike.com/ [1])

got my

attention and I would
like to know what are you people doing in order to have the

latest

patched QEMU version
working with Ceph RBD?

In my case I am using the qemu-img and qemu-kvm packages

provided by

Ceph (http://ceph.com/packages/ceph-extras/rpm/centos6/x86_64/

[2]) in

order to have RBD working on CentOS6 since the default

repository

packages do not work!

If I want to update to the latest QEMU packages which ones are

known

to work with Ceph RBD?
I have seen some people mentioning that Fedora packages are

working

but I am not sure if they have the latest packages available and

if

they are going to work eventually.

Is building manually the QEMU packages the only way???


Best regards,


George
___
ceph-users mailing list
ceph-users@lists.ceph.com [3]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [4]


___
ceph-users mailing list
ceph-users@lists.ceph.com [5]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [6]


-BEGIN PGP SIGNATURE-
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com [7]

wsFcBAEBCAAQBQJVW4+RCRDmVDuy+mK58QAAg8AP/jqmQFYEwOeGRTJigk9M
pBhr34vyA3mky+BjjW9pt2tydECOH0p5PlYXBfhrQeg2B/yT0uVUKYbYkdBU
fY85UhS5NFdm7VyFyMPSGQwZlXIADF8YJw+Zbj1tpfRvbCi/sntbvGQk+9X8
usVSwBTbWKhYyMW8J5edppv72fMwoVjmoNXuE7wCUoqwxpQBUt0ouap6gDNd
Cu0ZMu+RKq+gfLGcIeSIhsDfV0/LHm2QBO/XjNZtMjyomOWNk9nYHp6HGJxH
MV/EoF4dYoCqHcODPjU2NvesQfYkmqfFoq/n9q/fMEV5JQ+mDfXqc2BcQUsx
40LDWDs+4BTw0KI+dNT0XUYTw+O0WnXFzgIn1wqXEs8pyOSJy1gCcnOGEavy
4PqYasm1g+5uzggaIddFPcWHJTw5FuFfjCnHX8Jo3EeQVDM6Vg8FPkkb5JQk
sqxVRQWsF89gGRUbHIQWdkgy3PZN0oTkBvUfflmE/cUq/r40sD4c25D+9Gti
Gj0IKG5uqMaHud3Hln++0ai5roOghoK0KxcDoBTmFLaQSNo9c4CIFCDf2kJ3
idH5tVozDSgvFpgBFLFatb7isctIYf4Luh/XpLXUzdjklGGzo9mhOjXsbm56
WCJZOkQ/OY1UFysMV5+tSSEn7TsF7Np9NagZB7AHhYuTKlOnbv3QJlhATOPp
u4wP
=SsM2
-END PGP SIGNATURE-
___
ceph-users mailing list
ceph-users@lists.ceph.com [8]
http://lists.ceph.com/listinfo.cgi

Re: [ceph-users] How to improve latencies and per-VM performance and latencies

2015-05-20 Thread Josef Johansson
Hi,

Just to add, there’s also a collectd plugin at 
https://github.com/rochaporto/collectd-ceph 
.

Things to check when you have slow read performance is:

*) how much defragmentation on those xfs-partitions? With some workloads you 
get high values pretty quick.
for osd in $(grep 'osd/ceph' /etc/mtab | cut -d ' ' -f 1); do sudo xfs_db -c 
frag -r $osd;done
*) 32/48GB RAM on the OSDs, could be increased. So as XFS is used and all the 
objects are files, ceph uses the linux file cache.
If your data set fits into that cache pretty much, you can gain _alot_ of read 
performance since there’s pretty much no reads from the drives. We’re at 128GB 
per OSD right now. Compared with the options at hand this could be a cheap way 
of increasing the performance. It won’t help you out when you’re doing 
deep-scrubs or recovery though.
*) turn off logging
[global]
debug_lockdep = 0/0
debug_context = 0/0
debug_crush = 0/0
debug_buffer = 0/0
debug_timer = 0/0
debug_filer = 0/0
debug_objecter = 0/0
debug_rados = 0/0
debug_rbd = 0/0
debug_journaler = 0/0
debug_objectcatcher = 0/0
debug_client = 0/0
debug_osd = 0/0
debug_optracker = 0/0
debug_objclass = 0/0
debug_filestore = 0/0
debug_journal = 0/0
debug_ms = 0/0
debug_monc = 0/0
debug_tp = 0/0
debug_auth = 0/0
debug_finisher = 0/0
debug_heartbeatmap = 0/0
debug_perfcounter = 0/0
debug_asok = 0/0
debug_throttle = 0/0
debug_mon = 0/0
debug_paxos = 0/0
debug_rgw = 0/0
[osd]
   debug lockdep = 0/0
   debug context = 0/0
   debug crush = 0/0
   debug buffer = 0/0
   debug timer = 0/0
   debug journaler = 0/0
   debug osd = 0/0
   debug optracker = 0/0
   debug objclass = 0/0
   debug filestore = 0/0
   debug journal = 0/0
   debug ms = 0/0
   debug monc = 0/0
   debug tp = 0/0
   debug auth = 0/0
   debug finisher = 0/0
   debug heartbeatmap = 0/0
   debug perfcounter = 0/0
   debug asok = 0/0
   debug throttle = 0/0

*) run htop or vmstat/iostat to determinate whether it’s the CPU that’s getting 
maxed out or not.
*) just double check the performance and latencies on the network (do it for 
low and high MTU, just to make sure, it’s tough to optimise a lot and get 
bitten by it ;)

2) I don’t see anything in the help section about it
sudo ceph --admin-daemon /var/run/ceph/ceph-osd.$osd.asok help
an easy way of getting the osds if you want to change something globally
for osd in $(grep 'osd/ceph' /etc/mtab | cut -d ' ' -f 2 | cut -d '-' -f 2); do 
echo $osd;done

3) this is on one of the OSDs, about the same size as yours but sata drives for 
backing ( a bit more cpu and memory though):

sudo ceph --admin-daemon /var/run/ceph/ceph-osd.1.asok perf dump | grep -A 1 -e 
op_latency -e op_[rw]_latency -e op_[rw]_process_latency -e journal_latency
  "journal_latency": { "avgcount": 406051353,
  "sum": 230178.927806000},
--
  "op_latency": { "avgcount": 272537987,
  "sum": 4337608.21104},
--
  "op_r_latency": { "avgcount": 111672059,
  "sum": 758059.732591000},
--
  "op_w_latency": { "avgcount": 9308193,
  "sum": 174762.139637000},
--
  "subop_latency": { "avgcount": 273742609,
  "sum": 1084598.823585000},
--
  "subop_w_latency": { "avgcount": 273742609,
  "sum": 1084598.823585000},

Cheers
Josef

> On 20 May 2015, at 10:20, Межов Игорь Александрович  wrote:
> 
> Hi!
> 
> 1. Use it at your own risk. I'm not responsible to any damage, you can get by 
> running thos script
> 
> 2. What is it for. 
> Ceph osd daemon have so called 'admin socket' - a local (to osd host) unix 
> socket, that we can
> use to issue commant to that osd. The script connects to a list od osd hosts 
> (now it os hardcoded in
> source code, but it's easily changeable) by ssh, lists all admin sockets from 
> /var/run/ceph, grep
> socket names for osd numbers, and issue 'perf dump' command to all osds. Json 
> output parsed
> by standard python libs ans some latency parameters extracted from it. They 
> coded in json as tuples,
> containing  total amount of time in milliseconds and count of events. So 
> dividing time to count we get
> average latency for one or more ceph operations. The min/max/avg are counted 
> for every host and
> whole cluster, and latency of every osd compared to minimal value of cluster 
> (or host) and colorized
> to easily detect too high values. 
> You can check usage example in comments at the top of the script and change 
> hardcoded values,
> that are also gathered at the top.
> 
> 3. I use script on Ceph Firefly 0.80.7, but think that it will work on any 
> release, that supports
> admin socket connection to osd, 'perf dump' command and the same json output 
>

Re: [ceph-users] radosgw performance with small files

2015-05-20 Thread Srikanth Madugundi
My hardware setup

One OSD host
   - EL6
   - 10 spinning disks with configuration
  - sda (hpsa0): 450GB (0%) RAID-0 == 1 x 450GB 15K SAS/6
  - 31GB Memory
  - 1Gb/s ethernet line

Monitor and gateway hosts have the same configuration with just one disk.

I am benchmarking newstore performance with small files using radosgw. I am
hitting a bottleneck when writing data though radosgw. I get very good
write throughput when using librados to write small files (60K).

Regards
Srikanth


On Wed, May 20, 2015 at 8:03 AM, Mark Nelson  wrote:

> On 05/19/2015 11:31 AM, Srikanth Madugundi wrote:
>
>> Hi,
>>
>> I am seeing write performance hit with small files (60K) using radosgw.
>> The radosgw is configured to run with 600 threads. Here is the write
>> speed I get with file sizes of 60K
>>
>>
>> # sudo ceph -s
>>  cluster e445e46e-4d84-4606-9923-16fff64446dc
>>   health HEALTH_OK
>>   monmap e1: 1 mons at {osd187=13.24.0.7:6789/0
>> }, election epoch 1, quorum 0 osd187
>>   osdmap e205: 28 osds: 22 up, 22 in
>>pgmap v17007: 1078 pgs, 9 pools, 154 GB data, 653 kobjects
>>  292 GB used, 8709 GB / 9002 GB avail
>>  1078 active+clean
>>client io 1117 kB/s rd, *2878 kB/s wr*, 2513 op/s
>>
>
> It appears that you have 22 OSDs and between reads and writes, there are
> ~114 ops/s per OSD.  How many ops/s per disk are you trying to achieve?
>
>  #
>>
>>
>> If I run the same script with larger file sizes(1MB-3MB), I get a better
>> write speed.
>>
>
> Generally larger files will do better for a variety of reasons, but the
> primary one is that the data will consistently be more sequentially laid
> out.  Assuming your OSDs are on spinning disks, this is a big advantage.
>
>
>>
>> # sudo ceph -s
>>  cluster e445e46e-4d84-4606-9923-16fff64446dc
>>   health HEALTH_OK
>>   monmap e1: 1 mons at {osd187=13.24.0.79:6789/0
>> }, election epoch 1, quorum 0 osd187
>>   osdmap e205: 28 osds: 22 up, 22 in
>>pgmap v16883: 1078 pgs, 9 pools, 125 GB data, 140 kobjects
>>  192 GB used, 8809 GB / 9002 GB avail
>>  1078 active+clean
>>client io *105 MB/s wr*, 1839 op/s
>> #
>>
>> My cluster has 2 OSD hosts running total of 20 osd daemons, 1 mon and 1
>> radosgw hosts. Is the bottleneck coming from the single radosgw process?
>> If so, is it possible to run radosgw in multi process mode?
>>
>
> I think before anyone can answer your question, it might help to detail
> what your hardware setup is, how you are running the tests, and what kind
> of performance you'd like to achieve.
>
>
>> Regards
>> Srikanth
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>  ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] PG object skew settings

2015-05-20 Thread Abhishek L
Hi

Is it safe to tweak the value of `mon pg warn max object skew` from the
default value of 10 to a higher value of 20-30 or so. What would be a
safe upper limit for this value?

Also what does exceeding this ratio signify in terms of the cluster
health? We are sometimes hitting this limit in buckets index pool
(.rgw.buckets.index) which had the some number of pgs compared to a few
other pools which host almost no data like the gc pool, root pool etc.

Cheers!
Abhishek


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw performance with small files

2015-05-20 Thread Mark Nelson

On 05/19/2015 11:31 AM, Srikanth Madugundi wrote:

Hi,

I am seeing write performance hit with small files (60K) using radosgw.
The radosgw is configured to run with 600 threads. Here is the write
speed I get with file sizes of 60K


# sudo ceph -s
 cluster e445e46e-4d84-4606-9923-16fff64446dc
  health HEALTH_OK
  monmap e1: 1 mons at {osd187=13.24.0.7:6789/0
}, election epoch 1, quorum 0 osd187
  osdmap e205: 28 osds: 22 up, 22 in
   pgmap v17007: 1078 pgs, 9 pools, 154 GB data, 653 kobjects
 292 GB used, 8709 GB / 9002 GB avail
 1078 active+clean
   client io 1117 kB/s rd, *2878 kB/s wr*, 2513 op/s


It appears that you have 22 OSDs and between reads and writes, there are 
~114 ops/s per OSD.  How many ops/s per disk are you trying to achieve?



#


If I run the same script with larger file sizes(1MB-3MB), I get a better
write speed.


Generally larger files will do better for a variety of reasons, but the 
primary one is that the data will consistently be more sequentially laid 
out.  Assuming your OSDs are on spinning disks, this is a big advantage.





# sudo ceph -s
 cluster e445e46e-4d84-4606-9923-16fff64446dc
  health HEALTH_OK
  monmap e1: 1 mons at {osd187=13.24.0.79:6789/0
}, election epoch 1, quorum 0 osd187
  osdmap e205: 28 osds: 22 up, 22 in
   pgmap v16883: 1078 pgs, 9 pools, 125 GB data, 140 kobjects
 192 GB used, 8809 GB / 9002 GB avail
 1078 active+clean
   client io *105 MB/s wr*, 1839 op/s
#

My cluster has 2 OSD hosts running total of 20 osd daemons, 1 mon and 1
radosgw hosts. Is the bottleneck coming from the single radosgw process?
If so, is it possible to run radosgw in multi process mode?


I think before anyone can answer your question, it might help to detail 
what your hardware setup is, how you are running the tests, and what 
kind of performance you'd like to achieve.




Regards
Srikanth



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD unable to start (giant -> hammer)

2015-05-20 Thread Berant Lemmenes
Ok, just to update everyone, after moving out all the pg directories on the
OSD that were no longer valid PGs I was able to start it and the cluster is
back to healthy.

I'm going to trigger a deep scrub of osd.3 to be safe prior to deleting any
of those PGs though.

If I understand the gist of how 11429 is going to be addressed in 94.2, it
is going to disregard such "dead" PGs and complain in the logs. As far as
cleaning those up would a procedure similar to mine be appropriate (either
before or after 94.2).

Thank you Sam for your help, I greatly appreciate it!
-Berant

On Tue, May 19, 2015 at 7:13 PM, Berant Lemmenes 
wrote:

> Sam,
>
> It is for a valid pool, however the up and acting sets for 2.14 both show
> OSDs 8 & 7. I'll take a look at 7 &  8 and see if they are good.
>
> If so, it seems like it being present on osd.3 could be an artifact from
> previous topologies and I could mv it off old.3
>
> Thanks very much for the assistance!
>
> Berant
>
>
> On Tuesday, May 19, 2015, Samuel Just  wrote:
>
>> If 2.14 is part of a non-existent pool, you should be able to rename it
>> out of current/ in the osd directory to prevent the osd from seeing it on
>> startup.
>> -Sam
>>
>> - Original Message -
>> From: "Berant Lemmenes" 
>> To: "Samuel Just" 
>> Cc: ceph-users@lists.ceph.com
>> Sent: Tuesday, May 19, 2015 12:58:30 PM
>> Subject: Re: [ceph-users] OSD unable to start (giant -> hammer)
>>
>> Hello,
>>
>> So here are the steps I performed and where I sit now.
>>
>> Step 1) Using 'ceph-objectstore-tool list' to create a list of all PGs not
>> associated with the 3 pools (rbd, data, metadata) that are actually in use
>> on this cluster.
>>
>> Step 2) I then did a 'ceph-objectstore-tool remove' of those PGs
>>
>> Then when starting the OSD it would complain about PGs that were NOT in
>> the
>> list of 'ceph-objectstore-tool list' but WERE present on the filesystem of
>> the OSD in question.
>>
>> Step 3) Iterating over all of the PGs that were on disk and using
>> 'ceph-objectstore-tool info' I made a list of all PGs that returned
>> ENOENT,
>>
>> Step 4) 'ceph-objectstore-tool remove' to remove all those as well.
>>
>> Now when starting osd.3 I get an "unable to load metadata' error for a PG
>> that according to 'ceph pg 2.14 query' is not present (and shouldn't be)
>> on
>> osd.3. Shown below with OSD debugging at 20:
>>
>> 
>>
>>-23> 2015-05-19 15:15:12.712036 7fb079a20780 20 read_log 39533'174051
>> (39533'174050) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2811937 2015-05-18 07:18:42.859501
>>
>>-22> 2015-05-19 15:15:12.712066 7fb079a20780 20 read_log 39533'174052
>> (39533'174051) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2812374 2015-05-18 07:33:21.973157
>>
>>-21> 2015-05-19 15:15:12.712096 7fb079a20780 20 read_log 39533'174053
>> (39533'174052) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2812861 2015-05-18 07:48:23.098343
>>
>>-20> 2015-05-19 15:15:12.712127 7fb079a20780 20 read_log 39533'174054
>> (39533'174053) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2813371 2015-05-18 08:03:54.226512
>>
>>-19> 2015-05-19 15:15:12.712157 7fb079a20780 20 read_log 39533'174055
>> (39533'174054) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2813922 2015-05-18 08:18:20.351421
>>
>>-18> 2015-05-19 15:15:12.712187 7fb079a20780 20 read_log 39533'174056
>> (39533'174055) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2814396 2015-05-18 08:33:56.476035
>>
>>-17> 2015-05-19 15:15:12.712221 7fb079a20780 20 read_log 39533'174057
>> (39533'174056) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2814971 2015-05-18 08:48:22.605674
>>
>>-16> 2015-05-19 15:15:12.712252 7fb079a20780 20 read_log 39533'174058
>> (39533'174057) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2815407 2015-05-18 09:02:48.720181
>>
>>-15> 2015-05-19 15:15:12.712282 7fb079a20780 20 read_log 39533'174059
>> (39533'174058) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2815434 2015-05-18 09:03:43.727839
>>
>>-14> 2015-05-19 15:15:12.712312 7fb079a20780 20 read_log 39533'174060
>> (39533'174059) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2815889 2015-05-18 09:17:49.846406
>>
>>-13> 2015-05-19 15:15:12.712342 7fb079a20780 20 read_log 39533'174061
>> (39533'174060) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2816358 2015-05-18 09:32:50.969457
>>
>>-12> 2015-05-19 15:15:12.712372 7fb079a20780 20 read_log 39533'174062
>> (39533'174061) modify   49277412/rb.0.100f.2ae8944a.00029945/head//2
>> by
>> client.18119.0:2816840 2015-05-18 09:47:52.091524
>>
>>-11> 2015-05-19 15:15:12.712403 7fb079a20780

[ceph-users] НА: How to improve latencies and per-VM performance and latencies

2015-05-20 Thread Межов Игорь Александрович
Hi!

1. Use it at your own risk. I'm not responsible to any damage, you can get by 
running thos script

2. What is it for. 
 Ceph osd daemon have so called 'admin socket' - a local (to osd host) unix 
socket, that we can
use to issue commant to that osd. The script connects to a list od osd hosts 
(now it os hardcoded in
source code, but it's easily changeable) by ssh, lists all admin sockets from 
/var/run/ceph, grep
socket names for osd numbers, and issue 'perf dump' command to all osds. Json 
output parsed
by standard python libs ans some latency parameters extracted from it. They 
coded in json as tuples,
containing  total amount of time in milliseconds and count of events. So 
dividing time to count we get
average latency for one or more ceph operations. The min/max/avg are counted 
for every host and
whole cluster, and latency of every osd compared to minimal value of cluster 
(or host) and colorized
to easily detect too high values. 
You can check usage example in comments at the top of the script and change 
hardcoded values,
that are also gathered at the top.

3. I use script on Ceph Firefly 0.80.7, but think that it will work on any 
release, that supports
admin socket connection to osd, 'perf dump' command and the same json output 
structure.

4. As we connects to osd hosts by ssh in a one-by-one, the script is slow, 
especially when you have
more osd hosts. Also, als osd from a host are output in a one row, so if you 
have >12 osds per host,
it will mess output slightly.

PS: This is my first python script, so suggestions and improvements are welcome 
;)


Megov Igor
CIO, Yuterra


От: Michael Kuriger 
Отправлено: 19 мая 2015 г. 18:51
Кому: Межов Игорь Александрович
Тема: Re: [ceph-users] How to improve latencies and per-VM performance  and 
latencies

Awesome!  I would be interested in doing this as well.  Care to share how
your script works?

Thanks!




Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235





On 5/19/15, 6:31 AM, "Межов Игорь Александрович"  wrote:

>Hi!
>
>Seeking performance improvement in our cluster (Firefly 0.80.7 on Wheezy,
>5 nodes, 58 osds), I wrote
>a small python script, that walks through ceph nodes and issue 'perf
>dump' command on osd admin
>sockets. It extracts *_latency tuples, calculate min/max/avg, compare osd
>perf metrics with min/avg
>of whole cluster or same host and display result in table form. The goal
>- to check where the most latency is.
>
>The hardware is not new and shiny:
> - 5 nodes * 10-12 OSDs each
> - Intel E5520@2.26/32-48Gb DDR3-1066 ECC
> - 10Gbit X520DA interconnect
> - Intel DC3700 200Gb as a system volume + journals, connected to sata2
>onboard in ahci mode
> - Intel RS2MB044 / RS2BL080 SAS RAID in RAID0 per drive mode, WT, disk
>cache disabled
> - bunch of 1Tb or 2Tb various WD Black drives, 58 disks, 76Tb total
> - replication = 3, filestore on xfs
> - shared client and cluster 10Gbit network
> - cluster used as rbd storage for VMs
> - rbd_cache is on by 'cache=writeback' in libvirt (I suppose, that it is
>true ;))
> - no special tuning in ceph.conf:
>
>>osd mount options xfs = rw,noatime,inode64
>>osd disk threads = 2
>>osd op threads = 8
>>osd max backfills = 2
>>osd recovery max active = 2
>
>I get rather slow read performance from within VM, especially with QD=1,
>so many VMs are running slowly.
>I think, that this HW config can perform better, as I got 10-12k iops
>with QD=32 from time to time.
>
>So I have some questions:
> 1. Am I right, that osd perfs are cumulative and counting up from OSD
>start?
> 2. Is any way to reset perf counters without restating OSD daemon? Maybe
>a command through admin socket?
> 3. What latencies should I expect from my config, or, what latencies you
>have on yours clusters?
>Just an example or as a reference to compare with my values. I've
>interesting mostly in
> - 'op_latency',
> - 'op_[r|w]_latency',
> - 'op_[r|w]_process_latency'
> - 'journal_latency'
>But other parameters, like 'apply_latency' or
>'queue_transaction_latency_avg' are also interesting to compare.
> 4. Where I have to look firstly, if I need to improve QD=1 (i. e.
>per-VM) performance.
>
>Thanks!
>
>Megov Igor
>CIO, Yuterra
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinf
>o.cgi_ceph-2Dusers-2Dceph.com&d=AwICAg&c=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSnc
>m6Vfn0C_UQ&r=CSYA9OS6Qd7fQySI2LDvlQ&m=c0lu_hzIfU4AXi0gnwLzaOeWo7EFrFwlKjKf
>K-iihGg&s=o-hDZx1--UnZ27K2XL7-w08f2fwTwargpeiWtFS87L0&e=



getosdstat.py.gz
Description: getosdstat.py.gz
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com