RE: tool for applying 'ceph daemon ' command to all OSDs

2015-12-22 Thread igor.podo...@ts.fujitsu.com
> -Original Message-
> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> ow...@vger.kernel.org] On Behalf Of Dan Mick
> Sent: Tuesday, December 22, 2015 7:00 AM
> To: ceph-devel
> Subject: RFC: tool for applying 'ceph daemon ' command to all OSDs
> 
> I needed something to fetch current config values from all OSDs (sorta the
> opposite of 'injectargs --key value), so I hacked it, and then spiffed it up 
> a bit.
> Does this seem like something that would be useful in this form in the
> upstream Ceph, or does anyone have any thoughts on its design or
> structure?
>

You could do it using socat too:

Node1 has osd.0

Node1:
cd /var/run/ceph
sudo socat TCP-LISTEN:60100,fork unix-connect:ceph-osd.0.asok

Node2:
cd /var/run/ceph
sudo  socat unix-listen:ceph-osd.0.asok,fork TCP:Node1:60100

Node2:
sudo ceph daemon osd.0 help | head
{
"config diff": "dump diff of current config and default config",
"config get": "config get : get the config value",

This is more for development/test setup.

Regards,
Igor.

> It requires a locally-installed ceph CLI and a ceph.conf that points to the
> cluster and any required keyrings.  You can also provide it with a YAML file
> mapping host to osds if you want to save time collecting that info for a
> statically-defined cluster, or if you want just a subset of OSDs.
> 
> https://github.com/dmick/tools/blob/master/osd_daemon_cmd.py
> 
> Excerpt from usage:
> 
> Execute a Ceph osd daemon command on every OSD in a cluster with one
> connection to each OSD host.
> 
> Usage:
> osd_daemon_cmd [-c CONF] [-u USER] [-f FILE] (COMMAND | -k KEY)
> 
> Options:
>-c CONF   ceph.conf file to use [default: ./ceph.conf]
>-u USER   user to connect with ssh
>-f FILE   get names and osds from yaml
>COMMAND   command other than "config get" to execute
>-k KEYconfig key to retrieve with config get 
> 
> --
> Dan Mick
> Red Hat, Inc.
> Ceph docs: http://ceph.com/docs
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the
> body of a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html


[testing ceph] with gnu stow and NFS

2015-12-16 Thread igor.podo...@ts.fujitsu.com
H Cephers!

TL;DR
Use NFS server to share binaries and libs (ceph and any other) among your 
cluster. Then link them using  gnu stow from mounted NFS to the root ( / ) 
directory on every node. Now switching between your custom ceph builids (or any 
other e.g. tcmalloc) on whole cluster will be very fast, easy to automate and 
consistent. Stow will change your ceph version using symbolic links with two 
commands only.

Long version:

I want to share with you, one of my concepts that make my life testing ceph a 
little bit easier. Some time ago I wrote a few words about this in other 
thread, but You probably missed them, because of heavy discussion in that 
thread.

Main idea was to make easy mechanism to switch between ceph versions: 
binaries/libs - everything. But the truth was I'm too lazy to reinstall it 
manually on every host and to ignorant to check if I've installed right version 
;)

What I have at the moment:
- NFS server that exports /home/ceph to all of my cluster nodes
- Several subfolders with ceph builds, e.g. /home/ceph/ceph-0.94.1, 
/home/ceph/git/ceph
- and libraries e.g. /home/ceph/tcmalloc/gpertftools-2.4

In /home/ceph/ceph-0.94.1 and /home/ceph/tcmalloc/gpertftools-2.4 I have 
additional directory called BIN and everything is installed into it by running

instead of normal install (or building RPMs):
$ make
$ make install

something like:

$ mkdir  BIN
$ make
$ make DESTDIR=$PWD/BIN install
$ rm -rf $PWD/BIN/var   # in case of Ceph we don't want 
to share this directory on NFS, so we must remove it


DESTDIR will install all package related files into BIN, just like it would be 
the root ( / ) directory:
$ tree BIN
BIN
??? etc
?   ??? bash_completion.d
?   ?   ??? ceph
?   ?   ??? rados
?   ?   ??? radosgw-admin
?   ?   ??? rbd
?   ??? ceph
??? sbin
?   ??? mount.ceph
?   ??? mount.fuse.ceph
??? usr
??? bin
?   ??? ceph
?   ??? ceph-authtool
?   ??? ceph_bench_log
?   ??? ceph-brag
?   ??? ceph-client-debug
?   ??? ceph-clsinfo

And now it's time for gnu stow: https://www.gnu.org/software/stow/

On every node I run from root:
$ stow -d /home/ceph/ceph-0.94.1  -t/ BIN; ldconfig;

Stow will create symbolic links from every file/directory from BIN into root ( 
/ ) directory on my Linux, and ceph would work just like I'd make install it 
normal way, or using rpms.
$ type ceph
ceph is hashed (/usr/bin/ceph)

$ ls -al /usr/bin/ceph
$ lrwxrwxrwx 1 root root 50 Dec 11 14:33 /usr/bin/ceph -> 
../../home/ceph/ceph-0.94.1/BIN/usr/bin/ceph

I can do the same for other libraries as well:
$ stow -d /home/ceph/tcmalloc/gperftools-2.4   -t/ BIN; ldconfig;

If I need to check another ceph/library version I just stop ceph on all nodes, 
then "unstow":
$ stow -D -d /home/ceph/ceph-0.94.1  -t/ BIN; ldconfig;

and "stow" again to different version
$ stow -D -d /home/ceph/ceph-0.94.1_my_custom_build  -t/ BIN; ldconfig;

== Exception ==
/etc/init.d/ceph should be copied into / because when you "unstow" ceph, 
"service ceph start" won't be working.


Then I just start ceph on all nodes and that's all.

Quite fast isn't it?

NFS+stow concept could be used not only for "builds" (compilation, make, make 
install), but from RPMs too (precompiled binaries). You need to unpack RPM into 
BIN folder and run stow, it will work just like you would install this rpm in a 
standard way, into root ( / ).

Placing binaries/libs on NFS does not impact performance on ceph at runtime, it 
could in fact cause some delay during processes start, when they are loaded 
from file system. Of course NFS will be SPOF, but for tests that I made this 
doesn't matter, I test only application behavior and infrastructure is 
untouched.

This idea is a time-saver at the day, and easy automation during night tests.

Regards,
Igor.


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Scaling Ceph reviews and testing

2015-11-26 Thread igor.podo...@ts.fujitsu.com
> -Original Message-
> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> ow...@vger.kernel.org] On Behalf Of Dalek, Piotr
> Sent: Thursday, November 26, 2015 9:56 AM
> To: Gregory Farnum; ceph-devel
> Subject: RE: Scaling Ceph reviews and testing
> 
> > -Original Message-
> > From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> > ow...@vger.kernel.org] On Behalf Of Gregory Farnum
> > Sent: Wednesday, November 25, 2015 11:14 PM
> >
> > It has been a long-standing requirement that all code be tested by
> > teuthology before being merged to master. In the past leads have
> > shouldered a lot of this burden through integration and testing
> > branches, but it’s become unsustainable in present form: some PRs
> > which are intended as RFCs are being mistakenly identified as final;
> > some PRs are submitted which pass cursory sniff tests but fail under
> > recovery conditions that the teuthology suites cover. To prevent that,
> > please comment on exactly what testing you’ve performed when
> > submitting a PR and a justification why that is sufficient to promote
> > it to integration testing. [..]
> 
> Unless people will be convinced that performing their own testing isn't that
> complex (teuthology-openstack is a leapfrog in right direction), they won't do
> it, either because they simply don't know how to do it, or they don't have
> resources to do so (small startups may not afford them at all, and large,
> global corporations might have hardware request procedures so complex
> and with a such large time span, that it scares the devs out).
> But correctness and reliability regressions are one thing, performance
> regressions are another one. I already see PRs that promise performance
> increase, when at (my) first glance it looks totally contradictory, or it's 
> just a
> big, 100+ line change which adds more complexity than performance. Not to
> mention utter nonsense like https://github.com/ceph/ceph/pull/6582
> (excuse my finger-pointing, but this case is so extreme that it needs to be
> pointed out). Or, to put it more bluntly, some folks are spamming with
> performance PRs that in their opinion improve something, while in reality
> those PRs at best increase complexity of already complex code and add
> (sometimes potential) bugs, often with added bonus of actually degraded
> performance. So, my proposition is to postpone QA'ing performance pull
> requests until someone unrelated to PR author (or even author's company)
> can confirm that claims in that particular PR are true. Providing code snippet
> that shows the perf difference (or provide a way to verify those claims in
> reproducible matter) in PR should be enough for it
> (https://github.com/XinzeChi/ceph/commit/2c8a17560a797b316520cb689240
> d4dcecf3e4cc for a particular example), and it should help get rid of
> performance PRs that degrade performance or improve it only on particular
> hardware/software configuration and at best don't improve anything
> otherwise.
> 
> 
> With best regards / Pozdrawiam
> Piotr Dałek

We could also add another label, like "explanation/data needed" and guys 
marking new PR's could add this to make this more restrict:  "Performance 
enhancements must come with test data and detailed explanations." 
(https://github.com/ceph/ceph/blob/master/CONTRIBUTING.rst )

Then Piotr's idea will be easier to do, when "PR validator" will have test data 
and explanation he could faster/easier decide if this PR make sense or not.

Regards,
Igor.

> 
>   칻 & ~ &   +-  ݶ  w  ˛   m  ^  b  ^n r   z   h&   G   h ( 階 ݢj" 
>   m z
> ޖ   f   h   ~ m


FW: ceph_monitor - monitor your cluster with parallel python

2015-11-19 Thread igor.podo...@ts.fujitsu.com
Hey, one more time here, I’ve got reject from mail daemon.

Regards,
Igor.

From: Podoski, Igor 
Sent: Thursday, November 19, 2015 8:53 AM
To: ceph-devel; 'ceph-us...@ceph.com'
Subject: ceph_monitor - monitor your cluster with parallel python

Hi Cephers!

I’ve created small tool to help track memory/cpu/io usage. It’s useful for me 
so I thought I could share with you: https://github.com/aiicore/ceph_monitor

In general this is a python script, that uses parallel python to run a function 
on remote host. Data is gathered from all hosts and presented on console or 
added to sqlite database, then can be plotted with e.g. gnuplot. You can define 
osd ranges, that you want to monitor, or monitor certain process, e.g. osds 
only from pool that has ssds.

The main concept is that monitor don’t know and don’t care on which host osd’s 
are running, it treats them as a whole set.

Script uses psutil to get data related to processes (mon/osd/rgw/whatever). In 
near feature I’d like to add modes that can modify process behavior e.g. psutil 
has .nice .ionice .cpu_affinity methods, that could be useful in some tests. 
Basically with parallel python you can run any function remotely, so tuning SO 
by changing some /proc/* files can be done too.

You can add labels to data to see when what happens.

Sample plot: 
https://raw.githubusercontent.com/aiicore/ceph_monitor/master/examples/avg_cpu_mem.png
 
Simple test: 
https://github.com/aiicore/ceph_monitor/blob/master/examples/example_test_with_rados.sh
 

Short readme:  https://github.com/aiicore/ceph_monitor 
Full readme: https://github.com/aiicore/ceph_monitor/blob/master/readme.txt 

I encourage You to use and develop it, if not just please read the full readme 
text, maybe you’ll  come up with a better idea based on my this concept and 
something interesting will happen.

p.s. This currently works with python 2.6 and psutil 0.6.1 on centos 6.6.  If 
you find any bug – report it on my github as an issue.

!!! Security notice !!!
Parallel python supports SHA authentication  – my version currently runs 
WITHOUT this so in certain environments it could be dangerous (you could run 
any function from untrusted client). For now use it only in test/dev isolated 
clusters.

Regards,
Igor.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html