Moving away from Yum/DNF repo priorities for Ceph and ceph-deploy

2015-07-23 Thread Travis Rhoden
HI Everyone,

I’m working on ways to improve Ceph installation with ceph-deploy, and a common 
hurdle we have hit involves dependency issues between ceph.com hosted RPM 
repos, and packages within EPEL.  For a while we were able to managed this with 
the priorities plugin, but then EPEL shipped packages that included changes 
that weren’t available on the ceph.com packages, and the EPEL packages 
“obsoleted” the ceph.com ones.  This caused EPEL packages to take priority over 
ceph.com packages even when ceph.com packages had greater version numbers.  The 
solution to this was to enable the “check_obsoletes” feature of the priorities 
plugin.  That’s where we are today.

Recently when working with DNF, I observed that the priorities feature got 
pulled natively into DNF, but I cannot find anything about whether 
“check_obsoletes” is still necessary or even an option. Regardless, I would 
like move away from this workflow as it is generally seen as poor. [1] [2]

What I’d like to propose instead of ceph-deploy’s current workflow is to:

(1) install epel-release on nodes that need it
(2) disable EPEL by default (using yum-config-manager)
(3) When installing Ceph, break the install into two parts
(3)(a) Explicitly install Ceph’s dependencies from EPEL by name, using yum 
—enablerepo=epel
(3)(b) Proceed normally with Ceph installation, but adding a —disablerepo=epel 
flag as well

Note: the disabling of EPEL in 3b seems redundant with 2, but it would cover 
cases when a user/admin chooses to enable EPEL by default.  We are mostly 
concerned with nodes that are dedicated to Ceph and therefore ceph-deploy is 
free to do things like disabling EPEL, but that’s certainly not always ideal.  
We could disable it by default *only* if we were the ones to install it.  If 
it’s already there, we leave it along but then still do our two-phase install 
and explicitly disable it when doing the second phase of install.

I think this workflow would allow us to no longer need to use repo priorities, 
but I might be missing something.  A secondary motive to this is to end up with 
systems that EPEL disabled by default because it has caused issues with 
Calamari, where EPEL has newer packages of certain things than what gets 
installed initially and then breaks Calamari.  Having EPEL disabled will 
prevent that, and will also prevent things like “yum update” from breaking 
things.

Potential downsides I see are what happens when there are updates in EPEL that 
we want, say for a security fix?

 - Travis


[1] 
http://wiki.centos.org/PackageManagement/Yum/Priorities#head-38b91468cc607d0243f463489c2334bf40bfaaee
[2] 
http://wiki.centos.org/PackageManagement/Yum/Priorities#head-6601a4937d4b099e6d46eea0bdb54241d51c7277--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: registering for tracker.ceph.com

2015-07-23 Thread Dan Mick
On 07/23/2015 09:44 AM, Sage Weil wrote:
 On Thu, 23 Jul 2015, Deneau, Tom wrote:
 I wanted to register for tracker.ceph.com to enter a few issues but never
 got the confirming email and my registration is now in some stuck state
 (not complete but name/email in use so can't re-register).  Any suggestions?
 
 It does that sometimes... not sure why.  I activated your tdeneau user and 
 deleted the later tmdeneau one.

Usually the confirm email is stuck in a spam filter.

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hello, I am confused about a question of rbd

2015-07-23 Thread Dan Mick
Why not zero?

If the answer is it can't be used, then, what arbitrary minimum size
is too small?

(also, given that resize exists, it can be used for storage after a resize.)

On 07/23/2015 06:03 AM, Jason Dillaman wrote:
 According to the git history, support for zero MB images for the 
 create/resize commands was explicitly added by commit 08f47a4.  Dan Mick or 
 Josh Durgin could probably better explain the history behind the change since 
 it was before my time.
 

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vstart runner for cephfs tests

2015-07-23 Thread John Spray



On 23/07/15 12:56, Mark Nelson wrote:
I had similar thoughts on the benchmarking side, which is why I 
started writing cbt a couple years ago.  I needed the ability to 
quickly spin up clusters and run benchmarks on arbitrary sets of 
hardware.  The outcome isn't perfect, but it's been extremely useful 
for running benchmarks and sort of exists as a half-way point between 
vstart and teuthology.


The basic idea is that you give it a yaml file that looks a little bit 
like a teuthology yaml file and cbt will (optionally) build a cluster 
across a number of user defined nodes with pdsh, start various 
monitoring tools (this is ugly right now, I'm working on making it 
modular), and then sweep through user defined benchmarks and sets of 
parameter spaces.  I have a separate tool that will sweep through ceph 
parameters, create ceph.conf files for each space, and run cbt with 
each one, but the eventual goal is to integrate that into cbt itself.


Though I never really intended it to run functional tests, I just 
added something like looks very similar to the rados suite so I can 
benchmark ceph_test_rados for the new community lab hardware. I 
already had a mechanism to inject OSD down/out up/in events, so with a 
bit of squinting it can give you a very rough approximation of a 
workload using the osd thrasher.  If you are interested, I'd be game 
to see if we could integrate your cephfs tests as well (I eventually 
wanted to add cephfs benchmark capabilities anyway).


Cool - my focus is very much on tightening the code-build-test loop for 
developers, but I can see us needing to extend that into a 
code-build-test-bench loop as we do performance work on cephfs in the 
future.  Does cbt rely on having ceph packages built, or does it blast 
the binaries directly from src/ onto the test nodes?


John
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vstart runner for cephfs tests

2015-07-23 Thread Mark Nelson



On 07/23/2015 07:37 AM, John Spray wrote:



On 23/07/15 12:56, Mark Nelson wrote:

I had similar thoughts on the benchmarking side, which is why I
started writing cbt a couple years ago.  I needed the ability to
quickly spin up clusters and run benchmarks on arbitrary sets of
hardware.  The outcome isn't perfect, but it's been extremely useful
for running benchmarks and sort of exists as a half-way point between
vstart and teuthology.

The basic idea is that you give it a yaml file that looks a little bit
like a teuthology yaml file and cbt will (optionally) build a cluster
across a number of user defined nodes with pdsh, start various
monitoring tools (this is ugly right now, I'm working on making it
modular), and then sweep through user defined benchmarks and sets of
parameter spaces.  I have a separate tool that will sweep through ceph
parameters, create ceph.conf files for each space, and run cbt with
each one, but the eventual goal is to integrate that into cbt itself.

Though I never really intended it to run functional tests, I just
added something like looks very similar to the rados suite so I can
benchmark ceph_test_rados for the new community lab hardware. I
already had a mechanism to inject OSD down/out up/in events, so with a
bit of squinting it can give you a very rough approximation of a
workload using the osd thrasher.  If you are interested, I'd be game
to see if we could integrate your cephfs tests as well (I eventually
wanted to add cephfs benchmark capabilities anyway).


Cool - my focus is very much on tightening the code-build-test loop for
developers, but I can see us needing to extend that into a
code-build-test-bench loop as we do performance work on cephfs in the
future.  Does cbt rely on having ceph packages built, or does it blast
the binaries directly from src/ onto the test nodes?


cbt doesn't handle builds/installs at all, so it's probably not 
particularly helpful in this regard.  By default it assumes binaries are 
in /usr/bin, but you can optionally override that in the yaml.  My 
workflow is usually to:


1a) build ceph from src and distribute to other nodes (manually)
1b) run a shell script that installs a given release from gitbuilder on 
all nodes
2) run a cbt yaml file that targets /usr/local, the build dir, /usr/bin, 
etc.


Definitely would be useful to have something that makes 1a) better. 
Probably not cbt's job though.




John

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hello, I am confused about a question of rbd

2015-07-23 Thread Jason Dillaman
According to the git history, support for zero MB images for the create/resize 
commands was explicitly added by commit 08f47a4.  Dan Mick or Josh Durgin could 
probably better explain the history behind the change since it was before my 
time.

-- 

Jason Dillaman 
Red Hat 
dilla...@redhat.com 
http://www.redhat.com 


- Original Message - 

 From: zhengbin 08747 zhengbin.08...@h3c.com
 To: dilla...@redhat.com
 Sent: Thursday, July 23, 2015 7:52:16 AM
 Subject: hello, I am confused about a question of rbd

 When I create a rbd block, its size can be zero, why?I think it should not be
 zero, If it should not be zero, I will report a bug and fix it,thank you

 Like this, I create a rbd block name foo2, it can not be use

 -
 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
 邮件!
 This e-mail and its attachments contain confidential information from H3C,
 which is
 intended only for the person or entity whose address is listed above. Any use
 of the
 information contained herein in any way (including, but not limited to, total
 or partial
 disclosure, reproduction, or dissemination) by persons other than the
 intended
 recipient(s) is prohibited. If you receive this e-mail in error, please
 notify the sender
 by phone or email immediately and delete it!
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vstart runner for cephfs tests

2015-07-23 Thread Loic Dachary


On 23/07/2015 14:34, John Spray wrote: 
 
 On 23/07/15 12:23, Loic Dachary wrote:
 You may be interested by

 https://github.com/ceph/ceph/blob/master/src/test/ceph-disk-root.sh

 which is conditionally included

 https://github.com/ceph/ceph/blob/master/src/test/Makefile.am#L86

 by --enable-root-make-check

 https://github.com/ceph/ceph/blob/master/configure.ac#L414

 If you're reckless and trust the tests not to break (a crazy proposition by 
 definition IMHO ;-), you can

 make TESTS=test/ceph-disk-root.sh check

 If you want protection, you do the same in a docker container with

 test/docker-test.sh --os-type centos --os-version 7 --dev make 
 TESTS=test/ceph-disk-root.sh check

 I tried various strategies to make tests requiring root access more 
 accessible and less scary and that's the best compromise I found. 
 test/docker-test.sh is what the make check bot uses.
 
 Interesting, I didn't realise we already had root-ish tests in there.
 
 At some stage the need for root may go away in ceph-fuse, as in principle 
 fuse mount/unmounts shouldn't require root.  If not then putting an outer 
 docker wrapper around this could make sense, if we publish the built binaries 
 into the docker container via a volume or somesuch.  I am behind on 
 familiarizing myself with the dockerised tests.

The docker container runs from sources, not from packages. 

 
 When a test can be used both from sources and from teuthology, I found it 
 more convenient to have it in the qa/workunits directory which is available 
 in both environments. Who knows, maybe you will want a vstart based cephfs 
 test to run as part of make check, in the same way

 https://github.com/ceph/ceph/blob/master/src/test/cephtool-test-mds.sh

 does.
 
 Yes, this crossed my mind.  At the moment, even many of the quick 
 tests/cephfs tests take tens of seconds, so they are probably a bit too big 
 to go in a default make check, but for some of the really simple things that 
 are currently done in cephtool/test.sh, I would be temped to move them into 
 the python world to make them a bit less fiddly.
 
 The test location is a bit challenging, because we essentially have two 
 not-completely-stable interfaces here, vstart and teuthology. Because 
 teuthology is the more complicated, for the moment it makes sense for the 
 tests to live in that git repo.  Long term it would be nice if fine-grained 
 functional tests lived in the same git repo as the code they're testing, but 
 I don't really have a plan for that right now outside of the 
 probably-too-radical step of merging ceph-qa-suite into the ceph repo.
 
 John

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


Re: vstart runner for cephfs tests

2015-07-23 Thread Loic Dachary
Hi John,

You may be interested by 

https://github.com/ceph/ceph/blob/master/src/test/ceph-disk-root.sh

which is conditionally included 

https://github.com/ceph/ceph/blob/master/src/test/Makefile.am#L86

by --enable-root-make-check

https://github.com/ceph/ceph/blob/master/configure.ac#L414

If you're reckless and trust the tests not to break (a crazy proposition by 
definition IMHO ;-), you can

make TESTS=test/ceph-disk-root.sh check

If you want protection, you do the same in a docker container with

test/docker-test.sh --os-type centos --os-version 7 --dev make 
TESTS=test/ceph-disk-root.sh check

I tried various strategies to make tests requiring root access more accessible 
and less scary and that's the best compromise I found. test/docker-test.sh is 
what the make check bot uses.

When a test can be used both from sources and from teuthology, I found it more 
convenient to have it in the qa/workunits directory which is available in both 
environments. Who knows, maybe you will want a vstart based cephfs test to run 
as part of make check, in the same way 

https://github.com/ceph/ceph/blob/master/src/test/cephtool-test-mds.sh

does.

Cheers

On 23/07/2015 12:00, John Spray wrote:
 
 Audience: anyone working on cephfs, general testing interest.
 
 The tests in ceph-qa-suite/tasks/cephfs are growing in number, but kind of 
 inconvenient to run because they require teuthology (and therefore require 
 built packages, locked nodes, etc).  Most of them don't actually require 
 anything beyond what you already have in a vstart cluster, so I've adapted 
 them to optionally run that way.
 
 The idea is that we can iterate a lot faster when writing new tests (one less 
 excuse not to write them) and get better use out of the tests when debugging 
 things and testing fixes.  teuthology is fine for mass-running the nightlies 
 etc, but it's overkill for testing individual bits of MDS/client 
 functionality.
 
 The code is currently on the wip-vstart-runner ceph-qa-suite branch, and the 
 two magic commands are:
 
 1. Start a vstart cluster with a couple of MDSs, as your normal user:
 $ make -j4 rados ceph-fuse ceph-mds ceph-mon ceph-osd cephfs-data-scan 
 cephfs-journal-tool cephfs-table-tool  ./stop.sh ; rm -rf out dev ; MDS=2 
 OSD=3 MON=1 ./vstart.sh -d -n
 
 2. Invoke the test runner, as root (replace paths, test name as appropriate.  
 Leave of test name to run everything):
 # PYTHONPATH=/home/jspray/git/teuthology/:/home/jspray/git/ceph-qa-suite/ 
 python /home/jspray/git/ceph-qa-suite/tasks/cephfs/vstart_runner.py 
 tasks.cephfs.test_strays.TestStrays.test_migration_on_shutdown
 
 test_migration_on_shutdown (tasks.cephfs.test_strays.TestStrays) ... ok
 
 --
 Ran 1 test in 121.982s
 
 OK
 
 
 ^^^ see!  two minutes, and no waiting for gitbuilders!
 
 The main caveat here is that it needs to run as root in order to 
 mount/unmount things, which is a little scary.  My plan is to split it out 
 into a little root service for doing mount operations, and then let the main 
 test part run as a normal user and call out to the mounter service when 
 needed.
 
 Cheers,
 John
 -- 
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


Re: vstart runner for cephfs tests

2015-07-23 Thread Mark Nelson

Hi John,

I had similar thoughts on the benchmarking side, which is why I started 
writing cbt a couple years ago.  I needed the ability to quickly spin up 
clusters and run benchmarks on arbitrary sets of hardware.  The outcome 
isn't perfect, but it's been extremely useful for running benchmarks and 
sort of exists as a half-way point between vstart and teuthology.


The basic idea is that you give it a yaml file that looks a little bit 
like a teuthology yaml file and cbt will (optionally) build a cluster 
across a number of user defined nodes with pdsh, start various 
monitoring tools (this is ugly right now, I'm working on making it 
modular), and then sweep through user defined benchmarks and sets of 
parameter spaces.  I have a separate tool that will sweep through ceph 
parameters, create ceph.conf files for each space, and run cbt with each 
one, but the eventual goal is to integrate that into cbt itself.


Though I never really intended it to run functional tests, I just added 
something like looks very similar to the rados suite so I can benchmark 
ceph_test_rados for the new community lab hardware. I already had a 
mechanism to inject OSD down/out up/in events, so with a bit of 
squinting it can give you a very rough approximation of a workload using 
the osd thrasher.  If you are interested, I'd be game to see if we could 
integrate your cephfs tests as well (I eventually wanted to add cephfs 
benchmark capabilities anyway).


Mark

On 07/23/2015 05:00 AM, John Spray wrote:


Audience: anyone working on cephfs, general testing interest.

The tests in ceph-qa-suite/tasks/cephfs are growing in number, but kind
of inconvenient to run because they require teuthology (and therefore
require built packages, locked nodes, etc).  Most of them don't actually
require anything beyond what you already have in a vstart cluster, so
I've adapted them to optionally run that way.

The idea is that we can iterate a lot faster when writing new tests (one
less excuse not to write them) and get better use out of the tests when
debugging things and testing fixes.  teuthology is fine for mass-running
the nightlies etc, but it's overkill for testing individual bits of
MDS/client functionality.

The code is currently on the wip-vstart-runner ceph-qa-suite branch, and
the two magic commands are:

1. Start a vstart cluster with a couple of MDSs, as your normal user:
$ make -j4 rados ceph-fuse ceph-mds ceph-mon ceph-osd cephfs-data-scan
cephfs-journal-tool cephfs-table-tool  ./stop.sh ; rm -rf out dev ;
MDS=2 OSD=3 MON=1 ./vstart.sh -d -n

2. Invoke the test runner, as root (replace paths, test name as
appropriate.  Leave of test name to run everything):
#
PYTHONPATH=/home/jspray/git/teuthology/:/home/jspray/git/ceph-qa-suite/
python /home/jspray/git/ceph-qa-suite/tasks/cephfs/vstart_runner.py
tasks.cephfs.test_strays.TestStrays.test_migration_on_shutdown

test_migration_on_shutdown (tasks.cephfs.test_strays.TestStrays) ... ok

--
Ran 1 test in 121.982s

OK


^^^ see!  two minutes, and no waiting for gitbuilders!

The main caveat here is that it needs to run as root in order to
mount/unmount things, which is a little scary.  My plan is to split it
out into a little root service for doing mount operations, and then let
the main test part run as a normal user and call out to the mounter
service when needed.

Cheers,
John
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vstart runner for cephfs tests

2015-07-23 Thread John Spray



On 23/07/15 12:23, Loic Dachary wrote:

You may be interested by

https://github.com/ceph/ceph/blob/master/src/test/ceph-disk-root.sh

which is conditionally included

https://github.com/ceph/ceph/blob/master/src/test/Makefile.am#L86

by --enable-root-make-check

https://github.com/ceph/ceph/blob/master/configure.ac#L414

If you're reckless and trust the tests not to break (a crazy proposition by 
definition IMHO ;-), you can

make TESTS=test/ceph-disk-root.sh check

If you want protection, you do the same in a docker container with

test/docker-test.sh --os-type centos --os-version 7 --dev make 
TESTS=test/ceph-disk-root.sh check

I tried various strategies to make tests requiring root access more accessible 
and less scary and that's the best compromise I found. test/docker-test.sh is 
what the make check bot uses.


Interesting, I didn't realise we already had root-ish tests in there.

At some stage the need for root may go away in ceph-fuse, as in 
principle fuse mount/unmounts shouldn't require root.  If not then 
putting an outer docker wrapper around this could make sense, if we 
publish the built binaries into the docker container via a volume or 
somesuch.  I am behind on familiarizing myself with the dockerised tests.



When a test can be used both from sources and from teuthology, I found it more 
convenient to have it in the qa/workunits directory which is available in both 
environments. Who knows, maybe you will want a vstart based cephfs test to run 
as part of make check, in the same way

https://github.com/ceph/ceph/blob/master/src/test/cephtool-test-mds.sh

does.


Yes, this crossed my mind.  At the moment, even many of the quick 
tests/cephfs tests take tens of seconds, so they are probably a bit too 
big to go in a default make check, but for some of the really simple 
things that are currently done in cephtool/test.sh, I would be temped to 
move them into the python world to make them a bit less fiddly.


The test location is a bit challenging, because we essentially have two 
not-completely-stable interfaces here, vstart and teuthology. Because 
teuthology is the more complicated, for the moment it makes sense for 
the tests to live in that git repo.  Long term it would be nice if 
fine-grained functional tests lived in the same git repo as the code 
they're testing, but I don't really have a plan for that right now 
outside of the probably-too-radical step of merging ceph-qa-suite into 
the ceph repo.


John
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: vstart runner for cephfs tests

2015-07-23 Thread Podoski, Igor
 -Original Message-
 From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
 ow...@vger.kernel.org] On Behalf Of Mark Nelson
 Sent: Thursday, July 23, 2015 2:51 PM
 To: John Spray; ceph-devel@vger.kernel.org
 Subject: Re: vstart runner for cephfs tests
 
 
 
 On 07/23/2015 07:37 AM, John Spray wrote:
 
 
  On 23/07/15 12:56, Mark Nelson wrote:
  I had similar thoughts on the benchmarking side, which is why I
  started writing cbt a couple years ago.  I needed the ability to
  quickly spin up clusters and run benchmarks on arbitrary sets of
  hardware.  The outcome isn't perfect, but it's been extremely useful
  for running benchmarks and sort of exists as a half-way point between
  vstart and teuthology.
 
  The basic idea is that you give it a yaml file that looks a little
  bit like a teuthology yaml file and cbt will (optionally) build a
  cluster across a number of user defined nodes with pdsh, start
  various monitoring tools (this is ugly right now, I'm working on
  making it modular), and then sweep through user defined benchmarks
  and sets of parameter spaces.  I have a separate tool that will sweep
  through ceph parameters, create ceph.conf files for each space, and
  run cbt with each one, but the eventual goal is to integrate that into cbt
 itself.
 
  Though I never really intended it to run functional tests, I just
  added something like looks very similar to the rados suite so I can
  benchmark ceph_test_rados for the new community lab hardware. I
  already had a mechanism to inject OSD down/out up/in events, so with
  a bit of squinting it can give you a very rough approximation of a
  workload using the osd thrasher.  If you are interested, I'd be game
  to see if we could integrate your cephfs tests as well (I eventually
  wanted to add cephfs benchmark capabilities anyway).
 
  Cool - my focus is very much on tightening the code-build-test loop
  for developers, but I can see us needing to extend that into a
  code-build-test-bench loop as we do performance work on cephfs in the
  future.  Does cbt rely on having ceph packages built, or does it blast
  the binaries directly from src/ onto the test nodes?
 
 cbt doesn't handle builds/installs at all, so it's probably not particularly 
 helpful
 in this regard.  By default it assumes binaries are in /usr/bin, but you can
 optionally override that in the yaml.  My workflow is usually to:
 
 1a) build ceph from src and distribute to other nodes (manually)
 1b) run a shell script that installs a given release from gitbuilder on all 
 nodes
 2) run a cbt yaml file that targets /usr/local, the build dir, /usr/bin, etc.
 
 Definitely would be useful to have something that makes 1a) better.
 Probably not cbt's job though.

About 1a)

In my test cluster I have NFS server (on one node) sharing /home/ceph with 
others, with many versions in it. In every subdirectory I run make install with 
DESTDIR pointing to another newly created BIN subdir.

So it looks like this:
/home/ceph/ceph-0.94.1/BIN 
ls BIN
etc
sbin
usr
var

Then I remove var, and run stow on every node to link binaries and libs from 
shared /home/ceph/ceph-version/BIN to '/' and 'ldconfig' at the end. Basically 
I can do changes only on one node, and very quickly switch between ceph 
versions. So there is no ceph installed at any node, ceph stuff is only at /var 
directory.

Of course when NFS node fails, fails everything ... but I'm aware of that.

Check out stow.

 
  John
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


Regards,
Igor.



Re: Moving away from Yum/DNF repo priorities for Ceph and ceph-deploy

2015-07-23 Thread Travis Rhoden
sorry for double-send — forgot to make plain text for ceph-devel

Hi Shinobu,

Thanks for the response.


 On Jul 23, 2015, at 5:05 PM, Shinobu Kinjo shinobu...@gmail.com wrote:
 
 Hi Travis,
 
 Is this that you are talking about:
 
 ``dnf [options] list obsoletes [package-name-specs...]``
 ``dnf [options] repository-packages repoid info obsoletes 
 [package-name-spec...]``
 ``dnf [options] repository-packages repoid list obsoletes 
 [package-name-spec...]``

Partially.  Those queries either list packages already installed that are 
obsoleted by packages available in any enabled repos, or list packages in the 
repos that obsolete packages that are already installed (the man page says 
reposository-packages info/list are the same thing?).  Such commands would show 
the issue I am trying to solve - namely that for certain releases of Ceph, 
older packages of Ceph from EPEL take priority over newer ones, due to package 
obsoletions.

The exact scenario we hit was a problem most commonly on CentOS - I have not 
confirmed whether it is an issue in Fedora.  But here is a specific example.  
Starting with a CentOS 7 machine that has EPEL installed/enabled, and Ceph 
rpm-firefly enabled by default, and the yum priorities plugin installed (with 
check_obsoletes=0):

# yum info --disablerepo=Ceph --disablerepo=epel --disableplugin=priorities ceph
Error: No matching Packages to list
^ Expected - ceph packages aren’t reachable

# yum info --disablerepo=Ceph --disableplugin=priorities ceph
Available Packages
Name: ceph
Arch: x86_64
Epoch   : 1
Version : 0.80.7
Release : 0.4.el7
^ EPEL includes Ceph 0.80.7.

# yum info --disablerepo=epel --disableplugin=priorities ceph
Available Packages
Name: ceph
Arch: x86_64
Epoch   : 1
Version : 0.80.10
^ Ceph repos have 0.80.10.

# yum info --disableplugin=priorities ceph
Available Packages
Name: ceph
Arch: x86_64
Epoch   : 1
Version : 0.80.7
^ With priorities disabled, but both repos enabled, yum resolves the lower 
version number, 0.80.7 from EPEL.

# yum info ceph
24 packages excluded due to repository priority protections
Available Packages
Name: ceph
Arch: x86_64
Version : 0.80.10
^ With priorities enabled, we now get 0.80.10.  The ceph.repo file has 
priority=1 in it.

Great!  With priorities enabled, we now see 0.80.10.  Let’s install:

# yum install -v ceph
….
….
Error: Package: 1:python-rbd-0.80.7-2.el7.x86_64 (base)
   Requires: librbd1 = 1:0.80.7-2.el7
   Available: librbd1-0.80-0.el7.x86_64 (Ceph)
   librbd1 = 0.80-0.el7
   Available: librbd1-0.80.1-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.1-0.el7
   Available: librbd1-0.80.3-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.3-0.el7
   Available: librbd1-0.80.4-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.4-0.el7
   Available: librbd1-0.80.5-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.5-0.el7
   Available: librbd1-0.80.6-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.6-0.el7
   Available: librbd1-0.80.7-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.7-0.el7
   Available: librbd1-0.80.8-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.8-0.el7
   Available: librbd1-0.80.9-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.9-0.el7
   Installing: librbd1-0.80.10-0.el7.x86_64 (Ceph)
   librbd1 = 0.80.10-0.el7


So why did the install fail?  See [1] for full output, but the short version is 
at this step:

-- Processing Dependency: python-ceph for package: ceph-0.80.10-0.el7.x86_64
Searching pkgSack for dep: python-ceph
Not Updating Package that is obsoleted: python-ceph-0.80.10-0.el7.x86_64
TSINFO: Marking 1:python-ceph-compat-0.80.7-0.4.el7.x86_64 as install for 
ceph-0.80.10-0.el7.x86_64

When yum looks for python-ceph, it sees that it has been marked as obsoleted by 
python-ceph-compat, which is available from EPEL.  Pulling in that 
python-ceph-compat causes all kinds of problems and the install ultimately 
fails.  The solution is set check_obsoletes = 1 in 
/etc/yum/pluginconf.d/priorities.conf, which forces Yum to override the the 
obsoletion of a package from a lower priority repo.  This is what we are doing 
today.

This problem still exists for our Firefly and Giant packages, even though the 
EPEL package added versions to their obsoletes over 4 months ago: [2].  It is 
this “check_obsoletes” behavior that I am unsure of in DNF.  Granted, I haven’t 
tried it.  It may be that it will read the same config file, even 
(/etc/yum/pluginconf.d/priorities.conf).  I’d have to install DNF in a Fedora 
20 VM (the last Fedora we built production packages of Firefly on) to see how 
DNF behaves here.

Even if check_obsoletes wasn’t a consideration, I’d really like to get away 
from setting priority values in repo files and installing an additional plugin 
for Yum to make things work.  It feels like 

答复: 答复: hello, I am confused about a question of rbd

2015-07-23 Thread zhengbin.08...@h3c.com
From now I don't find any question, Understand it, thank you

-邮件原件-
发件人: Dan Mick [mailto:dm...@redhat.com] 
发送时间: 2015年7月24日 9:21
收件人: zhengbin 08747 (RD); ceph-devel
主题: Re: 答复: hello, I am confused about a question of rbd

Adding back ceph-devel

My point was, ok, ruling out 0, then we can create a block device of size 1 
byte.  Is that useful?  No, it is not.  How about 10 bytes?
1000?  1MB?

There's no good reason to rule out zero.  Is it causing a problem somehow?

On 07/23/2015 06:06 PM, zhengbin.08...@h3c.com wrote:
 If I create a block whose size is zero, it means you absolutely can 
 not use it unless you resize it
 
 
 what arbitrary minimum size is too small?  - I do not know, If I create 
 a block whose size is larger than zero,it may be can use.
 And why do we have to define a minimum size? we can just make sure the 
 block size is larger than zero
 
 -邮件原件-
 发件人: Dan Mick [mailto:dm...@redhat.com]
 发送时间: 2015年7月24日 5:25
 收件人: Jason Dillaman; zhengbin 08747 (RD)
 抄送: ceph-devel
 主题: Re: hello, I am confused about a question of rbd
 
 Why not zero?
 
 If the answer is it can't be used, then, what arbitrary minimum size is too 
 small?
 
 (also, given that resize exists, it can be used for storage after a 
 resize.)
 
 On 07/23/2015 06:03 AM, Jason Dillaman wrote:
 According to the git history, support for zero MB images for the 
 create/resize commands was explicitly added by commit 08f47a4.  Dan Mick or 
 Josh Durgin could probably better explain the history behind the change 
 since it was before my time.

 
 --
 ---
 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
 邮件!
 This e-mail and its attachments contain confidential information from 
 H3C, which is intended only for the person or entity whose address is 
 listed above. Any use of the information contained herein in any way 
 (including, but not limited to, total or partial disclosure, 
 reproduction, or dissemination) by persons other than the intended
 recipient(s) is prohibited. If you receive this e-mail in error, 
 please notify the sender by phone or email immediately and delete it!
 

--
Dan Mick
Red Hat, Inc.
Ceph docs: http://ceph.com/docs
N�r��yb�X��ǧv�^�)޺{.n�+���z�]z���{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i

Re: 答复: hello, I am confused about a question of rbd

2015-07-23 Thread Dan Mick
Adding back ceph-devel

My point was, ok, ruling out 0, then we can create a block device of
size 1 byte.  Is that useful?  No, it is not.  How about 10 bytes?
1000?  1MB?

There's no good reason to rule out zero.  Is it causing a problem somehow?

On 07/23/2015 06:06 PM, zhengbin.08...@h3c.com wrote:
 If I create a block whose size is zero, it means you absolutely can not use 
 it unless you resize it
 
 
 what arbitrary minimum size is too small?  - I do not know, If I create 
 a block whose size is larger than zero,it may be can use.
 And why do we have to define a minimum size? we can just make sure the block 
 size is larger than zero
 
 -邮件原件-
 发件人: Dan Mick [mailto:dm...@redhat.com]
 发送时间: 2015年7月24日 5:25
 收件人: Jason Dillaman; zhengbin 08747 (RD)
 抄送: ceph-devel
 主题: Re: hello, I am confused about a question of rbd
 
 Why not zero?
 
 If the answer is it can't be used, then, what arbitrary minimum size is too 
 small?
 
 (also, given that resize exists, it can be used for storage after a resize.)
 
 On 07/23/2015 06:03 AM, Jason Dillaman wrote:
 According to the git history, support for zero MB images for the 
 create/resize commands was explicitly added by commit 08f47a4.  Dan Mick or 
 Josh Durgin could probably better explain the history behind the change 
 since it was before my time.

 
 -
 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
 邮件!
 This e-mail and its attachments contain confidential information from H3C, 
 which is
 intended only for the person or entity whose address is listed above. Any use 
 of the
 information contained herein in any way (including, but not limited to, total 
 or partial
 disclosure, reproduction, or dissemination) by persons other than the intended
 recipient(s) is prohibited. If you receive this e-mail in error, please 
 notify the sender
 by phone or email immediately and delete it!
 

-- 
Dan Mick
Red Hat, Inc.
Ceph docs: http://ceph.com/docs
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Varada Kari
(Adding devel list to the CC)
Hi Eric,

To add more context to the problem:

Min_size was set to 1 and replication size is 2.

There was a flaky power connection to one of the enclosures.  With min_size 1, 
we were able to continue the IO's, and recovery was active once the power comes 
back. But if there is a power failure again when recovery is in progress, some 
of the PGs are going to down+peering state.

Extract from pg query.

$ ceph pg 1.143 query
{ state: down+peering,
  snap_trimq: [],
  epoch: 3918,
  up: [
17],
  acting: [
17],
  info: { pgid: 1.143,
  last_update: 3166'40424,
  last_complete: 3166'40424,
  log_tail: 2577'36847,
  last_user_version: 40424,
  last_backfill: MAX,
  purged_snaps: [],

.. recovery_state: [
{ name: Started\/Primary\/Peering\/GetInfo,
  enter_time: 2015-07-15 12:48:51.372676,
  requested_info_from: []},
{ name: Started\/Primary\/Peering,
  enter_time: 2015-07-15 12:48:51.372675,
  past_intervals: [
{ first: 3147,
  last: 3166,
  maybe_went_rw: 1,
  up: [
17,
4],
  acting: [
17,
4],
  primary: 17,
  up_primary: 17},
{ first: 3167,
  last: 3167,
  maybe_went_rw: 0,
  up: [
10,
20],
  acting: [
10,
20],
  primary: 10,
  up_primary: 10},
{ first: 3168,
  last: 3181,
  maybe_went_rw: 1,
  up: [
10,
20],
  acting: [
10,
4],
  primary: 10,
  up_primary: 10},
{ first: 3182,
  last: 3184,
  maybe_went_rw: 0,
  up: [
20],
  acting: [
4],
  primary: 4,
  up_primary: 20},
{ first: 3185,
  last: 3188,
  maybe_went_rw: 1,
  up: [
20],
  acting: [
20],
  primary: 20,
  up_primary: 20}],
  probing_osds: [
17,
20],
  blocked: peering is blocked due to down osds,
  down_osds_we_would_probe: [
4,
10],
  peering_blocked_by: [
{ osd: 4,
  current_lost_at: 0,
  comment: starting or marking this osd lost may let us 
proceed},
{ osd: 10,
  current_lost_at: 0,
  comment: starting or marking this osd lost may let us 
proceed}]},
{ name: Started,
  enter_time: 2015-07-15 12:48:51.372671}],
  agent_state: {}}

And Pgs are not coming to active+clean till power is resumed again. During this 
period no IOs are allowed to the cluster. Not able to follow why the PGs are 
ending up in peering state? Each Pg has two copies in both the enclosures. If 
one of enclosure is down for some time, should be able to serve IO's from the 
second one. That was true, if no recovery IO is involved. In case of any 
recovery, we are ending up some Pg's in down and peering state.

Thanks,
Varada


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Eric 
Eastman
Sent: Thursday, July 23, 2015 8:37 PM
To: Mallikarjun Biradar mallikarjuna.bira...@gmail.com
Cc: ceph-us...@lists.ceph.com
Subject: Re: [ceph-users] Enclosure power failure pausing client IO till all 
connected hosts up

You may want to check your min_size value for your pools.  If it is set to the 
pool size value, then the cluster will not do I/O if you loose a chassis.

On Sun, Jul 5, 2015 at 11:04 PM, Mallikarjun Biradar 
mallikarjuna.bira...@gmail.com wrote:
 Hi all,

 Setup details:
 Two storage enclosures each connected to 4 OSD nodes (Shared storage).
 Failure domain is Chassis (enclosure) level. Replication count is 2.
 Each host has allotted with 4 drives.

 I have active client IO running on cluster. (Random write profile with
 4M block size  64 Queue depth).

 One of enclosure had power loss. So all OSD's from hosts that are
 connected to this enclosure went down as expected.

 But client IO got paused. After some time enclosure  hosts connected
 to it came up.
 And all OSD's on that hosts came up.

 Till this time, cluster was not serving IO. Once all hosts  OSD's
 pertaining to that enclosure came up, client IO resumed.


 Can anybody help me why cluster not serving IO during 

Re: Ceph Tech Talk next week

2015-07-23 Thread Patrick McGarry
correct.


Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph


On Tue, Jul 21, 2015 at 6:03 PM, Gregory Farnum g...@gregs42.com wrote:
 On Tue, Jul 21, 2015 at 6:09 PM, Patrick McGarry pmcga...@redhat.com wrote:
 Hey cephers,

 Just a reminder that the Ceph Tech Talk on CephFS that was scheduled
 for last month (and cancelled due to technical difficulties) has been
 rescheduled for this month's talk. It will be happening next Thurs at
 17:00 UTC (1p EST)

 So that's July 30, according to the website, right? :)
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


registering for tracker.ceph.com

2015-07-23 Thread Deneau, Tom
I wanted to register for tracker.ceph.com to enter a few issues but never
got the confirming email and my registration is now in some stuck state
(not complete but name/email in use so can't re-register).  Any suggestions?

-- Tom Deneau

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: vstart runner for cephfs tests

2015-07-23 Thread Gregory Meno
On Thu, Jul 23, 2015 at 11:00:57AM +0100, John Spray wrote:
 
 Audience: anyone working on cephfs, general testing interest.
 
 The tests in ceph-qa-suite/tasks/cephfs are growing in number, but kind of
 inconvenient to run because they require teuthology (and therefore require
 built packages, locked nodes, etc).  Most of them don't actually require
 anything beyond what you already have in a vstart cluster, so I've adapted
 them to optionally run that way.
 
 The idea is that we can iterate a lot faster when writing new tests (one
 less excuse not to write them) and get better use out of the tests when
 debugging things and testing fixes.  teuthology is fine for mass-running the
 nightlies etc, but it's overkill for testing individual bits of MDS/client
 functionality.
 
 The code is currently on the wip-vstart-runner ceph-qa-suite branch, and the
 two magic commands are:
 
 1. Start a vstart cluster with a couple of MDSs, as your normal user:
 $ make -j4 rados ceph-fuse ceph-mds ceph-mon ceph-osd cephfs-data-scan
 cephfs-journal-tool cephfs-table-tool  ./stop.sh ; rm -rf out dev ; MDS=2
 OSD=3 MON=1 ./vstart.sh -d -n
 
 2. Invoke the test runner, as root (replace paths, test name as appropriate.
 Leave of test name to run everything):
 # PYTHONPATH=/home/jspray/git/teuthology/:/home/jspray/git/ceph-qa-suite/
 python /home/jspray/git/ceph-qa-suite/tasks/cephfs/vstart_runner.py
 tasks.cephfs.test_strays.TestStrays.test_migration_on_shutdown
 
 test_migration_on_shutdown (tasks.cephfs.test_strays.TestStrays) ... ok
 
 --
 Ran 1 test in 121.982s
 
 OK
 
 
 ^^^ see!  two minutes, and no waiting for gitbuilders!

You are a testing hero John!

-G
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Ceph Day Speakers (Chicago, Raleigh)

2015-07-23 Thread Patrick McGarry
Hey cephers,

Since Ceph Days for both Chicago and Raleigh are fast approaching, I
wanted to put another call out on the mailing lists for anyone who
might be interested in sharing their Ceph experiences with the
community at either location. If you have something to share
(integration, use case, performance, hardware tuning, etc) please let
me know ASAP. Thanks!

http://ceph.com/cephdays



-- 

Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Performance degrad on big cluster

2015-07-23 Thread Ketor D
Hi all,
  We meet a performance degrad in one of our cluster. Our
randwrite latency degraded from 1ms to 5ms(fio -ioengine=rbd
iodepth=1)
  The cluster has about 200 osds runing on Intel 3500 SSD, we run
both qemu and ceph-osd on the hosts. The network for ceph is 10GbE.
   While the cluster is smaller and not so many qemu processes,
the IO latency is about 1ms, but now the latency is 5ms.
   I use strace to  get the time for syscall, all syscall (writev,
io_submit, recvfrom,sendmsg,lseek,fgetxattr etc.)  use 300us to 600us.
The syscall time on a small and idle cluster is near to 0us.
   After checked serval clusters, I come to a conclusion:
   num_of_osds num_of_threads_on_hosttime_of_syscall(us)
   200 1   300-600
   100   5000   200-500
702500   100-300
 9   750
  20-60

   The threads on one of 200 osds cluster's host is like this:
   name   num_of_processes  num_of_threads
num_of_threads_per_process
   qemu-kvm49  9748
198
   ceph-osd  65707
   951

   We are running Firefly 0.80.7 and 0.80.9, and qemu version is 2.1.

   I think too many threads on the host lead to high latency of
ceph-osd process, and cause to high I/O latency from the client-side.
   Anyone's help is welcome.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: registering for tracker.ceph.com

2015-07-23 Thread Sage Weil
On Thu, 23 Jul 2015, Deneau, Tom wrote:
 I wanted to register for tracker.ceph.com to enter a few issues but never
 got the confirming email and my registration is now in some stuck state
 (not complete but name/email in use so can't re-register).  Any suggestions?

It does that sometimes... not sure why.  I activated your tdeneau user and 
deleted the later tmdeneau one.

sage

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


vstart runner for cephfs tests

2015-07-23 Thread John Spray


Audience: anyone working on cephfs, general testing interest.

The tests in ceph-qa-suite/tasks/cephfs are growing in number, but kind 
of inconvenient to run because they require teuthology (and therefore 
require built packages, locked nodes, etc).  Most of them don't actually 
require anything beyond what you already have in a vstart cluster, so 
I've adapted them to optionally run that way.


The idea is that we can iterate a lot faster when writing new tests (one 
less excuse not to write them) and get better use out of the tests when 
debugging things and testing fixes.  teuthology is fine for mass-running 
the nightlies etc, but it's overkill for testing individual bits of 
MDS/client functionality.


The code is currently on the wip-vstart-runner ceph-qa-suite branch, and 
the two magic commands are:


1. Start a vstart cluster with a couple of MDSs, as your normal user:
$ make -j4 rados ceph-fuse ceph-mds ceph-mon ceph-osd cephfs-data-scan 
cephfs-journal-tool cephfs-table-tool  ./stop.sh ; rm -rf out dev ; 
MDS=2 OSD=3 MON=1 ./vstart.sh -d -n


2. Invoke the test runner, as root (replace paths, test name as 
appropriate.  Leave of test name to run everything):
# 
PYTHONPATH=/home/jspray/git/teuthology/:/home/jspray/git/ceph-qa-suite/ 
python /home/jspray/git/ceph-qa-suite/tasks/cephfs/vstart_runner.py 
tasks.cephfs.test_strays.TestStrays.test_migration_on_shutdown


test_migration_on_shutdown (tasks.cephfs.test_strays.TestStrays) ... ok

--
Ran 1 test in 121.982s

OK


^^^ see!  two minutes, and no waiting for gitbuilders!

The main caveat here is that it needs to run as root in order to 
mount/unmount things, which is a little scary.  My plan is to split it 
out into a little root service for doing mount operations, and then let 
the main test part run as a normal user and call out to the mounter 
service when needed.


Cheers,
John
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html