Troublesome documentation link on http://ceph.com

2015-12-01 Thread Nathan Cutler
So I go to http://ceph.com and click on the "Documentation" link. In the 
HTML source code the linked URL is http://ceph.com/docs but the HTTP 
server rewrites this to:


http://docs.ceph.com/docs/v0.80.5/

Could someone arrange for this to be changed s/v0.80.5/master/ ?

( Searching for Ceph documentation is already difficult as it is, 
because Google does not provide links to the master version, but only to 
old outdated versions - see for example the first two search results in 
a Google search for "troubleshoot pgs in ceph": http://tinyurl.com/nfuohss )


--
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Firefly EOL date - still Jan 2016?

2015-11-13 Thread Nathan Cutler
> Does anyone on the stable release team have an interest in doing
> releases beyond that date, or should we announce that as a firm date?

For now my vote is to stick to the schedule and declare EOL on January
31, but I'm willing to negotiate :-)

Nathan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Would it make sense to require ntp

2015-11-06 Thread Nathan Cutler
Hi Ceph:

Recently I encountered some a "clock skew" issue with 0.94.3. I have
some small demo clusters in AWS. When I boot them up, in most cases the
cluster will start in HEALTH_WARN due to clock skew on some of the MONs.

I surmise that this is due to a race condition between the ceph-mon and
ntpd systemd services. Sometimes ntpd.service starts *after* ceph-mon -
in this case the MON sees a wrong/unsynchronized time value.

Now, even though ntpd.service starts (and fixes the time value) very
soon afterwards, the cluster remains in clock skew for a long time - but
that is a separate issue. What I would like to ask is this:

Is there any reasonable Ceph cluster node configuration that does not
include running the NTP daemon?

If the answer is "no", would it make sense to make NTP a runtime
dependency and tell the ceph-mon systemd service to wait for
ntpd.service before it starts?

Thanks and regards

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: a home for backport snippets

2015-11-05 Thread Nathan Cutler
>   ceph-workbench set-release --token $github_token --key $redmine_key
> 
> What do you think ?

Would be nice to be able to put $github_token and $redmine_key into a
configuration file.

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: civetweb upstream/downstream divergence

2015-11-03 Thread Nathan Cutler
IMHO the first step should be to get rid of the evil submodule. Arguably
the most direct path leading to this goal is to simply package up the
downstream civetweb (i.e. 1.6 plus all the downstream patches) for all
the supported distros. The resulting package would be Ceph-specific,
obviously, so it could be called "civetweb-ceph".

Like Ken says, the upstreaming effort can continue in parallel.

After we get Ceph/RGW working fine with civetweb-ceph 1.6, we can rebase
the package to upstream civetweb 1.7.

I am not volunteering to do all the work, but we at SUSE are certainly
prepared to shoulder our share of it.

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ceph-users] who is using radosgw with civetweb?

2015-11-02 Thread Nathan Cutler
On 11/02/2015 06:01 PM, Martin Millnert wrote:
> Minimum the documentation at
> http://docs.ceph.com/docs/master/radosgw/config-ref/ could be blessed
> with an entry on 'rgw frontends', including notes on how to configure it
> for loopback-binding access only.

Agreed: http://tracker.ceph.com/issues/13670

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


civetweb upstream/downstream divergence

2015-10-29 Thread Nathan Cutler
Hi Ceph:

The civetweb code in RGW is taken from https://github.com/ceph/civetweb/
which is a fork of https://github.com/civetweb/civetweb. The last commit
to our fork took place on March 18.

Upstream civetweb development has progressed ("This branch is 19 commits
ahead, 972 commits behind civetweb:master.")

Are there plans to rebase to a newer upstream version or should we think
more in terms of backporting (to ceph/civetweb.git) from upstream
(civetweb/civetweb.git) when we need to fix bugs or add features?

Thanks and regards

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Distro support in infernalis and above

2015-10-11 Thread Nathan Cutler
Hi!

I recently noted that infernalis no longer supports RHEL 6. Does this
extend to the ceph-common package?

If so, we could greatly simplify the spec file . . . something along the
lines of https://github.com/ceph/ceph/pull/6225

Regards

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Isn't --allhosts dangerous?

2015-09-27 Thread Nathan Cutler
Taking actions cluster-wide via a shell script seems like it could have
dangerous unexpected side effects, and indeed while looking through old
tickets, I ran across http://tracker.ceph.com/issues/9407 . . .

This is "fixed" (by getting rid of the --allhosts option to init-ceph)
in https://github.com/ceph/ceph/pull/6089

The same PR also adds getopt to make processing of command-line options
more flexible.

Opinions? Reviews?

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Firefly help

2015-09-26 Thread Nathan Cutler
Hi Loic:

For some reason I cannot reach teuthology.front.sepia.ceph.com - I turn
on the VPN but the machine does not respond to pings or ssh. (Nor does
the gateway machine, for that matter.)

If you feel inclined to help, could you start the standard round of
integration tests on firefly-backports?

Thanks,
Nathan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backporting from Infernalis and c++11

2015-09-15 Thread Nathan Cutler
> With Infernalis Ceph move to c++11 (and CMake), we will see more conflicts 
> when backporting bug fixes to Hammer. 

Good point, Loic!

> Any ideas you may have to better deal with this would be most welcome.

A couple thoughts pop into mind.

When I joined the project, I was told that doing backports is a good way
to get into the codebase, and after some months I can confirm that this
is true.

Loic has literally bent over backwards to help me along the way, and
thanks to that I have made some progress. Still, the factor determining
whether a backport is trivial or non-trivial is often my own
"cluenessness".

I would suggest to developers that they keep backporting in the back of
their mind as they design and implement bugfixes. Will the backport be
doable even by a relatively inexperienced backporter? Is there a way to
make it easier on the backporter?

I would suggest that it is in the developers' best interest to make a
little extra effort in this direction, as it will reduce the
probability of the backporter asking them for help later ;-)

Regards

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


try-restart on upgrade, and upgrade procedures in general

2015-09-09 Thread Nathan Cutler
Hi all:

I have been tinkering with the %preun and %postun scripts in
ceph.spec.in - in particular, the ones for the "ceph" and "ceph-radosgw"
packages.

Recently, as part of the "wip-systemd" effort, these snippets were
updated for compatibility with systemd. Since the "Upgrade procedures"
documentation[1] is going to have to be updated anyway, I hope we might
have a discussion on these upgrade procedures.

Based on my research and discussions to-date, it seems like there are
two camps:

The first camp says "upgrade should not touch running daemons;
restarting them should be left to the admin." This is closely related to
the idea that daemons should be upgraded and restarted individually:
i.e., mons first, then osds, etc.

The second camp says: "since the typical workflow for upgrading a
package in Linux distributions involves having the package itself
automatically restart running daemons, the Ceph package should do
this, too".

The first camp's position appears to be motivated primarily by a desire
to keep the cluster up and running during the upgrade, and minimize
disruption by proceeding "daemon by daemon".

The second camp's position is driven by distribution packaging
conventions and the fact that all the Ceph daemons and systemd units
(except RGW) are packaged together. This lends itself to a "node by
node" approach to upgrading, rather than "daemon by daemon". (Also,
since there is always a risk that an upgrade might cause an entire node
to fail, Ceph clusters need to be able to cope with an entire node going
offline for upgrade. This might even be an argument for *recommending*
"node by node" as an upstream-sanctioned upgrade procedure!)

It was suggested to me that a nice way to reconcile these two camps
would be to introduce an /etc/sysconfig/ceph (/etc/default/ceph) option,
which I have provisionally called CEPH_AUTO_RESTART_ON_UPGRADE. If this
option is set to "yes", the packaging scriptlet that is run on upgrade
would do a "systemctl try-restart" on all the systemd units in the
respective package. If it were not set, or set to any value other than
"yes", the current behavior would be preserved.

Opinions? Ideas?

So far, I have opened https://github.com/ceph/ceph/pull/5835 with the
RPM implementation.

[1] http://ceph.com/docs/master/install/upgrading-ceph/#upgrade-procedures

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Backport] assigning a pull request when asking permission to merge

2015-09-07 Thread Nathan Cutler
> What do you think ?

Good idea. Done for all the pending firefly PRs as well.

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Enabling contributors to run teuthology suites

2015-08-24 Thread Nathan Cutler

We don't yet have a long term solution to fund this kind of testing, but that's 
not a blocker. It's cheap enough to setup and I'm willing to spend 500 euros 
per month of my own money for the time being. It's really selfish of me : 
enabling contributors will save me time that's worth more than 500 euros ;-)


I suggest that Loic provide PayPal details so folks can help him defray 
the cost of running teuthology on his cluster.


--
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hammer/firefly backports upgrade suite

2015-07-30 Thread Nathan Cutler

It turns out the mistake was to use upgrade/firefly-x and upgrade/hammer-x 
instead of upgrade/firelfy and upgrade/hammer. When -c hammer-backports or -c 
firefly-backports is given to teuthology-suite, it will upgrade to this 
version, from older versions of the same stable branch.

I've updated:

http://tracker.ceph.com/issues/11644#upgrade
http://tracker.ceph.com/issues/11990#upgrade

whith the fix and they are running (using the OpenStack cluster though, there 
may be issues with the firefly upgrade suite but the hammer upgrade suite must 
run fine, it has run at http://ceph.aevoo.fr:8081/ enough times to suggest 
there are no OpenStack specific issues).


Hi Loic:

As you predicted, the firefly upgrade suite finished with quite a few 
failures. I looked at the first one and the log said:


ERROR:teuthology.misc:. teuthology/test/integration/openrc.sh  
openstack network show -f json unknown error ERROR: openstack No 
networks with a name or ID of 'unknown' found


Nathan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: upstream/firefly exporting the same snap 2 times results in different exports

2015-07-22 Thread Nathan Cutler
On 2015-07-22 09:03, Stefan Priebe - Profihost AG wrote:
 That would be really important. I've seen that this one was already in
 upstream/firefly-backports. What's the purpose of that branch?

That is where the Stable Releases and Backports team stages backports
and does integration testing on them before they are merged into the
'firefly' named branch.

-- 
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Firefly 0.80.11 integration testing summary

2015-07-16 Thread Nathan Cutler

Hi Loic:

Current status of integration testing for the upcoming firefly 0.80.11 
release is as follows:


* rados: all green
* rgw: currently re-running 11 tests (known bug that has since been fixed)
* fs: currently re-running 4 tests that failed (probably 
environment-related)
* rbd: encountered test failures that I was unable to debug -- opened 
http://tracker.ceph.com/issues/12336

* powercycle: all green
* upgrade: encountered similar-looking test failures as in rbd

For details, see http://tracker.ceph.com/issues/11644

Regards

--
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Duplicate issues?

2015-07-10 Thread Nathan Cutler

Hi Loic:

Please look at the following two issues that are earmarked for backport 
to firefly:


http://tracker.ceph.com/issues/8674 osd: cache tier: avoid promotion on 
first read

http://tracker.ceph.com/issues/9064 RadosModel assertion failure

They both seem to be fixed by the same commit, namely:

https://github.com/ceph/ceph/commit/0ed3adc1e0a74bf9548d1d956aece11f019afee0

. . . which would mean that the corresponding backport tracker issues 
are duplicates?


Can you confirm/deny this?

Thanks,
Nathan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dealing with conflicting pull requests

2015-07-09 Thread Nathan Cutler

For the first time we have a few conflicting pull requests that prevent merging 
all in the integration branch.

...


I think the simplest way to deal with that is to arbitrarily pick a few that 
apply cleanly and just DNM the others so they wait for the next round of 
integration testing. After the first round of integration test, the chosen one 
will be merged and the DNM will have to be rebased to resolve the conflict, but 
that's what we do routinely.

What do you think ?


I think your suggestion is reasonable and I can't think of any better 
way to proceed, so +1 !


Nathan

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: xattrs vs. omap with radosgw

2015-06-17 Thread Nathan Cutler
 We've since merged something 
 that stripes over several small xattrs so that we can keep things inline, 
 but it hasn't been backported to hammer yet.  See
 c6cdb4081e366f471b372102905a1192910ab2da.

Hi Sage:

You wrote yet - should we earmark it for hammer backport?

Nathan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: preparing v0.80.11

2015-06-04 Thread Nathan Cutler

Could I also ask for this one to be backported?

https://github.com/ceph/ceph/pull/4844

It breaks a couple of setups I know of. It's not in master yet, but it's
a very trivial fix.


Hi Wido:

firefly backport: https://github.com/ceph/ceph/pull/4861
hammer backport: https://github.com/ceph/ceph/pull/4862

Please take a look.

Nathan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: preparing v0.80.11

2015-05-28 Thread Nathan Cutler

Would it be possible to backport this as well to 0.80.11:

http://tracker.ceph.com/issues/9792#change-46498

And I think this commit would be the easiest to backport:
https://github.com/ceph/ceph/commit/6b982e4cc00f9f201d7fbffa0282f8f3295f2309

This way we add a simple safeguard against pool removal into Firefly as
well.


Hi Wido:

Thanks for this suggestion. I have cherry-picked the commit you quoted 
and it will soon be in firefly-backports. The backport tracker issue is:


http://tracker.ceph.com/issues/11801

Regards,
Nathan



--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: preparing v0.80.11

2015-05-26 Thread Nathan Cutler

Hi Loic:

The first round of 0.80.11 backports, including all trivial backports 
(where trivial is defined as those I was able to do by myself without 
help), is now ready for integration testing in the firefly-backports 
branch of the SUSE fork:


https://github.com/SUSE/ceph/commits/firefly-backports

The non-trivial backports (on which I hereby solicit help) are:

http://tracker.ceph.com/issues/11699 Objecter: resend linger ops on split
http://tracker.ceph.com/issues/11700 make the all osd/filestore thread 
pool suicide timeouts separately configurable

http://tracker.ceph.com/issues/11704 erasure-code: misalignment
http://tracker.ceph.com/issues/11720 rgw deleting S3 objects leaves 
__shadow_ objects behind


Nathan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: preparing v0.80.11

2015-05-25 Thread Nathan Cutler

Although v0.80.10 is not out yet, the odds of a discovering a problem that 
would require an additional backport are low. I think you can start with 
v0.80.11 without further delay :-)


As soon as I get over the flu :-(

Nathan

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Proposal for a Backport tracker

2015-05-22 Thread Nathan Cutler

Hi Loic:

As I understand the new backport workflow, the process of creating 
tickets in the Backport tracker will be automated. This raises a question:


As the automation script loops over each Pending Backport ticket, it 
will first create a corresponding ticket (or tickets) in the Backport 
tracker. After that, I'm thinking it will need to change the status of 
the original ticket from Pending Backport to something else (so it 
doesn't pick up the same ticket again the next time the script is run).


So the question is, what will the original ticket's status be changed 
to? Resolved?


Nathan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Proposal for a Backport tracker

2015-05-22 Thread Nathan Cutler

We can probably keep the original ticket in the Pending Backport status until 
all backports are resolved, as we currently do.


So the question is, what will the original ticket's status be changed to? 
Resolved?


When all backports are resolved, then we can also resolve the original ticket. 
Do you forsee a problem with that ?



Yes and no. It's a scripting problem, not a workflow problem. I'll try 
to describe the scenario I'm envisioning:


We have a simple script that loops over all tickets with status Pending 
Backport. We run it once per week. The first week we run it, the script 
finds 3 such tickets. For each one it creates a ticket in the Backport 
tracker.


A week goes by, during which developers marked 4 more tickets Pending 
Backport. It's time to run the script again. Since it is looping over 
all tickets marked Pending Backport, it now finds 7 such tickets: 3 
from last week and the 4 new ones. For each one it dutifully creates a 
new ticket in the Backport tracker. Now we have duplicate Backport 
tickets for 3 bugfixes.


But I guess it will be trivial to make the script check if a Backport 
ticket is already open and refrain from opening a new one in this case.


The same script could also check further: if Backport tickets already 
exist for the bugfix and all of them are marked Resolved, then 
automatically mark the original ticket Resolved as well.


Nathan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Proposal for a Backport tracker

2015-05-21 Thread Nathan Cutler
 * a backport issue is created in the Backport tracker and is 
 Related to the original issue

Hi Loic:

I like the idea of having backport tickets in a separate Redmine
subproject/issue tracker. In fact, I would go even a step further and
have separate subprojects for each target version (hammer backports,
firefly backports).

Having all, e.g. hammer, backports in one place is advantageous in
pretty much every way I can think of: easier to see what has been
done, what needs to be done, easier to search, easier to automate,
less risk of munging something not related to backports...

It also has the effect of removing all the backport-related tickets
from the bug tracker(s).

Nathan
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html