Troublesome documentation link on http://ceph.com
So I go to http://ceph.com and click on the "Documentation" link. In the HTML source code the linked URL is http://ceph.com/docs but the HTTP server rewrites this to: http://docs.ceph.com/docs/v0.80.5/ Could someone arrange for this to be changed s/v0.80.5/master/ ? ( Searching for Ceph documentation is already difficult as it is, because Google does not provide links to the master version, but only to old outdated versions - see for example the first two search results in a Google search for "troubleshoot pgs in ceph": http://tinyurl.com/nfuohss ) -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Firefly EOL date - still Jan 2016?
> Does anyone on the stable release team have an interest in doing > releases beyond that date, or should we announce that as a firm date? For now my vote is to stick to the schedule and declare EOL on January 31, but I'm willing to negotiate :-) Nathan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Would it make sense to require ntp
Hi Ceph: Recently I encountered some a "clock skew" issue with 0.94.3. I have some small demo clusters in AWS. When I boot them up, in most cases the cluster will start in HEALTH_WARN due to clock skew on some of the MONs. I surmise that this is due to a race condition between the ceph-mon and ntpd systemd services. Sometimes ntpd.service starts *after* ceph-mon - in this case the MON sees a wrong/unsynchronized time value. Now, even though ntpd.service starts (and fixes the time value) very soon afterwards, the cluster remains in clock skew for a long time - but that is a separate issue. What I would like to ask is this: Is there any reasonable Ceph cluster node configuration that does not include running the NTP daemon? If the answer is "no", would it make sense to make NTP a runtime dependency and tell the ceph-mon systemd service to wait for ntpd.service before it starts? Thanks and regards -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: a home for backport snippets
> ceph-workbench set-release --token $github_token --key $redmine_key > > What do you think ? Would be nice to be able to put $github_token and $redmine_key into a configuration file. -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: civetweb upstream/downstream divergence
IMHO the first step should be to get rid of the evil submodule. Arguably the most direct path leading to this goal is to simply package up the downstream civetweb (i.e. 1.6 plus all the downstream patches) for all the supported distros. The resulting package would be Ceph-specific, obviously, so it could be called "civetweb-ceph". Like Ken says, the upstreaming effort can continue in parallel. After we get Ceph/RGW working fine with civetweb-ceph 1.6, we can rebase the package to upstream civetweb 1.7. I am not volunteering to do all the work, but we at SUSE are certainly prepared to shoulder our share of it. -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ceph-users] who is using radosgw with civetweb?
On 11/02/2015 06:01 PM, Martin Millnert wrote: > Minimum the documentation at > http://docs.ceph.com/docs/master/radosgw/config-ref/ could be blessed > with an entry on 'rgw frontends', including notes on how to configure it > for loopback-binding access only. Agreed: http://tracker.ceph.com/issues/13670 -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
civetweb upstream/downstream divergence
Hi Ceph: The civetweb code in RGW is taken from https://github.com/ceph/civetweb/ which is a fork of https://github.com/civetweb/civetweb. The last commit to our fork took place on March 18. Upstream civetweb development has progressed ("This branch is 19 commits ahead, 972 commits behind civetweb:master.") Are there plans to rebase to a newer upstream version or should we think more in terms of backporting (to ceph/civetweb.git) from upstream (civetweb/civetweb.git) when we need to fix bugs or add features? Thanks and regards -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Distro support in infernalis and above
Hi! I recently noted that infernalis no longer supports RHEL 6. Does this extend to the ceph-common package? If so, we could greatly simplify the spec file . . . something along the lines of https://github.com/ceph/ceph/pull/6225 Regards -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Isn't --allhosts dangerous?
Taking actions cluster-wide via a shell script seems like it could have dangerous unexpected side effects, and indeed while looking through old tickets, I ran across http://tracker.ceph.com/issues/9407 . . . This is "fixed" (by getting rid of the --allhosts option to init-ceph) in https://github.com/ceph/ceph/pull/6089 The same PR also adds getopt to make processing of command-line options more flexible. Opinions? Reviews? -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Firefly help
Hi Loic: For some reason I cannot reach teuthology.front.sepia.ceph.com - I turn on the VPN but the machine does not respond to pings or ssh. (Nor does the gateway machine, for that matter.) If you feel inclined to help, could you start the standard round of integration tests on firefly-backports? Thanks, Nathan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Backporting from Infernalis and c++11
> With Infernalis Ceph move to c++11 (and CMake), we will see more conflicts > when backporting bug fixes to Hammer. Good point, Loic! > Any ideas you may have to better deal with this would be most welcome. A couple thoughts pop into mind. When I joined the project, I was told that doing backports is a good way to get into the codebase, and after some months I can confirm that this is true. Loic has literally bent over backwards to help me along the way, and thanks to that I have made some progress. Still, the factor determining whether a backport is trivial or non-trivial is often my own "cluenessness". I would suggest to developers that they keep backporting in the back of their mind as they design and implement bugfixes. Will the backport be doable even by a relatively inexperienced backporter? Is there a way to make it easier on the backporter? I would suggest that it is in the developers' best interest to make a little extra effort in this direction, as it will reduce the probability of the backporter asking them for help later ;-) Regards -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
try-restart on upgrade, and upgrade procedures in general
Hi all: I have been tinkering with the %preun and %postun scripts in ceph.spec.in - in particular, the ones for the "ceph" and "ceph-radosgw" packages. Recently, as part of the "wip-systemd" effort, these snippets were updated for compatibility with systemd. Since the "Upgrade procedures" documentation[1] is going to have to be updated anyway, I hope we might have a discussion on these upgrade procedures. Based on my research and discussions to-date, it seems like there are two camps: The first camp says "upgrade should not touch running daemons; restarting them should be left to the admin." This is closely related to the idea that daemons should be upgraded and restarted individually: i.e., mons first, then osds, etc. The second camp says: "since the typical workflow for upgrading a package in Linux distributions involves having the package itself automatically restart running daemons, the Ceph package should do this, too". The first camp's position appears to be motivated primarily by a desire to keep the cluster up and running during the upgrade, and minimize disruption by proceeding "daemon by daemon". The second camp's position is driven by distribution packaging conventions and the fact that all the Ceph daemons and systemd units (except RGW) are packaged together. This lends itself to a "node by node" approach to upgrading, rather than "daemon by daemon". (Also, since there is always a risk that an upgrade might cause an entire node to fail, Ceph clusters need to be able to cope with an entire node going offline for upgrade. This might even be an argument for *recommending* "node by node" as an upstream-sanctioned upgrade procedure!) It was suggested to me that a nice way to reconcile these two camps would be to introduce an /etc/sysconfig/ceph (/etc/default/ceph) option, which I have provisionally called CEPH_AUTO_RESTART_ON_UPGRADE. If this option is set to "yes", the packaging scriptlet that is run on upgrade would do a "systemctl try-restart" on all the systemd units in the respective package. If it were not set, or set to any value other than "yes", the current behavior would be preserved. Opinions? Ideas? So far, I have opened https://github.com/ceph/ceph/pull/5835 with the RPM implementation. [1] http://ceph.com/docs/master/install/upgrading-ceph/#upgrade-procedures -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Backport] assigning a pull request when asking permission to merge
> What do you think ? Good idea. Done for all the pending firefly PRs as well. -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Enabling contributors to run teuthology suites
We don't yet have a long term solution to fund this kind of testing, but that's not a blocker. It's cheap enough to setup and I'm willing to spend 500 euros per month of my own money for the time being. It's really selfish of me : enabling contributors will save me time that's worth more than 500 euros ;-) I suggest that Loic provide PayPal details so folks can help him defray the cost of running teuthology on his cluster. -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hammer/firefly backports upgrade suite
It turns out the mistake was to use upgrade/firefly-x and upgrade/hammer-x instead of upgrade/firelfy and upgrade/hammer. When -c hammer-backports or -c firefly-backports is given to teuthology-suite, it will upgrade to this version, from older versions of the same stable branch. I've updated: http://tracker.ceph.com/issues/11644#upgrade http://tracker.ceph.com/issues/11990#upgrade whith the fix and they are running (using the OpenStack cluster though, there may be issues with the firefly upgrade suite but the hammer upgrade suite must run fine, it has run at http://ceph.aevoo.fr:8081/ enough times to suggest there are no OpenStack specific issues). Hi Loic: As you predicted, the firefly upgrade suite finished with quite a few failures. I looked at the first one and the log said: ERROR:teuthology.misc:. teuthology/test/integration/openrc.sh openstack network show -f json unknown error ERROR: openstack No networks with a name or ID of 'unknown' found Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: upstream/firefly exporting the same snap 2 times results in different exports
On 2015-07-22 09:03, Stefan Priebe - Profihost AG wrote: That would be really important. I've seen that this one was already in upstream/firefly-backports. What's the purpose of that branch? That is where the Stable Releases and Backports team stages backports and does integration testing on them before they are merged into the 'firefly' named branch. -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Firefly 0.80.11 integration testing summary
Hi Loic: Current status of integration testing for the upcoming firefly 0.80.11 release is as follows: * rados: all green * rgw: currently re-running 11 tests (known bug that has since been fixed) * fs: currently re-running 4 tests that failed (probably environment-related) * rbd: encountered test failures that I was unable to debug -- opened http://tracker.ceph.com/issues/12336 * powercycle: all green * upgrade: encountered similar-looking test failures as in rbd For details, see http://tracker.ceph.com/issues/11644 Regards -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Duplicate issues?
Hi Loic: Please look at the following two issues that are earmarked for backport to firefly: http://tracker.ceph.com/issues/8674 osd: cache tier: avoid promotion on first read http://tracker.ceph.com/issues/9064 RadosModel assertion failure They both seem to be fixed by the same commit, namely: https://github.com/ceph/ceph/commit/0ed3adc1e0a74bf9548d1d956aece11f019afee0 . . . which would mean that the corresponding backport tracker issues are duplicates? Can you confirm/deny this? Thanks, Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dealing with conflicting pull requests
For the first time we have a few conflicting pull requests that prevent merging all in the integration branch. ... I think the simplest way to deal with that is to arbitrarily pick a few that apply cleanly and just DNM the others so they wait for the next round of integration testing. After the first round of integration test, the chosen one will be merged and the DNM will have to be rebased to resolve the conflict, but that's what we do routinely. What do you think ? I think your suggestion is reasonable and I can't think of any better way to proceed, so +1 ! Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: xattrs vs. omap with radosgw
We've since merged something that stripes over several small xattrs so that we can keep things inline, but it hasn't been backported to hammer yet. See c6cdb4081e366f471b372102905a1192910ab2da. Hi Sage: You wrote yet - should we earmark it for hammer backport? Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: preparing v0.80.11
Could I also ask for this one to be backported? https://github.com/ceph/ceph/pull/4844 It breaks a couple of setups I know of. It's not in master yet, but it's a very trivial fix. Hi Wido: firefly backport: https://github.com/ceph/ceph/pull/4861 hammer backport: https://github.com/ceph/ceph/pull/4862 Please take a look. Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: preparing v0.80.11
Would it be possible to backport this as well to 0.80.11: http://tracker.ceph.com/issues/9792#change-46498 And I think this commit would be the easiest to backport: https://github.com/ceph/ceph/commit/6b982e4cc00f9f201d7fbffa0282f8f3295f2309 This way we add a simple safeguard against pool removal into Firefly as well. Hi Wido: Thanks for this suggestion. I have cherry-picked the commit you quoted and it will soon be in firefly-backports. The backport tracker issue is: http://tracker.ceph.com/issues/11801 Regards, Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: preparing v0.80.11
Hi Loic: The first round of 0.80.11 backports, including all trivial backports (where trivial is defined as those I was able to do by myself without help), is now ready for integration testing in the firefly-backports branch of the SUSE fork: https://github.com/SUSE/ceph/commits/firefly-backports The non-trivial backports (on which I hereby solicit help) are: http://tracker.ceph.com/issues/11699 Objecter: resend linger ops on split http://tracker.ceph.com/issues/11700 make the all osd/filestore thread pool suicide timeouts separately configurable http://tracker.ceph.com/issues/11704 erasure-code: misalignment http://tracker.ceph.com/issues/11720 rgw deleting S3 objects leaves __shadow_ objects behind Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: preparing v0.80.11
Although v0.80.10 is not out yet, the odds of a discovering a problem that would require an additional backport are low. I think you can start with v0.80.11 without further delay :-) As soon as I get over the flu :-( Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Proposal for a Backport tracker
Hi Loic: As I understand the new backport workflow, the process of creating tickets in the Backport tracker will be automated. This raises a question: As the automation script loops over each Pending Backport ticket, it will first create a corresponding ticket (or tickets) in the Backport tracker. After that, I'm thinking it will need to change the status of the original ticket from Pending Backport to something else (so it doesn't pick up the same ticket again the next time the script is run). So the question is, what will the original ticket's status be changed to? Resolved? Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Proposal for a Backport tracker
We can probably keep the original ticket in the Pending Backport status until all backports are resolved, as we currently do. So the question is, what will the original ticket's status be changed to? Resolved? When all backports are resolved, then we can also resolve the original ticket. Do you forsee a problem with that ? Yes and no. It's a scripting problem, not a workflow problem. I'll try to describe the scenario I'm envisioning: We have a simple script that loops over all tickets with status Pending Backport. We run it once per week. The first week we run it, the script finds 3 such tickets. For each one it creates a ticket in the Backport tracker. A week goes by, during which developers marked 4 more tickets Pending Backport. It's time to run the script again. Since it is looping over all tickets marked Pending Backport, it now finds 7 such tickets: 3 from last week and the 4 new ones. For each one it dutifully creates a new ticket in the Backport tracker. Now we have duplicate Backport tickets for 3 bugfixes. But I guess it will be trivial to make the script check if a Backport ticket is already open and refrain from opening a new one in this case. The same script could also check further: if Backport tickets already exist for the bugfix and all of them are marked Resolved, then automatically mark the original ticket Resolved as well. Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Proposal for a Backport tracker
* a backport issue is created in the Backport tracker and is Related to the original issue Hi Loic: I like the idea of having backport tickets in a separate Redmine subproject/issue tracker. In fact, I would go even a step further and have separate subprojects for each target version (hammer backports, firefly backports). Having all, e.g. hammer, backports in one place is advantageous in pretty much every way I can think of: easier to see what has been done, what needs to be done, easier to search, easier to automate, less risk of munging something not related to backports... It also has the effect of removing all the backport-related tickets from the bug tracker(s). Nathan -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html