Re: [ceph-users] 答复: 答复: Can't start ceph-mon through systemctl start ceph-mon@.service after upgrading from Hammer to Jewel

2017-06-23 Thread Curt
Did you set "setuser match path" in your config?  If you look at the
release notes for Infernalis, it outlines how to still use the ceph user.
Also to note below from Infernalis,

"Ceph daemons now run as user and group ceph by default. The ceph user has
a static UID assigned by Fedora and Debian (also used by derivative
distributions like RHEL/CentOS and Ubuntu). On SUSE the ceph user will
currently get a dynamically assigned UID when the user is created."

On Thu, Jun 22, 2017 at 11:40 PM, 许雪寒  wrote:

> I set the "mon_data" configuration item and "user" configuration item in
> my ceph.conf, and start ceph-mon using the user "ceph".
> I tested directly calling "ceph-mon" command to start the daemon using
> "root" and "ceph", there were no problem. Only when starting through
> systemctl, the start failed.
>
> 发件人: David Turner [mailto:drakonst...@gmail.com]
> 发送时间: 2017年6月22日 20:47
> 收件人: 许雪寒; Linh Vu; ceph-users@lists.ceph.com
> 主题: Re: [ceph-users] 答复: Can't start ceph-mon through systemctl start
> ceph-mon@.service after upgrading from Hammer to Jewel
>
> Did you previously edit the init scripts to look in your custom location?
> Those could have been overwritten. As was mentioned, Jewel changed what
> user the daemon runs as, but you said that you tested running the daemon
> manually under the ceph user? Was this without sudo? It used to run as root
> under Hammer and would have needed to be chown'd recursively to allow the
> ceph user to run it.
>
> On Thu, Jun 22, 2017, 4:39 AM 许雪寒  wrote:
> I set mon_data to “/home/ceph/software/ceph/var/lib/ceph/mon”, and its
> owner has always been “ceph” since we were running Hammer.
> And I also tried to set the permission to “777”, it also didn’t work.
>
>
> 发件人: Linh Vu [mailto:v...@unimelb.edu.au]
> 发送时间: 2017年6月22日 14:26
> 收件人: 许雪寒; ceph-users@lists.ceph.com
> 主题: Re: [ceph-users] Can't start ceph-mon through systemctl start 
> ceph-mon@.service
> after upgrading from Hammer to Jewel
>
> Permissions of your mon data directory under /var/lib/ceph/mon/ might have
> changed as part of Hammer -> Jewel upgrade. Have you had a look there?
> 
> From: ceph-users  on behalf of 许雪寒 <
> xuxue...@360.cn>
> Sent: Thursday, 22 June 2017 3:32:45 PM
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] Can't start ceph-mon through systemctl start
> ceph-mon@.service after upgrading from Hammer to Jewel
>
> Hi, everyone.
>
> I upgraded one of our ceph clusters from Hammer to Jewel. After upgrading,
> I can’t start ceph-mon through “systemctl start ceph-mon@ceph1”, while,
> on the other hand, I can start ceph-mon, either as user ceph or root, if I
> directly call “/usr/bin/ceph-mon –cluster ceph –id ceph1 –setuser ceph
> –setgroup ceph”. I looked “/var/log/messages”, and find that the reason
> systemctl can’t start ceph-mon is that ceph-mon can’t access its configured
> data directory. Why ceph-mon can’t access its data directory when its
> called by systemctl?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph.conf and monitors

2017-05-31 Thread Curt
Hello all,

Had a recent issue with ceph monitors and osd's when connecting to
second/third monitor.  I don't have any debug logs to currently paste, but
wanted to get feedback on my ceph.conf for the monitors.

This is giant release.

Here's the error from monB that stuck out "osd_map(174373..174373 src has
173252..174373)...failed lossy con, dropping message".  If I understand
that correctly, the OSD had an older version of the map that the mon
didn't? The OSD logs show auth errors decoding block, failed verifying auth
reply.  Changing the ceph.conf to only point to monA, fixed the issue, but
it's only a temp work around.

Any suggestions on the cause or recommended fix for this?

Org conf(ip's changed):
mon_initial_members = monitorA
mon_host = 1.1.1.1 ,1.1.1.2, 1.1.1.3

Conf now:
mon_initial_members = monitorA
mon_host = 1.1.1.1

Does initial members needs to be all the mons?  Any other config places I
should check?

Cheers,
Curt
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Restart ceph cluster

2017-05-12 Thread Curt
As other's have said, best bet is update conf and then just use injectargs,
but if you need to restart a group of OSD's you could script it. Assuming
you are using linux, you could do something like.

//If you wanted to restart osd 1-10
for i in (1..10);
do
HOST=`ceph osd find ${i} | jq -r .crush_location.host`
ssh ${HOST} sudo stop ceph-osd id=${i};
sleep 5;
ssh ${HOST} sudo start ceph-osd id=${i};
sleep 10;
done

This isn't something I've ever had to do and others have pointed out the
issues with restarting a cluster all at once, but that's how you could do
it.   You could also tweak it to pass the OSD's in and probably just make
the ssh one command of restart rather than stop/start.

On Fri, May 12, 2017 at 8:49 AM, Алексей Усов 
wrote:

> Greetings,
>
> Could someone, please, tell me how do I restart all daemons in a cluster
> if I make changes in ceph.conf, if it's needed indeed? Since
> enterprise-scale ceph clusters usually tend to comprise of hundreds of
> OSDs, I doubt one must restart the entire cluster by hand or use some sort
> of external orchestrating tool - there must be a centralized solution in
> ceph itself. Thank you in advance.
>
> --
> С уважением, Усов А.Е..
> Best Regards, Usov A.Y..
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reg: Ceph-deploy install - failing

2017-05-08 Thread Curt
Hello,

I don't use Cent, but I've seen the same thing with Ubuntu, I'm going to
assume it's the same problem.  The repo url should be download.ceph.com,
instead of just ceph.com, which is uses when it add's it to the repo.  My
solution usually is, correct the repo URL to point to download.ceph.com and
in the ceph-deploy add the tag --no-adjust-repo.

Cheers

On Sun, May 7, 2017 at 3:51 PM, psuresh  wrote:

> Hi,
>
> When i run "ceph-deploy install lceph-mon2" from admin node i'm getting
> following error.   Any clue!
>
> [cide-lceph-mon2][DEBUG ] connected to host: cide-lceph-mon2
> [cide-lceph-mon2][DEBUG ] detect platform information from remote host
> [cide-lceph-mon2][DEBUG ] detect machine type
> [ceph_deploy.install][INFO  ] Distro info: CentOS Linux 7.3.1611 Core
> [cide-lceph-mon2][INFO  ] installing ceph on cide-lceph-mon2
> [cide-lceph-mon2][INFO  ] Running command: yum clean all
> [cide-lceph-mon2][DEBUG ] Loaded plugins: fastestmirror, langpacks,
> priorities
> [cide-lceph-mon2][DEBUG ] Cleaning repos: Ceph Ceph-noarch base
> ceph-source epel extras updates
> [cide-lceph-mon2][DEBUG ] Cleaning up everything
> [cide-lceph-mon2][INFO  ] adding EPEL repository
> [cide-lceph-mon2][INFO  ] Running command: yum -y install epel-release
> [cide-lceph-mon2][DEBUG ] Loaded plugins: fastestmirror, langpacks,
> priorities
> [cide-lceph-mon2][DEBUG ] Determining fastest mirrors
> [cide-lceph-mon2][DEBUG ]  * base: centos.excellmedia.net
> [cide-lceph-mon2][DEBUG ]  * epel: ftp.cuhk.edu.hk
> [cide-lceph-mon2][DEBUG ]  * extras: centos.excellmedia.net
> [cide-lceph-mon2][DEBUG ]  * updates: centos.excellmedia.net
> [cide-lceph-mon2][DEBUG ] Package epel-release-7-9.noarch already
> installed and latest version
> [cide-lceph-mon2][DEBUG ] Nothing to do
> [cide-lceph-mon2][INFO  ] Running command: yum -y install yum-priorities
> [cide-lceph-mon2][DEBUG ] Loaded plugins: fastestmirror, langpacks,
> priorities
> [cide-lceph-mon2][DEBUG ] Loading mirror speeds from cached hostfile
> [cide-lceph-mon2][DEBUG ]  * base: centos.excellmedia.net
> [cide-lceph-mon2][DEBUG ]  * epel: ftp.cuhk.edu.hk
> [cide-lceph-mon2][DEBUG ]  * extras: centos.excellmedia.net
> [cide-lceph-mon2][DEBUG ]  * updates: centos.excellmedia.net
> [cide-lceph-mon2][DEBUG ] Package yum-plugin-priorities-1.1.31-40.el7.noarch
> already installed and latest version
> [cide-lceph-mon2][DEBUG ] Nothing to do
> [cide-lceph-mon2][DEBUG ] Configure Yum priorities to include obsoletes
> [cide-lceph-mon2][WARNIN] check_obsoletes has been enabled for Yum
> priorities plugin
> [cide-lceph-mon2][INFO  ] Running command: rpm --import
> https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
> [cide-lceph-mon2][INFO  ] Running command: rpm -Uvh --replacepkgs
> http://ceph.com/rpm-hammer/el7/noarch/ceph-release-1-0.el7.noarch.rpm
> [cide-lceph-mon2][WARNIN] error: open of  failed: No such file or
> directory
> [cide-lceph-mon2][WARNIN] error: open of Index failed: No
> such file or directory
> [cide-lceph-mon2][WARNIN] error: open of of failed: No such file or
> directory
> [cide-lceph-mon2][WARNIN] error: open of /rpm-hammer/
> failed: No such file or directory
> [cide-lceph-mon2][WARNIN] error: open of  directory
> [cide-lceph-mon2][WARNIN] error: open of bgcolor=white> failed: No such
> file or directory
> [cide-lceph-mon2][WARNIN] error: open of Index failed: No such file or
> directory
> [cide-lceph-mon2][WARNIN] error: open of of failed: No such file or
> directory
> [cide-lceph-mon2][WARNIN] error: open of /rpm-hammer/ failed: No such file or directory
> [cide-lceph-mon2][WARNIN] error: open of href=../>../ failed: No such
> file or directory
> [cide-lceph-mon2][WARNIN] error: open of  directory
> [cide-lceph-mon2][WARNIN] error: open of href=el6/>el6/ failed: No
> such file or directory
> [cide-lceph-mon2][WARNIN] error: open of 24-Apr-2016 failed: No such file
> or directory
> [cide-lceph-mon2][WARNIN] error: open of 00:05 failed: No such file or
> directory
> [cide-lceph-mon2][WARNIN] error: -: not an rpm package (or package
> manifest):
> [cide-lceph-mon2][WARNIN] error: open of  directory
> [cide-lceph-mon2][WARNIN] error: open of href=el7/>el7/ failed: No
> such file or directory
> [cide-lceph-mon2][WARNIN] error: open of 29-Aug-2016 failed: No such file
> or directory
> [cide-lceph-mon2][WARNIN] error: open of 11:53 failed: No such file or
> directory
> [cide-lceph-mon2][WARNIN] error: -: not an rpm package (or package
> manifest):
> [cide-lceph-mon2][WARNIN] error: open of  directory
> [cide-lceph-mon2][WARNIN] error: open of href=fc20/>fc20/ failed: No
> such file or directory
> [cide-lceph-mon2][WARNIN] error: open of 07-Apr-2015 failed: No such file
> or directory
> [cide-lceph-mon2][WARNIN] error: open of 19:21 failed: No such file or
> directory
> [cide-lceph-mon2][WARNIN] error: -: not an rpm package (or package
> manifest):
> [cide-lceph-mon2][WARNIN] error: open of  directory
> [cide-lceph-mon2][WARNIN] error: open of href=rhel6/

[ceph-users] Monitor issues

2017-05-04 Thread Curt Beason
Hello,

So at some point during the night, our monitor 1 server rebooted for so far
unknown reason.  When it came back up, the clock was skewed by 6 hours.
There were no right happening when I got alerted to the issue.  ceph shows
all OSD's up and in, but no op/s and 600+ blocked requests.  I logged into
mon1, fixed the clock and restarted it.  Ceph status, showed all mons up,
no skew, but still no op/s.

Check the OSD logs, see cephx auth errors, which can be caused by clock
skew, from ceph website.  So try to restart the one osd to check and same
thing.  So I stopped mon1, figuring it would roll over to use mon2/3 and
get us backup and running.

Well, the OSD weren't showing as up, so I check my ceph.conf file to see
why it wasn't failing over to mon2/3 and notice it only has the ip for
mon1, so update ceph.conf with the ip for mon2/3 and restart, OSD come back
up and start talking again.

So right now, mon1 is offline, and I only have mon2/3 running.  Without
knowing why mon1 was having issues, I don't want to start it and bring it
back in, just to have the cluster freak.  At the same time, I'd like to get
back to having a quorum. I'm still review the logs on mon1 to try and see
if there are any errors that might point me to the issue.

In the mean time, my questions are.  Do you think it would be worth trying
starting mon1 again and see what happens?  If it still has issues, will my
OSD's failover to mon2/3 now that the conf is correct?  Is there any other
issues that might arise from bring it back in?

The other option I could think of would be deploy a new monitor 4 and then
remove the monitor 1, but I think this could lead to other issues if I am
reading the docs correct on correctly.

All our PG's are active+clean, so the cluster is in a healthy state.  The
only warn is from having set no scrub and no deep scrub and 1 mon being
down.

Any advice would be greatly appreciated.  Sorry for the long windedness of
it and scattered thought process.

Thanks,
Curt
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrade osd ceph version

2017-03-03 Thread Curt Beason
Hello,

So this is going to be a noob question probably.  I read the documentation,
but it didn't really cover upgrading to a specific version.

We have a cluster with mixed versions.  While I don't want to upgrade the
latest version of ceph, I would like to upgrade the osd's so they are all
on the same version.  Most of them are on 0.87.1 or 0.87.2.  There are 2
servers with osd's on 0.80.10.  What is the best way to go through and
upgrade them all to 0.87.2?

They are all running Ubuntu 14 with kernel 3.13 or newer.

Cheers,
Curt
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com