Re: [ceph-users] Emergency! Production cluster is down

2016-07-13 Thread Wido den Hollander

> Op 12 juli 2016 om 23:10 schreef Chandrasekhar Reddy 
> :
> 
> 
> Hi Wido,
> 
> Thank you for helping out. it worked like charm. i followed this steps
> 
> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors
> 
> can you help in sharing any good docs which deals with backups ?
> 

Backups for Ceph really depend on the use-case, there is no general 
recommendation for backups.

With Jewel for example you can use RBD mirroring to back up RBD data or with 
CephFS you can use old-fashion rsync.

Wido

> Thanks,
> Chandra.
> 
> On Tue, Jul 12, 2016 at 10:37 PM, Chandrasekhar Reddy <
> chandrasekha...@payoda.com> wrote:
> 
> > Thanks wido..  I will give a try.
> >
> > Thanks,
> > Chandra
> > On Tue, Jul 12, 2016 at 10:35 PM, Wido den Hollander 
> > wrote:
> >
> >
> > > Op 12 juli 2016 om 19:00 schreef Chandrasekhar Reddy <
> > chandrasekha...@payoda.com>:
> > >
> > >
> > > Thanks for quick reply..
> > >
> > > Should I need to remove cephx in osd nodes also??
> > >
> > disable all cephx on all nodes in the ceph.conf
> >
> > See: http://docs.ceph.com/docs/master/rados/configuration/auth-config-ref/
> >
> > Add this to the [global] section:
> >
> > auth_cluster_required = none
> > auth_service_required = none
> > auth_client_required = none
> >
> > You still have the problem that your monitor map contains 3 monitors. You
> > removed it from the ceph.conf, but that is not sufficient. You will need to
> > inject the monmap with just one monitor into the remaining monitor.
> >
> > BEFORE YOU DO, CREATE A BACKUP OF THE MON'S DATA STORE.
> >
> > I don't know the commands from the top of my head, but 'monmaptool' is
> > something you will need/want.
> >
> > Wido
> >
> > > Thanks,
> > > Chandra
> > >
> > > On Tue, Jul 12, 2016 at 10:22 PM, Oliver Dzombic <
> > i...@ip-interactive.de [i...@ip-interactive.de] > wrote:
> > > Hi,
> > >
> > > fast aid: remove cephx authentication.
> > >
> > > --
> > > Mit freundlichen Gruessen / Best regards
> > >
> > > Oliver Dzombic
> > > IP-Interactive
> > >
> > > mailto:i...@ip-interactive.de
> > >
> > > Anschrift:
> > >
> > > IP Interactive UG ( haftungsbeschraenkt )
> > > Zum Sonnenberg 1-3
> > > 63571 Gelnhausen
> > >
> > > HRB 93402 beim Amtsgericht Hanau
> > > Geschäftsführung: Oliver Dzombic
> > >
> > > Steuer Nr.: 35 236 3622 1
> > > UST ID: DE274086107
> > >
> > >
> > > Am 12.07.2016 um 18:45 schrieb Chandrasekhar Reddy:
> > > > Hi Guys,
> > > >
> > > > Need help. I had 3 monitors nodes and 2 went down ( Disk got corrupted
> > > > ). after some time even 3rd monitor went unresponsive. so i rebooted
> > the
> > > > 3rd node. it came up but ceph is not working .
> > > >
> > > > so i tried to remove 2 failed monitors from ceph.conf file and
> > restarted
> > > > the mon and osd. but still ceph is not up.
> > > >
> > > > please find log files as attached.
> > > >
> > > > 1. Log file of ceph-mon.openstack01-vm001.log ( Monitor node )
> > > >
> > > > http
> > > > ://
> > paste.openstack.org/show/530944/
> > > > 
> > > >
> > > > 2. ceph.conf
> > > >
> > > > http
> > > > ://
> > paste.openstack.org/show/530945/
> > > > 
> > > >
> > > > 3. ceph -w output
> > > >
> > > > http
> > > > ://
> > paste.openstack.org/show/530947/
> > > > 
> > > >
> > > > 4. ceph mon dump
> > > >
> > > > http
> > > > ://
> > paste.openstack.org/show/530950/
> > > > 
> > > >
> > > > what error i see is
> > > >
> > > > monclient(hunting): authenticate timed out after 300
> > > >
> > > > librados: client.admin authentication error (110) Connection timed out
> > > >
> > > > Any suggestions? please help ...
> > > >
> > > > Thanks
> > > > Chandra
> > > >
> > > >
> > > >
> > > > ___
> > > > ceph-users mailing list
> > > > ceph-users@lists.ceph.com
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > >
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production cluster is down

2016-07-12 Thread Chandrasekhar Reddy
Hi Wido,

Thank you for helping out. it worked like charm. i followed this steps

http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors

can you help in sharing any good docs which deals with backups ?

Thanks,
Chandra.

On Tue, Jul 12, 2016 at 10:37 PM, Chandrasekhar Reddy <
chandrasekha...@payoda.com> wrote:

> Thanks wido..  I will give a try.
>
> Thanks,
> Chandra
> On Tue, Jul 12, 2016 at 10:35 PM, Wido den Hollander 
> wrote:
>
>
> > Op 12 juli 2016 om 19:00 schreef Chandrasekhar Reddy <
> chandrasekha...@payoda.com>:
> >
> >
> > Thanks for quick reply..
> >
> > Should I need to remove cephx in osd nodes also??
> >
> disable all cephx on all nodes in the ceph.conf
>
> See: http://docs.ceph.com/docs/master/rados/configuration/auth-config-ref/
>
> Add this to the [global] section:
>
> auth_cluster_required = none
> auth_service_required = none
> auth_client_required = none
>
> You still have the problem that your monitor map contains 3 monitors. You
> removed it from the ceph.conf, but that is not sufficient. You will need to
> inject the monmap with just one monitor into the remaining monitor.
>
> BEFORE YOU DO, CREATE A BACKUP OF THE MON'S DATA STORE.
>
> I don't know the commands from the top of my head, but 'monmaptool' is
> something you will need/want.
>
> Wido
>
> > Thanks,
> > Chandra
> >
> > On Tue, Jul 12, 2016 at 10:22 PM, Oliver Dzombic <
> i...@ip-interactive.de [i...@ip-interactive.de] > wrote:
> > Hi,
> >
> > fast aid: remove cephx authentication.
> >
> > --
> > Mit freundlichen Gruessen / Best regards
> >
> > Oliver Dzombic
> > IP-Interactive
> >
> > mailto:i...@ip-interactive.de
> >
> > Anschrift:
> >
> > IP Interactive UG ( haftungsbeschraenkt )
> > Zum Sonnenberg 1-3
> > 63571 Gelnhausen
> >
> > HRB 93402 beim Amtsgericht Hanau
> > Geschäftsführung: Oliver Dzombic
> >
> > Steuer Nr.: 35 236 3622 1
> > UST ID: DE274086107
> >
> >
> > Am 12.07.2016 um 18:45 schrieb Chandrasekhar Reddy:
> > > Hi Guys,
> > >
> > > Need help. I had 3 monitors nodes and 2 went down ( Disk got corrupted
> > > ). after some time even 3rd monitor went unresponsive. so i rebooted
> the
> > > 3rd node. it came up but ceph is not working .
> > >
> > > so i tried to remove 2 failed monitors from ceph.conf file and
> restarted
> > > the mon and osd. but still ceph is not up.
> > >
> > > please find log files as attached.
> > >
> > > 1. Log file of ceph-mon.openstack01-vm001.log ( Monitor node )
> > >
> > > http
> > > ://
> paste.openstack.org/show/530944/
> > > 
> > >
> > > 2. ceph.conf
> > >
> > > http
> > > ://
> paste.openstack.org/show/530945/
> > > 
> > >
> > > 3. ceph -w output
> > >
> > > http
> > > ://
> paste.openstack.org/show/530947/
> > > 
> > >
> > > 4. ceph mon dump
> > >
> > > http
> > > ://
> paste.openstack.org/show/530950/
> > > 
> > >
> > > what error i see is
> > >
> > > monclient(hunting): authenticate timed out after 300
> > >
> > > librados: client.admin authentication error (110) Connection timed out
> > >
> > > Any suggestions? please help ...
> > >
> > > Thanks
> > > Chandra
> > >
> > >
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> >
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production cluster is down

2016-07-12 Thread Chandrasekhar Reddy
Thanks wido.. I will give a try.

Thanks,
Chandra

On Tue, Jul 12, 2016 at 10:35 PM, Wido den Hollander < w...@42on.com 
[w...@42on.com] > wrote:

> Op 12 juli 2016 om 19:00 schreef Chandrasekhar Reddy 
> :
>
>
> Thanks for quick reply..
>
> Should I need to remove cephx in osd nodes also??
>
disable all cephx on all nodes in the ceph.conf

See: http://docs.ceph.com/docs/master/rados/configuration/auth-config-ref/

Add this to the [global] section:

auth_cluster_required = none
auth_service_required = none
auth_client_required = none

You still have the problem that your monitor map contains 3 monitors. You 
removed it from the ceph.conf, but that is not sufficient. You will need to 
inject the monmap with just one monitor into the remaining monitor.

BEFORE YOU DO, CREATE A BACKUP OF THE MON'S DATA STORE.

I don't know the commands from the top of my head, but 'monmaptool' is 
something you will need/want.

Wido

> Thanks,
> Chandra
>
> On Tue, Jul 12, 2016 at 10:22 PM, Oliver Dzombic < i...@ip-interactive.de 
> [i...@ip-interactive.de] > wrote:
> Hi,
>
> fast aid: remove cephx authentication.
>
> --
> Mit freundlichen Gruessen / Best regards
>
> Oliver Dzombic
> IP-Interactive
>
> mailto:i...@ip-interactive.de
>
> Anschrift:
>
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
>
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
>
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
>
>
> Am 12.07.2016 um 18:45 schrieb Chandrasekhar Reddy:
> > Hi Guys,
> >
> > Need help. I had 3 monitors nodes and 2 went down ( Disk got corrupted
> > ). after some time even 3rd monitor went unresponsive. so i rebooted the
> > 3rd node. it came up but ceph is not working .
> >
> > so i tried to remove 2 failed monitors from ceph.conf file and restarted
> > the mon and osd. but still ceph is not up.
> >
> > please find log files as attached.
> >
> > 1. Log file of ceph-mon.openstack01-vm001.log ( Monitor node )
> >
> > http
> > ://paste.openstack.org/show/530944/
> > 
> >
> > 2. ceph.conf
> >
> > http
> > ://paste.openstack.org/show/530945/
> > 
> >
> > 3. ceph -w output
> >
> > http
> > ://paste.openstack.org/show/530947/
> > 
> >
> > 4. ceph mon dump
> >
> > http
> > ://paste.openstack.org/show/530950/
> > 
> >
> > what error i see is
> >
> > monclient(hunting): authenticate timed out after 300
> >
> > librados: client.admin authentication error (110) Connection timed out
> >
> > Any suggestions? please help ...
> >
> > Thanks
> > Chandra
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production cluster is down

2016-07-12 Thread Wido den Hollander

> Op 12 juli 2016 om 19:00 schreef Chandrasekhar Reddy 
> :
> 
> 
> Thanks for quick reply..
> 
> Should I need to remove cephx in osd nodes also??
> 
disable all cephx on all nodes in the ceph.conf

See: http://docs.ceph.com/docs/master/rados/configuration/auth-config-ref/

Add this to the [global] section:

auth_cluster_required = none
auth_service_required = none
auth_client_required = none

You still have the problem that your monitor map contains 3 monitors. You 
removed it from the ceph.conf, but that is not sufficient. You will need to 
inject the monmap with just one monitor into the remaining monitor.

BEFORE YOU DO, CREATE A BACKUP OF THE MON'S DATA STORE.

I don't know the commands from the top of my head, but 'monmaptool' is 
something you will need/want.

Wido

> Thanks,
> Chandra
> 
> On Tue, Jul 12, 2016 at 10:22 PM, Oliver Dzombic < i...@ip-interactive.de 
> [i...@ip-interactive.de] > wrote:
> Hi,
> 
> fast aid: remove cephx authentication.
> 
> --
> Mit freundlichen Gruessen / Best regards
> 
> Oliver Dzombic
> IP-Interactive
> 
> mailto:i...@ip-interactive.de
> 
> Anschrift:
> 
> IP Interactive UG ( haftungsbeschraenkt )
> Zum Sonnenberg 1-3
> 63571 Gelnhausen
> 
> HRB 93402 beim Amtsgericht Hanau
> Geschäftsführung: Oliver Dzombic
> 
> Steuer Nr.: 35 236 3622 1
> UST ID: DE274086107
> 
> 
> Am 12.07.2016 um 18:45 schrieb Chandrasekhar Reddy:
> > Hi Guys,
> >
> > Need help. I had 3 monitors nodes and 2 went down ( Disk got corrupted
> > ). after some time even 3rd monitor went unresponsive. so i rebooted the
> > 3rd node. it came up but ceph is not working .
> >
> > so i tried to remove 2 failed monitors from ceph.conf file and restarted
> > the mon and osd. but still ceph is not up.
> >
> > please find log files as attached.
> >
> > 1. Log file of ceph-mon.openstack01-vm001.log ( Monitor node )
> >
> > http
> > ://paste.openstack.org/show/530944/
> > 
> >
> > 2. ceph.conf
> >
> > http
> > ://paste.openstack.org/show/530945/
> > 
> >
> > 3. ceph -w output
> >
> > http
> > ://paste.openstack.org/show/530947/
> > 
> >
> > 4. ceph mon dump
> >
> > http
> > ://paste.openstack.org/show/530950/
> > 
> >
> > what error i see is
> >
> > monclient(hunting): authenticate timed out after 300
> >
> > librados: client.admin authentication error (110) Connection timed out
> >
> > Any suggestions? please help ...
> >
> > Thanks
> > Chandra
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production cluster is down

2016-07-12 Thread Chandrasekhar Reddy
Thanks for quick reply..

Should I need to remove cephx in osd nodes also??

Thanks,
Chandra

On Tue, Jul 12, 2016 at 10:22 PM, Oliver Dzombic < i...@ip-interactive.de 
[i...@ip-interactive.de] > wrote:
Hi,

fast aid: remove cephx authentication.

--
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:i...@ip-interactive.de

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 12.07.2016 um 18:45 schrieb Chandrasekhar Reddy:
> Hi Guys,
>
> Need help. I had 3 monitors nodes and 2 went down ( Disk got corrupted
> ). after some time even 3rd monitor went unresponsive. so i rebooted the
> 3rd node. it came up but ceph is not working .
>
> so i tried to remove 2 failed monitors from ceph.conf file and restarted
> the mon and osd. but still ceph is not up.
>
> please find log files as attached.
>
> 1. Log file of ceph-mon.openstack01-vm001.log ( Monitor node )
>
> http
> ://paste.openstack.org/show/530944/
> 
>
> 2. ceph.conf
>
> http
> ://paste.openstack.org/show/530945/
> 
>
> 3. ceph -w output
>
> http
> ://paste.openstack.org/show/530947/
> 
>
> 4. ceph mon dump
>
> http
> ://paste.openstack.org/show/530950/
> 
>
> what error i see is
>
> monclient(hunting): authenticate timed out after 300
>
> librados: client.admin authentication error (110) Connection timed out
>
> Any suggestions? please help ...
>
> Thanks
> Chandra
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production cluster is down

2016-07-12 Thread Oliver Dzombic
Hi,

fast aid: remove cephx authentication.

-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:i...@ip-interactive.de

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 12.07.2016 um 18:45 schrieb Chandrasekhar Reddy:
> Hi Guys,
> 
> Need help. I had 3 monitors nodes and 2 went down ( Disk got corrupted
> ). after some time even 3rd monitor went unresponsive. so i rebooted the
> 3rd node. it came up but ceph is not working .
> 
> so i tried to remove 2 failed monitors from ceph.conf file and restarted
> the mon and osd. but still ceph is not up.
> 
> please find log files as attached.
> 
> 1. Log file of ceph-mon.openstack01-vm001.log ( Monitor node )
> 
> http
> ://paste.openstack.org/show/530944/
> 
> 
> 2. ceph.conf
> 
> http
> ://paste.openstack.org/show/530945/
> 
> 
> 3. ceph -w output
> 
> http
> ://paste.openstack.org/show/530947/
> 
> 
> 4. ceph mon dump
> 
> http
> ://paste.openstack.org/show/530950/
> 
> 
> what error i see is
> 
> monclient(hunting): authenticate timed out after 300
> 
> librados: client.admin authentication error (110) Connection timed out
> 
> Any suggestions? please help ... 
> 
> Thanks 
> Chandra
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Emergency! Production cluster is down

2016-07-12 Thread Chandrasekhar Reddy

Hi Guys,


Need help. I had 3 monitors nodes and 2 went down ( Disk got corrupted ). 
after some time even 3rd monitor went unresponsive. so i rebooted the 3rd 
node. it came up but ceph is not working .



so i tried to remove 2 failed monitors from ceph.conf file and restarted 
the mon and osd. but still ceph is not up.



please find log files as attached.


1. Log file of ceph-mon.openstack01-vm001.log ( Monitor node )


http [http://paste.openstack.org/show/530944/] 
://paste.openstack.org/show/530944/ 
[http://paste.openstack.org/show/530944/]



2. ceph.conf


http [http://paste.openstack.org/show/530945/] 
://paste.openstack.org/show/530945/ 
[http://paste.openstack.org/show/530945/]



3. ceph -w output


http [http://paste.openstack.org/show/530947/] 
://paste.openstack.org/show/530947/ 
[http://paste.openstack.org/show/530947/]



4. ceph mon dump


http [http://paste.openstack.org/show/530950/] 
://paste.openstack.org/show/530950/ 
[http://paste.openstack.org/show/530950/]




what error i see is


monclient(hunting): authenticate timed out after 300

librados: client.admin authentication error (110) Connection timed out


Any suggestions? please help ...


Thanks
Chandra___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production Cluster is down

2013-12-08 Thread Mark Nelson

Hello Howie,

Is your cluster still down?

If you have a support contract with us please make sure to submit a 
support ticket so that our professional services team sees it.


If not, I'd suggest looking through the logs on the hosts that have 
remaining monitors and seeing if they say anything.  You can also set 
debug mon = 20 in your ceph.conf file and restart the mons to get more 
debugging info.


Mark

On 12/08/2013 12:39 AM, Mark Kirkwood wrote:

On 08/12/13 19:28, Howie C. wrote:

Hello Guys,

Tonight when I was trying to remove 2 monitors from the production
cluster, everything seems fine but all the sudden I cannot connect to
the cluster no more, showing
root@mon01:~# ceph mon dump
2013-12-07 22:24:57.693246 7f7ee21cc700  0 monclient(hunting):
authenticate timed out after 300
2013-12-07 22:24:57.693291 7f7ee21cc700  0 librados: client.admin
authentication error (110) Connection timed out
Error connecting to cluster: TimedOut

I tried to call Intank, but no ones there.

Any suggestions? Please help!



How many monitors did you have (before removing the 2)? Check that your
ceph.conf on the host where you are running the mon dump has them all
listed (otherwise use the -m switch to specify own you know is still
there)!

It might be that the remaining ones are just taking a few moments to
decide on a quorum.

Regards

Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production Cluster is down

2013-12-08 Thread Wolfgang Schulze
Hi Howie,

Our support team is available 24/7. You might have called the wrong number.
We conduct onboarding sessions with our subscription customers where we
brief new customers on how to get assistance even at 1am on a Saturday
night. 

I will send you a pm with more information.

Regards,
Wolfgang
VP Services, Inktank

On 12/8/13 9:02 AM, Mark Nelson mark.nel...@inktank.com wrote:

Hello Howie,

Is your cluster still down?

If you have a support contract with us please make sure to submit a
support ticket so that our professional services team sees it.

If not, I'd suggest looking through the logs on the hosts that have
remaining monitors and seeing if they say anything.  You can also set
debug mon = 20 in your ceph.conf file and restart the mons to get more
debugging info.

Mark

On 12/08/2013 12:39 AM, Mark Kirkwood wrote:
 On 08/12/13 19:28, Howie C. wrote:
 Hello Guys,

 Tonight when I was trying to remove 2 monitors from the production
 cluster, everything seems fine but all the sudden I cannot connect to
 the cluster no more, showing
 root@mon01:~# ceph mon dump
 2013-12-07 22:24:57.693246 7f7ee21cc700  0 monclient(hunting):
 authenticate timed out after 300
 2013-12-07 22:24:57.693291 7f7ee21cc700  0 librados: client.admin
 authentication error (110) Connection timed out
 Error connecting to cluster: TimedOut

 I tried to call Intank, but no ones there.

 Any suggestions? Please help!


 How many monitors did you have (before removing the 2)? Check that your
 ceph.conf on the host where you are running the mon dump has them all
 listed (otherwise use the -m switch to specify own you know is still
 there)!

 It might be that the remaining ones are just taking a few moments to
 decide on a quorum.

 Regards

 Mark
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production Cluster is down

2013-12-08 Thread Wolfgang Schulze
Hi Howie,

Our support team is available 24/7. You might have called the wrong number.
We conduct onboarding sessions with our subscription customers where we

Regards,
Wolfgang
VP Services, Inktank

On 12/8/13 9:02 AM, Mark Nelson mark.nel...@inktank.com wrote:

Hello Howie,

Is your cluster still down?

If you have a support contract with us please make sure to submit a
support ticket so that our professional services team sees it.

If not, I'd suggest looking through the logs on the hosts that have
remaining monitors and seeing if they say anything.  You can also set
debug mon = 20 in your ceph.conf file and restart the mons to get more
debugging info.

Mark

On 12/08/2013 12:39 AM, Mark Kirkwood wrote:
 On 08/12/13 19:28, Howie C. wrote:
 Hello Guys,

 Tonight when I was trying to remove 2 monitors from the production
 cluster, everything seems fine but all the sudden I cannot connect to
 the cluster no more, showing
 root@mon01:~# ceph mon dump
 2013-12-07 22:24:57.693246 7f7ee21cc700  0 monclient(hunting):
 authenticate timed out after 300
 2013-12-07 22:24:57.693291 7f7ee21cc700  0 librados: client.admin
 authentication error (110) Connection timed out
 Error connecting to cluster: TimedOut

 I tried to call Intank, but no ones there.

 Any suggestions? Please help!


 How many monitors did you have (before removing the 2)? Check that your
 ceph.conf on the host where you are running the mon dump has them all
 listed (otherwise use the -m switch to specify own you know is still
 there)!

 It might be that the remaining ones are just taking a few moments to
 decide on a quorum.

 Regards

 Mark
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Emergency! Production Cluster is down

2013-12-07 Thread Howie C.
Hello Guys, 

Tonight when I was trying to remove 2 monitors from the production cluster, 
everything seems fine but all the sudden I cannot connect to the cluster no 
more, showing 
root@mon01:~# ceph mon dump
2013-12-07 22:24:57.693246 7f7ee21cc700  0 monclient(hunting): authenticate 
timed out after 300
2013-12-07 22:24:57.693291 7f7ee21cc700  0 librados: client.admin 
authentication error (110) Connection timed out
Error connecting to cluster: TimedOut


I tried to call Intank, but no ones there.

Any suggestions? Please help!  

-- 
Howie C.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Emergency! Production Cluster is down

2013-12-07 Thread Mark Kirkwood

On 08/12/13 19:28, Howie C. wrote:

Hello Guys,

Tonight when I was trying to remove 2 monitors from the production
cluster, everything seems fine but all the sudden I cannot connect to
the cluster no more, showing
root@mon01:~# ceph mon dump
2013-12-07 22:24:57.693246 7f7ee21cc700  0 monclient(hunting):
authenticate timed out after 300
2013-12-07 22:24:57.693291 7f7ee21cc700  0 librados: client.admin
authentication error (110) Connection timed out
Error connecting to cluster: TimedOut

I tried to call Intank, but no ones there.

Any suggestions? Please help!



How many monitors did you have (before removing the 2)? Check that your 
ceph.conf on the host where you are running the mon dump has them all 
listed (otherwise use the -m switch to specify own you know is still 
there)!


It might be that the remaining ones are just taking a few moments to 
decide on a quorum.


Regards

Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com