Re: [atlas] Probe #1000165 showed as down in tab#status

2021-03-02 Thread Colin Johnston
So whilst volunteering as NHS vaccination stewards we still awaiting up probe 
notifications by email which we have asked for before so we can see on our 
mobile devices.

Col


> On 2 Mar 2021, at 18:06, Robert Kisteleki  wrote:
> 
> Hi,
> 
> On 2021-03-02 17:59, Jacques Lavignotte wrote:
>> Le 02/03/2021 à 12:46, Viktor Naumov a écrit :
>>> the controller's  IPv6 connectivity was not restored.
>> For post mortem analysis :
>> Probe #1000165 is hosted by an IPV4 only machine (no IPV6)
> 
> Well, in this curious case the machine ("controller") itself thought it 
> *should* have IPv6, which indeed was the expected situation. Yet it didn't. 
> Therefore it was very confused about its own state and as a safety measure it 
> told its probes "hang on a minute while I figure out my own situation" :-)
> 
> We designed the controller-probe protocol so that it can handle case such as 
> this; that is, the probes will try reconnecting, and in really bad cases the 
> system drives them to a different controller. In the meantime they execute 
> what they were asked and store results.
> 
>> J.
>> Anyway : probe is ok for 5 hours, 42 minutes
> 
> Good, good!
> 
> Cheers,
> Robert
> 
> 




Re: [atlas] Probe #1000165 showed as down in tab#status

2021-03-02 Thread Robert Kisteleki

Hi,

On 2021-03-02 17:59, Jacques Lavignotte wrote:



Le 02/03/2021 à 12:46, Viktor Naumov a écrit :


the controller's  IPv6 connectivity was not restored.


For post mortem analysis :

Probe #1000165 is hosted by an IPV4 only machine (no IPV6)


Well, in this curious case the machine ("controller") itself thought it 
*should* have IPv6, which indeed was the expected situation. Yet it 
didn't. Therefore it was very confused about its own state and as a 
safety measure it told its probes "hang on a minute while I figure out 
my own situation" :-)


We designed the controller-probe protocol so that it can handle case 
such as this; that is, the probes will try reconnecting, and in really 
bad cases the system drives them to a different controller. In the 
meantime they execute what they were asked and store results.



J.

Anyway : probe is ok for 5 hours, 42 minutes


Good, good!

Cheers,
Robert




Re: [atlas] Probe #1000165 showed as down in tab#status

2021-03-02 Thread Jacques Lavignotte




Le 02/03/2021 à 12:46, Viktor Naumov a écrit :


the controller's  IPv6 connectivity was not restored.


For post mortem analysis :

Probe #1000165 is hosted by an IPV4 only machine (no IPV6)

J.

Anyway : probe is ok for 5 hours, 42 minutes



--
GnuPg : 156520BBC8F5B1E3 Because privacy matters.
« Quand est-ce qu'on mange ? » AD (c) (tm)



Re: [atlas] Probe #1000165 showed as down in tab#status

2021-03-02 Thread Gert Doering
Hi,

On Tue, Mar 02, 2021 at 12:46:05PM +0100, Viktor Naumov wrote:
> There was a network maintenance in Hetzner last night. Few days before 
> the ctr-fsn01 was reinstalled. Due to minor network misconfiguration the 
> controller's  IPv6 connectivity was not restored.

I know people that have preached "if you do IPv6, *always* remember to
monitor IPv4 *and* IPv6, all the time, for all services" for like 15+
years now...

And I've been told that you have the largest monitoring system ever
at your disposal... :-)

Gert Doering
-- NetMaster
-- 
have you enabled IPv6 on something today...?

SpaceNet AG  Vorstand: Sebastian v. Bomhard, Michael Emmer
Joseph-Dollinger-Bogen 14Aufsichtsratsvors.: A. Grundner-Culemann
D-80807 Muenchen HRB: 136055 (AG Muenchen)
Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279


signature.asc
Description: PGP signature


Re: [atlas] Probe #1000165 showed as down in tab#status

2021-03-02 Thread Viktor Naumov

Hi Jacques,

Thank you for reporting!

There was a network maintenance in Hetzner last night. Few days before 
the ctr-fsn01 was reinstalled. Due to minor network misconfiguration the 
controller's  IPv6 connectivity was not restored.


Now the problem is solved.

Best regards
/vty

On 3/2/21 11:17 AM, Jacques Lavignotte wrote:

Dear RIPE NCC crew,

Some diag below :

# systemctl status atlas

mars 02 11:12:43 melusine.eu.org ATLAS[32011]: Do a controller INIT
mars 02 11:12:43 melusine.eu.org ATLAS[32011]: Controller init -p 443 
at...@ctr-fsn01.atlas.ripe.net  INIT
mars 02 11:12:43 melusine.eu.org ATLAS[32011]: 255 controller INIT 
exit with error


Probe seems down since scheduled machine reboot this night.

Help for fixing welcome,

Merci,  Jacques






Re: [atlas] Probe #1000165 showed as down in tab#status

2021-03-02 Thread Jacques Lavignotte

Probe is back.

Sorry for the noise.

Have a nice day,

Jacques

--
GnuPg : 156520BBC8F5B1E3 Because privacy matters.
« Quand est-ce qu'on mange ? » AD (c) (tm)



[atlas] Probe #1000165 showed as down in tab#status

2021-03-02 Thread Jacques Lavignotte

Dear RIPE NCC crew,

Some diag below :

# systemctl status atlas

mars 02 11:12:43 melusine.eu.org ATLAS[32011]: Do a controller INIT
mars 02 11:12:43 melusine.eu.org ATLAS[32011]: Controller init -p  443 
at...@ctr-fsn01.atlas.ripe.net  INIT
mars 02 11:12:43 melusine.eu.org ATLAS[32011]: 255 controller INIT exit 
with error


Probe seems down since scheduled machine reboot this night.

Help for fixing welcome,

Merci,  Jacques

--
GnuPg : 156520BBC8F5B1E3 Because privacy matters.
« Quand est-ce qu'on mange ? » AD (c) (tm)