Re: [atlas] One-off measurements stopped after 5-6 minutes without results

2023-09-19 Thread Robert Kisteleki

Hello,

As you've probably seen in the other thread, we have a problem in 
storing/processing new results. We're working on fixing this.


I wrote "all else is working fine" there - which I have to correct now: 
the above error affects the probe tagging, where because of the lack of 
"good data" the tagging process removes some tags (as you note below). 
Even though the lack of new data is not the expected scenario, this is 
something we will fix.


Regards,
Robert


On 2023-09-19 20:03, Gerdriaan Mulder wrote:

Hi list,

A few hours ago (around 15:00Z), I launched some one-off 
measurements[1][2][3] on the platform for a select number of probes 
(based on ASN and region). The target in all three cases was 
2a02:1807:1020:700::1 (no hostname, purely the address). Although the 
number of requested and participating probes varied a bit, it seems none 
of the measurements yielded any results.


At the time I didn't look at recent posts on this mailing list, nor was 
I aware of this ongoing incident[4] on the Atlas backend. It seems like 
that incident might be the cause of the (current) unavailability of 
these results.


It would be great if anyone can confirm that (viewing results of) 
user-defined measurements are affected as well.


Aside, I suppose the following messages on "My RIPE Atlas Dashboard" are 
also caused by the mentioned incident[4]?


--->8---
2023-09-18 16:40 UTC Your probe #11nnn was automatically untagged as 
"system: Resolves  Correctly"
2023-09-18 16:40 UTC Your probe #11nnn was automatically untagged as 
"system: Resolves A Correctly"
2023-09-18 16:20 UTC Your probe #11nnn was automatically untagged as 
"system: IPv6 Works"
2023-09-18 16:20 UTC Your probe #11nnn was automatically untagged as 
"system: IPv4 Works"

---8<---

Perhaps the incident page[4] can include a few lines of the current 
(estimated) impact, as well.


Best regards,
Gerdriaan Mulder

[1] https://atlas.ripe.net/frames/measurements/60161706/ (default IPv6 
traceroute)
[2] https://atlas.ripe.net/frames/measurements/60162230/ (IPv6 
traceroute, paris=0)
[3] https://atlas.ripe.net/frames/measurements/60162753/ (default IPv6 
ping)

[4] https://status.ripe.net/incidents/wfd7qywz3v68



--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Atlas fully down..

2023-09-19 Thread Robert Kisteleki




On 2023-09-19 23:04, Ernst J. Oud wrote:

Considering the fact that all of Atlas is completely down for more than 24 
hrs., I find the silence a bit deafening. No status updates, nothing. Weird.

Not up to RIPE standards.

Regards,

Ernst J. Oud


Good morning,

I'm sad to report that indeed there's still an issue with result 
processing - which is still reflected on the status page.


Specifically, the HBase backend that is responsible for storing and 
retrieving the new (and historic) results is struggling to store the 
data form the last ~24 hours. The teams have been working on solving 
this basically 24/7 since the issue occurred but haven't been successful 
yet.


All else (continuing to run existing measurements, creating new ones, 
real-time streaming of the results, APIs, UI, ...) are running undisturbed.


I hope this helps understanding the extent of the problem, and we'll of 
course let you know when there's progress.


Regards,
Robert

--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Atlas fully down..

2023-09-19 Thread Fearghas Mckay


> On 19 Sep 2023, at 17:04, Ernst J. Oud  wrote:
> 
> Considering the fact that all of Atlas is completely down for more than 24 
> hrs., I find the silence a bit deafening. No status updates, nothing. Weird.

The status update is Degraded Performance, for a non-critical service. 
https://atlas.ripe.net/ acknowledges there is a consumption delay, hardly 
silence.

f-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Atlas fully down..

2023-09-19 Thread Randy Bush
> Consumption delay according to the main page is up to 16+ hours so
> something is indeed very wrong.

aha!  a symptom.  thanks.

indeed, an issue

randy

-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Atlas fully down..

2023-09-19 Thread Ernst J. Oud
I don’t think I fully understand what you are saying. Do you imply that Atlas 
works for you?

I doubt it since all of it is down, see the status page at ripe.net, I guess 
worldwide. No results are processed, no tests are running, even Magellan is 
down.

Or were you joking? My Dutch sense of humor might be different :-)

Ernst

> On 20 Sep 2023, at 00:06, Randy Bush  wrote:
> 
> 
>> 
>> Considering the fact that all of Atlas is completely down for more
>> than 24 hrs., I find the silence a bit deafening. No status updates,
>> nothing. Weird.
> 
> been down so long it looks like up to me [0]
> 
> as you are probably too young for that reference, how about it looks
> pretty up from here.  perhaps a more specific symptom might help with
> diagnosis.
> 
> randy
> 
> [0] https://en.wikipedia.org/wiki/Been_Down_So_Long_It_Looks_Like_Up_to_Me

-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Atlas fully down..

2023-09-19 Thread Peter Potvin via ripe-atlas
Consumption delay according to the main page is up to 16+ hours so
something is indeed very wrong.

Regards,
Peter Potvin | Executive Director
--
*Accuris Technologies Ltd.*


On Tue, Sep 19, 2023 at 6:06 PM Randy Bush  wrote:

> > Considering the fact that all of Atlas is completely down for more
> > than 24 hrs., I find the silence a bit deafening. No status updates,
> > nothing. Weird.
>
> been down so long it looks like up to me [0]
>
> as you are probably too young for that reference, how about it looks
> pretty up from here.  perhaps a more specific symptom might help with
> diagnosis.
>
> randy
>
> [0] https://en.wikipedia.org/wiki/Been_Down_So_Long_It_Looks_Like_Up_to_Me
>
> --
> ripe-atlas mailing list
> ripe-atlas@ripe.net
> https://lists.ripe.net/mailman/listinfo/ripe-atlas
>
-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Atlas fully down..

2023-09-19 Thread Randy Bush
> Considering the fact that all of Atlas is completely down for more
> than 24 hrs., I find the silence a bit deafening. No status updates,
> nothing. Weird.

been down so long it looks like up to me [0]

as you are probably too young for that reference, how about it looks
pretty up from here.  perhaps a more specific symptom might help with
diagnosis.

randy

[0] https://en.wikipedia.org/wiki/Been_Down_So_Long_It_Looks_Like_Up_to_Me

-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


[atlas] Atlas fully down..

2023-09-19 Thread Ernst J. Oud
Considering the fact that all of Atlas is completely down for more than 24 
hrs., I find the silence a bit deafening. No status updates, nothing. Weird.

Not up to RIPE standards.

Regards,

Ernst J. Oud


-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] One-off measurements stopped after 5-6 minutes without results

2023-09-19 Thread Gerdriaan Mulder

On 19/09/2023 20:13, Ernst J. Oud wrote:

Hi,

If you look here, it shows that the back-end is dead:

https://atlas.ripe.net/ 

 > 20 hours backlog, 0 results processed.


Ah, another blip on the radar that I never noticed before. Apparently 
the statistics box isn't something that caught my attention despite the 
(somewhat too dark?) red dot near "Consumption delay".


Best regards,
Gerdriaan Mulder

--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] One-off measurements stopped after 5-6 minutes without results

2023-09-19 Thread Ernst J. Oud
I stand corrected. Magellan is also dead.

Regards,

Ernst J. Oud

> On 19 Sep 2023, at 20:14, Ernst J. Oud  wrote:
> 
> Hi,
> 
> If you look here, it shows that the back-end is dead:
> 
> https://atlas.ripe.net/
> 
> > 20 hours backlog, 0 results processed.
> 
> Magellan works fine, so streaming is ok, only storage of results seems to be 
> severely impacted.
> 
> Regards,
> 
> Ernst J. Oud
> 
>>> On 19 Sep 2023, at 20:04, Gerdriaan Mulder  wrote:
>>> 
>> Hi list,
>> 
>> A few hours ago (around 15:00Z), I launched some one-off 
>> measurements[1][2][3] on the platform for a select number of probes (based 
>> on ASN and region). The target in all three cases was 2a02:1807:1020:700::1 
>> (no hostname, purely the address). Although the number of requested and 
>> participating probes varied a bit, it seems none of the measurements yielded 
>> any results.
>> 
>> At the time I didn't look at recent posts on this mailing list, nor was I 
>> aware of this ongoing incident[4] on the Atlas backend. It seems like that 
>> incident might be the cause of the (current) unavailability of these results.
>> 
>> It would be great if anyone can confirm that (viewing results of) 
>> user-defined measurements are affected as well.
>> 
>> Aside, I suppose the following messages on "My RIPE Atlas Dashboard" are 
>> also caused by the mentioned incident[4]?
>> 
>> --->8---
>> 2023-09-18 16:40 UTCYour probe #11nnn was automatically untagged as 
>> "system: Resolves  Correctly"
>> 2023-09-18 16:40 UTCYour probe #11nnn was automatically untagged as 
>> "system: Resolves A Correctly"
>> 2023-09-18 16:20 UTCYour probe #11nnn was automatically untagged as 
>> "system: IPv6 Works"
>> 2023-09-18 16:20 UTCYour probe #11nnn was automatically untagged as 
>> "system: IPv4 Works"
>> ---8<---
>> 
>> Perhaps the incident page[4] can include a few lines of the current 
>> (estimated) impact, as well.
>> 
>> Best regards,
>> Gerdriaan Mulder
>> 
>> [1] https://atlas.ripe.net/frames/measurements/60161706/ (default IPv6 
>> traceroute)
>> [2] https://atlas.ripe.net/frames/measurements/60162230/ (IPv6 traceroute, 
>> paris=0)
>> [3] https://atlas.ripe.net/frames/measurements/60162753/ (default IPv6 ping)
>> [4] https://status.ripe.net/incidents/wfd7qywz3v68
>> 
>> -- 
>> ripe-atlas mailing list
>> ripe-atlas@ripe.net
>> https://lists.ripe.net/mailman/listinfo/ripe-atlas
-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] One-off measurements stopped after 5-6 minutes without results

2023-09-19 Thread Ernst J. Oud
Hi,

If you look here, it shows that the back-end is dead:

https://atlas.ripe.net/

> 20 hours backlog, 0 results processed.

Magellan works fine, so streaming is ok, only storage of results seems to be 
severely impacted.

Regards,

Ernst J. Oud

> On 19 Sep 2023, at 20:04, Gerdriaan Mulder  wrote:
> 
> Hi list,
> 
> A few hours ago (around 15:00Z), I launched some one-off 
> measurements[1][2][3] on the platform for a select number of probes (based on 
> ASN and region). The target in all three cases was 2a02:1807:1020:700::1 (no 
> hostname, purely the address). Although the number of requested and 
> participating probes varied a bit, it seems none of the measurements yielded 
> any results.
> 
> At the time I didn't look at recent posts on this mailing list, nor was I 
> aware of this ongoing incident[4] on the Atlas backend. It seems like that 
> incident might be the cause of the (current) unavailability of these results.
> 
> It would be great if anyone can confirm that (viewing results of) 
> user-defined measurements are affected as well.
> 
> Aside, I suppose the following messages on "My RIPE Atlas Dashboard" are also 
> caused by the mentioned incident[4]?
> 
> --->8---
> 2023-09-18 16:40 UTCYour probe #11nnn was automatically untagged as 
> "system: Resolves  Correctly"
> 2023-09-18 16:40 UTCYour probe #11nnn was automatically untagged as 
> "system: Resolves A Correctly"
> 2023-09-18 16:20 UTCYour probe #11nnn was automatically untagged as 
> "system: IPv6 Works"
> 2023-09-18 16:20 UTCYour probe #11nnn was automatically untagged as 
> "system: IPv4 Works"
> ---8<---
> 
> Perhaps the incident page[4] can include a few lines of the current 
> (estimated) impact, as well.
> 
> Best regards,
> Gerdriaan Mulder
> 
> [1] https://atlas.ripe.net/frames/measurements/60161706/ (default IPv6 
> traceroute)
> [2] https://atlas.ripe.net/frames/measurements/60162230/ (IPv6 traceroute, 
> paris=0)
> [3] https://atlas.ripe.net/frames/measurements/60162753/ (default IPv6 ping)
> [4] https://status.ripe.net/incidents/wfd7qywz3v68
> 
> -- 
> ripe-atlas mailing list
> ripe-atlas@ripe.net
> https://lists.ripe.net/mailman/listinfo/ripe-atlas
-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


[atlas] One-off measurements stopped after 5-6 minutes without results

2023-09-19 Thread Gerdriaan Mulder

Hi list,

A few hours ago (around 15:00Z), I launched some one-off 
measurements[1][2][3] on the platform for a select number of probes 
(based on ASN and region). The target in all three cases was 
2a02:1807:1020:700::1 (no hostname, purely the address). Although the 
number of requested and participating probes varied a bit, it seems none 
of the measurements yielded any results.


At the time I didn't look at recent posts on this mailing list, nor was 
I aware of this ongoing incident[4] on the Atlas backend. It seems like 
that incident might be the cause of the (current) unavailability of 
these results.


It would be great if anyone can confirm that (viewing results of) 
user-defined measurements are affected as well.


Aside, I suppose the following messages on "My RIPE Atlas Dashboard" are 
also caused by the mentioned incident[4]?


--->8---
2023-09-18 16:40 UTC 	Your probe #11nnn was automatically untagged as 
"system: Resolves  Correctly"
2023-09-18 16:40 UTC 	Your probe #11nnn was automatically untagged as 
"system: Resolves A Correctly"
2023-09-18 16:20 UTC 	Your probe #11nnn was automatically untagged as 
"system: IPv6 Works"
2023-09-18 16:20 UTC 	Your probe #11nnn was automatically untagged as 
"system: IPv4 Works"

---8<---

Perhaps the incident page[4] can include a few lines of the current 
(estimated) impact, as well.


Best regards,
Gerdriaan Mulder

[1] https://atlas.ripe.net/frames/measurements/60161706/ (default IPv6 
traceroute)
[2] https://atlas.ripe.net/frames/measurements/60162230/ (IPv6 
traceroute, paris=0)

[3] https://atlas.ripe.net/frames/measurements/60162753/ (default IPv6 ping)
[4] https://status.ripe.net/incidents/wfd7qywz3v68

--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


[atlas] Changes to RIPE Atlas API keys

2023-09-19 Thread Robert Kisteleki



Dear RIPE Atlas users,

We'd like to update you on some upcoming changes regarding API keys in 
RIPE Atlas.


TL;DR nothing changes regarding how you can use your API keys in the 
short term - as long as you're actually using them. However, we'll 
change how unused or forgotten keys are handled as well as remove the 
less secure in-URL use of them.



At the moment RIPE Atlas users can query their existing API keys via the 
UI and API, including the possibility to retrieve old keys. In order to 
improve the security of how we handle these, we'll introduce the 
following changes in October 2023:


* The listing (retrieval) of keys will only reveal parts of the keys 
(enough to identify them) in the API as well as in the UI.


* We'll add the ability to "regenerate" an API key, which will replace 
the secret UUID of the key while keeping exactly the same permissions.


* Unused API keys will automatically be frozen after 1 year of not being 
used. Active keys (i.e. the ones that have been used at least once) will 
not be frozen.


You still have the ability to save your keys until these changes are 
done and, as written above, you will be able to regenerate them later. 
We'll notify this list when the changes are about to be done.



In addition, in order to further increase the security of our system, in 
the long run we'll make changes about how these API keys are 
communicated to the API:


* At the moment the API accepts these either in HTTP headers 
("Authorization" header) or in the URL (?key=xyz), although the 
Authorization header version has been documented as the preferred 
version for some time.


* We'll deprecate and remove the ability to use the URL form in about a 
year (around October 2024).


* We plan to send further reminders about this change over time, as well 
as reaching out to heavy users of the to-be-removed format.


Regards,
Robert Kisteleki
RIPE Atlas team




--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Changes to RIPE Atlas API auth status codes on 2 Oct

2023-09-19 Thread Chris Amin

On 19/09/2023 11:11, Gert Doering wrote:


This temporary measure is guaranteed to work for the rest of 2022,


So which iteration of 2022 would that be?


Thanks Gert. This was of course deliberate to make see whether people 
were paying attention.


The temporary measure is *also* guaranteed to work for the rest of 2023, 
and the X-Compat header may contain either "auth-2022" or "auth-2023" to 
maintain the old behaviour until the end of this year.



Chris

--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Changes to RIPE Atlas API auth status codes on 2 Oct

2023-09-19 Thread Gert Doering
Hi,

On Tue, Sep 19, 2023 at 10:38:29AM +0200, Chris Amin wrote:
> This temporary measure is guaranteed to work for the rest of 2022, 

So which iteration of 2022 would that be?

Gert Doering
-- NetMaster
-- 
have you enabled IPv6 on something today...?

SpaceNet AG  Vorstand: Sebastian v. Bomhard, Michael Emmer
Joseph-Dollinger-Bogen 14Aufsichtsratsvors.: A. Grundner-Culemann
D-80807 Muenchen HRB: 136055 (AG Muenchen)
Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279


signature.asc
Description: PGP signature
-- 
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Changes to RIPE Atlas API auth status codes on 2 Oct

2023-09-19 Thread Chris Amin

On 19/09/2023 10:38, Chris Amin wrote:

As a temporary migration measure, the API will keep the same behaviour 
(always return 403) if either:


* The "Referer" header contains "RIPE Atlas Tools" and a version string 
<= 3.1.1, or


Apologies, this should refer to the "User-Agent" header and not the 
"Referer" header.



* An "X-Compat" header is set and contains the string "auth-2022"

This temporary measure is guaranteed to work for the rest of 2022, after 
which it will be removed and the API will always make the 401/403 
distinction.


--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


[atlas] Changes to RIPE Atlas API auth status codes on 2 Oct

2023-09-19 Thread Chris Amin

Dear colleagues,

Currently the RIPE Atlas REST API (https://atlas.ripe.net/api/v2/) 
returns a 403 Forbidden status code in two cases:


* When a request requires authentication but the user has not provided 
any credentials, or has provided incorrect credentials
* When a user has authenticated correctly, but they or their API key 
lacks the permissions needed for a particular request


Distinguishing between these two cases is important because in the first 
case the client can potentially get access by authenticating, and in the 
second case there is no point in retrying authentication with the same 
credentials.


In order to enable this distinction, and to generally conform to web 
standards and best practices, on Monday, 2nd October we will change the 
REST API so that a completely unauthenticated request will receive a 
response with a 401 Unauthorized status code. The 403 Forbidden status 
code will still be returned for users and API keys that are 
authenticated but lack the necessary permissions for the request.


As a temporary migration measure, the API will keep the same behaviour 
(always return 403) if either:


* The "Referer" header contains "RIPE Atlas Tools" and a version string 
<= 3.1.1, or

* An "X-Compat" header is set and contains the string "auth-2022"

This temporary measure is guaranteed to work for the rest of 2022, after 
which it will be removed and the API will always make the 401/403 
distinction.


Kind regards,
Chris Amin
RIPE Atlas team

--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas


Re: [atlas] Issues with Atlas back-end

2023-09-19 Thread Robert Kisteleki

Hi,

The issues with RIPEstat have been resolved in about an hour, the 
problem with the Atlas data backend is unfortunately still ongoing.


Regards,
Robert


On 2023-09-19 00:03, Ernst J. Oud wrote:


Hi,

 From September 18th there is an issue with the Atlas back-end, already 
9 hours delay. Built-ins show data until around 14:00 UTC. It is now 
midnight.


I am confused, the status page (https://status.ripe.net/ 
) at the bottom says:


  “Resolved - The backend team could restore the service level and with 
that issues in RIPEstat stopped as well.

Sep 18, 15:11 UTC”

It clearly is not resolved. On the top of that status page it says:

“Investigating - We are experiencing issues with the RIPE Atlas backend 
and currently are investigating this issue.

Sep 18, 2023 - 14:07 UTC”

So what is the status?

Regards,

Ernst J. Oud



--
ripe-atlas mailing list
ripe-atlas@ripe.net
https://lists.ripe.net/mailman/listinfo/ripe-atlas