An update: I was able to ‘delete' my stuck measurements via the API, so they’re stopped now and I’m back up and running for the moment.
I also added an API command to my code to ‘delete’ measurements as soon as the results have been picked up, which I hoped would make this fix sustainable, but so far that doesn’t seem to be doing anything. Perhaps a longer delay is required between creating the measurement and sending the ‘delete’ command? Thanks, Steve > On Dec 28, 2019, at 3:20 PM, Steve Gibbard <s...@gibbard.org> wrote: > > Hi Atlas folks, > > I hope you’re having a good holiday season. Sorry to interrupt it by > complaining about issues. > > On Christmas Eve my time (early Christmas morning your time) there was an > Atlas issue where any attempt at reading measurements failed with an HTTP 500 > status error. That appears to have gotten fixed on Christmas (a really big > thank you to whoever worked on that) but since then it appears that while > most of the one-off measurements we’ve created have delivered results very > quickly, none of the measurements created since 17:00 UTC on 2019-12-25 have > stopped running. As shown in the Atlas portal: > > > 23722197 Traceroute www.globaltraceroute.com (AS13335) Test > Traceroute 1 one-off 2019-12-25 22:24 > Never > 23722089 Traceroute archive.ubuntu.com (AS41231) Test Traceroute > 1 one-off 2019-12-25 19:16 > Never > 23722088 Traceroute sps.prima.com.ar (AS10318) Test Traceroute > 1 one-off 2019-12-25 19:14 > Never > 23721915 Traceroute www.globaltraceroute.com (AS13335) Test > Traceroute 1 one-off 2019-12-25 17:00 > Never > > And on for every measurement between then and now. > > Previously, the typical one-off measurement was listed with start and stop > times less than 10 minutes apart. > > When a user has 100 measurements running concurrently, creation of new > measurements fails, which is happening for me now. > > If somebody could take a look at this, I’d really appreciate it. > > Thanks, > Steve >