[google-appengine] Re: Error handling during downtime

dir Ls Wed, 15 May 2019 23:39:55 -0700

Thank you TIago for the response. 

> if a write fails, you can catch the Datastore exception and enqueue a 
task to retry later, etc.


What I would like to know is the kind of exceptions that will be thrown 
that tells me that I need to try it later. My app is based on Go and the 
datastore client in Go only has few errors and none of them are related to 
read/write errors that are infrastructure level. They all seem to be app 
logic related.

https://godoc.org/cloud.google.com/go/datastore#pkg-variables

On Wednesday, May 15, 2019 at 8:10:35 PM UTC-7, Tiago (Google Cloud 
Platform Support) wrote:
>
> Hello,
>
> The Cloud Datastore SLA agreement <https://cloud.google.com/datastore/sla> 
> doesn't specify answers to many of the questions posed here on purpose: 
> it's extremely hard to predict if downtime will happen all at once or 
> intermittently, as those events are most often unplanned by their own 
> nature. Indeed, a quick glance at previous incidents 
> <https://status.cloud.google.com/summary#cloud-datastore> reveal the 
> occurrence of them both in the past year. When designing your application, 
> it's probably better to abstract such unknowns and implement general 
> fail-safe mechanisms - for instance, if a write fails, you can catch the 
> Datastore exception and enqueue a task to retry later, etc.
>
> That being said, given the small budget for downtime allocated for Cloud 
> Datastore (and taking into consideration its past generally reliable 
> behavior), it's more common to observe issues with it due to the 
> implementation not following the general best practices 
> <https://cloud.google.com/datastore/docs/best-practices> or because of 
> sub-optimal 
> design <https://cloud.google.com/appengine/articles/scalability>. There's 
> a greater benefit to be reaped in terms of your app's overall reliability 
> by focusing on a general strategy to give those topics the proper attention 
> they deserve in development instead.
>
> On Friday, April 26, 2019 at 12:21:50 PM UTC-4, dir Ls wrote:
>>
>> Cloud datastore has 99.95% monthly uptime SLA for multi-region which 
>> translates to slightly above 20 minutes per month. Is this downtime likely 
>> to happen all at once or intermittently? What kind of errors are to be 
>> expected during the downtime? I am trying to figure out the strategy 
>> required to be put in place on how the app should respond to end users 
>> during the downtime. Would it be possible that it works for data related to 
>> some users but not the others at a given time? I am looking for a best 
>> practice guidance for an app that is expected to be usable 24/7 with 
>> graceful downgrading based on the underlying services. For example, if the 
>> downtime is intermittent, users might just reload the page and won't even 
>> know something wrong happened. But if the downtime is prolonged, explicitly 
>> displaying that the system is currently inaccessible and asking them to 
>> visit after sometime might be better.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/a1e7fe14-1007-4ac0-a902-dcfa9db7f50f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[google-appengine] Re: Error handling during downtime

Reply via email to