*Short Answer:* We should have. 

*Longer(ish) Answer: *That will be a Postmortem Action Item for my team. 
The dashboard doesn't allow me to create retroactive impact for a service 
marked 'unaffected' (because of log integrity rules), so I can't mark GKE 
as 'orange' for that time period. (Allowing exceptions for cases like these 
will also be an AI.) In the heat of responding, we just plain missed a 
couple of playbook steps when reporting to the dashboard. That's the 
unvarnished truth of it. 

I hope you'll accept my apologies for it and know that we'll make it harder 
to make that same mistake again.

-dave




On Wednesday, June 29, 2016 at 9:43:30 AM UTC-7, Kevin Griffin wrote:
>
> Hi,
>
> I am trying to find the best place to get support for an issue we have had 
> with two of our Google Container Engine clusters (both running Node version 
> 1.1.6) which last night failed at 10:56 pm (EST) / 11:07 pm (EST) and 
> failed to restart cleanly.
>
> The only reported issue from Google last night has 'investigating latency 
> on ssd' which seems unrelated, we'd obviously like to get to the root cause 
> of why both clusters went down, logs are of no real use, they show the 
> Health Checks, then they just stop. Next logs are from us restarting the 
> cluster and upgrading to Node Version 1.2.4.
>
> Is there another source of logs perhaps we are not looking at?
>
> Is this only source of 'bronze' support?
>
> Any help/clarification is appreciated.
>
> (Apologies if you're re-reading this I originally had it posted in 
> gc-discussion)
>
> Regards,
>

-- 
You received this message because you are subscribed to the Google Groups 
"Containers at Google" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/google-containers.
For more options, visit https://groups.google.com/d/optout.

Reply via email to