Thanks a lot for the support, j and HG, it's great to hear I'm not the only one
Jason, it makes sense to let some queues retry indefinitely, I initially had only 1 retries, as I thought those retries for user failures, than I moved onto my current config, as it's extremely unlikely that 6 retries with backoffs could be exhausted, however it seems they can be, similarly, there is a possibility that, the issue might also occur with indefinite retries too, it's still not 100% certain internal retry failures are the cause, although several engineers claimed they were (I generally try to avoid indefinite retries, as they might produce serious side effects) The conversation that I learned about the retry behaviour was with: hanssens[t]google.com titled "TaskQueue Followup" on 3/22/14 (He was very helpful, thanks) You can't see these issues from the logs, yet you can see them as 5xx spikes from the error logs (most of the time) (By the way, this specific instance of the issue cost me almost 200$'s - I'm constantly running and inspecting routines that trigger many entities, costs a lot, I initially decided to use basic_scaling to prevent instance/cost spikes, limit instances, yet basic_scaling made the issue worse somehow) -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To post to this group, send email to google-appengine@googlegroups.com. Visit this group at http://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/047eaae6-f6b4-44ad-a95a-b993ab2a4aad%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.