[google-appengine] Re: Is it impossible to guarantee a put()

stevep Sun, 13 Feb 2011 10:32:54 -0800

Hi Robert,

Thanks for your input.

Here's a summary, and wrap-up for me as I've decided on this path for
now.

First, I'll note that all the data coming in is compressed. No
opportunity there to squeeze below the current TQ limit.

My current strategy for architecture design should hopefully minimize
put() failures in the on-line handler, and provide good, solid, backup
(solid with respect to business not just technical risks).

1) Use a high replication handler.
2) Strip all but the most essential indexes from the main record which
will be put() in the on-line handler. Handle indexes in a separate,
related db class via the task queue.
3) Maintain an Amazon Web Server for handling put() failures at GAE.

Hopefully Google's SLA-like inferences about high replication bear
fruit. Perhaps at some point we'll get past the beta label and get
something from them more specific. For now I'll trust the "Do no evil"
slogan, and will also trust what appears to be a well-grounded and
talented group of engineers (based solely on my readings of their
forum posts and seeing GAE develop).

My conclusion about custom indexes is based on latencies for some
early development work that had some activity as a master/slave app.
Clearly (and this has been alluded by in these forums by Googlers)
custom indexes need careful consideration because of the put()
overhead. Fortunately for what we are doing we can separate the index
from the main file using the allocated-id to link back. Pushing the
indexed update into the task queue is probably the most important
change as we provide a good number of user slice/dice filters to
access the site content. If Google could do one case study, I'd vote
for this area so that we could have a better understanding of how
custom indexes affect put() performance.

Of all your work-arounds, my belief is that running an AWS server for
saving failed writes is going to be the best. There are important
business-risk considerations at play here not just technical. I wont
go into these in detail, but suffice it to say that I'll be happier
when Google removes the beta label – and I say this not just out of
selfish interests, but because I believe that GAE is really
disruptive, leverages a very key Google core company competency, and
deserves to gain strategic commitment from Larry (or whomever) as a
full enterprise-class product. Think shareholders and developers, and
the GAE team all deserve this.

Summary: Overall I am not unhappy having to rework how my put()s are
done. I think the new approach will be much better suited to an app
that is optimized not just for lower failure rates, but for optimized
operation within the GAE infrastructure (but I'm clearly guessing
about the latter point).

Having said that, I am still hopeful that Google's talented GAE group
will consider providing a task queue that can accept that same data
payload as the on-line handler. It's obvious why they will not do this
for a TQ that can accept large numbers of tasks. Likely it would be
simple for them to provide a limit on this queue related to the number
of available tasks so that the memory/storage requirements for this TQ
are not abused by us developers who are always looking for that easy
workaround. TQ's are great IMHO, and I'd like to utilize them as much
as possible.

We’re going live soon, so I think this will be my last long
architecture thread. Thanks for all who have helped, most notably
Robert whose great inputs for nearly every forum thread (not just
mine) have been tremendously helpful. (Note to Google: if you’re
helping Robert financially, you’re getting a great return. If not, do
so.)

cheers,
stevep
(who had a good night's sleep, hence the verbosity of this reply)

On Feb 12, 9:11 pm, Robert Kluin <robert.kl...@gmail.com> wrote:
> Hey Steve,
>   First, "[t]hink of your client's put() as your customer's head, and
> the 503 as the brick wall" is great, it is humorous and accurate
> analogy.  I can relate.  And, I also deal with low-write-rate but
> extremely high-value writes in some apps.  It is tricky and makes
> small failures / blips a much more significant issue.
>
>   In any client-server app, there will occasionally be communication /
> server-side issues.  Have you thought about storing data locally on
> submit (or only if there is a server-side error) then implementing
> some type of start-up recovery logic that will retry the save?  Could
> be particularly helpful if you have some way of making the write
> idempotent (ie using a key_name). I know it is not perfect but it
> could help out a bit.  There are some little tools that might help
> with this, like lawnchair (http://westcoastlogic.com/lawnchair/).  If
> you couple that with saving via an AJAX call, even in an error
> situation the user experience would be at least a little better.
>
>   Another question, how far over the 10K limit are you?  Have you
> tried using bzip to compress your data?  Perhaps if you compress the
> payload it would be possible to enqueue it in a task?
>
> Robert
>
>
>
> On Sat, Feb 12, 2011 at 10:40, stevep <prosse...@gmail.com> wrote:
> > Thanks Calvin and Robert (as always).
>
> > Calvin: I initially did my puts() in the handler. Easy and clean as
> > you suggest until your put that normally takes 200ms is hit by a
> > series of 503s because of infrastructure issues Google controls (not
> > you).
>
> > Think of your client's put() as your customer's head, and the 503 as
> > the brick wall. You can beat the head against the wall handling all
> > the various scenarios via dialogs. That's an option yes, but my
> > customers are not technically inclined, and will be unhappy. Also, you
> > **cannot** assume this write will happen. Very, very likely the
> > frustrated customer closes her browser window. Then you're SOL.
>
> > My option using the task queue was to simply send the put there, and
> > give the client the means to verify its occurrence. If the
> > infrastructure is thrashing away with 503s, then the client can easily
> > determine that the customer needs a nice dialog saying she will be
> > apprised via email when the record is written.
>
> > I'm happy: record is not lost. Customer is happy: she has an
> > assurance, and is free to get on with her life.
>
> > Conclusion: it is becoming apparent to me that GAE cannot provide the
> > means for a developer to ensure a record is written. To minimize risk,
> > I am now setting up a high replication handler writing to db class
> > that has the least possible number of indices. AND, I get to set up
> > all those client dialogs.
>
> > So, so much easier if there would be one task queue were I could send
> > the transaction data from my client's post, and take advantage of all
> > the benefits of the TQ.
>
> > Sorry for the rant-like tone -- too little sleep last few days.
> > stevep
>
> > Robert:
> >>   Instead could catch timeout, contention, or similar exceptions and
> >> retry the put.  You might also wan to set an limit on the RPC timeout
> >> so that you'll have time to retry the write at least once within the
> >> 30 second request limit.   If you still get an error after some number
> >> of retries, notify the client.
>
> > Calvin:
> >> What I'm not understanding from the original question is why can't you do
> >> the put in the request handler?  That way if there is a failure (and there
> >> will always be occasional failures) you can catch the exception and return
> >> an error to the client.
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Google App Engine" group.
> > To post to this group, send email to google-appengine@googlegroups.com.
> > To unsubscribe from this group, send email to 
> > google-appengine+unsubscr...@googlegroups.com.
> > For more options, visit this group 
> > athttp://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] Re: Is it impossible to guarantee a put()

Reply via email to