[google-appengine] Re: how to work around this eror: "DeadlineExceededError"

Erwin Streur Sat, 30 Apr 2011 07:33:09 -0700

Sasha, you are absolutely right.

Based on the orignal post I assumed that the number of Authority
records belonging to a given twitterUser is 0 or 1, because the get
method is used which returns the first result only (according to doc)
and not the fetch method which can return multiple results.
In case the code sniplet is wrong and large amount of data are being
fetched(and this is a real requirement) it become a whole different
ballgame.


Linh Truong, can clarify what exactly you are trying to do and how
much data/entities you have in the datastore and how many you expect
to retrieve using the query?

It is important to realize that the datastore is not a regular
(relational) database. Large amount of in the datastore is not/should
not be a problem, because GAE doesn't allow you to preform a full
table scan, but only index based retrieval.
Returning large amount of data in a single operation is a problem and
should be tackled by mechanisms as paging or sharding(MapReduce)

On 29 apr, 07:05, Sasha <sasha.h...@gmail.com> wrote:
> DeadlineExceeded usually just means that you tried to do too much work
> in one request, and you need to choose a more efficient design.
> Usually this also means the user is waiting many seconds for a
> response, which will frustrate them, so it's good to avoid it. You
> need to push work out of the request into tasks, split up the work
> across multiple requests, or simply do less work in each request.
> Almost certainly, you don't want a huge SELECT statement.
>
> How many Authority records should normally have a given twitterUser?
> Do you really need to retrieve all of them at once? If it isn't really
> so many you are fetching, then it may just be that you are querying
> inefficiently, and this is why you should profile (for a start, see
> appstats:http://code.google.com/appengine/docs/python/tools/appstats.html).
> If the bottleneck is in how you are querying, you can make it fast in
> some simple ways, like reducing the number of round trips to the
> datastore, getting things by key name or key id instead of querying,
> storing frequently-read information in memcache, using __key__ queries
> where possible... not to mention having your application do some kinds
> of work incrementally and in advance rather than all at once. If you
> do many fetches instead of just one at a time, it will be slower.
>
> But... if the real cause of the problem is that the query returns too
> many records, and it won't work to use key names or __key__ queries to
> fetch records,
> then the solution strongly depends on WHY you are trying to get so
> many records at once. Probably, you won't really want to get all those
> records at once.
>
> If you want to show the results to the user, it's customary to show
> only a few at each time, in 'pages.' You should consider this
> carefully.
> A pretty simple and efficient way is to use query cursors, for this
> seehttp://code.google.com/appengine/docs/python/datastore/queries.html#Q...
> If the user can't stand to click 'Next,' then you could always write
> some Javascript to request each chunk after the last one completed, or
> retrieve it as the user scrolls down, so the user sees an 'infinite
> page'
> and the server can still be happy because each request does not take
> so long. If you don't like query cursors, you can use other tricks
> like inequality filters (rank >= 10 and rank < 20 limits you to only
> data from ranks 10-19) but this adds to the complexity.
>
> If you are NOT displaying data to users on a web page - if you need to
> do a mass export of some data to another computer, or to carry out an
> operation on every user, or generate a big report based on lots of
> data, then you were on the right track to look at the bulk loader
> (http://code.google.com/appengine/docs/python/tools/
> uploadingdata.html) and mapreduce. These tools really just split up
> big jobs into multiple requests themselves in a way that only makes it
> look simple. Regardless of the approach you choose, you will have to
> somehow ensure that each request takes only a short time to carry
> out.
>
> If you try to get all matching keys, that will take you a little
> farther but you will still hit a limit of how many you can grab. It's
> better to find a way to only ask the datastore for fewer records at
> each request. If something like mapreduce or pipeline or the bulk
> loader will work well for you, then by all means use those. It
> strongly depends on whether and why you really need to fetch a large
> number of records at once.
>
> If you really need a LOT of work to be done by the task queue, and
> cannot avoid it, you can always tell the user that it will take a
> while. Then when the job is done, you can refresh the page (in
> javascript) or send a notification. That is not unreasonable at all
> for tasks which are very computationally expensive.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

[google-appengine] Re: how to work around this eror: "DeadlineExceededError"

Reply via email to