Hi Eric,

Of course this kind of performance is in no way normal.
Such a simple query typically takes below 50ms.
So either there is a problem in your code, or you hit a bug in the
java SDK.
I suggest you post a more complete code sample in the App Engine Java
group.

And no need to be condescending here. You'll see that the datastore is
a great piece of engineering :)
Query performance is completely independent of the number of entities
you have.
It depends on the size of the result set, but results are fetched in
parallel so the overhead is mainly the time needed to deserialize
entities. For the same reason locality is not a factor.



On Dec 2, 6:12 am, Eric Rannaud <eric.rann...@gmail.com> wrote:
> On Tue, Dec 1, 2009 at 11:02 AM, Stephen <sdea...@gmail.com> wrote:
> > On Dec 1, 9:55 am, Eric Rannaud <eric.rann...@gmail.com> wrote:
> >>     Calendar c = Calendar.getInstance();
> >>     long t0 = c.getTimeInMillis();
> >>     qmsgr = (List<MessageS>) qmsg.execute(lo, hi);
> >>     System.err.println("getCMIdRange:qmsg: " + (c.getTimeInMillis() - t0));
>
> > Are you fetching all 128 entities in one batch? If you don't, the
> > result is fetched in batches of 20, incurring extra disk reads and rpc
> > overhead.
>
> > Not sure how you do that with the Java API, but with python you pass
> > '128' to the .fetch() method of a query object.
>
> As far as I can tell, there is no such equivalent in the Java API. The
> query.execute() statement returns a collection that is meant to
> contain all the results. I don't know how they implement the
> Collection object returned by query.execute(). Google may well manage
> that in batches internally, inside the object with interface
> List<MessageS>, but that would be nasty for performance.
>
> I should say that a query with 1 result takes about 30ms. 128*30 =
> 3840 ms. That's pretty close to what I'm seeing for 128, indicating a
> linear scaling in the number of entities. Which would be really bad,
> and unexpected.
>
> It's really hard to guess what's going on internally, without any
> visibility of the architecture.
>
> To see the impact of number of entities on response time, I did some
> systematic testing:
>
> Querying elements [0,10), [0,10), [0,10), [0,20), [0,20), [0,20),
> [0,30), [0,30), [0,30), ... [0, 260), [0, 260), [0, 260) by increments
> of 10, in a quick succession, three times each, actually shows a
> pretty good performance behavior, the largest query with 260 entities
> returned taking 300ms. So there is some kind of caching happening,
> maybe. I didn't see that caching behavior earlier, but I wasn't doing
> queries in such a quick succession.
>
> But if I hit randomly in the datastore, i.e., [X+0,X+10), [X+0,X+20),
> [X+0,X+30), ...  [X+0, X+260), where X is random and different for
> each request, 0 <= X < 500000, then pretty much all the queries take
> between 1s and 4s, and we're back to more or less linear scaling in
> the number of entities fetched. (With a query returning a single
> entitiy taking 3s every so often.)
>
> It does make some sense for random queries to take longer than a bunch
> of queries in the same area of the datastore (except that there are no
> guarantees that the locality in the datastore is related to the
> ordering with respect to the field 'id'). But with the near linear
> scaling in response time with the number of entities, say 30 ms per
> entity, of average size 463 B, that's an implied bandwidth in the
> backend of 120Kb/s. Which is not very good.
>
> A last point, the field 'id' and the PrimaryKey of the entity MessageS
> are effectively uncorrelated (with respect to their ordering). The
> PrimaryKey is a String containing a MD5 hash of the content, the 'id'
> is a long set incrementally.
>
> Has anybody looked (publicly) at datastore performance depending on
> query size, locality, etc? If not, I might try to gather some
> extensive data, and write it up.
>
> Thanks,
> Eric.

--

You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appeng...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.


Reply via email to