On 12 Mar 2010, at 16:28, Jeff Schnitzer wrote:
Look at these graphs:
http://code.google.com/status/appengine/detail/datastore/2010/03/12#ae-trust-detail-datastore-get-latency
http://code.google.com/status/appengine/detail/datastore/2010/03/12#ae-trust-detail-datastore-query-latency
Notice that a get()'s average latency is 50ms and a query()'s average
latency is 500ms. Last week the typical query was averaging
800-1000ms with frequent spikes into 1200ms or so.
"You are increasing my suspicion that you have never worked" with an
application that queries large amounts of data. If your queries are
taking anywhere near 1000 ms then you must be doing something
seriously wrong.
One of my apps query times are generally in the 200 ms range over 2
million records. A keys-only query can return in 50ms.
This is the time required to execute 9 parallel queries on geospatial
data and OR merge them together. Keep in mind that with Twig I could
execute 90 parallel queries and expect the time to be about the same.
Deep down in the fiber of its being, BigTable is a key-value store.
It is very very efficient at doing batch gets. It wants to do batch
gets all day long. Queries require touching indexes maintained in
alternative tablets and comparatively, the performance sucks.
You are ignoring the fact that for many (most?) applications queries
are essential. I completely understand that your FaceBook app doesn't
depend on them but assuming that other peoples apps also do not is
just not helpful.
Why am I obsessed with batch gets? Because they're essential for
making an application perform. They're why there is such a thing as a
NoSQL movement in the first place.
Again, essential for your app. Not mine and probably many other apps
in which querying their own data is more important. Batch gets are
really only useful in apps that need to take a load of ids from an
external source and do something with them. Social network
"extension" apps for example.
Just to reiterate - batch gets of external ids is a trivial feature
that has always been planned to be a part of the new "load command"
that will follow the pattern of the find and store commands.
* Fire off a batch job at your leisure to finish it off.
This "partial update" approach only works in cases where you are not
adding a field that you will query on. That needs to be an all-or-
nothing batch job.
What is with your obsession with batch gets? I understand they are
central
in Objectify because you are always loading keys. As I said
already - even
though this is not as essential in Twig it will be added to a new
load
command.
Batch gets are *the* core feature of NoSQL databases, including the
GAE datastore.
Querying is important. You are ignoring a whole class of applications
if you think that querying is not important. I understand that your
applications works with FaceBook and does a lot of "lookups" by
external ids in a large dataset so to your mind batch get is the most
important operation. This is really not such a common scenario as you
social network developers might think.
One of the applications I work on application has about 2 million
records on which it needs to do geospatial queries sorted and
filtered. I guarantee you that there are many other applications that
have different query needs so to focus only on batch gets is myopic.
It probably explains why you don't think that OR queries are so
important. They were one of the first things I tried on App Engine
and one of the reasons Twig was written. I would bet that most
developers could not imagine working with an RDBMS that did not
support OR and AND queries (on more than one property). Twigs support
for these saves time and reduces the complexity of the developers
app. With Objectify they are left on their own to re-invent the wheel
every time.
The high-level design of Twigs commands means that ORs are supported
now in the query API. Objectifies low-level design could only help
out by providing helper classes - hardly user friendly or intuitive.
The goal of Twigs design is to put these common solutions at the
developers finger tips. Yes there are more methods in the API, but
they are well organised using the fluent style commands.
The command pattern used by Twig has the potential to add new "high
level" functionality that Objectifies low-level query interface would
need to rely on helper functions. For example, supporting AND queries
with more than one inequality filter is in development. Just like the
OR queries it will "stream" results, never keeping more than a small
number in memory.
These are the types of common problems that take a lot of time to
code. Im not saying that this is impossible to code with Objectify -
just that it is up to the developer to code these patterns again and
again. Re-inventing the wheel is one of the biggest wasters of a
developers time. That and long discussions on mailing lists :)
I do appreciate Objectifies simplicity - but it is built on a system
that is already too simple to be very usable for a lot of apps. In my
mind the goal of a good GAE framework should be to make the difficult
easy - not just to make the easy typesafe and pretty. I think that
really sums up the different goals of Twig and Objectify.
John
--
You received this message because you are subscribed to the Google Groups "Google
App Engine for Java" group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine-java?hl=en.