> having said that, i'm still a little skeptical that you truly need to
> optimize this. relative to your overall traffic pattern, i'm guessing
> this operation is fairly rare, so i wonder if avoiding a single extra
> put() is really that important. still, let us know what you think of
> the "get next key" operation.

It's not the cost of the extra put, it's that said extra put can't be
in the transaction that creates the rest of the entity group.  Because
of the "only one put per entity in a transaction", that extra put is
either before the transaction that creates all of the children or
after.  Either way, the sequence can result in an incomplete/
inconsistent/incorrect group that I have to write code to handle.  (If
it's before, I can have a root that has no children.  If it's after, I
can have a root that doesn't refer to children that exist.)

Also, there's a certain cleanliness in creating a bunch of db.Model
instances and storing them with a single db.put, and now there's the
"it's run in the datastore" pragmatic argument.

Of course, it isn't just how common an operation is, its position in
the execution path matters.  For example, if it's part of startup and
it fails too often, that's a big deal even if the whole application
spends very little time in startup.  In my case, it's part of a loop
preamble and my application spends almost all of its time in the
loop.  In some cases, being able to start the loop quickly is a big
deal.

As a practical matter, the less often my application creates such
groups, the less effort that I want to expend on ensuring their
consistency/correctness/completeness during creation.  (I don't even
want to spend time ensuring consistency/correctness/completeness of
common operations but ....)  I'd much rather spend time speeding up
common operations or adding features.  Plus, the less common something
is, the less often it will run into trouble, which can make it harder
to detect such trouble and to write the code that handles it
correctly.  (In my case, it's on a critical code path, so it has to
work reliably.)

If db.get_next_key comes with the relevant changes to db.Model, I'd
probably be happy.  (Your use case must be somewhat different if
db.get_next_key satisfies it but
get_application_process_instance_unique_string doesn't because "get
next key" changes faster.)  The "probably" is because I don't know if
I'm going to take a hit because it's a datastore operation that can
increase the number of application failures.

> as for quotas, those are only checked at the beginning of each call,
> not while the call is running in the datastore itself.

I think of every datastore transaction, url fetch, and the "return the
result to the user" as an opportunity for GAE to abort my
application.  Yes, I know that GAE can abort my application between
such operations but those are the only things that create "state" that
I have to manage if an abort happens.

While I can believe that there are cases where minimizing the number
of abort opportunities isn't the right approach, I suspect that
they're fairly rare so minimizing them is one of my rules of thumb.

Thanks for your help and attention,
-andy

On Dec 12, 5:11 pm, ryan <ryanb+appeng...@google.com> wrote:
> On Dec 12, 10:06 am, Andy Freeman <ana...@earthlink.net> wrote:
>
>
>
> > You're missing the ability to read my mind.  I haven't mentioned that
> > (my) parent has references to (some of) its children.
>
> > The "one put per entity in a transaction" rule means that I can't
> > update parent in that transaction after the child put.  I can't use
> > parent as a parent until it has a valid key.  I can't generate a
>
> ahhh...got it. ok. in that case, you're right. using id-based keys
> would make your life easier, since we do the heavy lifting of
> generating a unique identifier (the parent key), but it would require
> an extra put().
>
> we've actually recently come across the same use case internally, a
> couple of times. to address it, we've considered adding a "get next
> key" operation. you'd provide the kind and parent key, if any, and it
> would allocate an id you can attach to that key for use in a later put
> (). you'd use it to decouple id allocation and entity insertion, which
> is what you want to do here.
>
> the datastore reserves ids in batches, so in the common case this
> operation would take well under 1ms, similar to memcache. if the
> datastore has used up its last batch and needs to allocate a new one,
> that will take around 10-20ms, but that's very rare.
>
> having said that, i'm still a little skeptical that you truly need to
> optimize this. relative to your overall traffic pattern, i'm guessing
> this operation is fairly rare, so i wonder if avoiding a single extra
> put() is really that important. still, let us know what you think of
> the "get next key" operation.
>
> > Which reminds me - is the transaction for db.put([a, b,]) (for a and b
> > in the same entity group) run in the datastore or is it run using an
> > application-side transaction like
>
> good question! unlike transactions, batch puts, gets, and deletes are
> run in the datastore. if they hit contention, they're retried similar
> to transactions, except that the retries happen in the datastore, not
> in the application side python API. also, the work for each entity
> group is done in parallel, up to a point, as opposed to transactions.
> given that, you should generally see both asymptotic and constant
> factor speedups with batch writes and reads, relative to doing them
> individually in application code.
>
> as for quotas, those are only checked at the beginning of each call,
> not while the call is running in the datastore itself.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to