Thank you very much, Andy. I was never totally certain I understood
exactly what Nick had said.

In short, to remove old properties, you have to instantiate a fresh
entity yourself the normal Python way, copy the data you want, and put
() it back with the idential key_name or ID, parent, etc. (i.e. the
same key).

I starred your bug. I won't go into my disillusionment with the issue
tracker here; however in this case I think the solution might better
be done in a third-party library or middleware. It's arguably better
architecture, but at any rate it would have a better chance of being
implemented.

On Oct 13, 1:23 am, Andy Freeman <ana...@earthlink.net> wrote:
> > There's no need to use a new model name: You can simply create new entities
> > to replace the old ones, under the current model name. If you're using key
> > names, you can construct a new entity with the same values as the old ones,
> > and store that.
>
> Note the precise wording.  You can't just put() the instance that you
> read from the datastore, the instance that doesn't have the properties
> that you've deleted, you have to get(), make a new db.Model instance
> with the same key, populate its properties from the instance that you
> got, and put the new instance.  If you're not using key names, you
> can't create that new db.Model instance (as of 1.2.5) because you
> can't create an instance with a specified id.
>
> The problem is in db.Model._to_entity() (and maybe
> db.Expando._to_entity()).  If the instance was created from a protocol
> buffer, put() tries to reuse said protocol buffer, and it still
> contains values for properties that you've deleted.  These values are
> not deleted by _to_entity() so they end up being sent back to the
> datastore.
>
> I've filedhttp://code.google.com/p/googleappengine/issues/detail?id=2251
> .
>
> On Oct 10, 1:29 pm, "Nick Johnson (Google)" <nick.john...@google.com>
> wrote:
>
> > On Sat, Oct 10, 2009 at 6:27 PM, Jason Smith 
> > <j...@proven-corporation.com>wrote:
>
> > > Thanks for the help guys. I think this is an important matter to have
> > > cleared up.
>
> > > It's bedtime here (GMT+7) however tomorrow I think I will do some
> > > benchmarks along the lines of the example I wrote up in the SO
> > > question.
>
> > > At this point I would think the safest thing would be to completely
> > > change the model name, thereby guaranteeing that you will be writing
> > > entities with fresh keys. However I suspect it's not necessary to go
> > > that far. I'm thinking that on the production datastore, changing the
> > > model definition and then re-put()ing the entity will be what's
> > > required to realize a speed benefit when reducing the number of
> > > properties on a model. But the facts will speak for themselves.
>
> > There's no need to use a new model name: You can simply create new entities
> > to replace the old ones, under the current model name. If you're using key
> > names, you can construct a new entity with the same values as the old ones,
> > and store that.
>
> > You can also use the low-level API in google.appengine.api.datastore; this
> > provides a dict-like interface from which you can delete unwanted fields.
>
> > -Nick Johnson
>
> > > On Oct 11, 12:17 am, Andy Freeman <ana...@earthlink.net> wrote:
> > > > > In other words: if I want to reduce the size of my entities, is
> > > > > it necessary to migrate the old entities to ones with the new
> > > > > definition?
>
> > > > I'm pretty sure that the answer to that is yes.
>
> > > > >              If so, is it sufficient to re-put() the entity, or must I
> > > > > save under a wholly new key?
>
> > > > I think that it should be sufficient re-put() but decided to test that
> > > > hypothesis.
>
> > > > It isn't sufficient in the SDK - the SDK admin console continues to
> > > > show values for properties that you've deleted from the model
> > > > definition after the re-put().  Yes, I checked to make sure that those
> > > > properties didn't have values before the re-put().
>
> > > > I did the get and re-put() in a transaction, namely:
>
> > > > def txn(key):
> > > >     obj = Model.get(key)
> > > >     obj.put()
> > > > assert db.run_in_transaction(txn, key)
>
> > > > I tried two things to get around this problem.  The first was to add
> > > > db.delete(obj.key()) right before obj.put().  (You can't do obj.delete
> > > > because that trashes the obj.)
>
> > > > The second was to add "obj.old_property = None" right before the
> > > > obj.put() (old_property is the name of the property that I deleted
> > > > from Model's definition.)
>
> > > > Neither one worked.  According to the SDK's datastore viewer, existing
> > > > instances of Model continued to have values for old_property after I
> > > > updated them with that transaction even with the two changes, together
> > > > or separately.
>
> > > > If this is also true of the production datastore, this is a big deal.
>
> > > > On Oct 10, 4:44 am, Jason Smith <j...@proven-corporation.com> wrote:
>
> > > > > Hi, group. My app's main cost (in dollars and response time) is in the
> > > > > db.get([list, of, keys, here]) call in some very high-trafficked code.
> > > > > I want to pare down the size of that model to the bare minimum with
> > > > > the hope of reducing the time and CPU fee for this very common
> > > > > activity. Many users who are experiencing growth in the app popularity
> > > > > probably have this objective as well.
>
> > > > > I have two questions that hopefully others are thinking about too.
>
> > > > > 1. Can I expect the API time of a db.get() with several hundred keys
> > > > > to reduce roughly linearly as I reduce the size of the entity?
> > > > > Currently the entity has the following data attached: 9 String, 9
> > > > > Boolean, 8 Integer, 1 GeoPt, 2 DateTime, 1 Text (avg size ~100 bytes
> > > > > FWIW), 1 Reference, 1 StringList (avg size 500 bytes). The goal is to
> > > > > move the vast majority of this data to related classes so that the
> > > > > core fetch of the main model will be quick.
>
> > > > > 2. If I do not change the name of the entity (i.e. just delete all the
> > > > > db.*Property definitions in the model), will I still incur the same
> > > > > high cost fetching existing entities? The documentation says that all
> > > > > properties of a model are fetched simultaneously. Will the old
> > > > > unneeded properties still transfer over RPC on my dime and while users
> > > > > wait? In other words: if I want to reduce the size of my entities, is
> > > > > it necessary to migrate the old entities to ones with the new
> > > > > definition? If so, is it sufficient to re-put() the entity, or must I
> > > > > save under a wholly new key?
>
> > > > > Thanks very much to anyone who knows about this matter!
>
> > --
> > Nick Johnson, Developer Programs Engineer, App Engine
> > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number:
> > 368047- Hide quoted text -
>
> > - Show quoted text -- Hide quoted text -
>
> > - Show quoted text -
>
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to