Thanks, Jeff. I understand that an entity is a blob from serialization
point of view. I thought  that there is a notion of sub-blob
corresponding to each property and that when a property is updated,
only a sub-blob is written out to disk. You clarified that this is not
the case. Thanks.

But, this seems to run counter to Ikai's suggestion. In the interest
of reducing the number of entities, one has to pack more data in an
entity. In practice, this means that one has to set more properties
for a given entity. Updating a single property entails writing out the
whole blob encompassing the entity. On the contrary, proliferating
entities entails more put(). ( let us assume that there is no
indexing ). There seems to be a tangle here.


J.Ganesan

On Nov 5, 6:47 pm, Jeff Schnitzer <j...@infohazard.org> wrote:
> You're still thinking of an RDBMS.  Here is a more accurate mental
> model of the datastore:
>
> It's a big persistent HashMap.  The key is a Key (tuple of ancestor
> keys, kind, and id), the value is a serialized (protobuf) blob of your
> properties.  There is no update except loading an entity *whole* and
> writing an entity *whole* in a transaction.  Indexes are separate -
> they live in a different BigTable.  Querying for entities walks the
> index, then requires separate fetches for each entity blob.  This is
> why keys-only queries are cheaper.
>
> This is also why unindexed properties are "free" - it really doesn't
> matter whether you write 20 bytes or 20 kilobytes into the serialized
> blob.  Index updates are expensive because they require separate
> writes to other tables.
>
> Jeff
>
>
>
>
>
>
>
> On Sat, Nov 5, 2011 at 9:15 AM, J.Ganesan <j.gane...@datastoregwt.com> wrote:
> > Thank you, Ikai. Your comments were useful. I was fixated on "one
> > object one entity". Following your advice, I collapsed the entities.
> > Now, they are only handful.
>
> > One last doubt - how does writing to the disk work in the following
> > scenario ?
>
> > // transaction 1
> > // set two objects as properties
> > // no doubt here
> > entity.setUnindexedProperty( string1, object1 ) ;
> > entity.setUnindexedProperty( string2, object2 ) ;
> > put().
>
> > // transaction 2
> > // update object1
> >  entity.setUnindexedProperty( string1, object1 ) ;
> >  put() ;
>
> > App Engine stores bytes corresponding to object1 somewhere. I guess
> > that string1 points to the disk location, enabling overwriting with
> > updated bytes. My doubt is whether the second transaction forces
> > writing bytes corresponding to object2 also to the disk, which is
> > needless ?
>
> > J.Ganesan
>
> > On Nov 5, 3:56 am, "Ikai Lan (Google)" <ika...@google.com> wrote:
> >> If all 4000 entites are in a single entity group, in theory you can do this
> >> because it counts as a single transactional write. There's a maximum RPC
> >> size of 11mb (implementation detail) so if you trip this, you're in some
> >> trouble - the RPC size include not only the size of the entity but also the
> >> size of all the indexes.
>
> >> The problem is that this is a bad design. App Engine charges for datastore
> >> ops, so you're already using 4000 datastore write ops per request +
> >> multiples for indexes.
>
> >> Instead, try to figure out how to can write the data in as few entities as
> >> possible. I suspect you're still thinking relationally. You didn't answer
> >> my question about what problem you're trying to solve. What are you
> >> building? Why would 4000 writes be needed? Why can't all the data fit into
> >> a single entity?
>
> >> --
> >> Ikai Lan
> >> Developer Programs Engineer, Google App Engine
> >> plus.ikailan.com | twitter.com/ikai
>
> >> On Fri, Nov 4, 2011 at 8:31 AM, J.Ganesan 
> >> <j.gane...@datastoregwt.com>wrote:
>
> >> > Thank you, Gerald. I will look for alternative implementations if I
> >> > can not call put() many times within a transaction. I am waiting for
> >> > Ikai's comments.
>
> >> > J.Ganesan
>
> >> > On Nov 4, 11:02 am, Gerald Tan <woefulwab...@gmail.com> wrote:
> >> > > If there is no need to reference the objects from outside the group, 
> >> > > you
> >> > > would probably find it a lot more efficient to store the while array
> >> > > serialized as a byte array.
>
> >> > --
> >> > You received this message because you are subscribed to the Google Groups
> >> > "Google App Engine for Java" group.
> >> > To post to this group, send email to
> >> > google-appengine-java@googlegroups.com.
> >> > To unsubscribe from this group, send email to
> >> > google-appengine-java+unsubscr...@googlegroups.com.
> >> > For more options, visit this group at
> >> >http://groups.google.com/group/google-appengine-java?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Google App Engine for Java" group.
> > To post to this group, send email to google-appengine-java@googlegroups.com.
> > To unsubscribe from this group, send email to 
> > google-appengine-java+unsubscr...@googlegroups.com.
> > For more options, visit this group 
> > athttp://groups.google.com/group/google-appengine-java?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to google-appengine-java@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.

Reply via email to