[google-appengine] Re: Entity Sizing / Grouping wrt New Pricing

2011-09-16 Thread Joops

Going through a similar process myself.
(combining multiple entities into single entities as bundles of Json)

I think it's a good idea and I think you will want to do experiments
to see what size works best for you.
I have made it so I can tweak the point at which my entities are
forked into separate entities.
I found that deserialistion was slow when my single entities where too
large.

Don't forget memcache! (so if you need the data for today, you can
just grab this months data from memcache)

J
On Sep 15, 11:18 pm, Steve  wrote:
> I'm looking for some opinions on to what degree I should aggregate my now
> small entities together into larger entities.  Presently I have 50,000 "Day"
> entities where each entity is represents a day.
>
> They are relatively small with 6 float, 2 bool, 1 int, and 1 string
> property.  No property indexes.  Datastore statistics says they average 161
> bytes of data and 80 bytes of metadata (241b total).
>
> 80% of my user requests are GETs in which I read:
> 70% of the time, 10 day entities
> 25% of the time, 30 day entities
> 05% of the time, 365 day entities
>
> 20 of my user requests are POSTs in which I read & write:
> 75% of the time, 1 day entity
> 15% of the time, 7 day entities
> 10% of the time, ~15 day entities
>
> Since the new pricing is going to charge me per entity read and per entity
> write (and thankfully no property indexes here), I think I should look at
> reducing how many reads and writes are involved.  I could very easily chunk
> these individual day entities into groups of 10, or groups of 30.  That
> would (by my rough guess on metadata savings) put the entity size around 2k
> or 6k respectively.
>
> I am wondering where the line is between retrieving fewer entities and each
> entity becoming too big because the overhead of unwanted days.  With a 10
> day chunk, my most frequent GET request would usually need 2 entities (4k)
> where only half the data was in the needed range.  At a 30 day chunk,
> usually 1 entity would suffice (6k) but 4k of that would be unwanted
> overhead.
>
> I'm having a hard time getting some internal model for what the impact
> of serializing & deserializing the overhead days would be.  I wish appstats
> wasn't just for RPCs.  I'm guessing the extra time to transfer the larger
> entities to/from the datastore is relatively minimal with Google's network
> infrastructure.  But now that CPU is throttled down to 600Mhz I don't know
> what kind of latencies I'd be adding in with serialization.
>
> Right now my most common POST operation is to put 1 entity of 241b.  With a
> 30 day chunk, that would be still a single entity put but 6k in size.
>
> Any opinions, ideas, gut feelings, etc?
>
> Cheers,
> Steve

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.



[google-appengine] Re: Entity Sizing / Grouping wrt New Pricing

2011-09-16 Thread Steve
Thanks for the input.  If you don't mind me asking, how large were 
your entities when you noticed deserialization taking a long time?

Cheers,
Steve

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/nQsXGqma5YkJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.