The first question I would ask is:  Is this really a problem?  Have
you looked at your bill and decided for certain that the savings would
be worth changing your code & possibly cluttering your business logic?

That said, I would be tempted to see what optimization you can make
with memcache.  Seems like your write load is light and your read load
is heavy, so there's probably a lot of opportunity here without
fundamentally changing your data architecture.

Jeff

On Thu, Sep 15, 2011 at 11:18 PM, Steve <unetright.thebas...@xoxy.net> wrote:
> I'm looking for some opinions on to what degree I should aggregate my now
> small entities together into larger entities.  Presently I have 50,000 "Day"
> entities where each entity is represents a day.
> They are relatively small with 6 float, 2 bool, 1 int, and 1 string
> property.  No property indexes.  Datastore statistics says they average 161
> bytes of data and 80 bytes of metadata (241b total).
> 80% of my user requests are GETs in which I read:
> 70% of the time, 10 day entities
> 25% of the time, 30 day entities
> 05% of the time, 365 day entities
> 20 of my user requests are POSTs in which I read & write:
> 75% of the time, 1 day entity
> 15% of the time, 7 day entities
> 10% of the time, ~15 day entities
> Since the new pricing is going to charge me per entity read and per entity
> write (and thankfully no property indexes here), I think I should look at
> reducing how many reads and writes are involved.  I could very easily chunk
> these individual day entities into groups of 10, or groups of 30.  That
> would (by my rough guess on metadata savings) put the entity size around 2k
> or 6k respectively.
> I am wondering where the line is between retrieving fewer entities and each
> entity becoming too big because the overhead of unwanted days.  With a 10
> day chunk, my most frequent GET request would usually need 2 entities (4k)
> where only half the data was in the needed range.  At a 30 day chunk,
> usually 1 entity would suffice (6k) but 4k of that would be unwanted
> overhead.
> I'm having a hard time getting some internal model for what the impact
> of serializing & deserializing the overhead days would be.  I wish appstats
> wasn't just for RPCs.  I'm guessing the extra time to transfer the larger
> entities to/from the datastore is relatively minimal with Google's network
> infrastructure.  But now that CPU is throttled down to 600Mhz I don't know
> what kind of latencies I'd be adding in with serialization.
> Right now my most common POST operation is to put 1 entity of 241b.  With a
> 30 day chunk, that would be still a single entity put but 6k in size.
> Any opinions, ideas, gut feelings, etc?
> Cheers,
> Steve
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/6Tk1jzXYZTAJ.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to