[appengine-java] Displaying 1000 tweets eats up datastore quotas in minutes

2012-01-08 Thread Serdar
My app archives and displays tweets in a simple layout, which lets
people easily browse older tweets of Twitter users.

This is what happens in a typical user page:

- Get 100 more tweets via Twitter API and save to the datastore. Each
tweet is stored in a single Entity.

- Get 1000 (will be 200 in the real case) tweets from the datastore
and display.

These datastore reads and writes fill the limits very very quickly.
Even a single user (that's me testing) fills the quotas in minutes,
checking one or two Twitter user's tweets.

I'll use memcache for the reads and that'll help but I don't see my
app could serve more than 10 users a day.

An idea is to save, say, 100 tweets in a single Entity but that just
sounds not right in terms of data structure.

How would you store and display tweets (more than 100 a page) in your
application? (A typical visitor would like to browse some thousands of
tweets.)

-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-java@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



Re: [appengine-java] Displaying 1000 tweets eats up datastore quotas in minutes

2012-01-08 Thread Amy Unruh
Serdar,

If you are frequently pulling in and storing many users' Twitter streams,
this might well require you to enable billing for your app eventually.
 However, during initial development and testing, you can probably reduce
costs enough to avoid that (e.g., try turning down cron frequency).  See
also the documentation regarding billable resources (e.g.
http://code.google.com/appengine/docs/billing.html#Billable_Resource_Unit_Cost).
 For example, if not all properties in your entities need to be
indexed,
you can reduce your write costs by setting some to unindexed.

As you comment, you ought to be able to greatly reduce the number of reads
by using memcache.  You can use appstats (
http://code.google.com/appengine/docs/python/tools/appstats.html) to get
more detail on where your reads and writes are occurring.

On Sun, Jan 8, 2012 at 10:09 PM, Serdar serdar...@gmail.com wrote:

 My app archives and displays tweets in a simple layout, which lets
 people easily browse older tweets of Twitter users.

 This is what happens in a typical user page:

 - Get 100 more tweets via Twitter API and save to the datastore. Each
 tweet is stored in a single Entity.

 - Get 1000 (will be 200 in the real case) tweets from the datastore
 and display.

 These datastore reads and writes fill the limits very very quickly.
 Even a single user (that's me testing) fills the quotas in minutes,
 checking one or two Twitter user's tweets.

 I'll use memcache for the reads and that'll help but I don't see my
 app could serve more than 10 users a day.

 An idea is to save, say, 100 tweets in a single Entity but that just
 sounds not right in terms of data structure.

 How would you store and display tweets (more than 100 a page) in your
 application? (A typical visitor would like to browse some thousands of
 tweets.)

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine for Java group.
 To post to this group, send email to
 google-appengine-java@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine-java+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine-java?hl=en.



-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-java@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.



Re: [appengine-java] Displaying 1000 tweets eats up datastore quotas in minutes

2012-01-08 Thread Matthew Jaggard
You're thinking far too much like an RDBMS user. Imagine for a moment that
querying a database is like picking up polystyrene balls. SQL Server or
whatever is like a vacuum cleaner, the datastore is like a pair of hands.
Picking up a million polystyrene balls with your hands is perfectly
possible but hopelessly inefficient. Picking up a really big piece of
polystyrene however is trivial, size being almost irrelevant. Equally, SQL
Server/vacuum cleaner will collect tiny parts from various places with ease.

So, in your case, I'd definitely combine multiple tweets, I might even put
all of the tweets for a user in the same entity as his other details -
login, etc. See objectify and its load groups for how to do this but
keeping logical separation in Java objects.

Mat.
On 8 Jan 2012 23:25, Amy Unruh amyu+gro...@google.com wrote:

 Serdar,

 If you are frequently pulling in and storing many users' Twitter streams,
 this might well require you to enable billing for your app eventually.
  However, during initial development and testing, you can probably reduce
 costs enough to avoid that (e.g., try turning down cron frequency).  See
 also the documentation regarding billable resources (e.g.
 http://code.google.com/appengine/docs/billing.html#Billable_Resource_Unit_Cost).
   For example, if not all properties in your entities need to be indexed,
 you can reduce your write costs by setting some to unindexed.

 As you comment, you ought to be able to greatly reduce the number of reads
 by using memcache.  You can use appstats (
 http://code.google.com/appengine/docs/python/tools/appstats.html) to get
 more detail on where your reads and writes are occurring.

 On Sun, Jan 8, 2012 at 10:09 PM, Serdar serdar...@gmail.com wrote:

 My app archives and displays tweets in a simple layout, which lets
 people easily browse older tweets of Twitter users.

 This is what happens in a typical user page:

 - Get 100 more tweets via Twitter API and save to the datastore. Each
 tweet is stored in a single Entity.

 - Get 1000 (will be 200 in the real case) tweets from the datastore
 and display.

 These datastore reads and writes fill the limits very very quickly.
 Even a single user (that's me testing) fills the quotas in minutes,
 checking one or two Twitter user's tweets.

 I'll use memcache for the reads and that'll help but I don't see my
 app could serve more than 10 users a day.

 An idea is to save, say, 100 tweets in a single Entity but that just
 sounds not right in terms of data structure.

 How would you store and display tweets (more than 100 a page) in your
 application? (A typical visitor would like to browse some thousands of
 tweets.)

 --
 You received this message because you are subscribed to the Google Groups
 Google App Engine for Java group.
 To post to this group, send email to
 google-appengine-java@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine-java+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine-java?hl=en.


  --
 You received this message because you are subscribed to the Google Groups
 Google App Engine for Java group.
 To post to this group, send email to
 google-appengine-java@googlegroups.com.
 To unsubscribe from this group, send email to
 google-appengine-java+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/google-appengine-java?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
Google App Engine for Java group.
To post to this group, send email to google-appengine-java@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.