I have users with more than half a million followers using my app. My own
app account has 140k followers. I personally know a few of my app users who
have more than 200,000 followers :)



On 28 April 2011 08:44, Brandon Wirtz <drak...@digerat.com> wrote:

> Why not?  My two biggest projects have 180k and 90k friends.
>
>
>
> *From:* google-appengine@googlegroups.com [mailto:
> google-appengine@googlegroups.com] *On Behalf Of *Nick Johnson (Google)
> *Sent:* Wednesday, April 27, 2011 7:40 PM
>
> *To:* google-appengine@googlegroups.com
> *Subject:* Re: [google-appengine] Appropriate way to save hundreds of
> thousands of ids per user
>
>
>
> Hi David,
>
>
>
> Can you elaborate on your exact use-case? You mentioned twitter friends,
> but I'm fairly sure no users have 200,000 friends on Twitter.
>
>
>
> -Nick Johnson
>
> On Mon, Apr 25, 2011 at 2:54 PM, David Parks <davidpark...@yahoo.com>
> wrote:
>
> I did indeed mean pulling back a result set of say 200,000 rows. If I’m
> following the conversation correctly then what you described was storing all
> IDs, querying that one field and de-serializing all IDs into an array that
> you can then search for the ID’s you need.
>
>
>
> I like that idea. But I certainly can’t tell you if the overhead of reading
> all values, and deserializing them will be better or worse than the overhead
> of scrolling through a large result set and loading the database with
> hundreds of millions of rows. Of all databases you could be using, googles
> big table is certainly well designed for large data sets.
>
>
>
> It seems that your proposed method makes great sense when you need the
> entire result set (or close to it) for one or more users. But when you only
> need 100 results of 150,000, then the deserialization process is going to
> constitute a measurable overhead. Also, I can’t say for sure how the google
> datastore will  perform when you commit hundreds of millions of rows to it.
> Of course, if small queries like are rare, then maybe it’s not so important
> to consider them.
>
>
>
> Anyway, I guess you could write, in perhaps a day or less, a very simple
> test case that populate the datastore with both scenarios and profile them.
>
>
>
> Doing the profiling work will probably give you some very useful insight
> and experience on how things will really perform in reality.
>
>
>
> I’d also suggest that you encapsulate this functionality so that you can
> easily replace one strategy with another without changing code unrelated to
> the data store (e.g. design your code using proper data access objects to
> keep this code separate from the rest of your code, and code to interfaces
> up front).
>
>
>
>
>
>
>
> *From:* google-appengine@googlegroups.com [mailto:
> google-appengine@googlegroups.com] *On Behalf Of *Nischal Shetty
> *Sent:* Monday, April 25, 2011 10:34 AM
>
>
> *To:* google-appengine@googlegroups.com
> *Subject:* Re: [google-appengine] Appropriate way to save hundreds of
> thousands of ids per user
>
>
>
> @David
>
>
>
> Querying the whole group would mean having 200,000 results for few of my
> users. Pulling all that and then searching, wouldn't that be inefficient? or
> are you talking about sharded ListProperty here?
>
>
>
>
>
>
>
> On 25 April 2011 05:41, David Parks <davidpark...@yahoo.com> wrote:
>
> That seems like a reasonable approach. But I think you should do both
> tests. 1) let google do the work and store a lot of records, 2) query the
> whole group and parse it into an array and search the array. It wouldn’t be
> too hard to created a simple test case that populates the data for whatever
> # of users you need to plan for and profile the lookup and storage speeds of
> both.
>
>
>
> I’d love to know your results if you do test both approaches.
>
>
>
>
>
> *From:* google-appengine@googlegroups.com [mailto:
> google-appengine@googlegroups.com] *On Behalf Of *Nischal Shetty
> *Sent:* Friday, April 22, 2011 3:10 PM
>
>
> *To:* google-appengine@googlegroups.com
>
> *Subject:* Re: [google-appengine] Appropriate way to save hundreds of
> thousands of ids per user
>
>
>
> @David
>
>
>
> Thanks for the input. Every reply gives me some more insight into how I
> achieve this. My use case is as below :
>
>
>
> 1. At times I would need all the IDs at the same time in memory
>
> 2. Most of the times I would need to check if a set of IDs as input by the
> user (say 100 IDs) are present in the datastore
>
>
>
> I've been thinking of doing the following :
>
>
>
> 1. Persisting all the IDs by putting them into an array (I will probably
> have shards where each array would hold 50k IDs)
>
> 2. Implementing a bloom filter to search for the set of IDs if they exist
> in the datastore.
>
>
>
>
>
> On 22 April 2011 09:34, David Parks <davidpark...@yahoo.com> wrote:
>
> I don’t know your intended use of these ID’s, my thoughts here are limited
> to assumed use, feel free to ignore thoughts that are off base for your use
> case.
>
>
>
> If, when you query for the IDs you are looking for **all** the IDs, then
> just serialize them into one field and retrieve them as one record and
> de-serialize them in a way that doesn’t require they all fit into memory at
> the same time (a tokenized CSV list is most straight forward example, but
> you can do more compact serializations).
>
>
>
> If you need to query for some subset of these IDs, then storing them in the
> datastore is indeed the way to go I suspect. You can batch many
> inserts/updates. You’ll have a large table, but that isn’t likely to be a
> problem with this data store, but do test it. If lookup times degrade with
> size you could consider partitioning your users into different groups
> (simple example: 1 group of users IDs that end in even #’s, another that
> ends in odd #’s), this can reduce the size of indexes and improve
> performance on some systems (I don’t have personal experience to tell you
> whether this is necessary in this system, but it’s a thought to consider).
>
>
>
> Again, I just offer this as food for thought. If you describe your intended
> access patterns it will probably help guide the discussion. Good luck.
>
>
>
>
>
> *From:* google-appengine@googlegroups.com [mailto:
> google-appengine@googlegroups.com] *On Behalf Of *nischalshetty
> *Sent:* Tuesday, April 19, 2011 1:15 PM
> *To:* google-appengine@googlegroups.com
> *Subject:* [google-appengine] Appropriate way to save hundreds of
> thousands of ids per user
>
>
>
> Every user in my app would have thousands of ids corresponding to them. I
> would need to look up these ids often.
>
> Two things I could think of:
>
> 1. Put them into Lists - (drawback is that lists have a maximum capacity of
> 5000(hope I'm right here) and I have users who would need to save more than
> 150,000 ids)
> 2. Insert each id as a unique record in the datastore (too much of data? as
> it would be user * ids of all users). Can I batch put 5000 records at a
> time? Can I batch get at least 100 - 500 records at a time?
>
> Is there any other way to do this? I hope my question's clear. Your
> suggestions are greatly appreciated.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
> ------------------------------
>
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1209 / Virus Database: 1500/3582 - Release Date: 04/18/11
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>
>
>
> --
> -Nischal
>
> +91-9920240474
>
> twitter: NischalShetty <http://twitter.com/nischalshetty>
>
> facebook: Nischal <http://facebook.com/nischal>
>
>
>
> <http://www.justunfollow.com>
>
>   <http://www.justunfollow.com>
>
>   <http://www.justunfollow.com>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to *google-appengine@googlegroups.com*.
> To unsubscribe from this group, send email to *
> google-appengine+unsubscr...@googlegroups.com*.
> For more options, visit this group at *
> http://groups.google.com/group/google-appengine?hl=en*.<http://www.justunfollow.com>
> ------------------------------
>  <http://www.justunfollow.com>
>
> No virus found in this message.
> Checked by AVG - *www.avg.com* <http://www.justunfollow.com>
>
> Version: 10.0.1209 / Virus Database: 1500/3589 - Release Date: 
> 04/21/11<http://www.justunfollow.com>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to *google-appengine@googlegroups.com*.
> To unsubscribe from this group, send email to *
> google-appengine+unsubscr...@googlegroups.com*.
> For more options, visit this group at *
> http://groups.google.com/group/google-appengine?hl=en*.<http://www.justunfollow.com>
>
>
>
> <http://www.justunfollow.com>
>
>   <http://www.justunfollow.com>
>
>   <http://www.justunfollow.com>
>
>   <http://www.justunfollow.com>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to *google-appengine@googlegroups.com*.
> To unsubscribe from this group, send email to *
> google-appengine+unsubscr...@googlegroups.com*.
> For more options, visit this group at *
> http://groups.google.com/group/google-appengine?hl=en*.<http://www.justunfollow.com>
> ------------------------------
>  <http://www.justunfollow.com>
>
> No virus found in this message.
> Checked by AVG - *www.avg.com* <http://www.justunfollow.com>
>
> Version: 10.0.1209 / Virus Database: 1500/3595 - Release Date: 
> 04/24/11<http://www.justunfollow.com>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to *google-appengine@googlegroups.com*.
> To unsubscribe from this group, send email to *
> google-appengine+unsubscr...@googlegroups.com*.
> For more options, visit this group at *
> http://groups.google.com/group/google-appengine?hl=en*.<http://www.justunfollow.com>
>
>
>
>
> --
> Nick Johnson, Developer Programs Engineer, App Engine
>
> <http://www.justunfollow.com>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.<http://www.justunfollow.com>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>



-- 
-Nischal
+91-9920240474
twitter: NischalShetty <http://twitter.com/nischalshetty>
facebook: Nischal <http://facebook.com/nischal>

<http://www.justunfollow.com>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to