Why not? My two biggest projects have 180k and 90k friends.
From: google-appengine@googlegroups.com [mailto:google-appengine@googlegroups.com] On Behalf Of Nick Johnson (Google) Sent: Wednesday, April 27, 2011 7:40 PM To: google-appengine@googlegroups.com Subject: Re: [google-appengine] Appropriate way to save hundreds of thousands of ids per user Hi David, Can you elaborate on your exact use-case? You mentioned twitter friends, but I'm fairly sure no users have 200,000 friends on Twitter. -Nick Johnson On Mon, Apr 25, 2011 at 2:54 PM, David Parks <davidpark...@yahoo.com> wrote: I did indeed mean pulling back a result set of say 200,000 rows. If I'm following the conversation correctly then what you described was storing all IDs, querying that one field and de-serializing all IDs into an array that you can then search for the ID's you need. I like that idea. But I certainly can't tell you if the overhead of reading all values, and deserializing them will be better or worse than the overhead of scrolling through a large result set and loading the database with hundreds of millions of rows. Of all databases you could be using, googles big table is certainly well designed for large data sets. It seems that your proposed method makes great sense when you need the entire result set (or close to it) for one or more users. But when you only need 100 results of 150,000, then the deserialization process is going to constitute a measurable overhead. Also, I can't say for sure how the google datastore will perform when you commit hundreds of millions of rows to it. Of course, if small queries like are rare, then maybe it's not so important to consider them. Anyway, I guess you could write, in perhaps a day or less, a very simple test case that populate the datastore with both scenarios and profile them. Doing the profiling work will probably give you some very useful insight and experience on how things will really perform in reality. I'd also suggest that you encapsulate this functionality so that you can easily replace one strategy with another without changing code unrelated to the data store (e.g. design your code using proper data access objects to keep this code separate from the rest of your code, and code to interfaces up front). From: google-appengine@googlegroups.com [mailto:google-appengine@googlegroups.com] On Behalf Of Nischal Shetty Sent: Monday, April 25, 2011 10:34 AM To: google-appengine@googlegroups.com Subject: Re: [google-appengine] Appropriate way to save hundreds of thousands of ids per user @David Querying the whole group would mean having 200,000 results for few of my users. Pulling all that and then searching, wouldn't that be inefficient? or are you talking about sharded ListProperty here? On 25 April 2011 05:41, David Parks <davidpark...@yahoo.com> wrote: That seems like a reasonable approach. But I think you should do both tests. 1) let google do the work and store a lot of records, 2) query the whole group and parse it into an array and search the array. It wouldn't be too hard to created a simple test case that populates the data for whatever # of users you need to plan for and profile the lookup and storage speeds of both. I'd love to know your results if you do test both approaches. From: google-appengine@googlegroups.com [mailto:google-appengine@googlegroups.com] On Behalf Of Nischal Shetty Sent: Friday, April 22, 2011 3:10 PM To: google-appengine@googlegroups.com Subject: Re: [google-appengine] Appropriate way to save hundreds of thousands of ids per user @David Thanks for the input. Every reply gives me some more insight into how I achieve this. My use case is as below : 1. At times I would need all the IDs at the same time in memory 2. Most of the times I would need to check if a set of IDs as input by the user (say 100 IDs) are present in the datastore I've been thinking of doing the following : 1. Persisting all the IDs by putting them into an array (I will probably have shards where each array would hold 50k IDs) 2. Implementing a bloom filter to search for the set of IDs if they exist in the datastore. On 22 April 2011 09:34, David Parks <davidpark...@yahoo.com> wrote: I don't know your intended use of these ID's, my thoughts here are limited to assumed use, feel free to ignore thoughts that are off base for your use case. If, when you query for the IDs you are looking for *all* the IDs, then just serialize them into one field and retrieve them as one record and de-serialize them in a way that doesn't require they all fit into memory at the same time (a tokenized CSV list is most straight forward example, but you can do more compact serializations). If you need to query for some subset of these IDs, then storing them in the datastore is indeed the way to go I suspect. You can batch many inserts/updates. You'll have a large table, but that isn't likely to be a problem with this data store, but do test it. If lookup times degrade with size you could consider partitioning your users into different groups (simple example: 1 group of users IDs that end in even #'s, another that ends in odd #'s), this can reduce the size of indexes and improve performance on some systems (I don't have personal experience to tell you whether this is necessary in this system, but it's a thought to consider). Again, I just offer this as food for thought. If you describe your intended access patterns it will probably help guide the discussion. Good luck. From: google-appengine@googlegroups.com [mailto:google-appengine@googlegroups.com] On Behalf Of nischalshetty Sent: Tuesday, April 19, 2011 1:15 PM To: google-appengine@googlegroups.com Subject: [google-appengine] Appropriate way to save hundreds of thousands of ids per user Every user in my app would have thousands of ids corresponding to them. I would need to look up these ids often. Two things I could think of: 1. Put them into Lists - (drawback is that lists have a maximum capacity of 5000(hope I'm right here) and I have users who would need to save more than 150,000 ids) 2. Insert each id as a unique record in the datastore (too much of data? as it would be user * ids of all users). Can I batch put 5000 records at a time? Can I batch get at least 100 - 500 records at a time? Is there any other way to do this? I hope my question's clear. Your suggestions are greatly appreciated. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com <mailto:google-appengine%2bunsubscr...@googlegroups.com> . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. _____ No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1209 / Virus Database: 1500/3582 - Release Date: 04/18/11 -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com <mailto:google-appengine%2bunsubscr...@googlegroups.com> . For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- -Nischal +91-9920240474 <tel:%2B91-9920240474> twitter: NischalShetty <http://twitter.com/nischalshetty> facebook: Nischal <http://facebook.com/nischal> <http://www.justunfollow.com> <http://www.justunfollow.com> <http://www.justunfollow.com> <http://www.justunfollow.com> -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. <http://www.justunfollow.com> _____ <http://www.justunfollow.com> No virus found in this message. Checked by AVG - www.avg.com <http://www.justunfollow.com> Version: 10.0.1209 / Virus Database: 1500/3589 - Release Date: 04/21/11 <http://www.justunfollow.com> -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. <http://www.justunfollow.com> <http://www.justunfollow.com> <http://www.justunfollow.com> <http://www.justunfollow.com> <http://www.justunfollow.com> -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. <http://www.justunfollow.com> _____ <http://www.justunfollow.com> No virus found in this message. Checked by AVG - www.avg.com <http://www.justunfollow.com> Version: 10.0.1209 / Virus Database: 1500/3595 - Release Date: 04/24/11 <http://www.justunfollow.com> -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. <http://www.justunfollow.com> -- Nick Johnson, Developer Programs Engineer, App Engine <http://www.justunfollow.com> -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.