That seems like a reasonable approach. But I think you should do both tests.
1) let google do the work and store a lot of records, 2) query the whole
group and parse it into an array and search the array. It wouldn't be too
hard to created a simple test case that populates the data for whatever # of
users you need to plan for and profile the lookup and storage speeds of
both.

 

I'd love to know your results if you do test both approaches.

 

 

From: google-appengine@googlegroups.com
[mailto:google-appengine@googlegroups.com] On Behalf Of Nischal Shetty
Sent: Friday, April 22, 2011 3:10 PM
To: google-appengine@googlegroups.com
Subject: Re: [google-appengine] Appropriate way to save hundreds of
thousands of ids per user

 

@David

 

Thanks for the input. Every reply gives me some more insight into how I
achieve this. My use case is as below : 

 

1. At times I would need all the IDs at the same time in memory

2. Most of the times I would need to check if a set of IDs as input by the
user (say 100 IDs) are present in the datastore

 

I've been thinking of doing the following :

 

1. Persisting all the IDs by putting them into an array (I will probably
have shards where each array would hold 50k IDs)

2. Implementing a bloom filter to search for the set of IDs if they exist in
the datastore.

 

 

On 22 April 2011 09:34, David Parks <davidpark...@yahoo.com> wrote:

I don't know your intended use of these ID's, my thoughts here are limited
to assumed use, feel free to ignore thoughts that are off base for your use
case. 

 

If, when you query for the IDs you are looking for *all* the IDs, then just
serialize them into one field and retrieve them as one record and
de-serialize them in a way that doesn't require they all fit into memory at
the same time (a tokenized CSV list is most straight forward example, but
you can do more compact serializations).

 

If you need to query for some subset of these IDs, then storing them in the
datastore is indeed the way to go I suspect. You can batch many
inserts/updates. You'll have a large table, but that isn't likely to be a
problem with this data store, but do test it. If lookup times degrade with
size you could consider partitioning your users into different groups
(simple example: 1 group of users IDs that end in even #'s, another that
ends in odd #'s), this can reduce the size of indexes and improve
performance on some systems (I don't have personal experience to tell you
whether this is necessary in this system, but it's a thought to consider).

 

Again, I just offer this as food for thought. If you describe your intended
access patterns it will probably help guide the discussion. Good luck.

 

 

From: google-appengine@googlegroups.com
[mailto:google-appengine@googlegroups.com] On Behalf Of nischalshetty
Sent: Tuesday, April 19, 2011 1:15 PM
To: google-appengine@googlegroups.com
Subject: [google-appengine] Appropriate way to save hundreds of thousands of
ids per user

 

Every user in my app would have thousands of ids corresponding to them. I
would need to look up these ids often.

Two things I could think of:

1. Put them into Lists - (drawback is that lists have a maximum capacity of
5000(hope I'm right here) and I have users who would need to save more than
150,000 ids)
2. Insert each id as a unique record in the datastore (too much of data? as
it would be user * ids of all users). Can I batch put 5000 records at a
time? Can I batch get at least 100 - 500 records at a time?

Is there any other way to do this? I hope my question's clear. Your
suggestions are greatly appreciated.

-- 
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com
<mailto:google-appengine%2bunsubscr...@googlegroups.com> .
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

  _____  

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1209 / Virus Database: 1500/3582 - Release Date: 04/18/11

-- 
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com
<mailto:google-appengine%2bunsubscr...@googlegroups.com> .
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.




-- 
-Nischal

+91-9920240474

twitter: NischalShetty <http://twitter.com/nischalshetty> 

facebook: Nischal <http://facebook.com/nischal> 

 

 <http://www.justunfollow.com> 

 

 

-- 
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.

  _____  

No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1209 / Virus Database: 1500/3589 - Release Date: 04/21/11

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to