@Nick

1) the 1000 entities (rows) limit has been lifted long time ago.


I thought by lifting the limit it meant I could go ahead and fetech 1001-
2000 using a cursor. So I guess, it means pulling more than 1000 rows at a
time, stupid me :)

2) tasks are not limited by the 30s limit - can run for 10 minutes.


We provide URLs that would be called when the task executes. Those would
stop in  30s right? So, what exactly is this 10 minute limit, I haven't been
able to wrap my head around the 10 minute thingy.


On 26 April 2011 00:58, nickmilon <nickmi...@gmail.com> wrote:

> 1) the 1000 entities (rows) limit has been lifted long time ago.
> 2) tasks are not limited by the 30s limit - can run for 10 minutes.
>
> Happy coding ;-)
> Nick
> On Apr 25, 9:01 am, Nischal Shetty <nischalshett...@gmail.com> wrote:
> > I will indeed try a few ways to do this. But pulling all rows
> individually
> > would be an overkill because every query gives us 1000 rows at a time
> which
> > means I would hit the 30s limit while I'm at it :(
> >
> > For searching the IDs that I have at hand, I would not need to
> deserialize
> > the array of ids. I would be making use of Bloom Filter which I think
> would
> > speed things up. I would need to deserialize all the ids occasionally for
> > some rare computational purposes.
> >
> > So my use case would consist  80% search a bunch of IDs and 20%
> deserialize
> > all the IDs.
> >
> > On 25 April 2011 10:24, David Parks <davidpark...@yahoo.com> wrote:
> >
> >
> >
> >
> >
> >
> >
> > > I did indeed mean pulling back a result set of say 200,000 rows. If I’m
> > > following the conversation correctly then what you described was
> storing all
> > > IDs, querying that one field and de-serializing all IDs into an array
> that
> > > you can then search for the ID’s you need.
> >
> > > I like that idea. But I certainly can’t tell you if the overhead of
> reading
> > > all values, and deserializing them will be better or worse than the
> overhead
> > > of scrolling through a large result set and loading the database with
> > > hundreds of millions of rows. Of all databases you could be using,
> googles
> > > big table is certainly well designed for large data sets.
> >
> > > It seems that your proposed method makes great sense when you need the
> > > entire result set (or close to it) for one or more users. But when you
> only
> > > need 100 results of 150,000, then the deserialization process is going
> to
> > > constitute a measurable overhead. Also, I can’t say for sure how the
> google
> > > datastore will  perform when you commit hundreds of millions of rows to
> it.
> > > Of course, if small queries like are rare, then maybe it’s not so
> important
> > > to consider them.
> >
> > > Anyway, I guess you could write, in perhaps a day or less, a very
> simple
> > > test case that populate the datastore with both scenarios and profile
> them.
> >
> > > Doing the profiling work will probably give you some very useful
> insight
> > > and experience on how things will really perform in reality.
> >
> > > I’d also suggest that you encapsulate this functionality so that you
> can
> > > easily replace one strategy with another without changing code
> unrelated to
> > > the data store (e.g. design your code using proper data access objects
> to
> > > keep this code separate from the rest of your code, and code to
> interfaces
> > > up front).
> >
> > > *From:* google-appengine@googlegroups.com [mailto:
> > > google-appengine@googlegroups.com] *On Behalf Of *Nischal Shetty
> > > *Sent:* Monday, April 25, 2011 10:34 AM
> >
> > > *To:* google-appengine@googlegroups.com
> > > *Subject:* Re: [google-appengine] Appropriate way to save hundreds of
> > > thousands of ids per user
> >
> > > @David
> >
> > > Querying the whole group would mean having 200,000 results for few of
> my
> > > users. Pulling all that and then searching, wouldn't that be
> inefficient? or
> > > are you talking about sharded ListProperty here?
> >
> > > On 25 April 2011 05:41, David Parks <davidpark...@yahoo.com> wrote:
> >
> > > That seems like a reasonable approach. But I think you should do both
> > > tests. 1) let google do the work and store a lot of records, 2) query
> the
> > > whole group and parse it into an array and search the array. It
> wouldn’t be
> > > too hard to created a simple test case that populates the data for
> whatever
> > > # of users you need to plan for and profile the lookup and storage
> speeds of
> > > both.
> >
> > > I’d love to know your results if you do test both approaches.
> >
> > > *From:* google-appengine@googlegroups.com [mailto:
> > > google-appengine@googlegroups.com] *On Behalf Of *Nischal Shetty
> > > *Sent:* Friday, April 22, 2011 3:10 PM
> >
> > > *To:* google-appengine@googlegroups.com
> >
> > > *Subject:* Re: [google-appengine] Appropriate way to save hundreds of
> > > thousands of ids per user
> >
> > > @David
> >
> > > Thanks for the input. Every reply gives me some more insight into how I
> > > achieve this. My use case is as below :
> >
> > > 1. At times I would need all the IDs at the same time in memory
> >
> > > 2. Most of the times I would need to check if a set of IDs as input by
> the
> > > user (say 100 IDs) are present in the datastore
> >
> > > I've been thinking of doing the following :
> >
> > > 1. Persisting all the IDs by putting them into an array (I will
> probably
> > > have shards where each array would hold 50k IDs)
> >
> > > 2. Implementing a bloom filter to search for the set of IDs if they
> exist
> > > in the datastore.
> >
> > > On 22 April 2011 09:34, David Parks <davidpark...@yahoo.com> wrote:
> >
> > > I don’t know your intended use of these ID’s, my thoughts here are
> limited
> > > to assumed use, feel free to ignore thoughts that are off base for your
> use
> > > case.
> >
> > > If, when you query for the IDs you are looking for **all** the IDs,
> then
> > > just serialize them into one field and retrieve them as one record and
> > > de-serialize them in a way that doesn’t require they all fit into
> memory at
> > > the same time (a tokenized CSV list is most straight forward example,
> but
> > > you can do more compact serializations).
> >
> > > If you need to query for some subset of these IDs, then storing them in
> the
> > > datastore is indeed the way to go I suspect. You can batch many
> > > inserts/updates. You’ll have a large table, but that isn’t likely to be
> a
> > > problem with this data store, but do test it. If lookup times degrade
> with
> > > size you could consider partitioning your users into different groups
> > > (simple example: 1 group of users IDs that end in even #’s, another
> that
> > > ends in odd #’s), this can reduce the size of indexes and improve
> > > performance on some systems (I don’t have personal experience to tell
> you
> > > whether this is necessary in this system, but it’s a thought to
> consider).
> >
> > > Again, I just offer this as food for thought. If you describe your
> intended
> > > access patterns it will probably help guide the discussion. Good luck.
> >
> > > *From:* google-appengine@googlegroups.com [mailto:
> > > google-appengine@googlegroups.com] *On Behalf Of *nischalshetty
> > > *Sent:* Tuesday, April 19, 2011 1:15 PM
> > > *To:* google-appengine@googlegroups.com
> > > *Subject:* [google-appengine] Appropriate way to save hundreds of
> > > thousands of ids per user
> >
> > > Every user in my app would have thousands of ids corresponding to them.
> I
> > > would need to look up these ids often.
> >
> > > Two things I could think of:
> >
> > > 1. Put them into Lists - (drawback is that lists have a maximum
> capacity of
> > > 5000(hope I'm right here) and I have users who would need to save more
> than
> > > 150,000 ids)
> > > 2. Insert each id as a unique record in the datastore (too much of
> data? as
> > > it would be user * ids of all users). Can I batch put 5000 records at a
> > > time? Can I batch get at least 100 - 500 records at a time?
> >
> > > Is there any other way to do this? I hope my question's clear. Your
> > > suggestions are greatly appreciated.
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Google App Engine" group.
> > > To post to this group, send email to google-appengine@googlegroups.com
> .
> > > To unsubscribe from this group, send email to
> > > google-appengine+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > >http://groups.google.com/group/google-appengine?hl=en.
> > > ------------------------------
> >
> > > No virus found in this message.
> > > Checked by AVG -www.avg.com
> > > Version: 10.0.1209 / Virus Database: 1500/3582 - Release Date: 04/18/11
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Google App Engine" group.
> > > To post to this group, send email to google-appengine@googlegroups.com
> .
> > > To unsubscribe from this group, send email to
> > > google-appengine+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > >http://groups.google.com/group/google-appengine?hl=en.
> >
> > > --
> > > -Nischal
> >
> > > +91-9920240474
> >
> > > twitter: NischalShetty <http://twitter.com/nischalshetty>
> >
> > > facebook: Nischal <http://facebook.com/nischal>
> >
> > > <http://www.justunfollow.com>
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Google App Engine" group.
> > > To post to this group, send email to google-appengine@googlegroups.com
> .
> > > To unsubscribe from this group, send email to
> > > google-appengine+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > >http://groups.google.com/group/google-appengine?hl=en.
> > > ------------------------------
> >
> > > No virus found in this message.
> > > Checked by AVG -www.avg.com
> >
> > > Version: 10.0.1209 / Virus Database: 1500/3589 - Release Date: 04/21/11
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Google App Engine" group.
> > > To post to this group, send email to google-appengine@googlegroups.com
> .
> > > To unsubscribe from this group, send email to
> > > google-appengine+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > >http://groups.google.com/group/google-appengine?hl=en.
> >
> > > --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Google App Engine" group.
> > > To post to this group, send email to google-appengine@googlegroups.com
> .
> > > To unsubscribe from this group, send email to
> > > google-appengine+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > >http://groups.google.com/group/google-appengine?hl=en.
> > > ------------------------------
> >
> > > No virus found in this message.
> > > Checked by AVG -www.avg.com
> > > Version: 10.0.1209 / Virus Database: 1500/3595 - Release Date: 04/24/11
> >
> > >  --
> > > You received this message because you are subscribed to the Google
> Groups
> > > "Google App Engine" group.
> > > To post to this group, send email to google-appengine@googlegroups.com
> .
> > > To unsubscribe from this group, send email to
> > > google-appengine+unsubscr...@googlegroups.com.
> > > For more options, visit this group at
> > >http://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-appengine@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>


-- 
-Nischal
twitter: NischalShetty <http://twitter.com/nischalshetty>
facebook: Nischal <http://facebook.com/nischal>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to