[google-appengine] Re: What is the most efficient way to do a large IN query in GQL?
Hi Nick, I can't use key_names since twitter ids are not the key_names for my user models The users don't have to oauth to twitter or other sites. Robert/Niklas, I'll try getting all the users and do a reference query. Thanks guys John On Sep 7, 3:39 am, "Nick Johnson (Google)" wrote: > Hi John, > > On Tue, Sep 7, 2010 at 1:01 AM, johnterran wrote: > > Hi Robert, > > > I can't use the key_name. The ids are not from my site > > i.e. > > Lets say the ids are from twitter. I want to know how many of the > > twitter users > > are registered on my site. So the ids can exists in the datastore, > > but it doesn't have to. > > This doesn't prevent you using the IDs as key_names. Attempting to fetch an > entity that doesn't exist will simply return None for that entity. > > -Nick Johnson > > > > > > > > > Is the best way to get all the users and filter them manually similar > > to what Niklas wrote? > > > Thanks > > John > > > On Sep 6, 8:22 am, Robert Kluin wrote: > > > It will not be possible to use IN for something like that. IN will > > > execute a series of queries, and it is is capped at 30. > > > > If possible, I would suggest you make the entity key_name the user's > > > id. Then you can just build a list of keys and fetch those -- but I > > > really doubt you'll get anything close to 10K on a single fetch. > > > > Robert > > > > On Mon, Sep 6, 2010 at 04:51, johnterran wrote: > > > > Hi > > > > > In BigTable, what is the most efficient way to do a large IN query? > > > > My IN parameter list is typically 500 but can be 10k+ > > > > i.e. > > > > class User(db.Model): > > > > name = db.StringProperty(required = True) > > > > id = db.StringProperty(required = True) > > > > > given a list of ids that can consist of 10k list, i need to retrieve > > > > all the names > > > > users = db.GqlQuery("SELECT * FROM User where id IN :1", > > > > ids) > > > > > what is the best way to do this? > > > > > Thanks > > > > John > > > > > -- > > > > You received this message because you are subscribed to the Google > > Groups "Google App Engine" group. > > > > To post to this group, send email to google-appengine@googlegroups.com > > . > > > > To unsubscribe from this group, send email to > > google-appengine+unsubscr...@googlegroups.com > e...@googlegroups.com> > > . > > > > For more options, visit this group athttp:// > > groups.google.com/group/google-appengine?hl=en. > > > -- > > You received this message because you are subscribed to the Google Groups > > "Google App Engine" group. > > To post to this group, send email to google-appeng...@googlegroups.com. > > To unsubscribe from this group, send email to > > google-appengine+unsubscr...@googlegroups.com > e...@googlegroups.com> > > . > > For more options, visit this group at > >http://groups.google.com/group/google-appengine?hl=en. > > -- > Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd. :: > Registered in Dublin, Ireland, Registration Number: 368047 > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: > 368047 -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] Re: What is the most efficient way to do a large IN query in GQL?
Hi John, On Tue, Sep 7, 2010 at 1:01 AM, johnterran wrote: > Hi Robert, > > I can't use the key_name. The ids are not from my site > i.e. > Lets say the ids are from twitter. I want to know how many of the > twitter users > are registered on my site. So the ids can exists in the datastore, > but it doesn't have to. > This doesn't prevent you using the IDs as key_names. Attempting to fetch an entity that doesn't exist will simply return None for that entity. -Nick Johnson > > Is the best way to get all the users and filter them manually similar > to what Niklas wrote? > > Thanks > John > > On Sep 6, 8:22 am, Robert Kluin wrote: > > It will not be possible to use IN for something like that. IN will > > execute a series of queries, and it is is capped at 30. > > > > If possible, I would suggest you make the entity key_name the user's > > id. Then you can just build a list of keys and fetch those -- but I > > really doubt you'll get anything close to 10K on a single fetch. > > > > Robert > > > > > > > > On Mon, Sep 6, 2010 at 04:51, johnterran wrote: > > > Hi > > > > > In BigTable, what is the most efficient way to do a large IN query? > > > My IN parameter list is typically 500 but can be 10k+ > > > i.e. > > > class User(db.Model): > > >name = db.StringProperty(required = True) > > >id = db.StringProperty(required = True) > > > > > given a list of ids that can consist of 10k list, i need to retrieve > > > all the names > > > users = db.GqlQuery("SELECT * FROM User where id IN :1", > > >ids) > > > > > what is the best way to do this? > > > > > Thanks > > > John > > > > > -- > > > You received this message because you are subscribed to the Google > Groups "Google App Engine" group. > > > To post to this group, send email to google-appengine@googlegroups.com > . > > > To unsubscribe from this group, send email to > google-appengine+unsubscr...@googlegroups.com > . > > > For more options, visit this group athttp:// > groups.google.com/group/google-appengine?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to google-appeng...@googlegroups.com. > To unsubscribe from this group, send email to > google-appengine+unsubscr...@googlegroups.com > . > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- Nick Johnson, Developer Programs Engineer, App Engine Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047 Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047 -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Re: [google-appengine] Re: What is the most efficient way to do a large IN query in GQL?
Hey John, Niklas also has another followup post. For the exact situation you are asking about I think he has the right idea. You could use simply use a reference property to reference another model, making it very easy to query. To build those models you will need to iterate over all users and generate the appropriate cross-reference models. For that I suggest looking into cursors and the task queue. If you are trying to keep statistics of some sort, I would pre-compute when ever possible. In this case you could keep counts of how many users each source has. For querying you could add a "source" field to your user model indicating the source of the user, then fetching twitter, facebook, or google users would be much easier. If your data set is large, you may want to look into the mapper api. Robert On Mon, Sep 6, 2010 at 20:01, johnterran wrote: > Hi Robert, > > I can't use the key_name. The ids are not from my site > i.e. > Lets say the ids are from twitter. I want to know how many of the > twitter users > are registered on my site. So the ids can exists in the datastore, > but it doesn't have to. > > Is the best way to get all the users and filter them manually similar > to what Niklas wrote? > > Thanks > John > > On Sep 6, 8:22 am, Robert Kluin wrote: >> It will not be possible to use IN for something like that. IN will >> execute a series of queries, and it is is capped at 30. >> >> If possible, I would suggest you make the entity key_name the user's >> id. Then you can just build a list of keys and fetch those -- but I >> really doubt you'll get anything close to 10K on a single fetch. >> >> Robert >> >> >> >> On Mon, Sep 6, 2010 at 04:51, johnterran wrote: >> > Hi >> >> > In BigTable, what is the most efficient way to do a large IN query? >> > My IN parameter list is typically 500 but can be 10k+ >> > i.e. >> > class User(db.Model): >> > name = db.StringProperty(required = True) >> > id = db.StringProperty(required = True) >> >> > given a list of ids that can consist of 10k list, i need to retrieve >> > all the names >> > users = db.GqlQuery("SELECT * FROM User where id IN :1", >> > ids) >> >> > what is the best way to do this? >> >> > Thanks >> > John >> >> > -- >> > You received this message because you are subscribed to the Google Groups >> > "Google App Engine" group. >> > To post to this group, send email to google-appeng...@googlegroups.com. >> > To unsubscribe from this group, send email to >> > google-appengine+unsubscr...@googlegroups.com. >> > For more options, visit this group >> > athttp://groups.google.com/group/google-appengine?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to google-appeng...@googlegroups.com. > To unsubscribe from this group, send email to > google-appengine+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: What is the most efficient way to do a large IN query in GQL?
On Sep 7, 12:01 am, johnterran wrote: > Hi Robert, > > I can't use the key_name. The ids are not from my site > i.e. > Lets say the ids are from twitter. I want to know how many of the > twitter users > are registered on my site. So the ids can exists in the datastore, > but it doesn't have to. > > Is the best way to get all the users and filter them manually similar > to what Niklas wrote? > > Thanks > John > > On Sep 6, 8:22 am, Robert Kluin wrote: > > > It will not be possible to use IN for something like that. IN will > > execute a series of queries, and it is is capped at 30. > > > If possible, I would suggest you make the entity key_name the user's > > id. Then you can just build a list of keys and fetch those -- but I > > really doubt you'll get anything close to 10K on a single fetch. > > > Robert > > > On Mon, Sep 6, 2010 at 04:51, johnterran wrote: > > > Hi > > > > In BigTable, what is the most efficient way to do a large IN query? > > > My IN parameter list is typically 500 but can be 10k+ > > > i.e. > > > class User(db.Model): > > > name = db.StringProperty(required = True) > > > id = db.StringProperty(required = True) > > > > given a list of ids that can consist of 10k list, i need to retrieve > > > all the names > > > users = db.GqlQuery("SELECT * FROM User where id IN :1", > > > ids) > > > > what is the best way to do this? > > > > Thanks > > > John class Match(db.Model):#match other with own user=db.UserProperty(verbose_name="myuser") reference=db.ReferenceProperty(OtherModel,collection_name='matched_users',verbose_name="Title") Here a better(?) structure referenced model ie twitter.matched_users could do it or even in memcache for something like a dictionary or a table. Again I didn't program it only proposing similar structure which solved my logic matching entities according to nearly arbitrary matches So like making the query like a dictionary worked for blobs also "automatically" parametrized via get_serving_url ie finding the parametrization got querying 2 () only querying 1 just observing how query parametrizes. Regards Niklas (proposing Match class that also can work self-referencing) -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: What is the most efficient way to do a large IN query in GQL?
Hi Robert, I can't use the key_name. The ids are not from my site i.e. Lets say the ids are from twitter. I want to know how many of the twitter users are registered on my site. So the ids can exists in the datastore, but it doesn't have to. Is the best way to get all the users and filter them manually similar to what Niklas wrote? Thanks John On Sep 6, 8:22 am, Robert Kluin wrote: > It will not be possible to use IN for something like that. IN will > execute a series of queries, and it is is capped at 30. > > If possible, I would suggest you make the entity key_name the user's > id. Then you can just build a list of keys and fetch those -- but I > really doubt you'll get anything close to 10K on a single fetch. > > Robert > > > > On Mon, Sep 6, 2010 at 04:51, johnterran wrote: > > Hi > > > In BigTable, what is the most efficient way to do a large IN query? > > My IN parameter list is typically 500 but can be 10k+ > > i.e. > > class User(db.Model): > > name = db.StringProperty(required = True) > > id = db.StringProperty(required = True) > > > given a list of ids that can consist of 10k list, i need to retrieve > > all the names > > users = db.GqlQuery("SELECT * FROM User where id IN :1", > > ids) > > > what is the best way to do this? > > > Thanks > > John > > > -- > > You received this message because you are subscribed to the Google Groups > > "Google App Engine" group. > > To post to this group, send email to google-appeng...@googlegroups.com. > > To unsubscribe from this group, send email to > > google-appengine+unsubscr...@googlegroups.com. > > For more options, visit this group > > athttp://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
[google-appengine] Re: What is the most efficient way to do a large IN query in GQL?
On Sep 6, 8:51 am, johnterran wrote: > Hi > > In BigTable, what is the most efficient way to do a large IN query? > My IN parameter list is typically 500 but can be 10k+ > i.e. > class User(db.Model): > name = db.StringProperty(required = True) > id = db.StringProperty(required = True) > > given a list of ids that can consist of 10k list, i need to retrieve > all the names > users = db.GqlQuery("SELECT * FROM User where id IN :1", > ids) > > what is the best way to do this? > > Thanks > John I propose parametrize pairs to a dictionary using Query and not GQL ie User.All( ...+ logic ie filter("url id", ['www.domain010703.com.','domain010703']) with IN pairs listed as dictionary. I didn't program it but the data structure seems adequate. Thank you Niklas R -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appeng...@googlegroups.com. To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.