[web2py] Re: efficient DB queries
This is not the best way. It will run a separate query for each board id. On GAE you want to do as much as you can in a single query. So the best way is to construct a list of GAE keys and get all entities together. Unfortunately, AFAIK, there is no support for GAE batch queries in DAL. You can, however, try to refer to the GAE datastore directly. from google.appengine.ext import db as gdb articles = db().select(db.articles.ALL) keys = [gdb.Key.from_path(boards, a.board_id) for a in articles] boards = gdb.get(keys) But I'm afraid this is not an elegant solution as it probably won't work without a gdb.Model defined for the board entity, and that would go against DRY [1]. Anyway, if you want to do it, see the GAE docs [2] for details on model definition. Depending on how often are you going to perform this query a better solution might be redesigning your database. Forget about 3NF and add the board attributes to each article entity so that you can fetch everything together in a single query. [1] https://en.wikipedia.org/wiki/Don%27t_repeat_yourself [2] http://code.google.com/appengine/docs/python/datastore/entities.html
[web2py] Re: efficient DB queries
i use: ids = [...generate your list of ids...] rows = db(db.board.id.belongs(ids[0:30]).select() for i in range(30,len(ids), 30): rows db(db.board.id.belongs(ids[i:i+30]).select() so that gets me my in statement working with GAE's 30 items per in limit. On Wednesday, March 21, 2012 10:05:46 PM UTC-7, Udi Milo wrote: I went with: query = '' for id in board_ids: query += 'db(db.board.id==%d).select()|' %id query = query[:-1] boards = eval(query) I have no idea how to evaluate performance of that query though if you have any other ideas (I cannot limit 30 right now without rethinking other things), please share them. On Wednesday, March 21, 2012 11:52:33 PM UTC-4, Anthony wrote: I think that is outdated -- according to this ( https://groups.google.com/d/msg/web2py/vWqOET74qg4/92DLUFTUsN0Jhttps://groups.google.com/d/msg/web2py/vWqOET74qg4/92DLUFTUsN0J), GAE does now support belongs but is limited to 30 items per query (so you have to break it up). Perhaps there is a better way, though. Anthony On Wednesday, March 21, 2012 9:26:44 PM UTC-4, Udi Milo wrote: It doesn't. I found this: GAE does not support belongs and does not support OR. You have to do: rows = db(db.media_type.name=='paper').select()db (db.media_type.name=='cd').select() The is done at the web2py level but since records are exclusive and you are not sorting them, there is no major slowdown. Since I'm a python newbie, how would you go about building the loop that creates a very long query? you need to do some kind of python eval tricks here that I don't know. I'm guessing its going to look like for id in board_ids: query = query + new query(id) but how do you do it in python? On Wednesday, March 21, 2012 8:30:54 PM UTC-4, Anthony wrote: I didn't realize you were on GAE. I'm not quite sure how GAE handles this, so perhaps someone with more GAE experience can chime in. In an RDBMS, you should be able to do: board_ids = set([a.board for a in articles]) boards = db(db.boards.id.belongs(board_ids)).select() But not sure if that works on GAE. Anthony On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote: Anthony, Thanks for answering so quickly. I did mean board and not board_id My question is very basic. I know that the article table has an Id column that is really the board reference, but when I look at the result of the query and unification of: board_ids = set(map(lambda a: a.board, articles)) I get a set of references, how do I use that set in another query to get all the boards? also, just to be sure, running this lambda expression does not hit the db to fetch the boards, right? I can't use join b/c I'm running on GAE. On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote: I have two db tables: board (name, created_on) article(board, name, title) currently, in my html I do a naive loop {{for article in articles}} {{=article.board.name}} {{pass}} I would like to change it and do something like: articles = db.select.all... board_ids = set(map(lambda a: a.board_id, articles)) Instead of a.board_id, do you mean a.board (I don't see a board_id field listed in your article table definition)? Also, is db.article.board a reference field to the db.board table? In that case, that means it is already storing the id of the referenced record in the db.board table (that's what reference fields store), so you don't need to retrieve it separately. Anyway, if you need the db.board.name value for all the records you are selecting from the db.article table, you should probably just do a join so you can get everything in a single query -- see http://web2py.com/books/default/chapter/29/6#Inner-joinshttp://web2py.com/books/default/chapter/29/6#Inner-joins . Anthony
[web2py] Re: efficient DB queries
i use: ids = [...generate your list of ids...] rows = db(db.board.id.belongs(ids[0:30]).select() for i in range(30,len(ids), 30): rows db(db.board.id.belongs(ids[i:i+30]).select() so that gets me my in statement working with GAE's 30 items per in limit. According to the GAE docs, under the hood that still results in a separate datastore query for every id: http://code.google.com/appengine/docs/python/datastore/gqlreference.html. Anthony
[web2py] Re: efficient DB queries
I have two db tables: board (name, created_on) article(board, name, title) currently, in my html I do a naive loop {{for article in articles}} {{= article.board.name}} {{pass}} I would like to change it and do something like: articles = db.select.all... board_ids = set(map(lambda a: a.board_id, articles)) Instead of a.board_id, do you mean a.board (I don't see a board_id field listed in your article table definition)? Also, is db.article.board a reference field to the db.board table? In that case, that means it is already storing the id of the referenced record in the db.board table (that's what reference fields store), so you don't need to retrieve it separately. Anyway, if you need the db.board.name value for all the records you are selecting from the db.article table, you should probably just do a join so you can get everything in a single query -- see http://web2py.com/books/default/chapter/29/6#Inner-joins. Anthony
[web2py] Re: efficient DB queries
Anthony, Thanks for answering so quickly. I did mean board and not board_id My question is very basic. I know that the article table has an Id column that is really the board reference, but when I look at the result of the query and unification of: board_ids = set(map(lambda a: a.board, articles)) I get a set of references, how do I use that set in another query to get all the boards? also, just to be sure, running this lambda expression does not hit the db to fetch the boards, right? I can't use join b/c I'm running on GAE. On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote: I have two db tables: board (name, created_on) article(board, name, title) currently, in my html I do a naive loop {{for article in articles}} {{= article.board.name}} {{pass}} I would like to change it and do something like: articles = db.select.all... board_ids = set(map(lambda a: a.board_id, articles)) Instead of a.board_id, do you mean a.board (I don't see a board_id field listed in your article table definition)? Also, is db.article.board a reference field to the db.board table? In that case, that means it is already storing the id of the referenced record in the db.board table (that's what reference fields store), so you don't need to retrieve it separately. Anyway, if you need the db.board.name value for all the records you are selecting from the db.article table, you should probably just do a join so you can get everything in a single query -- see http://web2py.com/books/default/chapter/29/6#Inner-joins. Anthony
[web2py] Re: efficient DB queries
I didn't realize you were on GAE. I'm not quite sure how GAE handles this, so perhaps someone with more GAE experience can chime in. In an RDBMS, you should be able to do: board_ids = set([a.board for a in articles]) boards = db(db.boards.id.belongs(board_ids)).select() But not sure if that works on GAE. Anthony On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote: Anthony, Thanks for answering so quickly. I did mean board and not board_id My question is very basic. I know that the article table has an Id column that is really the board reference, but when I look at the result of the query and unification of: board_ids = set(map(lambda a: a.board, articles)) I get a set of references, how do I use that set in another query to get all the boards? also, just to be sure, running this lambda expression does not hit the db to fetch the boards, right? I can't use join b/c I'm running on GAE. On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote: I have two db tables: board (name, created_on) article(board, name, title) currently, in my html I do a naive loop {{for article in articles}} {{= article.board.name}} {{pass}} I would like to change it and do something like: articles = db.select.all... board_ids = set(map(lambda a: a.board_id, articles)) Instead of a.board_id, do you mean a.board (I don't see a board_id field listed in your article table definition)? Also, is db.article.board a reference field to the db.board table? In that case, that means it is already storing the id of the referenced record in the db.board table (that's what reference fields store), so you don't need to retrieve it separately. Anyway, if you need the db.board.name value for all the records you are selecting from the db.article table, you should probably just do a join so you can get everything in a single query -- see http://web2py.com/books/default/chapter/29/6#Inner-joins. Anthony
[web2py] Re: efficient DB queries
It doesn't. I found this: GAE does not support belongs and does not support OR. You have to do: rows = db(db.media_type.name=='paper').select()db (db.media_type.name=='cd').select() The is done at the web2py level but since records are exclusive and you are not sorting them, there is no major slowdown. Since I'm a python newbie, how would you go about building the loop that creates a very long query? you need to do some kind of python eval tricks here that I don't know. I'm guessing its going to look like for id in board_ids: query = query + new query(id) but how do you do it in python? On Wednesday, March 21, 2012 8:30:54 PM UTC-4, Anthony wrote: I didn't realize you were on GAE. I'm not quite sure how GAE handles this, so perhaps someone with more GAE experience can chime in. In an RDBMS, you should be able to do: board_ids = set([a.board for a in articles]) boards = db(db.boards.id.belongs(board_ids)).select() But not sure if that works on GAE. Anthony On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote: Anthony, Thanks for answering so quickly. I did mean board and not board_id My question is very basic. I know that the article table has an Id column that is really the board reference, but when I look at the result of the query and unification of: board_ids = set(map(lambda a: a.board, articles)) I get a set of references, how do I use that set in another query to get all the boards? also, just to be sure, running this lambda expression does not hit the db to fetch the boards, right? I can't use join b/c I'm running on GAE. On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote: I have two db tables: board (name, created_on) article(board, name, title) currently, in my html I do a naive loop {{for article in articles}} {{= article.board.name}} {{pass}} I would like to change it and do something like: articles = db.select.all... board_ids = set(map(lambda a: a.board_id, articles)) Instead of a.board_id, do you mean a.board (I don't see a board_id field listed in your article table definition)? Also, is db.article.board a reference field to the db.board table? In that case, that means it is already storing the id of the referenced record in the db.board table (that's what reference fields store), so you don't need to retrieve it separately. Anyway, if you need the db.board.name value for all the records you are selecting from the db.article table, you should probably just do a join so you can get everything in a single query -- see http://web2py.com/books/default/chapter/29/6#Inner-joins. Anthony
[web2py] Re: efficient DB queries
I think that is outdated -- according to this (https://groups.google.com/d/msg/web2py/vWqOET74qg4/92DLUFTUsN0J), GAE does now support belongs but is limited to 30 items per query (so you have to break it up). Perhaps there is a better way, though. Anthony On Wednesday, March 21, 2012 9:26:44 PM UTC-4, Udi Milo wrote: It doesn't. I found this: GAE does not support belongs and does not support OR. You have to do: rows = db(db.media_type.name=='paper').select()db (db.media_type.name=='cd').select() The is done at the web2py level but since records are exclusive and you are not sorting them, there is no major slowdown. Since I'm a python newbie, how would you go about building the loop that creates a very long query? you need to do some kind of python eval tricks here that I don't know. I'm guessing its going to look like for id in board_ids: query = query + new query(id) but how do you do it in python? On Wednesday, March 21, 2012 8:30:54 PM UTC-4, Anthony wrote: I didn't realize you were on GAE. I'm not quite sure how GAE handles this, so perhaps someone with more GAE experience can chime in. In an RDBMS, you should be able to do: board_ids = set([a.board for a in articles]) boards = db(db.boards.id.belongs(board_ids)).select() But not sure if that works on GAE. Anthony On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote: Anthony, Thanks for answering so quickly. I did mean board and not board_id My question is very basic. I know that the article table has an Id column that is really the board reference, but when I look at the result of the query and unification of: board_ids = set(map(lambda a: a.board, articles)) I get a set of references, how do I use that set in another query to get all the boards? also, just to be sure, running this lambda expression does not hit the db to fetch the boards, right? I can't use join b/c I'm running on GAE. On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote: I have two db tables: board (name, created_on) article(board, name, title) currently, in my html I do a naive loop {{for article in articles}} {{= article.board.name}} {{pass}} I would like to change it and do something like: articles = db.select.all... board_ids = set(map(lambda a: a.board_id, articles)) Instead of a.board_id, do you mean a.board (I don't see a board_id field listed in your article table definition)? Also, is db.article.board a reference field to the db.board table? In that case, that means it is already storing the id of the referenced record in the db.board table (that's what reference fields store), so you don't need to retrieve it separately. Anyway, if you need the db.board.name value for all the records you are selecting from the db.article table, you should probably just do a join so you can get everything in a single query -- see http://web2py.com/books/default/chapter/29/6#Inner-joins. Anthony
[web2py] Re: efficient DB queries
I went with: query = '' for id in board_ids: query += 'db(db.board.id==%d).select()|' %id query = query[:-1] boards = eval(query) I have no idea how to evaluate performance of that query though if you have any other ideas (I cannot limit 30 right now without rethinking other things), please share them. On Wednesday, March 21, 2012 11:52:33 PM UTC-4, Anthony wrote: I think that is outdated -- according to this ( https://groups.google.com/d/msg/web2py/vWqOET74qg4/92DLUFTUsN0J), GAE does now support belongs but is limited to 30 items per query (so you have to break it up). Perhaps there is a better way, though. Anthony On Wednesday, March 21, 2012 9:26:44 PM UTC-4, Udi Milo wrote: It doesn't. I found this: GAE does not support belongs and does not support OR. You have to do: rows = db(db.media_type.name=='paper').select()db (db.media_type.name=='cd').select() The is done at the web2py level but since records are exclusive and you are not sorting them, there is no major slowdown. Since I'm a python newbie, how would you go about building the loop that creates a very long query? you need to do some kind of python eval tricks here that I don't know. I'm guessing its going to look like for id in board_ids: query = query + new query(id) but how do you do it in python? On Wednesday, March 21, 2012 8:30:54 PM UTC-4, Anthony wrote: I didn't realize you were on GAE. I'm not quite sure how GAE handles this, so perhaps someone with more GAE experience can chime in. In an RDBMS, you should be able to do: board_ids = set([a.board for a in articles]) boards = db(db.boards.id.belongs(board_ids)).select() But not sure if that works on GAE. Anthony On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote: Anthony, Thanks for answering so quickly. I did mean board and not board_id My question is very basic. I know that the article table has an Id column that is really the board reference, but when I look at the result of the query and unification of: board_ids = set(map(lambda a: a.board, articles)) I get a set of references, how do I use that set in another query to get all the boards? also, just to be sure, running this lambda expression does not hit the db to fetch the boards, right? I can't use join b/c I'm running on GAE. On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote: I have two db tables: board (name, created_on) article(board, name, title) currently, in my html I do a naive loop {{for article in articles}} {{=article.board.name}} {{pass}} I would like to change it and do something like: articles = db.select.all... board_ids = set(map(lambda a: a.board_id, articles)) Instead of a.board_id, do you mean a.board (I don't see a board_id field listed in your article table definition)? Also, is db.article.board a reference field to the db.board table? In that case, that means it is already storing the id of the referenced record in the db.board table (that's what reference fields store), so you don't need to retrieve it separately. Anyway, if you need the db.board.name value for all the records you are selecting from the db.article table, you should probably just do a join so you can get everything in a single query -- see http://web2py.com/books/default/chapter/29/6#Inner-joins. Anthony