[web2py] Re: efficient DB queries

2012-03-22 Thread Wikus van de Merwe
This is not the best way. It will run a separate query for each board id. 
On GAE you want to do
as much as you can in a single query. So the best way is to construct a 
list of GAE keys and
get all entities together. Unfortunately, AFAIK, there is no support for 
GAE batch queries in DAL.
You can, however, try to refer to the GAE datastore directly.

from google.appengine.ext import db as gdb

articles = db().select(db.articles.ALL)
keys = [gdb.Key.from_path(boards, a.board_id) for a in articles]
boards = gdb.get(keys)

But I'm afraid this is not an elegant solution as it probably won't work 
without a gdb.Model defined
for the board entity, and that would go against DRY [1]. Anyway, if you 
want to do it, see the GAE
docs [2] for details on model definition.

Depending on how often are you going to perform this query a better 
solution might be redesigning
your database. Forget about 3NF and add the board attributes to each 
article entity so that you
can fetch everything together in a single query.

[1] https://en.wikipedia.org/wiki/Don%27t_repeat_yourself
[2] http://code.google.com/appengine/docs/python/datastore/entities.html



[web2py] Re: efficient DB queries

2012-03-22 Thread howesc
i use:

ids = [...generate your list of ids...]

rows = db(db.board.id.belongs(ids[0:30]).select()
for i in range(30,len(ids), 30):
  rows  db(db.board.id.belongs(ids[i:i+30]).select()

so that gets me my in statement working with GAE's 30 items per in limit.

On Wednesday, March 21, 2012 10:05:46 PM UTC-7, Udi Milo wrote:

 I went with:

 query = ''
 for id in board_ids:
 query += 'db(db.board.id==%d).select()|​' %id
 query = query[:-1]
 boards = eval(query)

 I have no idea how to evaluate performance of that query though
 if you have any other ideas (I cannot limit 30 right now without 
 rethinking other things), please share them.

 On Wednesday, March 21, 2012 11:52:33 PM UTC-4, Anthony wrote:

 I think that is outdated -- according to this (
 https://groups.google.com/d/​msg/web2py/vWqOET74qg4/​92DLUFTUsN0Jhttps://groups.google.com/d/msg/web2py/vWqOET74qg4/92DLUFTUsN0J),
  
 GAE does now support belongs but is limited to 30 items per query (so you 
 have to break it up). Perhaps there is a better way, though.

 Anthony

 On Wednesday, March 21, 2012 9:26:44 PM UTC-4, Udi Milo wrote:

 It doesn't.
 I found this: 
 
 GAE does not support belongs and does not support OR. You have to do: 

 rows = db(db.media_type.name=='paper'​).select()db 
 (db.media_type.name=='cd').​select() 

 The  is done at the web2py level but since records are exclusive and 
 you are not sorting them, there is no major slowdown. 
 

 Since I'm a python newbie, how would you go about building the loop that 
 creates a very long query? you need to do some kind of python eval tricks 
 here that I don't know.
 I'm guessing its going to look like

 for id in board_ids:
query = query + new query(id)

 but how do you do it in python?

 On Wednesday, March 21, 2012 8:30:54 PM UTC-4, Anthony wrote:

 I didn't realize you were on GAE. I'm not quite sure how GAE handles 
 this, so perhaps someone with more GAE experience can chime in. In an 
 RDBMS, you should be able to do:

 board_ids = set([a.board for a in articles])
 boards = db(db.boards.id.belongs(board_​ids)).select()

 But not sure if that works on GAE.

 Anthony

 On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote:

 Anthony,

 Thanks for answering so quickly.
 I did mean board and not board_id

 My question is very basic.
 I know that the article table has an Id column that is really the 
 board reference, but when I look at the result of the query and 
 unification 
 of:
 board_ids = set(map(lambda a: a.board, articles))

 I get a set of references,
 how do I use that set in another query to get all the boards?
 also, just to be sure, running this lambda expression does not hit the 
 db to fetch the boards, right?

 I can't use join b/c I'm running on GAE.



 On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote:

 I have two db tables:

 board (name, created_on)
 article(board, name, title)

 currently, in my html I do a naive loop {{for article in articles}} 
 {{=article.board.name}} {{pass}}

 I would like to change it and do something like:
 articles = db.select.all...
 board_ids = set(map(lambda a: a.board_id, articles))


 Instead of a.board_id, do you mean a.board (I don't see a board_id 
 field listed in your article table definition)? Also, is 
 db.article.board a 
 reference field to the db.board table? In that case, that means it is 
 already storing the id of the referenced record in the db.board table 
 (that's what reference fields store), so you don't need to retrieve it 
 separately.

 Anyway, if you need the db.board.name value for all the records you 
 are selecting from the db.article table, you should probably just do a 
 join 
 so you can get everything in a single query -- see 
 http://web2py.com/books/​default/chapter/29/6#Inner-​joinshttp://web2py.com/books/default/chapter/29/6#Inner-joins
 .
  
 Anthony



[web2py] Re: efficient DB queries

2012-03-22 Thread Anthony


 i use:

 ids = [...generate your list of ids...]

 rows = db(db.board.id.belongs(ids[0:​30]).select()
 for i in range(30,len(ids), 30):
   rows  db(db.board.id.belongs(ids[i:​i+30]).select()

 so that gets me my in statement working with GAE's 30 items per in limit.


According to the GAE docs, under the hood that still results in a separate 
datastore query for every 
id: http://code.google.com/appengine/docs/python/datastore/gqlreference.html.

Anthony


[web2py] Re: efficient DB queries

2012-03-21 Thread Anthony


 I have two db tables:

 board (name, created_on)
 article(board, name, title)

 currently, in my html I do a naive loop {{for article in articles}} {{=
 article.board.name}} {{pass}}

 I would like to change it and do something like:
 articles = db.select.all...
 board_ids = set(map(lambda a: a.board_id, articles))


Instead of a.board_id, do you mean a.board (I don't see a board_id field 
listed in your article table definition)? Also, is db.article.board a 
reference field to the db.board table? In that case, that means it is 
already storing the id of the referenced record in the db.board table 
(that's what reference fields store), so you don't need to retrieve it 
separately.

Anyway, if you need the db.board.name value for all the records you are 
selecting from the db.article table, you should probably just do a join so 
you can get everything in a single query -- 
see http://web2py.com/books/default/chapter/29/6#Inner-joins.
 
Anthony



[web2py] Re: efficient DB queries

2012-03-21 Thread Udi Milo
Anthony,

Thanks for answering so quickly.
I did mean board and not board_id

My question is very basic.
I know that the article table has an Id column that is really the board 
reference, but when I look at the result of the query and unification of:
board_ids = set(map(lambda a: a.board, articles))

I get a set of references,
how do I use that set in another query to get all the boards?
also, just to be sure, running this lambda expression does not hit the db 
to fetch the boards, right?

I can't use join b/c I'm running on GAE.



On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote:

 I have two db tables:

 board (name, created_on)
 article(board, name, title)

 currently, in my html I do a naive loop {{for article in articles}} {{=
 article.board.name}} {{pass}}

 I would like to change it and do something like:
 articles = db.select.all...
 board_ids = set(map(lambda a: a.board_id, articles))


 Instead of a.board_id, do you mean a.board (I don't see a board_id field 
 listed in your article table definition)? Also, is db.article.board a 
 reference field to the db.board table? In that case, that means it is 
 already storing the id of the referenced record in the db.board table 
 (that's what reference fields store), so you don't need to retrieve it 
 separately.

 Anyway, if you need the db.board.name value for all the records you are 
 selecting from the db.article table, you should probably just do a join so 
 you can get everything in a single query -- see 
 http://web2py.com/books/default/chapter/29/6#Inner-joins.
  
 Anthony



[web2py] Re: efficient DB queries

2012-03-21 Thread Anthony
I didn't realize you were on GAE. I'm not quite sure how GAE handles this, 
so perhaps someone with more GAE experience can chime in. In an RDBMS, you 
should be able to do:

board_ids = set([a.board for a in articles])
boards = db(db.boards.id.belongs(board_ids)).select()

But not sure if that works on GAE.

Anthony

On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote:

 Anthony,

 Thanks for answering so quickly.
 I did mean board and not board_id

 My question is very basic.
 I know that the article table has an Id column that is really the board 
 reference, but when I look at the result of the query and unification of:
 board_ids = set(map(lambda a: a.board, articles))

 I get a set of references,
 how do I use that set in another query to get all the boards?
 also, just to be sure, running this lambda expression does not hit the db 
 to fetch the boards, right?

 I can't use join b/c I'm running on GAE.



 On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote:

 I have two db tables:

 board (name, created_on)
 article(board, name, title)

 currently, in my html I do a naive loop {{for article in articles}} {{=
 article.board.name}} {{pass}}

 I would like to change it and do something like:
 articles = db.select.all...
 board_ids = set(map(lambda a: a.board_id, articles))


 Instead of a.board_id, do you mean a.board (I don't see a board_id 
 field listed in your article table definition)? Also, is db.article.board a 
 reference field to the db.board table? In that case, that means it is 
 already storing the id of the referenced record in the db.board table 
 (that's what reference fields store), so you don't need to retrieve it 
 separately.

 Anyway, if you need the db.board.name value for all the records you are 
 selecting from the db.article table, you should probably just do a join so 
 you can get everything in a single query -- see 
 http://web2py.com/books/default/chapter/29/6#Inner-joins.
  
 Anthony



[web2py] Re: efficient DB queries

2012-03-21 Thread Udi Milo
It doesn't.
I found this: 

GAE does not support belongs and does not support OR. You have to do: 

rows = db(db.media_type.name=='paper').select()db 
(db.media_type.name=='cd').select() 

The  is done at the web2py level but since records are exclusive and 
you are not sorting them, there is no major slowdown. 


Since I'm a python newbie, how would you go about building the loop that 
creates a very long query? you need to do some kind of python eval tricks 
here that I don't know.
I'm guessing its going to look like

for id in board_ids:
   query = query + new query(id)

but how do you do it in python?

On Wednesday, March 21, 2012 8:30:54 PM UTC-4, Anthony wrote:

 I didn't realize you were on GAE. I'm not quite sure how GAE handles this, 
 so perhaps someone with more GAE experience can chime in. In an RDBMS, you 
 should be able to do:

 board_ids = set([a.board for a in articles])
 boards = db(db.boards.id.belongs(board_ids)).select()

 But not sure if that works on GAE.

 Anthony

 On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote:

 Anthony,

 Thanks for answering so quickly.
 I did mean board and not board_id

 My question is very basic.
 I know that the article table has an Id column that is really the board 
 reference, but when I look at the result of the query and unification of:
 board_ids = set(map(lambda a: a.board, articles))

 I get a set of references,
 how do I use that set in another query to get all the boards?
 also, just to be sure, running this lambda expression does not hit the db 
 to fetch the boards, right?

 I can't use join b/c I'm running on GAE.



 On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote:

 I have two db tables:

 board (name, created_on)
 article(board, name, title)

 currently, in my html I do a naive loop {{for article in articles}} {{=
 article.board.name}} {{pass}}

 I would like to change it and do something like:
 articles = db.select.all...
 board_ids = set(map(lambda a: a.board_id, articles))


 Instead of a.board_id, do you mean a.board (I don't see a board_id 
 field listed in your article table definition)? Also, is db.article.board a 
 reference field to the db.board table? In that case, that means it is 
 already storing the id of the referenced record in the db.board table 
 (that's what reference fields store), so you don't need to retrieve it 
 separately.

 Anyway, if you need the db.board.name value for all the records you are 
 selecting from the db.article table, you should probably just do a join so 
 you can get everything in a single query -- see 
 http://web2py.com/books/default/chapter/29/6#Inner-joins.
  
 Anthony



[web2py] Re: efficient DB queries

2012-03-21 Thread Anthony
I think that is outdated -- according to this 
(https://groups.google.com/d/msg/web2py/vWqOET74qg4/92DLUFTUsN0J), GAE does 
now support belongs but is limited to 30 items per query (so you have to 
break it up). Perhaps there is a better way, though.

Anthony

On Wednesday, March 21, 2012 9:26:44 PM UTC-4, Udi Milo wrote:

 It doesn't.
 I found this: 
 
 GAE does not support belongs and does not support OR. You have to do: 

 rows = db(db.media_type.name=='paper').select()db 
 (db.media_type.name=='cd').select() 

 The  is done at the web2py level but since records are exclusive and 
 you are not sorting them, there is no major slowdown. 
 

 Since I'm a python newbie, how would you go about building the loop that 
 creates a very long query? you need to do some kind of python eval tricks 
 here that I don't know.
 I'm guessing its going to look like

 for id in board_ids:
query = query + new query(id)

 but how do you do it in python?

 On Wednesday, March 21, 2012 8:30:54 PM UTC-4, Anthony wrote:

 I didn't realize you were on GAE. I'm not quite sure how GAE handles 
 this, so perhaps someone with more GAE experience can chime in. In an 
 RDBMS, you should be able to do:

 board_ids = set([a.board for a in articles])
 boards = db(db.boards.id.belongs(board_ids)).select()

 But not sure if that works on GAE.

 Anthony

 On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote:

 Anthony,

 Thanks for answering so quickly.
 I did mean board and not board_id

 My question is very basic.
 I know that the article table has an Id column that is really the board 
 reference, but when I look at the result of the query and unification of:
 board_ids = set(map(lambda a: a.board, articles))

 I get a set of references,
 how do I use that set in another query to get all the boards?
 also, just to be sure, running this lambda expression does not hit the 
 db to fetch the boards, right?

 I can't use join b/c I'm running on GAE.



 On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote:

 I have two db tables:

 board (name, created_on)
 article(board, name, title)

 currently, in my html I do a naive loop {{for article in articles}} {{=
 article.board.name}} {{pass}}

 I would like to change it and do something like:
 articles = db.select.all...
 board_ids = set(map(lambda a: a.board_id, articles))


 Instead of a.board_id, do you mean a.board (I don't see a board_id 
 field listed in your article table definition)? Also, is db.article.board 
 a 
 reference field to the db.board table? In that case, that means it is 
 already storing the id of the referenced record in the db.board table 
 (that's what reference fields store), so you don't need to retrieve it 
 separately.

 Anyway, if you need the db.board.name value for all the records you 
 are selecting from the db.article table, you should probably just do a 
 join 
 so you can get everything in a single query -- see 
 http://web2py.com/books/default/chapter/29/6#Inner-joins.
  
 Anthony



[web2py] Re: efficient DB queries

2012-03-21 Thread Udi Milo
I went with:

query = ''
for id in board_ids:
query += 'db(db.board.id==%d).select()|' %id
query = query[:-1]
boards = eval(query)

I have no idea how to evaluate performance of that query though
if you have any other ideas (I cannot limit 30 right now without rethinking 
other things), please share them.

On Wednesday, March 21, 2012 11:52:33 PM UTC-4, Anthony wrote:

 I think that is outdated -- according to this (
 https://groups.google.com/d/msg/web2py/vWqOET74qg4/92DLUFTUsN0J), GAE 
 does now support belongs but is limited to 30 items per query (so you have 
 to break it up). Perhaps there is a better way, though.

 Anthony

 On Wednesday, March 21, 2012 9:26:44 PM UTC-4, Udi Milo wrote:

 It doesn't.
 I found this: 
 
 GAE does not support belongs and does not support OR. You have to do: 

 rows = db(db.media_type.name=='paper').select()db 
 (db.media_type.name=='cd').select() 

 The  is done at the web2py level but since records are exclusive and 
 you are not sorting them, there is no major slowdown. 
 

 Since I'm a python newbie, how would you go about building the loop that 
 creates a very long query? you need to do some kind of python eval tricks 
 here that I don't know.
 I'm guessing its going to look like

 for id in board_ids:
query = query + new query(id)

 but how do you do it in python?

 On Wednesday, March 21, 2012 8:30:54 PM UTC-4, Anthony wrote:

 I didn't realize you were on GAE. I'm not quite sure how GAE handles 
 this, so perhaps someone with more GAE experience can chime in. In an 
 RDBMS, you should be able to do:

 board_ids = set([a.board for a in articles])
 boards = db(db.boards.id.belongs(board_ids)).select()

 But not sure if that works on GAE.

 Anthony

 On Wednesday, March 21, 2012 7:30:58 PM UTC-4, Udi Milo wrote:

 Anthony,

 Thanks for answering so quickly.
 I did mean board and not board_id

 My question is very basic.
 I know that the article table has an Id column that is really the board 
 reference, but when I look at the result of the query and unification of:
 board_ids = set(map(lambda a: a.board, articles))

 I get a set of references,
 how do I use that set in another query to get all the boards?
 also, just to be sure, running this lambda expression does not hit the 
 db to fetch the boards, right?

 I can't use join b/c I'm running on GAE.



 On Wednesday, March 21, 2012 5:22:50 PM UTC-4, Anthony wrote:

 I have two db tables:

 board (name, created_on)
 article(board, name, title)

 currently, in my html I do a naive loop {{for article in articles}} 
 {{=article.board.name}} {{pass}}

 I would like to change it and do something like:
 articles = db.select.all...
 board_ids = set(map(lambda a: a.board_id, articles))


 Instead of a.board_id, do you mean a.board (I don't see a board_id 
 field listed in your article table definition)? Also, is db.article.board 
 a 
 reference field to the db.board table? In that case, that means it is 
 already storing the id of the referenced record in the db.board table 
 (that's what reference fields store), so you don't need to retrieve it 
 separately.

 Anyway, if you need the db.board.name value for all the records you 
 are selecting from the db.article table, you should probably just do a 
 join 
 so you can get everything in a single query -- see 
 http://web2py.com/books/default/chapter/29/6#Inner-joins.
  
 Anthony