Re: [webpy] Re: Problem with lists

Dragan Espenschied Fri, 11 Nov 2011 11:12:02 -0800

Thank you very much Justin, now i understand it. It is indeed a good solution to
a problem.


However, I think most websites do "paged" database queries anyway, because too
much database thrashing also leads to too much browser thrashing :)

You are right, this default behaviour caused some confusion with me, especially
when I needed to use a record twice inside one page. And I think this case is
more likely than somebody querying for a million records and expecting it to 
work.

Bests,
Dragan

Am 10.11.2011 06:39, schrieb Justin Davis:
> The rational is to use the results from the database cursor directly
> and incrementally so you don't need to load all the results into
> memory before acting on them.  For instance, let's say you have a
> table that has 100,000 rows in it, and you need to return all of the
> results in JSON in a single call (a contrived example, but bear with
> me).  Let's say your handler did this:
> 
> def GET(self):
>   web.header('Content', 'text/json')
>   rows = db.select('foo').list()
>   return json.dumps(rows)
> 
> What would happen is this: webpy's database wrapper would query your
> database for all the rows and would load them into memory, very likely
> causing your program to run out of memory. On linux, it would probably
> be killed by the OOM functionality in the kernel after thrashing
> around your swap space for a while (I don't really know how other
> systems would behave). And if the table of 100,000 rows doesn't do
> that, there is some point at which this will run out of memory.
> 
> If instead we use the database cursor to read rows in one at a time
> and feed them back to the client as soon as we read them in, we only
> load into memory one row at a time. Consider this handler:
> 
> def GET(self):
>   web.header('Content', 'text/json')
>   rows = db.select('foo')
>   yield '['
>   for row in rows:
>     yield json.dumps(row) + ', '
>   yield ']'
> 
> (Yes, it's not technically correct JSON due to the trailing comma in
> out the output, but let's pretend I handle that...)  Now for each row
> in the table 'foo', we read in a row, dump it to json and flush it. We
> never load in the entire database table, so it uses substantially less
> memory in the python process.
> 
> And that's pretty the largest issue that the database returns an
> iterator. There are some other less obvious reasons -- for instance,
> by reading in the results all at once you put a brief but substantial
> load on the database server which probably isn't necessary since it
> can likely serve database queries faster than python can process them.
> 
> Is it worth it? Frankly I don't think it should be the default setting
> in webpy because of the confusion that springs up as a result. It's
> confusing to new users immediately and only really makes sense for
> advanced users familiar with databases and dealing with heavy load.
> 
> Hope that helps!
> Justin
> 
> On Nov 6, 12:47 pm, Dragan Espenschied <d...@a-blast.org> wrote:
>>> db.select returns webpy's flavored iterator,
>>
>> By the way, what is the background of this feature?
>> I do not understand what are the advantages of these iterators over lists.
>> Could somebody knowledgeable explain?
>>
>> I think I am missing out on something. :)
>>
>> Thanks,
>> Dragan
> 

-- 
http://noobz.cc/
http://digitalfolklore.org/
http://contemporary-home-computing.org/1tb/

-- 
You received this message because you are subscribed to the Google Groups 
"web.py" group.
To post to this group, send email to webpy@googlegroups.com.
To unsubscribe from this group, send email to 
webpy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/webpy?hl=en.

Re: [webpy] Re: Problem with lists

Reply via email to