Re: [Webware-discuss] Algorithm to handle big data selections

Frank Barknecht Mon, 23 Dec 2002 16:49:42 -0800

Hi,
Aaron Held hat gesagt: // Aaron Held wrote:

> I built a extranet application that pulled up to 5000 records (up to 30 
> * 50char text fields and a few dates)
> This was generated by a long SQL query that could take up to 12 seconds 
> and here was my plan:
> [...]
Wow, this looks like heavy work. I'm currently working on something
similar, a record shop, where there are about 40.000 records in the
database (MySQL, flat, non-normalized schema at the moment, I'll need
to fix that, too)


I went another route in my prototype and I'd like to hear some
comments: 

I have a class "Product" that represents a Product:

> start py code
class Product:          
        def __init__(self, id):
                db = datapool.getConnection()
                c = db.cursor()
                q = '''select * from NORMAL where (id = %d)''' % id
                c.execute(q)
                res = c.fetchone()
                description = c.description
                self.data = {}
                if res: 
                        for i in range(len(description)):
                                self.data[description[i][0]] = res[i]
                db.close()
> end py code
                
The Product has methods for pretty-printing and such. Then there is a class 
for a ProductSelection, which basically is a list of IDs, *but* it
also limits the results and does the stepping:

> start py code
class ProductSelection:
        def __init__(self, clause=''):
                self.productIds = self.selectByClause(clause)
                
        def selectByClause(self, clause='', start=0, count=10):
                if clause:
                        db = datapool.getConnection()
                        c = db.cursor()
                        q = '''select id from normal where (%s) order by artist''' % 
clause
                        c.execute(q)
                        # where do we start?
                        startcount = start * count
                        # where's the end, add 1, to see if there are
                        # more results after this set:
                        endcount   = startcount + count + 1
                        res = c.fetchall()[startcount:endcount]
                        db.close()
                        return res
                else:
                        return []
> end py code

Now in the Page code, this is easy to use:

> start py code
        stepsize = 10   # global stepsize variable
        startWhere = int(self.request().value("startProducts"))
        ps = ProductSelection(someClause, start=startWhere, count=stepsize )
        for id in ps.productIds[:-1]:
                p = Product(id)
                p.show()
                
        if len(ps.productIds) > stepsize:
                stpplus = stp + 1
                self.writeln('''
                <a href="?startProducts=%s"><b>To the Next %s results</b></a>
                ''' % (stpplus, stepsize))

> end py code


Now my hope is (I didn't profile it, yet), that this is fast enough,
because the IDs are int fields and in the worst case ("select * from
NORMAL") I have a c.fetchall() that returns 40.000 ints. But this is
sliced to be in a range [startcount:endcount], so it actually is only
used for a very short time, if at all, isn't it?

OTOH I have another select for every Product and there might be a lot
of Products in memory. At least, we hope that people look at the
Products a lot... :)

ciao
-- 
 Frank Barknecht                               _ ______footils.org__


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Webware-discuss mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/webware-discuss

Re: [Webware-discuss] Algorithm to handle big data selections

Reply via email to