On Sun, 2006-04-23 at 16:41 +0200, Dirk Meyer wrote:
> > When you're talking about a common code path (like _scan), it's worth it
> > to maintain as many dicts as are necessary to avoid even O(n) where you
> > can.  But O(n^2) is just wayyyy too much.  On /usr/bin with 2192 files,
> > it must make 4804864 iterations to do a scan.
> 
> Any ideas to fix this? 

_beacon_listdir() will setup all the interesting dicts needed.  Let's
look at the inner loop of that O(n^2), in File._beacon_mtime:

    def _beacon_mtime(self):
        search = self._beacon_data['name']
        if search.rfind('.') > 0:
            search = search[:search.rfind('.')]

        for basename, filename, overlay, stat_res in \
                self._beacon_parent._beacon_listdir(cache=True):
            if basename.startswith(search):
                mtime += stat_res[stat.ST_MTIME]
        return mtime

Hrmm, first of all, unrelated, mtime += stat_res[stat.ST_MTIME] looks a
bit like a kludge?  Are you just later testing that the mtime returned
here is greater than the last known mtime?  In this case, if you delete
one of these "overlay" files (or whatever you're calling them) then the
mtime returned here will be less than before, even though the file will
need to be rescanned in that case.  I guess that means just testing
mtime != last_mtime, rather than mtime > last_mtime.  Maybe you're
already doing that, but just thought I'd point it out.

Now, how do we fix this function.  Well, we could do it a couple ways.
First of all, the logic looks questionable to me.  You said this means
if I have foo.avi in some directory, and then add foo.jpg that foo.avi
will get rescanned.  I don't know that I like this idea.   It seems
kinda hackish.  What makes sense to me rather is to have a list of
special extensions (like .srt, etc.) and a list of special files (like
cover.jpg) that beacon could ask kaa.metadata for, and instead do
specific lookups on these files.

Also, if you ask me, foo.avi.srt makes more sense than foo.srt.  But
maybe people don't do this in practice, I dunno.

But the main benefit here is with a list of extensions that we know are
special, beacon can do O(1) lookups:

   ext = os.path.splitext(self._beacon_data['name'])[1]
   special_exts = kaa.metadata.get_special_exts_for_type(ext)
   listdir_file_map = self._beacon_parent._beacon_listdir(cache=True)[1]
   for ext in special_exts:
      if basename+ext in listdir_file_map:
          mtime += listdir_file_map[basename+ext][3][stat.ST_MTIME]

Maybe you want to avoid splitext() or whatever, but you get the idea.
This is just rough code.  But the above logic reduces the complexity to
O(n) and when n=2500, that's a difference of 6.25 million iterations.

> Done, it should be ok. For other searches we still do a sort based on
> the url. But it doesn't has to be url, maybe a name, parent
> combination and maybe sqlite can help us return sorted results.

Great.

Jason.



-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Freevo-devel mailing list
Freevo-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freevo-devel

Reply via email to