On Sun, 2006-04-23 at 16:41 +0200, Dirk Meyer wrote: > > When you're talking about a common code path (like _scan), it's worth it > > to maintain as many dicts as are necessary to avoid even O(n) where you > > can. But O(n^2) is just wayyyy too much. On /usr/bin with 2192 files, > > it must make 4804864 iterations to do a scan. > > Any ideas to fix this?
_beacon_listdir() will setup all the interesting dicts needed. Let's look at the inner loop of that O(n^2), in File._beacon_mtime: def _beacon_mtime(self): search = self._beacon_data['name'] if search.rfind('.') > 0: search = search[:search.rfind('.')] for basename, filename, overlay, stat_res in \ self._beacon_parent._beacon_listdir(cache=True): if basename.startswith(search): mtime += stat_res[stat.ST_MTIME] return mtime Hrmm, first of all, unrelated, mtime += stat_res[stat.ST_MTIME] looks a bit like a kludge? Are you just later testing that the mtime returned here is greater than the last known mtime? In this case, if you delete one of these "overlay" files (or whatever you're calling them) then the mtime returned here will be less than before, even though the file will need to be rescanned in that case. I guess that means just testing mtime != last_mtime, rather than mtime > last_mtime. Maybe you're already doing that, but just thought I'd point it out. Now, how do we fix this function. Well, we could do it a couple ways. First of all, the logic looks questionable to me. You said this means if I have foo.avi in some directory, and then add foo.jpg that foo.avi will get rescanned. I don't know that I like this idea. It seems kinda hackish. What makes sense to me rather is to have a list of special extensions (like .srt, etc.) and a list of special files (like cover.jpg) that beacon could ask kaa.metadata for, and instead do specific lookups on these files. Also, if you ask me, foo.avi.srt makes more sense than foo.srt. But maybe people don't do this in practice, I dunno. But the main benefit here is with a list of extensions that we know are special, beacon can do O(1) lookups: ext = os.path.splitext(self._beacon_data['name'])[1] special_exts = kaa.metadata.get_special_exts_for_type(ext) listdir_file_map = self._beacon_parent._beacon_listdir(cache=True)[1] for ext in special_exts: if basename+ext in listdir_file_map: mtime += listdir_file_map[basename+ext][3][stat.ST_MTIME] Maybe you want to avoid splitext() or whatever, but you get the idea. This is just rough code. But the above logic reduces the complexity to O(n) and when n=2500, that's a difference of 6.25 million iterations. > Done, it should be ok. For other searches we still do a sort based on > the url. But it doesn't has to be url, maybe a name, parent > combination and maybe sqlite can help us return sorted results. Great. Jason. ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Freevo-devel mailing list Freevo-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freevo-devel