Since you're saying that each of these files may be as large as the virtual memory of the process will allow, it follows that you cannot have them open at the same time. Thus, putting the processing of each file into a separate thread will not help, and you're stuck iterating through them sequentially.
If the files are very large, larger than the physical memory on the machine, it will be very beneficial to instead look at the code that processes each file. This is because if you're accessing the data in a random way, you'll cause massive page faulting, whereas if you access the data sequentially its likely that the OS will help and prefetch some of the pages, thereby avoiding the waits for page faults. So my advice is to carefully look at the processing-per-file code and not try to optimize the way files are processed (i.e. in what order or concurrency). --JYL > Currently, I have some code which acesses mutiple metakit storage files > in succession. This is something I would like to speed up a bit. > Suppose I have a loop which is something like this(simplified for > clarity) for mkFile in files: > db=metakit.storage(mkFile,1) > #now do something. > Maybe this is just a stupid question but is there a better, as in > faster, way of accessing mutiple mk storage files? The answer cannot be > having all the data in a single mk storage file however. Potentially, > each one of these may reach a maximum size limit for memory mapped > files. Just a thought, but would accessing each of the files in a > seperate thread help much? Would that create other issues that I may not > anticipate at first glance? > > _____________________________________________ > Metakit mailing list - [EMAIL PROTECTED] > http://www.equi4.com/mailman/listinfo/metakit _____________________________________________ Metakit mailing list - [EMAIL PROTECTED] http://www.equi4.com/mailman/listinfo/metakit
