[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > On 3/18/07, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > > In <[EMAIL PROTECTED]>, Daniel Nogradi > > wrote: > > > > >> f = open('file.txt','r') > > >> for line in f: > > >> db[line.split(' ')[0]] = line.split(' ')[-1] > > >> db.sync() > > > > > > What is db here? Looks like a dictionary but that doesn't have a sync > > >method. > > > > Shelves (`shelve` module) have this API. And syncing forces the changes > > to be written to disks, so all caching and buffering of the operating > > system is prevented. So this may slow down the program considerably. > > It is a handle for bsddb > > import bsddb > db=bsddb.hashopen('db_filename') > Syncing will defenitely slow down. I will slow that down. But is there > any improvement I can do to the other part the splitting and setting > the key value/pair?
Unless each line is huge, how exactly you split it to get the first and last blank-separated word is not going to matter much. Still, you should at least avoid repeating the splitting twice, that's pretty obviously sheer waste: so, change that loop body to: words = line.split(' ') db[words[0]] = words[-1] If some lines are huge, splitting them entirely may be far more work than you need. In this case, you may do two partial splits instead, one direct and one reverse: first_word = line.split(' ', 1)[0] last_word = line.rsplit(' ', 1][-1] db[first_word] = last_word You could also try to extract the first and last words by re or direct string manipulations, but I doubt that would buy you much, if any, performance improvement in comparison to the partial-splits. In the end, only by "benchmarking" (measuring performance on sample data of direct relevance to your application) can you find out. Alex -- http://mail.python.org/mailman/listinfo/python-list