efficient text file search -solution
OK, am not sure why, but fList=file('somefile').read() if fList.find('string') != -1: print 'FOUND' works much much faster. it is strange since i thought 'for line in file('somefile')' is optemized and read pages to the memory, i guess not.. George Sakkis wrote: > noro wrote: > > > Is there a more efficient method to find a string in a text file then: > > > > f=file('somefile') > > for line in f: > > if 'string' in line: > > print 'FOUND' > > > > ? > > Is this something you want to do only once for a given file ? The > replies so far seem to imply so and in this case I doubt that you can > do anything more efficient. OTOH, if the same file is to be searched > repeatedly for different strings, an appropriate indexing scheme can > speed things up considerably on average. > > George -- http://mail.python.org/mailman/listinfo/python-list
Re: efficient text file search.
i'm not sure. each line in the text file and an index string. i can sort the file, and use some binary tree search on it. (I need to do a number of searchs). there are 1219137 indexs in the file. so maby a memory efficient sort algorithm is in place. how can mmap help me? is there any fbinary search algorithm for text files out there or do i need to write one? Steve Holden wrote: > noro wrote: > > Bill Scherer wrote: > > > >>noro wrote: > >> > >> > >>>Is there a more efficient method to find a string in a text file then: > >>> > >>>f=file('somefile') > >>>for line in f: > >>> if 'string' in line: > >>>print 'FOUND' > >>> > >>>? > >>> > >>>BTW: > >>>does "for line in f: " read a block of line to te memory or is it > >>>simply calls f.readline() many times? > >>> > >>>thanks > >>>amit > >>> > >>> > >> > >>If your file is sorted by some key in the data, you can build a very > >>fast binary search with mmap in Python. > > > > > > can you add some more info, or point me to a link, i haven't found > > anything about binary search in mmap() in python documents. > > > > the files are very big... > > > [please don't "top-post": add your latest comments at the end so the > story reads from the beginning]. > > I think this is probably not going to help you. A binary search is only > useful if you want to locate a value in an ordered list. Since your > original posting made it seem like the text you are looking for could > appear in any position in any line of the file a binary search doesn't > do you any good at all (in fact it complicates things and slows them > down unnecessarily) because you'd still need to look at all lines. > > Plus, if the lines are of variable length then you'd need to start by > creating an index of them, meaning you'd have to go right through the > file anyway. > > regards > Steve > -- > Steve Holden +44 150 684 7255 +1 800 494 3119 > Holden Web LLC/Ltd http://www.holdenweb.com > Skype: holdenweb http://holdenweb.blogspot.com > Recent Ramblings http://del.icio.us/steve.holden -- http://mail.python.org/mailman/listinfo/python-list
Re: efficient text file search.
can you add some more info, or point me to a link, i havn't found anything about binary search in mmap() in python documents. the files are very big... thanks amit Bill Scherer wrote: > noro wrote: > > >Is there a more efficient method to find a string in a text file then: > > > >f=file('somefile') > >for line in f: > >if 'string' in line: > > print 'FOUND' > > > >? > > > >BTW: > >does "for line in f: " read a block of line to te memory or is it > >simply calls f.readline() many times? > > > >thanks > >amit > > > > > If your file is sorted by some key in the data, you can build a very > fast binary search with mmap in Python. -- http://mail.python.org/mailman/listinfo/python-list
Re: efficient text file search.
:) via python... Luuk wrote: > "noro" <[EMAIL PROTECTED]> schreef in bericht > news:[EMAIL PROTECTED] > > Is there a more efficient method to find a string in a text file then: > > > > f=file('somefile') > > for line in f: > >if 'string' in line: > > print 'FOUND' > > > > > yes, more efficient would be: > grep (http://www.gnu.org/software/grep/) -- http://mail.python.org/mailman/listinfo/python-list
efficient text file search.
Is there a more efficient method to find a string in a text file then: f=file('somefile') for line in f: if 'string' in line: print 'FOUND' ? BTW: does "for line in f: " read a block of line to te memory or is it simply calls f.readline() many times? thanks amit -- http://mail.python.org/mailman/listinfo/python-list
data structure
Hello again. I have a task i need to do and i can't seem to find an elegent solution. i need to make a tree like data structure (not necessry a binary tree). i would like each node to access his sons in a dicionary kind of why, for example: if ROOT node has the name 'A' and 'AA', 'AB' are his sons, and 'ABA' is 'AB son etc' (in this name scheme the letters from left to right shows the route from the root to the node) then ROOT['AB'] will point to 'AB' node and ROOT['AB'][ABA'] will point to 'ABA' node. the tree does not have to be symmarical and every node link to different number of nodes. two nodes can have the same name if they are in a different location in the tree. so ROOT['factory1]['manager'] and ROOT['factory2']['manager'] can be in the same tree and point to different objects. all up to now i can manage. - my problem is this: i would like to find a way to easly construct the tree by giving some simple template, somthing similer to: " ROOT={'factory1':FACTORY,'facory2':FACTORY,'linux':OS,'windows':OS} FACTORY={'manager':EMPLOEY,'worker':EMPLOEY,'office':BUILDING,'..} OS={'ver':VERSION,'problems':LIST,} " i started bulding the class NODE as an extention of "dict" with the keys are the childern names and the items are the next node reference. the proablem was that as you can see from the above example 'factory1' and 'factory2' point to the same object. i would like to have 2 different objects of FACTORY. making FACTORY a template and not an object. i'll appreciate any comment amit -- http://mail.python.org/mailman/listinfo/python-list
Re: dictionary with object's method as thier items
great that is what i looked for. >>> class C: > ... def function(self, arg): > ... print arg > ... > >>> obj = C() > >>> d = C.__dict__ > >>> d['function'](obj, 42) > 42 this allows me the access the same method in a range of objects. i can put all the functions i need in a dictionary as items, and the vars as keys, and then call them for all objects that belong to a class.. something like this class C: #object vars self.my_voice self.my_size self.my_feel # a method that do somthing, that might give different result for different objects getVoice(self): return(self.my_voice+'WOW') getSize(self): return(self.my_size*100) getFeel(self): return(self.my_feel) #create the dictionary with a reference to the class methode dic={'voice':C.getVoice,'size':C.getSize,'feel':C.getFeel} # create array of 10 different objects cArray = [] for i in range(10) cArray.append(C()) cArray[0].my_size=i # choose the function you need, and get the result choice=WHAT EVER KEY (e.g 'size') for i in range(10) print dic[choice](cArray[i]) #or even print all the values of all objects. if i ever want to print diffenet valuse i only need # to change the dictionary, nothing else... for choice in dic: for i in range(10) print dic[choice](cArray[i]) --- i totaly forget about the "self" argument in every method... a. is the main reason "self is there, or is it only a side effect? b. what do you think about this code style? it is not very OOP, but i cant see how one can do it other wise, and be able to control the function printed out with something as easy as dictionary.. Georg Brandl wrote: > noro wrote: > > Is it possible to do the following: > > > > for a certain class: > > > > > > class C: > > > > def func1(self): > > pass > > def func2(self): > > pass > > def func4(self): > > pass > > > > obj=C() > > > > > > by some way create a dictionary that look somthing like that: > > > > d= {'function one': , \ > > 'function two': , \ > > 'function three': } > > Perhaps this: > > >>> class C: > ... def function(self, arg): > ... print arg > ... > >>> obj = C() > >>> d = C.__dict__ > >>> d['function'](obj, 42) > 42 > >>> > > > Georg -- http://mail.python.org/mailman/listinfo/python-list
dictionary with object's method as thier items
Is it possible to do the following: for a certain class: class C: def func1(self): pass def func2(self): pass def func4(self): pass obj=C() by some way create a dictionary that look somthing like that: d= {'function one': , \ 'function two': , \ 'function three': } and so i could access every method of instances of C, such as obj with sometiing like: (i know that this syntax wont work ) obj.(d['function one']) obj.(d['function two']) etc.. thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: unit test for a printing method
Fredrik Lundh wrote: > Scott David Daniels wrote: > > > For silly module myprog.py: > > def A(s): > > print '---'+s+'---' > > in test_myprog.py: > > import unittest > > from cStringIO import StringIO # or from StringIO ... > > why are you trying to reinvent doctest ? > > it was my understanding that "doctest" is intented to test the little examples in a function/class documention, do people use it for more then that, i.e - do an extentive output testing for thier apps? amit -- http://mail.python.org/mailman/listinfo/python-list
unit test for a printing method
What is the proper way to test (using unit test) a method that print information? for example: def A(s): print '---'+s+'---' and i want to check that A("bla") really print out "---bla---" thanks amit -- http://mail.python.org/mailman/listinfo/python-list
Re: creating multiply arguments for a method.
John Roth wrote: > noro wrote: > > Hi all, > > > > I use a method that accept multiply arguments ("plot(*args)"). > > so plot([1,2,3,4]) is accepted and plot([1,2,3,4],[5,6,7,8]) is also > > accepted. > > > > the problem is that i know the number of arguments only at runtime. > > Let say that during runtime i need to pass 4 arguments, each is a list, > > creating a set of lists and passing it to the method wont word since > > the interpartor thinks it is only 1 argument the contain a reference to > > a "list of lists", instede of number of arguments, each is a list. > > > > any suggestions? > > thanks > > amit > > Why do you want to do this? You'll have to do some > logic in your method body to determine how many > operands you have whether you explicitly pass a list or > whether you have the system break it apart into > separate parameters. > > Fredrick Lund's solution, using an * parameter in the > method definition, will produce a list that you have to > pull apart in the method. Doing the same in the method > call takes a single list of all of your parameters and then > distributes it among the parameters in the definition. > > I wouldn't bother with either one. Passing a list of my > real parameters as a single parameter is, in most > circumstances, easier and IMO clearer. > > John Roth I do not have much choise. I did not wirte the module, i just use it. thank you all for the help, amit -- http://mail.python.org/mailman/listinfo/python-list
creating multiply arguments for a method.
Hi all, I use a method that accept multiply arguments ("plot(*args)"). so plot([1,2,3,4]) is accepted and plot([1,2,3,4],[5,6,7,8]) is also accepted. the problem is that i know the number of arguments only at runtime. Let say that during runtime i need to pass 4 arguments, each is a list, creating a set of lists and passing it to the method wont word since the interpartor thinks it is only 1 argument the contain a reference to a "list of lists", instede of number of arguments, each is a list. any suggestions? thanks amit -- http://mail.python.org/mailman/listinfo/python-list
Re: local moduile acess from cgi-bin
thanks bruno Bruno Desthuilliers wrote: > noro wrote: > > hello all. > > > > I do some coding in python but this is my first attampt to write > > somthing for hte web. > > > > I need to write a cgi-bin script for a web-server, and i've got the > > access for it from our "SYSTEM". the problem is that this script uses > > some modules (pg, pyLab) that i've installed localy in my home dir. > > Python knows how to find them due to an enviorment variable inthe shell > > (please corrent me if i'm wrong). > > > > now, i am by no means big expert about web servers, but if i got it > > right, the web server run under some user ("www-data" or such). > > so howi can i make this user (and the web) be able to run my python > > code without having to install the modules as shared. (which i dont > > think they will allow me). > > import sys > sys.path.append('/home/amir/') > > should do the trick. > > HTH > -- > bruno desthuilliers > python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for > p in '[EMAIL PROTECTED]'.split('@')])" -- http://mail.python.org/mailman/listinfo/python-list
local moduile acess from cgi-bin
hello all. I do some coding in python but this is my first attampt to write somthing for hte web. I need to write a cgi-bin script for a web-server, and i've got the access for it from our "SYSTEM". the problem is that this script uses some modules (pg, pyLab) that i've installed localy in my home dir. Python knows how to find them due to an enviorment variable inthe shell (please corrent me if i'm wrong). now, i am by no means big expert about web servers, but if i got it right, the web server run under some user ("www-data" or such). so howi can i make this user (and the web) be able to run my python code without having to install the modules as shared. (which i dont think they will allow me). thanks very much amit -- http://mail.python.org/mailman/listinfo/python-list