[Tutor] Finding the End of a Def?
Title: Signature.html I'm using pythonWin. Is there some way to skip from the start of a def to the end? How about any similar indentation? -- Wayne Watson (Watson Adventures, Prop., Nevada City, CA) (121.01 Deg. W, 39.26 Deg. N) GMT-8 hr std. time) Copper and its alloys have been found effective in hospital sinks, hand rails, beds, ... in significantly reducing bacteria. Estimates are 1/20 people admitted to a hospital become infected, and 1/20 die from the infection. -- NPR Science Friday, 01/16/2009 Web Page:___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] finding words that contain some letters in their respective order
2009/1/23 Emad Nawfal (عماد نوفل) > > > On Fri, Jan 23, 2009 at 8:04 PM, Andre Engels wrote: > >> 2009/1/24 Emad Nawfal (عماد نوفل) : >> > >> > >> > 2009/1/23 Emad Nawfal (عماد نوفل) >> >> >> >> >> >> On Fri, Jan 23, 2009 at 6:57 PM, Andre Engels >> >> wrote: >> >>> >> >>> I made an error in my program... Sorry, it should be: >> >>> >> >>> def hasRoot(word, root): # This order I find more logical >> >>> loc = 0 >> >>> for letter in root: >> >>>loc = word.find(letter,loc) # I missed the ,loc here... >> >>>if loc == -1: >> >>>return false >> >>> return true >> >>> >> >>> # main >> >>> >> >>> infile = open("myCorpus.txt").read().split() >> >>> query = "ktb" >> >>> outcome = [word for word in infile if hasRoot(word,query)] >> >>> >> >>> >> >>> -- >> >>> André Engels, andreeng...@gmail.com >> >> >> >> >> >> Thank you so much. bktab is a legal Arabic word. I also found the word >> >> bmktbha in the corpus. I would have missed that. >> >> Thank you again. >> >> -- >> >> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه >> كالحقيقة.محمد >> >> الغزالي >> >> "No victim has ever been more repressed and alienated than the truth" >> >> >> >> Emad Soliman Nawfal >> >> Indiana University, Bloomington >> >> http://emnawfal.googlepages.com >> >> >> > >> > Hi again, >> > If I want to use a regular expression to find the root ktb in all its >> > derivations, would this be a good way around it: >> > >> x = re.compile("[a-z]*k[a-z]*t[a-z]*b[a-z]*") >> text = "hw syktbha ghda wlktab ktb" >> re.findall(x, text) >> > ['syktbha', 'wlktab', 'ktb'] >> >> >> Yes, that looks correct - and a regular expression solution also is >> easier to adapt - for example, the little that I know of Arab makes me >> believe that _between_ the letters of a root there may only be vowels. >> If that's correct, the RE can be changed to >> >> "[a-z]*k[aeiou]*t[aeiou]*b[a-z]*" > > The letter t does very often occur between the root consonants as well. For > example, we have akttb, katatib, and for the root fsr you can have astfsr. > > Thank you Andre for your helpfulness, and thank you Eugene for suggesting > the use of regular expressions. > >> >> >> >> >> -- >> André Engels, andreeng...@gmail.com >> > > > > -- > لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد > الغزالي > "No victim has ever been more repressed and alienated than the truth" > > Emad Soliman Nawfal > Indiana University, Bloomington > http://emnawfal.googlepages.com > > Sorry, the last example was incorrect. A correct example would be fqr and aftqr, slf and astlf -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington http://emnawfal.googlepages.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] finding words that contain some letters in their respective order
On Fri, Jan 23, 2009 at 8:04 PM, Andre Engels wrote: > 2009/1/24 Emad Nawfal (عماد نوفل) : > > > > > > 2009/1/23 Emad Nawfal (عماد نوفل) > >> > >> > >> On Fri, Jan 23, 2009 at 6:57 PM, Andre Engels > >> wrote: > >>> > >>> I made an error in my program... Sorry, it should be: > >>> > >>> def hasRoot(word, root): # This order I find more logical > >>> loc = 0 > >>> for letter in root: > >>>loc = word.find(letter,loc) # I missed the ,loc here... > >>>if loc == -1: > >>>return false > >>> return true > >>> > >>> # main > >>> > >>> infile = open("myCorpus.txt").read().split() > >>> query = "ktb" > >>> outcome = [word for word in infile if hasRoot(word,query)] > >>> > >>> > >>> -- > >>> André Engels, andreeng...@gmail.com > >> > >> > >> Thank you so much. bktab is a legal Arabic word. I also found the word > >> bmktbha in the corpus. I would have missed that. > >> Thank you again. > >> -- > >> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه > كالحقيقة.محمد > >> الغزالي > >> "No victim has ever been more repressed and alienated than the truth" > >> > >> Emad Soliman Nawfal > >> Indiana University, Bloomington > >> http://emnawfal.googlepages.com > >> > > > > Hi again, > > If I want to use a regular expression to find the root ktb in all its > > derivations, would this be a good way around it: > > > x = re.compile("[a-z]*k[a-z]*t[a-z]*b[a-z]*") > text = "hw syktbha ghda wlktab ktb" > re.findall(x, text) > > ['syktbha', 'wlktab', 'ktb'] > > > Yes, that looks correct - and a regular expression solution also is > easier to adapt - for example, the little that I know of Arab makes me > believe that _between_ the letters of a root there may only be vowels. > If that's correct, the RE can be changed to > > "[a-z]*k[aeiou]*t[aeiou]*b[a-z]*" The letter t does very often occur between the root consonants as well. For example, we have akttb, katatib, and for the root fsr you can have astfsr. Thank you Andre for your helpfulness, and thank you Eugene for suggesting the use of regular expressions. > > > > > -- > André Engels, andreeng...@gmail.com > -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington http://emnawfal.googlepages.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] finding words that contain some letters in their respective order
2009/1/24 Emad Nawfal (عماد نوفل) : > > > 2009/1/23 Emad Nawfal (عماد نوفل) >> >> >> On Fri, Jan 23, 2009 at 6:57 PM, Andre Engels >> wrote: >>> >>> I made an error in my program... Sorry, it should be: >>> >>> def hasRoot(word, root): # This order I find more logical >>> loc = 0 >>> for letter in root: >>>loc = word.find(letter,loc) # I missed the ,loc here... >>>if loc == -1: >>>return false >>> return true >>> >>> # main >>> >>> infile = open("myCorpus.txt").read().split() >>> query = "ktb" >>> outcome = [word for word in infile if hasRoot(word,query)] >>> >>> >>> -- >>> André Engels, andreeng...@gmail.com >> >> >> Thank you so much. bktab is a legal Arabic word. I also found the word >> bmktbha in the corpus. I would have missed that. >> Thank you again. >> -- >> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد >> الغزالي >> "No victim has ever been more repressed and alienated than the truth" >> >> Emad Soliman Nawfal >> Indiana University, Bloomington >> http://emnawfal.googlepages.com >> > > Hi again, > If I want to use a regular expression to find the root ktb in all its > derivations, would this be a good way around it: > x = re.compile("[a-z]*k[a-z]*t[a-z]*b[a-z]*") text = "hw syktbha ghda wlktab ktb" re.findall(x, text) > ['syktbha', 'wlktab', 'ktb'] Yes, that looks correct - and a regular expression solution also is easier to adapt - for example, the little that I know of Arab makes me believe that _between_ the letters of a root there may only be vowels. If that's correct, the RE can be changed to "[a-z]*k[aeiou]*t[aeiou]*b[a-z]*" -- André Engels, andreeng...@gmail.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] finding words that contain some letters in their respective order
2009/1/23 Emad Nawfal (عماد نوفل) > > > On Fri, Jan 23, 2009 at 6:57 PM, Andre Engels wrote: > >> I made an error in my program... Sorry, it should be: >> >> def hasRoot(word, root): # This order I find more logical >> loc = 0 >> for letter in root: >> loc = word.find(letter,loc) # I missed the ,loc here... >> if loc == -1: >>return false >> return true >> >> # main >> >> infile = open("myCorpus.txt").read().split() >> query = "ktb" >> outcome = [word for word in infile if hasRoot(word,query)] >> >> >> -- >> André Engels, andreeng...@gmail.com >> > > > Thank you so much. bktab is a legal Arabic word. I also found the word > bmktbha in the corpus. I would have missed that. > Thank you again. > > -- > لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد > الغزالي > "No victim has ever been more repressed and alienated than the truth" > > Emad Soliman Nawfal > Indiana University, Bloomington > http://emnawfal.googlepages.com > > Hi again, If I want to use a regular expression to find the root ktb in all its derivations, would this be a good way around it: >>> x = re.compile("[a-z]*k[a-z]*t[a-z]*b[a-z]*") >>> text = "hw syktbha ghda wlktab ktb" >>> re.findall(x, text) ['syktbha', 'wlktab', 'ktb'] >>> -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington http://emnawfal.googlepages.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] finding words that contain some letters in their respective order
On Fri, Jan 23, 2009 at 6:57 PM, Andre Engels wrote: > I made an error in my program... Sorry, it should be: > > def hasRoot(word, root): # This order I find more logical > loc = 0 > for letter in root: > loc = word.find(letter,loc) # I missed the ,loc here... > if loc == -1: >return false > return true > > # main > > infile = open("myCorpus.txt").read().split() > query = "ktb" > outcome = [word for word in infile if hasRoot(word,query)] > > > -- > André Engels, andreeng...@gmail.com > Thank you so much. bktab is a legal Arabic word. I also found the word bmktbha in the corpus. I would have missed that. Thank you again. -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington http://emnawfal.googlepages.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] finding words that contain some letters in their respective order
I made an error in my program... Sorry, it should be: def hasRoot(word, root): # This order I find more logical loc = 0 for letter in root: loc = word.find(letter,loc) # I missed the ,loc here... if loc == -1: return false return true # main infile = open("myCorpus.txt").read().split() query = "ktb" outcome = [word for word in infile if hasRoot(word,query)] -- André Engels, andreeng...@gmail.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] finding words that contain some letters in their respective order
On Sat, Jan 24, 2009 at 12:02 AM, Emad Nawfal (عماد نوفل) wrote: > Hello Tutors, > Arabic words are build around a root of 3 or 4 consonants with lots of > letters in between, and also prefixes and suffixes. > The root ktb (write) for example, could be found in words like: > ktab : book > mktob: letter, written > wktabhm: and their book > yktb: to write > lyktbha: in order for him to write it > > I need to find all the word forms made up of a certain root in a corpus. My > idea, which is not completely right, but nonetheless works most of the > time, is to find words that have the letters of the root in their > respective order. For example, the words that contain k followed by t > then followed by b, no matter whether there is something in between. I came > up with following which works fine. For learning purposes, please let me > know whether this is a good way, and how else I can achieve that. > I appreciate your help, as I always did. > > > > def getRoot(root, word): > result = "" > > for letter in word: > if letter not in root: > continue > result +=letter > return result > > # main > > infile = open("myCorpus.txt").read().split() > query = "ktb" > outcome = set([word for word in infile if query == getRoot(query, word)]) > for word in outcome: > > print(word) This gets into problems if the letters of the root occur somewhere else in the word as well. For example, if there would be a word bktab, then getRoot("ktb","bktab") would be "bktb", not "ktb". I would use the find method of the string class here - if A and B are strings, and n is a number, then A.find(B,n) is the first location, starting at n, where B is a substring of A, or -1 if there isn't any. Using this, I get: def hasRoot(word, root): # This order I find more logical loc = 0 for letter in root: loc = word.find(letter) if loc == -1: return false return true # main infile = open("myCorpus.txt").read().split() query = "ktb" outcome = [word for word in infile if hasRoot(word,query)] for word in outcome: print(word) -- André Engels, andreeng...@gmail.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] finding words that contain some letters in their respective order
Hello Tutors, Arabic words are build around a root of 3 or 4 consonants with lots of letters in between, and also prefixes and suffixes. The root ktb (write) for example, could be found in words like: ktab : book mktob: letter, written wktabhm: and their book yktb: to write lyktbha: in order for him to write it I need to find all the word forms made up of a certain root in a corpus. My idea, which is not completely right, but nonetheless works most of the time, is to find words that have the letters of the root in their respective order. For example, the words that contain k followed by t then followed by b, no matter whether there is something in between. I came up with following which works fine. For learning purposes, please let me know whether this is a good way, and how else I can achieve that. I appreciate your help, as I always did. def getRoot(root, word): result = "" for letter in word: if letter not in root: continue result +=letter return result # main infile = open("myCorpus.txt").read().split() query = "ktb" outcome = set([word for word in infile if query == getRoot(query, word)]) for word in outcome: print(word) -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington http://emnawfal.googlepages.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Possible to search text file for multiple string values at once?
On Fri, Jan 23, 2009 at 1:11 PM, Scott Stueben wrote: > Thanks for the help so far - it seems easy enough. To clarify on the > points you have asked me about: > > A sqlite3 database on my machine would be an excellent idea for > personal use. I would like to be able to get a functional script for > others on my team to use, so maybe a script or compiled program > (Win32) eventually. As long as everyone on your team has python installed (or as long as python is installed on the machines they'll be using), a functional script would be fairly easy to get rolling. Sqlite is (AFAIK) included with the newer versions of python by default. Heck, it's on the version I have installed on my phone! (Cingular 8525). Simply zipping up the directory should provide an easy enough distribution method. Although, you *could* even write a python script that does the "install" for them. > As for output, I would probably like to return the entire lines that > contain any search results of those strings. Maybe just output to a > results.txt that would have the entire line of each line that contains > 'Bob', 'John', 'Joe', 'Jim', and or 'Fred'. The simplest method: In [5]: f = open('interculturalinterview2.txt', 'r') In [6]: searchstrings = ('holy', 'hand', 'grenade', 'potato') In [7]: for line in f.readlines(): ...: for word in searchstrings: ...: if word in line: ...: print line ...: ...: Hana: have a bonfire n candy apples n make potatoes on a car lol! Wayne: potatoes on a car? Hana .: yer lol its fun and they taste nicer lol, you wrap a potato in tinfoil a nd put in on the engine of a car and close the bonnet and have the engine run an d it cooks it in about 30 mins Speed isn't as important as ease of use, I suppose, since > non-technical people should be able to use it, ideally. Although that wouldn't be quite so easy to use ;) Of course simple modifications would provide a little more user friendliness. > Maybe, since I am on Win32, I could have a popup window that asks for > input filename and path, and then asks for string(s) to search for, > and then it would process the search and output all lines to a file. > Something like that is what I am imagining, but I am open to > suggestions on items that may be easier to use or code. Using Tkinter (again, which AFAIK comes with all versions of python) is pretty simple. import Tkinter, tkFileDialog, tkMessageBox root = Tkinter.Tk() root.withdraw() if tkMessageBox.askyesno("Choose a file?", "Would you like to choose a file to search?"): f = tkFileDialog.askopenfile() You could also use this method to select a file that contains search strings, or allow the user to input them in some other way. > Is that reasonably simple to code for a beginner? > Yes, it's fairly simple. If you've had minimal programming experience you might encounter some bigger problems, but if you've programmed in PHP or some other language you should find it fairly easy to pick up python, and use all the commands you'll need. HTH, Wayne ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Possible to search text file for multiple string values at once?
Thanks for the help so far - it seems easy enough. To clarify on the points you have asked me about: A sqlite3 database on my machine would be an excellent idea for personal use. I would like to be able to get a functional script for others on my team to use, so maybe a script or compiled program (Win32) eventually. As for output, I would probably like to return the entire lines that contain any search results of those strings. Maybe just output to a results.txt that would have the entire line of each line that contains 'Bob', 'John', 'Joe', 'Jim', and or 'Fred'. Speed isn't as important as ease of use, I suppose, since non-technical people should be able to use it, ideally. Maybe, since I am on Win32, I could have a popup window that asks for input filename and path, and then asks for string(s) to search for, and then it would process the search and output all lines to a file. Something like that is what I am imagining, but I am open to suggestions on items that may be easier to use or code. Is that reasonably simple to code for a beginner? Thanks again, Scott On Fri, Jan 23, 2009 at 11:52 AM, Kent Johnson wrote: > On Fri, Jan 23, 2009 at 1:25 PM, Scott Stueben wrote: > >> I would like to search a text file for a list of strings, like a sql query. > > What do you want to do if you find one? Do you want to get every line > that contains any of the strings, or a list of which strings are > found, or just find out if any of the strings are there? > >> For instance: To search a text file for the values 'Bob', 'John', >> 'Joe', 'Jim', and 'Fred', you would have to open the dialog and do >> five separate searches. Lots of copying and pasting, lots of room for >> typos. > > You can do this with a regular expression. For example, > > import re > findAny = re.compile('Bob|John|Joe|Jim|Fred') > > for found in findAny.findall(s): > print found > > will print all occurrences of any of the target names. > > You can build the regex string dynamically from user input; if > 'toFind' is a list of target words, use > findAny = re.compile('|'.join(re.escape(target) for target in toFind)) > > re.escape() cleans up targets that have special characters in them. > -- "Shine on me baby, cause it's rainin' in my heart" --Elliott Smith ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] class arguments?
Forwarding to the list with my reply... On Fri, Jan 23, 2009 at 1:35 PM, spir wrote: > Le Fri, 23 Jan 2009 06:45:04 -0500, > Kent Johnson a écrit : > >> On Fri, Jan 23, 2009 at 6:04 AM, spir wrote: >> >> > Thank you Alan and sorry for not having been clear enough. The point >> > actually was class (definition) attributes. I thought at e.g. Guido's >> > views that lists were for homogeneous sequences as opposed to tuples >> > rather like records. And a way to ensure sich a homogeneity, in the sense >> > of items beeing of the same type or super type. >> > The straightforward path to ensure that, as I see it, is to add proper >> > argument to a class definition. >> >> A simple way to do this is with a class factory function, for example: >> >> def makeMonoList(typ, number): >> class MonoListSubtype(MonoList): >> item_type = type >> item_number = number >> return MonoListSubtype > > That's it! Stupid me!! [Just realize I have a kind of mental blocage that > prevents me *imagining* a class beeing defined inside a func. As for me a > class is a higher level kind of thing. Actually I also have problems with > defs insides defs. Maybe there should be more introduction to that in python > literature. Probably it may help and simplify a whole lot of models.] > Thank you again. >> then e.g. >> IntegerList = makeMonoList(int, 5) >> myIntegerList = IntegerList() >> >> This is similar in spirit to collections.namedtuple() in Python 2.6 >> though the implementation is different; namedtuple() actually creates >> and evaluates the text of the new class definition: >> http://docs.python.org/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields >> http://svn.python.org/view/python/trunk/Lib/collections.py?rev=68853&view=auto > > I have watched that some time ago -- as you pointed to it already. I take the > opportunity to ask why this contruct is so complicated. There is an > alternative in the cookbook (also pointed by Kent, if I remmember well) that > is only a few lines long. Something like: > > class Record(dict): >def __init__(self,**kwargs): >dict.__init__(self,kwargs) ># and/or >self.__dict__ = kwargs > > [There are several versons around, +/- based on the same principle] namedtuple() creates a new class that has exactly the desired attributes, so it is a bit more specific and typesafe - you have to have the correct number of points. The generated class subclasses tuple so it can be used as a dict key (if the items themselves can be). Class instances are lightweight because they don't have a __dict__ member. > Actually, the thing I like at least in the namedtuple recipe is that it > writes the class def as a string to be executed: Yeah, I know...with all the time we spend on the list telling people not to use eval()... > > I know there are several advantages: > * a docstring > * For large collections of records of the same (sub)type, as the list of > field is held by the class (instances record the actual data only), which > spares memory. But doesn't this lead to lower performance, as attribute > access by name requires adressing a class level attribute? The attributes are properties, so attribute access is like a method call, I suppose this is slower than direct field access but it is a common Python technique. > * attributes can be accessed by index, too > > Also, as this factory create kinds of records, meaning data constructs with > an identical structure, that would perfectly hold table records, why isn't it > simply called "record". > To sum up in a word: why so much *complication*? I guess you would have to search comp.lang.python or python-dev to find the reasons, I don't think there is a PEP for this (at least not referenced in the What's New). Kent > > Denis > -- > la vida e estranya > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
On Fri, Jan 23, 2009 at 11:25 AM, Andre Engels wrote: > On Fri, Jan 23, 2009 at 10:37 AM, amit sethi > wrote: >> so is there a way around that problem ?? > > Ok, I have done some checking around, and it seems that the Wikipedia > server is giving a return code of 403 (forbidden), but still giving > the page - which I think is weird behaviour. I will check with the > developers of Wikimedia why this is done, It appears that this is done on purpose, not just for Python but also for the 'standard' user agent in other languages. The idea is that it forces programmers to add their own user agent, so that if the program trying to contact Wikipedia misbehaves, it can be blocked or otherwise handeled with; as a bonus it also gives programmers a small extra hurdle so that the most amateuristic attempts are stopped, but more thought out programs are not. -- André Engels, andreeng...@gmail.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Possible to search text file for multiple string values at once?
On Fri, Jan 23, 2009 at 1:25 PM, Scott Stueben wrote: > I would like to search a text file for a list of strings, like a sql query. What do you want to do if you find one? Do you want to get every line that contains any of the strings, or a list of which strings are found, or just find out if any of the strings are there? > For instance: To search a text file for the values 'Bob', 'John', > 'Joe', 'Jim', and 'Fred', you would have to open the dialog and do > five separate searches. Lots of copying and pasting, lots of room for > typos. You can do this with a regular expression. For example, import re findAny = re.compile('Bob|John|Joe|Jim|Fred') for found in findAny.findall(s): print found will print all occurrences of any of the target names. You can build the regex string dynamically from user input; if 'toFind' is a list of target words, use findAny = re.compile('|'.join(re.escape(target) for target in toFind)) re.escape() cleans up targets that have special characters in them. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Possible to search text file for multiple string values at once?
On Fri, Jan 23, 2009 at 12:38 PM, bob gailer wrote: > Scott Stueben wrote: > >> Hi all, >> >> I understand that python excels at text processing, and wondered if >> there is a way to use python to accomplish a certain task. >> I would love to set up a script to parse a file and show results from >> a list of strings. Is this possible with python? >> >> > > Yes - and also possible with almost all other programming languages. > Here is one of several ways to do it in Python: > > for first_name in ('Bob', 'John', 'Joe', 'Jim', 'Fred'): > if first_name in text: > print first_name, 'found' > Another option, if you really like sql, is to import sqlite3 and then parse your text file into a sqlite database. That's probably overkill, of course. But it's a possibility. It really depends on what matters most. Speed? Comfort(with syntax)? Ease of use? That will give you an idea of which tools you should use. HTH, Wayne -- To be considered stupid and to be told so is more painful than being called gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness, every vice, has found its defenders, its rhetoric, its ennoblement and exaltation, but stupidity hasn't. - Primo Levi ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Possible to search text file for multiple string values at once?
Scott Stueben wrote: Hi all, I understand that python excels at text processing, and wondered if there is a way to use python to accomplish a certain task. I am trying to search large text files for multiple strings (like employee ID number, or name). Any text editor (I use Windows mostly) will certainly have a "find", "replace", or even "find in files" (to search multiple files for a value) function, but this is searching for one string at a time. I would like to search a text file for a list of strings, like a sql query. For instance: To search a text file for the values 'Bob', 'John', 'Joe', 'Jim', and 'Fred', you would have to open the dialog and do five separate searches. Lots of copying and pasting, lots of room for typos. But if you were in a SQL database, you could do something like: "SELECT * FROM my_table WHERE first_name IN ('Bob', 'John', 'Joe', 'Jim', 'Fred')" and you would get results for all five values. I would love to set up a script to parse a file and show results from a list of strings. Is this possible with python? Yes - and also possible with almost all other programming languages. Here is one of several ways to do it in Python: for first_name in ('Bob', 'John', 'Joe', 'Jim', 'Fred'): if first_name in text: print first_name, 'found' -- Bob Gailer Chapel Hill NC 919-636-4239 ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Possible to search text file for multiple string values at once?
Hi all, I understand that python excels at text processing, and wondered if there is a way to use python to accomplish a certain task. I am trying to search large text files for multiple strings (like employee ID number, or name). Any text editor (I use Windows mostly) will certainly have a "find", "replace", or even "find in files" (to search multiple files for a value) function, but this is searching for one string at a time. I would like to search a text file for a list of strings, like a sql query. For instance: To search a text file for the values 'Bob', 'John', 'Joe', 'Jim', and 'Fred', you would have to open the dialog and do five separate searches. Lots of copying and pasting, lots of room for typos. But if you were in a SQL database, you could do something like: "SELECT * FROM my_table WHERE first_name IN ('Bob', 'John', 'Joe', 'Jim', 'Fred')" and you would get results for all five values. I would love to set up a script to parse a file and show results from a list of strings. Is this possible with python? Thanks for the input and help, Scott ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Why dictionaries?
On Fri, Jan 23, 2009 at 8:57 AM, Vicent wrote: > A simple but maybe too wide question: > > When is it / isn't it useful to use dictionaries, in a Python program? > I mean, what kind of tasks are they interesting for? Lists are ordered, dicts are not List indices are consecutive integers, dict keys can be any hashable value (most often a number or string) and do not have to be consecutive. Dict lookup is fast, even for large dictionaries. List search is sequential and gets slow for long lists. This can be a performance killer, for example a common performance problem is code like this: for item1 in very_long_list_1: if item1 in very_long_list_2: # do something with item1 that matches This particular example is best solved with a set, not a dict, but the performance of both is similar and sometimes a dict is the correct solution. This will be significantly faster than the previous code: list_2_items = set(very_long_list_2) for item1 in very_long_list_1: if item1 in list_2_items: # do something with item1 Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Why dictionaries?
"Vicent" wrote When is it / isn't it useful to use dictionaries, in a Python program? I mean, what kind of tasks are they interesting for? They are interesting for any place where you need to store associations of objects and data. Think of a glossary or an index in a book. Both are a type of dictionary. A Glossary takes a single word and returns a paragraph of definition. An index takes a word and retirns a list of page numbers where that word appears. A dictionary can also model a simple database where a single key can retrieve an entire record. It goes on and on, they are one of the most powerful data structures available to us. The biggest snags are that they are not ordered so if you need a sorted data store dictionaries may not be the best choice. Maybe you can give me some references where they explain it. Try Wikipedia under Associative Array (a fancy term for a dictionary) or Hash Table for a description of the inner workings... HTH, -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Customizing Eclipse text editor
"spir" wrote There should be some research done on this topic. After all, it's a relevant aspect of all programmers' everyday life, no? I suspect there has been - just not in our field. The reason I changed the colours on my web tutor about a year ago was because one of my users who was a graphics designer chided me for the use of contrasting colours. It is apparently easier to read something if all the colours on a page are from the same family - different shades of red say. Contrasting colours clash and so should only be used where you want to draw attention away from the main text. I'm no expert but he sounded like he knew what he was talking about and the end result pleases me so I stuck with it! :-) Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Why dictionaries?
A simple but maybe too wide question: When is it / isn't it useful to use dictionaries, in a Python program? I mean, what kind of tasks are they interesting for? Maybe you can give me some references where they explain it. Thank you!! -- Vicent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
"Kent Johnson" wrote Rather than editing the existing code and making it non standard why not subclass robotparser: That won't work, it is urllib.URLOpener() that he is patching and Sorry, yes I misread that post as modifying robotparser, it should have been URLOpener. But... robotparser does not supply a way to change the URLOpener subclass that it uses. I didn't realize that. So you would need to cut n' paste the robotparser read() method into a subclass of robotparser too. Which is almost as messy as editing the original source. Such a shame that the author didn't either put the opener as an attribute or as a defaulted parameter of read! Pity, I hate to see editing of existing classes in an OO system. Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
On Fri, Jan 23, 2009 at 5:37 AM, Andre Engels wrote: > Looking further I found that a 'cleaner' way to make the same change > is to add to the code of URLopener (outside any method): > > version = '' You can do this without modifying the standard library source, by import urllib urllib.URLopener.version = '' The version string is used as the User-Agent, that is why this works at all. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
On Fri, Jan 23, 2009 at 6:23 AM, Alan Gauld wrote: > Rather than editing the existing code and making it non standard > why not subclass robotparser: > > class WP_RobotParser(robotparser): > def __init__(self, *args, *kwargs): > robotparser.__init__(self, *args, *kwargs) > self.addheaders = ...blah > > Thats one of the advantages of OOP, you can change the way > classes work without modifying the original code. And thus not > breaking any code that relies on the original behaviour. That won't work, it is urllib.URLOpener() that he is patching and robotparser does not supply a way to change the URLOpener subclass that it uses. Ken ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] class arguments?
On Fri, Jan 23, 2009 at 6:04 AM, spir wrote: > Thank you Alan and sorry for not having been clear enough. The point actually > was class (definition) attributes. I thought at e.g. Guido's views that lists > were for homogeneous sequences as opposed to tuples rather like records. And > a way to ensure sich a homogeneity, in the sense of items beeing of the same > type or super type. > The straightforward path to ensure that, as I see it, is to add proper > argument to a class definition. A simple way to do this is with a class factory function, for example: def makeMonoList(typ, number): class MonoListSubtype(MonoList): item_type = type item_number = number return MonoListSubtype then e.g. IntegerList = makeMonoList(int, 5) myIntegerList = IntegerList() This is similar in spirit to collections.namedtuple() in Python 2.6 though the implementation is different; namedtuple() actually creates and evaluates the text of the new class definition: http://docs.python.org/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields http://svn.python.org/view/python/trunk/Lib/collections.py?rev=68853&view=auto Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Customizing Eclipse text editor
Le Fri, 23 Jan 2009 11:17:15 +0100, Vicent a écrit : > Hello everyone. [...] > As you can see there, I prefer a black background, so I've changed colors a > little. My second question is: do you know any kind of recomended (I mean, > optimized for a good working experience, good for eyes health, etc.) color > palette or color combination with black background? > > Thank you in advance. I have long talked about that with a web designer who is particuliarly aware of user comfort in relation to colors, brightness, etc. What I remember is the following rules about what factors make reading/watching harmful, difficult and/or tiring: -1- overall brightness --> white background forbidden -2- difficulty to read or distingush shapes (e.g. too small or badly designed fonts) -3- physical effects of low quality monitors -4- too low (rule -2-) or too high (rule -1-) contrast In addition to that, a basic rule of graphical design in any domain is that too many colors or styles -- as usually done in standard syntax highlighting -- rather prevents readibility that helps it. There should be some research done on this topic. After all, it's a relevant aspect of all programmers' everyday life, no? I'm rather sure that for most PLs, a combination of 3 to 5 colors, and bold/italics properties given to only 1 or 2 token types is optimal. I haven't yet found a style sheet to be really happy with, nevertheless. Now, for a language that becomes more and more complex like python... maybe it will be easier with py3 ;-) denis -- la vida e estranya ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
"Andre Engels" wrote developers of Wikimedia why this is done, but for now you can resolve this by editing robotparser.py in the following way: In the __init__ of the class URLopener, add the following at the end: self.addheaders = [header for header in self.addheaders if header[0] != "User-Agent"] + [('User-Agent', '')] Rather than editing the existing code and making it non standard why not subclass robotparser: class WP_RobotParser(robotparser): def __init__(self, *args, *kwargs): robotparser.__init__(self, *args, *kwargs) self.addheaders = ...blah Thats one of the advantages of OOP, you can change the way classes work without modifying the original code. And thus not breaking any code that relies on the original behaviour. HTH, -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
On Fri, Jan 23, 2009 at 12:07 PM, amit sethi wrote: > well thanks ... it worked well ... but robotparser is in urllib isn't there > a module like robotparser in > urllib2 You'll have to ask someone else about that part... -- André Engels, andreeng...@gmail.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
well thanks ... it worked well ... but robotparser is in urllib isn't there a module like robotparser in urllib2 On Fri, Jan 23, 2009 at 3:55 PM, Andre Engels wrote: > On Fri, Jan 23, 2009 at 10:37 AM, amit sethi > wrote: > > so is there a way around that problem ?? > > Ok, I have done some checking around, and it seems that the Wikipedia > server is giving a return code of 403 (forbidden), but still giving > the page - which I think is weird behaviour. I will check with the > developers of Wikimedia why this is done, but for now you can resolve > this by editing robotparser.py in the following way: > > In the __init__ of the class URLopener, add the following at the end: > > self.addheaders = [header for header in self.addheaders if header[0] > != "User-Agent"] + [('User-Agent', '')] > > (probably > > self.addheaders = [('User-Agent', '')] > > does the same, but my version is more secure) > > -- > André Engels, andreeng...@gmail.com > -- A-M-I-T S|S ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] class arguments?
Le Thu, 22 Jan 2009 23:29:59 -, "Alan Gauld" a écrit : > > "Alan Gauld" wrote > > >> is there a way to give arguments to a class definition? > > I see that Kent interpreted your question differently to me. > If you do mean that you want to dynamically define class > attributes rather than instance attributes then __init__() > won't work. But I'd be interested to understand why and > how you would want to do that? And in particular how > you would use them after creating them? Thank you Alan and sorry for not having been clear enough. The point actually was class (definition) attributes. I thought at e.g. Guido's views that lists were for homogeneous sequences as opposed to tuples rather like records. And a way to ensure sich a homogeneity, in the sense of items beeing of the same type or super type. The straightforward path to ensure that, as I see it, is to add proper argument to a class definition. But I couldn't find a way to do this. In pseudo-code, it would look like that: class MonoList(list, item_type): typ = item_type def __init__(self,items): self._check_types(items) list.__init__(self,items) def _check_types(self,items): for item in items: if not isinstance(item,MonoList.typ): message = "blah!" raise TypeError(message) def __setitem__(self,index,item): if not isinstance(item,MonoList.typ): message = "blah!" raise TypeError(message) list.__setitem__(self,index,item) def __add__(self,other): self._check_types(other) list.__add__(self,other) ... Well, I realize now that it is a bit more complicated. MonoList itself should be an intermediate base class between list and and subclasses that each allow only a single item type. Otherwise all monolist-s have the same item type ;-) Just exploring around... denis > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > -- la vida e estranya ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
On Fri, Jan 23, 2009 at 11:25 AM, Andre Engels wrote: > In the __init__ of the class URLopener, add the following at the end: > > self.addheaders = [header for header in self.addheaders if header[0] > != "User-Agent"] + [('User-Agent', '')] > > (probably > > self.addheaders = [('User-Agent', '')] > > does the same, but my version is more secure) Looking further I found that a 'cleaner' way to make the same change is to add to the code of URLopener (outside any method): version = '' -- André Engels, andreeng...@gmail.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
On Fri, Jan 23, 2009 at 10:37 AM, amit sethi wrote: > so is there a way around that problem ?? Ok, I have done some checking around, and it seems that the Wikipedia server is giving a return code of 403 (forbidden), but still giving the page - which I think is weird behaviour. I will check with the developers of Wikimedia why this is done, but for now you can resolve this by editing robotparser.py in the following way: In the __init__ of the class URLopener, add the following at the end: self.addheaders = [header for header in self.addheaders if header[0] != "User-Agent"] + [('User-Agent', '')] (probably self.addheaders = [('User-Agent', '')] does the same, but my version is more secure) -- André Engels, andreeng...@gmail.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Customizing Eclipse text editor
Hello everyone. I work with Eclipse+PyDev. I've managed to customize colors and fonts of the text editor, but I don't know how to change the appearance of the vertical zone that contains folding controls which is next to the line number zone, at the left of the editor window. Here you can see a screenshot for what I mean (it is the white vertical zone close to the line number zone): http://dl.getdropbox.com/u/155485/screenshot01.png Can you help me? As you can see there, I prefer a black background, so I've changed colors a little. My second question is: do you know any kind of recomended (I mean, optimized for a good working experience, good for eyes health, etc.) color palette or color combination with black background? Thank you in advance. -- Vicent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
so is there a way around that problem ?? On Fri, Jan 23, 2009 at 2:25 PM, Andre Engels wrote: > On Fri, Jan 23, 2009 at 9:09 AM, amit sethi > wrote: > > Well that is interesting but why should that happen in case I am using a > > different User Agent because I tried doing > > status=rp.can_fetch('Mozilla/5.0', > > "http://en.wikipedia.org/wiki/Sachin_Tendulkar";) > > but even that returns false > > Is there something wrong with the syntax , Is there a catch that i don't > > understand. > > The problem is that you are using the standard Python user agent when > getting the robots.txt. Because the user agent is refused, it cannot > get the robots.txt file itself to look at. > > -- > André Engels, andreeng...@gmail.com > -- A-M-I-T S|S ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
On Fri, Jan 23, 2009 at 9:09 AM, amit sethi wrote: > Well that is interesting but why should that happen in case I am using a > different User Agent because I tried doing > status=rp.can_fetch('Mozilla/5.0', > "http://en.wikipedia.org/wiki/Sachin_Tendulkar";) > but even that returns false > Is there something wrong with the syntax , Is there a catch that i don't > understand. The problem is that you are using the standard Python user agent when getting the robots.txt. Because the user agent is refused, it cannot get the robots.txt file itself to look at. -- André Engels, andreeng...@gmail.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] fetching wikipedia articles
Well that is interesting but why should that happen in case I am using a different User Agent because I tried doing status=rp.can_fetch('Mozilla/5.0', " http://en.wikipedia.org/wiki/Sachin_Tendulkar";) but even that returns false Is there something wrong with the syntax , Is there a catch that i don't understand. On Thu, Jan 22, 2009 at 10:45 PM, Andre Engels wrote: > On Thu, Jan 22, 2009 at 6:08 PM, amit sethi > wrote: > > hi , I need help as to how i can fetch a wikipedia article i tried > changing > > my user agent but it did not work . Although as far as my knowledge of > > robots.txt goes , looking at en.wikipedia.org/robots.txt it does not > seem it > > should block a useragent (*, which is what i would normally use) from > > accesing a simple article like say > > "http://en.wikipedia.org/wiki/Sachin_Tendulkar"; but still robotparser > > returns false > > status=rp.can_fetch("*", "http://en.wikipedia.org/wiki/Sachin_Tendulkar > ") > > where rp is a robot parser object . why is that? > > Yes, Wikipedia is blocking the Python default user agent. This was > done to block the main internal bot in its early days (it was > misbehaving by getting each page twice); when it got to allowing the > bot again, it had already changed to having its own user agent string, > and apparently it was not deemed necessary to unblock the user > string... > > > > > -- > André Engels, andreeng...@gmail.com > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > -- A-M-I-T S|S ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor