Shiva <shivaji...@yahoo.com.dmarc.invalid> Wrote in message: > Hi, > > I am doing a regular expression search for a year through a file. >
I think you are being confused in part by your choice of names. Let's go through and describe the variable contents. > fileextract = open(fullfilename,'r') > line = fileextract.read() 'line' is a single string containing all the lines in the file. > texts = re.search(r'1\d\d\d', line) > print(texts.group()) > > The above works. > > However if I do: > fileextract = open(fullfilename,'r') > line = fileextract.readlines() Now, 'line' is a list, with each item of the list being a string. The name is very misleading, and should be something like 'lines' > > for l in line: > texts = re.search(r'1\d\d\d', line) The second argument here is a list, not a string. You probably meant to search the variable named 'l' . Of course if you renamed things, then you might have a loop of for line in lines: And after that change, your search call would be correct again. > print(texts.group()) Here you look only at the last result. You probably want this line indented., so it's part of the loop. > > > None is returned. Why is it not iterating through each line of the file and > doing a search? - It seems to return none. > re.search only searches strings. Other comments. You neglected to close the files. Doesn’t hurt here, but it's best to get into good habits. Look up the with statement. The readlines call was unnecessary, as you could have iterated oner the file object. for line in fileextract: Your regexp won't match recent years. And it will match numbers like 51420, 1994333, and so on. Perhaps you want to restrict where in the line those four digits may be. When asking questions, it is frequently useful to specify Python version. -- DaveA -- https://mail.python.org/mailman/listinfo/python-list