Greetings Ben, : I'm searching line by line for certain tags and then printing the : tag followed by the word immediately following the tag.
What you are describing is an awful lot like 'grep'. But, of course, many different sorts of file searching resemble grep. : So for example, suppose I had the following line of text in a file: : "this is a key test123 noise noise noise noise noise" : : In this example, I would want to print "key test123" to a new : file. The rest of the words I would not want. : : Here is my code so far: : : def test(infile, outfile): : for line in infile: : tagIndex = line.find("key") : start = tagIndex + 4 : stop = line[start:].find("\t") -1 : if tagIndex != -1: : print("start is: ", start) : print("stop is: ", stop) : print("spliced word is ", line[start: stop]) Your problem is that you are calculating the value for 'stop' from a subset of the 'line string (and then subtracting 1), though you want to be adding the value of 'start'. Replace your above line which performs assignment on the stop variable with the following. stop = line[start:].find("\t") + start : My question is the following: What is wrong w/ the variable : 'stop'? The index it gives me when I print out 'stop' is not even : close to the right number. Furthermore, when I try to print out : just the word following the tag w/ the form: line[start: stop], : it prints nothing (it seems b/c my stop variable is incorrect). Now, think about why this is happening.... You are calculating 'stop' based on a the substring of 'line'. You use the 'start' offset to create a substring, in which you then search for a tab. Then, you subtract 1 and try to use that to mean something in the original string 'line. Finally, you are slicing incorrectly (well, that's just the issue of subtracting 1 when you shouldn't be), a not uncommon slicing problem (see this post for more detail [0]). Finally, I have to wonder why are you doing so much of the work yourself, when .... : I would greatly appreciate any help you have. This is a much : simplified example from the script I'm actually writing, but I : need to figure out a way to eliminate the noise after the key and : the word immediately following it are found. I realize that your question was not like the above, but in your example, it seems that you don't know about the 'csv' module. It's convenient, simple, easy to use and quite robust. This should help you. I don't know much about your data format, nor why you are searching, but let's assume that you are searching where you wish to match 'key' as the contents of an entire field. If that's the case, then: import sys import csv def test(infile,outfile,sought): tsv = csv.reader(infile, delimiter='\t') for row in tsv: if sought in row: outfile.write( '\t'.join( row ) + '\n' ) Now, how would you call this function? if __name__ == '__main__': test(sys.stdin, sys.stdout, sys.argv[1]) And, suppose you were at a command line, how would you call that? python tabbed-reader.py < "$MYFILE" 'key' OK, so the above function called 'test' is probably not quite what you had wanted, but you should be able to adapt it pretty readily. Good luck, -Martin [0] http://mail.python.org/pipermail/tutor/2010-December/080592.html -- Martin A. Brown http://linux-ip.net/ _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor