Hi Uma,
In your case, I'd look at the file as a sequence of "tokens" and look at this as a tokenization problem I think we'll see some kind of _identifier_, followed by _whitespace_, followed by a _string_. All these three tokens will repeat, until we hit the end of the the file. More formally, I'd try to describe the file's structure in a grammar: ## This is not Python, but just a way for me to formally express what I think your file format is: file := (IDENTIFIER WHITESPACE STRING)* END_OF_FILE The star there is meant to symbolize the "repeatedly" part. Note that we haven't yet said what IDENTIFIER, WHITESPACE, or STRING means at all yet: I'm just making sure we've got the toplevel understanding of the file. If this is true, then we might imagine a function tokenize() that takes the file and breaks it down into a sequence of these tokens, and then the job of pulling out the content you care about should be easier. We can loop over the token sequence, watch for an identifier "x-1", skip over the whitespace token, and then take the string we care about, till we hit the "x-2" identifier and stop. tokenize() should not be too bad to write: we walk the file, and recognize certain patterns as either IDENTIFIER, WHITESPACE, or STRING. IDENTIFIER looks like a bunch of non-whitespace characters. WHITESPACE looks like a bunch of whitespace characters. STRING looks like a quote, followed by a bunch of non-quote characters, followed by a quote. The description above is very handwavy. You can more formally write those descriptions out by hand with the use of regular expressions. Regular expressions are a mini-language for writing out string patterns and extracting content from strings. See: https://docs.python.org/2/howto/regex.html Once we can formally describe the patterns above, then we can walk the characters in the file. We pick out which of the three patterns will match what we're currently seeing, and then add to the list of tokens. Eventually, we hit end of file, and tokenize() can return all the tokens that it has accumulated. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor