On Thursday, 12 December 2019 04:55:46 UTC+8, Joel Goldstick wrote: > On Wed, Dec 11, 2019 at 1:31 PM Ben Bacarisse <ben.use...@bsb.me.uk> wrote: > > > > A S <aishan0...@gmail.com> writes: > > > > > I would like to extract all words within specific keywords in a .txt > > > file. For the keywords, there is a starting keyword of "PROC SQL;" (I > > > need this to be case insensitive) and the ending keyword could be > > > either "RUN;", "quit;" or "QUIT;". This is my sample .txt file. > > > > > > Thus far, this is my code: > > > > > > with open('lan sample text file1.txt') as file: > > > text = file.read() > > > regex = re.compile(r'(PROC SQL;|proc sql;(.*?)RUN;|quit;|QUIT;)') > > > k = regex.findall(text) > > > print(k) > > > > Try > > > > re.compile(r'(?si)(PROC SQL;.*(?:QUIT|RUN);)') > > > > Read up one what (?si) means and what (?:...) means.. You can do the > > same by passing flags to the compile method. > > > > > Output: > > > > > > [('quit;', ''), ('quit;', ''), ('PROC SQL;', '')] > > > > Your main issue is that | binds weakly. Your whole pattern tries to > > match any one of just four short sub-patterns: > > > > PROC SQL; > > proc sql;(.*?)RUN; > > quit; > > QUIT; > > > > -- > > Ben. > > -- > > https://mail.python.org/mailman/listinfo/python-list > > Consider using python string functions. > > 1. read your string, lets call it s. > 2 . start = s.find("PROC SQL:" > This will find the starting index point. It returns and index > 3. DO the same for each of the three possible ending strings. Use if/else > 4. This will give you your ending index. > 5 slice the included string, taking into account the start is start + > len("PROC SQL;") and the end is the ending index - the length of > whichever string ended in your case > > Regular expressions are powerful, but not so easy to read unless you > are really into them. > -- > Joel Goldstick > http://joelgoldstick.com/blog > http://cc-baseballstats.info/stats/birthdays
Hey Joel, not too sure if i get the idea of your code implementation -- https://mail.python.org/mailman/listinfo/python-list