Edward Kanja wrote: > Hi there , > Earlier i had sent an email on how to use re.sub function to eliminate > square brackets. I have simplified the statements. Attached txt file named > unon.Txt has the data im extracting from. The file named code.txt has the > codes I'm using to extract the data.The regular expression works fine but > my output has too many square brackets. How do i do away with them thanks.
The square brackets appear because re.findall() returns a list. If you know that there is only one match or if you are only interested in the first match you can extract it with first = re.findall(...)[1] This will of course fail if there is no match at all, so you have to check the length first. You can also use the length check to skip the lines with no match at all, i. e. the line appearing as [] [] [] in your script's output. Now looking at your data -- at least from the sample it seems to be rather uniform. There are records separated by "---..." and fields separated by "|". I'd forego regular expressions for that: $ cat code.py from itertools import groupby def is_record_sep(line): return not line.rstrip().strip("-") with open("unon.txt") as instream: for sep, group in groupby(instream, key=is_record_sep): if not sep: record = [ [field.strip() for field in line.split("|")] for line in group if line.strip().strip("|-") ] # select field by their position in the record names = record[0][1] station = record[0][2] index = record[1][1] print(index, names, station, sep=", ") $ python3 code.py 11113648, Rawzeea NLKPP, VE11-Nairobi 10000007, Pattly MUNIIZ, TX00-Nairobi _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor