Re: [Tutor] My problem in simple terms

Peter Otten Mon, 04 Mar 2019 06:09:50 -0800

Edward Kanja wrote:

> Hi there ,
> Earlier i had sent an email on how to use re.sub function to eliminate
> square brackets. I have simplified the statements. Attached txt file named
> unon.Txt has the data im extracting from. The file named code.txt has the
> codes I'm using to extract the data.The regular expression works fine but
> my output has too many square brackets. How do i do away with them thanks.


The square brackets appear because re.findall() returns a list. If you know 
that there is only one match or if you are only interested in the first 
match you can extract it with

first = re.findall(...)[1]

This will of course fail if there is no match at all, so you have to check 
the length first. You can also use the length check to skip the lines with 
no match at all, i. e. the line appearing as

[] [] []

in your script's output.

Now looking at your data -- at least from the sample it seems to be rather 
uniform. There are records separated by "---..." and fields separated by 
"|". I'd forego regular expressions for that:

$ cat code.py
from itertools import groupby

def is_record_sep(line):
    return not line.rstrip().strip("-")

with open("unon.txt") as instream:
    for sep, group in groupby(instream, key=is_record_sep):
        if not sep:
            record = [
                [field.strip() for field in line.split("|")]
                for line in group if line.strip().strip("|-")
            ]
            # select field by their position in the record
            names = record[0][1]
            station = record[0][2]
            index = record[1][1]
            print(index, names, station, sep=", ")
$ python3 code.py 
11113648, Rawzeea NLKPP, VE11-Nairobi
10000007, Pattly MUNIIZ, TX00-Nairobi


_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] My problem in simple terms

Reply via email to