Hi, I have a long text, which tells me which files from a database were downloaded and which ones failed. The pattern is as follows (at the end of this post). Wrote a tiny program, but still is raw. I want to find term "ERROR" and go 5 lines above and get the name with suffix XPT, in this first case DRXIFF_F.XPT, but it changes in other cases to some other name with suffix XPT. Thanks, Aldi
# reading errors from a file txt import re with open('nohup.out', 'r') as fh: lines = fh.readlines() for line in lines: m1 = re.search("XPT", line) m2 = re.search('ERROR', line) if m1: print(line) if m2: print(line) --2018-07-14 21:26:45-- https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXIFF_F.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not Found. --2018-07-14 21:26:46-- https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXTOT_F.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not Found. --2018-07-14 21:26:46-- https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DRXFMT_F.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-07-14 21:26:46 ERROR 404: Not Found. --2018-07-14 21:26:46-- https://wwwn.cdc.gov/Nchs/Nhanes/2009-2010/DSQ1_F.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2018-07-14 21:26:47 ERROR 404: Not Found. --2018-07-14 21:26:47-- https://wwwn.cdc.gov/Nchs/Nhanes/1999-2000/DSII.XPT Resolving wwwn.cdc.gov (wwwn.cdc.gov)... 198.246.102.39 Connecting to wwwn.cdc.gov (wwwn.cdc.gov)|198.246.102.39|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 56060880 (53M) [application/octet-stream] Saving to: ‘DSII.XPT’ -- https://mail.python.org/mailman/listinfo/python-list