On Sep 8, 2:21 pm, "Mark Tolonen" <metolone+gm...@gmail.com> wrote: > "Martin" <mdeka...@gmail.com> wrote in message > > news:5941d8f1-27c0-47d9-8221-d21f07200...@j39g2000yqh.googlegroups.com... > > > > > Hi, > > > I need to extract a string after a matching a regular expression. For > > example I have the string... > > > s = "FTPHOST: e4ftl01u.ecs.nasa.gov" > > > and once I match "FTPHOST" I would like to extract > > "e4ftl01u.ecs.nasa.gov". I am not sure as to the best approach to the > > problem, I had been trying to match the string using something like > > this: > > > m = re.findall(r"FTPHOST", s) > > > But I couldn't then work out how to return the "e4ftl01u.ecs.nasa.gov" > > part. Perhaps I need to find the string and then split it? I had some > > help with a similar problem, but now I don't seem to be able to > > transfer that to this problem! > > In regular expressions, you match the entire string you are interested in, > and parenthesize the parts that you want to parse out of that string. The > group() method is used to get the whole string with group(0), and each of > the parenthesized parts with group(n). An example: > > >>> s = "FTPHOST: e4ftl01u.ecs.nasa.gov" > >>> import re > >>> re.search(r'FTPHOST: (.*)',s).group(0) > > 'FTPHOST: e4ftl01u.ecs.nasa.gov'>>> re.search(r'FTPHOST: (.*)',s).group(1) > > 'e4ftl01u.ecs.nasa.gov' > > -Mark
I see what you mean regarding the groups. Because my string is nested in amongst others e.g. MEDIATYPE: FtpPull\r\n', 'MEDIAFORMAT: FILEFORMAT\r\n', 'FTPHOST: e4ftl01u.ecs.nasa.gov\r\n', 'FTPDIR: /PullDir/0301872638CySfQB\r\n', 'Ftp Pull Download Links: \r\n', I get the information that follows as well. So is the only way to then parse the new string? I am trying to construct something that is fairly robust, so not sure just printing before the \r is the best solution. Thanks -- http://mail.python.org/mailman/listinfo/python-list