On 2015-04-29 14:22, Emile van Sebille wrote: > On 4/29/2015 1:49 PM, Kashif Rana wrote: > > pol_elements = > > re.compile('id\s(?P<p_id>.+?)(?:\sname\s(?P<p_name>.+?))?\sfrom\s(?P<p_from>.+?)\sto\s(?P<p_to>.+?)\s{2}(?P<p_src>[^\s]+?)\s(?P<p_dst>[^\s]+?)\s(?P<p_port>[^\s]+?)(?:(?P<p_nat_status>\snat)\s(?P<p_nat_type>[^\s]+?)(?P<p_nat_ip>\sdip-id\s[^\s]+?)?)?\s(?P<p_action>[^\s]+?)(?:\sschedule\s(?P<p_schedule>[^\s]+?))?(?P<p_log_status>\slog)?$' > > ) > > ... and that's why we avoid regular expressions... it makes my head > hurt just looking at that line noise.
First, it appears the OP isn't using raw strings which make those back-slashes just ask for trouble. That said, it would be a lot better if the OP made use of re.VERBOSE to put each component on its own line: pol_elements = re.compile(r""" id \s (?P<p_id>.+?) (?: \s name \s (?P<p_name>.+?) )? \s from \s (?P<p_from>.+?) \s to \s (?P<p_to>.+?) \s{2} (?P<p_src>[^\s]+?) \s (?P<p_dst>[^\s]+?) \s(?P<p_port>[^\s]+?) (?: \s (?P<p_nat_status>nat) \s (?P<p_nat_type>\w+) ( \s? P<p_nat_src_ip>dip-id \s \d+ )? ( \s ip \s (?P<p_nat_dst_ip>[\d\.]+) \s port (?P<dst_nat_port>\d+) )? )? \s (?P<p_action>[^\s]+?) (?: \s schedule \s (?P<p_schedule>[^\s]+?) )? (?P<p_log_status>\slog)? $ """, re.VERBOSE) which, with some copious comments in the expression, would make it almost readable. Alternatively, switch to an actual parser like pyparsing. -tkc -- https://mail.python.org/mailman/listinfo/python-list