2017-03-12 17:22 GMT+01:00 <rahulra...@gmail.com>: > Hi All, > > I have a string which looks like > > aaaaa,bbbbb,ccccc "4873898374", ddddd, eeeeee "3343,23,23,5,,5,45", fffff > "5546,3434,345,34,34,5,34,543,7" > > It is comma saperated string, but some of the fields have a double quoted > string as part of it (and that double quoted string can have commas). > Above string have only 6 fields. First is aaaaa, second is bbbbb and last is > fffff "5546,3434,345,34,34,5,34,543,7". > How can I split this string in its fields using regular expression ? or even > if there is any other way to do this, please speak out. > > Thanks in advance > -- > https://mail.python.org/mailman/listinfo/python-list
Hi, would something like the following pattern fulfill the requirements? (It doesn't handle possible empty fields, as mentioned in other posts, the surrounding whitespace can be removed separately: >>> >>> re.findall(r'(?:(?:"[^"]*"|[^,]))+(?=,|$)', 'aaaaa,bbbbb,ccccc >>> "4873898374", ddddd, eeeeee "3343,23,23,5,,5,45", fffff >>> "5546,3434,345,34,34,5,34,543,7"') ['aaaaa', 'bbbbb', 'ccccc "4873898374"', ' ddddd', ' eeeeee "3343,23,23,5,,5,45"', ' fffff "5546,3434,345,34,34,5,34,543,7"'] >>> >>> for field in re.findall(r'(?:(?:"[^"]*"|[^,]))+(?=,|$)', 'aaaaa,bbbbb,ccccc >>> "4873898374", ddddd, eeeeee "3343,23,23,5,,5,45", fffff >>> "5546,3434,345,34,34,5,34,543,7"'): print(field.strip()) ... aaaaa bbbbb ccccc "4873898374" ddddd eeeeee "3343,23,23,5,,5,45" fffff "5546,3434,345,34,34,5,34,543,7" >>> hth, vbr -- https://mail.python.org/mailman/listinfo/python-list