Hi Malte, Typically, double quotes are used to identify strings and thus are not interpreted literally. Any data in a field after a double quoted string is regarded as invalid trailing data.
You could replace double quotes with single quotes: A|ggg B|'hhh' xx C|xxx This results in the expected >'hhh' xx< for the second line. Best regards, Max On Fri, Dec 5, 2014 at 4:44 PM, Malte Schwarzer <[email protected]> wrote: > Hi Stephan, > > The result should be >"hhh“ xx< as field value. Enclosures should be > disabled but there seems to be no method to do that. > > > Malte > > Von: Stephan Ewen <[email protected]> > Antworten an: <[email protected]> > Datum: Freitag, 5. Dezember 2014 16:28 > An: <[email protected]> > Betreff: Re: Quotes in fields of CsvInputFormat > > Hi! > > The parser interprets the quotes as quotes for the field. That means the > second field (the string) stops after the "hhh" and the xx is considered > invalid trailing data. > > What do you expect as the result of parsing that line? > > Stephan > > > On Fri, Dec 5, 2014 at 4:16 PM, Malte Schwarzer <[email protected]> wrote: > >> Hi, >> >> I’m try to import a CSV file but the parser seems to have problems this >> quotes in the beginning of a field. Is there a way to set or disable >> enclosures for the CSV input? >> >> This is my code: >> >> DataSet<Tuple2<String, String>> res = env.readCsvFile(inputCsvFilename) >> .fieldDelimiter('|') >> .types(String.class, String.class) >> >> CSV: >> >> A|ggg >> B|"hhh" xx >> C|xxx >> >> As result I’m receiving a ParserException for line B: >> >> *org.apache.flink.api.common.io.ParseException: Line could not be parsed: >> 'B|"hhh" xx**‘* >> >> >> Thanks, >> Malte >> > >
