Hello, I'm currently encountering following problem.
I have a xml file that gets loaded using a custom LoadFunc. Boiled down my xml file could look like: <files> <file> <id> 1 </id> <text> This is a sample text that contains newlines, which should be preserved when parsing. </text> </file> <file> ... </file> <file> ... </file> ... </files> So the text does contain a newline (\r\n or \n does not matter). When parsing the xml I parse the contents of <text/> into a string and add it to the list that should be returned by the LoadFunc. The problem now is that whenever I dump, store or use the intermediate result in another UDF e.g. with raw = LOAD 'data/files.xml' using org.my.MyCustomXMLLoader() AS ( id:int , text: chararray); dump raw; or raw = LOAD 'data/files.xml' using org.my.MyCustomXMLLoader() AS ( id:int , text: chararray); clean = FOREACH raw GENERATE id, org.my.MyCleaner(text) as clean_text; The newlines as completely stripped away: 1 This is a sample text that contains newlines,which should be preserved when parsing. Or in the latter example leading MyCleaner() to fail.. How can I preserve the newline in Pig? Best, Will