[ https://issues.apache.org/jira/browse/PIG-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975131#comment-14975131 ]
Daniel Dai commented on PIG-4689: --------------------------------- Can you add your test case to TestCSVExcelStorage? > CSV Writes incorrect header if two CSV files are created in one script > ---------------------------------------------------------------------- > > Key: PIG-4689 > URL: https://issues.apache.org/jira/browse/PIG-4689 > Project: Pig > Issue Type: Bug > Affects Versions: 0.14.0, 0.15.0 > Reporter: Niels Basjes > Assignee: Niels Basjes > Attachments: PIG-4689-2015-10-06.patch, PIG-4689-20151016.patch > > > From a single Pig script I write two completely different and unrelated CSV > files; both with the flag 'WRITE_OUTPUT_HEADER'. > The bug is that both files get the SAME header at the top of the output file > even though the data is different. > *Reproduction:* > {code:title=foo.txt} > 1 > {code} > {code:title=bar.txt (Tab separated)} > 1 a > {code} > {code:title=WriteTwoCSV.pig} > FOO = > LOAD 'foo.txt' > USING PigStorage('\t') > AS (a:chararray); > BAR = > LOAD 'bar.txt' > USING PigStorage('\t') > AS (b:chararray, c:chararray); > STORE FOO into 'Foo' > USING org.apache.pig.piggybank.storage.CSVExcelStorage('\t','NO_MULTILINE', > 'UNIX', 'WRITE_OUTPUT_HEADER'); > STORE BAR into 'Bar' > USING org.apache.pig.piggybank.storage.CSVExcelStorage('\t','NO_MULTILINE', > 'UNIX', 'WRITE_OUTPUT_HEADER'); > {code} > *Command:* > {quote}pig -x local WriteTwoCSV.pig{quote} > *Result:* > {quote}cat Bar/part-*{quote} > {code} > b c > 1 a > {code} > {quote}cat Foo/part-*{quote} > {code} > b c > 1 > {code} > *The error is that the {{Foo}} output has a the two column header from the > {{Bar}} output.* > *One of the effects is that parsing the {{Foo}} data will probably fail due > to the varying number of columns* -- This message was sent by Atlassian JIRA (v6.3.4#6332)