[
https://issues.apache.org/jira/browse/FLINK-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097480#comment-14097480
]
ASF GitHub Bot commented on FLINK-1002:
---------------------------------------
Github user StephanEwen commented on a diff in the pull request:
https://github.com/apache/incubator-flink/pull/65#discussion_r16260290
--- Diff:
stratosphere-java/src/test/java/eu/stratosphere/api/java/io/CsvInputFormatTest.java
---
@@ -423,5 +423,42 @@ private void testRemovingTrailingCR(String
lineBreakerInFile, String lineBreaker
fail("Test erroneous");
}
}
+
+ @Test
+ public void testSkipHeader() throws IOException {
+ try {
+ final String fileContent =
"HEAD\nHEAD\n111|222|333|444|555|\n666|777|888|999|000|\n";
+ final FileInputSplit split =
createTempFile(fileContent);
--- End diff --
If you use a test that creates 8 splits of size 6, then it will fail,
because some header lines are in the second split.
> Add CSVReader support to ignore multiple lines at the beginning of a document
> -----------------------------------------------------------------------------
>
> Key: FLINK-1002
> URL: https://issues.apache.org/jira/browse/FLINK-1002
> Project: Flink
> Issue Type: Improvement
> Reporter: Bastian Köcher
> Assignee: Markus Holzemer
>
> At the moment the CSVReader only supports to skip the first line, but for
> example I've got a format where multiple lines at the beginning need to be
> skipped. A function to skip multiple lines at the beginning would be very
> useful.
--
This message was sent by Atlassian JIRA
(v6.2#6252)