[
https://issues.apache.org/jira/browse/FLINK-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222149#comment-14222149
]
ASF GitHub Bot commented on FLINK-1208:
---------------------------------------
Github user StephanEwen commented on a diff in the pull request:
https://github.com/apache/incubator-flink/pull/201#discussion_r20759430
--- Diff:
flink-java/src/main/java/org/apache/flink/api/java/io/CsvInputFormat.java ---
@@ -130,6 +216,21 @@ public OUT readRecord(OUT reuse, byte[] bytes, int
offset, int numBytes) {
numBytes--;
}
+ if (commentPrefix != null && commentPrefix.length <= numBytes) {
+ //check record for comments
+ Boolean isComment = true;
+ for (int i = 0; i < commentPrefix.length; i++) {
+ if (commentPrefix[i] != bytes[offset + i]) {
+ isComment = false;
+ break;
+ }
+ }
+ if (isComment) {
+ this.commentCount++;
+ return nextRecord(reuse);
--- End diff --
This call results in recursive calls. For files with a lot of successive
comments, this will result in a stack overflow.
> Skip comment lines in CSV input format. Allow user to specify comment
> character.
> --------------------------------------------------------------------------------
>
> Key: FLINK-1208
> URL: https://issues.apache.org/jira/browse/FLINK-1208
> Project: Flink
> Issue Type: Improvement
> Components: Java API, Scala API
> Affects Versions: 0.8-incubating
> Reporter: Aljoscha Krettek
> Assignee: Felix Neutatz
> Priority: Minor
> Labels: starter
>
> The current skipFirstLine is limited. Skipping arbitrary lines that start
> with a certain character would be much more flexible while still easy to
> implement.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)