Re: [PR] NIFI-8932: Add capability to skip first N rows in CSVReader [nifi]

via GitHub Tue, 05 Dec 2023 11:47:34 -0800


dan-s1 commented on code in PR #7952:
URL: https://github.com/apache/nifi/pull/7952#discussion_r1416200289



##########
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/csv/AbstractCSVRecordReader.java:
##########
@@ -158,4 +180,46 @@ protected String trim(String value) {
     public RecordSchema getSchema() {
         return schema;
     }
+
+    /**
+     * This method searches using the specified Reader character-by-character 
until the
+     * record separator is found.
+     * @param reader the Reader providing the input
+     * @param recordSeparator the String specifying the end of a record in the 
input
+     * @throws IOException if an error occurs during reading, including not 
finding the record separator in the input
+     */
+    protected void readNextRecord(Reader reader, String recordSeparator) 
throws IOException {
+        int indexIntoSeparator = 0;
+        int recordSeparatorLength = recordSeparator.length();
+        int code = reader.read();
+        while (code != -1) {
+            char nextChar = (char)code;
+            if (recordSeparator.charAt(indexIntoSeparator) == nextChar) {
+                if (++indexIntoSeparator == recordSeparatorLength) {
+                    // We have matched the separator, return the string built 
so far
+                    return;
+                }

Review Comment:
   @mattyb149 @exceptionfactory A concern I have with this logic is what 
happens when the record separator is escaped in the data? How will you tell 
whether you have an end of record or an escaped record separator.  I did a 
quick look on line and found that a new line character which usually is the end 
of a CSV record can be embedded in the data.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] NIFI-8932: Add capability to skip first N rows in CSVReader [nifi]

Reply via email to