[ 
https://issues.apache.org/jira/browse/CAMEL-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640430#comment-16640430
 ] 

ASF GitHub Bot commented on CAMEL-12698:
----------------------------------------

onderson closed pull request #2454: CAMEL-12698: Use the Stream API to read 
files instead of Scanner
URL: https://github.com/apache/camel/pull/2454
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/WrappedException.java
 
b/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/WrappedException.java
new file mode 100644
index 00000000000..0caa627ae81
--- /dev/null
+++ 
b/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/WrappedException.java
@@ -0,0 +1,38 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.camel.dataformat.bindy;
+
+/**
+ * A {@link RuntimeException} which wraps a checked exception. This is 
necessary when dealing with streams,
+ * since the API does not allow catching or propagating a checked exception.
+ */
+public class WrappedException extends RuntimeException {
+    private final Exception exception;
+
+    /**
+     * Mandatory constructor.
+     * @param wrappedException the checked exception being passed in
+     */
+    public WrappedException(Exception wrappedException) {
+        this.exception = wrappedException;
+    }
+
+    public Exception getWrappedException() {
+        return exception;
+    }
+}
diff --git 
a/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/csv/BindyCsvDataFormat.java
 
b/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/csv/BindyCsvDataFormat.java
index ed5845cd9b6..59069820b97 100644
--- 
a/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/csv/BindyCsvDataFormat.java
+++ 
b/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/csv/BindyCsvDataFormat.java
@@ -16,6 +16,8 @@
  */
 package org.apache.camel.dataformat.bindy.csv;
 
+
+import java.io.BufferedReader;
 import java.io.IOException;
 import java.io.InputStream;
 import java.io.InputStreamReader;
@@ -26,15 +28,18 @@
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
-import java.util.Scanner;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.function.Consumer;
 import java.util.regex.Matcher;
 import java.util.regex.Pattern;
+import java.util.stream.Stream;
 
 import org.apache.camel.Exchange;
 import org.apache.camel.dataformat.bindy.BindyAbstractDataFormat;
 import org.apache.camel.dataformat.bindy.BindyAbstractFactory;
 import org.apache.camel.dataformat.bindy.BindyCsvFactory;
 import org.apache.camel.dataformat.bindy.FormatFactory;
+import org.apache.camel.dataformat.bindy.WrappedException;
 import org.apache.camel.dataformat.bindy.util.ConverterUtils;
 import org.apache.camel.spi.DataFormat;
 import org.apache.camel.util.IOHelper;
@@ -138,58 +143,74 @@ public Object unmarshal(Exchange exchange, InputStream 
inputStream) throws Excep
         // List of Pojos
         List<Map<String, Object>> models = new ArrayList<>();
 
-        // Pojos of the model
-        Map<String, Object> model;
         InputStreamReader in = null;
-        Scanner scanner = null;
         try {
             if (checkEmptyStream(factory, inputStream)) {
                 return models;
             }
     
             in = new InputStreamReader(inputStream, 
IOHelper.getCharsetName(exchange));
-    
-            // Scanner is used to read big file
-            scanner = new Scanner(in);
-    
+
             // Retrieve the separator defined to split the record
             String separator = factory.getSeparator();
             String quote = factory.getQuote();
             ObjectHelper.notNull(separator, "The separator has not been 
defined in the annotation @CsvRecord or not instantiated during initModel.");
-    
-            int count = 0;
-            
-            // If the first line of the CSV file contains columns name, then we
-            // skip this line
-            if (factory.getSkipFirstLine()) {
-                // Check if scanner is empty
-                if (scanner.hasNextLine()) {
-                    scanner.nextLine();
+            AtomicInteger count = new AtomicInteger(0);
+
+            // Use a Stream to stream a file across.
+            try (Stream<String> lines = new BufferedReader(in).lines()) {
+                int linesToSkip = 0;
+
+                // If the first line of the CSV file contains columns name, 
then we
+                // skip this line
+                if (factory.getSkipFirstLine()) {
+                    linesToSkip = 1;
                 }
-            }
-    
-            while (scanner.hasNextLine()) {
-    
-                // Read the line
-                String line = scanner.nextLine().trim();
-    
-                if (ObjectHelper.isEmpty(line)) {
-                    // skip if line is empty
-                    continue;
+
+                // Consume the lines in the file via a consumer method, 
passing in state as necessary.
+                // If the internals of the consumer fail, we unrap the checked 
exception upstream.
+                try {
+                    lines.skip(linesToSkip)
+                            .forEachOrdered(consumeFile(factory, models, 
separator, quote, count));
+                } catch (WrappedException e) {
+                    throw e.getWrappedException();
                 }
-    
+
+                // BigIntegerFormatFactory if models list is empty or not
+                // If this is the case (correspond to an empty stream, ...)
+                if (models.size() == 0) {
+                    throw new java.lang.IllegalArgumentException("No records 
have been defined in the CSV");
+                } else {
+                    return extractUnmarshalResult(models);
+                }
+            }
+        } finally {
+            if (in != null) {
+                IOHelper.close(in, "in", LOG);
+            }
+        }
+
+    }
+
+    private Consumer<String> consumeFile(BindyCsvFactory factory, 
List<Map<String, Object>> models,
+                                         String separator, String quote, 
AtomicInteger count) {
+        return line -> {
+            try {
+                // Trim the line coming in to remove any trailing whitespace
+                String trimmedLine = line.trim();
                 // Increment counter
-                count++;
-    
+                count.incrementAndGet();
+                Map<String, Object> model;
+
                 // Create POJO where CSV data will be stored
                 model = factory.factory();
-    
+
                 // Split the CSV record according to the separator defined in
                 // annotated class @CSVRecord
                 Pattern pattern = Pattern.compile(separator);
-                Matcher matcher = pattern.matcher(line);
+                Matcher matcher = pattern.matcher(trimmedLine);
                 List<String> separators = new ArrayList<>();
-                
+
                 // Retrieve separators for each match
                 while (matcher.find()) {
                     separators.add(matcher.group());
@@ -198,49 +219,35 @@ public Object unmarshal(Exchange exchange, InputStream 
inputStream) throws Excep
                 if (separators.size() > 0) {
                     separators.add(separators.get(separators.size() - 1));
                 }
-                
-                String[] tokens = pattern.split(line, 
factory.getAutospanLine() ? factory.getMaxpos() : -1);
+
+                String[] tokens = pattern.split(trimmedLine, 
factory.getAutospanLine() ? factory.getMaxpos() : -1);
                 List<String> result = Arrays.asList(tokens);
+
                 // must unquote tokens before use
                 result = unquoteTokens(result, separators, quote);
-    
-                if (result.size() == 0 || result.isEmpty()) {
-                    throw new java.lang.IllegalArgumentException("No records 
have been defined in the CSV");
+
+                if (result.isEmpty()) {
+                    throw new IllegalArgumentException("No records have been 
defined in the CSV");
                 } else {
                     if (LOG.isDebugEnabled()) {
                         LOG.debug("Size of the record splitted : {}", 
result.size());
                     }
-    
+
                     // Bind data from CSV record with model classes
-                    factory.bind(getCamelContext(), result, model, count);
-    
+                    factory.bind(getCamelContext(), result, model, 
count.get());
+
                     // Link objects together
                     factory.link(model);
-    
+
                     // Add objects graph to the list
                     models.add(model);
-    
+
                     LOG.debug("Graph of objects created: {}", model);
                 }
+            } catch (Exception e) {
+                throw new WrappedException(e);
             }
-    
-            // BigIntegerFormatFactory if models list is empty or not
-            // If this is the case (correspond to an empty stream, ...)
-            if (models.size() == 0) {
-                throw new java.lang.IllegalArgumentException("No records have 
been defined in the CSV");
-            } else {
-                return extractUnmarshalResult(models);
-            }
-
-        } finally {
-            if (scanner != null) {
-                scanner.close();
-            }
-            if (in != null) {
-                IOHelper.close(in, "in", LOG);
-            }
-        }
-
+        };
     }
 
     /**
diff --git 
a/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/kvp/BindyKeyValuePairDataFormat.java
 
b/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/kvp/BindyKeyValuePairDataFormat.java
index f12ad983ee5..62602b8b3db 100644
--- 
a/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/kvp/BindyKeyValuePairDataFormat.java
+++ 
b/components/camel-bindy/src/main/java/org/apache/camel/dataformat/bindy/kvp/BindyKeyValuePairDataFormat.java
@@ -16,6 +16,7 @@
  */
 package org.apache.camel.dataformat.bindy.kvp;
 
+import java.io.BufferedReader;
 import java.io.InputStream;
 import java.io.InputStreamReader;
 import java.io.OutputStream;
@@ -26,7 +27,9 @@
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
-import java.util.Scanner;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
 
 import org.apache.camel.Exchange;
 import org.apache.camel.TypeConverter;
@@ -34,6 +37,7 @@
 import org.apache.camel.dataformat.bindy.BindyAbstractFactory;
 import org.apache.camel.dataformat.bindy.BindyKeyValuePairFactory;
 import org.apache.camel.dataformat.bindy.FormatFactory;
+import org.apache.camel.dataformat.bindy.WrappedException;
 import org.apache.camel.dataformat.bindy.util.ConverterUtils;
 import org.apache.camel.spi.DataFormat;
 import org.apache.camel.util.IOHelper;
@@ -88,55 +92,76 @@ public void marshal(Exchange exchange, Object body, 
OutputStream outputStream) t
     }
 
     public Object unmarshal(Exchange exchange, InputStream inputStream) throws 
Exception {
-        BindyKeyValuePairFactory factory = 
(BindyKeyValuePairFactory)getFactory();
+        BindyKeyValuePairFactory factory = (BindyKeyValuePairFactory) 
getFactory();
 
         // List of Pojos
         List<Map<String, Object>> models = new ArrayList<>();
 
-        // Pojos of the model
-        Map<String, Object> model;
-        
         // Map to hold the model @OneToMany classes while binding
         Map<String, List<Object>> lists = new HashMap<>();
 
         InputStreamReader in = new InputStreamReader(inputStream, 
IOHelper.getCharsetName(exchange));
 
-        // Scanner is used to read big file
-        Scanner scanner = new Scanner(in);
+        // Use a Stream to stream a file across
+        try (Stream<String> lines = new BufferedReader(in).lines()) {
+            // Retrieve the pair separator defined to split the record
+            ObjectHelper.notNull(factory.getPairSeparator(), "The pair 
separator property of the annotation @Message");
+            String separator = factory.getPairSeparator();
+            AtomicInteger count = new AtomicInteger(0);
+
+            try {
+                lines.forEachOrdered(line -> {
+                    consumeFile(factory, models, lists, separator, count, 
line);
+                });
+            } catch (WrappedException e) {
+                throw e.getWrappedException();
+            }
 
-        // Retrieve the pair separator defined to split the record
-        ObjectHelper.notNull(factory.getPairSeparator(), "The pair separator 
property of the annotation @Message");
-        String separator = factory.getPairSeparator();
+            // BigIntegerFormatFactory if models list is empty or not
+            // If this is the case (correspond to an empty stream, ...)
+            if (models.size() == 0) {
+                throw new java.lang.IllegalArgumentException("No records have 
been defined in the CSV");
+            } else {
+                return extractUnmarshalResult(models);
+            }
 
-        int count = 0;
-        try {
-            while (scanner.hasNextLine()) {
-                // Read the line
-                String line = scanner.nextLine().trim();
+        } finally {
+            IOHelper.close(in, "in", LOG);
+        }
+    }
 
-                if (ObjectHelper.isEmpty(line)) {
-                    // skip if line is empty
-                    continue;
-                }
+    private void consumeFile(BindyKeyValuePairFactory factory, 
List<Map<String, Object>> models, Map<String, List<Object>> lists, String 
separator, AtomicInteger count, String line) {
+        try {
+            // Trim the line coming in to remove any trailing whitespace
+            String trimmedLine = line.trim();
 
+            if (!ObjectHelper.isEmpty(trimmedLine)) {
                 // Increment counter
-                count++;
+                count.incrementAndGet();
+                // Pojos of the model
+                Map<String, Object> model;
 
                 // Create POJO
                 model = factory.factory();
 
                 // Split the message according to the pair separator defined in
                 // annotated class @Message
-                List<String> result = Arrays.asList(line.split(separator));
+                // Explicitly replace any occurrence of the Unicode new line 
character.
+                // Simply reading the line in with the File stream doesn't get 
us around the fact
+                // that this character is still present in the data set, and 
we don't wish for it
+                // to be present when storing the actual data in the model.
+                List<String> result = Arrays.stream(line.split(separator))
+                        .map(x -> x.replace("\u0085", ""))
+                        .collect(Collectors.toList());
 
                 if (result.size() == 0 || result.isEmpty()) {
-                    throw new java.lang.IllegalArgumentException("No records 
have been defined in the KVP");
+                    throw new IllegalArgumentException("No records have been 
defined in the KVP");
                 }
 
                 if (result.size() > 0) {
                     // Bind data from message with model classes
                     // Counter is used to detect line where error occurs
-                    factory.bind(getCamelContext(), result, model, count, 
lists);
+                    factory.bind(getCamelContext(), result, model, 
count.get(), lists);
 
                     // Link objects together
                     factory.link(model);
@@ -147,18 +172,8 @@ public Object unmarshal(Exchange exchange, InputStream 
inputStream) throws Excep
                     LOG.debug("Graph of objects created: {}", model);
                 }
             }
-
-            // BigIntegerFormatFactory if models list is empty or not
-            // If this is the case (correspond to an empty stream, ...)
-            if (models.size() == 0) {
-                throw new java.lang.IllegalArgumentException("No records have 
been defined in the CSV");
-            } else {
-                return extractUnmarshalResult(models);
-            }
-
-        } finally {
-            scanner.close();
-            IOHelper.close(in, "in", LOG);
+        } catch (Exception e) {
+            throw new WrappedException(e);
         }
     }
 
diff --git 
a/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/csv/BindySimpleCsvUnmarshallUnicodeNextLineTest.java
 
b/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/csv/BindySimpleCsvUnmarshallUnicodeNextLineTest.java
new file mode 100644
index 00000000000..af2505529a0
--- /dev/null
+++ 
b/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/csv/BindySimpleCsvUnmarshallUnicodeNextLineTest.java
@@ -0,0 +1,72 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.camel.dataformat.bindy.csv;
+
+import org.apache.camel.EndpointInject;
+import org.apache.camel.Produce;
+import org.apache.camel.ProducerTemplate;
+import org.apache.camel.builder.RouteBuilder;
+import org.apache.camel.component.mock.MockEndpoint;
+import org.apache.camel.dataformat.bindy.model.unicode.LocationRecord;
+import org.junit.Test;
+import org.springframework.test.annotation.DirtiesContext;
+import org.springframework.test.context.ContextConfiguration;
+import 
org.springframework.test.context.junit4.AbstractJUnit4SpringContextTests;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotNull;
+
+@ContextConfiguration
+public class BindySimpleCsvUnmarshallUnicodeNextLineTest extends 
AbstractJUnit4SpringContextTests {
+    private static final String URI_MOCK_RESULT = "mock:result";
+    private static final String URI_DIRECT_START = "direct:start";
+
+    @Produce(uri = URI_DIRECT_START)
+    protected ProducerTemplate template;
+
+    @EndpointInject(uri = URI_MOCK_RESULT)
+    private MockEndpoint result;
+
+    private String record;
+
+    @Test
+    @DirtiesContext
+    public void testUnicodeNextLineCharacterParsing() throws Exception {
+        record = "123\u0085 Anywhere Lane,United States";
+
+        template.sendBody(record);
+
+        result.expectedMessageCount(1);
+        result.assertIsSatisfied();
+        LocationRecord data = 
result.getExchanges().get(0).getIn().getBody(LocationRecord.class);
+        assertNotNull(data);
+        assertEquals("Parsing error with unicode next line", "123\u0085 
Anywhere Lane", data.getAddress());
+        assertEquals("Parsing error with unicode next line", "United States", 
data.getNation());
+    }
+
+    public static class ContextConfig extends RouteBuilder {
+        BindyCsvDataFormat locationRecordBindyDataFormat = new 
BindyCsvDataFormat(LocationRecord.class);
+
+        public void configure() {
+            from(URI_DIRECT_START)
+                    .unmarshal(locationRecordBindyDataFormat)
+                    .to(URI_MOCK_RESULT);
+        }
+    }
+
+}
diff --git 
a/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/fix/BindySimpleKeyValuePairUnicodeNextLineTest.java
 
b/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/fix/BindySimpleKeyValuePairUnicodeNextLineTest.java
new file mode 100644
index 00000000000..a33b4f0ebf0
--- /dev/null
+++ 
b/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/fix/BindySimpleKeyValuePairUnicodeNextLineTest.java
@@ -0,0 +1,100 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.camel.dataformat.bindy.fix;
+
+import org.apache.camel.EndpointInject;
+import org.apache.camel.Produce;
+import org.apache.camel.ProducerTemplate;
+import org.apache.camel.builder.RouteBuilder;
+import org.apache.camel.component.mock.MockEndpoint;
+import org.apache.camel.dataformat.bindy.annotation.KeyValuePairField;
+import org.apache.camel.dataformat.bindy.annotation.Message;
+import org.apache.camel.dataformat.bindy.kvp.BindyKeyValuePairDataFormat;
+import org.junit.Test;
+import org.springframework.test.context.ContextConfiguration;
+import 
org.springframework.test.context.junit4.AbstractJUnit4SpringContextTests;
+
+import static org.junit.Assert.assertTrue;
+
+@ContextConfiguration
+public class BindySimpleKeyValuePairUnicodeNextLineTest extends 
AbstractJUnit4SpringContextTests {
+
+    private static final String URI_MOCK_RESULT = "mock:result";
+    private static final String URI_DIRECT_START = "direct:start";
+
+    @Produce(uri = URI_DIRECT_START)
+    private ProducerTemplate template;
+
+    @EndpointInject(uri = URI_MOCK_RESULT)
+    private MockEndpoint result;
+
+
+    @Test
+    public void testUnmarshallMessage() throws Exception {
+        String sent = "8=FIX.4.1 37=1 38=1 40=\u0085butter";
+
+        result.expectedMessageCount(1);
+
+        template.sendBody(sent);
+
+        result.assertIsSatisfied();
+
+        UnicodeFixOrder unicodeFixOrder = 
result.getReceivedExchanges().get(0).getIn().getBody(UnicodeFixOrder.class);
+
+        assertTrue(unicodeFixOrder.getId().equals("1"));
+        assertTrue(unicodeFixOrder.getProduct().equals("butter"));
+        assertTrue(unicodeFixOrder.getQuantity().equals("1"));
+    }
+
+
+    public static class ContextConfig extends RouteBuilder {
+
+
+        BindyKeyValuePairDataFormat kvpBindyDataFormat = new 
BindyKeyValuePairDataFormat(UnicodeFixOrder.class);
+
+        public void configure() {
+            
from(URI_DIRECT_START).unmarshal(kvpBindyDataFormat).to(URI_MOCK_RESULT);
+        }
+    }
+
+    @Message(keyValuePairSeparator = "=", pairSeparator = " ", type = "FIX", 
version = "4.1")
+    public static class UnicodeFixOrder {
+        @KeyValuePairField(tag = 37)
+        private String id;
+        @KeyValuePairField(tag = 40)
+        private String product;
+        @KeyValuePairField(tag = 38)
+        private String quantity;
+
+        public String getId() {
+            return id;
+        }
+
+        public String getProduct() {
+            return product;
+        }
+
+        public String getQuantity() {
+            return quantity;
+        }
+
+        public void setQuantity(String quantity) {
+            this.quantity = quantity;
+        }
+    }
+}
diff --git 
a/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/model/unicode/LocationRecord.java
 
b/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/model/unicode/LocationRecord.java
new file mode 100644
index 00000000000..f889a953d51
--- /dev/null
+++ 
b/components/camel-bindy/src/test/java/org/apache/camel/dataformat/bindy/model/unicode/LocationRecord.java
@@ -0,0 +1,51 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.camel.dataformat.bindy.model.unicode;
+
+import org.apache.camel.dataformat.bindy.annotation.CsvRecord;
+import org.apache.camel.dataformat.bindy.annotation.DataField;
+
+@CsvRecord(separator = ",")
+public class LocationRecord {
+    @DataField(pos = 1)
+    public String address;
+
+    @DataField(pos = 2)
+    public String nation;
+
+    public String getAddress() {
+        return address;
+    }
+
+    public void setAddress(String address) {
+        this.address = address;
+    }
+
+    public String getNation() {
+        return nation;
+    }
+
+    public void setNation(String nation) {
+        this.nation = nation;
+    }
+
+    @Override
+    public String toString() {
+        return "LocationRecord[address=" + address + ", nation=" + nation + 
"]";
+    }
+}
diff --git 
a/components/camel-bindy/src/test/resources/org/apache/camel/dataformat/bindy/csv/BindySimpleCsvUnmarshallUnicodeNextLineTest-context.xml
 
b/components/camel-bindy/src/test/resources/org/apache/camel/dataformat/bindy/csv/BindySimpleCsvUnmarshallUnicodeNextLineTest-context.xml
new file mode 100644
index 00000000000..9971f5e4370
--- /dev/null
+++ 
b/components/camel-bindy/src/test/resources/org/apache/camel/dataformat/bindy/csv/BindySimpleCsvUnmarshallUnicodeNextLineTest-context.xml
@@ -0,0 +1,34 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+         http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+
+-->
+<beans xmlns="http://www.springframework.org/schema/beans";
+    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+    xsi:schemaLocation="
+     http://www.springframework.org/schema/beans
+     http://www.springframework.org/schema/beans/spring-beans.xsd
+     http://camel.apache.org/schema/spring
+     http://camel.apache.org/schema/spring/camel-spring.xsd";>
+     
+       <camelContext xmlns="http://camel.apache.org/schema/spring";>
+               <routeBuilder ref="myBuilder" /> 
+       </camelContext>
+       
+       <bean id="myBuilder" 
class="org.apache.camel.dataformat.bindy.csv.BindySimpleCsvUnmarshallUnicodeNextLineTest$ContextConfig"/>
+       
+</beans>
\ No newline at end of file
diff --git 
a/components/camel-bindy/src/test/resources/org/apache/camel/dataformat/bindy/fix/BindySimpleKeyValuePairUnicodeNextLineTest-context.xml
 
b/components/camel-bindy/src/test/resources/org/apache/camel/dataformat/bindy/fix/BindySimpleKeyValuePairUnicodeNextLineTest-context.xml
new file mode 100644
index 00000000000..7678495c751
--- /dev/null
+++ 
b/components/camel-bindy/src/test/resources/org/apache/camel/dataformat/bindy/fix/BindySimpleKeyValuePairUnicodeNextLineTest-context.xml
@@ -0,0 +1,34 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+         http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+
+-->
+<beans xmlns="http://www.springframework.org/schema/beans";
+       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+       xsi:schemaLocation="
+     http://www.springframework.org/schema/beans
+     http://www.springframework.org/schema/beans/spring-beans.xsd
+     http://camel.apache.org/schema/spring
+     http://camel.apache.org/schema/spring/camel-spring.xsd";>
+
+    <camelContext xmlns="http://camel.apache.org/schema/spring";>
+        <routeBuilder ref="myBuilder" />
+    </camelContext>
+
+    <bean id="myBuilder" 
class="org.apache.camel.dataformat.bindy.fix.BindySimpleKeyValuePairUnicodeNextLineTest$ContextConfig"/>
+
+</beans>
\ No newline at end of file


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unmarshaling a CSV file with the NEL (next line) character will cause Bindy 
> to misread the entire file
> ------------------------------------------------------------------------------------------------------
>
>                 Key: CAMEL-12698
>                 URL: https://issues.apache.org/jira/browse/CAMEL-12698
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-bindy
>    Affects Versions: 2.22.0
>            Reporter: Jason Black
>            Priority: Major
>
> I am using Apache Camel to process a lot of large CSV files, and relying on 
> Bindy to assist with unmarshalling them into POJOs.
> We have an upstream data bug which causes a record of ours to contain the 
> Unicode character 
> [NEL|http://www.fileformat.info/info/unicode/char/85/index.htm], but while 
> we're working through the cause of that, I found it curious as to what Bindy 
> is actually doing with it.  We rely on the unmarshal process to perform a 
> batch insert, and because our POJO is missing certain fields, we started 
> observing that the 
> Bindy is relying on Scanner to read lines in a large file; however, Scanner 
> itself also does some parsing of the line with the assumption that, if it 
> sees the NEL character, it will regard it as a newline character.  The modern 
> Files API does not make this distinction and reads to a newline designation 
> only (e.g \n, \r, or \r\n).
> There are two ways to fix this from what I've been able to smoke test:
>  * Change the Scanner implementation to use a delimeter of the more 
> traditional newline characters
>  * Use Java 8's Files API and stream the file in
> I would personally want to use the Files API to handle this since it's more 
> robust and capable of higher performance, but I'll explore both approaches 
> and see where I end up.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to