[ 
https://issues.apache.org/jira/browse/IO-670?focusedWorklogId=464221&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-464221
 ]

ASF GitHub Bot logged work on IO-670:
-------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Jul/20 17:50
            Start Date: 29/Jul/20 17:50
    Worklog Time Spent: 10m 
      Work Description: XenoAmess commented on a change in pull request #118:
URL: https://github.com/apache/commons-io/pull/118#discussion_r462150050



##########
File path: 
src/test/java/org/apache/commons/io/performance/IOUtilsContentEqualsPerformanceTest.java
##########
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.io.performance;
+
+import org.apache.commons.io.IOUtils;
+import org.openjdk.jmh.annotations.*;
+
+import java.io.*;
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.commons.io.IOUtils.EOF;
+import static org.apache.commons.io.IOUtils.toBufferedReader;
+
+/**
+ * Test to show whether using BitSet for removeAll() methods is faster than 
using HashSet.
+ */
+@BenchmarkMode(Mode.AverageTime)
+@OutputTimeUnit(TimeUnit.NANOSECONDS)
+@State(Scope.Thread)
+public class IOUtilsContentEqualsPerformanceTest {
+
+    private static final String[] STRINGS = new String[3];
+
+    static {
+        STRINGS[0] = getString0();
+        STRINGS[1] = STRINGS[0] + 'c';
+        STRINGS[2] = STRINGS[0] + 'd';
+    }
+
+    private static final int LOOP = 10;
+
+    @Benchmark
+    public boolean[] testContentEqualsForFileNew() throws IOException {
+        boolean[] res = new boolean[2];
+        for (int i = 0; i < LOOP; i++) {
+            try (InputStream inputStream1 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 InputStream inputStream2 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileNoBOM.xml");
+                 Reader inputReader1 = new InputStreamReader(inputStream1);
+                 Reader inputReader2 = new InputStreamReader(inputStream2);
+            ) {
+                res[0] = IOUtils.contentEquals(inputReader1, inputReader2);
+            }
+            try (InputStream inputStream1 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 InputStream inputStream2 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 Reader inputReader1 = new InputStreamReader(inputStream1);
+                 Reader inputReader2 = new InputStreamReader(inputStream2);
+            ) {
+                res[1] = IOUtils.contentEquals(inputReader1, inputReader2);
+            }
+        }
+        return res;
+    }
+
+    @Benchmark
+    public boolean[] testContentEqualsOld() throws IOException {
+        boolean[] res = new boolean[2];
+        for (int i = 0; i < LOOP; i++) {
+            try (InputStream inputStream1 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 InputStream inputStream2 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileNoBOM.xml");
+                 Reader inputReader1 = new InputStreamReader(inputStream1);
+                 Reader inputReader2 = new InputStreamReader(inputStream2);
+            ) {
+                res[0] = contentEqualsOld(inputReader1, inputReader2);
+            }
+            try (InputStream inputStream1 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 InputStream inputStream2 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 Reader inputReader1 = new InputStreamReader(inputStream1);
+                 Reader inputReader2 = new InputStreamReader(inputStream2);
+            ) {
+                res[1] = contentEqualsOld(inputReader1, inputReader2);
+            }
+        }
+        return res;
+    }
+
+    @Benchmark
+    public boolean[] testContentEqualsNew2() throws IOException {
+        boolean[] res = new boolean[9];
+        for (int i = 0; i < 3; i++) {
+            for (int j = 0; j < 3; j++) {
+                try (Reader inputReader1 = new StringReader(STRINGS[i]);
+                     Reader inputReader2 = new StringReader(STRINGS[j]);
+                ) {
+                    res[i * 3 + j] = IOUtils.contentEquals(inputReader1, 
inputReader2);
+                }
+            }
+        }
+        return res;
+    }
+
+    @Benchmark
+    public boolean[] testContentEqualsOld2() throws IOException {
+        boolean[] res = new boolean[9];
+        for (int i = 0; i < 3; i++) {
+            for (int j = 0; j < 3; j++) {
+                try (Reader inputReader1 = new StringReader(STRINGS[i]);
+                     Reader inputReader2 = new StringReader(STRINGS[j]);
+                ) {
+                    res[i * 3 + j] = contentEqualsOld(inputReader1, 
inputReader2);
+                }
+            }
+        }
+        return res;
+    }
+
+    /**
+     * Old version of IOUtils.contentEquals(Reader, Reader)
+     *
+     * Compares the contents of two Readers to determine if they are equal or
+     * not.
+     * <p>
+     * This method buffers the input internally using
+     * <code>BufferedReader</code> if they are not already buffered.
+     * </p>
+     *
+     * @param input1 the first reader
+     * @param input2 the second reader
+     * @return true if the content of the readers are equal or they both don't
+     * exist, false otherwise
+     * @throws NullPointerException if either input is null
+     * @throws IOException          if an I/O error occurs
+     * @since 1.1
+     */
+    @SuppressWarnings("resource")
+    public static boolean contentEqualsOld(final Reader input1, final Reader 
input2)
+            throws IOException {
+        if (input1 == input2) {
+            return true;
+        }
+        if (input1 == null ^ input2 == null) {
+            return false;
+        }
+        final BufferedReader bufferedInput1 = toBufferedReader(input1);
+        final BufferedReader bufferedInput2 = toBufferedReader(input2);
+
+        int ch = bufferedInput1.read();
+        while (EOF != ch) {
+            final int ch2 = bufferedInput2.read();
+            if (ch != ch2) {
+                return false;
+            }
+            ch = bufferedInput1.read();
+        }
+
+        return bufferedInput2.read() == EOF;
+    }
+
+    public static String getString0() {
+        StringBuilder stringBuilder = new StringBuilder("ab");
+        for (int i = 0; i < 24; i++) {

Review comment:
       @eolivelli
   stringBuilder.append(stringBuilder);
   that is 2^24, not 2*24...
   And I think that is long enough.
   If I use lang3.StringUtils.repeat maybe it will be more clear, but I'm not 
quite sure whethere it be good to use it in commons-io.

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -752,19 +777,46 @@ public static boolean contentEquals(final Reader input1, 
final Reader input2)
         if (input1 == null ^ input2 == null) {
             return false;
         }
-        final BufferedReader bufferedInput1 = toBufferedReader(input1);
-        final BufferedReader bufferedInput2 = toBufferedReader(input2);
 
-        int ch = bufferedInput1.read();
-        while (EOF != ch) {
-            final int ch2 = bufferedInput2.read();
-            if (ch != ch2) {
-                return false;
+        char[] charArray1 = new char[CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE];
+        char[] charArray2 = new char[CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE];
+        int nowPos1;
+        int nowPos2;
+        int nowRead1;
+        int nowRead2;
+        while (true) {
+            nowPos1 = 0;
+            nowPos2 = 0;
+            for (int nowCheck = 0; nowCheck < 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE; nowCheck++) {
+                if (nowPos1 == nowCheck) {
+                    do {
+                        nowRead1 = input1.read(charArray1, nowPos1, 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                    } while (nowRead1 == 0);
+                    if (nowRead1 == -1) {
+                        return nowPos2 == nowCheck && input2.read() == -1;
+                    }
+                    nowPos1 += nowRead1;
+                }
+                if (nowPos2 == nowCheck) {
+                    do {
+                        nowRead2 = input2.read(charArray2, nowPos2, 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                    } while (nowRead2 == 0);
+                    if (nowRead2 == -1) {
+                        return nowPos1 == nowCheck && input1.read() == -1;
+                    }
+                    nowPos2 += nowRead2;
+                }
+                if (charArray1[nowCheck] != charArray2[nowCheck]) {
+                    return false;
+                }
             }
-            ch = bufferedInput1.read();
         }
+    }
 
-        return bufferedInput2.read() == EOF;
+    private enum LastState {
+        r,

Review comment:
       > Can you please use UPPERCASE identifiers?
   Also, IMHO it is better to give meaningful names or add a minimal explanation
   
   @eolivelli
   OK, will do.

##########
File path: 
src/test/java/org/apache/commons/io/performance/IOUtilsContentEqualsPerformanceTest.java
##########
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.io.performance;
+
+import org.apache.commons.io.IOUtils;
+import org.openjdk.jmh.annotations.*;
+
+import java.io.*;
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.commons.io.IOUtils.EOF;
+import static org.apache.commons.io.IOUtils.toBufferedReader;
+
+/**
+ * Test to show whether using BitSet for removeAll() methods is faster than 
using HashSet.
+ */
+@BenchmarkMode(Mode.AverageTime)
+@OutputTimeUnit(TimeUnit.NANOSECONDS)
+@State(Scope.Thread)
+public class IOUtilsContentEqualsPerformanceTest {
+
+    private static final String[] STRINGS = new String[3];
+
+    static {
+        STRINGS[0] = getString0();
+        STRINGS[1] = STRINGS[0] + 'c';
+        STRINGS[2] = STRINGS[0] + 'd';
+    }
+
+    private static final int LOOP = 10;
+
+    @Benchmark
+    public boolean[] testContentEqualsForFileNew() throws IOException {
+        boolean[] res = new boolean[2];
+        for (int i = 0; i < LOOP; i++) {
+            try (InputStream inputStream1 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 InputStream inputStream2 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileNoBOM.xml");
+                 Reader inputReader1 = new InputStreamReader(inputStream1);
+                 Reader inputReader2 = new InputStreamReader(inputStream2);
+            ) {
+                res[0] = IOUtils.contentEquals(inputReader1, inputReader2);
+            }
+            try (InputStream inputStream1 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 InputStream inputStream2 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 Reader inputReader1 = new InputStreamReader(inputStream1);
+                 Reader inputReader2 = new InputStreamReader(inputStream2);
+            ) {
+                res[1] = IOUtils.contentEquals(inputReader1, inputReader2);
+            }
+        }
+        return res;
+    }
+
+    @Benchmark
+    public boolean[] testContentEqualsOld() throws IOException {
+        boolean[] res = new boolean[2];
+        for (int i = 0; i < LOOP; i++) {
+            try (InputStream inputStream1 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 InputStream inputStream2 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileNoBOM.xml");
+                 Reader inputReader1 = new InputStreamReader(inputStream1);
+                 Reader inputReader2 = new InputStreamReader(inputStream2);
+            ) {
+                res[0] = contentEqualsOld(inputReader1, inputReader2);
+            }
+            try (InputStream inputStream1 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 InputStream inputStream2 =
+                         
this.getClass().getResourceAsStream("/org/apache/commons/io/testfileBOM.xml");
+                 Reader inputReader1 = new InputStreamReader(inputStream1);
+                 Reader inputReader2 = new InputStreamReader(inputStream2);
+            ) {
+                res[1] = contentEqualsOld(inputReader1, inputReader2);
+            }
+        }
+        return res;
+    }
+
+    @Benchmark
+    public boolean[] testContentEqualsNew2() throws IOException {
+        boolean[] res = new boolean[9];
+        for (int i = 0; i < 3; i++) {
+            for (int j = 0; j < 3; j++) {
+                try (Reader inputReader1 = new StringReader(STRINGS[i]);
+                     Reader inputReader2 = new StringReader(STRINGS[j]);
+                ) {
+                    res[i * 3 + j] = IOUtils.contentEquals(inputReader1, 
inputReader2);
+                }
+            }
+        }
+        return res;
+    }
+
+    @Benchmark
+    public boolean[] testContentEqualsOld2() throws IOException {
+        boolean[] res = new boolean[9];
+        for (int i = 0; i < 3; i++) {
+            for (int j = 0; j < 3; j++) {
+                try (Reader inputReader1 = new StringReader(STRINGS[i]);
+                     Reader inputReader2 = new StringReader(STRINGS[j]);
+                ) {
+                    res[i * 3 + j] = contentEqualsOld(inputReader1, 
inputReader2);
+                }
+            }
+        }
+        return res;
+    }
+
+    /**
+     * Old version of IOUtils.contentEquals(Reader, Reader)
+     *
+     * Compares the contents of two Readers to determine if they are equal or
+     * not.
+     * <p>
+     * This method buffers the input internally using
+     * <code>BufferedReader</code> if they are not already buffered.
+     * </p>
+     *
+     * @param input1 the first reader
+     * @param input2 the second reader
+     * @return true if the content of the readers are equal or they both don't
+     * exist, false otherwise
+     * @throws NullPointerException if either input is null
+     * @throws IOException          if an I/O error occurs
+     * @since 1.1
+     */
+    @SuppressWarnings("resource")
+    public static boolean contentEqualsOld(final Reader input1, final Reader 
input2)
+            throws IOException {
+        if (input1 == input2) {
+            return true;
+        }
+        if (input1 == null ^ input2 == null) {
+            return false;
+        }
+        final BufferedReader bufferedInput1 = toBufferedReader(input1);
+        final BufferedReader bufferedInput2 = toBufferedReader(input2);
+
+        int ch = bufferedInput1.read();
+        while (EOF != ch) {
+            final int ch2 = bufferedInput2.read();
+            if (ch != ch2) {
+                return false;
+            }
+            ch = bufferedInput1.read();
+        }
+
+        return bufferedInput2.read() == EOF;
+    }
+
+    public static String getString0() {
+        StringBuilder stringBuilder = new StringBuilder("ab");
+        for (int i = 0; i < 24; i++) {

Review comment:
       > Did you test with larger strings?
   24 looks like a small string
   
   @eolivelli
   stringBuilder.append(stringBuilder);
   that is 2^24, not 2*24...
   And I think that is long enough.
   If I use lang3.StringUtils.repeat maybe it will be more clear, but I'm not 
quite sure whethere it be good to use it in commons-io.

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -790,16 +842,342 @@ public static boolean contentEqualsIgnoreEOL(final 
Reader input1, final Reader i
         if (input1 == null ^ input2 == null) {
             return false;
         }
-        final BufferedReader br1 = toBufferedReader(input1);
-        final BufferedReader br2 = toBufferedReader(input2);
 
-        String line1 = br1.readLine();
-        String line2 = br2.readLine();
-        while (line1 != null && line1.equals(line2)) {
-            line1 = br1.readLine();
-            line2 = br2.readLine();
+        char[] charArray1 = new char[CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE];
+        char[] charArray2 = new char[CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE];
+        int nowPos1 = 0;
+        int nowPos2 = 0;
+        int nowRead1;
+        int nowRead2;
+        int nowCheck1 = 0;
+        int nowCheck2 = 0;
+        boolean readEnd1 = false;
+        boolean readEnd2 = false;
+        LastState lastState1 = LastState.newLine;
+        LastState lastState2 = LastState.newLine;
+        while (true) {
+            if (nowPos1 == nowCheck1) {
+                if (nowCheck1 == CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                    nowPos1 = nowCheck1 = 0;
+                }
+                do {
+                    nowRead1 = input1.read(charArray1, nowPos1, 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                } while (nowRead1 == 0);
+                if (nowRead1 == -1) {
+                    readEnd1 = true;
+                } else {
+                    nowPos1 += nowRead1;
+                }
+            }
+            if (nowPos2 == nowCheck2) {
+                if (nowCheck2 == CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                    nowPos2 = nowCheck2 = 0;
+                }
+                do {
+                    nowRead2 = input2.read(charArray2, nowPos2, 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                } while (nowRead2 == 0);
+                if (nowRead2 == -1) {
+                    readEnd2 = true;
+                } else {
+                    nowPos2 += nowRead2;
+                }
+            }
+            if (readEnd1) {
+                if (readEnd2) {
+                    return true;
+                } else {
+                    switch (lastState1) {
+                        case r:
+                        case newLine:
+                            switch (lastState2) {
+                                case r:
+                                    if (charArray2[nowCheck2] == '\n') {
+                                        nowCheck2++;
+                                        if (nowPos2 == nowCheck2) {
+                                            if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                nowPos2 = nowCheck2 = 0;
+                                            }
+                                            do {
+                                                nowRead2 = 
input2.read(charArray2, nowPos2,
+                                                        
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                            } while (nowRead2 == 0);
+                                            if (nowRead2 == -1) {
+                                                readEnd2 = true;
+                                            } else {
+                                                nowPos2 += nowRead2;
+                                            }
+                                        }
+                                        return readEnd2;
+                                    }
+                                    return false;
+                                default:
+                                    return false;
+                            }
+                        case normal:
+                            switch (lastState2) {
+                                case normal:
+                                    switch (charArray2[nowCheck2]) {
+                                        case '\r':
+                                            nowCheck2++;
+                                            if (nowPos2 == nowCheck2) {
+                                                if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                    nowPos2 = nowCheck2 = 0;
+                                                }
+                                                do {
+                                                    nowRead2 = 
input2.read(charArray2, nowPos2,
+                                                     
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                                } while (nowRead2 == 0);
+                                                if (nowRead2 == -1) {
+                                                    readEnd2 = true;
+                                                } else {
+                                                    nowPos2 += nowRead2;
+                                                }
+                                            }
+                                            if (readEnd2) {
+                                                return true;
+                                            } else if (charArray2[nowCheck2] 
== '\n') {
+                                                nowCheck2++;
+                                                if (nowPos2 == nowCheck2) {
+                                                    if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                        nowPos2 = nowCheck2 = 
0;
+                                                    }
+                                                    do {
+                                                        nowRead2 = 
input2.read(charArray2, nowPos2,
+                                                         
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                                    } while (nowRead2 == 0);
+                                                    if (nowRead2 == -1) {
+                                                        readEnd2 = true;
+                                                    } else {
+                                                        nowPos2 += nowRead2;
+                                                    }
+                                                }
+                                                return readEnd2;
+                                            }
+                                            return false;
+                                        case '\n':
+                                            nowCheck2++;
+                                            if (nowPos2 == nowCheck2) {
+                                                if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                    nowPos2 = nowCheck2 = 0;
+                                                }
+                                                do {
+                                                    nowRead2 = 
input2.read(charArray2, nowPos2,
+                                                     
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                                } while (nowRead2 == 0);
+                                                if (nowRead2 == -1) {
+                                                    readEnd2 = true;
+                                                } else {
+                                                    nowPos2 += nowRead2;
+                                                }
+                                            }
+                                            return readEnd2;
+                                        default:
+                                            return false;
+                                    }
+                                default:
+                                    return false;
+                            }
+                        default:
+                            //shall never enter
+                    }
+                }
+            } else if (readEnd2) {
+                switch (lastState2) {
+                    case r:
+                    case newLine:
+                        switch (lastState1) {
+                            case r:
+                                if (charArray1[nowCheck1] == '\n') {
+                                    nowCheck1++;
+                                    if (nowPos1 == nowCheck1) {
+                                        if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                            nowPos1 = nowCheck1 = 0;
+                                        }
+                                        do {
+                                            nowRead1 = input1.read(charArray1, 
nowPos1,
+                                             
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                        } while (nowRead1 == 0);
+                                        if (nowRead1 == -1) {
+                                            readEnd1 = true;
+                                        } else {
+                                            nowPos1 += nowRead1;
+                                        }
+                                    }
+                                    return readEnd1;
+                                }
+                                return false;
+                            default:
+                                return false;
+                        }
+                    case normal:
+                        switch (lastState1) {
+                            case normal:
+                                switch (charArray1[nowCheck1]) {
+                                    case '\r':
+                                        nowCheck1++;
+                                        if (nowPos1 == nowCheck1) {
+                                            if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                nowPos1 = nowCheck1 = 0;
+                                            }
+                                            do {
+                                                nowRead1 = 
input1.read(charArray1, nowPos1,
+                                                        
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                            } while (nowRead1 == 0);
+                                            if (nowRead1 == -1) {
+                                                readEnd1 = true;
+                                            } else {
+                                                nowPos1 += nowRead1;
+                                            }
+                                        }
+                                        if (readEnd1) {
+                                            return true;
+                                        } else if (charArray1[nowCheck1] == 
'\n') {
+                                            nowCheck1++;
+                                            if (nowPos1 == nowCheck1) {
+                                                if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                    nowPos1 = nowCheck1 = 0;
+                                                }
+                                                do {
+                                                    nowRead1 = 
input1.read(charArray1, nowPos1,
+                                                     
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                                } while (nowRead1 == 0);
+                                                if (nowRead1 == -1) {
+                                                    readEnd1 = true;
+                                                } else {
+                                                    nowPos1 += nowRead1;
+                                                }
+                                            }
+                                            return readEnd1;
+                                        }
+                                        return false;
+                                    case '\n':
+                                        nowCheck1++;
+                                        if (nowPos1 == nowCheck1) {
+                                            if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                nowPos1 = nowCheck1 = 0;
+                                            }
+                                            do {
+                                                nowRead1 = 
input1.read(charArray1, nowPos1,
+                                                        
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                            } while (nowRead1 == 0);
+                                            if (nowRead1 == -1) {
+                                                readEnd1 = true;
+                                            } else {
+                                                nowPos1 += nowRead1;
+                                            }
+                                        }
+                                        return readEnd1;
+                                    default:
+                                        return false;
+                                }
+                            default:
+                                return false;
+                        }
+                    default:
+                        //shall never enter
+                }
+            }
+
+            switch (charArray1[nowCheck1]) {
+                case '\r':
+                    switch (charArray2[nowCheck2]) {
+                        case '\r':
+                            lastState1 = lastState2 = LastState.r;
+                            nowCheck1++;
+                            nowCheck2++;
+                            continue;
+                        case '\n':
+                            nowCheck1++;
+                            if (nowPos1 == nowCheck1) {
+                                if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                    nowPos1 = nowCheck1 = 0;
+                                }
+                                do {
+                                    nowRead1 = input1.read(charArray1, nowPos1,
+CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                } while (nowRead1 == 0);
+                                if (nowRead1 == -1) {
+                                    readEnd1 = true;
+                                } else {
+                                    nowPos1 += nowRead1;
+                                }
+                            }
+                            lastState1 = lastState2 = LastState.newLine;
+                            nowCheck2++;
+                            if (readEnd1) {
+                                continue;
+                            }
+                            if (charArray1[nowCheck1] == '\n') {
+                                nowCheck1++;
+                            }
+                            continue;
+                        default:
+                            return false;
+                    }
+                case '\n':
+                    switch (charArray2[nowCheck2]) {
+                        case '\n':
+                            lastState1 = lastState2 = LastState.newLine;
+                            nowCheck1++;
+                            nowCheck2++;
+                            continue;
+                        case '\r':
+                            nowCheck2++;
+                            if (nowPos2 == nowCheck2) {
+                                if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                    nowPos2 = nowCheck2 = 0;
+                                }
+                                do {
+                                    nowRead2 = input2.read(charArray2, nowPos2,
+                                            
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                } while (nowRead2 == 0);
+                                if (nowRead2 == -1) {
+                                    readEnd2 = true;
+                                } else {
+                                    nowPos2 += nowRead2;
+                                }
+                            }
+                            lastState1 = lastState2 = LastState.newLine;
+                            nowCheck1++;
+                            if (readEnd2) {
+                                continue;
+                            }
+                            if (charArray2[nowCheck2] == '\n') {
+                                nowCheck2++;
+                            }
+                            continue;
+                        default:
+                            if (lastState1 == LastState.r) {
+                                lastState1 = LastState.newLine;
+                                nowCheck1++;
+                                continue;
+                            } else {
+                                return false;
+                            }
+                    }
+                default:
+                    switch (charArray2[nowCheck2]) {
+                        case '\n':
+                            if (lastState2 == LastState.r) {
+                                lastState2 = LastState.newLine;
+                                nowCheck2++;
+                                continue;
+                            } else {
+                                return false;
+                            }
+                        case '\r':
+                            return false;
+                        default:
+                            if (charArray1[nowCheck1] != 
charArray2[nowCheck2]) {
+                                return false;
+                            }
+                            lastState1 = lastState2 = LastState.normal;
+                            nowCheck1++;
+                            nowCheck2++;
+                            continue;
+                    }
+            }
         }

Review comment:
       @garydgregory @melloware 
   > there are zero comments in hundreds of lines of code
   
   comments just added now:)
   please see the latest pr.
   
   > In some cases performance gains do not outweigh complexity of the code.
   Yes.
   It is also out of my thought.
   I thought it SHOULD be also a 500%+ faster like the contentEquals function, 
but it is actually only 92% faster in jmh.
   So it might not worth that much inline.
   I will try to split some sub-functions out later today, and hope that can 
reduce the size of this function.
   

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -790,16 +842,342 @@ public static boolean contentEqualsIgnoreEOL(final 
Reader input1, final Reader i
         if (input1 == null ^ input2 == null) {
             return false;
         }
-        final BufferedReader br1 = toBufferedReader(input1);
-        final BufferedReader br2 = toBufferedReader(input2);
 
-        String line1 = br1.readLine();
-        String line2 = br2.readLine();
-        while (line1 != null && line1.equals(line2)) {
-            line1 = br1.readLine();
-            line2 = br2.readLine();
+        char[] charArray1 = new char[CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE];
+        char[] charArray2 = new char[CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE];
+        int nowPos1 = 0;
+        int nowPos2 = 0;
+        int nowRead1;
+        int nowRead2;
+        int nowCheck1 = 0;
+        int nowCheck2 = 0;
+        boolean readEnd1 = false;
+        boolean readEnd2 = false;
+        LastState lastState1 = LastState.newLine;
+        LastState lastState2 = LastState.newLine;
+        while (true) {
+            if (nowPos1 == nowCheck1) {
+                if (nowCheck1 == CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                    nowPos1 = nowCheck1 = 0;
+                }
+                do {
+                    nowRead1 = input1.read(charArray1, nowPos1, 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                } while (nowRead1 == 0);
+                if (nowRead1 == -1) {
+                    readEnd1 = true;
+                } else {
+                    nowPos1 += nowRead1;
+                }
+            }
+            if (nowPos2 == nowCheck2) {
+                if (nowCheck2 == CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                    nowPos2 = nowCheck2 = 0;
+                }
+                do {
+                    nowRead2 = input2.read(charArray2, nowPos2, 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                } while (nowRead2 == 0);
+                if (nowRead2 == -1) {
+                    readEnd2 = true;
+                } else {
+                    nowPos2 += nowRead2;
+                }
+            }
+            if (readEnd1) {
+                if (readEnd2) {
+                    return true;
+                } else {
+                    switch (lastState1) {
+                        case r:
+                        case newLine:
+                            switch (lastState2) {
+                                case r:
+                                    if (charArray2[nowCheck2] == '\n') {
+                                        nowCheck2++;
+                                        if (nowPos2 == nowCheck2) {
+                                            if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                nowPos2 = nowCheck2 = 0;
+                                            }
+                                            do {
+                                                nowRead2 = 
input2.read(charArray2, nowPos2,
+                                                        
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                            } while (nowRead2 == 0);
+                                            if (nowRead2 == -1) {
+                                                readEnd2 = true;
+                                            } else {
+                                                nowPos2 += nowRead2;
+                                            }
+                                        }
+                                        return readEnd2;
+                                    }
+                                    return false;
+                                default:
+                                    return false;
+                            }
+                        case normal:
+                            switch (lastState2) {
+                                case normal:
+                                    switch (charArray2[nowCheck2]) {
+                                        case '\r':
+                                            nowCheck2++;
+                                            if (nowPos2 == nowCheck2) {
+                                                if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                    nowPos2 = nowCheck2 = 0;
+                                                }
+                                                do {
+                                                    nowRead2 = 
input2.read(charArray2, nowPos2,
+                                                     
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                                } while (nowRead2 == 0);
+                                                if (nowRead2 == -1) {
+                                                    readEnd2 = true;
+                                                } else {
+                                                    nowPos2 += nowRead2;
+                                                }
+                                            }
+                                            if (readEnd2) {
+                                                return true;
+                                            } else if (charArray2[nowCheck2] 
== '\n') {
+                                                nowCheck2++;
+                                                if (nowPos2 == nowCheck2) {
+                                                    if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                        nowPos2 = nowCheck2 = 
0;
+                                                    }
+                                                    do {
+                                                        nowRead2 = 
input2.read(charArray2, nowPos2,
+                                                         
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                                    } while (nowRead2 == 0);
+                                                    if (nowRead2 == -1) {
+                                                        readEnd2 = true;
+                                                    } else {
+                                                        nowPos2 += nowRead2;
+                                                    }
+                                                }
+                                                return readEnd2;
+                                            }
+                                            return false;
+                                        case '\n':
+                                            nowCheck2++;
+                                            if (nowPos2 == nowCheck2) {
+                                                if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                    nowPos2 = nowCheck2 = 0;
+                                                }
+                                                do {
+                                                    nowRead2 = 
input2.read(charArray2, nowPos2,
+                                                     
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                                } while (nowRead2 == 0);
+                                                if (nowRead2 == -1) {
+                                                    readEnd2 = true;
+                                                } else {
+                                                    nowPos2 += nowRead2;
+                                                }
+                                            }
+                                            return readEnd2;
+                                        default:
+                                            return false;
+                                    }
+                                default:
+                                    return false;
+                            }
+                        default:
+                            //shall never enter
+                    }
+                }
+            } else if (readEnd2) {
+                switch (lastState2) {
+                    case r:
+                    case newLine:
+                        switch (lastState1) {
+                            case r:
+                                if (charArray1[nowCheck1] == '\n') {
+                                    nowCheck1++;
+                                    if (nowPos1 == nowCheck1) {
+                                        if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                            nowPos1 = nowCheck1 = 0;
+                                        }
+                                        do {
+                                            nowRead1 = input1.read(charArray1, 
nowPos1,
+                                             
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                        } while (nowRead1 == 0);
+                                        if (nowRead1 == -1) {
+                                            readEnd1 = true;
+                                        } else {
+                                            nowPos1 += nowRead1;
+                                        }
+                                    }
+                                    return readEnd1;
+                                }
+                                return false;
+                            default:
+                                return false;
+                        }
+                    case normal:
+                        switch (lastState1) {
+                            case normal:
+                                switch (charArray1[nowCheck1]) {
+                                    case '\r':
+                                        nowCheck1++;
+                                        if (nowPos1 == nowCheck1) {
+                                            if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                nowPos1 = nowCheck1 = 0;
+                                            }
+                                            do {
+                                                nowRead1 = 
input1.read(charArray1, nowPos1,
+                                                        
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                            } while (nowRead1 == 0);
+                                            if (nowRead1 == -1) {
+                                                readEnd1 = true;
+                                            } else {
+                                                nowPos1 += nowRead1;
+                                            }
+                                        }
+                                        if (readEnd1) {
+                                            return true;
+                                        } else if (charArray1[nowCheck1] == 
'\n') {
+                                            nowCheck1++;
+                                            if (nowPos1 == nowCheck1) {
+                                                if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                    nowPos1 = nowCheck1 = 0;
+                                                }
+                                                do {
+                                                    nowRead1 = 
input1.read(charArray1, nowPos1,
+                                                     
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                                } while (nowRead1 == 0);
+                                                if (nowRead1 == -1) {
+                                                    readEnd1 = true;
+                                                } else {
+                                                    nowPos1 += nowRead1;
+                                                }
+                                            }
+                                            return readEnd1;
+                                        }
+                                        return false;
+                                    case '\n':
+                                        nowCheck1++;
+                                        if (nowPos1 == nowCheck1) {
+                                            if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                                nowPos1 = nowCheck1 = 0;
+                                            }
+                                            do {
+                                                nowRead1 = 
input1.read(charArray1, nowPos1,
+                                                        
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                            } while (nowRead1 == 0);
+                                            if (nowRead1 == -1) {
+                                                readEnd1 = true;
+                                            } else {
+                                                nowPos1 += nowRead1;
+                                            }
+                                        }
+                                        return readEnd1;
+                                    default:
+                                        return false;
+                                }
+                            default:
+                                return false;
+                        }
+                    default:
+                        //shall never enter
+                }
+            }
+
+            switch (charArray1[nowCheck1]) {
+                case '\r':
+                    switch (charArray2[nowCheck2]) {
+                        case '\r':
+                            lastState1 = lastState2 = LastState.r;
+                            nowCheck1++;
+                            nowCheck2++;
+                            continue;
+                        case '\n':
+                            nowCheck1++;
+                            if (nowPos1 == nowCheck1) {
+                                if (nowCheck1 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                    nowPos1 = nowCheck1 = 0;
+                                }
+                                do {
+                                    nowRead1 = input1.read(charArray1, nowPos1,
+CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos1);
+                                } while (nowRead1 == 0);
+                                if (nowRead1 == -1) {
+                                    readEnd1 = true;
+                                } else {
+                                    nowPos1 += nowRead1;
+                                }
+                            }
+                            lastState1 = lastState2 = LastState.newLine;
+                            nowCheck2++;
+                            if (readEnd1) {
+                                continue;
+                            }
+                            if (charArray1[nowCheck1] == '\n') {
+                                nowCheck1++;
+                            }
+                            continue;
+                        default:
+                            return false;
+                    }
+                case '\n':
+                    switch (charArray2[nowCheck2]) {
+                        case '\n':
+                            lastState1 = lastState2 = LastState.newLine;
+                            nowCheck1++;
+                            nowCheck2++;
+                            continue;
+                        case '\r':
+                            nowCheck2++;
+                            if (nowPos2 == nowCheck2) {
+                                if (nowCheck2 == 
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE) {
+                                    nowPos2 = nowCheck2 = 0;
+                                }
+                                do {
+                                    nowRead2 = input2.read(charArray2, nowPos2,
+                                            
CONTENT_EQUALS_CHAR_ARRAY_BUFFER_SIZE - nowPos2);
+                                } while (nowRead2 == 0);
+                                if (nowRead2 == -1) {
+                                    readEnd2 = true;
+                                } else {
+                                    nowPos2 += nowRead2;
+                                }
+                            }
+                            lastState1 = lastState2 = LastState.newLine;
+                            nowCheck1++;
+                            if (readEnd2) {
+                                continue;
+                            }
+                            if (charArray2[nowCheck2] == '\n') {
+                                nowCheck2++;
+                            }
+                            continue;
+                        default:
+                            if (lastState1 == LastState.r) {
+                                lastState1 = LastState.newLine;
+                                nowCheck1++;
+                                continue;
+                            } else {
+                                return false;
+                            }
+                    }
+                default:
+                    switch (charArray2[nowCheck2]) {
+                        case '\n':
+                            if (lastState2 == LastState.r) {
+                                lastState2 = LastState.newLine;
+                                nowCheck2++;
+                                continue;
+                            } else {
+                                return false;
+                            }
+                        case '\r':
+                            return false;
+                        default:
+                            if (charArray1[nowCheck1] != 
charArray2[nowCheck2]) {
+                                return false;
+                            }
+                            lastState1 = lastState2 = LastState.normal;
+                            nowCheck1++;
+                            nowCheck2++;
+                            continue;
+                    }
+            }
         }

Review comment:
       @garydgregory @melloware 
   > there are zero comments in hundreds of lines of code
   
   comments just added now:)
   please see the latest pr.
   
   > In some cases performance gains do not outweigh complexity of the code.
   
   Yes.
   It is also out of my thought.
   I thought it SHOULD be also a 500%+ faster like the contentEquals function, 
but it is actually only 92% faster in jmh.
   So it might not worth that much inline.
   I will try to split some sub-functions out later today, and hope that can 
reduce the size of this function.
   

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -1147,31 +1201,43 @@ public static boolean contentEqualsIgnoreEOL(final 
Reader input1, final Reader i
                             }
                             continue;
                         default:
-                            if (lastState1 == LastState.r) {
-                                lastState1 = LastState.newLine;
+                            // if input2's next is normal.
+                            //  if input1's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState1 == LastState.R) {
+                                lastState1 = LastState.NEW_LINE;
                                 nowCheck1++;
                                 continue;
                             } else {
                                 return false;
                             }
                     }
                 default:
+                    // if input1's next is normal.
                     switch (charArray2[nowCheck2]) {
                         case '\n':
-                            if (lastState2 == LastState.r) {
-                                lastState2 = LastState.newLine;
+                            // if input2's next is '\n'.
+                            //  if input2's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState2 == LastState.R) {
+                                lastState2 = LastState.NEW_LINE;
                                 nowCheck2++;
                                 continue;
                             } else {
                                 return false;
                             }
                         case '\r':
+                            // if input2's next is '\r'.
+                            // illegal.
                             return false;
                         default:
+                            // if input2's next is normal.
+                            //  if equal then legal.
+                            //  otherwise illegal.
                             if (charArray1[nowCheck1] != 
charArray2[nowCheck2]) {
                                 return false;
                             }
-                            lastState1 = lastState2 = LastState.normal;
+                            lastState1 = lastState2 = LastState.NORMAL;
                             nowCheck1++;
                             nowCheck2++;
                             continue;

Review comment:
       > Most of the comments above are unnecessary if the variables are well 
named and commented.
   > What is missing is how the algorithm works, and what the various 
combinations of state actually mean in terms of the algorithm.
   > 
   > I'm not clear why the code sometimes checks for '\r' and sometimes uses 
the enum.
   > Why not use a variable containing the last character?
   > 
   > ==
   > 
   > I suspect the code could be much simplified by using a suitable filter on 
the inputs.
   > There are several examples in IO, and NET has FromNetASCIIInputStream 
which deals with CRLF conversions
   
   
   
   > It occurs to me that the code is effectively buffering the output from a 
BufferedReader (or BufferedInputStream).
   > One would expect these classes to be reasonably fast, however the JVM has 
to do locking and other checks in order to support multi-threading. It has to 
do this for each read() call.
   > 
   > Rather than implement the buffering directly in the compare methods, it 
might be better to implement a generic, non-threadsafe filter that can be used 
in situations such as these.
   > 
   > The original code should then work without any change other than to add 
the filter.
   
   
   
   > It occurs to me that the code is effectively buffering the output from a 
BufferedReader (or BufferedInputStream).
   > One would expect these classes to be reasonably fast, however the JVM has 
to do locking and other checks in order to support multi-threading. It has to 
do this for each read() call.
   > 
   > Rather than implement the buffering directly in the compare methods, it 
might be better to implement a generic, non-threadsafe filter that can be used 
in situations such as these.
   > 
   > The original code should then work without any change other than to add 
the filter.
   
   Hi.
   I used the filter idea you mentioned and re-write another implementation, 
named contentEqualsIgnoreEOLNew2.
   You can see it in the latest commit.
   A fast performance test shows that it is SLOWER than my giant function 
(called contentEqualsIgnoreEOLNew1), but still very much FASTER than the 
original implementation in commons-io.

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -1147,31 +1201,43 @@ public static boolean contentEqualsIgnoreEOL(final 
Reader input1, final Reader i
                             }
                             continue;
                         default:
-                            if (lastState1 == LastState.r) {
-                                lastState1 = LastState.newLine;
+                            // if input2's next is normal.
+                            //  if input1's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState1 == LastState.R) {
+                                lastState1 = LastState.NEW_LINE;
                                 nowCheck1++;
                                 continue;
                             } else {
                                 return false;
                             }
                     }
                 default:
+                    // if input1's next is normal.
                     switch (charArray2[nowCheck2]) {
                         case '\n':
-                            if (lastState2 == LastState.r) {
-                                lastState2 = LastState.newLine;
+                            // if input2's next is '\n'.
+                            //  if input2's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState2 == LastState.R) {
+                                lastState2 = LastState.NEW_LINE;
                                 nowCheck2++;
                                 continue;
                             } else {
                                 return false;
                             }
                         case '\r':
+                            // if input2's next is '\r'.
+                            // illegal.
                             return false;
                         default:
+                            // if input2's next is normal.
+                            //  if equal then legal.
+                            //  otherwise illegal.
                             if (charArray1[nowCheck1] != 
charArray2[nowCheck2]) {
                                 return false;
                             }
-                            lastState1 = lastState2 = LastState.normal;
+                            lastState1 = lastState2 = LastState.NORMAL;
                             nowCheck1++;
                             nowCheck2++;
                             continue;

Review comment:
       > It occurs to me that the code is effectively buffering the output from 
a BufferedReader (or BufferedInputStream).
   > One would expect these classes to be reasonably fast, however the JVM has 
to do locking and other checks in order to support multi-threading. It has to 
do this for each read() call.
   > 
   > Rather than implement the buffering directly in the compare methods, it 
might be better to implement a generic, non-threadsafe filter that can be used 
in situations such as these.
   > 
   > The original code should then work without any change other than to add 
the filter.
   
   Hi.
   I used the filter idea you mentioned and re-write another implementation, 
named contentEqualsIgnoreEOLNew2.
   You can see it in the latest commit.
   A fast performance test shows that it is SLOWER than my giant function 
(called contentEqualsIgnoreEOLNew1), but still very much FASTER than the 
original implementation in commons-io.

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -1147,31 +1201,43 @@ public static boolean contentEqualsIgnoreEOL(final 
Reader input1, final Reader i
                             }
                             continue;
                         default:
-                            if (lastState1 == LastState.r) {
-                                lastState1 = LastState.newLine;
+                            // if input2's next is normal.
+                            //  if input1's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState1 == LastState.R) {
+                                lastState1 = LastState.NEW_LINE;
                                 nowCheck1++;
                                 continue;
                             } else {
                                 return false;
                             }
                     }
                 default:
+                    // if input1's next is normal.
                     switch (charArray2[nowCheck2]) {
                         case '\n':
-                            if (lastState2 == LastState.r) {
-                                lastState2 = LastState.newLine;
+                            // if input2's next is '\n'.
+                            //  if input2's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState2 == LastState.R) {
+                                lastState2 = LastState.NEW_LINE;
                                 nowCheck2++;
                                 continue;
                             } else {
                                 return false;
                             }
                         case '\r':
+                            // if input2's next is '\r'.
+                            // illegal.
                             return false;
                         default:
+                            // if input2's next is normal.
+                            //  if equal then legal.
+                            //  otherwise illegal.
                             if (charArray1[nowCheck1] != 
charArray2[nowCheck2]) {
                                 return false;
                             }
-                            lastState1 = lastState2 = LastState.normal;
+                            lastState1 = lastState2 = LastState.NORMAL;
                             nowCheck1++;
                             nowCheck2++;
                             continue;

Review comment:
       > It occurs to me that the code is effectively buffering the output from 
a BufferedReader (or BufferedInputStream).
   
   Yes, this is the main trick.
   For that two not-so-long function, I used another trick to make it faster 
with same checkIndex for two buffers.
   
   > One would expect these classes to be reasonably fast, however the JVM has 
to do locking and other checks in order to support multi-threading. It has to 
do this for each read() call.
   > 
   > Rather than implement the buffering directly in the compare methods, it 
might be better to implement a generic, non-threadsafe filter that can be used 
in situations such as these.
   > 
   > The original code should then work without any change other than to add 
the filter.
   
   I used the filter idea you mentioned and re-write another implementation, 
named contentEqualsIgnoreEOLNew2.
   You can see it in the latest commit.
   A fast performance test shows that it is SLOWER than my giant function 
(called contentEqualsIgnoreEOLNew1), but still very much FASTER than the 
original implementation in commons-io.

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -1147,31 +1201,43 @@ public static boolean contentEqualsIgnoreEOL(final 
Reader input1, final Reader i
                             }
                             continue;
                         default:
-                            if (lastState1 == LastState.r) {
-                                lastState1 = LastState.newLine;
+                            // if input2's next is normal.
+                            //  if input1's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState1 == LastState.R) {
+                                lastState1 = LastState.NEW_LINE;
                                 nowCheck1++;
                                 continue;
                             } else {
                                 return false;
                             }
                     }
                 default:
+                    // if input1's next is normal.
                     switch (charArray2[nowCheck2]) {
                         case '\n':
-                            if (lastState2 == LastState.r) {
-                                lastState2 = LastState.newLine;
+                            // if input2's next is '\n'.
+                            //  if input2's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState2 == LastState.R) {
+                                lastState2 = LastState.NEW_LINE;
                                 nowCheck2++;
                                 continue;
                             } else {
                                 return false;
                             }
                         case '\r':
+                            // if input2's next is '\r'.
+                            // illegal.
                             return false;
                         default:
+                            // if input2's next is normal.
+                            //  if equal then legal.
+                            //  otherwise illegal.
                             if (charArray1[nowCheck1] != 
charArray2[nowCheck2]) {
                                 return false;
                             }
-                            lastState1 = lastState2 = LastState.normal;
+                            lastState1 = lastState2 = LastState.NORMAL;
                             nowCheck1++;
                             nowCheck2++;
                             continue;

Review comment:
       > It occurs to me that the code is effectively buffering the output from 
a BufferedReader (or BufferedInputStream).
   
   Yes, this is the main trick.
   For that two not-so-long function, I used another trick to make it faster 
with same checkIndex for two buffers.
   
   > One would expect these classes to be reasonably fast, however the JVM has 
to do locking and other checks in order to support multi-threading. It has to 
do this for each read() call.
   > 
   > Rather than implement the buffering directly in the compare methods, it 
might be better to implement a generic, non-threadsafe filter that can be used 
in situations such as these.
   > 
   > The original code should then work without any change other than to add 
the filter.
   
   I used the filter idea you mentioned and re-write another implementation, 
named contentEqualsIgnoreEOLNew2.
   You can see it in the latest commit.
   A fast performance test shows that it is SLOWER than my giant function 
(called contentEqualsIgnoreEOLNew1), but still very much FASTER than the 
original implementation in commons-io.
   
   And I worried if we really split it out to class, it may become even slower.
   (still, faster than original, of course.)

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -814,9 +814,21 @@ public static boolean contentEquals(final Reader input1, 
final Reader input2)
     }
 
     private enum LastState {
-        r,
-        normal,
-        newLine;
+        /**
+         * If last char is '\r'.
+         */
+        R,
+
+        /**
+         * If last char is not '\r' nor '\n'.
+         */
+        NORMAL,
+
+        /**
+         * If we just moved to a new line.
+         * It might sounds weird but after you see the codes you can know it.
+         */
+        NEW_LINE;

Review comment:
       @sebbASF I changed the enum to int, and get even better performance 
result, so I deleted that enum.

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -882,15 +894,26 @@ public static boolean contentEqualsIgnoreEOL(final Reader 
input1, final Reader i
                     nowPos2 += nowRead2;
                 }
             }
+

Review comment:
       @sebbASF 
   well it shouldn't in anyway...
   If it really happened,I think there must bugs in jvm,and I don't think using 
>= can make things back to correct...

##########
File path: src/main/java/org/apache/commons/io/IOUtils.java
##########
@@ -1147,31 +1201,43 @@ public static boolean contentEqualsIgnoreEOL(final 
Reader input1, final Reader i
                             }
                             continue;
                         default:
-                            if (lastState1 == LastState.r) {
-                                lastState1 = LastState.newLine;
+                            // if input2's next is normal.
+                            //  if input1's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState1 == LastState.R) {
+                                lastState1 = LastState.NEW_LINE;
                                 nowCheck1++;
                                 continue;
                             } else {
                                 return false;
                             }
                     }
                 default:
+                    // if input1's next is normal.
                     switch (charArray2[nowCheck2]) {
                         case '\n':
-                            if (lastState2 == LastState.r) {
-                                lastState2 = LastState.newLine;
+                            // if input2's next is '\n'.
+                            //  if input2's last is '\r', then it can become 
"\r\n", then legal.
+                            //  otherwise illegal.
+                            if (lastState2 == LastState.R) {
+                                lastState2 = LastState.NEW_LINE;
                                 nowCheck2++;
                                 continue;
                             } else {
                                 return false;
                             }
                         case '\r':
+                            // if input2's next is '\r'.
+                            // illegal.
                             return false;
                         default:
+                            // if input2's next is normal.
+                            //  if equal then legal.
+                            //  otherwise illegal.
                             if (charArray1[nowCheck1] != 
charArray2[nowCheck2]) {
                                 return false;
                             }
-                            lastState1 = lastState2 = LastState.normal;
+                            lastState1 = lastState2 = LastState.NORMAL;
                             nowCheck1++;
                             nowCheck2++;
                             continue;

Review comment:
       @sebbASF 
   > Most of the comments above are unnecessary if the variables are well named 
and commented.
   > What is missing is how the algorithm works, and what the various 
combinations of state actually mean in terms of the algorithm.
   
   I added more comments, please have a look.
   
   > I'm not clear why the code sometimes checks for '\r' and sometimes uses 
the enum.
   > Why not use a variable containing the last character?
   
   because in some cases even if last character is '\r', we should treated it 
as a new line start.
   for example, in
   "aaaaa\r\naaaaa" and "aaaaa\raaaaa",
   although in second String it is \r, it still need to treat it as new line, 
so new line here not mean LF, but "new line".
   
   though, we can use an int instead of enum.
   I tried, and that can work, and even be faster than the enum version....
   
   > I suspect the code could be much simplified by using a suitable filter on 
the inputs.
   
   Yes I think so. But it will be a performance cost in doing so.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 464221)
    Time Spent: 5h 20m  (was: 5h 10m)

> IOUtils.contentEquals is of low performance. I will refine it.
> --------------------------------------------------------------
>
>                 Key: IO-670
>                 URL: https://issues.apache.org/jira/browse/IO-670
>             Project: Commons IO
>          Issue Type: Improvement
>            Reporter: Jin Xu
>            Priority: Critical
>         Attachments: jmh-result.org.apache.json
>
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/commons-io/pull/118]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to