Hi all, During investigation of an expression language issue posted to the list, I discovered that replace explicitly delegates to a String#replace invocation that only accepts literal expressions, not regular expressions, while replaceAll accepts regular expressions. I thought this was an oversight and filed NIFI-1919 [1] to document and fix this, by changing the ReplaceEvaluator [2] to use String#replaceFirst, which accepts regular expressions. I wrote failing unit tests [3] to capture the fix. After implementing the change, two existing unit tests [4] broke, which indicated a regression. At first, I believed these two tests to be incorrect, but further investigation showed they were merely surprising.
TestQuery#testQuotingQuotes (below) fails on the second verifyEquals call, but the test is asserting that replace should replace multiple instances of the single quote. While this is similar to String#replace, because the expression language exposes only two methods — replace vs. replaceAll — one could easily assume the difference between the two was the number of attempted replacements, rather than the actual difference, which is literal expression vs. pattern. @Test public void testQuotingQuotes() { final Map<String, String> attributes = new HashMap<>(); attributes.put("xx", "say 'hi'"); String query = "${xx:replaceAll( \"'.*'\", '\\\"hello\\\"' )}"; verifyEquals(query, attributes, "say \"hello\""); query = "${xx:replace( \"'\", '\"')}"; verifyEquals(query, attributes, "say \"hi\""); query = "${xx:replace( '\\'', '\"')}"; System.out.println(query); verifyEquals(query, attributes, "say \"hi\""); } TestQuery#testReplaceAllWithOddNumberOfBackslashPairs (below) also fails on the first verifyEquals call with a PatternSyntaxException. I am investigating that further. @Test public void testReplaceAllWithOddNumberOfBackslashPairs() { final Map<String, String> attributes = new HashMap<>(); attributes.put("filename", "C:\\temp\\.txt"); verifyEquals("${filename:replace('\\\\', '/')}", attributes, "C:/temp/.txt"); verifyEquals("${filename:replaceAll('\\\\\\\\', '/')}", attributes, "C:/temp/.txt"); verifyEquals("${filename:replaceAll('\\\\\\.txt$', '')}", attributes, "C:\\temp"); } While I originally had just modified replace, after looking at the EL documentation [5], replace is explicitly documented to only replace literal expressions, and does so multiple times, as does Java’s String#replace [6]. I now propose to add another method replaceFirst, which accepts a pattern and replaces only the first occurrence. I will update the unit tests to properly capture this, and will update the documentation to reflect the new method. Thoughts from the community? [1] https://issues.apache.org/jira/browse/NIFI-1919 <https://issues.apache.org/jira/browse/NIFI-1919> [2] https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/evaluation/functions/ReplaceEvaluator.java <https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/main/java/org/apache/nifi/attribute/expression/language/evaluation/functions/ReplaceEvaluator.java> [3] https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/test/groovy/org/apache/nifi/attribute/expression/language/QueryGroovyTest.groovy <https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/test/groovy/org/apache/nifi/attribute/expression/language/QueryGroovyTest.groovy> [4] https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java <https://github.com/alopresto/nifi/blob/NIFI-1919/nifi-commons/nifi-expression-language/src/test/java/org/apache/nifi/attribute/expression/language/TestQuery.java> [5] https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#replace [6] https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replace(java.lang.CharSequence,%20java.lang.CharSequence) <https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#replace(java.lang.CharSequence, java.lang.CharSequence)> Andy LoPresto alopre...@apache.org alopresto.apa...@gmail.com PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
signature.asc
Description: Message signed with OpenPGP using GPGMail