dtspence commented on code in PR #3549:
URL: https://github.com/apache/accumulo/pull/3549#discussion_r1245727893


##########
core/src/main/java/org/apache/accumulo/core/metadata/ValidationUtil.java:
##########
@@ -62,9 +62,12 @@ public static void validateRFileName(String fileName) {
 
   public static void validateFileName(String fileName) {
     Objects.requireNonNull(fileName);
-    if (!fileName.matches("[\\dA-Za-z._-]+")) {
-      throw new IllegalArgumentException(
-          "Provided filename (" + fileName + ") contains invalid characters.");
+    for (int i = 0; i < fileName.length(); i++) {
+      final char ch = fileName.charAt(i);
+      if (!(Character.isLetterOrDigit(ch) || ch == '-' || ch == '.' || ch == 
'_')) {
+        throw new IllegalArgumentException(
+            "Provided filename (" + fileName + ") contains invalid 
characters.");
+      }

Review Comment:
   I unsure what the `validateFileName` change is doing to the overall GC 
references benchmark. The last number has variance and looking at it further 
now. The average time observed across multiple benchmark executions based on 
A/B for `validateFileName`, seemed to show some difference (i.e. 233ms vs 
245ms). The HEAD difference from 521ms to 233ms/245ms is accurate and had a 
larger impact from the `URI` and `validateFileName` regex removal. In context 
of the `validateFileName` I had used a separate benchmark for the method itself.
   
   Just as minor note the `splitRfile` parameter is saying (for the larger 
number) 1K splits w/100 r-files per split. I am investigating the iterations 
question further.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to