dtspence commented on code in PR #3549:
URL: https://github.com/apache/accumulo/pull/3549#discussion_r1245727893
##########
core/src/main/java/org/apache/accumulo/core/metadata/ValidationUtil.java:
##########
@@ -62,9 +62,12 @@ public static void validateRFileName(String fileName) {
public static void validateFileName(String fileName) {
Objects.requireNonNull(fileName);
- if (!fileName.matches("[\\dA-Za-z._-]+")) {
- throw new IllegalArgumentException(
- "Provided filename (" + fileName + ") contains invalid characters.");
+ for (int i = 0; i < fileName.length(); i++) {
+ final char ch = fileName.charAt(i);
+ if (!(Character.isLetterOrDigit(ch) || ch == '-' || ch == '.' || ch ==
'_')) {
+ throw new IllegalArgumentException(
+ "Provided filename (" + fileName + ") contains invalid
characters.");
+ }
Review Comment:
I unsure what the `validateFileName` change is doing to the overall GC
references benchmark. The last number has variance and looking at it further
now. The average time observed across multiple benchmark executions based on
A/B for `validateFileName`, seemed to show some difference (i.e. 233ms vs
245ms). The HEAD difference from 521ms to 233ms/245ms is accurate and had a
larger impact from the `URI` and `validateFileName` regex removal. In context
of the `validateFileName` I had used a separate benchmark for the method itself.
Just as minor note the `splitRfile` parameter is saying (for the larger
number) 1K splits w/100 r-files per split. I am investigating the iterations
question further.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]