featzhang commented on code in PR #28140:
URL: https://github.com/apache/flink/pull/28140#discussion_r3222898528


##########
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/SqlFunctionUtils.java:
##########
@@ -444,26 +443,23 @@ public static String regexpReplace(String str, String 
regex, String replacement)
 
     /**
      * Returns a string extracted with a specified regular expression and a 
regex match group index.
+     * Literal regexes are validated at planning time by the input type 
strategy.
      */
     public static String regexpExtract(String str, String regex, int 
extractIndex) {
-        if (str == null || regex == null) {
+        if (str == null || regex == null || extractIndex < 0) {
             return null;
         }
-
         try {
-            Matcher m = Pattern.compile(regex).matcher(str);
+            final Matcher m = REGEXP_PATTERN_CACHE.get(regex).matcher(str);
+            if (m.groupCount() < extractIndex) {
+                return null;
+            }
             if (m.find()) {
-                MatchResult mr = m.toMatchResult();
-                return mr.group(extractIndex);
+                return m.group(extractIndex);
             }
-        } catch (Exception e) {
-            LOG.error(
-                    String.format(
-                            "Exception in regexpExtract('%s', '%s', '%d')",
-                            str, regex, extractIndex),
-                    e);
+        } catch (PatternSyntaxException e) {

Review Comment:
   ### Potential Uncaught Exception: `ExecutionException`
   
   The current code only catches `PatternSyntaxException`, but Guava's 
`LoadingCache.get()` declares `throws ExecutionException` in its API contract:
   
   ```java
   V get(K key) throws ExecutionException;
   ```
   
   **Analysis:**
   
   While the current `CacheLoader` implementation (which calls 
`Pattern.compile(regex)`) is unlikely to throw `ExecutionException` in 
practice, not handling it introduces:
   
   1. **API Contract Violation** — Calling a method that declares a checked 
exception without handling it
   2. **Future Maintenance Risk** — If the cache loader is modified (e.g., for 
validation, metrics, or external config), checked exceptions will be wrapped in 
`ExecutionException`
   3. **Edge Case Robustness** — Thread interruptions or cache eviction errors 
could theoretically trigger it
   
   **Suggested Fix:**
   
   ```java
   } catch (PatternSyntaxException | ExecutionException e) {
       // Invalid regex pattern or cache loading failure - return null silently
   }
   ```
   
   **Rationale:**
   
   This is a low-risk change (no performance impact, no behavior change) that 
ensures:
   - ✅ Compliance with Guava's API contract
   - ✅ Future-proof implementation
   - ✅ Consistency with Flink's defensive programming practices
   
   This builds on @dylanhz's [earlier 
comment](https://github.com/apache/flink/pull/28140#discussion_r1635967892) 
about exception propagation, ensuring complete coverage of all declared 
exceptions.
   
   ---
   
   **Alternative:** If catching `ExecutionException` is deemed unnecessary, I 
suggest adding a comment documenting the intentional deviation from the API 
contract for future maintainers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to