Hello,

Java regular expression engine is based on backtracking. Thus, a malicious
regular expression may result in a ReDoS (regular expression denial of
service).

When ReDoS occurs in the HBase, the region server's handler is occupied as
a result it cannot process the user's request.

To avoid this, need to use a regular expression engine such as RE2 that
does not use backtracking.

However, there are cases where have to use the Java regular expression
engine. In this case, it would be nice to add a timeout so that pattern
matching does not proceed for more than a certain period.

The Java regular expression engine does not have the timeout, however, it
is possible to implement the timeout by re-implementing the charAt method
like InterruptibleCharSequence[1].

If the Java regular expression engine implementation is changed, it may not
work. However, it seems to be the best way to add a timeout.

Is it appropriate to add a timeout in JavaRegexEngine[2]? Or is there any
other good way?

Thanks.

[1]:
https://github.com/internetarchive/archive-commons/blob/b5f0b8f549fcd108c4b0b6f2182603bb9d037e9e/archive-commons/src/main/java/org/archive/util/InterruptibleCharSequence.java
[2]:
https://github.com/apache/hbase/blob/74fd5b2e68f0cdc8ae35dd7ba7963e0d0e6fc161/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/RegexStringComparator.java#L257

Reply via email to