I made an issue[1] and a PR[2]. [1]: https://issues.apache.org/jira/browse/HBASE-27219 [2]: https://github.com/apache/hbase/pull/4632
On 2022/07/16 18:05:32 Andrew Purtell wrote: > Please do file an issue on our issue tracker. https://issues.apache.org/jira . The project name is HBASE of course. > > I think we may have bigger issues here because joni was recently flagged by static analysis tools we use at my employer to determine compliance with various government requirements. I would assume a CVE has been filed regarding joni. I plan to dig in here soon. A required upgrade of joni could by extension provoke an upgrade of JRuby. Sean, I recall you recently landed some changes in that regard, but only back to branch-2. So, if so, this encoding issue by comparison would be a smaller detail to also address concurrently. In any case let’s track the problem. > > > On Jul 16, 2022, at 10:43 AM, Sean Busbey <bu...@apache.org> wrote: > > > > That sounds reasonable. Could you file an issue in our issue tracker? Are > > you up for working on a PR? > > > > > >> On Wed, Jul 13, 2022 at 2:27 AM Minwoo Kang <it...@gmail.com> > >> wrote: > >> > >> Hello, > >> > >> I checked whether JONI can be used in RegexStringComparator. > >> After changing the engine of RegexStringComparator to JONI, when a regex > >> filter request was sent, the heap memory usage spiked and the RegionServer > >> did not work due to GC. > >> > >> When I checked the reason, it is said that when using UTF8Encoding, an > >> infinite loop can occur if an invalid UTF8 is entered.[1] > >> For trino, using NonStrictUTF8Encoding instead of UTF8Encoding. > >> > >> After changing the encoding of JoniRegexEngine to NonStrictUTF8Encoding in > >> RegexStringComparator, it was confirmed that the heap memory usage spike > >> was gone.[2] > >> > >> In HBase, like trino, it seems to be necessary to use NonStrictUTF8Encoding > >> instead of UTF8Encoding for JoniRegexEngine's encoding. > >> What do you think about changing JoniRegexEngine's encoding to > >> NonStrictUTF8Encoding? > >> > >> Best Regards, > >> Minwoo > >> > >>> On 2022/06/27 04:41:41 Minwoo Kang wrote: > >>> (I sent the mail title in Korean for the first time. I'm so sorry.) > >>> > >>> Hello, > >>> > >>> Recently, java.util.regex in the Regex filter (RegexStringComparator) had > >>> been running forever. > >>> It is said that java.util.regex can run forever or stack overflow in the > >>> worst case. > >>> > >>> Looking at RegexStringComparator, I saw that two regex implementations > >>> (java, joni) were provided. > >>> I was wondering if anyone has experience in changing the regex engine > >>> in RegexStringComparator to joni and operating it. > >>> > >>> Best Regards, > >>> Minwoo > >>> > >>> On 2022/06/27 04:37:11 Minwoo Kang wrote: > >>>> Hello, > >>>> > >>>> Recently, java.util.regex in the Regex filter (RegexStringComparator) > >> had > >>>> been running forever. > >>>> It is said that java.util.regex can run forever or stack overflow in > >> the > >>>> worst case. > >>>> > >>>> Looking at RegexStringComparator, I saw that two regex implementations > >>>> (java, joni) were provided. > >>>> I was wondering if anyone has experience in changing the regex engine > >>>> in RegexStringComparator to joni and operating it. > >>>> > >>>> Best Regards, > >>>> Minwoo > >>>> > >>> > >> >