Hi, Can we have a parameter to control the recursion depth in these cases to avoid crashes? Just a thought.
Thanks, Mahendrakar. On Wed, 24 Aug, 2022, 3:21 pm Егор Чиндяскин, <kyzeva...@mail.ru> wrote: > Hello, I recently got a server crash (bug #17583 [1]) caused by a stack > overflow. > > Tom Lane and Richard Guo, in a discussion of this bug, suggested that > there could be more such places. > Therefore, Alexander Lakhin and I decided to deal with this issue and > Alexander developed a methodology. We processed src/backend/*/*.c with > "clang -emit-llvm ... | opt -analyze -print-calgraph" to find all the > functions that call themselves directly. I checked each of them for > features that protect against stack overflows. > We analyzed 4 catalogs: regex, tsearch, snowball and adt. > Firstly, we decided to test the regex catalog functions and found 6 of > them that lack the check_stach_depth() call. > > zaptreesubs > markst > next > nfatree > numst > repeat > > We have tried to exploit the recursion in the function zaptreesubs(): > select regexp_matches('a' || repeat(' a', 11000), '(.)(' || repeat(' \1', > 11000) || ')?'); > > ERROR: invalid regular expression: regular expression is too complex > > repeat(): > select regexp_match('abc01234xyz',repeat('a{0,2}',100001)); > > ERROR: invalid regular expression: regular expression is too complex > > numst(): > select regexp_match('abc01234xyz',repeat('(.)\1e',100001)); > > ERROR: invalid regular expression: regular expression is too complex > > markst(): > markst is called in the code after v->tree = parse(...); > it is necessary that the tree be successfully parsed, but with a nesting > level of about 100,000 this will not work - stack protection will work > during parsing and v->ntree = numst(...); is also there. > > next(): > we were able to crash the server with the following query: > (printf "SELECT regexp_match('abc', 'a"; for ((i=1;i<1000000;i++)); do > printf "(?#)"; done; printf "b')" ) | psql > > Secondly, we have tried to exploit the recursion in the adt catalog > functions and Alexander was able to crash the server with the following > query: > > regex_selectivity_sub(): > SELECT * FROM pg_proc WHERE proname ~ ('(a' || repeat('|', 200000) || > 'b)'); > > And this query: > > (n=100000; > printf "SELECT polygon '((0,0),(0,1000000))' <@ polygon > '((-200000,1000000),"; > for ((i=1;i<$n;i++)); do printf "(100000,$(( 300000 + > $i))),(-100000,$((800000 + $i))),"; done; > printf "(200000,900000),(200000,0))';" > ) | psql > > Thirdly, the snowball catalog, Alexander has tried to exploit the > recursion in the r_stem_suffix_chain_before_ki function and crashed a > server using this query: > > r_stem_suffix_chain_before_ki(): > SELECT ts_lexize('turkish_stem', repeat('lerdeki', 1000000)); > > The last one is the tsearch catalog. We have found 4 functions that didn't > have check_stach_depth() function: > > SplitToVariants > mkANode > mkSPNode > LexizeExec > > We have tried to exploit the recursion in the SplitToVariants function and > Alexander crashed a server using this: > > SplitToVariants(): > CREATE TEXT SEARCH DICTIONARY ispell (Template=ispell, > DictFile=ispell_sample,AffFile=ispell_sample); > SELECT ts_lexize('ispell', repeat('bally', 10000)); > > After trying to exploit the recursion in the LexizeExec function Alexander > made this conlusion: > > LexizeExec has two branches "ld->curDictId == InvalidOid" (usual mode) and > "ld->curDictId != InvalidOid" (multiword mode) - we start with the first > one, then make recursive call to switch to the multiword mode, but then we > return to the usual mode again. > > mkANode and mkSPNode deal with the dictionary structs, not with > user-supplied data, so we believe these functions are not vulnerable. > > [1] > https://www.postgresql.org/message-id/flat/CAMbWs499ytQiH4mLMhRxRWP-iEUz3-DSinpAD-cUCtVo_23Wtg%40mail.gmail.com#03ad703cf4bc8d28ccba69913e1e8106 >