Hi, > Hi Ishii-san, > > On 7/19/23 22:15, Tatsuo Ishii wrote: >> Currently my patch has a limitation for the sake of simple >> implementation: a pattern like "A+" is parsed and analyzed in the raw >> parser. This makes subsequent process much easier because the pattern >> element variable (in this case "A") and the quantifier (in this case >> "+") is already identified by the raw parser. However there are much >> more cases are allowed in the standard as you already pointed out. For >> those cases probably we should give up to parse PATTERN items in the >> raw parser, and instead the raw parser just accepts the elements as >> Sconst? > > Is there a concern that the PATTERN grammar can't be represented in > Bison? I thought it was all context-free...
I don't know at this point. I think context-free is not enough to be repsented in Bison. The grammer also needs to be LALR(1). Moreover, adding the grammer to existing parser may generate shift/reduce errors. > Or is the concern that the > parse tree of the pattern is hard to feed into a regex engine? One small concern is how to convert pattern variables to regex expression which our regex enginer understands. Suppose, PATTERN UP+ Currently I convert "UP+" to "U+" so that it can be fed to the regexp engine. In order to do that, we need to know which part of the pattern (UP+) is the pattern variable ("UP"). For "UP+" it's quite easy. But for more complex regular expressions it would be not, unless PATTERN grammer can be analyzed by our parser to know which part is the pattern variable. >> Any comments, especially on the PREV/NEXT implementation part is >> welcome. Currently the DEFINE expression like "price > PREV(price)" is >> prepared in ExecInitWindowAgg using ExecInitExpr,tweaking var->varno >> in Var node so that PREV uses OUTER_VAR, NEXT uses INNER_VAR. Then >> evaluate the expression in ExecWindowAgg using ExecEvalExpr, setting >> previous row TupleSlot to ExprContext->ecxt_outertuple, and next row >> TupleSlot to ExprContext->ecxt_innertuple. I think this is temporary >> hack and should be gotten ride of before v1 is committed. Better idea? > > I'm not familiar enough with this code yet to offer very concrete > suggestions, sorry... But at some point in the future, we need to be > able to skip forward and backward from arbitrary points, like > > DEFINE B AS B.price > PREV(FIRST(A.price), 3) > > so there won't be just one pair of "previous and next" tuples. Yes, I know. > Maybe > that can help clarify the design? It feels like it'll need to eventually > be a "real" function that operates on the window state, even if it > doesn't support all of the possible complexities in v1. Unfortunately an window function can not call other window functions. > Taking a closer look at the regex engine: > > It looks like the + qualifier has trouble when it meets the end of the > frame. For instance, try removing the last row of the 'stock' table in > rpr.sql; some of the final matches will disappear unexpectedly. Or try a > pattern like > > PATTERN ( a+ ) > DEFINE a AS TRUE > > which doesn't seem to match anything in my testing. > > There's also the issue of backtracking in the face of reclassification, > as I think Vik was alluding to upthread. The pattern > > PATTERN ( a+ b+ ) > DEFINE a AS col = 2, > b AS col = 2 > > doesn't match a sequence of values (2 2 ...) with the current > implementation, even with a dummy row at the end to avoid the > end-of-frame bug. > > (I've attached two failing tests against v2, to hopefully better > illustrate, along with what I _think_ should be the correct results.) Thanks. I will look into this. > I'm not quite understanding the match loop in evaluate_pattern(). It > looks like we're building up a string to pass to the regex engine, but > by the we call regexp_instr, don't we already know whether or not the > pattern will match based on the expression evaluation we've done? For "+" yes. But for more complex regular expression like '{n}', we need to call our regexp engine to check if the pattern matches. Best reagards, -- Tatsuo Ishii SRA OSS LLC English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp