On Wed, Jan 23, 2013 at 7:29 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Heikki Linnakangas <hlinnakan...@vmware.com> writes:
> > On 23.01.2013 09:36, Alexander Korotkov wrote:
> >> On Wed, Jan 23, 2013 at 6:08 AM, Tom Lane<t...@sss.pgh.pa.us>  wrote:
> >>> The biggest problem is that I really don't care for the idea of
> >>> contrib/pg_trgm being this cozy with the innards of regex_t.
>
> >> The only option I see now is to provide a method like "export_cnfa"
> which
> >> would export corresponding CNFA in fixed format.
>
> > Yeah, I think that makes sense. The transformation code in trgm_regexp.c
> > would probably be more readable too, if it didn't have to deal with the
> > regex guts representation of the CNFA. Also, once you have intermediate
> > representation of the original CNFA, you could do some of the
> > transformation work on that representation, before building the
> > "tranformed graph" containing trigrams. You could eliminate any
> > non-alphanumeric characters, joining states connected by arcs with
> > non-alphanumeric characters, for example.
>
> It's not just the CNFA though; the other big API problem is with mapping
> colors back to characters.  Right now, that not only knows way too much
> about a part of the regex internals we have ambitions to change soon,
> but it also requires pg_wchar2mb_with_len() and lowerstr(), neither of
> which should be known to the regex library IMO.  So I'm not sure how we
> divvy that up sanely.  To be clear: I'm not going to insist that we have
> to have a clean API factorization before we commit this at all.  But it
> worries me if we don't even know how we could get to that, because we
> are going to need it eventually.
>

Now, we probably don't have enough of time before 9.3 to solve an API
problem :(. It's likely we have to choose either commit to 9.3 without
clean API factorization or postpone it to 9.4.

------
With best regards,
Alexander Korotkov.

Reply via email to