Re: Permute underscore separated components of columns before fuzzy matching
> On 23 Jan 2024, at 09:42, Arne Roland wrote: > > <0001-fuzzy_underscore_permutation_v5.patch> Mikhail, there’s a new patch version. May I ask you to review it? Best regards, Andrey Borodin.
Re: Permute underscore separated components of columns before fuzzy matching
Thank you! I wasn't aware of the filter per person. It was quite simple integrate a web scraper into my custom push system. Regarding the patch: I ran the 2.1.1 version of pg_bsd_indent now. I hope that suffices. I removed the matrix declaration to make it C90 complaint. I attached the result. Regards Arne On 2024-01-22 19:22, Tom Lane wrote: Arne Roland writes: Thank you for bringing that to my attention. Is there a way to subscribe to cf-bot failures? I don't know of any push notification support in cfbot, but you can bookmark the page with your own active patches, and check it periodically: http://commitfest.cputube.org/arne-roland.html (For others, click on your own name in the main cfbot page's entry for one of your patches to find out how it spelled your name for this purpose.) regards, tom lanediff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c index 34a0ec5901..69abac7c92 100644 --- a/src/backend/parser/parse_relation.c +++ b/src/backend/parser/parse_relation.c @@ -579,37 +579,18 @@ GetCTEForRTE(ParseState *pstate, RangeTblEntry *rte, int rtelevelsup) return NULL;/* keep compiler quiet */ } -/* - * updateFuzzyAttrMatchState - * Using Levenshtein distance, consider if column is best fuzzy match. - */ static void -updateFuzzyAttrMatchState(int fuzzy_rte_penalty, - FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte, - const char *actual, const char *match, int attnum) +updateFuzzyAttrMatchStateSingleString(int fuzzy_rte_penalty, + FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte, + const char *actual, const char *match, int attnum, int matchlen) { - int columndistance; - int matchlen; - - /* Bail before computing the Levenshtein distance if there's no hope. */ - if (fuzzy_rte_penalty > fuzzystate->distance) - return; - - /* - * Outright reject dropped columns, which can appear here with apparent - * empty actual names, per remarks within scanRTEForColumn(). - */ - if (actual[0] == '\0') - return; - /* Use Levenshtein to compute match distance. */ - matchlen = strlen(match); - columndistance = - varstr_levenshtein_less_equal(actual, strlen(actual), match, matchlen, - 1, 1, 1, - fuzzystate->distance + 1 - - fuzzy_rte_penalty, - true); + int columndistance = + varstr_levenshtein_less_equal(actual, strlen(actual), match, matchlen, + 1, 1, 1, + fuzzystate->distance + 1 + - fuzzy_rte_penalty, + true); /* * If more than half the characters are different, don't treat it as a @@ -667,6 +648,207 @@ updateFuzzyAttrMatchState(int fuzzy_rte_penalty, } } +static void +putUnderscores(char *string, int start, int underscore_amount, int len) +{ + for (int i = 0; start + i < len && i < underscore_amount; i++) + string[start + i] = '_'; +} + +/* + * updateFuzzyAttrMatchState + * Using Levenshtein distance, consider if column is best fuzzy match. + */ +static void +updateFuzzyAttrMatchState(int fuzzy_rte_penalty, + FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte, + const char *actual, const char *match, int attnum) +{ + /* Memory segment to store the current permutation of the match string. */ + char *tmp_match; + int matchlen = strlen(match); + + /* + * We keep track how many permutations we have already processed, to avoid + * long runtimes. + */ + int underscore_permutations_count = 0; + + /* + * The location the underscore we currently process within the match + * string. + */ + int underscore_current = 1; + + /* Variables to track the amount of underscores delimiting sections */ + int underscore_amount = 1; + int underscore_second_amount = 1; + + /* + * The entries of the permutation matrix tell us, where we should copy the + * tree segments to. The zeroth dimension iterates over the permutations, + * while the first dimension iterates over the three segments are permuted + * to. Considering the string A_B_C the three segments are: + * - A: before the initial underscore sections + * - B: between the underscore sections + * - C: after the later underscore sections + * + * Please note that the _ in above example are placeholders for + * underscore_amount underscores, which might be more than one. + */ + int permutation_matrix[3][3]; + + /* Bail before computing the Levenshtein distance if there's no hope. */ + if (fuzzy_rte_penalty > fuzzystate->distance) + return; + + /* + * Outright reject dropped columns, which can appear here with apparent + * empty actual names, per remarks within scanRTEForColumn(). + */ + if (actual[0] == '\0') + return; + + updateFuzzyAttrMatchStateSingleString(fuzzy_rte_penalty, fuzzystate, rte, actual, match, attnum, matchlen); + + /* + * We don't want to permute zero length strings, so check whether the + * string starts with an underscore. + */ + if (match[0] == '_') + { + while (underscore_current < match
Re: Permute underscore separated components of columns before fuzzy matching
Arne Roland writes: > Thank you for bringing that to my attention. Is there a way to subscribe > to cf-bot failures? I don't know of any push notification support in cfbot, but you can bookmark the page with your own active patches, and check it periodically: http://commitfest.cputube.org/arne-roland.html (For others, click on your own name in the main cfbot page's entry for one of your patches to find out how it spelled your name for this purpose.) regards, tom lane
Re: Permute underscore separated components of columns before fuzzy matching
Thank you for bringing that to my attention. Is there a way to subscribe to cf-bot failures? Apparently I confused myself with my naming. I attached a patch that fixes the bug (at least at my cassert test-world run). Regards Arne On 2024-01-22 06:38, Peter Smith wrote: 2024-01 Commitfest. Hi, This patch has a CF status of "Needs Review" [1], but it seems like there were CFbot test failures last time it was run [2]. Please have a look and post an updated version if necessary. == [1] https://commitfest.postgresql.org/46/4282/ [2] https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/4282 Kind Regards, Peter Smith.diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c index 34a0ec5901..c416be2ea0 100644 --- a/src/backend/parser/parse_relation.c +++ b/src/backend/parser/parse_relation.c @@ -579,32 +579,13 @@ GetCTEForRTE(ParseState *pstate, RangeTblEntry *rte, int rtelevelsup) return NULL;/* keep compiler quiet */ } -/* - * updateFuzzyAttrMatchState - * Using Levenshtein distance, consider if column is best fuzzy match. - */ static void -updateFuzzyAttrMatchState(int fuzzy_rte_penalty, - FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte, - const char *actual, const char *match, int attnum) +updateFuzzyAttrMatchStateSingleString(int fuzzy_rte_penalty, + FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte, + const char *actual, const char *match, int attnum, int matchlen) { - int columndistance; - int matchlen; - - /* Bail before computing the Levenshtein distance if there's no hope. */ - if (fuzzy_rte_penalty > fuzzystate->distance) - return; - - /* - * Outright reject dropped columns, which can appear here with apparent - * empty actual names, per remarks within scanRTEForColumn(). - */ - if (actual[0] == '\0') - return; - /* Use Levenshtein to compute match distance. */ - matchlen = strlen(match); - columndistance = + int columndistance = varstr_levenshtein_less_equal(actual, strlen(actual), match, matchlen, 1, 1, 1, fuzzystate->distance + 1 @@ -667,6 +648,142 @@ updateFuzzyAttrMatchState(int fuzzy_rte_penalty, } } +static void putUnderscores(char* string, int start, int underscore_amount, int len) { + for (int i = 0; start + i < len && i < underscore_amount; i++) + string[start + i] = '_'; +} + +/* + * updateFuzzyAttrMatchState + * Using Levenshtein distance, consider if column is best fuzzy match. + */ +static void +updateFuzzyAttrMatchState(int fuzzy_rte_penalty, + FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte, + const char *actual, const char *match, int attnum) +{ + /* Memory segment to store the current permutation of the match string. */ + char* tmp_match; + int matchlen = strlen(match); + /* We keep track how many permutations we have already processed, to avoid long runtimes. */ + int underscore_permutations_count = 0; + /* The location the underscore we currently process within the match string. */ + int underscore_current = 1; + /* Variables to track the amount of underscores delimiting sections */ + int underscore_amount = 1; + int underscore_second_amount = 1; + + /* Bail before computing the Levenshtein distance if there's no hope. */ + if (fuzzy_rte_penalty > fuzzystate->distance) + return; + + /* + * Outright reject dropped columns, which can appear here with apparent + * empty actual names, per remarks within scanRTEForColumn(). + */ + if (actual[0] == '\0') + return; + + updateFuzzyAttrMatchStateSingleString(fuzzy_rte_penalty, fuzzystate, rte, actual, match, attnum, matchlen); + /* We don't want to permute zero length strings, so check whether the string starts with an underscore. */ + if (match[0] == '_') { + while (underscore_current < matchlen - 1 && match[underscore_current] == '_') { + underscore_current++; + } + } + /* Advance to the next underscore. We do this once here to avoid pallocing, if the string does't contain an underscore at all. */ + while (underscore_current < matchlen - 1 && match[underscore_current] != '_') { + underscore_current++; + } + /* + * Check for permuting up to three sections separated by underscores. + * + * We count the number of underscores here, because we want to know whether we should consider + * permuting underscore separated sections. + */ + if (underscore_current < matchlen - 1) { + tmp_match = palloc(matchlen + 1); + tmp_match[matchlen] = '\0'; + while (underscore_permutations_count < 300 && underscore_current < matchlen - 1) { + /* + * If sections contain more than one underscore, we want to swap the sections separated by more than one instead. + * There would be no point in swapping zero length strings around. + * So we check how many consecutive underscores we have here. + */ + underscore_amount = 1; + while (underscore_current + underscore_amount < matchlen && match[underscore_current + underscore_amount] == '_') { +underscore_amount++; +
Re: Permute underscore separated components of columns before fuzzy matching
2024-01 Commitfest. Hi, This patch has a CF status of "Needs Review" [1], but it seems like there were CFbot test failures last time it was run [2]. Please have a look and post an updated version if necessary. == [1] https://commitfest.postgresql.org/46/4282/ [2] https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/4282 Kind Regards, Peter Smith.
Re: Permute underscore separated components of columns before fuzzy matching
Hi! Mikhail Gribkov writes: > > Honestly I'm not entirely sure fixing only two switched words is worth the > > effort, but the declared goal is clearly achieved. > > > > I think the patch is good to go, although you need to fix code formatting. > > > I took a brief look at this. I concur that we shouldn't need to be > hugely concerned about the speed of this code path. However, we *do* > need to be concerned about its maintainability, and I think the patch > falls down badly there: it adds a chunk of very opaque and essentially > undocumented code, that people will need to reverse-engineer anytime > they are studying this function. That could be alleviated perhaps > with more work on comments, but I have to wonder whether it's worth > carrying this logic at all. It's a rather strange behavior to add, > and I wonder if many users will want it. I encounter this problem all the time. I don't know, whether my clients are representative. But I see the problem, when the developers show me their code base all the time. It's an issue for column names and table names alike. I personally spent hours watching developers trying various permutations. They rarely request this feature. Usually they are to embarrassed for not knowing their object names to request anything in that state. But I want the database, which I support, to be gentle and helpful to the user under these circumstances. Regarding complexity: I think the permutation matrix is the thing to easily get wrong. I had a one off bug writing it down initially. I tried to explain the conceptual approach better with a longer comment than before. /* * Only consider mirroring permutations, since the three simple rotations are already * (or will be for a later underscore_current) covered above. * * The entries of the permutation matrix tell us, where we should copy the tree segments to. * The zeroth dimension iterates over the permutations, while the first dimension iterates * over the three segments are permuted to. * Considering the string A_B_C the three segments are: * - before the initial underscore sections (A) * - between the underscore sections (B) * - after the later underscore sections (C) */ If anything is still unclear, I'd appreciate feedback about what might be still unclear/confusing about this. I can't promise to be helpful, if something breaks. But I have practically forgotten how I did it, and I found it easy to extend it like described below. It would have been embarrassing otherwise. Yet this gives me hope, it should be possible to enable others the same way. I certainly want the code simple without need to reverse-engineer anything. Please let me know, if there are difficult to understand bits left around. > One thing that struck me is that no care is being taken for adjacent > underscores (that is, "foo__bar" and similar cases). It seems > unlikely that treating the zero-length substring between the > underscores as a word to permute is helpful; moreover, it adds > an edge case that the string-moving logic could easily get wrong. > I wonder if the code should treat any number of consecutive > underscores as a single separator. (Somewhat related: I think it > will behave oddly when the first or last character is '_', since the > outer loop ignores those positions.) I wasn't sure how there could be any potential future bug with copying zero-length strings, i.e. doing nothing. And I still don't see that. There is one point I agree with: Doing this seems rarely helpful. I changed the code, so it treats sections delimited by an arbitrary amount of underscores. So it never permutes with zero length strings within. I also added functionality to skip the zero length cases if we should encounter them at the end of the string. So afaict there should be no zero length swaps left. Please let me know whether this is more to your liking. I also replaced the hard limit of underscores with more nuanced limits of permutations to try before giving up. > > And it would be much more convenient to work with your patch if every next > > version file will have a unique name (maybe something like "_v2", "_v3" > > etc. suffixes) > > > Please. It's very confusing when there are multiple identically-named > patches in a thread. Sorry, I started with this, because I confused cf bot in the past about whether the patches should be applied on top of each other or not. For me the cf-bot logic is a bit opaque there. But you are right, confusing patch readers is definitely worse. I'll try to do that. I hope the attached format is better. One question about pgindent: I struggled a bit with getting the right version of bsd_indent. I found versions labeled 2.2.1 and 2.1.1, but apparently we work with 2.1.2. Where
Re: Permute underscore separated components of columns before fuzzy matching
Mikhail Gribkov writes: > Honestly I'm not entirely sure fixing only two switched words is worth the > effort, but the declared goal is clearly achieved. > I think the patch is good to go, although you need to fix code formatting. I took a brief look at this. I concur that we shouldn't need to be hugely concerned about the speed of this code path. However, we *do* need to be concerned about its maintainability, and I think the patch falls down badly there: it adds a chunk of very opaque and essentially undocumented code, that people will need to reverse-engineer anytime they are studying this function. That could be alleviated perhaps with more work on comments, but I have to wonder whether it's worth carrying this logic at all. It's a rather strange behavior to add, and I wonder if many users will want it. One thing that struck me is that no care is being taken for adjacent underscores (that is, "foo__bar" and similar cases). It seems unlikely that treating the zero-length substring between the underscores as a word to permute is helpful; moreover, it adds an edge case that the string-moving logic could easily get wrong. I wonder if the code should treat any number of consecutive underscores as a single separator. (Somewhat related: I think it will behave oddly when the first or last character is '_', since the outer loop ignores those positions.) > And it would be much more convenient to work with your patch if every next > version file will have a unique name (maybe something like "_v2", "_v3" > etc. suffixes) Please. It's very confusing when there are multiple identically-named patches in a thread. regards, tom lane
Re: Permute underscore separated components of columns before fuzzy matching
Hello Arne, yep, now the warnings have gone. And I must thank you for quite a fun time I had here testing your patch :) I tried even some weird combinations like this: postgres=# create table t("_ __ ___" int); CREATE TABLE postgres=# select "__ _ ___" from t; ERROR: column "__ _ ___" does not exist LINE 1: select "__ _ ___" from t; ^ HINT: Perhaps you meant to reference the column "t._ __ ___". postgres=# select "___ __ _" from t; ERROR: column "___ __ _" does not exist LINE 1: select "___ __ _" from t; ^ HINT: Perhaps you meant to reference the column "t._ __ ___". postgres=# ... and it still worked fine. Honestly I'm not entirely sure fixing only two switched words is worth the effort, but the declared goal is clearly achieved. I think the patch is good to go, although you need to fix code formatting. At least the char*-definition and opening "{" brackets are conspicuous. Maybe there are more: it is worth running pgindend tool. And it would be much more convenient to work with your patch if every next version file will have a unique name (maybe something like "_v2", "_v3" etc. suffixes) -- best regards, Mikhail A. Gribkov e-mail: youzh...@gmail.com *http://www.flickr.com/photos/youzhick/albums <http://www.flickr.com/photos/youzhick/albums>* http://www.strava.com/athletes/5085772 phone: +7(916)604-71-12 Telegram: @youzhick On Mon, Jul 17, 2023 at 1:42 AM Arne Roland wrote: > Hello Mikhail, > > I'm sorry. Please try attached patch instead. > > Thank you for having a look! > > Regards > Arne > > ---------- > *From:* Mikhail Gribkov > *Sent:* Thursday, July 6, 2023 13:31 > *To:* Arne Roland > *Cc:* Pg Hackers > *Subject:* Re: Permute underscore separated components of columns before > fuzzy matching > > Hello Arne, > > The goal of supporting words-switching hints sounds interesting and I've > tried to apply your patch. > The patch was applied smoothly to the latest master and check-world > reported no problems. Although I had problems after trying to test the new > functionality. > > I tried to simply mix words in pg_stat_activity.wait_event_type: > > postgres=# select wait_type_event from pg_stat_activity ; > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end > in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 > WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 &
Re: Permute underscore separated components of columns before fuzzy matching
Hello Mikhail, I'm sorry. Please try attached patch instead. Thank you for having a look! Regards Arne From: Mikhail Gribkov Sent: Thursday, July 6, 2023 13:31 To: Arne Roland Cc: Pg Hackers Subject: Re: Permute underscore separated components of columns before fuzzy matching Hello Arne, The goal of supporting words-switching hints sounds interesting and I've tried to apply your patch. The patch was applied smoothly to the latest master and check-world reported no problems. Although I had problems after trying to test the new functionality. I tried to simply mix words in pg_stat_activity.wait_event_type: postgres=# select wait_type_event from pg_stat_activity ; 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] ERROR: column "wait_type_event" does not exist at character 8 2023-07-06 14:12:35.968 MSK [1480] HINT: Perhaps you meant to reference the column "pg_stat_activity.wait_event_type". 2023-07-06 14:12:35.968 MSK [1480] STATEMENT: select wait_type_event from pg_stat_activity ; WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNI
Re: Permute underscore separated components of columns before fuzzy matching
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, failed Documentation:tested, failed Hello Arne, The goal of supporting words-switching hints sounds interesting and I've tried to apply your patch. The patch was applied smoothly to the latest master and check-world reported no problems. Although I had problems after trying to test the new functionality. I tried to simply mix words in pg_stat_activity.wait_event_type: postgres=# select wait_type_event from pg_stat_activity ; 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] ERROR: column "wait_type_event" does not exist at character 8 2023-07-06 14:12:35.968 MSK [1480] HINT: Perhaps you meant to reference the column "pg_stat_activity.wait_event_type". 2023-07-06 14:12:35.968 MSK [1480] STATEMENT: select wait_type_event from pg_stat_activity ; WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 ERROR: column "wait_type_event" does not ex
Re: Permute underscore separated components of columns before fuzzy matching
Hello Arne, The goal of supporting words-switching hints sounds interesting and I've tried to apply your patch. The patch was applied smoothly to the latest master and check-world reported no problems. Although I had problems after trying to test the new functionality. I tried to simply mix words in pg_stat_activity.wait_event_type: postgres=# select wait_type_event from pg_stat_activity ; 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 2023-07-06 14:12:35.968 MSK [1480] ERROR: column "wait_type_event" does not exist at character 8 2023-07-06 14:12:35.968 MSK [1480] HINT: Perhaps you meant to reference the column "pg_stat_activity.wait_event_type". 2023-07-06 14:12:35.968 MSK [1480] STATEMENT: select wait_type_event from pg_stat_activity ; WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 WARNING: detected write past chunk end in MessageContext 0x559d668aaf30 ERROR: column "wait_type_event" does not exist LINE 1: select wait_type_event from pg_stat_activity ; ^ HINT: Perhaps you meant to reference the column "pg_stat_activity.wait_event_type". postgres=# So the desired hint is really there, but thgether with looots of warnings. For sure these sh