On 2017-05-21 03:00, Tom Lane wrote:
> I wrote:
>> Also, I found two places where an overlength comment line is simply busted
>> altogether --- notice that a character is missing at the split point:
> 
> I found the cause of that: you need to apply this patch:
> 
> --- freebsd_indent/pr_comment.c~      2017-05-17 14:59:31.548442801 -0400
> +++ freebsd_indent/pr_comment.c       2017-05-20 20:51:16.447332977 -0400
> @@ -344,8 +353,8 @@ pr_comment(void)
>               {
>                   int len = strlen(t_ptr);
>  
> -                 CHECK_SIZE_COM(len);
> -                 memmove(e_com, t_ptr, len);
> +                 CHECK_SIZE_COM(len + 1);
> +                 memmove(e_com, t_ptr, len + 1);
>                   last_bl = strpbrk(e_com, " \t");
>                   e_com += len;
>               }
> 
> As the code stands, the strpbrk call is being applied to a
> not-null-terminated string and therefore is sometimes producing an
> insane value of last_bl, messing up decisions later in the comment.
> Having the memmove include the trailing \0 resolves that.

I have been analyzing this and came to different conclusions. Foremost,
a strpbrk() call like that finds the first occurrence of either space or
a tab, but last_bl means "last blank" - it's used for marking where to
wrap a comment line if it turns out to be too long. The previous coding
moved the character sequence byte after byte, updating last_bl every
time it was copying one of the two characters. I've rewritten that part as:
                    CHECK_SIZE_COM(len);
                    memmove(e_com, t_ptr, len);
-                   last_bl = strpbrk(e_com, " \t");
                    e_com += len;
+                   last_bl = NULL;
+                   for (t_ptr = e_com - 1; t_ptr > e_com - len; t_ptr--)
+                       if (*t_ptr == ' ' || *t_ptr == '\t') {
+                           last_bl = t_ptr;
+                           break;
+                       }
                }


But then I also started to wonder if there is any case when there's more
than one character to copy and I haven't found one yet. It looks like
            } while (!memchr("*\n\r\b\t", *buf_ptr, 6) &&
                (now_col <= adj_max_col || !last_bl));
guarantees that if we're past adj_max_col, it'll only be one non-space
character. But I'm not sure yet.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to