Hello! Inspired by Simon Riggs' keynote talk at PGCounf.eu 2023 sharing list of ideas for PostgreSQL (https://riggs.business/blog/f/postgresql-todo-2023), I have crafted a quick patch to do SQL syntax validation.
It is also heavily inspired by the "ruby -c" command, useful to check syntax of Ruby programs without executing them. For now, to keep it simple and to open discussion, I have added new "--syntax" option into "postgres" command, since it is currently the only one using needed parser dependency (at least per my understanding). I tried to add this into psql or separate pg_syntax commands, but parser is not exposed in "postgres_fe.h" and including backend into those tools would not make most likely sense. Also syntax could vary per backend, it makes sense to keep it in there. It expects input on STDIN, prints out error if any and prints out summary message (Valid SQL/Invalid SQL). On valid input it exits with 0 (success), otherwise it exits with 1 (error). Example usage: $ echo "SELECT 1" | src/backend/postgres --syntax Valid SQL $ echo "SELECT 1abc" | src/backend/postgres --syntax ERROR: trailing junk after numeric literal at or near "1a" at character 8 Invalid SQL $ cat ../src/test/regress/sql/alter_operator.sql | src/backend/postgres --syntax Valid SQL $ cat ../src/test/regress/sql/advisory_lock.sql | src/backend/postgres --syntax ERROR: syntax error at or near "\" at character 99 Invalid SQL This could be useful for manual script checks, automated script checks and code editor integrations. Notice it just parses the SQL, it doesn't detect any "runtime" problems like unexisting table names, etc. I have various ideas around this (like exposing similar functionality directly in SQL using custom function like pg_check_syntax), but I would like to get some feedback first. What do you think? enhnace PS: I wasn't able to find any automated tests for "postgres" command to enhance with, are there any? PS2: Patch could be found at https://github.com/simi/postgres/pull/8 as well.
From 6ec6cc599678c0ac76f4559039ff3399f843b9b1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Josef=20=C5=A0im=C3=A1nek?= <josef.sima...@gmail.com> Date: Fri, 15 Dec 2023 13:00:23 +0100 Subject: [PATCH] Add --syntax option to postgres. - it validates SQL on STDIN and reports back if valid/invalid --- src/backend/main/main.c | 45 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/src/backend/main/main.c b/src/backend/main/main.c index ed11e8be7fab..eb7266f9d5d3 100644 --- a/src/backend/main/main.c +++ b/src/backend/main/main.c @@ -40,6 +40,7 @@ #include "utils/memutils.h" #include "utils/pg_locale.h" #include "utils/ps_status.h" +#include "parser/parser.h" const char *progname; @@ -50,6 +51,7 @@ static void startup_hacks(const char *progname); static void init_locale(const char *categoryname, int category, const char *locale); static void help(const char *progname); static void check_root(const char *progname); +static void check_syntax(void); /* @@ -153,6 +155,10 @@ main(int argc, char *argv[]) fputs(PG_BACKEND_VERSIONSTR, stdout); exit(0); } + if (strcmp(argv[1], "--syntax") == 0) + { + check_syntax(); + } /* * In addition to the above, we allow "--describe-config" and "-C var" @@ -347,6 +353,7 @@ help(const char *progname) printf(_(" -s show statistics after each query\n")); printf(_(" -S WORK-MEM set amount of memory for sorts (in kB)\n")); printf(_(" -V, --version output version information, then exit\n")); + printf(_(" --syntax checks SQL on STDIN is valid, then exit\n")); printf(_(" --NAME=VALUE set run-time parameter\n")); printf(_(" --describe-config describe configuration parameters, then exit\n")); printf(_(" -?, --help show this help, then exit\n")); @@ -422,6 +429,44 @@ check_root(const char *progname) #endif /* WIN32 */ } +static void +check_syntax() { + List *parsetree_list; + bool valid = false; + char buffer[1024 * 1024]; + char ch; + int i = 0; + + // TODO: check for buffer overflow + while ((ch = getchar()) != EOF) + { + buffer[i] = ch; + i++; + } + + PG_TRY(); + { + parsetree_list = raw_parser(buffer, RAW_PARSE_DEFAULT); + valid = true; + } + PG_CATCH(); + { + EmitErrorReport(); + FlushErrorState(); + valid = false; + } + PG_END_TRY(); + + // TODO: translate output + if (valid && parsetree_list) { + printf("Valid SQL\n"); + exit(0); + } else { + printf("Invalid SQL\n"); + exit(1); + } +} + /* * At least on linux, set_ps_display() breaks /proc/$pid/environ. The * sanitizer library uses /proc/$pid/environ to implement getenv() as it wants