On Mon, Jul 21, 2025 at 2:31 PM jian he <[email protected]> wrote: > > actually, > section (9.7.3.) > https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP > already have example, like: > substring('foobar' from 'o(.)b') > > new patch attached, split substr, substring to make review more easier. > v6-0001: add function argument name to function substr > v6-0002: add function argument name to function substring > > v6-0002 incorporated some of the changes in v5-0002-v3-delta.patch. > some of the changes in v5-0002-v3-delta.patch are not related to this thread, > so I didn't incorporate them, right now.
hi. rebased.
From 5afc6f3ce1ab154576002c1dfe82a2dd22d80964 Mon Sep 17 00:00:00 2001 From: jian he <[email protected]> Date: Mon, 13 Oct 2025 21:04:40 +0800 Subject: [PATCH v7 2/2] add function argument name to function substring HEAD \df substr List of functions Schema | Name | Result data type | Argument data types | Type ------------+-----------+------------------+-------------------------+------ pg_catalog | substring | bit | bit, integer | func pg_catalog | substring | bit | bit, integer, integer | func pg_catalog | substring | bytea | bytea, integer | func pg_catalog | substring | bytea | bytea, integer, integer | func pg_catalog | substring | text | text, integer | func pg_catalog | substring | text | text, integer, integer | func pg_catalog | substring | text | text, text | func pg_catalog | substring | text | text, text, text | func with patch applied \df substring Schema | Name | Result data type | Argument data types | Type ------------+-----------+------------------+--------------------------------------------------+------ pg_catalog | substring | bit | bits bit, start integer, count integer | func pg_catalog | substring | bytea | bytes bytea, start integer | func pg_catalog | substring | bytea | bytes bytea, start integer, count integer | func pg_catalog | substring | bit | string bit, start integer | func pg_catalog | substring | text | string text, pattern text | func pg_catalog | substring | text | string text, pattern text, escape_character text | func pg_catalog | substring | text | string text, start integer | func pg_catalog | substring | text | string text, start integer, count integer | func (8 rows) discussion: https://postgr.es/m/CACJufxHTBkymh06D4mGKNe1YfRNFN+gFBybmygWk=ptmqu0...@mail.gmail.com --- doc/src/sgml/func/func-binarystring.sgml | 20 ++++++++++ doc/src/sgml/func/func-bitstring.sgml | 20 ++++++++++ doc/src/sgml/func/func-matching.sgml | 18 ++++++--- doc/src/sgml/func/func-string.sgml | 51 ++++++++++++++++++++++++ src/backend/catalog/system_functions.sql | 2 +- src/include/catalog/pg_proc.dat | 8 ++++ 6 files changed, 112 insertions(+), 7 deletions(-) diff --git a/doc/src/sgml/func/func-binarystring.sgml b/doc/src/sgml/func/func-binarystring.sgml index dd7037811af..4786a096dd5 100644 --- a/doc/src/sgml/func/func-binarystring.sgml +++ b/doc/src/sgml/func/func-binarystring.sgml @@ -216,6 +216,26 @@ </para></entry> </row> + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>substring</primary> + </indexterm> + <function>substring</function> ( <parameter>bytes</parameter> <type>bytea</type>, <parameter>start</parameter> <type>integer</type> <optional>, <parameter>count</parameter> <type>integer</type> </optional> ) + <returnvalue>bytea</returnvalue> + </para> + <para> + Extracts the substring of <parameter>bytes</parameter> starting at + the <parameter>start</parameter>'th byte, + and stopping after <parameter>count</parameter> bytes if that is + specified. + </para> + <para> + <literal>substring('\x1234567890'::bytea, 3, 2)</literal> + <returnvalue>\x5678</returnvalue> + </para></entry> + </row> + <row> <entry role="func_table_entry"><para role="func_signature"> <indexterm> diff --git a/doc/src/sgml/func/func-bitstring.sgml b/doc/src/sgml/func/func-bitstring.sgml index f03dd63afcc..c1de0963ec0 100644 --- a/doc/src/sgml/func/func-bitstring.sgml +++ b/doc/src/sgml/func/func-bitstring.sgml @@ -274,6 +274,26 @@ </para></entry> </row> + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>substring</primary> + </indexterm> + <function>substring</function> ( <parameter>bits</parameter> <type>bit</type>, <parameter>start</parameter> <type>integer</type> <optional>, <parameter>count</parameter> <type>integer</type> </optional> ) + <returnvalue>bit</returnvalue> + </para> + <para> + Extracts the substring of <parameter>bits</parameter> starting at + the <parameter>start</parameter>'th bit, + and stopping after <parameter>count</parameter> bits if that is + specified. + </para> + <para> + <literal>substring(B'110010111111', 3, 2)</literal> + <returnvalue>00</returnvalue> + </para></entry> + </row> + <row> <entry role="func_table_entry"><para role="func_signature"> <indexterm> diff --git a/doc/src/sgml/func/func-matching.sgml b/doc/src/sgml/func/func-matching.sgml index ebe0b22c8f6..1454db0378d 100644 --- a/doc/src/sgml/func/func-matching.sgml +++ b/doc/src/sgml/func/func-matching.sgml @@ -377,7 +377,7 @@ substring(<replaceable>string</replaceable> from <replaceable>pattern</replaceab </synopsis> or as a plain three-argument function: <synopsis> -substring(<replaceable>string</replaceable>, <replaceable>pattern</replaceable>, <replaceable>escape-character</replaceable>) +substring(<replaceable>string</replaceable>, <replaceable>pattern</replaceable>, <replaceable>escape</replaceable>) </synopsis> As with <literal>SIMILAR TO</literal>, the specified pattern must match the entire data string, or else the @@ -581,11 +581,17 @@ substring('foobar' similar '#"o_b#"%' escape '#') <lineannotation>NULL</linea </para> <para> - The <function>substring</function> function with two parameters, - <function>substring(<replaceable>string</replaceable> from - <replaceable>pattern</replaceable>)</function>, provides extraction of a - substring - that matches a POSIX regular expression pattern. It returns null if + The <function>substring</function> function with two parameters provides extraction of a + substring that matches a <acronym>POSIX</acronym> regular expression pattern. + The function can be written according to standard <acronym>SQL</acronym> syntax: +<synopsis> +substring(<replaceable>string</replaceable> FROM <replaceable>pattern</replaceable>) +</synopsis> + It can also written as a plain two-argument function: +<synopsis> +substring(<replaceable>string</replaceable>, <replaceable>pattern</replaceable>) +</synopsis> + It returns null if there is no match, otherwise the first portion of the text that matched the pattern. But if the pattern contains any parentheses, the portion of the text that matched the first parenthesized subexpression (the diff --git a/doc/src/sgml/func/func-string.sgml b/doc/src/sgml/func/func-string.sgml index 01cc94c234e..35d657eb418 100644 --- a/doc/src/sgml/func/func-string.sgml +++ b/doc/src/sgml/func/func-string.sgml @@ -1412,6 +1412,57 @@ </para></entry> </row> + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>substring</primary> + </indexterm> + <function>substring</function> ( <parameter>string</parameter> <type>text</type>, <parameter>start</parameter> <type>integer</type> <optional>, <parameter>count</parameter> <type>integer</type> </optional> ) + <returnvalue>text</returnvalue> + </para> + <para> + Extracts the substring of <parameter>string</parameter> starting at + the <parameter>start</parameter>'th character, + and stopping after <parameter>count</parameter> characters if that is + specified. + </para> + <para> + <literal>substring('Thomas', 2, 3)</literal> + <returnvalue>hom</returnvalue> + </para></entry> + </row> + + <row> + <entry role="func_table_entry"><para role="func_signature"> + <function>substring</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>, <parameter>escape</parameter> <type>text</type>) + <returnvalue>text</returnvalue> + </para> + <para> + Extracts the first substring matching <acronym>SQL</acronym> regular expression; + see <xref linkend="functions-similarto-regexp"/>. + </para> + <para> + <literal>substring('Thomas', '%#"o_a#"_', '#')</literal> + <returnvalue>oma</returnvalue> + </para></entry> + </row> + + <row> + <entry role="func_table_entry"><para role="func_signature"> + <function>substring</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type> ) + <returnvalue>text</returnvalue> + </para> + <para> + Extracts the first substring matching <acronym>POSIX</acronym> regular expression; see + <xref linkend="functions-posix-regexp"/>. + </para> + <para> + <literal>substring('Thomas', '...$')</literal> + <returnvalue>mas</returnvalue> + </para></entry> + </row> + + <row> <entry role="func_table_entry"><para role="func_signature"> <indexterm> diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql index 2d946d6d9e9..d5f8bbd953f 100644 --- a/src/backend/catalog/system_functions.sql +++ b/src/backend/catalog/system_functions.sql @@ -42,7 +42,7 @@ CREATE OR REPLACE FUNCTION rpad(text, integer) IMMUTABLE PARALLEL SAFE STRICT COST 1 RETURN rpad($1, $2, ' '); -CREATE OR REPLACE FUNCTION "substring"(text, text, text) +CREATE OR REPLACE FUNCTION "substring"(string text, pattern text, escape text) RETURNS text LANGUAGE sql IMMUTABLE PARALLEL SAFE STRICT COST 1 diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index b6d23315e46..432a75b41b4 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -3748,9 +3748,11 @@ { oid => '936', descr => 'extract portion of string', proname => 'substring', prorettype => 'text', proargtypes => 'text int4 int4', + proargnames => '{string, start, count}', prosrc => 'text_substr' }, { oid => '937', descr => 'extract portion of string', proname => 'substring', prorettype => 'text', proargtypes => 'text int4', + proargnames => '{string, start}', prosrc => 'text_substr_no_len' }, { oid => '2087', descr => 'replace all occurrences in string of old_substr with new_substr', @@ -4168,6 +4170,7 @@ prosrc => 'bitcat' }, { oid => '1680', descr => 'extract portion of bitstring', proname => 'substring', prorettype => 'bit', proargtypes => 'bit int4 int4', + proargnames => '{bits, start, count}', prosrc => 'bitsubstr' }, { oid => '1681', descr => 'bitstring length', proname => 'length', prorettype => 'int4', proargtypes => 'bit', @@ -4197,6 +4200,7 @@ prosrc => 'bitposition' }, { oid => '1699', descr => 'extract portion of bitstring', proname => 'substring', prorettype => 'bit', proargtypes => 'bit int4', + proargnames => '{string, start}', prosrc => 'bitsubstr_no_len' }, { oid => '3030', descr => 'substitute portion of bitstring', @@ -6302,9 +6306,11 @@ prosrc => 'byteacat' }, { oid => '2012', descr => 'extract portion of string', proname => 'substring', prorettype => 'bytea', + proargnames => '{bytes, start, count}', proargtypes => 'bytea int4 int4', prosrc => 'bytea_substr' }, { oid => '2013', descr => 'extract portion of string', proname => 'substring', prorettype => 'bytea', proargtypes => 'bytea int4', + proargnames => '{bytes, start}', prosrc => 'bytea_substr_no_len' }, { oid => '2085', descr => 'extract portion of string', proname => 'substr', prorettype => 'bytea', proargtypes => 'bytea int4 int4', @@ -6504,9 +6510,11 @@ { oid => '2073', descr => 'extract text matching regular expression', proname => 'substring', prorettype => 'text', proargtypes => 'text text', + proargnames => '{string, pattern}', prosrc => 'textregexsubstr' }, { oid => '2074', descr => 'extract text matching SQL regular expression', proname => 'substring', prolang => 'sql', prorettype => 'text', + proargnames => '{string, pattern, escape}', proargtypes => 'text text text', prosrc => 'see system_functions.sql' }, { oid => '2075', descr => 'convert int8 to bitstring', -- 2.34.1
From e18bc9123a65bd9c30fb8541ca464a407351fcce Mon Sep 17 00:00:00 2001 From: jian he <[email protected]> Date: Mon, 21 Jul 2025 11:58:37 +0800 Subject: [PATCH v7 1/2] add function argument name to substr. HEAD \df substr List of functions Schema | Name | Result data type | Argument data types | Type ------------+--------+------------------+-------------------------+------ pg_catalog | substr | bytea | bytea, integer | func pg_catalog | substr | bytea | bytea, integer, integer | func pg_catalog | substr | text | text, integer | func pg_catalog | substr | text | text, integer, integer | func with patch \df substr List of functions Schema | Name | Result data type | Argument data types | Type ------------+--------+------------------+-------------------------------------------+------ pg_catalog | substr | bytea | bytes bytea, start integer | func pg_catalog | substr | bytea | bytes bytea, start integer, count integer | func pg_catalog | substr | text | string text, start integer | func pg_catalog | substr | text | string text, start integer, count integer | func discussion: https://postgr.es/m/CACJufxHTBkymh06D4mGKNe1YfRNFN+gFBybmygWk=ptmqu0...@mail.gmail.com --- src/include/catalog/pg_proc.dat | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index b51d2b17379..b6d23315e46 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -3718,6 +3718,7 @@ prosrc => 'rtrim' }, { oid => '877', descr => 'extract portion of string', proname => 'substr', prorettype => 'text', proargtypes => 'text int4 int4', + proargnames => '{string, start, count}', prosrc => 'text_substr' }, { oid => '878', descr => 'map a set of characters appearing in string', proname => 'translate', prorettype => 'text', proargtypes => 'text text text', @@ -3736,6 +3737,7 @@ prosrc => 'rtrim1' }, { oid => '883', descr => 'extract portion of string', proname => 'substr', prorettype => 'text', proargtypes => 'text int4', + proargnames => '{string, start}', prosrc => 'text_substr_no_len' }, { oid => '884', descr => 'trim selected characters from both ends of string', proname => 'btrim', prorettype => 'text', proargtypes => 'text text', @@ -6306,9 +6308,11 @@ prosrc => 'bytea_substr_no_len' }, { oid => '2085', descr => 'extract portion of string', proname => 'substr', prorettype => 'bytea', proargtypes => 'bytea int4 int4', + proargnames => '{bytes, start, count}', prosrc => 'bytea_substr' }, { oid => '2086', descr => 'extract portion of string', proname => 'substr', prorettype => 'bytea', proargtypes => 'bytea int4', + proargnames => '{bytes, start}', prosrc => 'bytea_substr_no_len' }, { oid => '2014', descr => 'position of substring', proname => 'position', prorettype => 'int4', proargtypes => 'bytea bytea', -- 2.34.1
