Re: Control your disk usage in PG: Introduction to Disk Quota Extension
Thanks David! We rebased this patch with the newest master. Thank you very much! Regards, Haozhou On Wed, Mar 25, 2020 at 12:00 AM David Steele wrote: > On 12/2/19 1:39 AM, Haozhou Wang wrote: > > > > Thank you very much for your email. I rebased the code with the newest > > master and attached it in the attachment. > > This patch applies but no longer builds on Linux or Windows: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_github_postgresql-2Dcfbot_postgresql_builds_666113036&d=DwICaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0&m=UfgeYnaIkXdE7XmY-eCIHV_ZCTEZqCAjXkPausd5qZI&s=zHJ5pgH_P4mMhZAuXtfTCGUB7lyo7dJ4VeoapRhIN4g&e= > > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.appveyor.com_project_postgresql-2Dcfbot_postgresql_build_1.0.85284&d=DwICaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0&m=UfgeYnaIkXdE7XmY-eCIHV_ZCTEZqCAjXkPausd5qZI&s=WmyVNF-rItz10LoeLtY-Xu6l-iV3uwI_U-R9lR7cKjc&e= > > The CF entry has been updated to Waiting on Author. > > Regards, > -- > -David > da...@pgmasters.net > -- Regards, Haozhou disk_quota_hooks_v7.patch Description: Binary data
Re: Control your disk usage in PG: Introduction to Disk Quota Extension
Hi Michael, Thank you very much for your email. I rebased the code with the newest master and attached it in the attachment. Thank you very much! Regards, Haozhou On Sun, Dec 1, 2019 at 11:20 AM Michael Paquier wrote: > On Fri, Sep 27, 2019 at 11:30:08AM +0800, Haozhou Wang wrote: > > I rebased this patch with the newest master branch. Attached the new > > patch disk_quota_hooks_v5.patch in the attachment. > > This again needs a rebase, so I have switched it as waiting on > author. > -- > Michael > -- Regards, Haozhou disk_quota_hooks_v6.patch Description: Binary data
Re: Control your disk usage in PG: Introduction to Disk Quota Extension
Thanks Alvaro! I rebased this patch with the newest master branch. Attached the new patch disk_quota_hooks_v5.patch in the attachment. Regards, Haozhou On Thu, Sep 26, 2019 at 3:54 AM Alvaro Herrera wrote: > This patch no longer applies. Can you please rebase? > > -- > Álvaro Herrera > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.2ndQuadrant.com_&d=DwIDAw&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0&m=aNzGoEI15bAE-vAivY34BtG2WdgrVojH-B-kjvuXsYA&s=sT6zyiq4s8meelNuFw-lGD_mdvmUzv9zpVYWbFWusI0&e= > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > > -- Regards, Haozhou disk_quota_hooks_v5.patch Description: Binary data
Re: Control your disk usage in PG: Introduction to Disk Quota Extension
Thanks very much for your comments. To the best of my knowledge, smgr is a layer that abstract the storage operations. Therefore, it is a good place to control or collect information the storage operations without touching the physical storage layer. Moreover, smgr is coming with actual disk IO operation (not consider the OS cache) for postgres. So we do not need to worry about the buffer management in postgres. It will make the purpose of hook is pure: a hook for actual disk IO. Regards, Haozhou On Wed, Dec 26, 2018 at 1:56 PM Michael Paquier wrote: > On Wed, Nov 21, 2018 at 09:47:44AM -0500, Robert Haas wrote: > > +1 for adding some hooks to support this kind of thing, but I think > > the names you've chosen are not very good. The hook name should > > describe the place from which it is called, not the purpose for which > > one imagines that it will be used, because somebody else might imagine > > another use. Both BufferExtendCheckPerms_hook_type and > > SmgrStat_hook_type are imagining that they know what the hook does - > > CheckPerms in the first case and Stat in the second case. > > I personally don't mind making Postgres more pluggable, but I don't > think that we actually need the extra ones proposed here at the layer > of smgr, as smgr is already a layer designed to call an underlying set > of APIs able to extend, unlink, etc. depending on the storage type. > > > For this particular purpose, I don't immediately see why you need a > > hook in both places. If ReadBuffer is called with P_NEW, aren't we > > guaranteed to end up in smgrextend()? > > Yes, that's a bit awkward. > -- > Michael
Re: Control your disk usage in PG: Introduction to Disk Quota Extension
Thank you very much for your review. We refactored our patch with new names and comments. For ReadBufferExtended hook, yes, Readbuffer with P_NEW will then call smgrextend. But in smgrextend, we cannot get the oid of a relation, and it will take some time to get the oid via smgrrelation. We would like to add a hook just before the smgrextend to get the oid and avoid use RelidByRelfilenode(). New patch is attached in the attachment. Thank a lot! Regards, Haozhou On Wed, Nov 21, 2018 at 10:48 PM Robert Haas wrote: > On Tue, Nov 20, 2018 at 2:20 AM Haozhou Wang wrote: > > We prepared a patch that includes the hook points. And such hook points > are needed for disk quota extension. > > There are two hooks. > > One is SmgrStat_hook. It's used to perform ad-hoc logic in storage when > doing smgr create/extend/truncate in general. As for disk quota extension, > this hook is used to detect active tables(new created tables, tables > extending new blocks, or tables being truncated) > > The other is BufferExtendCheckPerms_hook. It's used to perform ad-hoc > logic when buffer extend a new block. Since ReadBufferExtended is a hot > function, we call this hook only when blockNum == P_NEW. As for disk quota > extension, this hook is used to do query enforcement during the query is > loading data. > > > > Any comments are appreciated. > > +1 for adding some hooks to support this kind of thing, but I think > the names you've chosen are not very good. The hook name should > describe the place from which it is called, not the purpose for which > one imagines that it will be used, because somebody else might imagine > another use. Both BufferExtendCheckPerms_hook_type and > SmgrStat_hook_type are imagining that they know what the hook does - > CheckPerms in the first case and Stat in the second case. > > For this particular purpose, I don't immediately see why you need a > hook in both places. If ReadBuffer is called with P_NEW, aren't we > guaranteed to end up in smgrextend()? > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > disk_quota_hooks_v2.patch Description: Binary data
Re: Control your disk usage in PG: Introduction to Disk Quota Extension
Hi all, We prepared a patch that includes the hook points. And such hook points are needed for disk quota extension. There are two hooks. One is SmgrStat_hook. It's used to perform ad-hoc logic in storage when doing smgr create/extend/truncate in general. As for disk quota extension, this hook is used to detect active tables(new created tables, tables extending new blocks, or tables being truncated) The other is BufferExtendCheckPerms_hook. It's used to perform ad-hoc logic when buffer extend a new block. Since ReadBufferExtended is a hot function, we call this hook only when blockNum == P_NEW. As for disk quota extension, this hook is used to do query enforcement during the query is loading data. Any comments are appreciated. Regards, Haozhou On Wed, Nov 14, 2018 at 6:07 PM Hubert Zhang wrote: > Thanks, Tomas, > > Yes, we want to add the hooks into postgres repo. > But before that, we plan to firstly get some feedbacks from community > about the diskquota extension implementation and usage? > Later, we'll modify our license and submit the hooks into CF. > > Thanks > Hubert > > On Wed, Nov 14, 2018 at 3:54 AM Tomas Vondra > wrote: > >> On Tue, 2018-11-13 at 16:47 +0800, Hubert Zhang wrote: >> > Hi all, >> > >> > We implement disk quota feature on Postgresql as an extension(link: >> > https://github.com/greenplum-db/diskquota), >> > If you are interested, try and use it to limit the amount of disk >> > space that >> > a schema or a role can use in Postgresql. >> > Any feedback or question are appreciated. >> > >> >> It's not clear to me what's the goal of this thread? I understand what >> quotas are about, but are you sharing it because (a) it's a useful >> extension, (b) you propose adding a couple of new hooks (and keep the >> extension separate) or (c) you propose adding both the hooks and the >> extension (into contrib)? >> >> I assume it's either (b) and (c), in which case you should add it to >> 2019-01 CF: https://commitfest.postgresql.org/21/ >> >> In either case, it might be useful to add a LICENSE to the github >> repository, it's not clear what's the situation in this respect. That >> probably matters especially for (b), because for (c) it'd end up with >> PostgreSQL license automatically. >> >> regards >> >> -- >> Tomas Vondra http://www.2ndQuadrant.com >> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >> >> > > -- > Thanks > > Hubert Zhang > disk_quota_hooks_v1.patch Description: Binary data
Re: Vacuum Full does not release the disk size space after delete from table
Thank Tom! We will check it. On Fri, Nov 2, 2018 at 10:35 PM Tom Lane wrote: > Haozhou Wang writes: > > We meet a corner case that related to the behavior of Vacuum Full. > > ... > > If we run both sql scripts on same database in parallel, the "VACUUM FULL > > a;" will not release the disk space. > > I think what's happening is that the delete in script 1 happens after the > "pg_sleep" in script 2 starts. Then the pg_sleep has an open snapshot > that could potentially see the deleted rows, so they can't be removed yet. > > You could check this theory by changing the vacuum to use VERBOSE, and > seeing what it says about rows that can't be removed yet. > > regards, tom lane > -- Regards, Haozhou
Vacuum Full does not release the disk size space after delete from table
Hi hackers, We meet a corner case that related to the behavior of Vacuum Full. There are two SQL scripts SQL1: - -- Test vacuum full create schema s2; set search_path to s2; create table a (i int); create table b (i int); insert into a select generate_series(1,5); delete from a where i > 10; -- Disk space should release after vacuum full vacuum full a; drop table a, b; reset search_path; drop schema s2; - SQL2: -- create schema s1; set search_path to s1; create table c(i int); insert into c select generate_series(1,5); select pg_sleep(5); reset search_path; drop schema s1; --- If we run both sql scripts on same database in parallel, the "VACUUM FULL a;" will not release the disk space. But if we run then on same database in sequence, "VACUUM FULL a" can release the disk space. We use the PG master branch and the autovacuum has set to off. Could someone please help us to figure out the root cause of this issue? Thanks a lot! -- Regards, Haozhou
Re: [PATCH] Add missing type conversion functions for PL/Python
+1, I also think that we may not change the previous behavior of plpython. @Nikita Glukhov maybe we just check the whether pyobject is int or long only in related conversion functions, and fallback otherwise? On Fri, Jul 13, 2018 at 12:09 AM Heikki Linnakangas wrote: > On 12/07/18 18:06, Nikita Glukhov wrote: > > I have added some cross-type test cases and now almost all new code is > covered > > (excluding several error cases which can be triggered only by custom > numeric > > type implementations). > > Thanks! Some of those new tests actually fail, if you run them against > unpatched master. For example: > > SELECT * FROM test_type_conversion_float8_int2(100::float8); > INFO: (100.0, ) > - test_type_conversion_float8_int2 > --- > - 100 > -(1 row) > - > +ERROR: invalid input syntax for integer: "100.0" > +CONTEXT: while creating return value > +PL/Python function "test_type_conversion_float8_int2" > > So this patch is making some subtle changes to behavior. I don't think > we want that. > > - Heikki > -- Regards, Haozhou
Re: [PATCH] Add missing type conversion functions for PL/Python
Hi Heikki, Thank you very much for your review! I will prepare a new patch and make it ready soon. Regards, Haozhou On Thu, Jul 12, 2018 at 2:03 AM Heikki Linnakangas wrote: > On 26/03/18 19:07, Nikita Glukhov wrote: > > Attached fixed 3th version of the patch: > > Thanks, I'm reviewing this now. Nice speedup! > > There is no test coverage for some of the added code. You can get a code > coverage report with: > > ./configure --enable-coverage ... > make > make -C src/pl/plpython check > make coverage-html > > That produces a code coverage report in coverage/index.html. Please look > at the coverage of the new functions, and add tests to > src/pl/plpython/sql/plpython_types.sql so that all the new code is covered. > > In some places, where you've already checked the object type e.g. with > PyFloat_Check(), I think you could use the less safe variants, like > PyFloat_AS_DOUBLE() instead of PyFloat_AsDouble(). Since this patch is > all about performance, seems worth doing. > > Some of the conversions seem a bit questionable. For example: > > > /* > > * Convert a Python object to a PostgreSQL float8 datum directly. > > * If can not convert it directly, fallback to PLyObject_ToScalar > > * to convert it. > > */ > > static Datum > > PLyObject_ToFloat8(PLyObToDatum *arg, PyObject *plrv, > > bool *isnull, bool inarray) > > { > > if (plrv == Py_None) > > { > > *isnull = true; > > return (Datum) 0; > > } > > > > if (PyFloat_Check(plrv) || PyInt_Check(plrv) || PyLong_Check(plrv)) > > { > > *isnull = false; > > return Float8GetDatum((double)PyFloat_AsDouble(plrv)); > > } > > > > return PLyObject_ToScalar(arg, plrv, isnull, inarray); > > } > > The conversion from Python int to C double is performed by > PyFloat_AsDouble(). But there's no error checking. And wouldn't > PyLong_AsDouble() be more appropriate in that case, since we already > checked the python type? > > - Heikki > -- Regards, Haozhou
Re: [PATCH] Add missing type conversion functions for PL/Python
Hi David, This new patch is look good for me. So I change the status to need review. Thanks! Regards, Haozhou On Wed, Apr 11, 2018 at 7:52 PM, David Steele wrote: > On 4/11/18 2:36 AM, Haozhou Wang wrote: > > > > Thanks for your email. > > Just wondering will I need to re-submit this patch? > > No need to resubmit, the CF entry has been moved here: > > https://commitfest.postgresql.org/18/1499/ > > You should have a look at Nikita's patch, though. > > Regards, > -- > -David > da...@pgmasters.net >
Re: [PATCH] Add missing type conversion functions for PL/Python
Hi David, Thanks for your email. Just wondering will I need to re-submit this patch? Thanks a lot! Regards, Haozhou On Tue, Apr 10, 2018 at 9:35 PM, David Steele wrote: > On 3/26/18 12:07 PM, Nikita Glukhov wrote: > > On 26.03.2018 17:19, David Steele wrote: > > > >> On 2/20/18 10:14 AM, Haozhou Wang wrote: > >>> Thank you very much for your review! > >>> > >>> I attached a new patch with typo fixed. > >> I think it's a bit premature to mark this Ready for Committer after a > >> review consisting of a few typos. Anthony only said that he started > >> looking at it so I've marked it Needs Review. > > > > Hi. > > > > I also have looked at this patch and found some problems. > > Attached fixed 3th version of the patch: > > * initialization of arg->u.scalar was moved into PLy_output_setup_func() > > * added range checks for int16 and int32 types > > * added subroutine PLyInt_AsLong() for correct handling OverflowError > > that can > >be thrown from PyInt_AsLong() > > * casting from Python float to PostgreSQL numeric using > > PyFloat_AsDouble() was > >removed because it can return incorrect result for Python long and > >float8_numeric() uses float8 and numeric I/O functions > > * fixed whitespace > > There's a new patch on this thread but since it was not submitted by the > author I've moved this entry to the next CF in Waiting for Author state. > > Regards, > -- > -David > da...@pgmasters.net >
Re: [PATCH] Add missing type conversion functions for PL/Python
Thanks Nikita! On Tue, Mar 27, 2018 at 12:07 AM, Nikita Glukhov wrote: > On 26.03.2018 17:19, David Steele wrote: > > On 2/20/18 10:14 AM, Haozhou Wang wrote: >> >>> Thank you very much for your review! >>> >>> I attached a new patch with typo fixed. >>> >> I think it's a bit premature to mark this Ready for Committer after a >> review consisting of a few typos. Anthony only said that he started >> looking at it so I've marked it Needs Review. >> > > Hi. > > I also have looked at this patch and found some problems. > Attached fixed 3th version of the patch: > * initialization of arg->u.scalar was moved into PLy_output_setup_func() > * added range checks for int16 and int32 types > * added subroutine PLyInt_AsLong() for correct handling OverflowError > that can >be thrown from PyInt_AsLong() > * casting from Python float to PostgreSQL numeric using > PyFloat_AsDouble() was >removed because it can return incorrect result for Python long and >float8_numeric() uses float8 and numeric I/O functions > * fixed whitespace > > > -- > Nikita Glukhov > > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > > -- Regards, Haozhou
Re: [PATCH] Add missing type conversion functions for PL/Python
Thank you very much for your review! I attached a new patch with typo fixed. Regards, Haozhou On Mon, Feb 19, 2018 at 2:37 PM, Anthony Bykov wrote: > On Wed, 31 Jan 2018 11:57:12 +0800 > Haozhou Wang wrote: > > > Hi All, > > > > PL/Python already has different type conversion functions to > > convert PostgreSQL datum to Python object. However, the conversion > > functions from Python object to PostgreSQL datum only has Boolean, > > Bytea and String functions. > > > > In this patch, we add rest of Integer and Float related datatype > > conversion functions > > and can increase the performance of data conversion greatly especially > > when returning a large array. > > > > We did a quick test about the performance of returning array in > > PL/Python: > > > > The following UDF is used for test: > > > > ``` > > CREATE OR REPLACE FUNCTION pyoutfloat8(num int) RETURNS float8[] AS $$ > > return [x/3.0 for x in range(num)] > > $$ LANGUAGE plpythonu; > > ``` > > > > The test command is > > > > ``` > > select count(pyoutfloat8(n)); > > ``` > > > > The count is used for avoid large output, where n is the number of > > element in returned array, and the size is from 1.5k to15m. > > > > Size of Array 1.5k |15k | > > 150k| 1.5m| 15m | > > > > Origin 2.324ms | 19.834ms | 194.991ms > > | 1927.28ms| 19982.1ms | > > > > With this patch 1.168ms | 3.052ms | > > 21.888ms | 213.39ms |2138.5ms | > > > > All test for both PG and PL/Python are passed. > > > > Thanks very much. > > > > > > Hello, > sounds like a really nice patch. I've started looking > through the code and noticed a sort of a typos (or I just couldn't > understand what did you mean) in comments. > > file "src/pl/plpython/plpy_typeio.c" > the comment is > * If can not convert if directly, fallback to PLyObject_ToDatum > * to convert it > > Maybe it should be something like ("it" instead of second "if") > * If can not convert it directly, fallback to PLyObject_ToDatum > * to convert it > > And the same typo is repeated several times in comments. > > -- > Anthony Bykov > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > 0001-Add-missing-type-conversion-functions-for-PL-Python-v2.patch Description: Binary data
[PATCH] Add missing type conversion functions for PL/Python
Hi All, PL/Python already has different type conversion functions to convert PostgreSQL datum to Python object. However, the conversion functions from Python object to PostgreSQL datum only has Boolean, Bytea and String functions. In this patch, we add rest of Integer and Float related datatype conversion functions and can increase the performance of data conversion greatly especially when returning a large array. We did a quick test about the performance of returning array in PL/Python: The following UDF is used for test: ``` CREATE OR REPLACE FUNCTION pyoutfloat8(num int) RETURNS float8[] AS $$ return [x/3.0 for x in range(num)] $$ LANGUAGE plpythonu; ``` The test command is ``` select count(pyoutfloat8(n)); ``` The count is used for avoid large output, where n is the number of element in returned array, and the size is from 1.5k to15m. Size of Array 1.5k |15k | 150k| 1.5m| 15m | Origin 2.324ms | 19.834ms | 194.991ms | 1927.28ms| 19982.1ms | With this patch 1.168ms | 3.052ms | 21.888ms | 213.39ms |2138.5ms | All test for both PG and PL/Python are passed. Thanks very much. -- Regards, Haozhou 0001-Add-missing-type-conversion-functions-for-PL-Python.patch Description: Binary data