> > just fast reaction - why QUOTA object? > Isn't ALTER SET enough? > Some like > ALTER TABLE a1 SET quote = 1MB; > ALTER USER ... > ALTER SCHEMA .. > New DDL commans looks like too hard hammer .
It's an option. Prefer to consider quota setting store together: CREATE DISK QUOTA way is more nature to store quota setting in a separate pg_diskquota catalog While ALTER SET way is more close to store quota setting in pg_class, pg_role, pg_namespace. etc in an integrated way. (Note that here I mean nature/close is not must, ALTER SET could also store in pg_diskquota and vice versa.) Here are some differences I can think of: 1 pg_role is a global catalog, not per database level. It's harder to tracker the user's disk usage in the whole clusters(considering 1000+ databases). So the semantic of CREATE DISK QUOTA ON USER is limited: it only tracks the user's disk usage inside the current database. 2 using separate pg_diskquota could add more field except for quota limit without adding too many fields in pg_class, e.g. red zone to give the user a warning or the current disk usage of the db objects. On Fri, Sep 21, 2018 at 8:01 PM Pavel Stehule <pavel.steh...@gmail.com> wrote: > > > pá 21. 9. 2018 v 13:32 odesílatel Hubert Zhang <hzh...@pivotal.io> napsal: > >> >> >> >> >> *Hi all,We redesign disk quota feature based on the comments from Pavel >> Stehule and Chapman Flack. Here are the new design.OverviewBasically, disk >> quota feature is used to support multi-tenancy environment, different level >> of database objects could be set a quota limit to avoid over use of disk >> space. A common case could be as follows: DBA could enable disk quota on a >> specified database list. DBA could set disk quota limit for >> tables/schemas/roles in these databases. Separate disk quota worker process >> will monitor the disk usage for these objects and detect the objects which >> exceed their quota limit. Queries loading data into these “out of disk >> quota” tables/schemas/roles will be cancelled.We are currently working at >> init implementation stage. We would like to propose our idea firstly and >> get feedbacks from community to do quick iteration.SQL Syntax (How to use >> disk quota)1 Specify the databases with disk quota enabled in GUC >> “diskquota_databases” in postgresql.conf and restart the database.2 DBA >> could set disk quota limit for table/schema/role.CREATE DISK QUOTA tablea1 >> ON TABLE a1 with (quota = ‘1MB’);CREATE DISK QUOTA roleu1 ON USER u1 with >> (quota = ‘1GB’);CREATE DISK QUOTA schemas1 ON SCHEMA s1 with (quota = >> ‘3MB’);* >> > > just fast reaction - why QUOTA object? > > Isn't ALTER SET enough? > > Some like > > ALTER TABLE a1 SET quote = 1MB; > ALTER USER ... > ALTER SCHEMA .. > > New DDL commans looks like too hard hammer . > > > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> *3 Simulate a schema out of quota limit case: suppose table a1 and table >> a2 are both under schema s1.INSERT INTO a1 SELECT >> generate_series(1,1000);INSERT INTO a2 SELECT >> generate_series(1,3000000);SELECT pg_sleep(5)INSERT INTO a1 SELECT >> generate_series(1,1000);ERROR: schema's disk space quota exceededDROP >> TABLE a2;SELECT pg_sleep(5)INSERT INTO a1 SELECT >> generate_series(1,1000);INSERT 0 1000ArchitectureDisk quota has the >> following components.1. Quota Setting Store is where the disk quota setting >> to be stored and accessed. We plan to use catalog table pg_diskquota to >> store these information. pg_diskquota is >> like:CATALOG(pg_diskquota,6122,DiskQuotaRelationId){ NameData quotaname; /* >> diskquota name */ int16 quotatype; /* diskquota type name */ Oid >> quotatargetoid; /* diskquota target db object oid*/ int32 quotalimit; /* >> diskquota size limit in MB*/ int32 quotaredzone; /* diskquota redzone in >> MB*/} FormData_pg_diskquota;2. Quota Change Detector is the monitor of size >> change of database objects. We plan to use stat collector to detect the >> ‘active’ table list at initial stage. But stat collector has some >> limitation on finding the active table which is in a running transaction. >> Details see TODO section.3. Quota Size Checker is where to calculate the >> size and compare with quota limit for database objects. According to >> Pavel’s comment, autovacuum launcher and worker process could be a good >> reference to disk quota. So we plan to use a disk quota launcher daemon >> process and several disk quota worker process to finish this work. Launcher >> process is responsible for starting worker process based on a user defined >> database list from GUC. Worker process will connect to its target database >> and monitor the disk usage for objects in this database. In init stage of >> worker process, it will call calculate_total_relation_size() to calculate >> the size for each user table. After init stage, worker process will refresh >> the disk model every N seconds. Refreshing will only recalculate the size >> of tables in ‘active’ table list, which is generated by Quata Change >> Detector to minimize the cost.4. Quota Enforcement Operator is where to >> check for the quota limitation at postgres backend side. We will firstly >> implement it in ExecCheckRTPerms() as pre-running enforcement. It will >> check the disk quota of tables being inserted or updated, and report error >> if table’s or table’s schema’s or table’s owner’s quota limit is exceeded. >> As a native feature, we plan to add more checkpoint to do running query >> enforcement. For example, if a disk quota lefts 10MB quota, a query could >> insert 1GB data. This query could be allowed in pre-running enforcement >> check, but will be cancelled in running query enforcement check. Therefore, >> it can improve the accurate of disk quota usage. To achieve this, we plan >> to add a checkpoint in lower API such as smgr_extened. Hence, the Quota >> Enforcement Operator will check the disk quota usage when smgr_extened is >> called. If the quota is over limited, current query will be cancelled. >> Highlight1. Native feature.Support native Create/Drop Disk Quota SQL >> statement.New catalog table pg_diskquota to store disk quota setting.2. >> Auto DML/DDL detection. Table >> create/update/insert/delete/vacuum/truncate/drop/schema_change/owner_change, >> Schema create/drop and Role create/drop will be detected by disk quota >> automatically. 3. Low cost disk quota checker.Worker process of disk quota >> need to refresh the disk usage model every N seconds. Since recalculate the >> file size using stat() system call is expensive for a large number of >> files, we use an ‘active’ table list to reduce the real work at each >> iteration. A basic experiment on our init stage implementation on database >> with 20K tables shows that the refresh cost is 1% cpu usage and will be >> finished within 50ms. Todo/LimitationBefore we propose our patch, we plan >> to enhance it with the following ideas:1. Setting database list with disk >> quota enabled dynamically without restart database. Since we have the disk >> quota launcher process, it could detect the new ‘diskquota_databases’ list >> and start/stop the corresponding disk quota worker process.2. Enforcement >> when query is running. Considering the case when there is 10MB quota left, >> but next query will insert 10GB data. Current enforcement design will allow >> this query to be executed. This is limited by the ‘active’ table detection >> is generated by stat collector. Postgres backend will only send table stat >> information to collector only when the transaction ends. We need a new way >> to detect the ‘active’ table even when this table is being modified inside >> a running transaction.3. Monitor unlimited number of databases. Current we >> set the max number of disk quota worker process to be 10 to reduce the >> affection normal workload. But how about if we want to monitor the disk >> quota of more than 10 databases? Our solution is to let disk quota launcher >> to manage a queue of database need to be monitored. And disk quota worker >> process consuming the queue and refresh the disk usage/quota for this >> database. After some periods, worker will return the database to the queue, >> and fetch the top database from queue to process. The period determine the >> delay of detecting disk quota change. To implement this feature, we need to >> support a subprocess of postmaster to rebind to another database instead of >> the database binded in InitPostgres().4. Support active table detection on >> vacuum full and vacuum analyze. Currently vacuum full and vacuum analyze >> are not tracked by stat collector.Thanks to Heikki, Pavel Stehule,Chapman >> Flack for the former comments on disk quota feature. Any comments on how to >> improve disk quota feature are appreciated.* >> >> >> On Mon, Sep 3, 2018 at 12:05 PM, Pavel Stehule <pavel.steh...@gmail.com> >> wrote: >> >>> >>> >>> 2018-09-03 3:49 GMT+02:00 Hubert Zhang <hzh...@pivotal.io>: >>> >>>> Thanks Pavel. >>>> Your patch did enforcement on storage level(md.c or we could also use >>>> smgr_extend). It's straight forward. >>>> But I prefer to implement disk_quota as a feature with following >>>> objectives: >>>> 1 set/alter disk quota setting on different database objects, e.g. >>>> user, database, schema etc. not only a general GUC, but we could set >>>> separate quota limit for a specific objects. >>>> 2 enforcement operator should work at two positions: before query is >>>> running and when query is running. The latter one's implementation maybe >>>> similar to your patch. >>>> >>> >>> The patch was just example. The resource quotes should be more complex - >>> per partition, table, schema, database, user - so GUC are possible, but not >>> very user friendly. >>> >>> Our case is specific, but not too much. The servers are used for >>> multidimensional analyses - and some tables can grow too fast (COPY, INSERT >>> SELECT). We need to solve limits immediately. The implementation is simple, >>> so I did it. Same implementation on database level, or schema level needs >>> some more locks, so it will not be too effective. The resource management >>> can be complex very complex, and I expect so it will be hard work. >>> >>> Regards >>> >>> Pavel >>> >>> >>>> On Sun, Sep 2, 2018 at 8:44 PM, Pavel Stehule <pavel.steh...@gmail.com> >>>> wrote: >>>> >>>>> Hi >>>>> >>>>> 2018-09-02 14:18 GMT+02:00 Hubert Zhang <hzh...@pivotal.io>: >>>>> >>>>>> Thanks Chapman. >>>>>> @Pavel, could you please explain more about your second suggestion >>>>>> "implement >>>>>> some quotas on storage level?" >>>>>> >>>>> >>>>> See attached patch - it is very simple - and good enough for our >>>>> purposes. >>>>> >>>>> Regards >>>>> >>>>> Pavel >>>>> >>>>> >>>>> >>>>>> We will not keep the long-lived processes attach to all >>>>>> databases(just like you mentioned servers with thousands of databases >>>>>> ) >>>>>> And you are right, we could share ideas with autovacuum process, fork >>>>>> worker processes in need. >>>>>> "autovacuum checks for tables that have had a large number of >>>>>> inserted, updated or deleted tuples. These checks use the statistics >>>>>> collection facility" >>>>>> diskquota process is similar to autovacuum at caring about insert, >>>>>> but the difference is that it also care about vucuum full, truncate and >>>>>> drop. While update and delete may not be interested since no file change >>>>>> happens. So a separate diskquota process is preferred. >>>>>> >>>>>> So if we implemented disk quota as a full native feature, and in the >>>>>> first initial version I prefer to implement the following features: >>>>>> 1 Fork diskquota launcher process under Postmaster serverloop, which >>>>>> is long-lived. >>>>>> 2 Diskquota launcher process is responsible for creating diskquota >>>>>> worker process for every database. >>>>>> 3 DIskquota setting is stored in a separate catalog table for each >>>>>> database. >>>>>> 4 Initialization stage, Diskquota launcher process creates diskquota >>>>>> worker >>>>>> process for all the databases(traverse like autovacuum). Worker process >>>>>> calculates disk usage of db objects and their diskquota setting. If any >>>>>> db object exceeds its quota limit, put them into the blacklist in the >>>>>> shared memory, which will later be used by enforcement operator. Worker >>>>>> process exits when works are done. >>>>>> 5 Running stage, Diskquota launcher process creates diskquota worker >>>>>> process for the database with a large number of insert, copy, truncate, >>>>>> drop etc. or create disk quota statement. Worker process updates the file >>>>>> size for db objects containing the result relation, and compare with the >>>>>> diskquota setting. Again, if exceeds quota limit, put them into >>>>>> blacklist, >>>>>> remove from blacklist vice versa. Worker process exits when works >>>>>> are done and a GUC could control the frequency of worker process restart >>>>>> to >>>>>> a specific database. As you know, this GUC also controls the delay when >>>>>> we >>>>>> do enforcement. >>>>>> 6 Enforcement. When postgres backend executes queries, check the >>>>>> blacklist in shared memory to determine whether the query is >>>>>> allowed(before >>>>>> execute) or need rollback(is executing)? >>>>>> >>>>>> If we implemented disk quota as an extension, we could just use >>>>>> background worker to start diskquota launcher process and use >>>>>> RegisterDynamicBackgroundWorker() to fork child diskquota worker >>>>>> processes by the launcher process as suggested by @Chapman. >>>>>> Diskquota setting could be stored in user table in a separate schema for >>>>>> each database(Schema and table created by create extension statement) >>>>>> just >>>>>> like what Heikki has done in pg_quota project. But in this case, we need >>>>>> to >>>>>> create extension for each database before diskquota worker process can be >>>>>> set up for that database. >>>>>> >>>>>> Any comments on the above design and which is preferred, native >>>>>> feature or extension as the POC? >>>>>> >>>>>> >>>>>> -- Hubert >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Aug 31, 2018 at 3:32 AM, Pavel Stehule < >>>>>> pavel.steh...@gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> 2018-08-30 16:22 GMT+02:00 Chapman Flack <c...@anastigmatix.net>: >>>>>>> >>>>>>>> On 08/30/2018 09:57 AM, Hubert Zhang wrote: >>>>>>>> >>>>>>>> > 2 Keep one worker process for each database. But using a >>>>>>>> parent/global >>>>>>>> > quota worker process to manage the lifecycle of database level >>>>>>>> worker >>>>>>>> > processes. It could handle the newly created database(avoid >>>>>>>> restart >>>>>>>> > database) and save resource when a database is not used. But this >>>>>>>> needs to >>>>>>>> > change worker process to be hierarchical. Postmaster becomes the >>>>>>>> grandfather >>>>>>>> > of database level worker processes in this case. >>>>>>>> >>>>>>>> I am using background workers this way in 9.5 at $work. >>>>>>>> >>>>>>>> In my case, one worker lives forever, wakes up on a set period, and >>>>>>>> starts a short-lived worker for every database, waiting for each >>>>>>>> one before starting the next. >>>>>>>> >>>>>>>> It was straightforward to implement. Looking back over the code, >>>>>>>> I see the global worker assigns its own PID to worker.bgw_notify_pid >>>>>>>> of each of its children, and also obtains a handle for each child >>>>>>>> from RegisterDynamicBackgroundWorker(). >>>>>>>> >>>>>>>> I imagine the global quota worker would prefer to start workers >>>>>>>> for every database and then just wait for notifications from any >>>>>>>> of them, but that seems equally straightforward at first glance. >>>>>>>> >>>>>>> >>>>>>> There are servers with thousands databases. Worker per database is >>>>>>> not good idea. >>>>>>> >>>>>>> It should to share ideas, code with autovacuum process. >>>>>>> >>>>>>> Not sure, how to effective implementation based on bg workers can >>>>>>> be. On servers with large set of databases, large set of tables it can >>>>>>> identify too big table too late. >>>>>>> >>>>>>> Isn't better to implement some quotas on storage level? >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> Pavel >>>>>>> >>>>>>> >>>>>>> >>>>>>>> -Chap >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks >>>>>> >>>>>> Hubert Zhang >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Thanks >>>> >>>> Hubert Zhang >>>> >>> >>> >> >> >> -- >> Thanks >> >> Hubert Zhang >> > -- Thanks Hubert Zhang