Re: [HACKERS] 7.3 schedule

Karel Zak Fri, 12 Apr 2002 00:54:11 -0700

On Fri, Apr 12, 2002 at 12:41:34AM -0400, Neil Conway wrote:
> On Fri, 12 Apr 2002 12:58:01 +0900
> "Hiroshi Inoue" <[EMAIL PROTECTED]> wrote:
> > 
> > Just a confirmation.
> > Someone is working on PREPARE/EXECUTE ?
> > What about Karel's work ?


 Right question :-)
 
> I am. My work is based on Karel's stuff -- at the moment I'm still
> basically working on getting Karel's patch to play nicely with
> current sources; once that's done I'll be addressing whatever
> issues are stopping the code from getting into CVS.

 My patch (qcache) for PostgreSQL 7.0 is available at 
 ftp://ftp2.zf.jcu.cz/users/zakkr/pg/.
 
 I very look forward to Neil's work on this. 

 Notes:

    * It's experimental patch, but usable. All features below mentioned 
      works.

    * PREPARE/EXECUTE is not only SQL statements, I think good idea is
      create something common and robus for query-plan caching,
      beacuse there is for example SPI too. The RI triggers are based 
      on SPI_saveplan(). 
 
    * My patch knows EXECUTE INTO feature:

 PREPARE foo AS SELECT * FROM pg_class WHERE relname ~~ $1 USING text;

 EXECUTE foo USING 'pg%';        <-- standard select

 EXECUTE foo INTO TEMP newtab USING 'pg%';  <-- select into
 
 
    * The patch allows store query-planns to shared memory and is
      possible EXECUTE it at more backends (over same DB) and planns
      are persistent across connetions. For this feature I create special 
      memory context subsystem (like current aset.c, but it works with 
      IPC shared memory).
      
      This is maybe too complex solution and (maybe) sufficient is cache 
      query in one backend only. I know unbelief about this shared
      memory solution (Tom?). 
      
    
    Karel
      
      
 My experimental patch README (excuse my English):

 Implementation
 ~~~~~~~~~~~~~~

   The qCache allows save queryTree and queryPlan. There is available are 
   two space for data caching. 
  
   LOCAL  - data are cached in backend non-shared memory and data aren't
            available in other backends.                  
  
   SHARE  - data are cached in backend shared memory and data are 
            visible in all backends.
  
   Because size of share memory pool is limited and it is set during
   postmaster start up, the qCache must remove all old planns if pool is 
   full. You can mark each entry as "REMOVEABLE" or "NOTREMOVEABLE". 
  
   A removeable entry is removed if pool is full.
  
   A not-removeable entry must be removed via qCache_Remove() or 
   the other routines. The qCache not remove this entry itself.
  
   All records in qCache are cached (in the hash table) under some key.
   The qCache knows two alternate of key --- "KEY_STRING" and "KEY_BINARY". 
  
   The qCache API not allows access to shared memory, all cached planns that 
   API returns are copy to CurrentMemoryContext. All (qCache_ ) routines lock 
   shmem itself (exception is qCache_RemoveOldest_ShareRemoveAble()).

        - for locking is used spin lock.

   Memory management
   ~~~~~~~~~~~~~~~~~
   The qCache use for qCache's shared pool its memory context independent on
   standard aset/mcxt, but use compatible API --- it allows to use standard
   palloc() (it is very needful for basic plan-tree operations, an example 
   for copyObject()). The qCache memory management is very simular to current
   aset.c code. It is chunk-ed blocks too, but the block is smaller - 1024b.

   The number of blocks is available set in postmaster 'argv' via option
   '-Z'.

   For plan storing is used separate MemoryContext for each plan, it 
   is good idea (Hiroshi's ?), bucause create new context is simple and 
   inexpensive and allows easy destroy (free) cached plan. This method is 
   used in my SPI overhaul instead TopMemoryContext feeding.

   Postmaster
   ~~~~~~~~~~
   The query cache memory is init during potmaster startup. The size of
   query cache pool is set via '-Z <number-of-blocks>' switch --- default 
   is 100 blocks where 1 block = 1024b, it is sufficient for 20-30 cached
   planns. One query needs somewhere 3-10 blocks, for example query like

        PREPARE sel AS SELECT * FROM pg_class;

   needs 10Kb, because table pg_class has very much columns. 
 
   Note: for development I add SQL function: "SELECT qcache_state();",
         this routine show usage of qCache.

 SPI
 ~~~
    I a little overwrite SPI save plan method and remove TopMemoryContext
    "feeding".

    Standard SPI:

        SPI_saveplan() - save each plan to separate standard memory context.

        SPI_freeplan() - free plan.

    By key SPI:

        It is SPI interface for query cache and allows save planns to SHARED
        or LOCAL cache 'by' arbitrary key (string or binary). Routines:

        SPI_saveplan_bykey()    - save plan to query cache

        SPI_freeplan_bykey()    - remove plan from query cache

        SPI_fetchplan_bykey()   - fetch plan saved in query cache

        SPI_execp_bykey()       - execute (via SPI) plan saved in query
                                  cache 

        - now, users can write functions that save planns to shared memory 
        and planns are visible in all backend and are persistent arcoss 
        connection. 

        Example:
        ~~~~~~~
        /* ----------
         * Save/exec query from shared cache via string key
         * ----------
         */
        int     keySize = 0;        
                flag = SPI_BYKEY_SHARE | SPI_BYKEY_STRING;
        char    *key = "my unique key";
        
        res = SPI_execp_bykey(values, nulls, tcount, key, flag, keySize);
        
        if (res == SPI_ERROR_PLANNOTFOUND) 
        {
                /* --- not plan in cache - must create it --- */
                
                void *plan;

                plan = SPI_prepare(querystr, valnum, valtypes);
                SPI_saveplan_bykey(plan, key, keySize, flag);
                
                res = SPI_execute(plan, values, Nulls, tcount);
        }
        
        elog(NOTICE, "Processed: %d", SPI_processed);


 PREPARE/EXECUTE
 ~~~~~~~~~~~~~~~
   * Syntax:
        
        PREPARE <name> AS <query> 
                [ USING type, ... typeN ] 
                [ NOSHARE | SHARE | GLOBAL ]
        
        EXECUTE <name> 
                [ INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table ]
                [ USING val, ... valN ]
                       [ NOSHARE | SHARE | GLOBAL ]

        DEALLOCATE PREPARE 
                [ <name> [ NOSHARE | SHARE | GLOBAL ]]
                [ ALL | ALL INTERNAL ]


     I know that it is a little out of SQL92... (use CREATE/DROP PLAN instead
     this?) --- what mean SQL standard guru?

   * Where:
 
        NOSHARE --- cached in local backend query cache - not accessable
                    from the others backends and not is persisten a across
                    conection.

        SHARE   --- cached in shared query cache and accessable from
                    all backends which work over same database.

        GLOBAL  --- cached in shared query cache and accessable from
                    all backends and all databases. 

        - default is 'SHARE'
   
        Deallocate:
                
                ALL          --- deallocate all users's plans

                ALL INTERNAL --- deallocate all internal plans, like planns
                                 cached via SPI. It is needful if user
                                 alter/drop table ...etc.

   * Parameters:
        
        "USING" part in the prepare statement is for datetype setting for
        paremeters in the query. For example:

        PREPARE sel AS SELECT * FROM pg_class WHERE relname ~~ $1 USING text;

        EXECUTE sel USING 'pg%';
        

   * Limitation:
        
        - prepare/execute allow use full statement of SELECT/INSERT/DELETE/
          UPDATE. 
        - possible is use union, subselects, limit, ofset, select-into


 Performance:
 ~~~~~~~~~~~
 * the SPI

        - I for my tests a little change RI triggers to use SPI by_key API
        and save planns to shared qCache instead to internal RI hash table.

        The RI use very simple (for parsing) queries and qCache interest is 
        not visible. It's better if backend very often startup and RI check 
        always same tables. In this situation speed go up --- 10-12%. 
        (This snapshot not include this RI change.)

        But all depend on how much complicate for parser is query in 
        trigger.

 * PREPARE/EXECUTE
        
        - For tests I use query that not use some table (the executor is 
        in boredom state), but is difficult for the parser. An example:

        SELECT 'a text ' || (10*10+(100^2))::text || ' next text ' || cast 
        (date_part('year', timestamp 'now') AS text );
                  
        - (10000 * this query):

        standard select:        54 sec
        via prepare/execute:     4 sec   (93% better)

        IMHO it is nod bad.
 
        - For standard query like:

        SELECT u.usename, r.relname FROM pg_class r, pg_user u WHERE 
        r.relowner = u.usesysid;

        it is with PREPARE/EXECUTE 10-20% faster.

-- 
 Karel Zak  <[EMAIL PROTECTED]>
 http://home.zf.jcu.cz/~zakkr/
 
 C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] 7.3 schedule

Reply via email to