> Hackers > > Hello! I am a GSoC 2011 student. > My GSoC project's name is "Caching query results in pgpool-II", > and I am sending a specification about it. > If you give me your opinion, I appreciate it so much. > > --- o external specification(for user) [remove non ASCII characters]
> [synopsis] > > On memory query cache allows you to speed-up database access by reusing > SELECT results. > > - The cache can be stored in either shared memory or memcached (not both). > > - The result of SELECT using none IMMUTABLE functions is not cached. > > - If the size of SELECT result is larger than the value specified in > pgpool.conf, it is not cached. > > - Cache will be removed if one of following conditions are met: > > - Cache expiration time exceeds > - If cache is full, the least recently used cache is removed > - If a table is dropped or modified, related cache data is deleted. > - If a table definition is modified(ALTER TABLE), related cache data is > deleted. > - If a schema or database is dropped, related cache data is deleted. Other things we have to care: - We should not cache data until current transaction committed. SELECT results generated in an aborted transaction apparently should not be cached. - SELECT results from cursor operations shoud not be cached. > [pgpool.conf Setting] > > - To enable the memory cache functionality, set this to on (default is off) > > memory_cache_enabled = on/off > > - Memory cache behavior can be specified by cache_param directive. > > cache_param = 'method=shmem, *1 > > memcached_host=, *2 > memcached_port=, *3 > total_size=, *4 > expire=60, *5 > maxcache=, *6 > cache_block_size=8192' *7 > > *1 Cache store method. Either "shmem"(shared memory) or "memcached". Default > is shmem. > > *2 Memcached host name. Mandatory if method=memcached. If this parameter is > starting with "/", UNIX domain socket is assumed. > > *3 Memcached port number. Mondatory if method=memcached. > > *4 Total memory size in bytes for storing memory cache. Mandatory if > method=shmem. > > *5 Memory cache entry life time specified in seconds. Default is 60. > > *6 Maximum SELECT result size in bytes. If exceeds this value, the result is > not cached. > > *7 Cache block size in bytes > > Only meaning full when method=shmem. Memory cache is devided into fixed size > of block. > If the block size is small, memory usage is efficient, but processing might > be slow. On the other hand, if the block size is large, memory usage is not > efficient but processing might be fast. Default is 8192. > > > â– Internal specification > > 1) Fetch query from frontend. > > (Extended Query's case) > get a message in "Execute" process after executing "Parse" or "Bind" > > 2) Check the SQL from Frontend > > 2.1) Parse the query. If the query is SELECT > > 2.1.1) Get md5 hash value of the query string. > search for the corresponding data between this MD5 string and > that in cache > > 2.1.1.1) If corresponding data is found > > 2.1.1.1.1) Send cached data to frontend. > > 2.1.1.2) If correnponding data is not found > > 2.1.1.2.1) Send the query to backend and get result. > > (Extended Query's case) > A portal is executed using an "Execute message" and get the results > > 2.1.1.2.2) Check the result's size > > 2.1.1.2.2.1) If the result's size from Backend is smaller than value of > "maxcache" set in pgpool.conf > > 2.1.1.2.2.1.1) Call PostgreSQL system catalog and check Function Volatility > Categories in SQL > > 2.1.1.2.2.1.1.1) If there are no function, or only an IMMUTABLE function in > SQL > > 2.1.1.2.2.1.1.1.1) Store the query result in shared memory cache or > memcached. > > > > ---------- > best regards > Masanori Yamazaki _______________________________________________ Pgpool-hackers mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-hackers
