Hi, Please not break mail thread. You make tracking discussions difficult.
Also do not use non ASCII characters in your mail. This is an English mailing list. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp >> > Hello! I am a GSoC 2011 student. >> > My GSoC project's name is "Caching query results in pgpool-II", >> > and I am sending a specification about it. >> > If you give me your opinion, I appreciate it so much. >> > >> > --- >> > â– external specification(for user) >> > >> > [synopsis] >> > >> > On memory query cache allows you to speed-up database access by reusing >> > SELECT results. >> > >> > - The cache can be stored in either shared memory or memcached (not >> both). >> > >> > - The result of SELECT using none IMMUTABLE functions is not cached. >> > >> > - If the size of SELECT result is larger than the value specified in >> > pgpool.conf, it is not cached. >> > >> > - Cache will be removed if one of following conditions are met: >> > >> > - Cache expiration time exceeds >> > - If cache is full, the least recently used cache is removed >> > - If a table is dropped or modified, related cache data is deleted. >> > - If a table definition is modified(ALTER TABLE), related cache data is >> > deleted. >> > - If a schema or database is dropped, related cache data is deleted. >> >> How do you track DDL changes ? table drop/altered or database dropped as >> instance ? >> > > I am going to prepare for a structure for tables and a structure for cache > data. > This structure for tables has table OID in RowDescription (table OID is > unique in DB clustor), > and cache_id. On the other hand, cache is stored in memory block and each > cache data > relate to cache_id of the structure for tables. > > >> > [pgpool.conf Setting] >> > >> > - To enable the memory cache functionality, set this to on (default is >> off) >> > >> > memory_cache_enabled = on/off >> > >> > - Memory cache behavior can be specified by cache_param directive. >> > >> > cache_param = 'method=shmem, *1 >> > >> > memcached_host=, *2 >> > memcached_port=, *3 >> > total_size=, *4 >> > expire=60, *5 >> > maxcache=, *6 >> > cache_block_size=8192' *7 >> > >> > *1 Cache store method. Either "shmem"(shared memory) or "memcached". >> Default >> > is shmem. >> > >> > *2 Memcached host name. Mandatory if method=memcached. If this parameter >> is >> > starting with "/", UNIX domain socket is assumed. >> > >> > *3 Memcached port number. Mondatory if method=memcached. >> > >> > *4 Total memory size in bytes for storing memory cache. Mandatory if >> > method=shmem. >> > >> > *5 Memory cache entry life time specified in seconds. Default is 60. >> > >> > *6 Maximum SELECT result size in bytes. If exceeds this value, the result >> is >> > not cached. >> > >> > *7 Cache block size in bytes >> > >> > Only meaning full when method=shmem. Memory cache is devided into fixed >> size >> > of block. >> > If the block size is small, memory usage is efficient, but processing >> might >> > be slow. On the other hand, if the block size is large, memory usage is >> not >> > efficient but processing might be fast. Default is 8192. >> > >> > >> > â– Internal specification >> > >> > 1) Fetch query from frontend. >> > >> > (Extended Query's case) >> > get a message in "Execute" process after executing "Parse" or "Bind" >> >> I don't understand what you meant here. I guess you want to save the >> query from the Parse message, then parameters from the Bind one. right ? >> >> yes > > > >> > 2) Check the SQL from Frontend >> > >> > 2.1) Parse the query. If the query is SELECT >> > >> > 2.1.1) Get md5 hash value of the query string. >> > search for the corresponding data between this MD5 string and >> > that in cache >> >> What about swapping 2.1.1 and 2.1 ? we would avoid parsing a query >> multiple times if it is already in the cache. >> > > parse query and fetch the string "SELECT" or others ("INSERT", "UPDATE" and > so on..). > The string is used for the IF Condition. if the string is "SELECT", key of > md5 hash value of > the string "SELECT" search for caching data, so I thought it necessary to > parse the > query before IF Condition. > > >> > 2.1.1.1) If corresponding data is found >> > >> > 2.1.1.1.1) Send cached data to frontend. >> > >> > 2.1.1.2) If correnponding data is not found >> > >> > 2.1.1.2.1) Send the query to backend and get result. >> > >> > (Extended Query's case) >> > A portal is executed using an "Execute message" and get the results >> > >> > 2.1.1.2.2) Check the result's size >> > >> > 2.1.1.2.2.1) If the result's size from Backend is smaller than value of >> > "maxcache" set in pgpool.conf >> > >> > 2.1.1.2.2.1.1) Call PostgreSQL system catalog and check Function >> Volatility >> > Categories in SQL >> >> Couldn't we move this right after 2.1.1.2 ? >> > > This can be shifted to right after 2.1.1.2. > > >> > >> > 2.1.1.2.2.1.1.1) If there are no function, or only an IMMUTABLE function >> in >> > SQL >> > >> > 2.1.1.2.2.1.1.1.1) Store the query result in shared memory cache or >> > memcached. >> >> >> When reading this I thought about something else: what about tracking >> queries to exclude as well ? having a list of md5 to blacklist because >> of non-SELECT queries or using non IMMUTABLE functions ? >> If possible, I guess it would allow to shortcut some steps here in a lot >> of cases. >> > > It is good idea. I am thinking about specification again. > thanks! > > >> >> > ---------- >> > best regards >> > Masanori Yamazaki >> >> - -- >> Jehan-Guillaume (ioguix) de Rorthais >> DBA >> http://www.dalibo.com >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG v1.4.11 (GNU/Linux) >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ >> >> iEYEARECAAYFAk3ji1EACgkQXu9L1HbaT6I4gwCgjT6ROYPZR0mOo5YSM7Bz9+7Y >> WxMAnjHhBaGbIckglgu1K8pyChiYh6jo >> =J+rY >> -----END PGP SIGNATURE----- >> _______________________________________________ Pgpool-hackers mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-hackers
