On Mon, Nov 14, 2005 at 08:31:50PM -0500, Robert Treat wrote: > The basic scenario is one of a function that, given input, looks up > corresponding information in a cache table. If it can't find the > information, it goes through a more complicated (and slower) search > to obtain the information, inserts that information into the cache, > and returns the result. Note it always returns the same result > whether the cache contains the information or not, which means you > really do only need to evaluate it once per scan. The problem is > that when you mark such functions as volatile the performance you > get is horrendous, so you're forced to mark them as stable so the > planner will make use of index scans and such and give decent > performance. Now maybe that's not a convincing use-case, but it is a > common one.
Isn't this the sort of case that Tom just explained as not functioning in 8.1, as the STABLE functions, and all functions called by the STABLE functions will use the snapshot that is used at the time it was called? As in, you do the INSERT, but within the same SELECT statement invoking this 'STABLE' function, it never sees the inserted cached value? Also - what does it do with parallel inserts of the same cache values? Three or four clients all require the data at the same time - they execute the cache table lookup, to fail to find a row, they then all resolve the query the slow way, and each try to insert a cache row. The case seems problematic to me. Isn't it better served by a caching daemon, such as memcached? It has similar problems - not transaction safe, and so on, but I would suspect that this caching table that you describe above cannot ever be truly transaction safe, unless you store full row dependencies for each of the cache records, and validate against the dependencies before returning any data. Who is to say the cache data is up-to-date? Invalidation of the cache data rows may not solve this either. I'd say why bother? Personally, I'm more in favour of PostgreSQL doing cheap caching of query to results, making those very common slow queries you mention faster where possible. For example, keeping the query results in a LRU cache, with an identifier that would allow it to quickly determine if all dependent tables have changed or not, allowing it to return the results as is, if all of the tables are unchanged since the last execution. To make it faster, and to minimize caching of less frequent queries, perhaps the first few times a query is executed, it should only remember the number of times it has been executed, and only after some threshhold has passed, start to cache the results, and the dependency information. If a query rarely keeps the same dependency information, keep it in a cache of queries to never cache results or dependency information for? I'm sure this has been talked about at length, before I joined this mailing list. Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org