Hi, what follows is a review of the FDW API patch from http://archives.postgresql.org/message-id/20110114212358.82c7.69899...@metrosystems.co.jp
All three patches apply cleanly and compile without warnings. Regression tests pass. Let me go patch by patch, starting with the first one that adds the HANDLER option. It adds one useless hunk in src/backend/commands/foreigncmds.c (changes order if #includes). There's a typo in a C commnent ("determin which validator to be used"). Other than that, it looks OK. The third patch just adds a GetForeignTable helper function and it looks OK. The second patch actually adds the API. First of all, I'd like say that it's a really cool piece of code, allowing all kinds of awesome funcionality to be added. I'm already excited by the things that this will make possible. Congratulations! To get a feel of the API I wrote a simple FDW wrapper that presents data from the commitfest RSS feed, based heavily on twitter_fdw by Hitoshi Harada. Please treat my thoughts as someone's who doesn't really know *why* the API looks like it does, but has some observations about what was missing or what felt strange when using it. I guess that's the position a typical FDW wrapper writer will be in. First of all, the C comments mention that PlanRelScan should put a tuple descriptor in FdwPlan, but there's no such field in it. Also, comments for PlanRelScan talk about the 'attnos' argument, which is not in the function's signature. I guess the comments are just obsolete and should be updated. I think it's actually a good thing you don't have to put a TupleDesc in FdwPlan. There are two API methods, PlanNative and PlanQuery that are ifdef'd out using IN_THE_FUTURE. Isn't the standard symbol we use NOT_USED? Also, the comments say you can implement either PlanRelScan or PlanNative, and only the former is available for now. If we keep these methods commented, they should be moved to the end of the struct, so that uncommenting them will not break compatibility with existing FDWs. The only documentation a FDW writer has is fdwapi.h, so comments there need to be top-notch. We might contemplate writing a documentation chapter explaining how FDW handlers should be written, like we do for C SRFs and libpq programs, but I guess for now just the headers file would be enough. FdwExecutionState is just a struct around a void pointer, can we imagine adding more fields there? If not, maybe we could just remove the structure and pass void pointers around? OTOH that gives us some compiler checking and possibility of extending the struct, so I guess we could also just leave it like that. The Iterate method gets passed a TupleTableSlot. Do we really need such a low-level structure? It makes returning the result easy, because you just form your tuple and call ExecStoreTuple, but perhaps we could abstract that away a bit and add a helper method that will take a tuple and call ExecStoreTuple for us, passing InvalidBuffer and false as the remaining arguments. Or maybe make Iterate return the tuple and call ExecStoreTuple internally? I don't have strong opinions, but TupleTableSlot feels a bit too gutty - I'm having a hard time imagining what fields from it would be useful for a FDW writer, and so perhaps you don't need to expose it. Why does BeginScan accept a ParamListInfo argument? First of all, it feels like a parsing thing, not a relation scan thing, so perhaps it should be available at some other, earlier stage. Second of all, what would it be useful for anyway? Neither file_fdw nor my commitfest_fdw does anything with it. We could use comments about how to return tuples from Iterate and how to finish returning them. I had to look at the example to figure out that you need ExecClearTuple(slot) in your last Iterate. If Iterate's interface were to change, we could just return NULL instead of a tuple to say that we're done. We could be a bit more explicit about how to allocate objects, for instance if I'm allocating a FdwPlan in my PlanRelScan with a palloc, will it not go away too soon, or too late (IOW leak)? I ran into a problem when doing: select i from generate_series(1, 100000) as g(i), pgcommitfest; when I was trying to check for leaks. It returned four rows (pgcommitfest is the foreign table that returns four rows). Explain analyze shows a nested loop with a foreign scan as inner loop. Maybe it's because I didn't implement ReScan, but the API says I don't have to. If you don't implement Iterate you get a elog(ERROR). But if you don't implement one of the other required methods, you segfault. Feels inconsistent. PlanRelScan looks like something that could use all kinds of information to come up with a good plan. Maybe we could change its input argument to a single struct that would contain all the current arguments, so it'll be easier to extend when people will start writing FDWs and will find out that they'd like more information available. Doing that would mean that adding a field in a release would mean FDWs just need to be recompiled and they keep working. Or do we opt for the "better an explicitly compile error than a silent change" approach? If we'd be just passing additional info to PlanRelScan I'd say keeping old source compilable would not be a problem. Storing private info in FdwPlan's private list is really awkward. I know it's because you need copyObject support, but it's just a big pain if you want to store several different things. Passing them around as a big list and doing list_nth(private, MY_WIDGET_OFFSET) feels like writing Lisp before you learn it has structures :) Is there really no other way? Maybe PlanNative should get the foreign table OID, not the server OID, to resemble PlanRelScan more. The server OID is just a syscache lookup away, anyway. If you do EXPLAIN SELECT * FROM foregintable, BeginScan is not called, but EndScan is. That's weird, and I noticed because I got a segfault when EndScan tried to free things that BeginScan allocated. Maybe just don't call EndScan for EXPLAIN? That's all as far as the API goes. Feel free to ignore most of these remarks if you see a reason why your choices are better (or necessary). I just thought I'd try to take a look at it as a user would (which is what I am, as I don't fully understand the internals) and offer my impressions. In general, the feature looks great and I hope it'll make it into 9.1. And it we'd get the possibility to write FDW handlers in other PLs than C, it would rock so hard... I'm going to mark this a Waiting for Author because of the typos, the BeginScan/EndScan issue, and the nested loop stopping halfway issue. The rest are suggestions or just thoughts, and if you don't see them as justified, I'll mark the next patch Ready for Committer. Cheers, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers