Hi Charles,
Makes sense. Perhaps there is a middle ground. Perhaps the user creates a
different plugin config for each API "group". That is, one for Google, another
for Facebook, another for CatVideoTube. The config would include things like
security tokens, REST format and so on.
Then, within that, use your idea to have a map of endpoints to table names.
That is "/people/who/liv/in/washington/json" for Facebook gets mapped to a
table called "DCUsers".
Point is, these configs are all for the same service; with multiple plugin
configs for different services.
We will still need to map SQL predicates to JSON query strings. This is a big
mess at present (Each plugin copies the same wad of code from one to the next,
sometimes with the same bugs.) So makes sense to tackle that part step by step.
We can note that each endpoint will need different parameters, so they may want
to be part of your endpoint config. That is the "users" table might take a
"city" parameter, while a "posts" endpoint might take "topic", "start date" and
"end date" parameters.
The great thing about the REST "standard" is that everyone has their own.
You'll probably get lots of feature requests as folks encounter all the odd
things that have been done with REST. No need to boil the ocean; these can be
tackled as they crop up.
I like your idea of using Swagger definitions. At least that imposes some
sanity on REST. (But, of course, someone will REALLY need to get a a
non-Swagger API.)
Another thing that would be handy would be sharding: the ability to take a big
query (all those Washington DC users) and split them into smaller queries (DC
users A-G, H-M, N-T, U-Z). Then, the big REST query could be done in parallel
as a series of smaller queries. Works especially well for things like time
series data (which is the case I had to handle recently.)
Thanks,
- Paul
On Wednesday, January 8, 2020, 01:32:20 PM PST, Charles Givre
<[email protected]> wrote:
Hey Paul, Thanks for the review on the plugin. I figured I send a response
here on the dev alias so as to not spam everyone with github responses.
Design: I did a lot of experimenting with this. The original implementation I
found set things up such that a user could specify an endpoint in the config
and append args in the query. For instance, you might have an endpoint called
facebook which pinged some API at facebook. The query args were the table, and
thus the queries looked like:
SELECT *FROM facebook.`?field_1=foo&field_2=bar`
You could append as much as you wanted so you could theoretically have
something like:
SELECT *FROM
facebook.`/people/who/live/in/washington/json?field_1=foo&field_2=bar`
That "table" name is appended to the URL specified in the plugin config to make
the actual request.
However, after using it for a while, I wasn't loving the implementation and
thought it would be better to have sub configs, similar to the idea of
workspaces in the dfs plugin. Thus, you can now create a plugin called
'googleapis' and within that, have different APIs. For example:
SELECT * FROM googleapis.googlesheets.`?param1=foo¶m2=bar`
and
SELECT *FROM googleapis.googledoc.`?param1=foo`
This seemed a lot more usable than the original implementation and would
prevent a proliferation of copies of the plugin if you were querying a bunch of
different APIs.
I should add that as currently implemented, the API plugin can do POST queries
and the user can add params to the POST body.
Future Functionality:I intended this plugin really to be an MVP and perhaps
something which could be extended for other systems that use HTTP/REST as an
interface. If this gets used, and I do think it will, I plan on adding:
- Support for OAUTH2
- Filter pushdown (once the Base framework is committed)
- Schemas from swagger and/or OpenAPI
Does this make sense?