Hi Charles,

Makes sense. Perhaps there is a middle ground. Perhaps the user creates a 
different plugin config for each API "group". That is, one for Google, another 
for Facebook, another for CatVideoTube. The config would include things like 
security tokens, REST format and so on.

Then, within that, use your idea to have a map of endpoints to table names. 
That is "/people/who/liv/in/washington/json" for Facebook gets mapped to a 
table called "DCUsers". 

Point is, these configs are all for the same service; with multiple plugin 
configs for different services.

We will still need to map SQL predicates to JSON query strings. This is a big 
mess at present (Each plugin copies the same wad of code from one to the next, 
sometimes with the same bugs.) So makes sense to tackle that part step by step.

We can note that each endpoint will need different parameters, so they may want 
to be part of your endpoint config. That is the "users" table might take a 
"city" parameter, while a "posts" endpoint might take "topic", "start date" and 
"end date" parameters.

The great thing about the REST "standard" is that everyone has their own. 
You'll probably get lots of feature requests as folks encounter all the odd 
things that have been done with REST. No need to boil the ocean; these can be 
tackled as they crop up.

I like your idea of using Swagger definitions. At least that imposes some 
sanity on REST. (But, of course, someone will REALLY need to get a a 
non-Swagger API.)

Another thing that would be handy would be sharding: the ability to take a big 
query (all those Washington DC users) and split them into smaller queries (DC 
users A-G, H-M, N-T, U-Z). Then, the big REST query could be done in parallel 
as a series of smaller queries. Works especially well for things like time 
series data (which is the case I had to handle recently.)

Thanks,
- Paul

 

    On Wednesday, January 8, 2020, 01:32:20 PM PST, Charles Givre 
<cgi...@gmail.com> wrote:  
 
 Hey Paul, Thanks for the review on the plugin.  I figured I send a response 
here on the dev alias so as to not spam everyone with github responses.
Design: I did a lot of experimenting with this.  The original implementation I 
found set things up such that a user could specify an endpoint in the config 
and append args in the query.  For instance, you might have an endpoint called 
facebook which pinged some API at facebook.  The query args were the table, and 
thus the queries looked like:
SELECT *FROM facebook.`?field_1=foo&field_2=bar`
You could append as much as you wanted so you could theoretically have 
something like:
SELECT *FROM 
facebook.`/people/who/live/in/washington/json?field_1=foo&field_2=bar`
That "table" name is appended to the URL specified in the plugin config to make 
the actual request.
However, after using it for a while, I wasn't loving the implementation and 
thought it would be better to have sub configs, similar to the idea of 
workspaces in the dfs plugin.  Thus, you can now create a plugin called 
'googleapis' and within that, have different APIs.   For example:
SELECT * FROM googleapis.googlesheets.`?param1=foo&param2=bar`
and 
SELECT *FROM googleapis.googledoc.`?param1=foo`
This seemed a lot more usable than the original implementation and would 
prevent a proliferation of copies of the plugin if you were querying a bunch of 
different APIs.
I should add that as currently implemented, the API plugin can do POST queries 
and the user can add params to the POST body. 

Future Functionality:I intended this plugin really to be an MVP and perhaps 
something which could be extended for other systems that use HTTP/REST as an 
interface.  If this gets used, and I do think it will, I plan on adding:
   
   - Support for OAUTH2
   - Filter pushdown (once the Base framework is committed)
   - Schemas from swagger and/or OpenAPI

Does this make sense?

  

Reply via email to