paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390533239
##########
File path: contrib/storage-http/README.md
##########
@@ -0,0 +1,218 @@
+
+# Generic REST API Storage Plugin
+This plugin is intended to enable you to query APIs over HTTP/REST. At this
point, the API reader will only accept JSON as input however in the future, it
may be possible to
+ add additional format readers to allow for APIs which return XML, CSV or
other formats.
+
+Note: This plugin should **NOT** be used for interacting with tools which
have REST APIs such as Splunk or Solr. It will not be performant for those use
cases.
+
+## Configuration
+To configure the plugin, create a new storage plugin, and add the following
configuration options which apply to ALL connections defined in this plugin:
+
+```json
+{
+ "type": "http",
+ "connection": "https://<your url here>/",
+ "cacheResults": true,
+ "timeout": 0,
+ "enabled": true
+}
+```
+The options are:
+* `type`: This should be `http`
+* `cacheResults`: Enable caching of the HTTP responses
+* `timeout`: Sets the response timeout in seconds. Defaults to `0` which is
no timeout.
+
+### Configuring the API Connections
+The HTTP Storage plugin allows you to configure multiple APIS which you can
query directly from this plugin. To do so, first add a `connections` parameter
to the configuration
+. Next give the connection a name, which will be used in queries. For
instance `stockAPI` or `jira`.
+
+The `connection` can accept the following options:
+* `url`: The base URL which Drill will query. You should include the ending
slash if there are additional arguments which you are passing.
+* `method`: The request method. Must be `get` or `post`. Other methods are not
allowed and will default to `GET`.
+* `headers`: Often APIs will require custom headers as part of the
authentication. This field allows you to define key/value pairs which are
submitted with the http request
+. The format is:
+```json
+headers: {
+ "key1":, "Value1",
+ "key2", "Value2"
+}
+```
+* `authType`: If your API requires authentication, specify the authentication
type. At the time of implementation, the plugin only supports basic
authentication, however, the
+ plugin will likely support OAUTH2 in the future. Defaults to `none`. If the
`authType` is set to `basic`, `username` and `password` must be set in the
configuration as well.
+ * `username`: The username for basic authentication.
+ * `password`: The password for basic authentication.
+ * `postBody`: Contains data, in the form of key value pairs, which are sent
during a `POST` request. Post body should be in the form:
+ ```
+key1=value1
+key2=value2
+```
+
+## Usage:
+This plugin is different from other plugins in that it the table component of
the `FROM` clause is different. In normal Drill queries, the `FROM` clause is
constructed as follows:
+```sql
+FROM <storage plugin>.<schema>.<table>
+```
+For example, you might have:
+```sql
+FROM dfs.test.`somefile.csv`
+
+-- or
+
+FROM mongo.stats.sales_data
+```
+
+The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your
REST call. The structure is:
+```sql
+FROM <plugin>.<connection>.<arguments>
+--Actual example:
+ FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today`
+```
+
+
+## Examples:
+### Example 1: Reference Data, A Sunrise/Sunset API
+The API sunrise-sunset.org returns data in the following format:
+
+ ```json
+ {
+ "results":
+ {
+ "sunrise":"7:27:02 AM",
+ "sunset":"5:05:55 PM",
+ "solar_noon":"12:16:28 PM",
+ "day_length":"9:38:53",
+ "civil_twilight_begin":"6:58:14 AM",
+ "civil_twilight_end":"5:34:43 PM",
+ "nautical_twilight_begin":"6:25:47 AM",
+ "nautical_twilight_end":"6:07:10 PM",
+ "astronomical_twilight_begin":"5:54:14 AM",
+ "astronomical_twilight_end":"6:38:43 PM"
+ },
+ "status":"OK"
+ }
+ }
+```
+To query this API, set the configuration as follows:
+
+```json
+{
+ {
+ "type": "http",
+ "cacheResults": false,
+ "enabled": true,
+ "timeout" 5,
+ "connections": {
+ "sunrise": {
+ "url": "https://api.sunrise-sunset.org/",
+ "method": "get",
+ "headers": null,
+ "authType": "none",
+ "userName": null,
+ "password": null,
+ "postBody": null
+ }
+ },
+}
+```
+Then, to execute a query:
+```sql
+ SELECT api_results.results.sunrise AS sunrise,
+ api_results.results.sunset AS sunset
+ FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` AS
api_results;
Review comment:
This is screaming out for filter push-down. Let's do this: get the basic
version in so we have a firm foundation. Then, add filter push-down, using this
plugin as the primary use case.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services