[ https://issues.apache.org/jira/browse/METRON-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670567#comment-16670567 ]
ASF GitHub Bot commented on METRON-1850: ---------------------------------------- GitHub user merrimanr opened a pull request: https://github.com/apache/metron/pull/1250 METRON-1850: Stellar REST function ## Contributor Comments This PR adds a Stellar REST function that can be used to enrich messages with data from 3rd party REST services. This function leverages the Apache HttpComponents library for Http requests. The function call follows this format: `REST_GET(rest uri, optional rest settings)` There are a handful of settings including basic authentication credentials, proxy support, and timeouts. These settings are included in the `RestConfig` class. Any of these settings can be defined in the global config under the `stellar.rest.settings` property and will override any default values. The global config settings can also be overridden by passing in a config object to the expression. This will allow support for multiple REST services that may have different requirements. Responses are expected to be in JSON format. Successful requests will return a MAP object in Stellar. Errors will be logged and null will be returned by default. There are ways to override this behavior by configuring a list of acceptable status codes and/or values to be returned on error or empty responses. Considering this will be used on streaming data, how we handle timeouts is important. This function exposes HttpClient timeout settings but those are not enough to keep unwanted latency from being introduced. A hard timeout is implemented in the function to abort a request if the timeout is exceeded. The idea is to guarantee the total request time will not exceed a configured value. ### Changes Included - HttpClient capability added to Stellar - HttpClient setup added to the various bolts and Stellar REPL - Utility added for setting up a pooling HttpClient (and possibly other types of clients in the future) - Configuration mechanism added for Stellar REST function including settings for authentication, proxy support, timeouts and other settings - Function implementation and appropriate unit/integration tests (both unit tests and integration tests are included) ### Testing There are several different ways to test this feature and I would encourage reviewers to get creative and look for cases I may not have thought of. For my testing, I used an online Http service that provides simple endpoints for simulating different use cases: `http://httpbin.org/#/`. Feel free to try your own or use this one. I tested this in full dev using the Stellar REPL and the parser and enrichment topologies. First you need to perform a couple setup steps: 1. Spin up full dev and ensure everything comes up and data is being indexed 2. Ssh to full dev and install the Squid proxy server: ``` yum -y install squid ``` 3. Create a password file that Squid can use for basic authentication ``` yum -y install httpd-tools touch /etc/squid/passwd && chown squid /etc/squid/passwd htpasswd /etc/squid/passwd user # (Will prompt for a password) ``` 4. Configure Squid for basic authentication by adding these lines to `/etc/squid/squid.conf`, under the lines with `acl Safe_ports*`: ``` auth_param basic program /usr/lib64/squid/ncsa_auth /etc/squid/passwd auth_param basic children 5 auth_param basic realm Squid Basic Authentication auth_param basic credentialsttl 2 hours acl auth_users proxy_auth REQUIRED http_access allow auth_users ``` 5. Start Squid and verify it is working correctly: ``` service squid restart curl --proxy-user user:password -x node1:3128 http://www.google.com/ ``` 6. Next create password files in HDFS: ``` su hdfs cd ~ echo passwd > basicPassword.txt hdfs dfs -put basicPassword.txt /apps/metron echo password > proxyPassword.txt hdfs dfs -put proxyPassword.txt /apps/metron exit ``` To test with the Stellar REPL, follow these steps: 1. Start the Stellar REPL and verify the `REST_GET` function is available: ``` /usr/metron/0.6.1/bin/stellar --zookeeper node1:2181 [Stellar]>>> %functions REST REST_GET ``` 2. Test a simple get request: ``` [Stellar]>>> REST_GET('http://httpbin.org/get') {args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 136.62.241.236, url=http://httpbin.org/get} ``` 3. Test a get request with basic authentication: ``` [Stellar]>>> config := {'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'} {basic.auth.user=user, basic.auth.password.path=/apps/metron/basicPassword.txt} [Stellar]>>> REST_GET('http://httpbin.org/basic-auth/user/passwd', config) {authenticated=true, user=user} ``` 4. Try the same request without passing in the config. You should get an authentication error: ``` [Stellar]>>> REST_GET('http://httpbin.org/basic-auth/user/passwd') 2018-10-28 00:32:20 ERROR RestFunctions:161 - Stellar REST request to http://httpbin.org/basic-auth/user/passwd expected status code to be one of [200] but failed with http status code 401: java.io.IOException: Stellar REST request to http://httpbin.org/basic-auth/user/passwd expected status code to be one of [200] but failed with http status code 401: at org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.doGet(RestFunctions.java:209) at org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.apply(RestFunctions.java:157) at org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:652) at org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:250) at org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:409) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:260) at org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357) at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` 5. You should also be able to set the basic authentication settings through the global config: ``` [Stellar]>>> %define stellar.rest.settings := {'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'} {basic.auth.user=user, basic.auth.password.path=/apps/metron/basicPassword.txt} [Stellar]>>> REST_GET('http://httpbin.org/basic-auth/user/passwd') {authenticated=true, user=user} ``` 6. Now verify you can send a request through the proxy: ``` [Stellar]>>> config := {'proxy.host':'node1','proxy.port':3128,'proxy.basic.auth.user':'user','proxy.basic.auth.password.path':'/apps/metron/proxyPassword.txt'} {proxy.basic.auth.password.path=/apps/metron/proxyPassword.txt, proxy.port=3128, proxy.host=node1, proxy.basic.auth.user=user} [Stellar]>>> REST_GET('http://httpbin.org/get', config) {args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 136.62.241.236, url=http://httpbin.org/get} ``` 7. Leave out the proxy credentials, you should get a proxy error: ``` [Stellar]>>> config := {'proxy.host':'node1','proxy.port':3128} {proxy.port=3128, proxy.host=node1} [Stellar]>>> REST_GET('http://httpbin.org/get', config) 2018-10-28 00:43:48 ERROR RestFunctions:161 - Stellar REST request to http://httpbin.org/get expected status code to be one of [200] but failed with http status code 407: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html><head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>ERROR: Cache Access Denied</title> ``` 8. Timeout is 1000 milliseconds by default. Test the timeout by setting it to 1 or a value where a request won't finish in time. You should get an error: ``` [Stellar]>>> REST_GET('http://httpbin.org/get', config) 2018-10-28 00:53:07 ERROR RestFunctions:161 - Total Stellar REST request time to http://httpbin.org/get exceeded the configured timeout of 1 ms. java.io.IOException: Total Stellar REST request time to http://httpbin.org/get exceeded the configured timeout of 1 ms. at org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.doGet(RestFunctions.java:188) at org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.apply(RestFunctions.java:157) at org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:652) at org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:250) at org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:409) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:260) at org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357) at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` 9. You can also configure which status codes should be handled as errors. A 404 is considered an error by default: ``` [Stellar]>>> REST_GET('http://httpbin.org/status/404', config) 2018-10-28 00:56:08 ERROR RestFunctions:161 - Stellar REST request to http://httpbin.org/status/404 expected status code to be one of [200] but failed with http status code 404: java.io.IOException: Stellar REST request to http://httpbin.org/status/404 expected status code to be one of [200] but failed with http status code 404: at org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.doGet(RestFunctions.java:209) at org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.apply(RestFunctions.java:157) at org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:652) at org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:250) at org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:409) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:260) at org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357) at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` This behavior can be changed by configuring a 404 to be an acceptable status code and returning an empty object instead of null: ``` {response.codes.allowed=[200, 404], empty.content.override={}} [Stellar]>>> REST_GET('http://httpbin.org/status/404', config) {} ``` 10. The value returned on an error can also be changed from null: ``` config := {'error.value.override':'got an error'} [Stellar]>>> result := REST_GET('http://httpbin.org/status/500', config) 2018-10-28 00:59:41 ERROR RestFunctions:161 - Stellar REST request to http://httpbin.org/status/500 expected status code to be one of [200] but failed with http status code 500: java.io.IOException: Stellar REST request to http://httpbin.org/status/500 expected status code to be one of [200] but failed with http status code 500: at org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.doGet(RestFunctions.java:209) at org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.apply(RestFunctions.java:157) at org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:652) at org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:250) at org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:409) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:260) at org.apache.metron.stellar.common.shell.specials.AssignmentCommand.execute(AssignmentCommand.java:66) at org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:255) at org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357) at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) got an error ``` To test with the parser and enrichment topologies follow these steps: 1. Make sure the topologies are running and data is flowing through. It's easier to test if you restart the parser topology with only a single sensor running. 2. Add a Stellar field transformation to the parser that is still running: ``` "fieldTransformations": [ { "input": [], "output": [ "parser_rest_result" ], "transformation": "STELLAR", "config": { "parser_rest_result": "REST_GET('http://httpbin.org/get?type=parser')" } } ], ``` 3. Listen on the enrichments Kafka topic. The `parser_rest_result` field should now be present. 4. Add a Stellar enrichment to the sensor: ``` "fieldMap": { "geo": [ "ip_dst_addr", "ip_src_addr" ], "host": [ "host" ], "stellar": { "config": { "enrichment_rest_result": "REST_GET('http://httpbin.org/get?type=enrichment')" } } } ``` 5. Listen on the indexing Kafka topic. The `enrichment_rest_result` field should now be present. ### Outstanding Issues - Currently the Stellar REPL does not quit cleanly. I suspect it's because the client is not closed but I'm still investigating. - I would like to add a section with a more detailed description of how to use this including explanation around what the various settings do. Where should this go? I don't see anything like this in the stellar-common README so wanted to get some guidance from the community. - Caching was briefly discussed in the discuss thread for this feature. Stellar provides a caching mechanism but we may need to be more selective about what is cached right now. I believe this should be a follow on. - There was a comment in the Jira related to adding a circuit breaker. Does that need to be done in this PR or can it be a follow on? Should we also explore/discuss a retry strategy? - It was also suggested that we create an abstraction for higher latency enrichments such as this in the discuss thread. I would prefer we create a few of these higher latency functions first so that we have a better understanding of how this abstraction would look. Do we want to take that on here? ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron. Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via: ``` mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh ``` - [x] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book mvn site ``` #### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/merrimanr/incubator-metron METRON-1850 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/metron/pull/1250.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1250 ---- commit acbfdda2e4dfd5034bea19e6c7d18acc2c6a1e17 Author: merrimanr <merrimanr@...> Date: 2018-10-31T13:19:34Z initial commit commit 1287d7f674c4197167f9237cf3d6749e77936230 Author: merrimanr <merrimanr@...> Date: 2018-10-31T18:56:04Z expression config should be a map and not a string ---- > Stellar REST function > --------------------- > > Key: METRON-1850 > URL: https://issues.apache.org/jira/browse/METRON-1850 > Project: Metron > Issue Type: New Feature > Reporter: Ryan Merriman > Priority: Major > > It would be useful to be able to enrich messages with Stellar using 3rd party > (or internal) REST services. At a minimum this function would: > * Stellar function available to GET from an HTTP API > * Optional parameters for basic auth (user/password) which generate correct > Authorization header > * Function returns null value for errors, connection failures etc and logs > error > * Function must provide and use pooled connection objects at the process > level > * Function must send Accept: application/json header > * A global setting must be available to set a proxy for all API calls, and > if present the proxy must be used. > * Proxy authentication must also be supported using basic auth. -- This message was sent by Atlassian JIRA (v7.6.3#76005)