[ 
https://issues.apache.org/jira/browse/METRON-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670567#comment-16670567
 ] 

ASF GitHub Bot commented on METRON-1850:
----------------------------------------

GitHub user merrimanr opened a pull request:

    https://github.com/apache/metron/pull/1250

    METRON-1850: Stellar REST function

    ## Contributor Comments
    This PR adds a Stellar REST function that can be used to enrich messages 
with data from 3rd party REST services.  This function leverages the Apache 
HttpComponents library for Http requests.
    
    The function call follows this format:  `REST_GET(rest uri, optional rest 
settings)`
    
    There are a handful of settings including basic authentication credentials, 
proxy support, and timeouts.  These settings are included in the `RestConfig` 
class.  Any of these settings can be defined in the global config under the 
`stellar.rest.settings` property and will override any default values.  The 
global config settings can also be overridden by passing in a config object to 
the expression.  This will allow support for multiple REST services that may 
have different requirements.
    
    Responses are expected to be in JSON format.  Successful requests will 
return a MAP object in Stellar.  Errors will be logged and null will be 
returned by default.  There are ways to override this behavior by configuring a 
list of acceptable status codes and/or values to be returned on error or empty 
responses.
     
    Considering this will be used on streaming data, how we handle timeouts is 
important.  This function exposes HttpClient timeout settings but those are not 
enough to keep unwanted latency from being introduced.  A hard timeout is 
implemented in the function to abort a request if the timeout is exceeded.  The 
idea is to guarantee the total request time will not exceed a configured value.
    
    ### Changes Included
    
    - HttpClient capability added to Stellar
    - HttpClient setup added to the various bolts and Stellar REPL
    - Utility added for setting up a pooling HttpClient (and possibly other 
types of clients in the future)
    - Configuration mechanism added for Stellar REST function including 
settings for authentication, proxy support, timeouts and other settings
    - Function implementation and appropriate unit/integration tests (both unit 
tests and integration tests are included)
    
    ### Testing
    
    There are several different ways to test this feature and I would encourage 
reviewers to get creative and look for cases I may not have thought of.  For my 
testing, I used an online Http service that provides simple endpoints for 
simulating different use cases:  `http://httpbin.org/#/`.  Feel free to try 
your own or use this one.
    
    I tested this in full dev using the Stellar REPL and the parser and 
enrichment topologies.  First you need to perform a couple setup steps:
    
    1. Spin up full dev and ensure everything comes up and data is being indexed
    2. Ssh to full dev and install the Squid proxy server:
    ```
    yum -y install squid
    ```
    3. Create a password file that Squid can use for basic authentication
    ```
    yum -y install httpd-tools
    touch /etc/squid/passwd && chown squid /etc/squid/passwd
    htpasswd /etc/squid/passwd user # (Will prompt for a password)
    ```
    4. Configure Squid for basic authentication by adding these lines to 
`/etc/squid/squid.conf`, under the lines with `acl Safe_ports*`:
    ```
    auth_param basic program /usr/lib64/squid/ncsa_auth /etc/squid/passwd
    auth_param basic children 5
    auth_param basic realm Squid Basic Authentication
    auth_param basic credentialsttl 2 hours
    acl auth_users proxy_auth REQUIRED
    http_access allow auth_users
    ```
    5. Start Squid and verify it is working correctly:
    ```
    service squid restart
    curl --proxy-user user:password -x node1:3128 http://www.google.com/
    ```
    6. Next create password files in HDFS:
    ```
    su hdfs
    cd ~
    echo passwd > basicPassword.txt
    hdfs dfs -put basicPassword.txt /apps/metron
    echo password > proxyPassword.txt
    hdfs dfs -put proxyPassword.txt /apps/metron
    exit
    ```
    
    To test with the Stellar REPL, follow these steps:
    
    1. Start the Stellar REPL and verify the `REST_GET` function is available:
    ```
    /usr/metron/0.6.1/bin/stellar --zookeeper node1:2181
    [Stellar]>>> %functions REST
    REST_GET
    ```
    2. Test a simple get request:
    ```
    [Stellar]>>> REST_GET('http://httpbin.org/get')
    {args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, 
Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, 
User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 
136.62.241.236, url=http://httpbin.org/get}
    ```
    3. Test a get request with basic authentication:
    ```
    [Stellar]>>> config := 
{'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'}
    {basic.auth.user=user, 
basic.auth.password.path=/apps/metron/basicPassword.txt}
    [Stellar]>>> REST_GET('http://httpbin.org/basic-auth/user/passwd', config)
    {authenticated=true, user=user}
    ```
    4. Try the same request without passing in the config.  You should get an 
authentication error:
    ```
    [Stellar]>>> REST_GET('http://httpbin.org/basic-auth/user/passwd')
    2018-10-28 00:32:20 ERROR RestFunctions:161 - Stellar REST request to 
http://httpbin.org/basic-auth/user/passwd expected status code to be one of 
[200] but failed with http status code 401: 
    java.io.IOException: Stellar REST request to 
http://httpbin.org/basic-auth/user/passwd expected status code to be one of 
[200] but failed with http status code 401: 
        at 
org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.doGet(RestFunctions.java:209)
        at 
org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.apply(RestFunctions.java:157)
        at 
org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:652)
        at 
org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:250)
        at 
org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:409)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:260)
        at 
org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357)
        at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    ```
    5. You should also be able to set the basic authentication settings through 
the global config:
    ```
    [Stellar]>>> %define stellar.rest.settings := 
{'basic.auth.user':'user','basic.auth.password.path':'/apps/metron/basicPassword.txt'}
    {basic.auth.user=user, 
basic.auth.password.path=/apps/metron/basicPassword.txt}
    [Stellar]>>> REST_GET('http://httpbin.org/basic-auth/user/passwd')
    {authenticated=true, user=user}
    ```
    6. Now verify you can send a request through the proxy:
    ```
    [Stellar]>>> config := 
{'proxy.host':'node1','proxy.port':3128,'proxy.basic.auth.user':'user','proxy.basic.auth.password.path':'/apps/metron/proxyPassword.txt'}
    {proxy.basic.auth.password.path=/apps/metron/proxyPassword.txt, 
proxy.port=3128, proxy.host=node1, proxy.basic.auth.user=user}
    [Stellar]>>> REST_GET('http://httpbin.org/get', config)
    {args={}, headers={Accept=application/json, Accept-Encoding=gzip,deflate, 
Cache-Control=max-age=259200, Connection=close, Host=httpbin.org, 
User-Agent=Apache-HttpClient/4.3.2 (java 1.5)}, origin=127.0.0.1, 
136.62.241.236, url=http://httpbin.org/get}
    ``` 
    7.  Leave out the proxy credentials, you should get a proxy error:
    ```
    [Stellar]>>> config := {'proxy.host':'node1','proxy.port':3128}
    {proxy.port=3128, proxy.host=node1}
    [Stellar]>>> REST_GET('http://httpbin.org/get', config)
    2018-10-28 00:43:48 ERROR RestFunctions:161 - Stellar REST request to 
http://httpbin.org/get expected status code to be one of [200] but failed with 
http status code 407: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" 
"http://www.w3.org/TR/html4/strict.dtd";>
    <html><head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>ERROR: Cache Access Denied</title>
    ```
    8. Timeout is 1000 milliseconds by default.  Test the timeout by setting it 
to 1 or a value where a request won't finish in time.  You should get an error:
    ```
    [Stellar]>>> REST_GET('http://httpbin.org/get', config)
    2018-10-28 00:53:07 ERROR RestFunctions:161 - Total Stellar REST request 
time to http://httpbin.org/get exceeded the configured timeout of 1 ms.
    java.io.IOException: Total Stellar REST request time to 
http://httpbin.org/get exceeded the configured timeout of 1 ms.
        at 
org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.doGet(RestFunctions.java:188)
        at 
org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.apply(RestFunctions.java:157)
        at 
org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:652)
        at 
org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:250)
        at 
org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:409)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:260)
        at 
org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357)
        at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    ```
    9. You can also configure which status codes should be handled as errors.  
A 404 is considered an error by default:
    ```
    [Stellar]>>> REST_GET('http://httpbin.org/status/404', config)
    2018-10-28 00:56:08 ERROR RestFunctions:161 - Stellar REST request to 
http://httpbin.org/status/404 expected status code to be one of [200] but 
failed with http status code 404: 
    java.io.IOException: Stellar REST request to http://httpbin.org/status/404 
expected status code to be one of [200] but failed with http status code 404: 
        at 
org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.doGet(RestFunctions.java:209)
        at 
org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.apply(RestFunctions.java:157)
        at 
org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:652)
        at 
org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:250)
        at 
org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:409)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:260)
        at 
org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357)
        at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    ```
    This behavior can be changed by configuring a 404 to be an acceptable 
status code and returning an empty object instead of null:
    ```
    {response.codes.allowed=[200, 404], empty.content.override={}}
    [Stellar]>>> REST_GET('http://httpbin.org/status/404', config)
    {}
    ```
    10. The value returned on an error can also be changed from null:
    ```
    config := {'error.value.override':'got an error'}
    [Stellar]>>> result := REST_GET('http://httpbin.org/status/500', config)
    2018-10-28 00:59:41 ERROR RestFunctions:161 - Stellar REST request to 
http://httpbin.org/status/500 expected status code to be one of [200] but 
failed with http status code 500: 
    java.io.IOException: Stellar REST request to http://httpbin.org/status/500 
expected status code to be one of [200] but failed with http status code 500: 
        at 
org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.doGet(RestFunctions.java:209)
        at 
org.apache.metron.stellar.dsl.functions.RestFunctions$RestGet.apply(RestFunctions.java:157)
        at 
org.apache.metron.stellar.common.StellarCompiler.lambda$exitTransformationFunc$13(StellarCompiler.java:652)
        at 
org.apache.metron.stellar.common.StellarCompiler$Expression.apply(StellarCompiler.java:250)
        at 
org.apache.metron.stellar.common.BaseStellarProcessor.parse(BaseStellarProcessor.java:151)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.executeStellar(DefaultStellarShellExecutor.java:409)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:260)
        at 
org.apache.metron.stellar.common.shell.specials.AssignmentCommand.execute(AssignmentCommand.java:66)
        at 
org.apache.metron.stellar.common.shell.DefaultStellarShellExecutor.execute(DefaultStellarShellExecutor.java:255)
        at 
org.apache.metron.stellar.common.shell.cli.StellarShell.execute(StellarShell.java:357)
        at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    got an error
    ```
    
    To test with the parser and enrichment topologies follow these steps:
    
    1. Make sure the topologies are running and data is flowing through.  It's 
easier to test if you restart the parser topology with only a single sensor 
running.
    2. Add a Stellar field transformation to the parser that is still running:
    ```
    "fieldTransformations": [
        {
          "input": [],
          "output": [
            "parser_rest_result"
          ],
          "transformation": "STELLAR",
          "config": {
            "parser_rest_result": 
"REST_GET('http://httpbin.org/get?type=parser')"
          }
        }
      ],
    ```
    3. Listen on the enrichments Kafka topic.  The `parser_rest_result` field 
should now be present.
    4. Add a Stellar enrichment to the sensor:
    ```
    "fieldMap": {
          "geo": [
            "ip_dst_addr",
            "ip_src_addr"
          ],
          "host": [
            "host"
          ],
          "stellar": {
            "config": {
              "enrichment_rest_result": 
"REST_GET('http://httpbin.org/get?type=enrichment')"
            }
          }
        }
    ```
    5. Listen on the indexing Kafka topic.  The `enrichment_rest_result` field 
should now be present.
    
    ### Outstanding Issues
    
    - Currently the Stellar REPL does not quit cleanly.  I suspect it's because 
the client is not closed but I'm still investigating.
    - I would like to add a section with a more detailed description of how to 
use this including explanation around what the various settings do.  Where 
should this go?  I don't see anything like this in the stellar-common README so 
wanted to get some guidance from the community.
    - Caching was briefly discussed in the discuss thread for this feature.  
Stellar provides a caching mechanism but we may need to be more selective about 
what is cached right now.  I believe this should be a follow on.
    - There was a comment in the Jira related to adding a circuit breaker.  
Does that need to be done in this PR or can it be a follow on?  Should we also 
explore/discuss a retry strategy?
    - It was also suggested that we create an abstraction for higher latency 
enrichments such as this in the discuss thread.  I would prefer we create a few 
of these higher latency functions first so that we have a better understanding 
of how this abstraction would look.  Do we want to take that on here?
    
    ## Pull Request Checklist
    
    Thank you for submitting a contribution to Apache Metron.  
    Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
    Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  
    
    
    In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? If not one needs to 
be created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
    - [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
    - [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
    
    
    ### For code changes:
    - [x] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
    - [x] Have you included steps or a guide to how the change may be verified 
and tested manually?
    - [x] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
      ```
      mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
      ```
    
    - [x] Have you written or updated unit tests and or integration tests to 
verify your changes?
    - [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
    - [x] Have you verified the basic functionality of the build by building 
and running locally with Vagrant full-dev environment or the equivalent?
    
    ### For documentation related changes:
    - [x] Have you ensured that format looks appropriate for the output in 
which it is rendered by building and verifying the site-book? If not then run 
the following commands and the verify changes via 
`site-book/target/site/index.html`:
    
      ```
      cd site-book
      mvn site
      ```
    
    #### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
    It is also recommended that [travis-ci](https://travis-ci.org) is set up 
for your personal repository such that your branches are built there before 
submitting a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/merrimanr/incubator-metron METRON-1850

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/1250.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1250
    
----
commit acbfdda2e4dfd5034bea19e6c7d18acc2c6a1e17
Author: merrimanr <merrimanr@...>
Date:   2018-10-31T13:19:34Z

    initial commit

commit 1287d7f674c4197167f9237cf3d6749e77936230
Author: merrimanr <merrimanr@...>
Date:   2018-10-31T18:56:04Z

    expression config should be a map and not a string

----


> Stellar REST function
> ---------------------
>
>                 Key: METRON-1850
>                 URL: https://issues.apache.org/jira/browse/METRON-1850
>             Project: Metron
>          Issue Type: New Feature
>            Reporter: Ryan Merriman
>            Priority: Major
>
> It would be useful to be able to enrich messages with Stellar using 3rd party 
> (or internal) REST services.  At a minimum this function would:
>  * Stellar function available to GET from an HTTP API
>  * Optional parameters for basic auth (user/password) which generate correct 
> Authorization header
>  * Function returns null value for errors, connection failures etc and logs 
> error
>  * Function must provide and use pooled connection objects at the process 
> level
>  * Function must send Accept: application/json header
>  * A global setting must be available to set a proxy for all API calls, and 
> if present the proxy must be used.
>  * Proxy authentication must also be supported using basic auth.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to