We've been able to set up Knox to route Livy requests, and it's working
mostly as expected; when creating a new Spark session via a POST request
with a JSON body, Knox has a rewrite rule that modifies the "proxyUser" in
the JSON body, making sure you can only act as the user you authenticated
to Knox with:
>From service.xml:
<route path="/livy/v1/sessions">
<rewrite apply="LIVYSERVER/livy/addusername/inbound" to="request.body"/>
</route>
>From rewrite.xml:
<filter name="LIVYSERVER/livy/addusername/inbound">
<content type="*/json">
<apply path="$.proxyUser" rule="LIVYSERVER/livy/user-name"/>
</content>
</filter>
Example of a request:
curl -u johwar -v -s --data '{"proxyUser":"foobar","kind": "pyspark"}' -H
"Content-Type: application/json"
https://myknoxserver/gateway/default/livy/v1/sessions
This works fine, and "foobar" above gets replaced with "johwar" before the
request reaches Livy.
However, if you *don't* pass "proxyUser" at all in the request, this rule
doesn't seem to *add* the element, so it ends up as "knox" on the Livy end;
it's probably defaulting to the Kerberos-authenticated user, which is of
course "knox".
Is there a way to make sure that "proxyUser" is modified if it exists (as
above) AND added if it's missing?
NOTE: For our full config, we followed the example below:
https://community.hortonworks.com/articles/70499/adding-livy-server-as-service-to-apache-knox.html