Cloudera specifics aside, some of these things have been on my personal
"back burner" todo list; I just haven't as of yet been able to bring them
to the front burner ;-)

Please feel free to create issues for these, and provide patches as you're
able.

Thanks for this useful feedback,
   Phil


On Thu, Oct 11, 2018 at 8:43 AM Lars Francke <lars.fran...@gmail.com> wrote:

> Hi again,
>
> (long intro, jump to the line with the XXXXX if you're not interested in
> the background)
>
> I'm sorry for all the mails, they all come from my quest to get Knox
> packaged up as a Cloudera Parcel plus a CSD (Custom Service Descriptor) to
> run it from Cloudera Manager (CM) similar to what Ambari currently allows.
> I did the same for NiFi and was facing similar issues there (e.g. <
> https://issues.apache.org/jira/browse/NIFI-5350>, <
> https://issues.apache.org/jira/browse/NIFI-5573> and others).
>
> For people unfamiliar with Cloudera Manager I'll explain how it works. That
> should make it clearer why I have the issues I describe below.
>
> Cloudera Manager extracts the "things" it manages into a directory (e.g.
> /opt/cloudera/parcels/KNOX) and they are owned by root:root. This is not to
> be changed by any process (e.g. no configuration file changes, no changing
> of symlinks, no storing of PIDs, logs etc.).
>
> Every time a process (e.g. Knox Gateway) is started CM creates a new
> directory (/var/run/cloudera-scm-agent/process/XXX) where it copies/creates
> the necessary config files + keytabs for _this run_ of the tool.
>
> It then starts the processes by pointing them at this directory, so they
> can pick up their config there and it also captures stdout & stderr in this
> folder.
>
> This is different from Ambari. Ambari extracts Knox and creates symlinks
> from its conf, logs, pids and data directory to /etc/XXX. This is possible
> here because those directories (/etc/...) don't change.
>
>
> XXXXXXXXX
>
>
> The problems I have with Knox so far (I'm sure I'll find more the further I
> get) are:
>
> * gateway.sh has no way to take in options from the "outside". With Hadoop,
> HBase, (now) NiFi you can pass in arbitrary Java options using variables
> like HADOOP_JAVA_OPTS and similar.
>
> In theory all the "setup" is already there for Knox as well using variables
> like APP_CONF_DIR but unfortunately, they get set to hardcoded values at
> the beginning of the script.
>
> Proposal: Add at least a APP_JAVA_OPTS variable so I can pass in arbitrary
> stuff to be added to the Java command line. But really, I'd love to just
> remove the defaults for APP_LOG_DIR etc. IFF they are already set
> externally
>
> * gateway.sh checks whether various directories exist. These are hardcoded
> (e.g. APP_HOME_DIR/conf). But those directories are configurable using
> GATEWAY_HOME etc. so those checks should either be removed or fixed, so
> they take those variables into account
>
> * knoxcli create-master takes a --master argument which I only found out by
> looking at Ambari. The source says it's for testing only. It seems as if
> that should be documented though. I think it's pretty useful to allow the
> master being created non-interactively
>
> * gateway.sh does allow one thing to be overridden externally and that is
> the pid dir using ENV_PID_DIR. Unfortunately, knox-env.sh (which is being
> sourced unconditionally) overrides this variable with an empty value. I
> think this line should just be removed from knox-env.sh
>
> * Launcher looks for a file called gateway.cfg but it always and
> unconditionally looks in its "own" directory (launcherDir). I need a way to
> point this to a different location. It allows me to define GATEWAY_HOME as
> a system property. While I can also define that as an environment variable
> the System property is checked first. And if it finds a gateway-site.xml
> there it uses that. I need it to use the one from the environment variable.
>
> * gateway.sh allows the process to run in the foreground but still captures
> stdout & stderr to files. I would argue that it makes more sense to leave
> them as is and print them to the console instead.
>
> I'm happy to create issues for all of these and also provide patches for
> some/all of them depending on my available time. I just wanted to bring
> this up before I started to see if anyone has any better ideas and/or
> things that I might have missed.
>
> Thanks for reading!
>
> Cheers,
> Lars
>

Reply via email to