Larry,

thank you so much for spending time on this. I didn't actually expect an
answer as I saw the ball in my court. Thank you! Especially as this must be
one of the most boring parts of writing any piece of software, sorry :(

I agree that lots of things are "close" but it's not all consistent. I have
to "relearn" this every time I look at it as well as it's not super trivial
to keep it all in memory ;-)

I have created some documentation for myself to remember these things. I've
just pushed it to a temporary repository. In case we'll decide to change
something and create an issue from it I'll attach it there as well: <
https://github.com/lfrancke/knox-config-docs>
Far from beautiful but it collects my findings.

As you can see you are correct that the order is correct for some variables
but different for others and on top of that they are inconsistent. All of
them can be configured slightly differently. And the big problem on top of
that is that the most important setting (the one taking top precedence) is
set via Java System properties.
In general I think that's a good decision because a user will actively have
to add a System Property while an Environment Variable and other default
settings  could be coming from an "unknown" environment.

The problem is that the launcher first reads a properties file (gateway.cfg
or knoxcli.cfg) from the bin directory and sets everything it finds in
there as System Properties. If it can't find this file it uses a default
one with hardcoded things.

Because the current shell scripts don't allow me to pass in any Java System
Properties that means that I can't effectively override these defaults
without changing this cfg file. I think that's counterintuitive. Especially
as the launcher writes its default cfg file to the bin directory on first
launch. This means that it effectively shifts the precedence around. The
least important ones (hardcoded defaults) now move to the top spot.

Yes it _can_ all be changed by changing this file but at least in my
opinion that is very counter-intuitive and different from the way all other
Hadoop ecosystem tools work that I know.

Unfortunately I totally agree with you on the backwards compatibility
issue...and I don't have a good idea here either.

For the shell script at least I propose to do the following:
1) Run shellcheck/shellharden over them (same thing happened with Hadoop in
the CLI rewrite) so they are up to current bash standards.

2) Currently the script uses APP_ environment variables. I propose adding a
second set called GATEWAY_ and in version X check if any of them are set
and if they are emit a warning (or even abort startup?) that they are going
to be used in version X+1 to modify startup behavior

3) At the same time check if anyone set APP_ externally and warn them that
they are currently not being honored and that they'll go away entirely in
X+1 as well

That way I could at least pass in Java Properties to the startup script and
it would finally allow me to override the defaults.

I will try to spend a bit of time to work out a more detailed plan for the
rest of it but if we agree that it's worthwhile to make all of this
consistent then I believe we won't be able to do it without breaking
changes in current behavior. I also believe however that almost no one will
rely on the current behavior. I mean...it took me at least a day to
understand how it currently works and I still only get half of it. I doubt
many people have made the same effort (there's always the good chance that
others are smarter than me and didn't take that long ;-) ) though.

In general I would propose the following:

* Roll out any changes over multiple versions of Knox in stages
* Stage 1: Improvements that can be made without any breaking changes
* Stage 2: WARN users if they rey on behavior that is going to change in
the future
* Stage 3: Make the change

Sorry for the rambling... I'll try to keep the Github Repository more
structured.

But I'd also be happy if you told me that you think this is not worth the
effort. Unfortunately our sponsor only paid me for a few days of work on
this so I'll be a bit slow to work on this anyway but I would be happy to
continue/start the work.

Cheers,
Lars


On Wed, Nov 7, 2018 at 7:59 PM larry mccay <lmc...@apache.org> wrote:

> Hi Lars -
>
> Sorry for the delay in getting back to this thread.
> I have to admit, I have yet to get my head around the risk associated with
> the proposed changes.
> Somewhat due to the way that some of the layers in the ordering you propose
> sort of overlap today.
> For instance, the ConfigConfigImpl class takes environment variables and
> system properties into account within its accessor methods for a number of
> these directories.
>
> Example:
>   @Override
>   public String getGatewayDataDir() {
>     String systemValue =
>         System.getProperty(GATEWAY_DATA_HOME_VAR,
> System.getenv(GATEWAY_DATA_HOME_VAR));
>     String dataDir = null;
>     if (systemValue != null) {
>       dataDir = systemValue;
>     } else {
>       dataDir = get(DATA_DIR, getGatewayHomeDir() + File.separator +
> DEFAULT_DATA_DIR);
>     }
>     return FilenameUtils.normalize(dataDir);
>   }
>
> The above does take sys property above environment variable.
> If neither is set it checks the xml config files and ultimately fallsback
> to some default.
> I believe this is exactly what your proposed order is.
>
> Directories that are to be relative to the DataDir determined above should
> be fine since the are considered relative:
>
>   @Override
>   public String getGatewayServicesDir() {
>     return get(STACKS_SERVICES_DIR, getGatewayDataDir() + File.separator +
> DEFAULT_STACKS_SERVICES_DIR);
>   }
>
>   @Override
>   public String getGatewayApplicationsDir() {
>     return get(APPLICATIONS_DIR, getGatewayDataDir() + File.separator +
> DEFAULT_APPLICATIONS_DIR);
>   }
>
> as examples.
>
> Unless we want to be able to use different directory names for those as
> well.
> That would seem more risky.
>
> Now, the ConfDir seems to actually be the same as the DataDir discovery:
>
>   @Override
>   public String getGatewayConfDir() {
>     String value = getVar( GATEWAY_CONF_HOME_VAR, getGatewayHomeDir() +
> File.separator + "conf"  );
>     return FilenameUtils.normalize(value);
>   }
>
> Where getVar does the interrogation of Sys Prop and Env vars:
>
>   private String getVar( String variableName, String defaultValue ) {
>     String value = get( variableName );
>     if( value == null ) {
>       value = System.getProperty( variableName );
>     }
>     if( value == null ) {
>       value = System.getenv( variableName );
>     }
>     if( value == null ) {
>       value = defaultValue;
>     }
>     return value;
>   }
>
> So, it actually seems like the code is processing things in the order that
> you propose already.
>
> You are saying that the order is being manipulated by the use of the cfg
> files, Launcher, etc by setting system properties then?
>
> Again, I still just trying to assess the risk on existing deployments and
> any provisioning tools out there that may be making assumptions.
>
> thanks,
>
> --larry
>
> On Fri, Oct 12, 2018 at 4:37 PM larry mccay <lmc...@apache.org> wrote:
>
> > Hi Lars -
> >
> > Thanks for raising the backward compatibility concern, it was exactly
> what
> > I was thinking as I was catching backing up the thread.
> > We need to give serious thought to upgrade scenarios.
> >
> > I will need to spend some time with the proposal but the bottom line is
> > this area clearly needs some love and we just have to find the safest way
> > to improve it.
> >
> > Specific concerns are obviously around existing deployments and any
> > assumptions that are made by devops type installs, Ambari, etc.
> >
> > thanks,
> >
> > --larry
> >
> > On Fri, Oct 12, 2018 at 1:39 PM Lars Francke <lars.fran...@gmail.com>
> > wrote:
> >
> >> So I did spend a few hours looking through the shell scripts, Launcher,
> >> KnoxCLI, GatewayServer and GatewayConfigImpl.
> >>
> >> Before I start work I'd like to get opinions to avoid unnecessary work.
> >>
> >> There are lots of inconsistencies in the current code for handling
> >> directories and properties. I'd love to make all of that as consistent
> as
> >> I
> >> can.
> >>
> >> The big question for me however is: What about backwards compatibility?
> If
> >> I change the way properties are handled it might mean different
> behaviour
> >> if people keep on using the same config as they are now.
> >>
> >> I say _might_ because I assume that most people stick to the standards
> >> (especially via Ambari) and are not using some of the ways to change
> >> configs. Most of them aren't really documented anyway.
> >>
> >> My current plan is as such:
> >>
> >> - Make all paths configurable using a consistent way in which these
> paths
> >> can be specified
> >>
> >> - The new order should be:
> >>   1) Java System Property
> >>   2) Environment variable
> >>   3) Config from gateway-default.xml & gateway-site.xm
> >>   4) Some computed default
> >>
> >> - This is not possible for all settings. Especially the location of the
> >> gateway-site.xml can obviously not be configured in gateway-site.xml
> >> itself
> >> :)
> >>
> >> - Knox currently (using Invoker + Launcher) sets Java System Properties
> >> for
> >> us coming from the gateway.cfg file. I think this is bad form because
> they
> >> take precedent over environment variables and are set automatically on
> >> first run and one needs to change them.
> >>
> >> So I'd really like to do away with the concept of those cfg files in its
> >> entirety and pass parameters explicitly. I looked at how the new Shell
> >> scripts from Hadoop 3 work and would like to implement something very
> >> similar here.
> >>
> >> If we want to keep the cfg file then I would propose either not using it
> >> to
> >> set Java System properties or at least to change the default gateway.cfg
> >> to
> >> not set GATEWAY_HOME and log4j.configuration
> >>
> >> And if we keep the file I'd also like to make its location configurable.
> >>
> >> But really I'd like to do away with it. Happy to be convinced otherwise
> >> though.
> >>
> >> - gateway.sh will not hardcode any values anymore but use the same logic
> >> as
> >> the code to determine final locations
> >>
> >> - gateway.sh will allow customization via environment variables
> >>
> >> - knox-env.sh will be renamed to knox-function.sh and a new empty
> >> knox-env.sh will be created that basically serves as documentation for
> all
> >> the environment variables that can be used to customize gateway.sh
> >>
> >> - Document knoxcli.sh create-master --master
> >>
> >> - Processes running in the foreground will not redirect stdout & stderr
> to
> >> a file
> >>
> >> Any concerns or comments?
> >>
> >> On Thu, Oct 11, 2018 at 5:36 PM Kevin Risden <kris...@apache.org>
> wrote:
> >>
> >> > Agree with the changes proposed. It is rather annoying to have to edit
> >> > gateway.sh to apply changes when external environment variables are
> >> useful.
> >> > This work would also help support a Docker image (KNOX-1285) since we
> >> would
> >> > run into the same customization issues.
> >> >
> >> > Kevin Risden
> >> >
> >> >
> >> > On Thu, Oct 11, 2018 at 11:12 AM Lars Francke <lars.fran...@gmail.com
> >
> >> > wrote:
> >> >
> >> > > Thanks both of you.
> >> > >
> >> > > No, these changes would benefit me and this is my use case but we
> >> would
> >> > not
> >> > > need to add anything parcel or cloudera specific. And I think that
> >> while
> >> > I
> >> > > have a specific need they are useful in general as well.
> >> > >
> >> > > It would probably be an overhaul of gateway.sh, knox-env.sh and
> >> > knoxcli.sh.
> >> > > Maybe a few changes to Launcher.
> >> > >
> >> > >
> >> > >
> >> > > On Thu, Oct 11, 2018, 16:50 larry mccay <larry.mc...@gmail.com>
> >> wrote:
> >> > >
> >> > > > This sounds like a worthwhile piece of work, Lars.
> >> > > > Would the "parcel" need to be added to the Knox project?
> >> > > >
> >> > > > +1 to Phil's response.
> >> > > >
> >> > > >
> >> > > > On Thu, Oct 11, 2018 at 9:33 AM Phil Zampino <pzamp...@apache.org
> >
> >> > > wrote:
> >> > > >
> >> > > > > Cloudera specifics aside, some of these things have been on my
> >> > personal
> >> > > > > "back burner" todo list; I just haven't as of yet been able to
> >> bring
> >> > > them
> >> > > > > to the front burner ;-)
> >> > > > >
> >> > > > > Please feel free to create issues for these, and provide patches
> >> as
> >> > > > you're
> >> > > > > able.
> >> > > > >
> >> > > > > Thanks for this useful feedback,
> >> > > > >    Phil
> >> > > > >
> >> > > > >
> >> > > > > On Thu, Oct 11, 2018 at 8:43 AM Lars Francke <
> >> lars.fran...@gmail.com
> >> > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi again,
> >> > > > > >
> >> > > > > > (long intro, jump to the line with the XXXXX if you're not
> >> > interested
> >> > > > in
> >> > > > > > the background)
> >> > > > > >
> >> > > > > > I'm sorry for all the mails, they all come from my quest to
> get
> >> > Knox
> >> > > > > > packaged up as a Cloudera Parcel plus a CSD (Custom Service
> >> > > Descriptor)
> >> > > > > to
> >> > > > > > run it from Cloudera Manager (CM) similar to what Ambari
> >> currently
> >> > > > > allows.
> >> > > > > > I did the same for NiFi and was facing similar issues there
> >> (e.g. <
> >> > > > > > https://issues.apache.org/jira/browse/NIFI-5350>, <
> >> > > > > > https://issues.apache.org/jira/browse/NIFI-5573> and others).
> >> > > > > >
> >> > > > > > For people unfamiliar with Cloudera Manager I'll explain how
> it
> >> > > works.
> >> > > > > That
> >> > > > > > should make it clearer why I have the issues I describe below.
> >> > > > > >
> >> > > > > > Cloudera Manager extracts the "things" it manages into a
> >> directory
> >> > > > (e.g.
> >> > > > > > /opt/cloudera/parcels/KNOX) and they are owned by root:root.
> >> This
> >> > is
> >> > > > not
> >> > > > > to
> >> > > > > > be changed by any process (e.g. no configuration file changes,
> >> no
> >> > > > > changing
> >> > > > > > of symlinks, no storing of PIDs, logs etc.).
> >> > > > > >
> >> > > > > > Every time a process (e.g. Knox Gateway) is started CM
> creates a
> >> > new
> >> > > > > > directory (/var/run/cloudera-scm-agent/process/XXX) where it
> >> > > > > copies/creates
> >> > > > > > the necessary config files + keytabs for _this run_ of the
> tool.
> >> > > > > >
> >> > > > > > It then starts the processes by pointing them at this
> >> directory, so
> >> > > > they
> >> > > > > > can pick up their config there and it also captures stdout &
> >> stderr
> >> > > in
> >> > > > > this
> >> > > > > > folder.
> >> > > > > >
> >> > > > > > This is different from Ambari. Ambari extracts Knox and
> creates
> >> > > > symlinks
> >> > > > > > from its conf, logs, pids and data directory to /etc/XXX. This
> >> is
> >> > > > > possible
> >> > > > > > here because those directories (/etc/...) don't change.
> >> > > > > >
> >> > > > > >
> >> > > > > > XXXXXXXXX
> >> > > > > >
> >> > > > > >
> >> > > > > > The problems I have with Knox so far (I'm sure I'll find more
> >> the
> >> > > > > further I
> >> > > > > > get) are:
> >> > > > > >
> >> > > > > > * gateway.sh has no way to take in options from the "outside".
> >> With
> >> > > > > Hadoop,
> >> > > > > > HBase, (now) NiFi you can pass in arbitrary Java options using
> >> > > > variables
> >> > > > > > like HADOOP_JAVA_OPTS and similar.
> >> > > > > >
> >> > > > > > In theory all the "setup" is already there for Knox as well
> >> using
> >> > > > > variables
> >> > > > > > like APP_CONF_DIR but unfortunately, they get set to hardcoded
> >> > values
> >> > > > at
> >> > > > > > the beginning of the script.
> >> > > > > >
> >> > > > > > Proposal: Add at least a APP_JAVA_OPTS variable so I can pass
> in
> >> > > > > arbitrary
> >> > > > > > stuff to be added to the Java command line. But really, I'd
> >> love to
> >> > > > just
> >> > > > > > remove the defaults for APP_LOG_DIR etc. IFF they are already
> >> set
> >> > > > > > externally
> >> > > > > >
> >> > > > > > * gateway.sh checks whether various directories exist. These
> are
> >> > > > > hardcoded
> >> > > > > > (e.g. APP_HOME_DIR/conf). But those directories are
> configurable
> >> > > using
> >> > > > > > GATEWAY_HOME etc. so those checks should either be removed or
> >> > fixed,
> >> > > so
> >> > > > > > they take those variables into account
> >> > > > > >
> >> > > > > > * knoxcli create-master takes a --master argument which I only
> >> > found
> >> > > > out
> >> > > > > by
> >> > > > > > looking at Ambari. The source says it's for testing only. It
> >> seems
> >> > as
> >> > > > if
> >> > > > > > that should be documented though. I think it's pretty useful
> to
> >> > allow
> >> > > > the
> >> > > > > > master being created non-interactively
> >> > > > > >
> >> > > > > > * gateway.sh does allow one thing to be overridden externally
> >> and
> >> > > that
> >> > > > is
> >> > > > > > the pid dir using ENV_PID_DIR. Unfortunately, knox-env.sh
> >> (which is
> >> > > > being
> >> > > > > > sourced unconditionally) overrides this variable with an empty
> >> > > value. I
> >> > > > > > think this line should just be removed from knox-env.sh
> >> > > > > >
> >> > > > > > * Launcher looks for a file called gateway.cfg but it always
> and
> >> > > > > > unconditionally looks in its "own" directory (launcherDir). I
> >> need
> >> > a
> >> > > > way
> >> > > > > to
> >> > > > > > point this to a different location. It allows me to define
> >> > > GATEWAY_HOME
> >> > > > > as
> >> > > > > > a system property. While I can also define that as an
> >> environment
> >> > > > > variable
> >> > > > > > the System property is checked first. And if it finds a
> >> > > > gateway-site.xml
> >> > > > > > there it uses that. I need it to use the one from the
> >> environment
> >> > > > > variable.
> >> > > > > >
> >> > > > > > * gateway.sh allows the process to run in the foreground but
> >> still
> >> > > > > captures
> >> > > > > > stdout & stderr to files. I would argue that it makes more
> >> sense to
> >> > > > leave
> >> > > > > > them as is and print them to the console instead.
> >> > > > > >
> >> > > > > > I'm happy to create issues for all of these and also provide
> >> > patches
> >> > > > for
> >> > > > > > some/all of them depending on my available time. I just wanted
> >> to
> >> > > bring
> >> > > > > > this up before I started to see if anyone has any better ideas
> >> > and/or
> >> > > > > > things that I might have missed.
> >> > > > > >
> >> > > > > > Thanks for reading!
> >> > > > > >
> >> > > > > > Cheers,
> >> > > > > > Lars
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to