I reread the github reference in more detail and like what was proposed.

>From https://github.com/lfrancke/knox-config-docs#proposed-changes:
1. Should be doable without any backwards compat issues
2. I think this is possible without backwards compat issues. Could have a
flag for capturing logs
3. This would be very helpful and as log as it is prefixed there shouldn't
be issues with arbitrary settings getting picked up.

4-8 would have to see what changes this means and if it causes issues.

For the first 3 would be good to create Jiras :)

Kevin Risden


On Thu, Nov 8, 2018 at 7:29 PM Kevin Risden <kris...@apache.org> wrote:

> I think some of the configuration is around log4j configurations which is
> probably going to have to change related to log4j2 (see KNOX-1462). For
> Docker support some of the same things (like running in the foreground and
> configs from environment variables) would be beneficial (see KNOX-1285). I
> read through the github repo and things looked pretty reasonable. Would be
> interesting in concrete changes and see how they play out.
>
> Kevin Risden
>
>
> On Thu, Nov 8, 2018 at 4:44 PM Lars Francke <lars.fran...@gmail.com>
> wrote:
>
>> Larry,
>>
>> thank you so much for spending time on this. I didn't actually expect an
>> answer as I saw the ball in my court. Thank you! Especially as this must
>> be
>> one of the most boring parts of writing any piece of software, sorry :(
>>
>> I agree that lots of things are "close" but it's not all consistent. I
>> have
>> to "relearn" this every time I look at it as well as it's not super
>> trivial
>> to keep it all in memory ;-)
>>
>> I have created some documentation for myself to remember these things.
>> I've
>> just pushed it to a temporary repository. In case we'll decide to change
>> something and create an issue from it I'll attach it there as well: <
>> https://github.com/lfrancke/knox-config-docs>
>> Far from beautiful but it collects my findings.
>>
>> As you can see you are correct that the order is correct for some
>> variables
>> but different for others and on top of that they are inconsistent. All of
>> them can be configured slightly differently. And the big problem on top of
>> that is that the most important setting (the one taking top precedence) is
>> set via Java System properties.
>> In general I think that's a good decision because a user will actively
>> have
>> to add a System Property while an Environment Variable and other default
>> settings  could be coming from an "unknown" environment.
>>
>> The problem is that the launcher first reads a properties file
>> (gateway.cfg
>> or knoxcli.cfg) from the bin directory and sets everything it finds in
>> there as System Properties. If it can't find this file it uses a default
>> one with hardcoded things.
>>
>> Because the current shell scripts don't allow me to pass in any Java
>> System
>> Properties that means that I can't effectively override these defaults
>> without changing this cfg file. I think that's counterintuitive.
>> Especially
>> as the launcher writes its default cfg file to the bin directory on first
>> launch. This means that it effectively shifts the precedence around. The
>> least important ones (hardcoded defaults) now move to the top spot.
>>
>> Yes it _can_ all be changed by changing this file but at least in my
>> opinion that is very counter-intuitive and different from the way all
>> other
>> Hadoop ecosystem tools work that I know.
>>
>> Unfortunately I totally agree with you on the backwards compatibility
>> issue...and I don't have a good idea here either.
>>
>> For the shell script at least I propose to do the following:
>> 1) Run shellcheck/shellharden over them (same thing happened with Hadoop
>> in
>> the CLI rewrite) so they are up to current bash standards.
>>
>> 2) Currently the script uses APP_ environment variables. I propose adding
>> a
>> second set called GATEWAY_ and in version X check if any of them are set
>> and if they are emit a warning (or even abort startup?) that they are
>> going
>> to be used in version X+1 to modify startup behavior
>>
>> 3) At the same time check if anyone set APP_ externally and warn them that
>> they are currently not being honored and that they'll go away entirely in
>> X+1 as well
>>
>> That way I could at least pass in Java Properties to the startup script
>> and
>> it would finally allow me to override the defaults.
>>
>> I will try to spend a bit of time to work out a more detailed plan for the
>> rest of it but if we agree that it's worthwhile to make all of this
>> consistent then I believe we won't be able to do it without breaking
>> changes in current behavior. I also believe however that almost no one
>> will
>> rely on the current behavior. I mean...it took me at least a day to
>> understand how it currently works and I still only get half of it. I doubt
>> many people have made the same effort (there's always the good chance that
>> others are smarter than me and didn't take that long ;-) ) though.
>>
>> In general I would propose the following:
>>
>> * Roll out any changes over multiple versions of Knox in stages
>> * Stage 1: Improvements that can be made without any breaking changes
>> * Stage 2: WARN users if they rey on behavior that is going to change in
>> the future
>> * Stage 3: Make the change
>>
>> Sorry for the rambling... I'll try to keep the Github Repository more
>> structured.
>>
>> But I'd also be happy if you told me that you think this is not worth the
>> effort. Unfortunately our sponsor only paid me for a few days of work on
>> this so I'll be a bit slow to work on this anyway but I would be happy to
>> continue/start the work.
>>
>> Cheers,
>> Lars
>>
>>
>> On Wed, Nov 7, 2018 at 7:59 PM larry mccay <lmc...@apache.org> wrote:
>>
>> > Hi Lars -
>> >
>> > Sorry for the delay in getting back to this thread.
>> > I have to admit, I have yet to get my head around the risk associated
>> with
>> > the proposed changes.
>> > Somewhat due to the way that some of the layers in the ordering you
>> propose
>> > sort of overlap today.
>> > For instance, the ConfigConfigImpl class takes environment variables and
>> > system properties into account within its accessor methods for a number
>> of
>> > these directories.
>> >
>> > Example:
>> >   @Override
>> >   public String getGatewayDataDir() {
>> >     String systemValue =
>> >         System.getProperty(GATEWAY_DATA_HOME_VAR,
>> > System.getenv(GATEWAY_DATA_HOME_VAR));
>> >     String dataDir = null;
>> >     if (systemValue != null) {
>> >       dataDir = systemValue;
>> >     } else {
>> >       dataDir = get(DATA_DIR, getGatewayHomeDir() + File.separator +
>> > DEFAULT_DATA_DIR);
>> >     }
>> >     return FilenameUtils.normalize(dataDir);
>> >   }
>> >
>> > The above does take sys property above environment variable.
>> > If neither is set it checks the xml config files and ultimately
>> fallsback
>> > to some default.
>> > I believe this is exactly what your proposed order is.
>> >
>> > Directories that are to be relative to the DataDir determined above
>> should
>> > be fine since the are considered relative:
>> >
>> >   @Override
>> >   public String getGatewayServicesDir() {
>> >     return get(STACKS_SERVICES_DIR, getGatewayDataDir() +
>> File.separator +
>> > DEFAULT_STACKS_SERVICES_DIR);
>> >   }
>> >
>> >   @Override
>> >   public String getGatewayApplicationsDir() {
>> >     return get(APPLICATIONS_DIR, getGatewayDataDir() + File.separator +
>> > DEFAULT_APPLICATIONS_DIR);
>> >   }
>> >
>> > as examples.
>> >
>> > Unless we want to be able to use different directory names for those as
>> > well.
>> > That would seem more risky.
>> >
>> > Now, the ConfDir seems to actually be the same as the DataDir discovery:
>> >
>> >   @Override
>> >   public String getGatewayConfDir() {
>> >     String value = getVar( GATEWAY_CONF_HOME_VAR, getGatewayHomeDir() +
>> > File.separator + "conf"  );
>> >     return FilenameUtils.normalize(value);
>> >   }
>> >
>> > Where getVar does the interrogation of Sys Prop and Env vars:
>> >
>> >   private String getVar( String variableName, String defaultValue ) {
>> >     String value = get( variableName );
>> >     if( value == null ) {
>> >       value = System.getProperty( variableName );
>> >     }
>> >     if( value == null ) {
>> >       value = System.getenv( variableName );
>> >     }
>> >     if( value == null ) {
>> >       value = defaultValue;
>> >     }
>> >     return value;
>> >   }
>> >
>> > So, it actually seems like the code is processing things in the order
>> that
>> > you propose already.
>> >
>> > You are saying that the order is being manipulated by the use of the cfg
>> > files, Launcher, etc by setting system properties then?
>> >
>> > Again, I still just trying to assess the risk on existing deployments
>> and
>> > any provisioning tools out there that may be making assumptions.
>> >
>> > thanks,
>> >
>> > --larry
>> >
>> > On Fri, Oct 12, 2018 at 4:37 PM larry mccay <lmc...@apache.org> wrote:
>> >
>> > > Hi Lars -
>> > >
>> > > Thanks for raising the backward compatibility concern, it was exactly
>> > what
>> > > I was thinking as I was catching backing up the thread.
>> > > We need to give serious thought to upgrade scenarios.
>> > >
>> > > I will need to spend some time with the proposal but the bottom line
>> is
>> > > this area clearly needs some love and we just have to find the safest
>> way
>> > > to improve it.
>> > >
>> > > Specific concerns are obviously around existing deployments and any
>> > > assumptions that are made by devops type installs, Ambari, etc.
>> > >
>> > > thanks,
>> > >
>> > > --larry
>> > >
>> > > On Fri, Oct 12, 2018 at 1:39 PM Lars Francke <lars.fran...@gmail.com>
>> > > wrote:
>> > >
>> > >> So I did spend a few hours looking through the shell scripts,
>> Launcher,
>> > >> KnoxCLI, GatewayServer and GatewayConfigImpl.
>> > >>
>> > >> Before I start work I'd like to get opinions to avoid unnecessary
>> work.
>> > >>
>> > >> There are lots of inconsistencies in the current code for handling
>> > >> directories and properties. I'd love to make all of that as
>> consistent
>> > as
>> > >> I
>> > >> can.
>> > >>
>> > >> The big question for me however is: What about backwards
>> compatibility?
>> > If
>> > >> I change the way properties are handled it might mean different
>> > behaviour
>> > >> if people keep on using the same config as they are now.
>> > >>
>> > >> I say _might_ because I assume that most people stick to the
>> standards
>> > >> (especially via Ambari) and are not using some of the ways to change
>> > >> configs. Most of them aren't really documented anyway.
>> > >>
>> > >> My current plan is as such:
>> > >>
>> > >> - Make all paths configurable using a consistent way in which these
>> > paths
>> > >> can be specified
>> > >>
>> > >> - The new order should be:
>> > >>   1) Java System Property
>> > >>   2) Environment variable
>> > >>   3) Config from gateway-default.xml & gateway-site.xm
>> > >>   4) Some computed default
>> > >>
>> > >> - This is not possible for all settings. Especially the location of
>> the
>> > >> gateway-site.xml can obviously not be configured in gateway-site.xml
>> > >> itself
>> > >> :)
>> > >>
>> > >> - Knox currently (using Invoker + Launcher) sets Java System
>> Properties
>> > >> for
>> > >> us coming from the gateway.cfg file. I think this is bad form because
>> > they
>> > >> take precedent over environment variables and are set automatically
>> on
>> > >> first run and one needs to change them.
>> > >>
>> > >> So I'd really like to do away with the concept of those cfg files in
>> its
>> > >> entirety and pass parameters explicitly. I looked at how the new
>> Shell
>> > >> scripts from Hadoop 3 work and would like to implement something very
>> > >> similar here.
>> > >>
>> > >> If we want to keep the cfg file then I would propose either not
>> using it
>> > >> to
>> > >> set Java System properties or at least to change the default
>> gateway.cfg
>> > >> to
>> > >> not set GATEWAY_HOME and log4j.configuration
>> > >>
>> > >> And if we keep the file I'd also like to make its location
>> configurable.
>> > >>
>> > >> But really I'd like to do away with it. Happy to be convinced
>> otherwise
>> > >> though.
>> > >>
>> > >> - gateway.sh will not hardcode any values anymore but use the same
>> logic
>> > >> as
>> > >> the code to determine final locations
>> > >>
>> > >> - gateway.sh will allow customization via environment variables
>> > >>
>> > >> - knox-env.sh will be renamed to knox-function.sh and a new empty
>> > >> knox-env.sh will be created that basically serves as documentation
>> for
>> > all
>> > >> the environment variables that can be used to customize gateway.sh
>> > >>
>> > >> - Document knoxcli.sh create-master --master
>> > >>
>> > >> - Processes running in the foreground will not redirect stdout &
>> stderr
>> > to
>> > >> a file
>> > >>
>> > >> Any concerns or comments?
>> > >>
>> > >> On Thu, Oct 11, 2018 at 5:36 PM
>> Kevin Risden
>> <kris...@apache.org>
>> > wrote:
>> > >>
>> > >> > Agree with the changes proposed. It is rather annoying to have to
>> edit
>> > >> > gateway.sh to apply changes when external environment variables are
>> > >> useful.
>> > >> > This work would also help support a Docker image (KNOX-1285) since
>> we
>> > >> would
>> > >> > run into the same customization issues.
>> > >> >
>> > >> > Kevin Risden
>> > >> >
>> > >> >
>> > >> > On Thu, Oct 11, 2018 at 11:12 AM Lars Francke <
>> lars.fran...@gmail.com
>> > >
>> > >> > wrote:
>> > >> >
>> > >> > > Thanks both of you.
>> > >> > >
>> > >> > > No, these changes would benefit me and this is my use case but we
>> > >> would
>> > >> > not
>> > >> > > need to add anything parcel or cloudera specific. And I think
>> that
>> > >> while
>> > >> > I
>> > >> > > have a specific need they are useful in general as well.
>> > >> > >
>> > >> > > It would probably be an overhaul of gateway.sh, knox-env.sh and
>> > >> > knoxcli.sh.
>> > >> > > Maybe a few changes to Launcher.
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > > On Thu, Oct 11, 2018, 16:50 larry mccay <larry.mc...@gmail.com>
>> > >> wrote:
>> > >> > >
>> > >> > > > This sounds like a worthwhile piece of work, Lars.
>> > >> > > > Would the "parcel" need to be added to the Knox project?
>> > >> > > >
>> > >> > > > +1 to Phil's response.
>> > >> > > >
>> > >> > > >
>> > >> > > > On Thu, Oct 11, 2018 at 9:33 AM Phil Zampino <
>> pzamp...@apache.org
>> > >
>> > >> > > wrote:
>> > >> > > >
>> > >> > > > > Cloudera specifics aside, some of these things have been on
>> my
>> > >> > personal
>> > >> > > > > "back burner" todo list; I just haven't as of yet been able
>> to
>> > >> bring
>> > >> > > them
>> > >> > > > > to the front burner ;-)
>> > >> > > > >
>> > >> > > > > Please feel free to create issues for these, and provide
>> patches
>> > >> as
>> > >> > > > you're
>> > >> > > > > able.
>> > >> > > > >
>> > >> > > > > Thanks for this useful feedback,
>> > >> > > > >    Phil
>> > >> > > > >
>> > >> > > > >
>> > >> > > > > On Thu, Oct 11, 2018 at 8:43 AM Lars Francke <
>> > >> lars.fran...@gmail.com
>> > >> > >
>> > >> > > > > wrote:
>> > >> > > > >
>> > >> > > > > > Hi again,
>> > >> > > > > >
>> > >> > > > > > (long intro, jump to the line with the XXXXX if you're not
>> > >> > interested
>> > >> > > > in
>> > >> > > > > > the background)
>> > >> > > > > >
>> > >> > > > > > I'm sorry for all the mails, they all come from my quest to
>> > get
>> > >> > Knox
>> > >> > > > > > packaged up as a Cloudera Parcel plus a CSD (Custom Service
>> > >> > > Descriptor)
>> > >> > > > > to
>> > >> > > > > > run it from Cloudera Manager (CM) similar to what Ambari
>> > >> currently
>> > >> > > > > allows.
>> > >> > > > > > I did the same for NiFi and was facing similar issues there
>> > >> (e.g. <
>> > >> > > > > > https://issues.apache.org/jira/browse/NIFI-5350>, <
>> > >> > > > > > https://issues.apache.org/jira/browse/NIFI-5573> and
>> others).
>> > >> > > > > >
>> > >> > > > > > For people unfamiliar with Cloudera Manager I'll explain
>> how
>> > it
>> > >> > > works.
>> > >> > > > > That
>> > >> > > > > > should make it clearer why I have the issues I describe
>> below.
>> > >> > > > > >
>> > >> > > > > > Cloudera Manager extracts the "things" it manages into a
>> > >> directory
>> > >> > > > (e.g.
>> > >> > > > > > /opt/cloudera/parcels/KNOX) and they are owned by
>> root:root.
>> > >> This
>> > >> > is
>> > >> > > > not
>> > >> > > > > to
>> > >> > > > > > be changed by any process (e.g. no configuration file
>> changes,
>> > >> no
>> > >> > > > > changing
>> > >> > > > > > of symlinks, no storing of PIDs, logs etc.).
>> > >> > > > > >
>> > >> > > > > > Every time a process (e.g. Knox Gateway) is started CM
>> > creates a
>> > >> > new
>> > >> > > > > > directory (/var/run/cloudera-scm-agent/process/XXX) where
>> it
>> > >> > > > > copies/creates
>> > >> > > > > > the necessary config files + keytabs for _this run_ of the
>> > tool.
>> > >> > > > > >
>> > >> > > > > > It then starts the processes by pointing them at this
>> > >> directory, so
>> > >> > > > they
>> > >> > > > > > can pick up their config there and it also captures stdout
>> &
>> > >> stderr
>> > >> > > in
>> > >> > > > > this
>> > >> > > > > > folder.
>> > >> > > > > >
>> > >> > > > > > This is different from Ambari. Ambari extracts Knox and
>> > creates
>> > >> > > > symlinks
>> > >> > > > > > from its conf, logs, pids and data directory to /etc/XXX.
>> This
>> > >> is
>> > >> > > > > possible
>> > >> > > > > > here because those directories (/etc/...) don't change.
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > XXXXXXXXX
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > The problems I have with Knox so far (I'm sure I'll find
>> more
>> > >> the
>> > >> > > > > further I
>> > >> > > > > > get) are:
>> > >> > > > > >
>> > >> > > > > > * gateway.sh has no way to take in options from the
>> "outside".
>> > >> With
>> > >> > > > > Hadoop,
>> > >> > > > > > HBase, (now) NiFi you can pass in arbitrary Java options
>> using
>> > >> > > > variables
>> > >> > > > > > like HADOOP_JAVA_OPTS and similar.
>> > >> > > > > >
>> > >> > > > > > In theory all the "setup" is already there for Knox as well
>> > >> using
>> > >> > > > > variables
>> > >> > > > > > like APP_CONF_DIR but unfortunately, they get set to
>> hardcoded
>> > >> > values
>> > >> > > > at
>> > >> > > > > > the beginning of the script.
>> > >> > > > > >
>> > >> > > > > > Proposal: Add at least a APP_JAVA_OPTS variable so I can
>> pass
>> > in
>> > >> > > > > arbitrary
>> > >> > > > > > stuff to be added to the Java command line. But really, I'd
>> > >> love to
>> > >> > > > just
>> > >> > > > > > remove the defaults for APP_LOG_DIR etc. IFF they are
>> already
>> > >> set
>> > >> > > > > > externally
>> > >> > > > > >
>> > >> > > > > > * gateway.sh checks whether various directories exist.
>> These
>> > are
>> > >> > > > > hardcoded
>> > >> > > > > > (e.g. APP_HOME_DIR/conf). But those directories are
>> > configurable
>> > >> > > using
>> > >> > > > > > GATEWAY_HOME etc. so those checks should either be removed
>> or
>> > >> > fixed,
>> > >> > > so
>> > >> > > > > > they take those variables into account
>> > >> > > > > >
>> > >> > > > > > * knoxcli create-master takes a --master argument which I
>> only
>> > >> > found
>> > >> > > > out
>> > >> > > > > by
>> > >> > > > > > looking at Ambari. The source says it's for testing only.
>> It
>> > >> seems
>> > >> > as
>> > >> > > > if
>> > >> > > > > > that should be documented though. I think it's pretty
>> useful
>> > to
>> > >> > allow
>> > >> > > > the
>> > >> > > > > > master being created non-interactively
>> > >> > > > > >
>> > >> > > > > > * gateway.sh does allow one thing to be overridden
>> externally
>> > >> and
>> > >> > > that
>> > >> > > > is
>> > >> > > > > > the pid dir using ENV_PID_DIR. Unfortunately, knox-env.sh
>> > >> (which is
>> > >> > > > being
>> > >> > > > > > sourced unconditionally) overrides this variable with an
>> empty
>> > >> > > value. I
>> > >> > > > > > think this line should just be removed from knox-env.sh
>> > >> > > > > >
>> > >> > > > > > * Launcher looks for a file called gateway.cfg but it
>> always
>> > and
>> > >> > > > > > unconditionally looks in its "own" directory
>> (launcherDir). I
>> > >> need
>> > >> > a
>> > >> > > > way
>> > >> > > > > to
>> > >> > > > > > point this to a different location. It allows me to define
>> > >> > > GATEWAY_HOME
>> > >> > > > > as
>> > >> > > > > > a system property. While I can also define that as an
>> > >> environment
>> > >> > > > > variable
>> > >> > > > > > the System property is checked first. And if it finds a
>> > >> > > > gateway-site.xml
>> > >> > > > > > there it uses that. I need it to use the one from the
>> > >> environment
>> > >> > > > > variable.
>> > >> > > > > >
>> > >> > > > > > * gateway.sh allows the process to run in the foreground
>> but
>> > >> still
>> > >> > > > > captures
>> > >> > > > > > stdout & stderr to files. I would argue that it makes more
>> > >> sense to
>> > >> > > > leave
>> > >> > > > > > them as is and print them to the console instead.
>> > >> > > > > >
>> > >> > > > > > I'm happy to create issues for all of these and also
>> provide
>> > >> > patches
>> > >> > > > for
>> > >> > > > > > some/all of them depending on my available time. I just
>> wanted
>> > >> to
>> > >> > > bring
>> > >> > > > > > this up before I started to see if anyone has any better
>> ideas
>> > >> > and/or
>> > >> > > > > > things that I might have missed.
>> > >> > > > > >
>> > >> > > > > > Thanks for reading!
>> > >> > > > > >
>> > >> > > > > > Cheers,
>> > >> > > > > > Lars
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> >
>>
>

Reply via email to