I reread the github reference in more detail and like what was proposed. >From https://github.com/lfrancke/knox-config-docs#proposed-changes: 1. Should be doable without any backwards compat issues 2. I think this is possible without backwards compat issues. Could have a flag for capturing logs 3. This would be very helpful and as log as it is prefixed there shouldn't be issues with arbitrary settings getting picked up.
4-8 would have to see what changes this means and if it causes issues. For the first 3 would be good to create Jiras :) Kevin Risden On Thu, Nov 8, 2018 at 7:29 PM Kevin Risden <kris...@apache.org> wrote: > I think some of the configuration is around log4j configurations which is > probably going to have to change related to log4j2 (see KNOX-1462). For > Docker support some of the same things (like running in the foreground and > configs from environment variables) would be beneficial (see KNOX-1285). I > read through the github repo and things looked pretty reasonable. Would be > interesting in concrete changes and see how they play out. > > Kevin Risden > > > On Thu, Nov 8, 2018 at 4:44 PM Lars Francke <lars.fran...@gmail.com> > wrote: > >> Larry, >> >> thank you so much for spending time on this. I didn't actually expect an >> answer as I saw the ball in my court. Thank you! Especially as this must >> be >> one of the most boring parts of writing any piece of software, sorry :( >> >> I agree that lots of things are "close" but it's not all consistent. I >> have >> to "relearn" this every time I look at it as well as it's not super >> trivial >> to keep it all in memory ;-) >> >> I have created some documentation for myself to remember these things. >> I've >> just pushed it to a temporary repository. In case we'll decide to change >> something and create an issue from it I'll attach it there as well: < >> https://github.com/lfrancke/knox-config-docs> >> Far from beautiful but it collects my findings. >> >> As you can see you are correct that the order is correct for some >> variables >> but different for others and on top of that they are inconsistent. All of >> them can be configured slightly differently. And the big problem on top of >> that is that the most important setting (the one taking top precedence) is >> set via Java System properties. >> In general I think that's a good decision because a user will actively >> have >> to add a System Property while an Environment Variable and other default >> settings could be coming from an "unknown" environment. >> >> The problem is that the launcher first reads a properties file >> (gateway.cfg >> or knoxcli.cfg) from the bin directory and sets everything it finds in >> there as System Properties. If it can't find this file it uses a default >> one with hardcoded things. >> >> Because the current shell scripts don't allow me to pass in any Java >> System >> Properties that means that I can't effectively override these defaults >> without changing this cfg file. I think that's counterintuitive. >> Especially >> as the launcher writes its default cfg file to the bin directory on first >> launch. This means that it effectively shifts the precedence around. The >> least important ones (hardcoded defaults) now move to the top spot. >> >> Yes it _can_ all be changed by changing this file but at least in my >> opinion that is very counter-intuitive and different from the way all >> other >> Hadoop ecosystem tools work that I know. >> >> Unfortunately I totally agree with you on the backwards compatibility >> issue...and I don't have a good idea here either. >> >> For the shell script at least I propose to do the following: >> 1) Run shellcheck/shellharden over them (same thing happened with Hadoop >> in >> the CLI rewrite) so they are up to current bash standards. >> >> 2) Currently the script uses APP_ environment variables. I propose adding >> a >> second set called GATEWAY_ and in version X check if any of them are set >> and if they are emit a warning (or even abort startup?) that they are >> going >> to be used in version X+1 to modify startup behavior >> >> 3) At the same time check if anyone set APP_ externally and warn them that >> they are currently not being honored and that they'll go away entirely in >> X+1 as well >> >> That way I could at least pass in Java Properties to the startup script >> and >> it would finally allow me to override the defaults. >> >> I will try to spend a bit of time to work out a more detailed plan for the >> rest of it but if we agree that it's worthwhile to make all of this >> consistent then I believe we won't be able to do it without breaking >> changes in current behavior. I also believe however that almost no one >> will >> rely on the current behavior. I mean...it took me at least a day to >> understand how it currently works and I still only get half of it. I doubt >> many people have made the same effort (there's always the good chance that >> others are smarter than me and didn't take that long ;-) ) though. >> >> In general I would propose the following: >> >> * Roll out any changes over multiple versions of Knox in stages >> * Stage 1: Improvements that can be made without any breaking changes >> * Stage 2: WARN users if they rey on behavior that is going to change in >> the future >> * Stage 3: Make the change >> >> Sorry for the rambling... I'll try to keep the Github Repository more >> structured. >> >> But I'd also be happy if you told me that you think this is not worth the >> effort. Unfortunately our sponsor only paid me for a few days of work on >> this so I'll be a bit slow to work on this anyway but I would be happy to >> continue/start the work. >> >> Cheers, >> Lars >> >> >> On Wed, Nov 7, 2018 at 7:59 PM larry mccay <lmc...@apache.org> wrote: >> >> > Hi Lars - >> > >> > Sorry for the delay in getting back to this thread. >> > I have to admit, I have yet to get my head around the risk associated >> with >> > the proposed changes. >> > Somewhat due to the way that some of the layers in the ordering you >> propose >> > sort of overlap today. >> > For instance, the ConfigConfigImpl class takes environment variables and >> > system properties into account within its accessor methods for a number >> of >> > these directories. >> > >> > Example: >> > @Override >> > public String getGatewayDataDir() { >> > String systemValue = >> > System.getProperty(GATEWAY_DATA_HOME_VAR, >> > System.getenv(GATEWAY_DATA_HOME_VAR)); >> > String dataDir = null; >> > if (systemValue != null) { >> > dataDir = systemValue; >> > } else { >> > dataDir = get(DATA_DIR, getGatewayHomeDir() + File.separator + >> > DEFAULT_DATA_DIR); >> > } >> > return FilenameUtils.normalize(dataDir); >> > } >> > >> > The above does take sys property above environment variable. >> > If neither is set it checks the xml config files and ultimately >> fallsback >> > to some default. >> > I believe this is exactly what your proposed order is. >> > >> > Directories that are to be relative to the DataDir determined above >> should >> > be fine since the are considered relative: >> > >> > @Override >> > public String getGatewayServicesDir() { >> > return get(STACKS_SERVICES_DIR, getGatewayDataDir() + >> File.separator + >> > DEFAULT_STACKS_SERVICES_DIR); >> > } >> > >> > @Override >> > public String getGatewayApplicationsDir() { >> > return get(APPLICATIONS_DIR, getGatewayDataDir() + File.separator + >> > DEFAULT_APPLICATIONS_DIR); >> > } >> > >> > as examples. >> > >> > Unless we want to be able to use different directory names for those as >> > well. >> > That would seem more risky. >> > >> > Now, the ConfDir seems to actually be the same as the DataDir discovery: >> > >> > @Override >> > public String getGatewayConfDir() { >> > String value = getVar( GATEWAY_CONF_HOME_VAR, getGatewayHomeDir() + >> > File.separator + "conf" ); >> > return FilenameUtils.normalize(value); >> > } >> > >> > Where getVar does the interrogation of Sys Prop and Env vars: >> > >> > private String getVar( String variableName, String defaultValue ) { >> > String value = get( variableName ); >> > if( value == null ) { >> > value = System.getProperty( variableName ); >> > } >> > if( value == null ) { >> > value = System.getenv( variableName ); >> > } >> > if( value == null ) { >> > value = defaultValue; >> > } >> > return value; >> > } >> > >> > So, it actually seems like the code is processing things in the order >> that >> > you propose already. >> > >> > You are saying that the order is being manipulated by the use of the cfg >> > files, Launcher, etc by setting system properties then? >> > >> > Again, I still just trying to assess the risk on existing deployments >> and >> > any provisioning tools out there that may be making assumptions. >> > >> > thanks, >> > >> > --larry >> > >> > On Fri, Oct 12, 2018 at 4:37 PM larry mccay <lmc...@apache.org> wrote: >> > >> > > Hi Lars - >> > > >> > > Thanks for raising the backward compatibility concern, it was exactly >> > what >> > > I was thinking as I was catching backing up the thread. >> > > We need to give serious thought to upgrade scenarios. >> > > >> > > I will need to spend some time with the proposal but the bottom line >> is >> > > this area clearly needs some love and we just have to find the safest >> way >> > > to improve it. >> > > >> > > Specific concerns are obviously around existing deployments and any >> > > assumptions that are made by devops type installs, Ambari, etc. >> > > >> > > thanks, >> > > >> > > --larry >> > > >> > > On Fri, Oct 12, 2018 at 1:39 PM Lars Francke <lars.fran...@gmail.com> >> > > wrote: >> > > >> > >> So I did spend a few hours looking through the shell scripts, >> Launcher, >> > >> KnoxCLI, GatewayServer and GatewayConfigImpl. >> > >> >> > >> Before I start work I'd like to get opinions to avoid unnecessary >> work. >> > >> >> > >> There are lots of inconsistencies in the current code for handling >> > >> directories and properties. I'd love to make all of that as >> consistent >> > as >> > >> I >> > >> can. >> > >> >> > >> The big question for me however is: What about backwards >> compatibility? >> > If >> > >> I change the way properties are handled it might mean different >> > behaviour >> > >> if people keep on using the same config as they are now. >> > >> >> > >> I say _might_ because I assume that most people stick to the >> standards >> > >> (especially via Ambari) and are not using some of the ways to change >> > >> configs. Most of them aren't really documented anyway. >> > >> >> > >> My current plan is as such: >> > >> >> > >> - Make all paths configurable using a consistent way in which these >> > paths >> > >> can be specified >> > >> >> > >> - The new order should be: >> > >> 1) Java System Property >> > >> 2) Environment variable >> > >> 3) Config from gateway-default.xml & gateway-site.xm >> > >> 4) Some computed default >> > >> >> > >> - This is not possible for all settings. Especially the location of >> the >> > >> gateway-site.xml can obviously not be configured in gateway-site.xml >> > >> itself >> > >> :) >> > >> >> > >> - Knox currently (using Invoker + Launcher) sets Java System >> Properties >> > >> for >> > >> us coming from the gateway.cfg file. I think this is bad form because >> > they >> > >> take precedent over environment variables and are set automatically >> on >> > >> first run and one needs to change them. >> > >> >> > >> So I'd really like to do away with the concept of those cfg files in >> its >> > >> entirety and pass parameters explicitly. I looked at how the new >> Shell >> > >> scripts from Hadoop 3 work and would like to implement something very >> > >> similar here. >> > >> >> > >> If we want to keep the cfg file then I would propose either not >> using it >> > >> to >> > >> set Java System properties or at least to change the default >> gateway.cfg >> > >> to >> > >> not set GATEWAY_HOME and log4j.configuration >> > >> >> > >> And if we keep the file I'd also like to make its location >> configurable. >> > >> >> > >> But really I'd like to do away with it. Happy to be convinced >> otherwise >> > >> though. >> > >> >> > >> - gateway.sh will not hardcode any values anymore but use the same >> logic >> > >> as >> > >> the code to determine final locations >> > >> >> > >> - gateway.sh will allow customization via environment variables >> > >> >> > >> - knox-env.sh will be renamed to knox-function.sh and a new empty >> > >> knox-env.sh will be created that basically serves as documentation >> for >> > all >> > >> the environment variables that can be used to customize gateway.sh >> > >> >> > >> - Document knoxcli.sh create-master --master >> > >> >> > >> - Processes running in the foreground will not redirect stdout & >> stderr >> > to >> > >> a file >> > >> >> > >> Any concerns or comments? >> > >> >> > >> On Thu, Oct 11, 2018 at 5:36 PM >> Kevin Risden >> <kris...@apache.org> >> > wrote: >> > >> >> > >> > Agree with the changes proposed. It is rather annoying to have to >> edit >> > >> > gateway.sh to apply changes when external environment variables are >> > >> useful. >> > >> > This work would also help support a Docker image (KNOX-1285) since >> we >> > >> would >> > >> > run into the same customization issues. >> > >> > >> > >> > Kevin Risden >> > >> > >> > >> > >> > >> > On Thu, Oct 11, 2018 at 11:12 AM Lars Francke < >> lars.fran...@gmail.com >> > > >> > >> > wrote: >> > >> > >> > >> > > Thanks both of you. >> > >> > > >> > >> > > No, these changes would benefit me and this is my use case but we >> > >> would >> > >> > not >> > >> > > need to add anything parcel or cloudera specific. And I think >> that >> > >> while >> > >> > I >> > >> > > have a specific need they are useful in general as well. >> > >> > > >> > >> > > It would probably be an overhaul of gateway.sh, knox-env.sh and >> > >> > knoxcli.sh. >> > >> > > Maybe a few changes to Launcher. >> > >> > > >> > >> > > >> > >> > > >> > >> > > On Thu, Oct 11, 2018, 16:50 larry mccay <larry.mc...@gmail.com> >> > >> wrote: >> > >> > > >> > >> > > > This sounds like a worthwhile piece of work, Lars. >> > >> > > > Would the "parcel" need to be added to the Knox project? >> > >> > > > >> > >> > > > +1 to Phil's response. >> > >> > > > >> > >> > > > >> > >> > > > On Thu, Oct 11, 2018 at 9:33 AM Phil Zampino < >> pzamp...@apache.org >> > > >> > >> > > wrote: >> > >> > > > >> > >> > > > > Cloudera specifics aside, some of these things have been on >> my >> > >> > personal >> > >> > > > > "back burner" todo list; I just haven't as of yet been able >> to >> > >> bring >> > >> > > them >> > >> > > > > to the front burner ;-) >> > >> > > > > >> > >> > > > > Please feel free to create issues for these, and provide >> patches >> > >> as >> > >> > > > you're >> > >> > > > > able. >> > >> > > > > >> > >> > > > > Thanks for this useful feedback, >> > >> > > > > Phil >> > >> > > > > >> > >> > > > > >> > >> > > > > On Thu, Oct 11, 2018 at 8:43 AM Lars Francke < >> > >> lars.fran...@gmail.com >> > >> > > >> > >> > > > > wrote: >> > >> > > > > >> > >> > > > > > Hi again, >> > >> > > > > > >> > >> > > > > > (long intro, jump to the line with the XXXXX if you're not >> > >> > interested >> > >> > > > in >> > >> > > > > > the background) >> > >> > > > > > >> > >> > > > > > I'm sorry for all the mails, they all come from my quest to >> > get >> > >> > Knox >> > >> > > > > > packaged up as a Cloudera Parcel plus a CSD (Custom Service >> > >> > > Descriptor) >> > >> > > > > to >> > >> > > > > > run it from Cloudera Manager (CM) similar to what Ambari >> > >> currently >> > >> > > > > allows. >> > >> > > > > > I did the same for NiFi and was facing similar issues there >> > >> (e.g. < >> > >> > > > > > https://issues.apache.org/jira/browse/NIFI-5350>, < >> > >> > > > > > https://issues.apache.org/jira/browse/NIFI-5573> and >> others). >> > >> > > > > > >> > >> > > > > > For people unfamiliar with Cloudera Manager I'll explain >> how >> > it >> > >> > > works. >> > >> > > > > That >> > >> > > > > > should make it clearer why I have the issues I describe >> below. >> > >> > > > > > >> > >> > > > > > Cloudera Manager extracts the "things" it manages into a >> > >> directory >> > >> > > > (e.g. >> > >> > > > > > /opt/cloudera/parcels/KNOX) and they are owned by >> root:root. >> > >> This >> > >> > is >> > >> > > > not >> > >> > > > > to >> > >> > > > > > be changed by any process (e.g. no configuration file >> changes, >> > >> no >> > >> > > > > changing >> > >> > > > > > of symlinks, no storing of PIDs, logs etc.). >> > >> > > > > > >> > >> > > > > > Every time a process (e.g. Knox Gateway) is started CM >> > creates a >> > >> > new >> > >> > > > > > directory (/var/run/cloudera-scm-agent/process/XXX) where >> it >> > >> > > > > copies/creates >> > >> > > > > > the necessary config files + keytabs for _this run_ of the >> > tool. >> > >> > > > > > >> > >> > > > > > It then starts the processes by pointing them at this >> > >> directory, so >> > >> > > > they >> > >> > > > > > can pick up their config there and it also captures stdout >> & >> > >> stderr >> > >> > > in >> > >> > > > > this >> > >> > > > > > folder. >> > >> > > > > > >> > >> > > > > > This is different from Ambari. Ambari extracts Knox and >> > creates >> > >> > > > symlinks >> > >> > > > > > from its conf, logs, pids and data directory to /etc/XXX. >> This >> > >> is >> > >> > > > > possible >> > >> > > > > > here because those directories (/etc/...) don't change. >> > >> > > > > > >> > >> > > > > > >> > >> > > > > > XXXXXXXXX >> > >> > > > > > >> > >> > > > > > >> > >> > > > > > The problems I have with Knox so far (I'm sure I'll find >> more >> > >> the >> > >> > > > > further I >> > >> > > > > > get) are: >> > >> > > > > > >> > >> > > > > > * gateway.sh has no way to take in options from the >> "outside". >> > >> With >> > >> > > > > Hadoop, >> > >> > > > > > HBase, (now) NiFi you can pass in arbitrary Java options >> using >> > >> > > > variables >> > >> > > > > > like HADOOP_JAVA_OPTS and similar. >> > >> > > > > > >> > >> > > > > > In theory all the "setup" is already there for Knox as well >> > >> using >> > >> > > > > variables >> > >> > > > > > like APP_CONF_DIR but unfortunately, they get set to >> hardcoded >> > >> > values >> > >> > > > at >> > >> > > > > > the beginning of the script. >> > >> > > > > > >> > >> > > > > > Proposal: Add at least a APP_JAVA_OPTS variable so I can >> pass >> > in >> > >> > > > > arbitrary >> > >> > > > > > stuff to be added to the Java command line. But really, I'd >> > >> love to >> > >> > > > just >> > >> > > > > > remove the defaults for APP_LOG_DIR etc. IFF they are >> already >> > >> set >> > >> > > > > > externally >> > >> > > > > > >> > >> > > > > > * gateway.sh checks whether various directories exist. >> These >> > are >> > >> > > > > hardcoded >> > >> > > > > > (e.g. APP_HOME_DIR/conf). But those directories are >> > configurable >> > >> > > using >> > >> > > > > > GATEWAY_HOME etc. so those checks should either be removed >> or >> > >> > fixed, >> > >> > > so >> > >> > > > > > they take those variables into account >> > >> > > > > > >> > >> > > > > > * knoxcli create-master takes a --master argument which I >> only >> > >> > found >> > >> > > > out >> > >> > > > > by >> > >> > > > > > looking at Ambari. The source says it's for testing only. >> It >> > >> seems >> > >> > as >> > >> > > > if >> > >> > > > > > that should be documented though. I think it's pretty >> useful >> > to >> > >> > allow >> > >> > > > the >> > >> > > > > > master being created non-interactively >> > >> > > > > > >> > >> > > > > > * gateway.sh does allow one thing to be overridden >> externally >> > >> and >> > >> > > that >> > >> > > > is >> > >> > > > > > the pid dir using ENV_PID_DIR. Unfortunately, knox-env.sh >> > >> (which is >> > >> > > > being >> > >> > > > > > sourced unconditionally) overrides this variable with an >> empty >> > >> > > value. I >> > >> > > > > > think this line should just be removed from knox-env.sh >> > >> > > > > > >> > >> > > > > > * Launcher looks for a file called gateway.cfg but it >> always >> > and >> > >> > > > > > unconditionally looks in its "own" directory >> (launcherDir). I >> > >> need >> > >> > a >> > >> > > > way >> > >> > > > > to >> > >> > > > > > point this to a different location. It allows me to define >> > >> > > GATEWAY_HOME >> > >> > > > > as >> > >> > > > > > a system property. While I can also define that as an >> > >> environment >> > >> > > > > variable >> > >> > > > > > the System property is checked first. And if it finds a >> > >> > > > gateway-site.xml >> > >> > > > > > there it uses that. I need it to use the one from the >> > >> environment >> > >> > > > > variable. >> > >> > > > > > >> > >> > > > > > * gateway.sh allows the process to run in the foreground >> but >> > >> still >> > >> > > > > captures >> > >> > > > > > stdout & stderr to files. I would argue that it makes more >> > >> sense to >> > >> > > > leave >> > >> > > > > > them as is and print them to the console instead. >> > >> > > > > > >> > >> > > > > > I'm happy to create issues for all of these and also >> provide >> > >> > patches >> > >> > > > for >> > >> > > > > > some/all of them depending on my available time. I just >> wanted >> > >> to >> > >> > > bring >> > >> > > > > > this up before I started to see if anyone has any better >> ideas >> > >> > and/or >> > >> > > > > > things that I might have missed. >> > >> > > > > > >> > >> > > > > > Thanks for reading! >> > >> > > > > > >> > >> > > > > > Cheers, >> > >> > > > > > Lars >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > > >> > >> >