Considering it from a behavior point of view, I mostly agree that there is no difference between a service that is missing and a service that is incorrect. Do you get a service connectivity error if a service isn’t declared in a topology? I suspect it’s probably a 401 from Knox in that case, which is different than a service with an incorrect URL.
In some cases, we can default to what we know should work, especially when there is not a valid alternative. For instance, Hive support requires the http transport mode, so we can always discover the HTTP URL whether the component is correctly configured for http transport or not; then, as you’ve said, the component config can be corrected, and the Knox proxy will just work. Even for the Hive service though, some properties are likely to be incorrect until the component configuration is modified (e.g., hive.server2.thrift.http.path has no value by default). So, even the default in this case won’t just start working if the component configuration is corrected; the topology will have to be regenerated. In the instructions for configuring Hive for the HTTP transport mode (http://knox.apache.org/books/knox-0-13-0/user-guide.html#Hive), the specified http path is “cliservice”, but could that not be any arbitrary value? By default this property has no value, so we would generate http://HIVESERVER2_HOST:HTTP_PORT/ ; When the aforementioned instructions are followed, the actual URL will be http://HIVESERVER2_HOST:HTTP_PORT/cliservice, and the topology will still be incorrect. If we default the path to “cliservice”, and the users specifies “donttellmewhattodo”, the result is the same. This is all to say that default the Hive service URL will still be troublesome, but there are certainly services for which a reasonable default is plausible. In other cases, the URL could be entirely invalid (e.g., missing config properties), but a configuration change noticed by a configuration monitor (i.e., KNOX-1013) could resolve that eventually. For these cases, I think we’re in agreement that they can be omitted from the generated topology since the source descriptor will still have the declarations. Thanks for the insight. I think we’re close to a good compromise. -- Phil On 9/26/17, 11:04 AM, "larry mccay" <[email protected]> wrote: Hi Phil - Thanks for bringing this up for discussion. I do agree with the descriptor author's intent but at the same time, they also intend for the others to be available. There isn't much difference between a topology with a service elements that can't be reached and one without the service elements in it. More than likely, when you deploy a topology and can't access a service - like HIVE - you will go to ambari to check the status on the service. In this case you will notice that it isn't deployed or configured correctly - like in http mode. You take the actions in Ambari and the the service is now accessible. Having to go and add it to the topology after that shouldn't be necessary. I think that we could consider how the monitoring of the discovery service is going to be driven. If it is drive by the simple descriptor - which makes sense - then I think that it could result in a topology with only those services that can be discovered. As long as the others are still in the descriptor they can be discovered later and the topology automagically get updated with the additions. This gives us a situation where only "correct" topologies are deployed and they will be autocorrecting as others come online. Even the HIVE situation would fix itself just by putting it in the right mode. My suggestion would to skip those that can't be fully discovered and log each one as WARNING. Monitoring of discovery service based on descriptors rather than topologies would be able to correct as appropriate. What do you think? thanks, --larry On Tue, Sep 26, 2017 at 10:42 AM, Philip Zampino <[email protected]> wrote: > I’ve been thinking about the behavior wrt topology generation when the > URL(s) for a service declared in a simple descriptor cannot be > correctly/completely determined. > > The options available include: > > > 1. Abort the topology generation because we can’t produce what has been > requested. > > 2. Complete the topology generation without those services whose URL(s) > could not be determined. > The unresolved services could be omitted or commented-out in the resulting > topology file. > > 3. Complete the topology generation, allowing the descriptor deployer > the opportunity to “fill in the blanks” > This will result in Knox deploying a topology it knows to be incorrect. > API deployments may not afford the deployer the opportunity to “fill in > the blanks” (e.g., Ambari-driven deployments). > > > My initial feeling on this is that we should not produce anything less > than what the descriptor declares (i.e., #1). After all, the declared > services are in the descriptor precisely because someone wants to access > them through Knox. > > I could possibly be persuaded that producing a partial topology (i.e., #2) > may be acceptable, but it’s still not what the descriptor author > intends/requires. > > I don’t believe Knox should ever produce or deploy a topology it knows to > be incorrect (i.e., #3). > > One example, which came up during the review of KNOX-1014, is HIVE; If the > hiveserver2 component is not configured for HTTP transport, then there is > no valid URL for that service, as far as Knox is concerned. In this case, I > think we must abort the topology generation or omit the HIVE service from > the generated topology. > > Interested in your thoughts… > > -- > Phil > >
