On 08/08/2017 09:21 AM, Ademar Reis wrote: > On Tue, Aug 08, 2017 at 01:01:26PM +0200, Lukáš Doktor wrote: >> Hello guys, >> >> I'm sorry for such a late response, I totally forgot about this email >> (thanks to Ademar, your response reminded it to me). >> >> Dne 7.8.2017 v 23:49 Ademar Reis napsal(a): >>> On Tue, Aug 01, 2017 at 03:37:34PM -0400, Cleber Rosa wrote: >>>> Even though Avocado has had a parameter passing system for >>>> instrumented tests almost from day one, it has been intertwined with >>>> the varianter (then multiplexer) and this is fundamentally wrong. The >>>> most obvious example of this broken design is the `mux-inject` command >>>> line option:: >>>> >>>> --mux-inject [MUX_INJECT [MUX_INJECT ...]] >>>> Inject [path:]key:node values into the final >>>> multiplex >>>> tree. >>>> >>>> This is broken design not because such a varianter implementations can >>>> be tweaked over the command line, that's fine. It's broken because it >>>> is the recommended way of passing parameters on the command line. >>>> >>>> The varianter (or any other subsystem) should be able to act as a >>>> parameter provider, but can not dictate that parameters must first be >>>> nodes/key/values of its own internal structure. >>> >>> Correct. It's broken because it violates several layers. There would >>> be nothing wrong with something like "--param [prefix:]<key:value>", >>> for example (more below). >>> >> Well I wouldn't call it broken. The implementation is fine we only lack >> other providers which would allow to inject just params so people are >> abusing `mux-inject` for that. >> >>>> >>>> The proposed design >>>> =================== >>>> >>>> A diagram has been used on a few different occasions, to describe how >>>> the parameters and variants generation mechanism should be connected >>>> to a test and to the overall Avocado architecture. Here it is, in its >>>> original form:: >>>> >>>> +------------------------------+ >>>> | Test | >>>> +------------------------------+ >>>> | >>>> | >>>> +---------v---------+ +--------------------------------+ >>>> | Parameters System |--->| Variants Generation Plugin API | >>>> +-------------------+ +--------------------------------+ >>>> ^ ^ >>>> | | >>>> +--------------------------------------+ | >>>> | +--------------+ +-----------------+ | | >>>> | | avocado-virt | | other providers | | | >>>> | +--------------+ +-----------------+ | | >>>> +--------------------------------------+ | >>>> | >>>> +----------------------------+-----+ >>>> | | >>>> | | >>>> | | >>>> +--------------------+ +-------------------------+ >>>> | Multiplexer Plugin | | Other variant plugin(s) | >>>> +--------------------+ +-------------------------+ >>>> | >>>> | >>>> +-----v---------------------------+ >>>> | +------------+ +--------------+ | >>>> | | --mux-yaml | | --mux-inject | | >>>> | +------------+ +--------------+ | >>>> +---------------------------------+ >>>> >>>> >>>> Given that the "Parameter System" is the entry point into the parameters >>>> providers, it should provide two different interfaces: >>>> >>>> 1) An interface for its users, that is, developers writing >>>> `avocado.Test` based tests >>>> >>>> 2) An interface for developers of additional providers, such as the >>>> "avocado-virt" and "other providers" box on the diagram. >>>> >>>> The current state of the the first interface is the ``self.params`` >>>> attribute. Hopefully, it will be possible to keep its current interface, >>>> so that tests won't need any kind of compatibility adjustments. >>> >>> Right. The way I envision the parameters system includes a >>> resolution mechanism, the "path" currently used in params.get(). >>> This adds extra specificity to the user who requests a parameter. >>> >>> But these parameters can be provided by any entity. In the diagram >>> above, they're part of the "Parameter System" box. Examples of >>> "other providers" could be support for a configuration file or a >>> "--param=[prefix:]<key:value>" command line option. >>> >>> The Test API to request parameters should be shared. To a test it >>> doesn't matter where the parameter is coming from. They're always >>> accessed through an API like: >>> >>> params.get(<key>, [prefix], [fallback value]) >>> >>> Where: >>> * key (mandatory) is the configuration variable the user wants. >>> >>> * prefix (optional) is used to find the right key and can be >>> partially resolved, from right to left. As an example, a >>> `params.get("bar", "name")` would be fulfilled by the parameter >>> system if there are entries like "bar:name", "foobar:name" or >>> "/foo/bar:name". We're using '/' in the multiplexer to >>> separate branch levels, thus instead of "prefix", we're calling >>> it "path", but the mechanism is the same and IMO should be >>> generic to avoid confusion (a path at the parameter level is >>> different than a path at the multiplexer level). >>> >>> params.get() can be even more flexible, supporting regexps and >>> blobs... The objective of [prefix] is to serve as a filter and >>> to guarantee the caller is getting the variable he wants. It's >>> similar to a namespace. >>> >>> * fallback (optional) is the value returned if the parameter >>> system can't resolve prefix+key. >>> >>> Users who don't want any specificity and/or have a small test base >>> with a low chance of clashes could simply ignore the prefix both >>> when creating parameters and when making calls to params.get(). >>> >>>> >>>> The second item will probably mean the definition of a new class to >>>> the ``avocado.core.plugin_interfaces`` module, together with a new >>>> dispatcher(-like) implementation in ``avocado.core.dispatcher``. >>>> >>>> Parameters availability: local .vs. external >>>> ============================================ >>>> >>>> Right now, all parameters are given to the test at instantiation time. >>>> Let's say that in this scenario, all parameters are *local* to the >>>> test. Local parameters have the benefit that the test is self >>>> contained and doesn't need any external communication. >>>> >>>> In theory, this is also a limitation, because all parameters must be >>>> available before the test is started. Even if other parameter system >>>> implementations are possible, with a local approach, there would be a >>>> number of limitations. For long running tests, that may depend on >>>> parameters generated during the test, this would be a blocker. Also, >>>> if a huge number of parameters would be available (think of a huge >>>> cloud or database of parameters) they would all have to be copied to >>>> the test at instantiation time. Finally, it also means that the >>>> source of parameters would need to be able to iterate over all the >>>> available parameters, so that they can be copied, which can be a >>>> further limitation for cascading implementations. >>>> >>>> An external approach to parameters, would be one that the test holds a >>>> handle to a broker of parameter providers. The parameter resolution >>>> would be done at run time. This avoids the copying of parameters, but >>>> requires runtime communication with the parameter providers. This can >>>> make the test execution much more fragile and dependent on the external >>>> communication. Even by minimizing the number of communication >>>> endpoints by communicating with the test runner only, it can still add >>>> significant overhead, latency and point of failures to the test >>>> execution. >>>> >>>> I believe that, at this time, the limitations imposed by local >>>> parameter availability do *not* outweigh the problems that an external >>>> approach can bring. In the future, if advanced use cases require an >>>> external approach to parameters availability, this can be reevaluated. >>> >>> If I understand your point correctly, this is an implementation >>> detail. It depends on what the "contract" is between the test runner >>> (parameter provider) and the test being run (the user of >>> params.get()). >> >> The only difference for the scope of this RFC I see is that the >> "lazy" params can't be merged with "static" params before test >> execution (otherwise it'd effectively made them static as well). >> This is quite important difference to consider further in the >> design... >> >>> >>> For example, should users assume parameters are dynamic and can >>> change during the lifetime of a test, an therefore two identical >>> calls to params.get() might return different values? Should it be >>> possible to change params (something like params.put()) at runtime? >> Definitely no params changing. >> >>> >>> (IMO the answer should be no to both questions). > > The above should be properly specified. >
Agreed, let's specify it as a negative: users won't have an interface to add/change/delete parameters during the execution of a test. >>> >>> If you have something different in mind, then it would be >>> interesting to see some real use-cases. >>> >>>> >>>> Namespaces (AKA how/if should we merge parameters) >>>> ================================================== >>>> >>>> Currently, the parameter fetching interface already contains at its >>>> core the concept of paths[1]. In theory, this is sufficient to prevent >>>> clashes of keys with the same names, but intended to configure different >>>> aspects of a test. >>>> >>>> Now, with the proposed implementation of multiple providers to the >>>> parameter system, the question of how they will be combined comes up. >>>> >>>> One way is for each implementation to claim, based on some unique >>>> attribute such as its own name, a part of a tree path. For instance, >>>> for two implementations: >>>> >>>> 1) variants >>>> 2) plain_ini >>>> >>>> Users could access parameters explicitly defined on those by referring >>>> to paths such as: >>>> >>>> self.params.get('speed', path='/plain_ini', default=60) >>>> >>>> or >>>> >>>> self.params.get('speed', path='/variants/path/inside/varianter', >>>> default=60) >>>> >>>> This clearly solves the clashes, but binds the implementation to the >>>> tests, which should be avoided at all costs. >>> >>> So you're providing this as an example of why it's a bad idea... >>> OK. :) >>> >> Yep, don't see this as a way to go. One should be able to execute the same >> test with different providers (eg. json file with params generated by >> jenkins) >> >>>> >>>> One other approach would be to merge everything into the tree root >>>> node. By doing this, one parameter provider could effectively >>>> override the values in another parameter provider, given that both >>>> used the same paths for a number of parameters. >> This wouldn't be possible with "lazy" providers as it'd require iterating >> through all params. >> >>>> >>>> Yet another approach would be to *not* use paths, and resort to >>>> completely separate namespaces. A parameter namespace would be an >>>> additional level of isolation, which can quickly become exceedingly >>>> complex. >> If I understood it correctly, this is what I'd go with. The way I envision >> this is: >> >> 1. varianter yields a variant, which also contains params in form of list of >> namespaces associated with params `variant = [{"/run/variant": {"foo": >> "bar"}},]` >> 2. runner attaches params from cmdline `cmdline = [{"/1": {"foo": "bar"}}, >> {"/2": {"foo": "bar"}}]` >> 3. runner attaches params from avocado-virt plugin `virt = [{"/virt": >> {"image_name": "foo"}] >> 4. runner executes test and passes params definition there >> {"params": [("varianter", varianter), ("cmdline", cmdline), ("virt", >> virt)], >> "order": ["varianter", "__ALL_", "cmdline"]} >> >> Now the NewAvocadoParams object should create AvocadoParams per each >> provider (params, cmdline, virt) being able to query params from any of >> them. The process of getting params would get a bit more complicated for >> Avocado, but for user nothing changes. Let's say user issues >> params.get("foo", "/*"): >> >> 1. NewAvocadoParasms looks at the "order" >> 2. varianter is asked for a value, reports "bar", as this is a valid >> response, no further params is queried and "bar" is returned >> >> Now for params.get("foo", "/1/*") >> >> 1. NewAvocadoParasms looks at the "order" >> 2. varianter is asked and reports no match >> 3. __ALL__ => means ask providers without assigned priority, which applies >> to "virt" in this example, which reports no match >> 4. cmdline is asked and returns "bar" >> >> And params.get("missing") >> >> 1. NewAvocadoParasms looks at the "order" >> 2. varianter => no match >> 3. virt => no match >> 4. cmdline => no match >> 5. default is reported (None) >> >> >> The "order" is important, by default I'd suggest >> varianter=>plugins=>params=>default_params (basically what Ademar suggested >> and we all agreed many times), but it should be possible to set it if >> necessary (eg. when user have multiple plugins and knows the order). >> >> Now in case of clash I believe it should report the clash even though >> further providers might be able to report. Let's change the order to >> ["cmdline", "__ALL__"] and query for params.get("foo", "/*") >> >> 1. cmdline is asked, it raises ValueError("Leaves [/1, /2] contain key >> "foo"\n /1:foo:bar\n /2:foo:bar") which is forwarded to the test. >> >> This schema would work for "lazy" providers as it'd only be asked for >> specific key and only when needed. >> Your example fits what would be necessary to have "lazy" providers indeed (what I call parameters that are not "local" to the test). But, please take a look at the other points raised in other responses in this thread. I think it's a good idea to stick with data only parameters to tests, which means no "lazy" providers. > > So looks like we agree here, but I'm not familiar with the > implementation details to follow everything. Cleber, can you confirm > and clarify what you had in mind? > We're in sync about what users should expect and the general functionality. As I just wrote, I'm confident that the test can be kept isolated from almost all of the parameter code, receiving only the parameter data. It'd only need a simple wrapper to access the parameter data it already possess: that would be the role of self.params.get(). That means a different *implementation* than Lukáš exemplified above. It also means no "lazy" providers. >>> >>> I think using paths is confusing because it mixes concepts which are >>> exclusive to the multiplexer (a particular implementation of the >>> varianter) with an API that is shared by all other parameter >>> providers. >>> >>> For example, when you say "merge everything into the tree root >>> node", are you talking about namespace paths, or paths as used by >>> the multiplexer when the "!mux" keyword is present? >> There is no difference between namespace path and path in >> multiplexer. The only difference is that multiplexer can provide a >> different slice, but the namespace is always the same, only not >> existing in some slices: > > The multiplexer is just one of many possible "varianters". In the > multiplexer the path (tree) has a well defined semantics, which is > non-trivial: paths can be namespaces or used for "multiplexing" when > the "!mux" keyword is used. > > I've been thinking of variants as having a prefix, but not a path (a > dictionary, not a tree). But maybe a tree is better, assuming it > works as a good abstraction for other "varianters". We need an > authoritative specification somewhere. > Right. Lukáš can you compose a simple section on this topic? Do you think the variants (thinking abstractly about them) would match nicely with the use of a tree based path? All of the previous responses about the parameter system have mentioned a regex based "prefix", so this is quite an important point. - Cleber. -- Cleber Rosa [ Sr Software Engineer - Virtualization Team - Red Hat ] [ Avocado Test Framework - avocado-framework.github.io ] [ 7ABB 96EB 8B46 B94D 5E0F E9BB 657E 8D33 A5F2 09F3 ]
signature.asc
Description: OpenPGP digital signature