Dne 2.5.2016 v 21:06 Ademar Reis napsal(a): > On Fri, Apr 29, 2016 at 09:31:56AM +0200, Lukáš Doktor wrote: >> Dne 28.4.2016 v 22:28 Cleber Rosa napsal(a): >>> > > Hi. > > I'll respond on this thread, bringing some of my comments from > the other reply. I would like Cleber to do the same there, so > hopefully we can converge on a few ideas before a v5. > Hi, thanks, I was about to write v5, early feedback could help shaping it.
>>> On 04/28/2016 12:10 PM, Lukáš Doktor wrote: >>>> Hello again, >>>> >>>> This version removes the rejected variants and hopefully clarifies all >>>> the goals needed for multi-stream (and also multi-host) tests available. >>>> >>>> Changes: >>>> >>>> v2: Rewritten from scratch >>>> v2: Added examples for the demonstration to avoid confusion >>>> v2: Removed the mht format (which was there to demonstrate manual >>>> execution) >>>> v2: Added 2 solutions for multi-tests >>>> v2: Described ways to support synchronization >>>> v3: Renamed to multi-stream as it befits the purpose >>>> v3: Improved introduction >>>> v3: Workers are renamed to streams >>>> v3: Added example which uses library, instead of new test >>>> v3: Multi-test renamed to nested tests >>>> v3: Added section regarding Job API RFC >>>> v3: Better description of the Synchronization section >>>> v3: Improved conclusion >>>> v3: Removed the "Internal API" section (it was a transition between >>>> no support and "nested test API", not a "real" solution) >>>> v3: Using per-test granularity in nested tests (requires plugins >>>> refactor from Job API, but allows greater flexibility) >>>> v4: Removed "Standard python libraries" section (rejected) >>>> v4: Removed "API backed by cmdline" (rejected) >>>> v4: Simplified "Synchronization" section (only describes the >>>> purpose) >>>> v4: Refined all sections >>>> v4: Improved the complex example and added comments >>>> v4: Formulated the problem of multiple tasks in one stream >>>> v4: Rejected the idea of bounding it inside MultiTest class >>>> inherited from avocado.Test, using a library-only approach >>>> >>>> >>>> The problem >>>> =========== >>>> >>>> Allow tests to have some if its block of code run in separate stream(s). >>>> We'll discuss the range of "block of code" further in the text as well >>>> as what the streams stands for. >>>> >>>> One example could be a user, who wants to run netperf on 2 machines, >>>> which requires following manual steps: >>>> >>>> stream1: netserver -D >>>> stream1: # Wait till netserver is initialized >>>> stream2: netperf -H $machine1 -l 60 >>>> stream2: # Wait till it finishes and report the results >>>> stream1: # stop the netserver and report possible failures >>>> >>>> the test would have to contain the code for both, stream1 and stream2 >>>> and it executes them in two separate streams, which might or not be >>>> executed on the same machine. >>>> >>> >>> Right, this clearly shows that the use case is "user wants to write/run >>> a test", which is right on Avocado's business. Then, "by the way", he >>> wants to leverage netperf for that, which fine (but not directly related >>> to this proposal). Oh, and BTW (again), test requires a part of it to >>> be run on a different place. Checks all necessary boxes IMO. > > Agree. It's a very good example. > > I think NetPerf and a QEMU Migration Test would be two reference > implementations for multi-stream tests. > >>> >>>> Some other examples might be: >>>> >>>> 1. A simple stress routine being executed in parallel (the same or >>>> different hosts) >>>> * utilize a service under testing from multiple hosts (stress test) >>> >>> The "test requires a part of it to be run a different place" requirement >>> again. Fine. >>> >>>> 2. Several code blocks being combined into a complex scenario(s) >>>> * netperf + other test >>>> * multi-host QEMU migration >>>> * migrate while changing interfaces and running cpu stress >>> >>> Yep, sounds like the very same requirement, just more imaginative >>> combinations. >>> >>>> 3. Running the same test along with stress test in background >>>> * cpu stress test + cpu hotplug test >>>> * memory stress test + migration >>>> >>> >>> Here, "a different place" is "the same place", but it's still seen as a >>> separate execution stream. The big difference is that you mention "test >>> x" + "test y". I know what's coming, but it gives it away that we're >>> possibly talking about having "blocks of code" made out of other tests. >>> IMO, it sounds good. > > I also think these examples deserve some extra words. Please > explain the use-cases and the motivation for them. Saying they're > real world examples from avocado-vt will also help. > OK, I'll try to add few bits. >>> >>>> >>>> Solution >>>> ======== >>>> >>>> >>>> Stream >>>> ------ >>>> >>>> From the introduction you can see that "Stream" stands for a "Worker" >>>> which allows to execute the code in parallel to the main test routine >>>> and the main test routine can offload tasks to it. The primary >>>> requirement is to allow this execution on the same as well on a >>>> different machine. >>>> >>>> >>> >>> Is a "Worker" a proper entity? In status parity with a "Stream"? Or is >>> it a synonym for "Worker"? >>> >>> Or maybe you meant that "Stream stands for a worker" (lowercase)? >> yep, should be the lowercase. >> >>> >>>> Block of code >>>> ------------- >>>> >>>> Throughout the first 3 versions we discussed what the "block of code" >>>> should be. The result is a avocado.Test compatible class, which follows >>>> the same workflow as normal test and reports the results back to the >>>> stream. It is not the smallest piece of code that could be theoretically >>>> executed (think of functions), but it has many benefits: >>>> >>>> 1. Well known structure including information in case of failure >>>> 2. Allows simple development of components (in form of tests) >>>> 3. Allows to re-use existing tests and combine them into complex >>>> scenarios >>>> >>>> Note: Smaller pieces of code can be still executed in parallel without >>>> the framework support using standard python libraries (multiprocessing, >>>> threading). This RFC is focusing on simplifying the development of >>>> complex cases, where test as a minimal block of code fits quite well. >>>> >>>> >>> >>> Sounds good. >>> > > Like I said in my other reply, even though here we have a more > abstract definition of what the "block of code" would be, > everything else in the RFC we still see "tests" as actual > references to it: "resolving the tests", "tests inside a stream", > "combine tests". > > Cleber, are we in sync here? (just to make sure your "sounds > good" doesn't get misinterpreted) > I can try modifying all wording, but in the end they need to be test-like classes and they must follow the same runner, otherwise I would not be able to use the API. Without it it's yet another `multiprocessing` library which won't help with debugging of possible failures. So I hope this is clear, Cleber, Ademar, (others), is this acceptable? If not, then I'd like to ask for a counter-proposal, because I don't see a suitable alternative. >>>> Resolving the tests >>>> ------------------- >>>> >>>> String >>>> ~~~~~~ >>>> >>>> As mentioned earlier, the `stream` should be able to handle >>>> avocado.Test-like classes, which means the test needs to find one. >>>> Luckily, avocado already has such feature as part of internal API. I'd >>>> like to use it by passing string `test reference` to the stream, which >>>> should resolve it and execute. >>>> >>> >>> Just to make it even more clear, this could also be (re-)written as: >>> >>> "... which means the test needs to unambiguously identify his block of >>> code, which also happens to be a valid avocado.Test." >>> >>> Right? >>> >>> By setting the reference to be evaluated by the stream, IMHO, you add >>> responsibilities to the stream. How will be behave on the various error >>> scenarios? Internally, the stream will most likely use the >>> loader/resolver, but then it would need to communicate the >>> loader/resolver status/exceptions back to the test. Looks like this >>> could be better layered. >>> >> I think it'd be convenient. For me the `stream.run_bg` means: Hey, >> stream, please add this task and report when it's scheduled and it >> should report the id (index) to identify it's results later. >> >> The `stream.run_fg` means: Hey, stream, please run this and report when >> it's done. So it's simply reports invalid test when it fails to resolve it. > > Like I've also said in the other e-mail, I believe using tests as > the abstraction for what is run in a stream/worker is > fundamentally wrong, because it breaks the abstractions of Job > and Test. You're basically introducing sub-tests, or even > "sub-jobs" here (more about it later). > This was rejected and would not be part of v5 (I'll use only Resolver and Local references) >> >>>> Resolver >>>> ~~~~~~~~ >>>> >>>> Some users might prefer tweaking the resolver. This is currently not >>>> supported, but is part of the "JobAPI RFC". Once it's developed, we >>>> should be able to benefit from it and use it to resolve the `test >>>> references` to test-like definitions and pass it over to the stream. >>>> >>> >>> What do you mean by "tweaking"? I don't think developers of a >>> multi-stream test would tweak a resolver. > > +1. > >>> >> Eg. to pick just a specific loader plugin, or to change the order... > > That's Job API. > >> >>> Now, let's look at: "... use it to resolve the `test references` to >>> test-like definitions ...". There's something slipping there, and >>> lacking a more clear definition. >>> >>> You probably mean: ".. use it to resolve the `test references` to "code >>> blocks" that would be passed to the stream.". Although I don't support >>> defining the result of the resolver as a "code block", the important >>> thing here is to define that a resolver API can either return the >>> `<class user_module.UserTest>` "Python reference/pointer" or some other >>> opaque structure that is well understood as being a valid and >>> unambiguous reference to a "code block". >>> >>> I see the result of a "resolve()" call returning something that packs >>> more information besides the `<class user_module.UserTest>` "pointer". >>> Right now, the closest we have to this are the "test factories". > > +1 > >> >> Yes, it's not returning the code directly, but test factory which can be >> executed by the stream's runner. But IMO this is too detailed for RFC so >> I used the "test-like definitions" (not tests, nor instances). I can use >> the "opaque structure that is well understood as being a valid and >> unambiguous reference" to make it clearer. >> >>> >>>> Local reference >>>> ~~~~~~~~~~~~~~~ >>>> >>>> Last but not least, some users might prefer keeping the code in one >>>> file. This is currently also not possible as the in-stream-test-class >>>> would either be also resolved as a main test or they would not be >>>> resolved by the stream. >>>> >>>> We faced a similar problem with the deep inheritance and we solved it by >>>> a docstring tag: >>>> >>>> class MyTest(Test): >>>> ''' >>>> Some description >>>> :avocado: disable >>>> ''' >>>> def test(self): >>>> pass >>>> >>>> which tells the resolver to avoid this class. We can expand it and use >>>> for example "strict" to only be executed when the full path >>>> ($FILE:$TEST.$METHOD) is used. This way we could put all the parts in a >>>> single file and reference the tasks by a full path. >>>> >>>> Alternatively we could introduce another class >>>> >>>> class Worker(avocado.Test): >>>> pass >>>> >>>> and the file loader would detect it and only yield it when full path is >>>> provided (similarly to SimpleTest class). >>>> >>>> >>> >>> If the loader acknowledges those nested classes as valid `avocado.Test`, >>> then the resolver can certainly return information about them in the >>> analog to our current "test factories". This way, the internal (same >>> file) referencing could indeed be cleanly implemented. >>> >> Yes, that's my plan. >> >>>> Synchronization >>>> --------------- >>>> >>>> Some tests do not need any synchronization, users just need to run them. >>>> But some multi-stream tests needs to be precisely synchronized or they >>>> need to exchange data. >>>> >>>> For synchronization purposes usually "barriers" are used, where barrier >>>> guards the entry into a section identified by "name" and "number of >>>> clients". All parties asking an entry into the section will be delayed >>>> until the "number of clients" reach the section (or timeout). Then they >>>> are resumed and can entry the section. Any failure while waiting for a >>>> barrier propagates to other waiting parties. >>>> >>>> One way is to use existing python libraries, but they usually require >>>> some boilerplate code around. One of the tasks on the multi-stream tests >>>> should be to implement basic barrier interface, which would be >>>> initialized in `avocado.Streams` and details should be propagated to the >>>> parts executed inside streams. >>>> >>>> The way I see this is to implement simple tcp-based protocol (to allow >>>> manual debug) and pass the details to tests inside streams via params. >>>> So `avocado.Streams` init would start the daemon and one would connect >>>> to it from the test by: >>>> >>>> from avocado.plugins.sync import Sync >>>> # Connect the sync server on address stored in params >>>> # which could be injected by the multi-stream test >>>> # or set manually. >>>> sync = Sync(self, params.get("sync_server", "/plugins/sync_server")) >>>> # wait until 2 tests ask to enter "setup" barrier (60s timeout) >>>> sync.barrier("setup", 2, 60) >>>> >>> >>> OK, so the execution streams can react to "test wide" synchronization >>> parameters. I don't see anything wrong with that at this point. >>> >> I thought about a way to solve this and I don't want to add yet another >> argument. As we have a complex and theoretically abstract params system >> (tags, not paths) we might use to pass this information. >> >>>> The new protocol is quite necessary as we need support for re-connection >>>> and other tweaks which are not supported by multiprocessing library. >>>> >>>> >>>> Very simple example >>>> ------------------- >>>> >>>> This example demonstrates a test, which tries to access "example.org" >>>> concurrently from N machines without any synchronization. >>>> >>>> import avocado >>>> >>>> class WgetExample(avocado.Test): >>>> def setUp(self): >>>> # Initialize streams >>>> self.streams = avocado.Streams(self) >>>> for machine in machines: >>>> # Add one stream per machine, create the connection >>>> # and prepare for execution. >>>> self.streams.add_stream(machine) >>>> def test(self) >>>> for stream in self.streams: >>>> # Resolve the "/usr..." into >>>> # SimpleTest("/usr/bin/wget example.org") and >>>> # schedule the execution inside the current stream >>>> stream.run_bg("/usr/bin/wget example.org") >>>> # Wait till both streams finish all tasks and fail the test >>>> # in case any of them fails. >>>> self.streams.wait(ignore_errors=False) >>>> >>>> where the `avocado.Stream` represents a worker (local or remote) which >>>> allows running avocado tests in it (foreground or background). This >>>> should provide enough flexibility to combine existing tests in complex >>>> tests. >>>> >>>> >>> >>> Of course questions such as "where to machines com from?" would arise, >>> but I understand the possibilities. My only very strong opinion here is >>> to not link the resolution and execution on the primary APIs. Maybe a >>> `resolve_and_run()` utility could exist, but I'm not entirely convinced. >>> I really see the two things (resolution and execution) as two different >>> layers. > > +1. > >>> >> As mentioned earlier, I'd like to support both. If you pass a string, it >> should resolve it. If you pass an output of resolver, which IIRC is part >> of the Job API RFC, then it should use it. Eventually you could say it's >> this file's class (Ademar's proposal) and the stream should be able to >> identify it and produce the necessary template. > > -1 to the idea of supporting both. > Yep, rejected, won't be there. >> >>>> Advanced example >>>> ---------------- >>>> >>>> MultiNetperf.py: >>>> >>>> class MultiNetperf(avocado.NestedTest): >>>> def setUp(self): >>>> # Initialize streams (start sync server, ...) >>>> self.streams = avocado.Streams(self) >>>> machines = ["localhost", "192.168.122.2"] >>>> for machine in machines: >>>> # Add one stream per machine >>>> self.streams.add_stream(machine) >>>> def test(self): >>>> # Ask the first stream to resolve "NetServer", pass the {} >>>> # params to it (together with sync-server url), >>>> # schedule the job in stream and return to main thread >>>> # while the stream executes the code. >>>> self.streams[0].run_bg("NetServer", >>>> {"no_clients": len(self.streams)}) >>>> for stream in self.streams[1:]: >>>> # Resolve "NetPerf", pass the {} params to it, >>>> # schedule the job in stream and return to main >>>> # thread while the stream executes the code >>>> stream.run_bg("NetPerf", >>>> {"no_clients": len(self.workers), >>>> "server_ip": machines[0]}) >>>> # Wait for all streams to finish all scheduled tasks >>>> self.streams.wait(ignore_failures=False) >>>> >>> >>> You lost me here with `avocado.NestedTest`... >>> >> copy&paste, I'm sorry (I changed it back to library, but forgot to >> update the class). >> >>>> NetServer.py: >>>> >>>> class NetServer(avocado.NestedTest): >>>> def setUp(self): >>>> # Initialize sync client >>>> self.sync = avocado.Sync(self) >>>> process.run("netserver") >>>> # Contact sync server (url was passed in `stream.run_bg`) >>>> # and ask to enter "setup" barrier with "no_clients" >>>> # clients >>>> self.sync.barrier("setup", self.params.get("no_clients")) >>>> def test(self): >>>> pass >>>> def tearDown(self): >>>> self.sync.barrier("finished", self.params.get("no_clients")) >>>> process.run("killall netserver") >>>> >>>> NetPerf: >>>> >>>> class NetPerf(avocado.NestedTest): >>>> def setUp(self): >>>> # Initialize sync client >>>> self.sync = avocado.Sync(self) >>>> process.run("netserver") >>>> # Contact sync server (url was passed in `stream.run_bg`) >>>> # and ask to enter "setup" barrier with "no_clients" >>>> # clients >>>> self.sync.barrier("setup", self.params.get("no_clients")) >>>> def test(self): >>>> process.run("netperf -H %s -l 60" >>>> % params.get("server_ip")) >>>> barrier("finished", params.get("no_clients")) >>>> >>>> >>>> Possible implementation >>>> ----------------------- >>>> >>>> _Previously: API backed by internal API_ >>>> >>>> One way to drive this is to use existing internal API and create a layer >>>> in between, which invokes runner (local/remote based on the stream >>>> machine) to execute the code on `stream.run_bg` calls. >>>> >>>> This means the internal API would stay internal and (roughly) the same, >>>> but we'd develop a class to invoke the internal API. This class would >>>> have to be public and supported. >>>> >>>> + runs native python >>>> + easy interaction and development >>>> + easily extensible by either using internal API (and risk changes) or >>>> by inheriting and extending the features. >>>> - lots of internal API will be involved, thus with almost every change >>>> of internal API we'd have to adjust this code to keep the NestedTest >>>> working >>>> - fabric/paramiko is not thread/parallel process safe and fails badly so >>>> first we'd have to rewrite our remote execution code (use autotest's >>>> worker, or aexpect+ssh) >>>> >>>> >>>> Queue vs. signle task >>>> --------------------- >>>> >>>> Up to this point I always talked about stream as an entity, which drives >>>> the execution of "a code block". A big question is, whether it should >>>> behave like a queue, or only a single task: >>>> >>>> queue - allows scheduling several tasks and reports list of results >>>> single task - stream would only accept one task and produce one result >>>> >>>> I'd prefer the queue-like approach as it's more natural to me to first >>>> prepare streams and then keep adding tasks until all my work is done and >>>> I'd expect per-stream results to be bounded together, so I can know what >>>> happened. This means I could run `stream.run_bg(first); >>>> stream.run_bg(second); stream.run_fg(third); stream.run_bg(fourth)` and >>>> the stream should start task "first", queue task "second", queue task >>>> "third", wait for it to finish and report "third" results. Then it >>>> should resume the main thread and queue the "fourth" task (FIFO queue). >>>> Each stream should then allow to query for all results (list of >>>> json-results) as well as it should create a directory inside results and >>>> per-task sub-directory with task results. >>>> >>> >>> I do see that the "queue" approach is more powerful, and I would love >>> having something like that for my own use. But (there's always a but), >>> to decide on that approach we also have to consider: >>> >>> * Increased complexity >>> * Increased development cost >>> * Passing the wrong message to users, that could look at this as a way >>> to, say, build conditional executions on the same stream and have now a >>> bunch of "micro" code blocks >>> >>> These are the questions that come to my mind, and they all be dismissed >>> as discussion progresses. I'm just playing devil's advocate at this point. >>> >> Yes, I know. On the other hand it's convenient to bundle tasks executed >> on one machine/stream together. >> >> An idea to allow this in the "single task" scenario came to my mind, we >> might allow a stream prefixes to identify the code intended to be bound >> together, so the results would be (again, I'm not talking about the >> queue-like approach, but only single tasks per stream scenario): >> >> 01-$PREFIX-$TEST >> 02-server-NetServer >> 03-client1-NetPerf.big >> 04-client2-NetPerf.big >> 05-client1-NetPerf.small >> ... >> >> This would help me debug the results and as it'd be optional it should >> not confuse people at first. >> >> Also I think the order tasks were executed in is more important, so that >> should be the first argument. >> >>>> On the other hand the "single task" should always establish the new >>>> connection and create separate results per-each task added. This means >>>> preparing the streams is not needed as each added task is executed >>>> inside a different stream. So the interface could be >>>> `self.streams.run_bg(where, what, details)` and it should report the >>>> task id or task results in case of `run_fg`. The big question is what >>>> should happen when a task resolves in multiple tasks (eg: `gdbtest`). >>> >>> That's why the "block of code" reference should be unambiguous. No >>> special situation to deal with. It'd be a major confusion to have more >>> than one "block of code" executed unintentionally. > > +1. > >>> >>>> Should it fail or create streams per each task? What should it report, >>>> then? I can imagine a function `run_all_{fg,bg}` which would create a >>>> stream for each worker and return list of id/results in case the writer >>>> is not sure (or knows) that the test reference resolves into several >>>> tasks. >>>> >>> >>> Let's try to favor simpler interfaces, which would not introduce this >>> number o special scenarios. > > +1. > >>> >>>> See more details in the next chapter >>>> >>>> >>>> Results directory >>>> ----------------- >>>> >>>> This demonstrates the results for a modified "MultiNetperf" test. The >>>> difference is that it runs 2 variants of netperf: >>>> >>>> * Netperf.bigbuf # netperf using big buffers >>>> * Netperf.smallbuf # netperf using small buffers >>>> >>>> Queue-like approach: >>>> >>>> job-2016-04-15T.../ >>>> ├── id >>>> ├── job.log >>>> └── test-results >>>> └── 1-MultiNetperf >>>> ├── debug.log >>>> ├── stream1 # one could provide custom name/host >>>> │ ├── 1-Netperf.bigbuf >>>> │ │ ├── debug.log >>>> │ │ └── whiteboard >>>> │ └── 2-Netperf.smallbuf >>>> │ ├── debug.log >>>> │ └── whiteboard >>>> ├── stream2 >>>> │ └── 1-NetServer >>>> │ ├── debug.log >>>> │ └── whiteboard >>>> └── whiteboard >>>> >>>> Single task approach: >>>> >>>> job-2016-04-16T.../ >>>> ├── id >>>> ├── job.log >>>> └── test-results >>>> └── 1-MultiNetperf >>>> ├── debug.log >>>> ├── whiteboard >>>> ├── 1-Netperf.bigbuf >>>> │ ├── debug.log >>>> │ └── whiteboard >>>> ├── 2-Netperf.smallbuf >>>> │ ├── debug.log >>>> │ └── whiteboard >>>> └── 3-Netperf.smallbuf >>>> ├── debug.log >>>> └── whiteboard >>>> >>>> The difference is that queue-like approach bundles the result >>>> per-worker, which could be useful when using multiple machines. >>>> >>>> The single-task approach makes it easier to follow how the execution >>>> went, but one needs to see the log to see on which machine was the task >>>> executed. >>>> >>>> >>> >>> The logs can indeed be useful. And the choices about single .vs. queue >>> wouldn't really depend on this... this is, quite obviously the *result* >>> of that choice. > > Agree. > >>> >>>> Job API RFC >>>> =========== >>>> >>>> Recently introduced Job API RFC covers very similar topic as "nested >>>> test", but it's not the same. The Job API is enabling users to modify >>>> the job execution, eventually even write a runner which would suit them >>>> to run groups of tests. On the contrary this RFC covers a way to combine >>>> code-blocks/tests to reuse them into a single test. In a hackish way, >>>> they can supplement each others, but the purpose is different. >>>> >>> >>> "nested", without a previous definition, really confuses me. Other than >>> that, ACK. >>> >> copy&past, thanks. >> >>>> One of the most obvious differences is, that a failed "nested" test can >>>> be intentional (eg. reusing the NetPerf test to check if unreachable >>>> machines can talk to each other), while in Job API it's always a failure. >>>> >>> >>> It may just be me, but I fail to see how this is one obvious difference. >> Because Job API is here to allow one to create jobs, not to modify the >> results. If the test fails, the job should fail. At least that's my >> understanding. > > That's basically the only difference between the Job API and this > proposal. And I don't think that's good (more below). > >> >>> >>>> I hope you see the pattern. They are similar, but on a different layer. >>>> Internally, though, they can share some pieces like execution the >>>> individual tests concurrently with different params/plugins >>>> (locally/remotely). All the needed plugin modifications would also be >>>> useful for both of these RFCs. >>>> >>> >>> The layers involved, and the proposed usage, should be the obvious >>> differences. If they're not cleanly seen, we're doing something wrong. >>> > > +1. > >> I'm not sure what you're proposing here. I put the section here to >> clarify Job API is a different story, while they share some bits >> (internally and could be abused to do the same) >> > > I think the point is that you're actually proposing nested-tests, > or sub-tests and those concepts break the abstraction and do not > belong here. Make the definitions and proposals abstract engough > and with clear and limited APIs, and there's no need for a > section to explain that this is different from the Job API. > I wanted to clarify the difference as both are being discussed at the same time, anyway it's clear now, so I can remove this section and examples. >>>> Some examples: >>>> >>>> User1 wants to run "compile_kernel" test on a machine followed by >>>> "install_compiled_kernel passtest failtest warntest" on "machine1 >>>> machine2". They depend on the status of the previous test, but they >>>> don't create a scenario. So the user should use Job API (or execute 3 >>>> jobs manually). >>>> >>>> User2 wants to create migration test, which starts migration from >>>> machine1 and receives the migration on machine2. It requires cooperation >>>> and together it creates one complex usecase so the user should use >>>> multi-stream test. >>>> >>>> >>> >>> OK. >>> >> So I should probably skip the introduction and use only the >> examples :-) >> >>>> Conclusion >>>> ========== >>>> >>>> This RFC proposes to add a simple API to allow triggering >>>> avocado.Test-like instances on local or remote machine. The main point >>>> is it should allow very simple code-reuse and modular test development. >>>> I believe it'll be easier, than having users to handle the >>>> multiprocessing library, which might allow similar features, but with a >>>> lot of boilerplate code and even more code to handle possible exceptions. >>>> >>>> This concept also plays nicely with the Job API RFC, it could utilize >>>> most of tasks needed for it and together they should allow amazing >>>> flexibility with known and similar structure (therefor easy to learn). >>>> >>> >>> Thanks for the much cleaner v4! I see that consensus and a common view >>> is now approaching. >>> >> >> Now the big question is, do we want queue-like or single-task interface? >> They are quite different. The single-task interface actually does not >> require any streams generation. It could just be the stream object and >> you could say hey, stream, run this for me on this guest and return ID >> so I can query for status later. Hey stream, please run also this and >> report when it's finished. Oh stream, did the first task already finish? > > The queue-like interface is probably the concept I'm more > strongly against in your RFC, so I would love to see it removed > from the proposal. I wasn't happy about it either, my motivation was strictly to bound all tasks offloaded to one machine together, which is convenient. Anyway I'd like to pursuit the version I introduced in Cleber's response; one task per stream, results are indexed and contain the stream name + task name: 01-localhost-NetServer 02-192.168.122.5-NetClient ... > > I wrote a lot more in my other reply. I hope Cleber can respond > there and we can converge on a few topics before v5. > > Thanks. > - Ademar > Thanks to you, Lukáš >> >> So the interface would be actually simpler and if we add the optional >> "stream tag" (viz my response in Queue vs. single task section), I'd be >> perfectly fine with it. Note that we could also just use the hostname/ip >> as the stream tag, but sometimes it might be better to allow to override >> it (eg. when running everything on localhost, one might use "stress" >> stream and "test" stream). >> >> After thinking of it a bit more I'm probably more inclined to the >> single-task execution with optional tag. The interface would be: >> >> streams = avocado.Streams(self) >> tid = streams.run_bg(task, **kwargs) >> results = streams.run_fg(task, **kwargs) >> results = streams.wait(tid) >> streams.wait() >> >> where the **kwargs might contain: >> >> host -> to run the task remotely >> stream_tag -> prefix for logs and results dir >> >> the remaining arguments would be combined with test-class arguments, so >> one could add `params={"foo": "bar"}`. This would not be needed in case >> the user first resolves the test, but it'd be super-convenient for >> simpler use cases. The alternative to params parsing could be: >> >> task = resolver.resolve(task) >> task[1]["params"].update(my_params) >> tid = streams.run_bg(task) >> >> Anyway if we implement the resolver quickly, we might just skip the >> implicit resolver (so require the additional 1-2 steps and avoid the >> **kwargs). >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Avocado-devel mailing list Avocado-devel@redhat.com https://www.redhat.com/mailman/listinfo/avocado-devel