Re: [Avocado-devel] RFC: Multi-host tests

2016-03-30 Thread Cleber Rosa

Lukáš,

This RFC has already had a lot of strong points raised, and it's now a 
bit hard to follow the proposals and general direction.


I believe it's time for a v2. What do you think?

Thanks,
- Cleber.


On 03/30/2016 11:54 AM, Lukáš Doktor wrote:

Dne 30.3.2016 v 16:52 Lukáš Doktor napsal(a):

Dne 30.3.2016 v 15:52 Cleber Rosa napsal(a):



On 03/30/2016 09:31 AM, Lukáš Doktor wrote:

Dne 29.3.2016 v 20:25 Cleber Rosa napsal(a):



On 03/29/2016 04:11 AM, Lukáš Doktor wrote:

Dne 28.3.2016 v 21:49 Cleber Rosa napsal(a):



- Original Message -

From: "Cleber Rosa" 
To: "Lukáš Doktor" 
Cc: "Amador Pahim" , "avocado-devel"
, "Ademar Reis" 
Sent: Monday, March 28, 2016 4:44:15 PM
Subject: Re: [Avocado-devel] RFC: Multi-host tests



- Original Message -

From: "Lukáš Doktor" 
To: "Ademar Reis" , "Cleber Rosa"
,
"Amador Pahim" , "Lucas
Meneghel Rodrigues" , "avocado-devel"

Sent: Saturday, March 26, 2016 4:01:15 PM
Subject: RFC: Multi-host tests

Hello guys,

Let's open a discussion regarding the multi-host tests for
avocado.

The problem
===

A user wants to run netperf on 2 machines. To do it manually he
does:

  machine1: netserver -D
  machine1: # Wait till netserver is initialized
  machine2: netperf -H $machine1 -l 60
  machine2: # Wait till it finishes and report store the
results
  machine1: # stop the netserver and report possible failures

Now how to support this in avocado, ideally as custom tests,
ideally
even with broken connections/reboots?


Super tests
===

We don't need to do anything and leave everything on the user.
He is
free to write code like:

  ...
  machine1 = aexpect.ShellSession("ssh $machine1")
  machine2 = aexpect.ShellSession("ssh $machine2")
  machine1.sendline("netserver -D")
  # wait till the netserver starts
  machine1.read_until_any_line_matches(["Starting netserver"],
60)
  output = machine2.cmd_output("netperf -H $machine1 -l
$duration")
  # interrupt the netserver
  machine1.sendline("\03")
  # verify netserver finished
  machine1.cmd("true")
  ...

the problem is it requires active connection and the user needs to
manually handle the results.


And of course the biggest problem here is that it doesn't solve the
Avocado problem: providing a framework and tools for tests that
span
multiple (Avocado) execution threads, possibly on multiple hosts.


Well it does, each "ShellSession" is a new parallel process. The only
problem I have with this design is that it does not allow easy code
reuse and the results strictly depend on the test writer.



Yes, *aexpect* allows parallel execution in an asynchronous fashion.
Not
targeted to tests *at all*. Avocado, as a test framework, should
deliver
more. Repeating the previous wording, it should be "providing a
framework and tools for tests that span multiple (Avocado) execution
threads, possibly on multiple hosts."


That was actually my point. You can implement multi-host-tests that
way,
but you can't share the tests (only include some shared pieces from
libraries).



Right, then not related to Avocado, just an example of how a test writer
could do it (painfully) today.




Triggered simple tests
==

Alternatively we can say each machine/worker is nothing but yet
another
test, which occasionally needs a synchronization or data-exchange.
The
same example would look like this:

machine1.py:

 process.run("netserver")
 barrier("server-started", 2)
 barrier("test-finished", 2)
 process.run("killall netserver")

machine2.py:

  barrier("server-started", 2)
  self.log.debug(process.run("netperf -H %s -l 60"
 % params.get("server_ip"))
  barrier("test-finished", 2)

where "barrier(name, no_clients)" is a framework function which
makes
the process wait till the specified number of processes are
waiting
for
the same barrier.


The barrier mechanism looks like an appropriate and useful utility
for the
example given.  Even though your use case example explicitly
requires it,
it's worth pointing out and keeping in mind that there may be valid
use cases
which won't require any kind of synchronization.  This may even be
true to
the executions of tests that spawn multiple *local* "Avocado runs".


Absolutely, this would actually allow Julio to run his "Parallel
(clustered) testing".


So, let's try to identify what we're really looking for. For both the
use case I mentioned and Julio's "Parallel (clustered) testing", we
need
a (the same) test run by multiple *runners*. A runner in this
context is
something that implements the `TestRunner` interface, such as the
`RemoteTestRunner`:


Re: [Avocado-devel] RFC: Multi-host tests

2016-03-30 Thread Lukáš Doktor

Dne 30.3.2016 v 15:52 Cleber Rosa napsal(a):



On 03/30/2016 09:31 AM, Lukáš Doktor wrote:

Dne 29.3.2016 v 20:25 Cleber Rosa napsal(a):



On 03/29/2016 04:11 AM, Lukáš Doktor wrote:

Dne 28.3.2016 v 21:49 Cleber Rosa napsal(a):



- Original Message -

From: "Cleber Rosa" 
To: "Lukáš Doktor" 
Cc: "Amador Pahim" , "avocado-devel"
, "Ademar Reis" 
Sent: Monday, March 28, 2016 4:44:15 PM
Subject: Re: [Avocado-devel] RFC: Multi-host tests



- Original Message -

From: "Lukáš Doktor" 
To: "Ademar Reis" , "Cleber Rosa"
,
"Amador Pahim" , "Lucas
Meneghel Rodrigues" , "avocado-devel"

Sent: Saturday, March 26, 2016 4:01:15 PM
Subject: RFC: Multi-host tests

Hello guys,

Let's open a discussion regarding the multi-host tests for avocado.

The problem
===

A user wants to run netperf on 2 machines. To do it manually he
does:

  machine1: netserver -D
  machine1: # Wait till netserver is initialized
  machine2: netperf -H $machine1 -l 60
  machine2: # Wait till it finishes and report store the results
  machine1: # stop the netserver and report possible failures

Now how to support this in avocado, ideally as custom tests, ideally
even with broken connections/reboots?


Super tests
===

We don't need to do anything and leave everything on the user. He is
free to write code like:

  ...
  machine1 = aexpect.ShellSession("ssh $machine1")
  machine2 = aexpect.ShellSession("ssh $machine2")
  machine1.sendline("netserver -D")
  # wait till the netserver starts
  machine1.read_until_any_line_matches(["Starting netserver"],
60)
  output = machine2.cmd_output("netperf -H $machine1 -l
$duration")
  # interrupt the netserver
  machine1.sendline("\03")
  # verify netserver finished
  machine1.cmd("true")
  ...

the problem is it requires active connection and the user needs to
manually handle the results.


And of course the biggest problem here is that it doesn't solve the
Avocado problem: providing a framework and tools for tests that span
multiple (Avocado) execution threads, possibly on multiple hosts.


Well it does, each "ShellSession" is a new parallel process. The only
problem I have with this design is that it does not allow easy code
reuse and the results strictly depend on the test writer.



Yes, *aexpect* allows parallel execution in an asynchronous fashion. Not
targeted to tests *at all*. Avocado, as a test framework, should deliver
more. Repeating the previous wording, it should be "providing a
framework and tools for tests that span multiple (Avocado) execution
threads, possibly on multiple hosts."


That was actually my point. You can implement multi-host-tests that way,
but you can't share the tests (only include some shared pieces from
libraries).



Right, then not related to Avocado, just an example of how a test writer
could do it (painfully) today.




Triggered simple tests
==

Alternatively we can say each machine/worker is nothing but yet
another
test, which occasionally needs a synchronization or data-exchange.
The
same example would look like this:

machine1.py:

 process.run("netserver")
 barrier("server-started", 2)
 barrier("test-finished", 2)
 process.run("killall netserver")

machine2.py:

  barrier("server-started", 2)
  self.log.debug(process.run("netperf -H %s -l 60"
 % params.get("server_ip"))
  barrier("test-finished", 2)

where "barrier(name, no_clients)" is a framework function which
makes
the process wait till the specified number of processes are waiting
for
the same barrier.


The barrier mechanism looks like an appropriate and useful utility
for the
example given.  Even though your use case example explicitly
requires it,
it's worth pointing out and keeping in mind that there may be valid
use cases
which won't require any kind of synchronization.  This may even be
true to
the executions of tests that spawn multiple *local* "Avocado runs".


Absolutely, this would actually allow Julio to run his "Parallel
(clustered) testing".


So, let's try to identify what we're really looking for. For both the
use case I mentioned and Julio's "Parallel (clustered) testing", we need
a (the same) test run by multiple *runners*. A runner in this context is
something that implements the `TestRunner` interface, such as the
`RemoteTestRunner`:

https://github.com/avocado-framework/avocado/blob/master/avocado/core/remote/runner.py#L37




The following (pseudo) Avocado Test could be written:

from avocado import Test

# These are currently private APIs that could/should be or

# be exposed under another level. Also, the current API is

# very different from what is used here, please take it as

# pseudo