[OE-core] [PATCH 1/2] commands.py: live output logging + result.error encoding fix

2017-06-27 Thread Patrick Ohly
Tests that use bitbake("my-test-image") can run for a long time
without any indication to the user of oe-selftest about what's going
on. The test author has to log the bitbake output explicitly,
otherwise it is lost in case of test failures.

Now it is possible to use bitbake("my-test-image",
output_log=self.logger) to get more output both on the console and in
the XML output (when xmlrunner is installed). Example output:

2017-06-23 12:23:14,144 - oe-selftest - INFO - Running tests...
2017-06-23 12:23:14,145 - oe-selftest - INFO - 
--
2017-06-23 12:23:14,151 - oe-selftest - INFO - Running: bitbake my-test-image
2017-06-23 12:23:16,363 - oe-selftest - INFO - Loading cache...done.
2017-06-23 12:23:17,575 - oe-selftest - INFO - Loaded 3529 entries from 
dependency cache.
2017-06-23 12:23:18,811 - oe-selftest - INFO - Parsing recipes...done.
2017-06-23 12:23:19,659 - oe-selftest - INFO - Parsing of 2617 .bb files 
complete (2612 cached, 5 parsed). 3533 targets, 460 skipped, 0 masked, 0 errors.
2017-06-23 12:23:19,659 - oe-selftest - INFO - NOTE: Resolving any missing task 
queue dependencies

Because the implementation was already using threading, the same is
done to decouple reading and writing the different pipes instead of
trying to multiplex IO in a single thread. Previously the helper
thread waited for command completion, now that is done in the main
thread.

The most common case (no input data, joined stdout/stderr) still uses
one extra thread and a single read(), so performance should be roughly
the same as before.

Probably unintentionally, result.error was left as byte string when
migrating to Python3. OE-core doesn't seem to use runCmd() with split
output at the moment, so changing result.error to be treated the same
as result.output (i.e. decoded to a normal strings) seems like a
relatively safe API change (or rather, implementation fix).

Signed-off-by: Patrick Ohly 

merge: wait()
---
 meta/lib/oeqa/utils/commands.py | 107 ++---
 1 file changed, 85 insertions(+), 22 deletions(-)

diff --git a/meta/lib/oeqa/utils/commands.py b/meta/lib/oeqa/utils/commands.py
index 57286fc..5e53454 100644
--- a/meta/lib/oeqa/utils/commands.py
+++ b/meta/lib/oeqa/utils/commands.py
@@ -13,6 +13,7 @@ import sys
 import signal
 import subprocess
 import threading
+import time
 import logging
 from oeqa.utils import CommandError
 from oeqa.utils import ftools
@@ -25,7 +26,7 @@ except ImportError:
 pass
 
 class Command(object):
-def __init__(self, command, bg=False, timeout=None, data=None, **options):
+def __init__(self, command, bg=False, timeout=None, data=None, 
output_log=None, **options):
 
 self.defaultopts = {
 "stdout": subprocess.PIPE,
@@ -48,41 +49,103 @@ class Command(object):
 self.options.update(options)
 
 self.status = None
+# We collect chunks of output before joining them at the end.
+self._output_chunks = []
+self._error_chunks = []
 self.output = None
 self.error = None
-self.thread = None
+self.threads = []
 
+self.output_log = output_log
 self.log = logging.getLogger("utils.commands")
 
 def run(self):
 self.process = subprocess.Popen(self.cmd, **self.options)
 
-def commThread():
-self.output, self.error = self.process.communicate(self.data)
-
-self.thread = threading.Thread(target=commThread)
-self.thread.start()
+def readThread(output, stream, logfunc):
+if logfunc:
+for line in stream:
+output.append(line)
+logfunc(line.decode("utf-8", errors='replace').rstrip())
+else:
+output.append(stream.read())
+
+def readStderrThread():
+readThread(self._error_chunks, self.process.stderr, 
self.output_log.error if self.output_log else None)
+
+def readStdoutThread():
+readThread(self._output_chunks, self.process.stdout, 
self.output_log.info if self.output_log else None)
+
+def writeThread():
+try:
+self.process.stdin.write(self.data)
+self.process.stdin.close()
+except OSError as ex:
+# It's not an error when the command does not consume all
+# of our data. subprocess.communicate() also ignores that.
+if ex.errno != EPIPE:
+raise
+
+# We write in a separate thread because then we can read
+# without worrying about deadlocks. The additional thread is
+# expected to terminate by itself and we mark it as a daemon,
+# so even it should happen to not terminate for whatever
+# reason, the main process will still exit, which will then
+# kill the write thread.
+if self.data:
+

[OE-core] [PATCH 1/2] commands.py: live output logging + result.error encoding fix

2017-06-23 Thread Patrick Ohly
Tests that use bitbake("my-test-image") can run for a long time
without any indication to the user of oe-selftest about what's going
on. The test author has to log the bitbake output explicitly,
otherwise it is lost in case of test failures.

Now it is possible to use bitbake("my-test-image",
output_log=self.logger) to get more output both on the console and in
the XML output (when xmlrunner is installed). Example output:

2017-06-23 12:23:14,144 - oe-selftest - INFO - Running tests...
2017-06-23 12:23:14,145 - oe-selftest - INFO - 
--
2017-06-23 12:23:14,151 - oe-selftest - INFO - Running: bitbake my-test-image
2017-06-23 12:23:16,363 - oe-selftest - INFO - Loading cache...done.
2017-06-23 12:23:17,575 - oe-selftest - INFO - Loaded 3529 entries from 
dependency cache.
2017-06-23 12:23:18,811 - oe-selftest - INFO - Parsing recipes...done.
2017-06-23 12:23:19,659 - oe-selftest - INFO - Parsing of 2617 .bb files 
complete (2612 cached, 5 parsed). 3533 targets, 460 skipped, 0 masked, 0 errors.
2017-06-23 12:23:19,659 - oe-selftest - INFO - NOTE: Resolving any missing task 
queue dependencies

Because the implementation was already using threading, the same is
done to decouple reading and writing the different pipes instead of
trying to multiplex IO in a single thread.

The most common case (no input data, joined stdout/stderr) still uses
one extra thread and a single read(), so performance should be roughly
the same as before.

Probably unintentionally, result.error was left as byte string when
migrating to Python3. OE-core doesn't seem to use runCmd() with split
output at the moment, so changing result.error to be treated the same
as result.output (i.e. decoded to a normal strings) seems like a
relatively safe API change (or rather, implementation fix).

Signed-off-by: Patrick Ohly 
---
 meta/lib/oeqa/utils/commands.py | 103 ++---
 1 file changed, 82 insertions(+), 21 deletions(-)

diff --git a/meta/lib/oeqa/utils/commands.py b/meta/lib/oeqa/utils/commands.py
index 57286fc..cc93448 100644
--- a/meta/lib/oeqa/utils/commands.py
+++ b/meta/lib/oeqa/utils/commands.py
@@ -13,6 +13,7 @@ import sys
 import signal
 import subprocess
 import threading
+import time
 import logging
 from oeqa.utils import CommandError
 from oeqa.utils import ftools
@@ -25,7 +26,7 @@ except ImportError:
 pass
 
 class Command(object):
-def __init__(self, command, bg=False, timeout=None, data=None, **options):
+def __init__(self, command, bg=False, timeout=None, data=None, 
output_log=None, **options):
 
 self.defaultopts = {
 "stdout": subprocess.PIPE,
@@ -48,40 +49,100 @@ class Command(object):
 self.options.update(options)
 
 self.status = None
+# We collect chunks of output before joining them at the end.
+self._output_chunks = []
+self._error_chunks = []
 self.output = None
 self.error = None
-self.thread = None
+self.threads = []
 
+self.output_log = output_log
 self.log = logging.getLogger("utils.commands")
 
 def run(self):
 self.process = subprocess.Popen(self.cmd, **self.options)
 
-def commThread():
-self.output, self.error = self.process.communicate(self.data)
-
-self.thread = threading.Thread(target=commThread)
-self.thread.start()
+def readThread(output, stream, logfunc):
+if logfunc:
+for line in stream:
+output.append(line)
+logfunc(line.decode("utf-8", errors='replace').rstrip())
+else:
+output.append(stream.read())
+
+def readStderrThread():
+readThread(self._error_chunks, self.process.stderr, 
self.output_log.error if self.output_log else None)
+
+def readStdoutThread():
+readThread(self._output_chunks, self.process.stdout, 
self.output_log.info if self.output_log else None)
+
+def writeThread():
+try:
+self.process.stdin.write(self.data)
+self.process.stdin.close()
+except OSError as ex:
+# It's not an error when the command does not consume all
+# of our data. subprocess.communicate() also ignores that.
+if ex.errno != EPIPE:
+raise
+
+# We write in a separate thread because then we can read
+# without worrying about deadlocks. The additional thread is
+# expected to terminate by itself and we mark it as a daemon,
+# so even it should happen to not terminate for whatever
+# reason, the main process will still exit, which will then
+# kill the write thread.
+if self.data:
+threading.Thread(target=writeThread, daemon=True).start()
+if self.process.stderr:
+thread =