Markus Trippelsdorf writes:
 > Since v3.8.0 several people reported intermittent IO errors that happen
 > during high system load while using "emerge" under Gentoo:
 > ...
 >   File "/usr/lib64/portage/pym/portage/util/_eventloop/EventLoop.py", line 
 > 260, in iteration
 >     if not x.callback(f, event, *x.args):
 >   File "/usr/lib64/portage/pym/portage/util/_async/PipeLogger.py", line 99, 
 > in _output_handler
 >     stdout_buf[os.write(stdout_fd, stdout_buf):]
 >   File "/usr/lib64/portage/pym/portage/__init__.py", line 246, in __call__
 >     rval = self._func(*wrapped_args, **wrapped_kwargs)
 > OSError: [Errno 5] Input/output error
 > 
 > Basically 'emerge' just writes the build output to stdout in a loop:
 > ...
 > def _output_handler(self, fd, event):
 > 
 >        background = self.background
 >        stdout_fd = self.stdout_fd
 >        log_file = self._log_file 
 > 
 >        while True:
 >                buf = self._read_buf(fd, event)
 > 
 >                if buf is None:
 >                        # not a POLLIN event, EAGAIN, etc...
 >                        break
 > 
 >                if not buf:
 >                        # EOF
 >                        self._unregister()
 >                        self.wait()
 >                        break
 > 
 >                else:
 >                        if not background and stdout_fd is not None:
 >                                failures = 0
 >                                stdout_buf = buf
 >                                while stdout_buf:
 >                                        try:
 >                                                stdout_buf = \
 >                                                        
 > stdout_buf[os.write(stdout_fd, stdout_buf):]
 >                                        except OSError as e:
 >                                                if e.errno != errno.EAGAIN:
 >                                                        raise
 >                              ...
 > 
 > see: https://bugs.gentoo.org/show_bug.cgi?id=459674
 > 
 > (A similar issue also happens when building Firefox since v3.8.0. But
 > because Firefox's build process doesn't raise an exception it just dies
 > at random points without giving a clue.)
 > 
 > Now the question is: Could this be a kernel bug? Maybe in the TTY layer?
 > 
 > Unfortunately the issue is not easily reproducible and a git-bisect is
 > out of the question.

I'm seeing a similar regression.  I do a lot of gcc bootstraps and regression
test suite runs, and for the bootstraps I do

        make -jN bootstrap |& tee build-log

(tcsh syntax, adjust as appropriate for your preferred shell) to get a complete
log for later inspection in case of error.  N is typically the number of cores
or threads on the machine, e.g. -j8 on my Core-i7 IVB.

Up to the 3.7 kernel this never had any problems.  Starting with the 3.8 kernel,
or possibly 3.9-rc1, this usually dies at some random point with an EIO.

I haven't had time to bisect it.

/Mikael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to