Re: DSA concerns for jessie architectures - mips/mipsel

2013-09-29 Thread Tollef Fog Heen

Hi Graham,

]] Graham Whaley 

sorry if you get an unwanted Cc on this, I'm not sure what, if any of
the lists you're reading.

  I'd like to respond to your call for help regards the release
 qualification matrix, in particular for hardware (buildd and porter
 machines), and in particular for mips and mipsel arch.
 
  I wish to work with you to remedy some of the listed issues. I've started
 working with MIPS hardware vendors on availability and pricing of hardware.

That's good news, once you have solid numbers, I'd be most interested in
seeing them.  Feel free to just mail d...@debian.org if the numbers are
confidential.

  Having researched your current mips/mipsel setup and the requirements for
 jessie, the issues as I see them, and hopefully solutions, are:
 
 1) reliability. Corelli and Gabrielli are unstable. I saw the thread way
 back where they were investigated, but it seems un-fixable (and the
 machines are now rather old). Let's work on replacing both of those, and
 maybe Lucatelli as well, as it appears to be the same hardware (but
 possibly stable?).

I think this makes sense.

 2) supportability. We'll work on this to see what the options are. I'm sure
 we all want boxes that can be maintained/replaced easily.
 
 3) speed. I see 'mips' (but not mipsel in particular) listed as 'too slow'.
 Sure, Can somebody point me at some indication of the minimum requirement
 here (not that I'm particularly aiming at the minimum, I just wish to
 ensure we reach it :-). And, is this just pure
 single-multi-core/thread-machine speed, or is it a solvable problem by
 using multiple machines if necessary ?

I think others have covered this: the buildds need to be able to keep
up, which can be done with multiple machines.

In addition the current MIPS machines are currently significantly slower
than even armel (so that upgrading packages and running samhain take
unreasonably long).  These are single-core performance tasks and don't
scale with the number of machines.

 4) I see there is a note about an 'opcode implementation error' for a
 mipsel porter box. Sounds like a new machine(s) is needed there as well.
 Could somebody point me at some data on the opcode issue (more out of
 interest really...).

The mono JIT doesn't work on our MIPS machines due to the machines not
implementing the full architecture spec, AIUI.  Porter and buildd boxes
should not have hardware bugs like that.

 From the three types of machines I see you currently have I believe
 there are more modern versions of all of those, and possibly some
 others. I believe we will be able to locate hardware to solve the
 issues.

That would be great.  Ideally, we'd want fast, server class machines
with working OOB (both power and console), that use standard hardware
(SATA/SAS drives, etc) and that we have some kind of warranty for, so we
can get them replaced when they fail.  Ideally world-wide, so we can
have them hosted where we want.

-- 
Tollef Fog Heen, DSA
UNIX is user friendly, it's just picky about who its friends are


-- 
To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8761tjz890@qurzaw.varnish-software.com



Re: DSA concerns for jessie architectures - mips/mipsel

2013-09-27 Thread Graham Whaley
Hi DSA, all.

 I'd like to respond to your call for help regards the release
qualification matrix, in particular for hardware (buildd and porter
machines), and in particular for mips and mipsel arch.

 I wish to work with you to remedy some of the listed issues. I've started
working with MIPS hardware vendors on availability and pricing of hardware.

 Having researched your current mips/mipsel setup and the requirements for
jessie, the issues as I see them, and hopefully solutions, are:

1) reliability. Corelli and Gabrielli are unstable. I saw the thread way
back where they were investigated, but it seems un-fixable (and the
machines are now rather old). Let's work on replacing both of those, and
maybe Lucatelli as well, as it appears to be the same hardware (but
possibly stable?).

2) supportability. We'll work on this to see what the options are. I'm sure
we all want boxes that can be maintained/replaced easily.

3) speed. I see 'mips' (but not mipsel in particular) listed as 'too slow'.
Sure, Can somebody point me at some indication of the minimum requirement
here (not that I'm particularly aiming at the minimum, I just wish to
ensure we reach it :-). And, is this just pure
single-multi-core/thread-machine speed, or is it a solvable problem by
using multiple machines if necessary ?

4) I see there is a note about an 'opcode implementation error' for a
mipsel porter box. Sounds like a new machine(s) is needed there as well.
Could somebody point me at some data on the opcode issue (more out of
interest really...).

From the three types of machines I see you currently have I believe there
are more modern versions of all of those, and possibly some others. I
believe we will be able to locate hardware to solve the issues.

Thanks,
  Graham

-- 
Software Design Manager, MIPS platforms
Imagination Technologies


Re: DSA concerns for jessie architectures - mips/mipsel

2013-09-27 Thread Steven Chamberlain
Hi,

On 27/09/13 16:23, Graham Whaley wrote:
 I wish to work with you to remedy some of the listed issues. I've
 started working with MIPS hardware vendors on availability and pricing
 of hardware.

I've wondered if SMP Loongson systems are anywhere to be found:
http://bbs.lemote.com/viewthread.php?tid=43118

or if even the Lemote Hongri would be available someday:
http://www.lemote.com/products/computer/hongri/

But I don't see Loongson 3A being an option until at the very least
jessie kernels support it and are stable with all cores in use.  This is
just my opinion though and I can't speak for DSA.

 3) speed. I see 'mips' (but not mipsel in particular) listed as 'too
 slow'. Sure, Can somebody point me at some indication of the minimum
 requirement here (not that I'm particularly aiming at the minimum, I
 just wish to ensure we reach it :-). And, is this just pure
 single-multi-core/thread-machine speed, or is it a solvable problem by
 using multiple machines if necessary ?

On mipsel at least, I recall that libreoffice, openjdk-7, webkit seemed
to have some difficulty building.  Each source package is built on a
single machine only, and the current machines are limited to = 1 GiB
RAM I think so I expect heavy swapping takes place.

I speculated some time ago (in a mail to the DSA list) that
network-attached storage might help for low-powered buildds, but I
didn't do any followup viability testing of this yet.  I thought that a
separate NAS (of any architecture) would have no particular limit on
number of disks or RAM, should provide fault tolerance, and maybe help
with provisioning too.  Whereas current mipsel hardware may be limited
to a single disk, perhaps not adequately cooled or designed for
continuous running, without RAID or hot-swap, and either low in capacity
or very slow (due to I/O latency) than could be achieved even over a
100Mbps link to dedicated storage hardware.

During the wheezy freeze period, mipsel and others did develop large
queues of (IIRC ~150) packages in Needs-Build state.  That's something
that having more (and reliable) buildds could help with even if the same
spec as the existing ones.

 4) I see there is a note about an 'opcode implementation error' for a
 mipsel porter box. Sounds like a new machine(s) is needed there as well.
 Could somebody point me at some data on the opcode issue (more out of
 interest really...).

I suspect that might refer to this, quoting from OpenBSD[0] :

 Unfortunately, most of the Loongson 2F-based hardware available at that
 time suffers from serious problems in the processor's branch prediction
 logic, causing the system to freeze, for which errata information only
 exists in the Chinese documentation (chapter 15, missing from the
 English translation), the only English language information being an
 e-mail[1] on a toolchain mailing-list.

[0]: http://www.openbsd.org/loongson.html
[1]: https://sourceware.org/ml/binutils/2009-11/msg00387.html

 From the three types of machines I see you currently have I believe
 there are more modern versions of all of those, and possibly some
 others. I believe we will be able to locate hardware to solve the issues.

I think Linux 3.2 detects and works around those bugs at least in kernel
code, but looking at the output in dmesg may help to identify which
boxes (if any) are affected.  (I imagine it's a problem for userland
binaries that were built before workarounds were added in binutils).

Maybe the existing boxes were not affected, but it was a concern about
acquiring newer Loongson 2F hardware?

Regards,
-- 
Steven Chamberlain
ste...@pyro.eu.org


-- 
To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/5245fcfa.7040...@pyro.eu.org