Re: [fedora-arm] builder io isue

2011-12-25 Thread Gordan Bobic

On 12/25/2011 06:16 AM, Brendan Conoboy wrote:

On 12/24/2011 09:25 PM, Dennis Gilmore wrote:

Just throwing out what im seeing. lets see what ideas we can come up
with. performance is better than before. and seneca has done a great
job and put a lot of hard work into the reorg and im thankful for that.
We just have another bottleneck to address.


Allocating builders to individual rather than a single raid volume will
help dramatically.


Care to explain why?


Since hfp/sfp F15 builds happen at the same time,
having 2 hfp disks and 2 sfp disks is a good start. Split it up again by
some other criteria- you want to try to ensure that any one builder only
causes one spindle to be used and any 2 builders each only cause 2
spindles to be used (1-to-1), and so forth. As long as any single
spindle isn't doing multiple simultaneous mock inits it should scale
pretty well.


You can largely avoid this by making sure your RAID and FS are aligned. 
Most mock IOPS are small, and provided layers are aligned right, should 
scale very well on RAID 0/10 (RAID5/6 will inevitably cripple just about 
anything when it comes to writes regardless of what you do, unless the 
vast majority of your writes is a multiple of your stripe-width, which 
is unlikely).



Allocate the large files the builders are using all at once
(no sparse files) to ensure large read/writes don't require seeks.


Is this a "proper" SAN or just another Linux box with some disks in it? 
Is NFS backed by a SAN "volume"?



Use the async nfs export option if it isn't already in use.


+1


Disable fs
journaling (normally dangerous, but this is throw-away space).


Not really dangerous - the only danger is that you might have to wait 
for fsck to do it's thing on an unclean shutdown (which can take hours 
on a full TB scale disk, granted).


Speaking of "dangerous" tweaks, you could LD_PRELOAD libeatmydata (add 
to a profile.d file in the mock config, and add the package to the 
buildsys-build group). That will eat fsync() calls which will smooth out 
commits and make a substantial difference to performance. Since it's 
scratch space anyway it doesn't matter WRT safety.



Use
noatime mounts if that won't break package builds checking filesystem
conformance.


Build of zsh will break on NFS whateveryou do. It will also break on a 
local FS with noatime. There may be other packages that suffer from this 
issue but I don't recall them off the top of my head. Anyway, that is an 
issue for a build policy - have one builder using block level storage 
with atime and the rest on NFS.



Once all that is done, tweak the number of nfsds such that
there are as many as possible without most of them going into deep
sleep. Perhaps somebody else can suggest some optimal sysctl and ext4fs
settings?


As mentioned in a previous post, have a look here:
http://www.altechnative.net/?p=96

Deadline scheduler might also help on the NAS/SAN end, plus all the 
usual tweaks (e.g. make sure write caches on the disks are enabled, if 
the disks support write-read-verify disable it, etc.)


Gordan
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

Re: [fedora-arm] builder io isue

2011-12-25 Thread Gordan Bobic

On 12/25/2011 05:25 AM, Dennis Gilmore wrote:

When we were testing build were happening really fast, once we loaded
up the build jobs things have become really slow

http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=238672 started
at 1:19 utc and at 5:18 utc  four hours later the buildrequires are
still being installed. australia is completely io bound. i think we
need to see how we can spread the io load. ~30 hosts reading and
writing to the 4 spindles just saturates the disk io.


Is your RAID aligned properly? I have found that making sure that your 
RAID and FS are aligned properly makes a big difference to performance, 
at least on my workloads. Have a look here:


http://www.altechnative.net/?p=96

Just taking the default options when setting up the RAID and FS 
typically hammer the IOPS performance down to that of a single disk.



i guess options
are find a way to add more spindles. move /var/lib/mock to sdcard, see
if we can get some kind of san that has alot of smaller fast disks.


That will make it an order of magnitude worse. A typical class 10 SD 
card only manages about 20 random write IOPS.



get
some 2.5" usb drives one for each builder. some other idea?  is there
anyway we could add 4-8 more disks to australia. the size of the
matters little. gaining more iops by adding more spindles would help.


You could just replace the disks with some decent SSDs and be done with 
it (aligning the RAID and FS as explained on the above linked page still 
applies just the same). Especially since 15K rpm disks are now highly 
uncompetitively priced against SSDs.


Gordan
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

[fedora-arm] Daily F15 ARM-Koji Build Status

2011-12-25 Thread jon . chiappetta
Period: Sat Dec 24 13:05:02 UTC 2011 -- Sun Dec 25 13:05:03 UTC 2011

Errors UnbuiltDifferent  WorkingBuilt  
---
1118 [-8]  2494 [-30] 57 [0] 12 [-2]6734 [+40] 


* Different contains packages that are built but have a varying ENVR to that of 
PA-Koji
* The error count is likely to drop as the packages are probably just waiting 
for a rebuild attempt
* The numbers in the brackets show the difference recorded in the counts since 
the script was last run
* https://fedoraproject.org/wiki/Architectures/ARM/F15_Koji_build
* http://arm.koji.fedoraproject.org/~jchiappetta/stats.html
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

[fedora-arm] Fedora ARM F-15 Branched report: 20111125 changes

2011-12-25 Thread Fedora ARM Branched Report
Compose started at Sun Dec 25 19:54:45 UTC 2011

Loaded plugins: langpacks, presto, refresh-packagekit
New package: artha-1.0.1-3.fc15
 A handy thesaurus based on WordNet

New package: aspell-es-1.11-2.fc15
 Spanish dictionaries for Aspell

New package: aspell-ru-0.99f7-7.fc15
 Russian dictionaries for Aspell

New package: bidiv-1.5-10.fc15
 Display logical Hebrew on unidirectional terminals

New package: bind-dyndb-ldap-0.2.0-3.fc15
 LDAP back-end plug-in for BIND

New package: bltk-1.0.9-11.fc15
 The BLTK measures notebook battery life under any workload

New package: cd-discid-1.1-2.fc15
 Utility to get CDDB discid information

New package: couchdb-glib-0.7.2-1.fc15
 A glib api to access CouchDB servers

New package: dotconf-1.3-2.fc15
 Libraries to parse configuration files

New package: dvgrab-3.5-2.fc15
 Utility to capture video from a DV camera

New package: dwdiff-1.7-4.fc15
 Front end to diff for comparing on a per word basis

New package: ez-ipupdate-3.0.11-0.23.b8.fc15
 Client for Dynamic DNS Services

New package: fbdesk-1.4.1-7.fc15
 Icon Manager for Fluxbox

New package: ftnchek-3.3.1-12.fc15
 Static analyzer for Fortran 77 programs

New package: gdb-heap-0.5-4.fc15
 Extensions to gdb for debugging dynamic memory allocation

New package: ghasher-1.2.1-10.fc15
 GUI hasher for GTK+ 2

New package: globalplatform-5.0.0-10.fc15
 Library for access to OP 2.0.1 and GP 2.1.1 conforming smart cards

New package: gnome-device-manager-0.2-6.fc15
 Graphical Device Manager Application

New package: gnome-libs-1.4.2-18.fc15
 The main GNOME1 libraries

New package: gtk-vnc-0.4.3-1.fc15
 A GTK2 widget for VNC clients

New package: gtkmm24-2.24.0-3.fc15
 C++ interface for GTK2 (a GUI library for X)

New package: hostapd-0.7.3-2.fc15
 IEEE 802.11 AP, IEEE 802.1X/WPA/WPA2/EAP/RADIUS Authenticator

New package: http-parser-0.3-6.20100911git.fc15
 HTTP request/response parser for C

New package: httrack-3.43.9-3.fc15
 Website copier and offline browser

New package: ifstatus-1.1.0-7.fc15
 Command line real time interface graphs using ncurses

New package: isomaster-1.3.7-2.fc15
 An easy to use GUI CD image editor

New package: ladspa-vco-plugins-0.3.0-9.fc15
 Anti-aliased pulse and sawtooth oscillators

New package: libUnihan-0.5.3-6.fc15
 C library for Unihan character database in fifth normal form

New package: libXrandr-1.3.1-2.fc15
 X.Org X11 libXrandr runtime library

New package: libeio-3.65-4.fc15
 Event-based fully asynchronous I/O library

New package: libgnome-2.32.1-2.fc15
 GNOME base library

New package: libgnomeui-2.24.5-2.fc15
 GNOME base GUI library

New package: libibmad-1.3.4-2.fc15
 OpenFabrics Alliance InfiniBand MAD library

New package: libibumad-1.3.4-2.fc15
 OpenFabrics Alliance InfiniBand umad (user MAD) library

New package: liblo-0.26-2.fc15
 Open Sound Control library

New package: libwpd-0.9.1-2.fc15
 Library for reading and converting WordPerfect documents

New package: libxnm-0.1.3-4.fc15
 A library for parsing the XNM format

New package: lv2-swh-plugins-1.0.15-5.20091118git.fc15
 LV2 ports of LADSPA swh plugins

New package: lxdm-0.3.0-4.fc15
 Lightweight X11 Display Manager

New package: maloc-1.5-1.fc15
 Minimal Abstraction Layer for Object-oriented C

New package: mutt-1.5.21-3.fc15
 A text mode mail user agent

New package: nxtvepg-2.8.1-6.fc15
 A nexTView EPG decoder and browser

New package: ocaml-json-static-0.9.8-4.fc15
 OCaml JSON validator and converter (syntax extension)

New package: olpc-contents-2.6-3.fc15
 OLPC contents manifest tools

New package: openbox-3.4.11.2-8.fc15
 A highly configurable and standards-compliant X11 window manager

New package: perl-BerkeleyDB-0.43-5.fc15
 Interface to Berkeley DB

New package: perl-Math-Random-ISAAC-XS-1.004-2.fc15
 C implementation of the ISAAC PRNG algorithm

New package: perl-Mozilla-LDAP-1.5.3-5.fc15
 LDAP Perl module that wraps the OpenLDAP C SDK

New package: perl-Net-DBus-GLib-0.33.0-7.fc15
 Perl extension for the DBus GLib bindings

New package: pngcrush-1.6.10-7.fc15
 Optimizer for PNG (Portable Network Graphics) files

New package: pngnq-0.5-9.fc15
 Pngnq is a tool for quantizing PNG images in RGBA format

New package: poster-20060221-8.fc15
 Scales PostScript images to span multiple pages

New package: pyhunspell-0.1-4.fc15
 Python bindings for hunspell

New package: python-isprelink-0.1.2-

[fedora-arm] Fedora ARM F-15 Branched report: 201111251 changes

2011-12-25 Thread Fedora ARM Branched Report
Compose started at Sun Dec 25 22:46:21 UTC 2011

Loaded plugins: langpacks, presto, refresh-packagekit
New package: acpi-1.5-2.fc15
 Command-line ACPI client

New package: aspell-0.60.6-14.fc15
 Spell checker

New package: bindfs-1.8.3-4.fc15
 Fuse filesystem to mirror a directory

New package: dvipdfm-0.13.2d-42.fc15
 A DVI to PDF translator

New package: ftplib-3.1-7.fc15
 Library of FTP routines

New package: indent-2.2.11-3.fc15
 A GNU program for formatting C code

New package: lensfun-0.2.5-5.fc15
 Library to rectify defects introduced by photographic lenses

New package: mod_security-2.5.12-4.fc15
 Security module for the Apache HTTP Server

New package: mon-1.2.0-7.fc15
 General-purpose resource monitoring system

New package: ocaml-csv-1.1.7-9.fc15
 OCaml library for reading and writing CSV files

New package: perl-B-Compiling-0.02-4.fc15
 Expose PL_compiling to perl

New package: perl-PerlIO-gzip-0.18-10.fc15
 Perl extension to provide a PerlIO layer to gzip/gunzip

New package: perl-Unicode-Map8-0.13-4.fc15
 Mapping table between 8-bit chars and Unicode for Perl

New package: psad-2.1.7-2.fc15
 Port Scan Attack Detector (psad) watches for suspect traffic


Updated Packages:

kernel-2.6.41.5-5.fc15
--
* Tue Dec 20 2011 Josh Boyer 
- Fix config options in arm configs after latest commits
- Backport upstream fix for b44_poll oops (rhbz #741117)

* Mon Dec 19 2011 Dave Jones 
- x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT

* Thu Dec 15 2011 Dennis Gilmore 
- build imx highbank and kirkwood kernel variants on arm
- add patch for tegra usb storage resets
- omap config cleanups from dmarlin

* Thu Dec 15 2011 Josh Boyer  - 2.6.41.5-4
- Add patch to fix Intel wifi regression in 3.1.5 (rhbz 767173)
- Add patch from Jeff Layton to fix suspend with NFS (rhbz #717735)
- Backport ALPS touchpad patches from input/next branch (rhbz #590880)

* Thu Dec 15 2011 Dave Jones  - 2.6.41.5-3
- Disable Intel IOMMU by default.

* Fri Dec 09 2011 Josh Boyer  2.6.41.5-1
- Linux 3.1.5 (Fedora 2.6.41.5)

* Thu Dec 08 2011 Chuck Ebbert  2.6.41.5-0.rc2.1
- Linux 3.1.5-rc2 (Fedora 2.6.41.5-rc2)
- Fix wrong link speed on some sky2 network adapters (rhbz #757839)

* Wed Dec 07 2011 Chuck Ebbert 
- Linux 3.1.5-rc1 (Fedora 2.6.41.5-rc1)
- Comment out merged patches:
  xfs-Fix-possible-memory-corruption-in-xfs_readlink.patch
  rtlwifi-fix-lps_lock-deadlock.patch

* Tue Dec 06 2011 Chuck Ebbert 
- Disable uas until someone can fix it (rhbz #717633)

* Tue Dec 06 2011 Josh Boyer 
- Add reworked pci ASPM patch from Matthew Garrett



Summary:
Added Packages: 14
Removed Packages: 0
Modified Packages: 1
Compose finshed at Sun Dec 25 23:36:49 UTC 2011
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

Re: [fedora-arm] builder io isue

2011-12-25 Thread Brendan Conoboy

On 12/25/2011 03:47 AM, Gordan Bobic wrote:

On 12/25/2011 06:16 AM, Brendan Conoboy wrote:

Allocating builders to individual rather than a single raid volume will
help dramatically.

Care to explain why?


Sure, see below.


Is this a "proper" SAN or just another Linux box with some disks in it?
Is NFS backed by a SAN "volume"?


As I understand it, the server is a Linux host using raid0 with 512k 
chunks across 4 sata drives.  This md device is then formatted with some 
filesystem (ext4?).  Directories on this filesystem are then exported to 
individual builders such that each builder has its own private space. 
These private directories contain a large file that is used as a 
loopback ext4fs (IE, the builder mounts the nfs share, then loopback 
mounts the file on that nfs share as an ext4fs).  This is where 
/var/lib/mock comes from.  Just to be clear, if you looked at nfs 
mounted directory on a build host you would see a single large file that 
represented a filesystem, making traditional ext?fs tuning a bit more 
complicated.


The structural complication is that we have something like 30-40 systems 
all vying for the attention of those 4 spindles.  It's really important 
that each builder not cause more than one disk to perform an operation 
because seeks are costly, and if just 2 disks get called up by a single 
builder, 50% of the storage resources will be taken up by a single host 
until the operation completes.  With 40 hosts, you'll just end up 
thrashing (with considerably fewer hosts, too)..  Raid0 gives great 
throughput, but it's at the cost of latency.  With so many 100mbit 
builders, throughput is less important and latency is key.


Roughly put, the two goals for good performance in this scenario are:

1. Make sure each builder only activates one disk per operation.

2. Make sure each io operation causes the minimum amount of seeking.

You're right that good alignment and block sizes and whatnot will help 
this cause, but there is still greater likelihood of io operations 
traversing spindle boundaries periodically in the best situation.  You'd 
need a chunk size about equal to the fs image file size to pull that 
off.  Perhaps an lvm setup with strictly defined layouts with each 
lvcreate would make it a bit more manageable, but for simplicity's sake 
I advocate simply treating the 4 disks like 4 disks, exported according 
to expected usage patterns.


In the end, if all this is done and the builders are delayed by deep 
sleeping nfsds, the only options are to move /var/lib/mock to local 
storage or increase the number of spindles on the server.



Disable fs
journaling (normally dangerous, but this is throw-away space).


Not really dangerous - the only danger is that you might have to wait
for fsck to do it's thing on an unclean shutdown (which can take hours
on a full TB scale disk, granted).


I mean dangerous in the sense that if the server goes down, there might 
be data loss, but the builders using the space won't know that.  This is 
particularly true if nfs exports are async.



Speaking of "dangerous" tweaks, you could LD_PRELOAD libeatmydata (add
to a profile.d file in the mock config, and add the package to the
buildsys-build group). That will eat fsync() calls which will smooth out
commits and make a substantial difference to performance. Since it's
scratch space anyway it doesn't matter WRT safety.


Sounds good to me :-)


Build of zsh will break on NFS whateveryou do. It will also break on a
local FS with noatime. There may be other packages that suffer from this
issue but I don't recall them off the top of my head. Anyway, that is an
issue for a build policy - have one builder using block level storage
with atime and the rest on NFS.


Since loopback files representing filesystems are being used with nfs as 
the storage mechanism, this would probably be a non-issue.  You just 
can't have the builder mount its loopback fs noatime (hadn't thought of 
that previously).



Once all that is done, tweak the number of nfsds such that
there are as many as possible without most of them going into deep
sleep. Perhaps somebody else can suggest some optimal sysctl and ext4fs
settings?


As mentioned in a previous post, have a look here:
http://www.altechnative.net/?p=96

Deadline scheduler might also help on the NAS/SAN end, plus all the
usual tweaks (e.g. make sure write caches on the disks are enabled, if
the disks support write-read-verify disable it, etc.)


Definitely worth testing.  Well ordered IO is critical here.

--
Brendan Conoboy / Red Hat, Inc. / b...@redhat.com
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

Re: [fedora-arm] builder io isue

2011-12-25 Thread Gordan Bobic

On 12/26/2011 03:57 AM, Brendan Conoboy wrote:

On 12/25/2011 03:47 AM, Gordan Bobic wrote:

On 12/25/2011 06:16 AM, Brendan Conoboy wrote:

Allocating builders to individual rather than a single raid volume will
help dramatically.

Care to explain why?


Sure, see below.


Is this a "proper" SAN or just another Linux box with some disks in it?
Is NFS backed by a SAN "volume"?


As I understand it, the server is a Linux host using raid0 with 512k
chunks across 4 sata drives. This md device is then formatted with some
filesystem (ext4?). Directories on this filesystem are then exported to
individual builders such that each builder has its own private space.
These private directories contain a large file that is used as a
loopback ext4fs (IE, the builder mounts the nfs share, then loopback
mounts the file on that nfs share as an ext4fs). This is where
/var/lib/mock comes from. Just to be clear, if you looked at nfs mounted
directory on a build host you would see a single large file that
represented a filesystem, making traditional ext?fs tuning a bit more
complicated.


Why not just mount direct via NFS? It'd be a lot quicker, not to mention 
easier to tune. It'd work for building all but a handful of packages 
(e.g. zsh), but you could handle that by having a single builder that 
uses a normal fs that has a policy pointing the packages that fail 
self-tests on NFS at it.



The structural complication is that we have something like 30-40 systems
all vying for the attention of those 4 spindles. It's really important
that each builder not cause more than one disk to perform an operation
because seeks are costly, and if just 2 disks get called up by a single
builder, 50% of the storage resources will be taken up by a single host
until the operation completes. With 40 hosts, you'll just end up
thrashing (with considerably fewer hosts, too).. Raid0 gives great
throughput, but it's at the cost of latency. With so many 100mbit
builders, throughput is less important and latency is key.


512KB chunks sound vastly oversized for this sort of a workload. But if 
you are running ext4 on top of loopback file on top of NFS, no wonder 
the performance sucks.



Roughly put, the two goals for good performance in this scenario are:

1. Make sure each builder only activates one disk per operation.


Sounds like a better way to ensure that would be to re-architect the 
storage solution more sensibly. If you really want to use block level 
storage, use iSCSI on top of raw partitions. Providing those partitions 
are suitably aligned (e.g. for 4KB physical sector disks, erase block 
sizes, underlying RAID, etc.), your FS on top of those iSCSI exports 
will also end up being properly aligned, and the stride, stripe-width 
and block group size will all still line up properly.


But with 40 builders, each builder only hammering one disk, you'll still 
get 10 builders hammering each spindle and causing a purely random seek 
pattern. I'd be shocked if you see any measurable improvement from just 
splitting up the RAID.



2. Make sure each io operation causes the minimum amount of seeking.

You're right that good alignment and block sizes and whatnot will help
this cause, but there is still greater likelihood of io operations
traversing spindle boundaries periodically in the best situation. You'd
need a chunk size about equal to the fs image file size to pull that
off.


Using the fs image over loopback over NFS sounds so eyewateringly wrong 
that I'm just going to give up on this thread if that part is immutable. 
I don't think the problem is significantly fixable if that approach remains.



Perhaps an lvm setup with strictly defined layouts with each
lvcreate would make it a bit more manageable, but for simplicity's sake
I advocate simply treating the 4 disks like 4 disks, exported according
to expected usage patterns.


I don't see why you think that seeking within a single disk is any less 
problematic than seeking across multiple disks. That will only happen 
when the file exceeds the chunk size, and that will typically happen 
only at the end when linking - there aren't many cases where a single 
code file is bigger than a sensible chunk size (and in a 4-disk RAID0 
case, you're pretty much forced to use a 32KB chunk size if you intend 
for the block group beginnings to be distributed across spindles).



In the end, if all this is done and the builders are delayed by deep
sleeping nfsds, the only options are to move /var/lib/mock to local
storage or increase the number of spindles on the server.


And local storage will be what? SD cards? There's only one model line of 
SD cards I have seen to date that actually produce random-write results 
that begin to approach a ~5000 rpm disk (up to 100 IOPS), and those are 
SLC and quite expensive. Having spent the last few months patching, 
fixing up and rebuilding RHEL6 packages for ARM, I have a pretty good 
understanding of what works for backing storage and what doesn't - and 
SD 

Re: [fedora-arm] builder io isue

2011-12-25 Thread Gordan Bobic

On 12/26/2011 05:06 AM, Gordan Bobic wrote:


Disable fs
journaling (normally dangerous, but this is throw-away space).


Not really dangerous - the only danger is that you might have to wait
for fsck to do it's thing on an unclean shutdown (which can take hours
on a full TB scale disk, granted).


I mean dangerous in the sense that if the server goes down, there might
be data loss, but the builders using the space won't know that. This is
particularly true if nfs exports are async.


Strictly speaking, journal is about preventing the integrity of the FS
so you don't have to fsck it after an unclean shutdown, not about
preventing data loss as such. But I guess you could argue the two are
related.


Obviously, I meant protecting rather than preventing the integrity. :^)
*facepalm*

Gordan
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm

Re: [fedora-arm] builder io isue

2011-12-25 Thread Brendan Conoboy

On 12/25/2011 09:06 PM, Gordan Bobic wrote:

Why not just mount direct via NFS? It'd be a lot quicker, not to mention
easier to tune. It'd work for building all but a handful of packages
(e.g. zsh), but you could handle that by having a single builder that
uses a normal fs that has a policy pointing the packages that fail
self-tests on NFS at it.


I'm not acquainted with the rationale for the decision so perhaps 
somebody else can comment.  Beyond the packages that demand a local 
filesystem, perhaps there were issues with .nfsXXX files, or some 
stability problem not seen when working with a single open file? Not sure.



512KB chunks sound vastly oversized for this sort of a workload. But if
you are running ext4 on top of loopback file on top of NFS, no wonder
the performance sucks.


Well, 512KB chunks is oversize for traditional NFS use, but perhaps 
undersized for this unusual use case.



Sounds like a better way to ensure that would be to re-architect the
storage solution more sensibly. If you really want to use block level
storage, use iSCSI on top of raw partitions. Providing those partitions
are suitably aligned (e.g. for 4KB physical sector disks, erase block
sizes, underlying RAID, etc.), your FS on top of those iSCSI exports
will also end up being properly aligned, and the stride, stripe-width
and block group size will all still line up properly.


I understand there was an issue with iSCSI stability about a year ago. 
One of our engineers tried it on his trimslice recently and had no 
problems so it may be time to reevaluate its use.



But with 40 builders, each builder only hammering one disk, you'll still
get 10 builders hammering each spindle and causing a purely random seek
pattern. I'd be shocked if you see any measurable improvement from just
splitting up the RAID.


Let's say 10 (40/4) builders are using one disk at the same time- that's 
not necessarily a doomsday scenario since their network speed is only 
100mbps.  The one situation you want to avoid is having numerous mock 
setups at one time, that will amount to a hammering.  How much time on 
average is spent composing the chroot vs building?  Sure, at some point 
builders will simply overwhelm any given disk, but what is that point? 
My guess is that 10 is really pushing it.  5 would be better.



Using the fs image over loopback over NFS sounds so eyewateringly wrong
that I'm just going to give up on this thread if that part is immutable.
I don't think the problem is significantly fixable if that approach
remains.


Why is that?


I don't see why you think that seeking within a single disk is any less
problematic than seeking across multiple disks. That will only happen
when the file exceeds the chunk size, and that will typically happen
only at the end when linking - there aren't many cases where a single
code file is bigger than a sensible chunk size (and in a 4-disk RAID0
case, you're pretty much forced to use a 32KB chunk size if you intend
for the block group beginnings to be distributed across spindles).


It's the chroot composition that makes me think seeking across multiple 
disks is an issue.



And local storage will be what? SD cards? There's only one model line of
SD cards I have seen to date that actually produce random-write results
that begin to approach a ~5000 rpm disk (up to 100 IOPS), and those are
SLC and quite expensive. Having spent the last few months patching,
fixing up and rebuilding RHEL6 packages for ARM, I have a pretty good
understanding of what works for backing storage and what doesn't - and
SD cards are not an approach to take if performance is an issue. Even
expensive, highly branded Class 10 SD cards only manage ~ 20 IOPS
(80KB/s) on random writes.


80KB/s? Really?  That sounds like bad alignment.


Strictly speaking, journal is about preventing the integrity of the FS
so you don't have to fsck it after an unclean shutdown, not about
preventing data loss as such. But I guess you could argue the two are
related.


Right, sorry, was still thinking of async.


I'm still not sure what is the point of using a loopback-ed file for
storage instead of raw NFS. NFS mounted with nolock,noatime,proto=udp
works exceedingly well for me with NFSv3.


I didn't think udp was a good idea any longer.


Well, deadline is about favouring reads over writes. Writes you can
buffer as long as you have RAM to spare (expecially with libeatmydata
LD_PRELOAD-ed). Reads, however, block everything until they complete. So
favouring reads over writes may well get you ahead in terms of keeping
he builders busy.


It really begs the question: What are builers blocking on right now? 
I'd assumed chroot composition which is rather write heavy.


--
Brendan Conoboy / Red Hat, Inc. / b...@redhat.com
___
arm mailing list
arm@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/arm