Re: high load, no bottleneck

2013-10-01 Thread Michael
Hello, on Saturday 28 September 2013 04:54:19 you wrote: > Date:Sat, 28 Sep 2013 09:09:02 +0100 > From:"Roland C. Dowdeswell" > Message-ID: <20130928080902.gg4...@roofdrak.imrryr.org> > > | I thought quite some time ago that it probably makes sense for us > | to m

Re: high load, no bottleneck

2013-09-30 Thread Mouse
>> Basically, if we have N pending VOP_FSYNC for a given filesystem, >> all theses requests will be honoured on first flush, but they are >> serialized and will be acknowledged one by one, with the cost of a >> useless flush each time. Am I right? > Do you mean "all these requests *could* be honor

Re: high load, no bottleneck

2013-09-28 Thread Emmanuel Dreyfus
Thor Lancelot Simon wrote: > Do you mean "all these requests *could* be honored on first flush"? If > so, then yes, I agree. It could be done in a bold way by turning any VOP_FSYNC into a VFS_SYNC. Obviously it would be inappropriate in most situations, but when facing a huge amount of concuren

Re: high load, no bottleneck

2013-09-28 Thread Thor Lancelot Simon
On Sat, Sep 28, 2013 at 06:25:22AM +0200, Emmanuel Dreyfus wrote: > > Basically, if we have N pending VOP_FSYNC for a given filesystem, all > theses requests will be honoured on first flush, but they are serialized > and will be acknowledged one by one, with the cost of a useless flush > each time

Re: high load, no bottleneck

2013-09-28 Thread Emmanuel Dreyfus
Michael van Elst wrote: > >Basically, if we have N pending VOP_FSYNC for a given filesystem, all > >theses requests will be honoured on first flush, but they are serialized > >and will be acknowledged one by one, with the cost of a useless flush > >each time. Am I right? > > That should be trivi

Re: high load, no bottleneck

2013-09-28 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes: >Basically, if we have N pending VOP_FSYNC for a given filesystem, all >theses requests will be honoured on first flush, but they are serialized >and will be acknowledged one by one, with the cost of a useless flush >each time. Am I right? That should be

Re: high load, no bottleneck

2013-09-28 Thread Emmanuel Dreyfus
Robert Elz wrote: > incidentally, while the man page says that -o log and -o async > can't be used together, if they are, the result is a panic, rather > than a more graceful error message ... This could be a real problem on a system that allows unprivilegied users to mount thumb drives... --

Re: high load, no bottleneck

2013-09-28 Thread Robert Elz
Date:Sat, 28 Sep 2013 09:09:02 +0100 From:"Roland C. Dowdeswell" Message-ID: <20130928080902.gg4...@roofdrak.imrryr.org> | I thought quite some time ago that it probably makes sense for us | to make the installer mount everything async to extract the sets | beca

Re: high load, no bottleneck

2013-09-28 Thread Roland C. Dowdeswell
On Sat, Sep 28, 2013 at 05:56:50PM +1000, matthew green wrote: > > > ps: I had been meaning to rant like this for some time, your message just > > provided the incentive today! > > :-) > > i will note that i'm also a fan of using -o async FFS mounts in > the right place. i just wouldn't do it f

re: high load, no bottleneck

2013-09-28 Thread matthew green
> ps: I had been meaning to rant like this for some time, your message just > provided the incentive today! :-) i will note that i'm also a fan of using -o async FFS mounts in the right place. i just wouldn't do it for a file server :-)

Re: high load, no bottleneck

2013-09-28 Thread Robert Elz
Date:Sat, 28 Sep 2013 14:24:32 +1000 From:matthew green Message-ID: <11701.1380342...@splode.eterna.com.au> | -o async is very dangerous. there's not even the vaguest | guarantee that even fsck can help you after a crash in | that case... All true, still it is

Re: high load, no bottleneck

2013-09-27 Thread Mouse
>>> I tried moving a client NFS mount to async. [...] >> Further testing shows that server with -o log / client with -o async >> has no performance problem. OTOH, the client sometimes complain >> about write errors. -o async seems dangerous. > -o async is very dangerous. there's not even the va

re: high load, no bottleneck

2013-09-27 Thread matthew green
> > I tried moving a client NFS mount to async. The result is that the > > server never sees a filesync again from that client. > > Further testing shows that server with -o log / client with -o async has > no performance problem. OTOH, the client sometimes complain about write > errors. -o asy

Re: high load, no bottleneck

2013-09-27 Thread Emmanuel Dreyfus
Thor Lancelot Simon wrote: > It should be possible to gather those requests and commit many of them > at once to disk with a single cache flush operation, rather than issuing > a cache flush for each one. This is not unlike the problem with nfs3 in > general, that many clients at once may issue

Re: high load, no bottleneck

2013-09-27 Thread Emmanuel Dreyfus
Emmanuel Dreyfus wrote: > I tried moving a client NFS mount to async. The result is that the > server never sees a filesync again from that client. Further testing shows that server with -o log / client with -o async has no performance problem. OTOH, the client sometimes complain about write

Re: high load, no bottleneck

2013-09-26 Thread Thor Lancelot Simon
On Tue, Sep 24, 2013 at 04:01:36PM +0100, David Brownlee wrote: > crap, apologies for the non checked return address. > > In the interest of trying to make a relevant reply - doesn't nfs3 > support differing COMMIT sync levels which could be leveraged for > this? (assuming your server is stable :)

Re: high load, no bottleneck

2013-09-25 Thread Paul Goyette
On Thu, 26 Sep 2013, Emmanuel Dreyfus wrote: Emmanuel Dreyfus wrote: async Assume that unstable write requests have actually been committed to stable storage on the server, and thus will not require resending in the event that the server crashes. Use of this

Re: high load, no bottleneck

2013-09-25 Thread Emmanuel Dreyfus
Emmanuel Dreyfus wrote: > async Assume that unstable write requests have actually been committed > to stable storage on the server, and thus will not require > resending in the event that the server crashes. Use of this > option may improve performan

Re: high load, no bottleneck

2013-09-24 Thread Emmanuel Dreyfus
David Brownlee wrote: > In the interest of trying to make a relevant reply - doesn't nfs3 > support differing COMMIT sync levels which could be leveraged for > this? (assuming your server is stable :) It would be that mount_nfs option? (from MacOS X man page) async Assume that unstable w

Re: high load, no bottleneck

2013-09-24 Thread David Brownlee
crap, apologies for the non checked return address. In the interest of trying to make a relevant reply - doesn't nfs3 support differing COMMIT sync levels which could be leveraged for this? (assuming your server is stable :) I recall using NFS for file storage at Dreamworks in the late '90s and

Re: high load, no bottleneck

2013-09-24 Thread David Brownlee
http://www.math.uni-bonn.de/people/ef/dotcache/ has a typo in the first subheading "Dotache" :) On 24 September 2013 13:38, Edgar Fuß wrote: >> We want fsync to do a disk sync, and client are unlikely to be fixable. > In my case, the culprit was SQLite used by browsers and dropbox. > As these wer

Re: high load, no bottleneck

2013-09-24 Thread Edgar Fuß
> We want fsync to do a disk sync, and client are unlikely to be fixable. In my case, the culprit was SQLite used by browsers and dropbox. As these were not fixable, I ended up writing a system that re-directs these SQLite files to local storage (http://www.math.uni-bonn.de/people/ef/dotcache). >

Re: high load, no bottleneck

2013-09-24 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > > I have no idea wether [several journal flushes per second] is high or low. > It should be killing you. > So the main question is who is issuing these small sync writes. > As already mentioned, per filesync write you get a WAPBL journal flush > which ends up in two disc flushe

Re: high load, no bottleneck

2013-09-24 Thread Edgar Fuß
> I have no idea wether [several journal flushes per second] is high or low. It should be killing you. So the main question is who is issuing these small sync writes. As already mentioned, per filesync write you get a WAPBL journal flush which ends up in two disc flushes (one before and one after)

Re: high load, no bottleneck

2013-09-23 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > EF> However, the amount of filesync writes may still be the problem. > EF> The missing data (for me) is how often your WAPBL journal gets flushed > ED> How that can be retreived? > Look at the WAPBL debug output in syslog (which has time stamps). min: 1 flush/s max: 6 flush/s

Re: high load, no bottleneck

2013-09-23 Thread Edgar Fuß
EF> However, the amount of filesync writes may still be the problem. EF> The missing data (for me) is how often your WAPBL journal gets flushed ED> How that can be retreived? Look at the WAPBL debug output in syslog (which has time stamps). EF> How large are your stripes, btw.? ED> It is the sect

Re: high load, no bottleneck

2013-09-22 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > I don't think it would. However, the amount of filesync writes may still > be the problem. The missing data (for me) is how often your WAPBL journal > gets flushed (because of these filesync writes). I How that can be retreived? > Additionally, every single small sync write

Re: high load, no bottleneck

2013-09-22 Thread Greg Troxel
Edgar Fuß writes: > I myself can't make sense out of the combination of > -- vfs.wapbl.flush_disk_cache=0 mitigating the problem > -- neither the RAID set nor its components showing busy in iostat > Maybe during a flush, the discs are not regarded busy? > Do you have physical access to the serve

Re: high load, no bottleneck

2013-09-22 Thread Edgar Fuß
> I confirm. Indeed this is weird. Yes. I would try finding out who causes this. > But how small write could kill WAPBL performances? I don't think it would. However, the amount of filesync writes may still be the problem. The missing data (for me) is how often your WAPBL journal gets flushed (be

Re: high load, no bottleneck

2013-09-21 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > Maybe also you get some hint by trying to find out whether the problem is > NFS or client related. Are you able to reproduce it locally? Can you > make it happen (to a lesser extent, of course) with a single process? I tried various ways but I could not obtain the same phenome

Re: high load, no bottleneck

2013-09-21 Thread Manuel Bouyer
On Sat, Sep 21, 2013 at 09:58:30AM +, Michael van Elst wrote: > e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes: > > >I myself can't make sense out of the combination of > >-- vfs.wapbl.flush_disk_cache=0 mitigating the problem > >-- neither the RAID set nor its components showing b

Re: high load, no bottleneck

2013-09-21 Thread Manuel Bouyer
On Sat, Sep 21, 2013 at 12:01:45PM +0200, Edgar Fuß wrote: > > I suspect that indeed, while a fluch cache command is running, the > > disk is not considered busy. Only read and write commands are tracked. > Would it a) make sense b) be possible to implement that flushes are counted > as busy? It

Re: high load, no bottleneck

2013-09-21 Thread Edgar Fuß
> I suspect that indeed, while a fluch cache command is running, the > disk is not considered busy. Only read and write commands are tracked. Would it a) make sense b) be possible to implement that flushes are counted as busy?

Re: high load, no bottleneck

2013-09-21 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes: >I myself can't make sense out of the combination of >-- vfs.wapbl.flush_disk_cache=0 mitigating the problem >-- neither the RAID set nor its components showing busy in iostat >Maybe during a flush, the discs are not regarded busy? busy

Re: high load, no bottleneck

2013-09-21 Thread Manuel Bouyer
On Sat, Sep 21, 2013 at 11:44:26AM +0200, Edgar Fuß wrote: > ED> 2908 getattr > EF> During which timeframe? > ED> 22.9 seconds. > So that's >100 getattrs per second. > > > Indeed [lots of 549-byte write requests] is weird. > > But how small write could kill WAPBL performances? > No idea. I think I

Re: high load, no bottleneck

2013-09-21 Thread Edgar Fuß
ED> 2908 getattr EF> During which timeframe? ED> 22.9 seconds. So that's >100 getattrs per second. > Indeed [lots of 549-byte write requests] is weird. > But how small write could kill WAPBL performances? No idea. I think I'm out of luck now, but maybe it rings a bell with someone else. It would

Re: high load, no bottleneck

2013-09-20 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > > Output of your first script: > > 2908 getattr > During which timeframe? 22.9 seconds. > > Output of your second script: > > 167 549 unstable > > 28 549 filesync > So you mostly get 549-byte write requests? > Could you manually double-check this? I confirm. Indeed this

Re: high load, no bottleneck

2013-09-20 Thread Edgar Fuß
> Here is an excerpt: > device read KB/tr/s time MB/s write KB/tw/s time MB/s > wd0 0.00 0 0.00 0.00 13.74 27 0.00 0.36 > wd1 0.00 0 0.00 0.00 13.74 27 0.00 0.36 > raid10.00 0 0.00 0.00

Re: high load, no bottleneck

2013-09-19 Thread Greg Oster
On Fri, 20 Sep 2013 01:37:20 +0200 m...@netbsd.org (Emmanuel Dreyfus) wrote: > Greg Oster wrote: > > > Any additional load you have on the RAID set while rebuilding > > parity is just going to make things worse... What you really want > > to do is turn on the parity logging stuff, and reduce th

Re: high load, no bottleneck

2013-09-19 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > How often do these log flushes occur? On a 6.1 kernel with RAIDOUTSTANDING=800 and -o log. Stress test raises load to around 10. > During the stess phase, what does > iostat -D -x -w 1 > show for the raid and for the components, especially in the time column? Here is

Re: high load, no bottleneck

2013-09-19 Thread Brian Buhrow
On Sep 19, 8:53pm, Emmanuel Dreyfus wrote: } Subject: Re: high load, no bottleneck } Greg Oster wrote: } } IMO raidctl makes more sense here, as it is the place where one is } looking for RAID stuff. } } While I am there: fsck takes an infinite time while RAIDframe is } rebuilding parity. I

Re: high load, no bottleneck

2013-09-19 Thread Emmanuel Dreyfus
Brian Buhrow wrote: > options RAIDOUTSTANDING=40 #try and enhance raid performance. I gave it a try, and even with RAIDOUTSTANDING set to 800 on a NetBSD-6.1 kernel, my stress test raises load over 10 with -o log, whereas it remains below 1 without -o log Therefore it must be something else.

Re: high load, no bottleneck

2013-09-19 Thread Emmanuel Dreyfus
Greg Oster wrote: > Any additional load you have on the RAID set while rebuilding parity is > just going to make things worse... What you really want to do is turn > on the parity logging stuff, and reduce the amount of effort spent > checking parity by orders of magnitude... You mean raidctl -

Re: high load, no bottleneck

2013-09-19 Thread Christos Zoulas
On Sep 19, 11:35am, buh...@nfbcal.org (Brian Buhrow) wrote: -- Subject: Re: high load, no bottleneck | Hello. the worst case scenario is when a raid set is running in | degraded mode. Greg sent me some notes on how to calculate the memory | utilization in this instance. I'll go dig

Re: high load, no bottleneck

2013-09-19 Thread Greg Oster
On Thu, 19 Sep 2013 11:26:21 -0700 (PDT) Paul Goyette wrote: > On Thu, 19 Sep 2013, Brian Buhrow wrote: > > > The line I include in my config files is: > > > > options RAIDOUTSTANDING=40 #try and enhance raid performance. > > Is this likely to have any impact on a system with multiple raid-1

Re: high load, no bottleneck

2013-09-19 Thread Christos Zoulas
On Sep 19, 6:41pm, m...@netbsd.org (Emmanuel Dreyfus) wrote: -- Subject: Re: high load, no bottleneck | Greg Oster wrote: | | > > sysctl to the rescue. | > | > The appropriate 'bit to twiddle' is likely raidPtr->openings. | > Increasing the value can be done

Re: high load, no bottleneck

2013-09-19 Thread Greg Oster
On Thu, 19 Sep 2013 20:53:30 +0200 m...@netbsd.org (Emmanuel Dreyfus) wrote: > Greg Oster wrote: > > > It's probably easier to do by raidctl right now. I'm not opposed to > > having RAIDframe grow a sysctl interface as well if folks think that > > makes sense. The 'openings' value is currently

Re: high load, no bottleneck

2013-09-19 Thread Emmanuel Dreyfus
Greg Oster wrote: > It's probably easier to do by raidctl right now. I'm not opposed to > having RAIDframe grow a sysctl interface as well if folks think that > makes sense. The 'openings' value is currently set on a per-RAID basis, > so a sysctl would need to be able to handle individual RAID s

RAIDOUTSTANDING (was: high load, no bottleneck)

2013-09-19 Thread Edgar Fuß
> options RAIDOUTSTANDING=40 #try and enhance raid performance. Is there any downside to this other than memory usage? How much does one unit cost?

Re: high load, no bottleneck

2013-09-19 Thread Paul Goyette
On Thu, 19 Sep 2013, Brian Buhrow wrote: The line I include in my config files is: options RAIDOUTSTANDING=40 #try and enhance raid performance. Is this likely to have any impact on a system with multiple raid-1 mirrors? ---

Re: high load, no bottleneck

2013-09-19 Thread Brian Buhrow
Hello. the worst case scenario is when a raid set is running in degraded mode. Greg sent me some notes on how to calculate the memory utilization in this instance. I'll go dig them out and send them along in a bit. In theory, if all your raid sets are in degraded mode at once, and i/o i

Re: RAIDOUTSTANDING (was: high load, no bottleneck)

2013-09-19 Thread Greg Oster
On Thu, 19 Sep 2013 20:14:33 +0200 Edgar Fuß wrote: > > options RAIDOUTSTANDING=40 #try and enhance raid performance. > Is there any downside to this other than memory usage? > How much does one unit cost? This is from the comment in src/sys/dev/raidframe/rf_netbsdkintf.c : /* * Allow RAIDOUT

Re: high load, no bottleneck

2013-09-19 Thread Brian Buhrow
Hello. thor's right. The raidframe driver defaults to a rediculously low number of maximum outstanding transactions for today's environment. This is not a criticism of how the number was chosen initially, but things have changed. In my production kernels around here, I include the followi

Re: high load, no bottleneck

2013-09-19 Thread Greg Oster
On Thu, 19 Sep 2013 18:41:45 +0200 m...@netbsd.org (Emmanuel Dreyfus) wrote: > Greg Oster wrote: > > > > sysctl to the rescue. > > > > The appropriate 'bit to twiddle' is likely raidPtr->openings. > > Increasing the value can be done while holding raidPtr->mutex. > > Decreasing the value can al

Re: high load, no bottleneck

2013-09-19 Thread Emmanuel Dreyfus
Greg Oster wrote: > > sysctl to the rescue. > > The appropriate 'bit to twiddle' is likely raidPtr->openings. > Increasing the value can be done while holding raidPtr->mutex. > Decreasing the value can also be done while holding raidPtr->mutex, but > will need some care if attempting to decrease

Re: high load, no bottleneck

2013-09-19 Thread Greg Oster
On Thu, 19 Sep 2013 10:29:55 -0400 chris...@zoulas.com (Christos Zoulas) wrote: > On Sep 19, 8:13am, t...@panix.com (Thor Lancelot Simon) wrote: > -- Subject: Re: high load, no bottleneck > > | On Wed, Sep 18, 2013 at 06:03:11PM +0200, Emmanuel Dreyfus wrote: > | > Emman

Re: high load, no bottleneck

2013-09-19 Thread Christos Zoulas
On Sep 19, 8:13am, t...@panix.com (Thor Lancelot Simon) wrote: -- Subject: Re: high load, no bottleneck | On Wed, Sep 18, 2013 at 06:03:11PM +0200, Emmanuel Dreyfus wrote: | > Emmanuel Dreyfus wrote: | > | > > Thank you for saving my day. But now what happens? | > > I note

Re: high load, no bottleneck

2013-09-19 Thread Emmanuel Dreyfus
On Thu, Sep 19, 2013 at 08:13:42AM -0400, Thor Lancelot Simon wrote: > There is at least one thing: RAIDframe doesn't allow enough simultaneously > pending transactions, so everything *really* backs up behind the cache flush. > > Fixing that would require allowing RAIDframe to eat more RAM. Last

Re: high load, no bottleneck

2013-09-19 Thread Thor Lancelot Simon
On Wed, Sep 18, 2013 at 06:03:11PM +0200, Emmanuel Dreyfus wrote: > Emmanuel Dreyfus wrote: > > > Thank you for saving my day. But now what happens? > > I note the SATA disks are in IDE emulation mode, and not AHCI. This is > > something I need to try changing: > > Switched to AHCI. Here is belo

Re: high load, no bottleneck

2013-09-19 Thread Thor Lancelot Simon
On Wed, Sep 18, 2013 at 06:03:11PM +0200, Emmanuel Dreyfus wrote: > Emmanuel Dreyfus wrote: > > > Thank you for saving my day. But now what happens? > > I note the SATA disks are in IDE emulation mode, and not AHCI. This is > > something I need to try changing: > > Switched to AHCI. Here is belo

Re: high load, no bottleneck

2013-09-19 Thread Edgar Fuß
> I re-enabled -o log and did the dd test again on NetBSD 6.0 with the > patch you posted and vfs.wapbl.verbose_commit=2 I wouldn't expect anything interesting from this, but maybe hannken@ does. > Running my stress test, which drives load to insane values: How often do these log flushes occur? D

Re: high load, no bottleneck

2013-09-18 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > > 35685270 bytes/sec > That's OK. > > > Note I removed -o log > Shouldn't make a difference, I think. I re-enabled -o log and did the dd test again on NetBSD 6.0 with the patch you posted and vfs.wapbl.verbose_commit=2 # dd if=/dev/zero bs=64k of=out count=1 1+0 r

Re: high load, no bottleneck

2013-09-18 Thread Emmanuel Dreyfus
Mouse wrote: > Depends. Is the filesystem mounted noatime (or read-only)? If not, > there are going to be atime updates, and don't all inode updates get > done synchronously? Or am I misunderstanding something? Thar is not the case, anyway: /dev/raid1e on /home type ffs (nodev, noexec, nosuid

Re: high load, no bottleneck

2013-09-18 Thread Edgar Fuß
> 35685270 bytes/sec That's OK. > Note I removed -o log Shouldn't make a difference, I think.

Re: high load, no bottleneck

2013-09-18 Thread Joerg Sonnenberger
On Wed, Sep 18, 2013 at 03:51:03PM -0400, Mouse wrote: > >> Yes, I run 24 concurent tar -czf as a test. > > But those shouldn't do small synchronous writes, should they? > > Depends. Is the filesystem mounted noatime (or read-only)? If not, > there are going to be atime updates, and don't all in

Re: high load, no bottleneck

2013-09-18 Thread Mouse
>> Yes, I run 24 concurent tar -czf as a test. > But those shouldn't do small synchronous writes, should they? Depends. Is the filesystem mounted noatime (or read-only)? If not, there are going to be atime updates, and don't all inode updates get done synchronously? Or am I misunderstanding som

Re: high load, no bottleneck

2013-09-18 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > EF> How fast can you write to the file system in question? > ED> What test do you want me to perform? > dd if=/dev/zero bs=64k helvede# dd if=/dev/zero bs=64k of=out count=1 1+0 records in 1+0 records out 65536 bytes transferred in 18.365 secs (35685270 bytes

Re: high load, no bottleneck

2013-09-18 Thread Edgar Fuß
EF> How fast can you write to the file system in question? ED> What test do you want me to perform? dd if=/dev/zero bs=64k EF> Does your NFS load include a large amount of small syncrounous (filesync) EF> write operations? ED> Yes, I run 24 concurent tar -czf as a test. But those shouldn't do smal

Re: high load, no bottleneck

2013-09-18 Thread Emmanuel Dreyfus
Edgar Fuß wrote: > How fast can you write to the file system in question? What test do you want me to perform? > Does your NFS load include a large amount of small syncrounous (filesync) > write operations? Yes, I run 24 concurent tar -czf as a test. -- Emmanuel Dreyfus http://hcpnet.free.fr

Re: high load, no bottleneck

2013-09-18 Thread Edgar Fuß
> In this setup, vfs.wapbl.flush_disk_cache=1 still get high loads, on both 6.0 > and -current. > I assume there must be something bad with WAPBL/RAIDframe Everything up to and including 6.0 is broken in this respect. Thanks to hannken@, 6.1 does align journal flushes. How fast can you write to th

Re: high load, no bottleneck

2013-09-18 Thread Emmanuel Dreyfus
Emmanuel Dreyfus wrote: > Thank you for saving my day. But now what happens? > I note the SATA disks are in IDE emulation mode, and not AHCI. This is > something I need to try changing: Switched to AHCI. Here is below how hard disks are discovered (the relevant raid is RAID1 on wd0 and wd1) In

Re: high load, no bottleneck

2013-09-18 Thread Emmanuel Dreyfus
Christos Zoulas wrote: > You *might* need an fsck after power loss. If you crash and the disk syncs > then you should be ok if the disk flushed (which it probably did if you > say "syncing disks" after the panic). I am not sure I ever encountered a crash where syncing disk after panic did not lo

Re: high load, no bottleneck

2013-09-18 Thread Christos Zoulas
On Sep 18, 3:34am, m...@netbsd.org (Emmanuel Dreyfus) wrote: -- Subject: Re: high load, no bottleneck | Christos Zoulas wrote: | | > On large filesystems with many files fsck can take a really long time after | > a crash. In my personal experience power outages are much less frequen

Re: high load, no bottleneck

2013-09-18 Thread Christos Zoulas
On Sep 17, 5:38pm, buh...@nfbcal.org (Brian Buhrow) wrote: -- Subject: Re: high load, no bottleneck | hello. How do you move the wapbl log to a drive other than the one | on which the filesystem that's being logged is runing? In other words, I | thought the log existed on the

Re: high load, no bottleneck

2013-09-18 Thread Emmanuel Dreyfus
Thor Lancelot Simon wrote: > In AHCI mode, you might be able to use ordered tags or "force unit access" > (does SATA have this concept per command?) to force individual transactions > or series of transactions out, rather than flushing out all the data every > time just to get the metadata into t

Re: high load, no bottleneck

2013-09-18 Thread Thor Lancelot Simon
On Tue, Sep 17, 2013 at 09:48:49PM +0200, Emmanuel Dreyfus wrote: > > Thank you for saving my day. But now what happens? > I note the SATA disks are in IDE emulation mode, and not AHCI. This is > something I need to try changing: In AHCI mode, you might be able to use ordered tags or "force unit

Re: high load, no bottleneck

2013-09-18 Thread Manuel Bouyer
On Wed, Sep 18, 2013 at 03:34:19AM +0200, Emmanuel Dreyfus wrote: > Christos Zoulas wrote: > > > On large filesystems with many files fsck can take a really long time after > > a crash. In my personal experience power outages are much less frequent than > > crashes (I crash quite a lot since I al

Re: high load, no bottleneck

2013-09-17 Thread Emmanuel Dreyfus
David Holland wrote: > The downside is that without the cache flushing there's some chance > that fsck won't be able to repair things afterwards. This is scary. If this is a WAPBL/NFS/RAIDframe interaction, I think I'd rather dump the RAID than the insurence to have a working fsck after a crash.

Re: high load, no bottleneck

2013-09-17 Thread David Holland
On Wed, Sep 18, 2013 at 03:34:19AM +0200, Emmanuel Dreyfus wrote: > Christos Zoulas wrote: > > > On large filesystems with many files fsck can take a really long time after > > a crash. In my personal experience power outages are much less frequent > > than > > crashes (I crash quite a lot

Re: high load, no bottleneck

2013-09-17 Thread Emmanuel Dreyfus
Christos Zoulas wrote: > On large filesystems with many files fsck can take a really long time after > a crash. In my personal experience power outages are much less frequent than > crashes (I crash quite a lot since I always fiddle with things). If you > don't care about fsck time, you don't nee

Re: high load, no bottleneck

2013-09-17 Thread Brian Buhrow
high load, no bottleneck } On Sep 18, 2:22am, m...@netbsd.org (Emmanuel Dreyfus) wrote: } -- Subject: Re: high load, no bottleneck } } | > The case to worry about is the scenario where the machine } | > suddently loses power, the data never makes it to the physical media, } | > and gets

Re: high load, no bottleneck

2013-09-17 Thread Christos Zoulas
On Sep 18, 2:22am, m...@netbsd.org (Emmanuel Dreyfus) wrote: -- Subject: Re: high load, no bottleneck | > The case to worry about is the scenario where the machine | > suddently loses power, the data never makes it to the physical media, | > and gets lost from the cache. In this case

Re: high load, no bottleneck

2013-09-17 Thread Emmanuel Dreyfus
Christos Zoulas wrote: > The case to worry about is the scenario where the machine > suddently loses power, the data never makes it to the physical media, > and gets lost from the cache. In this case you might end up with a > filesystem that has inconsistent metadata, so the next reboot might > e

Re: high load, no bottleneck

2013-09-17 Thread Emmanuel Dreyfus
Christos Zoulas wrote: > My suggestion is to try: > > sysctl -w vfs.wapbl.flush_disk_cache=0 > > for now... Excellent: the load does not go over 2 now (compared to 50). Thank you for saving my day. But now what happens? I note the SATA disks are in IDE emulation mode, and not AHCI. This i

Re: high load, no bottleneck

2013-09-17 Thread Christos Zoulas
On Sep 17, 9:48pm, m...@netbsd.org (Emmanuel Dreyfus) wrote: -- Subject: Re: high load, no bottleneck | Excellent: the load does not go over 2 now (compared to 50). | | Thank you for saving my day. But now what happens? | I note the SATA disks are in IDE emulation mode, and not AHCI. This is

Re: high load, no bottleneck

2013-09-17 Thread Emmanuel Dreyfus
I was suggested this should be better posted on tech-kern. It happened on NetBSD-6.0, and I tried to upgrade the kernel to -current, with the same result. On Tue, Sep 17, 2013 at 12:54:59PM +, Emmanuel Dreyfus wrote: > I have a NFS server that exhibit a high load (20-30) when supporting > abou

Re: high load, no bottleneck

2013-09-17 Thread Christos Zoulas
In article <1l9czcn.y6kr35aruvzvm%m...@netbsd.org>, Emmanuel Dreyfus wrote: >Emmanuel Dreyfus wrote: > >> db{0}> show vnode c5a24b08 >> OBJECT 0xc5a24b08: locked=0, pgops=0xc0b185a8, npages=1720, refs=16 >> >> VNODE flags 0x4030 >> mp 0xc4a14000 numoutput 0 size 0x6f writesize 0x6f >> da

Re: high load, no bottleneck

2013-09-17 Thread Emmanuel Dreyfus
Emmanuel Dreyfus wrote: > db{0}> show vnode c5a24b08 > OBJECT 0xc5a24b08: locked=0, pgops=0xc0b185a8, npages=1720, refs=16 > > VNODE flags 0x4030 > mp 0xc4a14000 numoutput 0 size 0x6f writesize 0x6f > data 0xc5a25d74 writecount 0 holdcnt 2 > tag VT_UFS(1) type VREG(1) mount 0xc4a14000 ty

Re: high load, no bottleneck

2013-09-17 Thread Emmanuel Dreyfus
J. Hannken-Illjes wrote: > What are your clients doing? MacOS X machines opening sessions kill the server. I can reproduce the problem with just concurent tar -czf on the NFS volume > Which vnode(s) are your nfsd threads waiting on (first arg to vn_lock)? Here is an example: vn_lock(c5a24b08,

Re: high load, no bottleneck

2013-09-17 Thread J. Hannken-Illjes
On Sep 17, 2013, at 5:39 PM, Emmanuel Dreyfus wrote: > I was suggested this should be better posted on tech-kern. It happened > on NetBSD-6.0, and I tried to upgrade the kernel to -current, with the > same result. > > On Tue, Sep 17, 2013 at 12:54:59PM +, Emmanuel Dreyfus wrote: >> I have a