Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Manuel Bouyer
On Thu, Sep 22, 2016 at 09:33:18PM -0400, Thor Lancelot Simon wrote:
> > AFAIK ordered tags only guarantees that the write will happen in order,
> > but not that the writes are actually done to stable storage.
> 
> The target's not allowed to report the command complete unless the data
> are on stable storage, except if you have write cache enable set in the
> relevant mode page.
> 
> If you run SCSI drives like that, you're playing with fire.  Expect to get
> burned.  The whole point of tagged queueing is to let you *not* set that
> bit in the mode pages and still get good performance.

Now I remember that I did indeed disable disk write cache when I had
scsi disks in production. It's been a while though.

But anyway, from what I remember you still need the disk cache flush
operation for SATA, even with NCQ. It's not equivalent to the SCSI tags.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


FUA and TCQ (was: Plan: journalling fixes for WAPBL)

2016-09-23 Thread Edgar Fuß
> The whole point of tagged queueing is to let you *not* set [the write 
> cache] bit in the mode pages and still get good performance.
I don't get that. My understanding was that TCQ allowed the drive to re-order 
commands within the bounds described by the tags. With the write cache 
disabled, all write commands must hit stable storage before being reported 
completed. So what's the point of tagging with cacheing disabled?


Re: FUA and TCQ (was: Plan: journalling fixes for WAPBL)

2016-09-23 Thread David Holland
On Fri, Sep 23, 2016 at 11:49:50AM +0200, Edgar Fu? wrote:
 > > The whole point of tagged queueing is to let you *not* set [the write 
 > > cache] bit in the mode pages and still get good performance.
 >
 > I don't get that. My understanding was that TCQ allowed the drive
 > to re-order commands within the bounds described by the tags. With
 > the write cache disabled, all write commands must hit stable
 > storage before being reported completed. So what's the point of
 > tagging with cacheing disabled?

You can have more than one in flight at a time. Typically the more you
can manage to have pending at once, the better the performance,
especially with SSDs.

-- 
David A. Holland
dholl...@netbsd.org


Re: FUA and TCQ

2016-09-23 Thread Johnny Billquist

On 2016-09-23 11:49, Edgar Fuß wrote:

The whole point of tagged queueing is to let you *not* set [the write
cache] bit in the mode pages and still get good performance.

I don't get that. My understanding was that TCQ allowed the drive to re-order
commands within the bounds described by the tags. With the write cache
disabled, all write commands must hit stable storage before being reported
completed. So what's the point of tagging with cacheing disabled?


Totally independent of any caching - disk I/O performance can be greatly 
improved by reordering operations to minimize disk head movement. Most 
of disk I/O times are head movements. I'd guess that makes up about 90% 
of the time.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: FUA and TCQ

2016-09-23 Thread Edgar Fuß
> You can have more than one in flight at a time.
My SCSI knowledge is probably out-dated. How can I have several commands 
in flight concurrently?


Re: FUA and TCQ

2016-09-23 Thread Manuel Bouyer
On Fri, Sep 23, 2016 at 01:13:09PM +0200, Edgar Fuß wrote:
> > You can have more than one in flight at a time.
> My SCSI knowledge is probably out-dated. How can I have several commands 
> in flight concurrently?

This is what tagged queueing is for.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: FUA and TCQ

2016-09-23 Thread Johnny Billquist

On 2016-09-23 13:05, David Holland wrote:

On Fri, Sep 23, 2016 at 11:49:50AM +0200, Edgar Fu? wrote:
 > > The whole point of tagged queueing is to let you *not* set [the write
 > > cache] bit in the mode pages and still get good performance.
 >
 > I don't get that. My understanding was that TCQ allowed the drive
 > to re-order commands within the bounds described by the tags. With
 > the write cache disabled, all write commands must hit stable
 > storage before being reported completed. So what's the point of
 > tagging with cacheing disabled?

You can have more than one in flight at a time. Typically the more you
can manage to have pending at once, the better the performance,
especially with SSDs.


I'd say especially with rotating rust, but either way... :-)
Yes, that's the whole point of tagged queuing. Issue many operations. 
Let the disk and controller sort out in which order to do them to make 
it the most efficient.


With rotating rust, the order of operations can make a huge difference 
in speed. With SSDs you don't have those seek times to begin with, so I 
would expect the gains to be marginal.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: FUA and TCQ

2016-09-23 Thread Greg Troxel

Johnny Billquist  writes:

> With rotating rust, the order of operations can make a huge difference
> in speed. With SSDs you don't have those seek times to begin with, so
> I would expect the gains to be marginal.

For reordering, I agree with you, but the SSD speeds are so high that
pipeling is probably necessary to keep the SSD from stalling due to not
having enough data to write.  So this could help move from 300 MB/s
(that I am seeing) to 550 MB/s.


signature.asc
Description: PGP signature


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Thor Lancelot Simon
On Fri, Sep 23, 2016 at 11:47:24AM +0200, Manuel Bouyer wrote:
> On Thu, Sep 22, 2016 at 09:33:18PM -0400, Thor Lancelot Simon wrote:
> > > AFAIK ordered tags only guarantees that the write will happen in order,
> > > but not that the writes are actually done to stable storage.
> > 
> > The target's not allowed to report the command complete unless the data
> > are on stable storage, except if you have write cache enable set in the
> > relevant mode page.
> > 
> > If you run SCSI drives like that, you're playing with fire.  Expect to get
> > burned.  The whole point of tagged queueing is to let you *not* set that
> > bit in the mode pages and still get good performance.
> 
> Now I remember that I did indeed disable disk write cache when I had
> scsi disks in production. It's been a while though.
> 
> But anyway, from what I remember you still need the disk cache flush
> operation for SATA, even with NCQ. It's not equivalent to the SCSI tags.

I think that's true only if you're running with write cache enabled; but
the difference is that most ATA disks ship with it turned on by default.

With an aggressive implementation of tag management on the host side,
there should be no performance benefit from unconditionally enabling
the write cache -- all the available cache should be used to stage
writes for pending tags.  Sometimes it works.

-- 
  Thor Lancelot Simont...@panix.com

"The dirtiest word in art is the C-word.  I can't even say 'craft'
 without feeling dirty."-Chuck Close


Re: FUA and TCQ

2016-09-23 Thread Johnny Billquist

On 2016-09-23 15:38, Greg Troxel wrote:


Johnny Billquist  writes:


With rotating rust, the order of operations can make a huge difference
in speed. With SSDs you don't have those seek times to begin with, so
I would expect the gains to be marginal.


For reordering, I agree with you, but the SSD speeds are so high that
pipeling is probably necessary to keep the SSD from stalling due to not
having enough data to write.  So this could help move from 300 MB/s
(that I am seeing) to 550 MB/s.


Good point. In which case (if I read you right), it's not the reordering 
that matters, but the simple case of being able to queue up several 
operations, to keep the disk busy. And potentially running several disks 
in parallel. Keeping them all busy. And we of course also have the 
pre-processing work before the command is queued, which can be done 
while the controller is busy. There are many potential gains here.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: FUA and TCQ

2016-09-23 Thread Thor Lancelot Simon
On Fri, Sep 23, 2016 at 09:38:08AM -0400, Greg Troxel wrote:
> 
> Johnny Billquist  writes:
> 
> > With rotating rust, the order of operations can make a huge difference
> > in speed. With SSDs you don't have those seek times to begin with, so
> > I would expect the gains to be marginal.
> 
> For reordering, I agree with you, but the SSD speeds are so high that
> pipeling is probably necessary to keep the SSD from stalling due to not
> having enough data to write.  So this could help move from 300 MB/s
> (that I am seeing) to 550 MB/s.

The iSCSI case is illustrative, too.  Now you can have a "SCSI bus" with
a huge bandwidth delay product.  It doesn't matter how quickly the target
says it finished one command (which is all enabling the write-cache can get
you) if you are working in lockstep such that the initiator cannot send
more commands until it receives the target's ack.

This is why on iSCSI you really do see hundreds of tags in flight at
once.  You can pump up the request size, but that causes fairness
problems.  Keeping many commands active at the same time helps much more.

Now think about that SSD again.  The SSD's write latency is so low that
_relative to the delay time  it takes the host to issue a new command_ you
have the same problem.  It's clear that enabling the write cache can't
really help, or at least can't help much: you need to have many commands
pending at the same time.

Our storage stack's inability to use tags with SATA targets is a huge
gating factor for performance with real workloads (the residual use of
the kernel lock at and below the bufq layer is another).  Starting de
novo with NVMe, where it's perverse and structurally difficult to not
support multiple commands in flight simultaneously, will help some, but
SATA SSDs are going to be around for a long time still and it'd be
great if this limitation went away.

That said, I am not going to fix it myself so all I can do is sit here
and pontificate -- which is worth about what you paid for it, and no
more.

Thor


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Warner Losh
On Fri, Sep 23, 2016 at 7:38 AM, Thor Lancelot Simon  wrote:
> On Fri, Sep 23, 2016 at 11:47:24AM +0200, Manuel Bouyer wrote:
>> On Thu, Sep 22, 2016 at 09:33:18PM -0400, Thor Lancelot Simon wrote:
>> > > AFAIK ordered tags only guarantees that the write will happen in order,
>> > > but not that the writes are actually done to stable storage.
>> >
>> > The target's not allowed to report the command complete unless the data
>> > are on stable storage, except if you have write cache enable set in the
>> > relevant mode page.
>> >
>> > If you run SCSI drives like that, you're playing with fire.  Expect to get
>> > burned.  The whole point of tagged queueing is to let you *not* set that
>> > bit in the mode pages and still get good performance.
>>
>> Now I remember that I did indeed disable disk write cache when I had
>> scsi disks in production. It's been a while though.
>>
>> But anyway, from what I remember you still need the disk cache flush
>> operation for SATA, even with NCQ. It's not equivalent to the SCSI tags.

All NCQ gives you is the ability to schedule multiple requests and
to get notification of their completion (perhaps out of order). There's
no coherency features are all in NCQ.

> I think that's true only if you're running with write cache enabled; but
> the difference is that most ATA disks ship with it turned on by default.
>
> With an aggressive implementation of tag management on the host side,
> there should be no performance benefit from unconditionally enabling
> the write cache -- all the available cache should be used to stage
> writes for pending tags.  Sometimes it works.

You don't need to flush all the writes, but do need to take special care
if you need more coherent semantics, which often is a small minority
of the writes, so I would agree the affect can be mostly mitigated. Not
completely since any coherency point has to drain the queue completely.
The cache drain ops are non-NCQ, and to send non-NCQ requests
no NCQ requests can be pending. TRIM[*] commands are the same way.

Warner

[*] There is an NCQ version of TRIM, but it requires the AUX register
to be sent and very few sata hosts controllers support that (though
AHCI does, many of the LSI controllers don't in any performant way).


Re: FUA and TCQ

2016-09-23 Thread Warner Losh
On Fri, Sep 23, 2016 at 8:05 AM, Thor Lancelot Simon  wrote:
> Our storage stack's inability to use tags with SATA targets is a huge
> gating factor for performance with real workloads (the residual use of
> the kernel lock at and below the bufq layer is another).

FreeBSD's storage stack does support NCQ. When that's artificially
turned off, performance drops on a certain brand of SSDs from about
500-550MB/s for large reads down to 200-300MB/s depending on
too many factors to go into here. It helps a lot for work loads and is
critical for Netflix to get 36-38Gbps rate from our 40Gbps systems.

> Starting de
> novo with NVMe, where it's perverse and structurally difficult to not
> support multiple commands in flight simultaneously, will help some, but
> SATA SSDs are going to be around for a long time still and it'd be
> great if this limitation went away.

NVMe is even worse. There's one drive that w/o queueing I can barely
get 1GB/s out of. With queueing and multiple requests I can get the
spec sheet rated 3.6GB/s. Here queueing is critical for Netflix to get to
90-93Gbps that our 100Gbps boxes can do (though it is but one of
many things).

> That said, I am not going to fix it myself so all I can do is sit here
> and pontificate -- which is worth about what you paid for it, and no
> more.

Yea, I'm just a FreeBSD guy lurking here.

Warner


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Manuel Bouyer
On Fri, Sep 23, 2016 at 09:38:44AM -0400, Thor Lancelot Simon wrote:
> > But anyway, from what I remember you still need the disk cache flush
> > operation for SATA, even with NCQ. It's not equivalent to the SCSI tags.
> 
> I think that's true only if you're running with write cache enabled; but
> the difference is that most ATA disks ship with it turned on by default.

all of them have it turned on by default, and you can't permanentely
disable it (you have to turn it off after each reset)

> 
> With an aggressive implementation of tag management on the host side,
> there should be no performance benefit from unconditionally enabling
> the write cache -- all the available cache should be used to stage
> writes for pending tags.  Sometimes it works.

With ATA you have only 32 tags ...

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Eric Haszlakiewicz
On September 23, 2016 10:51:30 AM EDT, Warner Losh  wrote:
>All NCQ gives you is the ability to schedule multiple requests and
>to get notification of their completion (perhaps out of order). There's
>no coherency features are all in NCQ.

This seems like the key thing needed to avoid FUA: to implement fsync() you 
just wait for notifications of completion to be received, and once you have 
those for all requests pending when fsync was called, or started as part of the 
fsync, then you're done.

Eric



Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Thor Lancelot Simon
On Fri, Sep 23, 2016 at 05:15:16PM +, Eric Haszlakiewicz wrote:
> On September 23, 2016 10:51:30 AM EDT, Warner Losh  wrote:
> >All NCQ gives you is the ability to schedule multiple requests and
> >to get notification of their completion (perhaps out of order). There's
> >no coherency features are all in NCQ.
> 
> This seems like the key thing needed to avoid FUA: to implement fsync() you 
> just wait for notifications of completion to be received, and once you have 
> those for all requests pending when fsync was called, or started as part of 
> the fsync, then you're done.

The other key point is that -- unless SATA NCQ is radically different from
SCSI tagged queuing in a particularly stupid way -- the rules require all
"simple" tags to be completed before any "ordered" tag is completed.  That is,
ordered tags are barriers against all simple tags.

So, with the write cache disabled, you can use a single command with an
ordered tag to force all preceding commands to complete, but continue
issuing commands while that happens.

To me, this is considerably more elegant that assuming all commands will
"complete" only to the cache by default and then setting FUA for commands
where you can't tolerate that misbehavior -- and certainly better than
flushing the whole cache, which is roughly like blowing off your own head
because you have a pimple on your nose.  But, clearly, others disagree.

Thor


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Manuel Bouyer
On Fri, Sep 23, 2016 at 05:15:16PM +, Eric Haszlakiewicz wrote:
> On September 23, 2016 10:51:30 AM EDT, Warner Losh  wrote:
> >All NCQ gives you is the ability to schedule multiple requests and
> >to get notification of their completion (perhaps out of order). There's
> >no coherency features are all in NCQ.
> 
> This seems like the key thing needed to avoid FUA: to implement fsync() you 
> just wait for notifications of completion to be received, and once you have 
> those for all requests pending when fsync was called, or started as part of 
> the fsync, then you're done.

*if you have the write cache disabled*

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Manuel Bouyer
On Fri, Sep 23, 2016 at 01:20:09PM -0400, Thor Lancelot Simon wrote:
> On Fri, Sep 23, 2016 at 05:15:16PM +, Eric Haszlakiewicz wrote:
> > On September 23, 2016 10:51:30 AM EDT, Warner Losh  wrote:
> > >All NCQ gives you is the ability to schedule multiple requests and
> > >to get notification of their completion (perhaps out of order). There's
> > >no coherency features are all in NCQ.
> > 
> > This seems like the key thing needed to avoid FUA: to implement fsync() you 
> > just wait for notifications of completion to be received, and once you have 
> > those for all requests pending when fsync was called, or started as part of 
> > the fsync, then you're done.
> 
> The other key point is that -- unless SATA NCQ is radically different from
> SCSI tagged queuing in a particularly stupid way -- the rules require all
> "simple" tags to be completed before any "ordered" tag is completed.  That is,
> ordered tags are barriers against all simple tags.

If I remember properly, there's only simple tags in ATA.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Thor Lancelot Simon
On Fri, Sep 23, 2016 at 07:45:00PM +0200, Manuel Bouyer wrote:
> On Fri, Sep 23, 2016 at 05:15:16PM +, Eric Haszlakiewicz wrote:
> > On September 23, 2016 10:51:30 AM EDT, Warner Losh  wrote:
> > >All NCQ gives you is the ability to schedule multiple requests and
> > >to get notification of their completion (perhaps out of order). There's
> > >no coherency features are all in NCQ.
> > 
> > This seems like the key thing needed to avoid FUA: to implement fsync() you 
> > just wait for notifications of completion to be received, and once you have 
> > those for all requests pending when fsync was called, or started as part of 
> > the fsync, then you're done.
> 
> *if you have the write cache disabled*

*Running with the write cache enabled is a bad idea*



Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Manuel Bouyer
On Fri, Sep 23, 2016 at 01:46:09PM -0400, Thor Lancelot Simon wrote:
> > > This seems like the key thing needed to avoid FUA: to implement fsync() 
> > > you just wait for notifications of completion to be received, and once 
> > > you have those for all requests pending when fsync was called, or started 
> > > as part of the fsync, then you're done.
> > 
> > *if you have the write cache disabled*
> 
> *Running with the write cache enabled is a bad idea*

On ATA devices, you can't permanently disable the write cache. You have
to do it on every power cycles.

Well this really needs to be carefully evaluated. With only 32 tags I'm not
sure you can efficiently use recent devices with the write cache
disabled (most enterprise disks have a 64M cache these days)

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Warner Losh
On Fri, Sep 23, 2016 at 11:20 AM, Thor Lancelot Simon  wrote:
> On Fri, Sep 23, 2016 at 05:15:16PM +, Eric Haszlakiewicz wrote:
>> On September 23, 2016 10:51:30 AM EDT, Warner Losh  wrote:
>> >All NCQ gives you is the ability to schedule multiple requests and
>> >to get notification of their completion (perhaps out of order). There's
>> >no coherency features are all in NCQ.
>>
>> This seems like the key thing needed to avoid FUA: to implement fsync() you 
>> just wait for notifications of completion to be received, and once you have 
>> those for all requests pending when fsync was called, or started as part of 
>> the fsync, then you're done.
>
> The other key point is that -- unless SATA NCQ is radically different from
> SCSI tagged queuing in a particularly stupid way -- the rules require all
> "simple" tags to be completed before any "ordered" tag is completed.  That is,
> ordered tags are barriers against all simple tags.

SATA NCQ doesn't have ordered tags. There's just 32 slots to send
requests into. Don't allow the word 'tag' to confuse you into thinking
it is anything at all like SCSI tags. You get ordering by not
scheduling anything until after the queue has drained when you send
your "ordered" command. It is that stupid.

Warner


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Warner Losh
On Fri, Sep 23, 2016 at 11:54 AM, Warner Losh  wrote:
> On Fri, Sep 23, 2016 at 11:20 AM, Thor Lancelot Simon  wrote:
>> On Fri, Sep 23, 2016 at 05:15:16PM +, Eric Haszlakiewicz wrote:
>>> On September 23, 2016 10:51:30 AM EDT, Warner Losh  wrote:
>>> >All NCQ gives you is the ability to schedule multiple requests and
>>> >to get notification of their completion (perhaps out of order). There's
>>> >no coherency features are all in NCQ.
>>>
>>> This seems like the key thing needed to avoid FUA: to implement fsync() you 
>>> just wait for notifications of completion to be received, and once you have 
>>> those for all requests pending when fsync was called, or started as part of 
>>> the fsync, then you're done.
>>
>> The other key point is that -- unless SATA NCQ is radically different from
>> SCSI tagged queuing in a particularly stupid way -- the rules require all
>> "simple" tags to be completed before any "ordered" tag is completed.  That 
>> is,
>> ordered tags are barriers against all simple tags.
>
> SATA NCQ doesn't have ordered tags. There's just 32 slots to send
> requests into. Don't allow the word 'tag' to confuse you into thinking
> it is anything at all like SCSI tags. You get ordering by not
> scheduling anything until after the queue has drained when you send
> your "ordered" command. It is that stupid.

And it can be even worse, since if the 'ordered' item must complete
after all before it, you have to drain the queue before you can even
send it to the drive. Depends on what the ordering guarantees you want
are...

Warner


Re: CVS commit: src/sys/arch

2016-09-23 Thread Jaromír Doleček
Hey Maxime,

Seems the KASSERTs() are too aggressive, or there is some other bug.

I can trigger the kassert by simply attaching to rump_ffs, setting a
breakpoint and continuing, i.e:

> rump_ffs -o log ./ffs ./mnt
> gdb rump_ffs
...
(gdb) attach RUMP_PID
(gdb) break ffs_truncate
Breakpoint 1 at 0xad0b951f: file
/usr/home/dolecek/netbsd/sys/rump/fs/lib/libffs/../../../../ufs/ffs/ffs_inode.c,
line 210.
(gdb) cont
panic: kernel diagnostic assetion "onfault == kcopy_fault || rcr2() <
VM_MAXUSER_ADDRESS" failed: file "../../../../arch/i386/i386/trap.c",
line 358

Could you please look at it?

I'll disable the KASSERT() in my local tree, so that I'll be able to
develop. But would be good idea to check what so special that gdb is
doing that it trips over.

Thank you.

Jaromir

2016-09-16 13:48 GMT+02:00 Maxime Villard :
> Module Name:src
> Committed By:   maxv
> Date:   Fri Sep 16 11:48:10 UTC 2016
>
> Modified Files:
> src/sys/arch/amd64/amd64: trap.c
> src/sys/arch/i386/i386: trap.c
>
> Log Message:
> Put two KASSERTs, to make sure the fault is happening in the correct
> half of the vm space when using special copy functions. It can detect
> bugs where the kernel would fault when copying a kernel buffer which
> it wrongly believes comes from userland.
>
>
> To generate a diff of this commit:
> cvs rdiff -u -r1.84 -r1.85 src/sys/arch/amd64/amd64/trap.c
> cvs rdiff -u -r1.278 -r1.279 src/sys/arch/i386/i386/trap.c
>
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
>


Re: CVS commit: src/sys/arch

2016-09-23 Thread Manuel Bouyer
On Fri, Sep 23, 2016 at 09:33:36PM +0200, Jaromír Dole?ek wrote:
> Hey Maxime,
> 
> Seems the KASSERTs() are too aggressive, or there is some other bug.
> 
> I can trigger the kassert by simply attaching to rump_ffs, setting a
> breakpoint and continuing, i.e:
> 
> > rump_ffs -o log ./ffs ./mnt
> > gdb rump_ffs
> ...
> (gdb) attach RUMP_PID
> (gdb) break ffs_truncate
> Breakpoint 1 at 0xad0b951f: file
> /usr/home/dolecek/netbsd/sys/rump/fs/lib/libffs/../../../../ufs/ffs/ffs_inode.c,
> line 210.
> (gdb) cont
> panic: kernel diagnostic assetion "onfault == kcopy_fault || rcr2() <
> VM_MAXUSER_ADDRESS" failed: file "../../../../arch/i386/i386/trap.c",
> line 358
> 
> Could you please look at it?
> 
> I'll disable the KASSERT() in my local tree, so that I'll be able to
> develop. But would be good idea to check what so special that gdb is
> doing that it trips over.

This anita test run:
http://www-soc.lip6.fr/~bouyer/NetBSD-tests/xen/HEAD/i386/201609171110Z_anita.txt

also triggered the KASSERT(). As it didn't happen with newer builds I assumed
it has been fixed, but maybe it's just that it's not 100% reproductible in
this context.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: FUA and TCQ (was: Plan: journalling fixes for WAPBL)

2016-09-23 Thread Paul.Koning

> On Sep 23, 2016, at 5:49 AM, Edgar Fuß  wrote:
> 
>> The whole point of tagged queueing is to let you *not* set [the write 
>> cache] bit in the mode pages and still get good performance.
> I don't get that. My understanding was that TCQ allowed the drive to re-order 
> commands within the bounds described by the tags. With the write cache 
> disabled, all write commands must hit stable storage before being reported 
> completed. So what's the point of tagging with cacheing disabled?

I'm not sure.  But I have the impression that in the real world tagging is 
rarely, if ever, used.

paul



Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Paul.Koning

> On Sep 23, 2016, at 10:51 AM, Warner Losh  wrote:
> 
> On Fri, Sep 23, 2016 at 7:38 AM, Thor Lancelot Simon  wrote:
>> On Fri, Sep 23, 2016 at 11:47:24AM +0200, Manuel Bouyer wrote:
>>> On Thu, Sep 22, 2016 at 09:33:18PM -0400, Thor Lancelot Simon wrote:
> AFAIK ordered tags only guarantees that the write will happen in order,
> but not that the writes are actually done to stable storage.
 
 The target's not allowed to report the command complete unless the data
 are on stable storage, except if you have write cache enable set in the
 relevant mode page.
 
 If you run SCSI drives like that, you're playing with fire.  Expect to get
 burned.  The whole point of tagged queueing is to let you *not* set that
 bit in the mode pages and still get good performance.
>>> 
>>> Now I remember that I did indeed disable disk write cache when I had
>>> scsi disks in production. It's been a while though.
>>> 
>>> But anyway, from what I remember you still need the disk cache flush
>>> operation for SATA, even with NCQ. It's not equivalent to the SCSI tags

vioif vs if_vio

2016-09-23 Thread Paul Goyette
Shouldn't the vioif(4) device be more properly named if_vio(4), to be 
consistent with other network interfaces?


With its current name, it could never successfully exist as an 
auto-loaded kernel module, since the auto-load code assumes the if_ 
prefix!



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++