Re: Any objections/comments on axing out old ATA stack?

2013-04-22 Thread Matthias Andree
Am 20.04.2013 23:29, schrieb Jeremy Chadwick:

 My feeling is that the stalls are mostly from the error handler and the
 overall time the drive is frozen gets shorter. If it had not _felt_
 faster, I'd not have left that in sysctl.conf in the first place.
 
 Your understanding of what that sysctl does is wrong, or I'm
 misunderstanding what you're saying (very possible!).

What I am saying is a high-level view on the situation.

If I leave the default slot timeout set, whenever the computer gets into
an episode of stalls, it becomes unusable (all I/O stalled so anything
that needs disk I/O will hang) for so long that it is much faster to
depress the reset button, reboot, force fsck, and retry.

This usually entails hand-holding and manually cleaning up debris, such
as b0rked .o files from a buildworld, or similar.

These stalls happens out of the middle of the buildworld, under heavy
I/O, so I'd dispute excessive head unloading and drive spindown is the
issue -- the computer (and fans in particular) is generally very quiet,
no VGA board (just fanless onboard Radeon HD 3300), I could hear
re-spinups or parking heads.  I don't hear anything like it.

I don't know how rescheduling commands that timed out and get
rescheduled happens overall.

 How I interpret what you're saying: that the sysctl somehow decreases
 stall times during I/O operations that fail.  This is incorrect.

That may not be the intention of the sysctl, but it is the high-level
outcome.

 What that sysctl does is define the number of seconds that transpire
 ***before*** the CAM layer says Okay, I didn't get a response to the
 ATA CDB I sent the disk, and then re-submits the same CDB to the disk.

The other question (to Alexander Motin) then is why do I see the
timeouts for the related slots rougly $timeout seconds apart.

Alexander, is there any way we can make the kernel dump the entire set
of pending NCQ queue entries including submitted timestamp, or timeout
values, so that we can see how much workload is queued?

Note also that the CRC count has not increased since I've put the
smartctl output online, it's still at 14 -- I would have to see CRC
errors and their consequences in Linux or Windows, too.

Linux's smartd 5.41 never mailed about an increase of the CRC value, and
I told it not to mail temperature changes.

 Rephrased: in the case of a disk stalling on an I/O request, you will
 experience the effects of that stall no matter what that sysctl is set
 to.  A lower value in that sysctl will result in CAM spitting out
 nasties on the console + hitting the CDB retry submission scenario
 sooner, which if the drive is awake/responsive by that time will go
 smoothly.
 
 That's all it does.

That's how you have explained and I have understood it on the queue-slot
level (microscopic), but at a larger scale, I do not observe that the
shorter timeout sysctl value led to these stall episodes happen more
often (as should be the consequence if spindown were the cause of the
stalls), only recovery is faster.

 Thus a value of 5 indicates a device/drive did not respond to a CDB
 within 5 seconds, and a value of 30 indicates a device/drive did not
 respond to a CDB within 30 seconds.  Regardless, those lengths of time
 are VERY long for an I/O operation on a mechanical HDD.

Indeed they are, and because /usr is on the offending drive, I lowered
the value to 5 s, which I still deem conservative.  I know that an older
ATA standard edition permitted longer completion times for flushing HDD
internal write caches to platters (15 s IIRC).

 Oh look, it's the Samsung SpinPoint series, especially the EcoGreen
 (EG) series.  No joke: ~60% of the problem reports I deal with when
 it comes to weird wonky problems stem from this drive series.  I have
 no idea why, but they're a common pain point for me.

I know they are, especially the larger siblings 1.5 G up.

 Politely, your analysis of the drive (looks sane to me) is an
 indicator of why SMART output needs to be interpreted by a person who is
 familiar with the information.  That drive *does not* look sane to me.
 :-)

14 CRC errors with a drive that moved through computers that got
modified over time, that does not run the whole day, and that was first
attached to a computer whose controller (VIA garbage) could only talk to
1.5 Gb/s ATA drives but not 3 Gb/s is not something I care about.

 Key points about these errors:
 
[...]

 - These are conditions that short, long, select (LBA range scan), and
   conveyance SMART tests would probably not detect.  Like I said: it
   seems to be all over the board.

I agree that it is more likely to be a communications issue between
FreeBSD and the drive's logic, with all components, hard- and software
involved.

 Bernd Walter responded indicating that his experience indicated that the
 issue related to NCQ compatibility.  This would not surprise me.

Neither would it surprise me, but Linux should suffer, too, then.  It
does use NCQ, too.  FreeBSD can be booted 

Re: Any objections/comments on axing out old ATA stack?

2013-04-21 Thread Alexander Motin

On 21.04.2013 00:29, Jeremy Chadwick wrote:

- The ATA commands which lead up to the error also vary.  Many are for
   write requests, and from some entries I can see that the OS was doing
   NCQ writes (WRITE FPDMA QUEUED) and then suddenly decided to do a
   classic 28-bit LBA write (WRITE DMA).  I'm not sure why an OS would do
   this (there's nothing optimal about it) unless there were conditions
   occurring where the OS/ATA driver said this NCQ write isn't working
   (timeout, etc.), let me retry with a classic 28-bit LBA write.


ATA disk driver in CAM inserts non-queued command every several seconds 
of continuous load to limit possible command starvation inside the disk. 
SCSI driver does alike things, but inserts ordered command flag, that 
does not exist in SATA, instead of different command.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-21 Thread Jeremy Chadwick
On Sun, Apr 21, 2013 at 02:11:04PM +0300, Alexander Motin wrote:
 On 21.04.2013 00:29, Jeremy Chadwick wrote:
 - The ATA commands which lead up to the error also vary.  Many are for
write requests, and from some entries I can see that the OS was doing
NCQ writes (WRITE FPDMA QUEUED) and then suddenly decided to do a
classic 28-bit LBA write (WRITE DMA).  I'm not sure why an OS would do
this (there's nothing optimal about it) unless there were conditions
occurring where the OS/ATA driver said this NCQ write isn't working
(timeout, etc.), let me retry with a classic 28-bit LBA write.
 
 ATA disk driver in CAM inserts non-queued command every several
 seconds of continuous load to limit possible command starvation
 inside the disk. SCSI driver does alike things, but inserts ordered
 command flag, that does not exist in SATA, instead of different
 command.

Thanks for the insights Alexander, greatly appreciated.

I'm a little confused by your description, because if I'm reading it
right, it sounds like it conflicts with what the ACS-2 spec states.
Quoting T13/2015-D rev 3 (I'm aware it's a working draft), section
4.16.1:

If the device receives a command that is not an NCQ command while NCQ
commands are in the queue, then the device shall return command aborted
for the new command and for all of the NCQ commands that are in the
queue.

I assume this means ABRT status is returned to the host controller; if
so (and by design of course), how do we differentiate between that
condition and any other I/O condition that induces ABRT?

Possibly in the answer is in this admission: I should probably get
around to reading ATA8-AST sometime.  :-)

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-21 Thread Alexander Motin
ATA controller drivers are delaying conflicting commands, avoiding
conflicts in device.
21.04.2013 14:32 пользователь Jeremy Chadwick j...@koitsu.org написал:

 On Sun, Apr 21, 2013 at 02:11:04PM +0300, Alexander Motin wrote:
  On 21.04.2013 00:29, Jeremy Chadwick wrote:
  - The ATA commands which lead up to the error also vary.  Many are for
 write requests, and from some entries I can see that the OS was doing
 NCQ writes (WRITE FPDMA QUEUED) and then suddenly decided to do a
 classic 28-bit LBA write (WRITE DMA).  I'm not sure why an OS would
 do
 this (there's nothing optimal about it) unless there were conditions
 occurring where the OS/ATA driver said this NCQ write isn't working
 (timeout, etc.), let me retry with a classic 28-bit LBA write.
 
  ATA disk driver in CAM inserts non-queued command every several
  seconds of continuous load to limit possible command starvation
  inside the disk. SCSI driver does alike things, but inserts ordered
  command flag, that does not exist in SATA, instead of different
  command.

 Thanks for the insights Alexander, greatly appreciated.

 I'm a little confused by your description, because if I'm reading it
 right, it sounds like it conflicts with what the ACS-2 spec states.
 Quoting T13/2015-D rev 3 (I'm aware it's a working draft), section
 4.16.1:

 If the device receives a command that is not an NCQ command while NCQ
 commands are in the queue, then the device shall return command aborted
 for the new command and for all of the NCQ commands that are in the
 queue.

 I assume this means ABRT status is returned to the host controller; if
 so (and by design of course), how do we differentiate between that
 condition and any other I/O condition that induces ABRT?

 Possibly in the answer is in this admission: I should probably get
 around to reading ATA8-AST sometime.  :-)

 --
 | Jeremy Chadwick   j...@koitsu.org |
 | UNIX Systems Administratorhttp://jdc.koitsu.org/ |
 | Mountain View, CA, US|
 | Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Any objections/comments on axing out old ATA stack?

2013-04-20 Thread Bernd Walter
On Thu, Apr 04, 2013 at 12:15:32AM +0200, Matthias Andree wrote:
 I have just sent more information to the PR at
 http://www.freebsd.org/cgi/query-pr.cgi?pr=157397
 
 The short summary (more info in the PR) is:
 
 - limiting tags to 31 does not help
 
 - disabling NCQ appears to help in initial testing, but warrants more
 testing
 
 - error happens during WRITE_FPDMA_QUEUED,
 
 - File system in question is SU+J UFS2 mounted on /usr, and I can for
 instance rm -rf /usr/obj or just log into GNOME and try to open a
 gnome-terminal to trigger stalls;
 
 - Linux uses 31 tags (for different reason) and has no drive quirks, but
 a controller quirk;
 
 for Jeremy's topic #6, regarding the ATI/AMD SB7x0 that I am using, it
 might be worthwhile investigating the AHCI_HFLAG_IGN_SERR_INTERNAL flag
 - it gets set by Linux on the SB700 that my computer is using, see
 ahci_error_intr() in libahci.h - I am not going to interpret that for
 lack of expertise, but it does affect error handling and appears to
 ignore a certain condition.
 
 Why only my Samsung HDD drive triggers this but not the WD drive, I do
 not know yet.

I have had data corruption with Samsung drive and CAM connected to
an onboard intel AHCI.
The system was known good running with an older FreeBSD version and was
brought back into service for another use case with a fresh installation.
Regulary on major filesystem write activity we got random FS corruptions
and panics.
My assumption was broen NCQ firmware on the drive, but have nothing to
proof this assumtion.
We switched to old ata driver and lived with this until we replaced the
whole machine.
Don't know if the machine still exists somewhere.

-- 
B.Walter be...@bwct.de http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-20 Thread Jeremy Chadwick
On Thu, Apr 04, 2013 at 10:00:18AM +0200, Matthias Andree wrote:
 Am 04.04.2013 03:05, schrieb Jeremy Chadwick:
 
 { snipping stuff I have no comment on.  reference thread: }
 {  http://lists.freebsd.org/pipermail/freebsd-stable/2013-April/073036.html }
 
  One piece of evidence that refutes my theory is that if Windows and/or
  Linux partition are something you boot into and use often, I would
  imagine NCQ would be used in both of those environments and would suffer
  from the same issue.  Although Windows tends to hide all sorts of
  transient errors from the user (sigh), Linux tends to be like FreeBSD
  with regards to such issues (on the console anyway; you wouldn't see
  such messages normally inside of X).
 
 Now, the FreeBSD slice is the only partition on that disk that would
 likely see concurrent write accesses (think make -j8 on a quadcore
 computer) which is more prone to ferret out such alignment contention.
 
 The NTFS partition is aligned on a multi-MB boundary, so wouldn't hit
 the problem anyways.
 
 The Linux partition is in ext4 format for mostly sequential access to
 files usually in excess of 10 MB each.
 
 Linux's ext4 jumps through several hoops to end up with bulk writes,
 like extents, delayed allocations (to avoid fragmentation), reordering
 of data and metadata writes, serialized log writes and all that stuff,
 and it would appear I am permitting it to cache writes -- Linux uses
 write barriers to enforce proper ordering of journal/meta-data writes.
 
 It would be rather hard to hit ATA taskfile timeouts, the expected rate
 with which the drive needs to do a partial write is orders of magnitude
 lower.
 
 Any good concurrent write exercise tools for Unix that I could run on
 the Linux ext4 partition that you would propose?

The only tool I'm familiar with is bonnie++.

But I don't think this (partition alignment) is what matters now.  Your
smartctl output has shed some light on your situation.

  - I am running with kern.cam.ada.default_timeout=5 which makes the
  computer recover faster
  
  I can definitely imagine cases where a drive using NCQ but doing writes
  to a non-aligned partition could take longer than 5 seconds to respond
  to an ATA CDB (this is different than a SATA or AHCI layer timeout).  I am
  not telling you change this back to 30, but it might not be helping
  your situation at all given my above theory.
 
 My feeling is that the stalls are mostly from the error handler and the
 overall time the drive is frozen gets shorter. If it had not _felt_
 faster, I'd not have left that in sysctl.conf in the first place.

Your understanding of what that sysctl does is wrong, or I'm
misunderstanding what you're saying (very possible!).

How I interpret what you're saying: that the sysctl somehow decreases
stall times during I/O operations that fail.  This is incorrect.

What that sysctl does is define the number of seconds that transpire
***before*** the CAM layer says Okay, I didn't get a response to the
ATA CDB I sent the disk, and then re-submits the same CDB to the disk.

Rephrased: in the case of a disk stalling on an I/O request, you will
experience the effects of that stall no matter what that sysctl is set
to.  A lower value in that sysctl will result in CAM spitting out
nasties on the console + hitting the CDB retry submission scenario
sooner, which if the drive is awake/responsive by that time will go
smoothly.

That's all it does.

Thus a value of 5 indicates a device/drive did not respond to a CDB
within 5 seconds, and a value of 30 indicates a device/drive did not
respond to a CDB within 30 seconds.  Regardless, those lengths of time
are VERY long for an I/O operation on a mechanical HDD.

When you get to the bottom of my Email, you'll understand why I screamed
at you about adjusting that sysctl.

  Finally: could you please provide output from smartctl -x /dev/ada1?
  I would like to rule out any possibility of your drive having some other
  kind of issue that might cause it to go catatonic.  Thanks.
 
 I have fetched the data with Linux this time (should not make a
 difference as it's all drive internal data, not host OS stuff).
 
 Looks sane to me, http://people.freebsd.org/~mandree/smartctl.log.
 I'll be happy to refetch this data with a more current smartctl version
 under FreeBSD if required.

Oh look, it's the Samsung SpinPoint series, especially the EcoGreen
(EG) series.  No joke: ~60% of the problem reports I deal with when
it comes to weird wonky problems stem from this drive series.  I have
no idea why, but they're a common pain point for me.

First, about the shown sector size: smartmontools 5.41 was the first
release to show the sector sizes per ATA IDENTIFY.  I assume they got
this right from the get-go.  So as of this moment I'm going to assume
that this drive really is a 512-byte sector drive.

Politely, your analysis of the drive (looks sane to me) is an
indicator of why SMART output needs to be interpreted by a person who is
familiar with 

Re: Any objections/comments on axing out old ATA stack?

2013-04-20 Thread Bruce Cran

On 04/04/2013 09:00, Matthias Andree wrote:
Any good concurrent write exercise tools for Unix that I could run 
on the Linux ext4 partition that you would propose?


benchmarks/fio is good for that.

--
Bruce Cran

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-04 Thread Matthias Andree
Am 04.04.2013 03:05, schrieb Jeremy Chadwick:

 Please provide gpart show -p ada1 output, both here and in the PR,
 if you could.

 =63  1953525105ada1  MBR  (931G)
   63   209714337  ada1s1  freebsd  [active]  (100G)
209714400 800  - free -  (400k)
2097152007168  ada1s2  ntfs  (34G)
281395200   15405  - free -  (7.5M)
281410605   488263545  ada1s3  linux-data  (232G)
769674150  1183851018  - free -  (564G)

Thanks for all the useful information provided so far (including further
down).  I know some of that already, but am not going to complain
because it is very useful in the logs.

 The problem here is that I cannot guarantee you that alignment is
 the problem.  The performance impact of writes to partitions which are
 non-aligned is quite high, and NCQ just exacerbates this problem.  I
 would love to tell you switch to GPT and follow Warren Block's
 document*** but if your NTFS partition is Windows and is a Windows version
 older than Windows 7 GPT is not supported.

I am happy to make that realign-and-use-GPT experiment.
My Windows is 7 Professional 64-bit.

It will take me a few days because this is spare-time stuff.

 One piece of evidence that refutes my theory is that if Windows and/or
 Linux partition are something you boot into and use often, I would
 imagine NCQ would be used in both of those environments and would suffer
 from the same issue.  Although Windows tends to hide all sorts of
 transient errors from the user (sigh), Linux tends to be like FreeBSD
 with regards to such issues (on the console anyway; you wouldn't see
 such messages normally inside of X).

Now, the FreeBSD slice is the only partition on that disk that would
likely see concurrent write accesses (think make -j8 on a quadcore
computer) which is more prone to ferret out such alignment contention.

The NTFS partition is aligned on a multi-MB boundary, so wouldn't hit
the problem anyways.

The Linux partition is in ext4 format for mostly sequential access to
files usually in excess of 10 MB each.

Linux's ext4 jumps through several hoops to end up with bulk writes,
like extents, delayed allocations (to avoid fragmentation), reordering
of data and metadata writes, serialized log writes and all that stuff,
and it would appear I am permitting it to cache writes -- Linux uses
write barriers to enforce proper ordering of journal/meta-data writes.

It would be rather hard to hit ATA taskfile timeouts, the expected rate
with which the drive needs to do a partial write is orders of magnitude
lower.

Any good concurrent write exercise tools for Unix that I could run on
the Linux ext4 partition that you would propose?

 If you have the time and want to put forth the effort, I would recommend
 backing up all your data on ada1, zero the first and last 1MByte of the
 drive, and then try following Warren Block's guide.  I'd just recommend
 doing this:
 
 gpart create -s gpt ada1
 gpart add -t freebsd-ufs -b 2m ada1
 newfs -U -j /dev/ada1p1   (or remove -j if you don't want to use SUJ)

Will do.

 - I am running with kern.cam.ada.default_timeout=5 which makes the
 computer recover faster
 
 I can definitely imagine cases where a drive using NCQ but doing writes
 to a non-aligned partition could take longer than 5 seconds to respond
 to an ATA CDB (this is different than a SATA or AHCI layer timeout).  I am
 not telling you change this back to 30, but it might not be helping
 your situation at all given my above theory.

My feeling is that the stalls are mostly from the error handler and the
overall time the drive is frozen gets shorter. If it had not _felt_
faster, I'd not have left that in sysctl.conf in the first place.

 Finally: could you please provide output from smartctl -x /dev/ada1?
 I would like to rule out any possibility of your drive having some other
 kind of issue that might cause it to go catatonic.  Thanks.

I have fetched the data with Linux this time (should not make a
difference as it's all drive internal data, not host OS stuff).

Looks sane to me, http://people.freebsd.org/~mandree/smartctl.log.
I'll be happy to refetch this data with a more current smartctl version
under FreeBSD if required.

 
 
 ** -- 
 http://www.seagate.com/files/www-content/support-content/documentation/samsung/tech-specs/eco_greenf2.pdf
 
 *** -- http://www.wonkity.com/~wblock/docs/html/ssd.html
 




signature.asc
Description: OpenPGP digital signature


Re: Any objections/comments on axing out old ATA stack?

2013-04-03 Thread Alexander Motin

On 02.04.2013 21:39, Matthias Andree wrote:

Am 31.03.2013 23:02, schrieb Scott Long:


So what I hear you and Matthias saying, I believe, is that it should be easier 
to
force disks to fall back to non-NCQ mode, and/or have a more responsive
black-list for problematic controllers.  Would this help the situation?  It's 
hard to
justify holding back overall forward progress because of some bad controllers;
we do several Tbps off of AHCI controllers with NCQ enabled on FreeBSD 9.x,
enough to make up a sizable percentage of the internet's traffic, and we see no
problems.  How can we move forward but also take care of you guys with
problematic hardware?


Well, I am running the driver fine off of my WD Caviar RE3 disk, and the
problematic drive also works just fine with Windows and Linux, so it
must be something between the problematic drive and the FreeBSD driver.

I would like to see any of this, in decreasing order of precedence:

- debugged driver

- assistance/instructions on helping how to debug the driver/trace NCQ
stuff/...  (as in Jeremy Chadwick's followup in this same thread - this
helps, I will attempt to procure the required information; back then,
reducing the number of tags to 31 was ineffective, including an error
message and getting a value of 32 when reading the setting back)


Unfortunately, I don't know how to debug that. Command timeouts reported 
on the lists before are the kind of errors that are most difficult to 
diagnose since the controller gives no information to do that. We just 
see that sent commands are no longer completing. May be it is some 
incompatibility of specific drive and HBA firmwares, triggered by some 
innocent specifics of our ATA stack, GEOM or filesystems implementation. 
All I can propose is to try to identify such cases and add some quirks 
to workaround it, like disabling NCQ or limiting number of tags. I am 
not sure what else can we do about it without some controlled lab 
environment with affected hardware and SATA analyzer.



- user-space contingency features, such as letting camcontrol limit
the number of open NCQ tags, or disable NCQ, either on a per-drive basis


I've merged support for that to 8/9-STABLE about 9 months ago:
`camcontrol tags ada0 -v -N X` should change number of simultaneously 
used tags,
`camcontrol negotiate ada0 -T (en|dis)able` should enable/disable use of 
NCQ.
I just did some tests on HEAD and these commands seems like working. If 
you can reproduce the problem, it would be nice to collect information 
how these changes affect it.



I am capable of debugging C - mostly with gdb command-line, and
graphical Windows IDEs - but am unfamiliar with FreeBSD kernel
debugging. If necessary, I can pull up a second console, but the PC that
is affected is legacy-free, so serial port only works through a
serial/USB converter.



--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-03 Thread Matthias Andree
I have just sent more information to the PR at
http://www.freebsd.org/cgi/query-pr.cgi?pr=157397

The short summary (more info in the PR) is:

- limiting tags to 31 does not help

- disabling NCQ appears to help in initial testing, but warrants more
testing

- error happens during WRITE_FPDMA_QUEUED,

- File system in question is SU+J UFS2 mounted on /usr, and I can for
instance rm -rf /usr/obj or just log into GNOME and try to open a
gnome-terminal to trigger stalls;

- Linux uses 31 tags (for different reason) and has no drive quirks, but
a controller quirk;

for Jeremy's topic #6, regarding the ATI/AMD SB7x0 that I am using, it
might be worthwhile investigating the AHCI_HFLAG_IGN_SERR_INTERNAL flag
- it gets set by Linux on the SB700 that my computer is using, see
ahci_error_intr() in libahci.h - I am not going to interpret that for
lack of expertise, but it does affect error handling and appears to
ignore a certain condition.

Why only my Samsung HDD drive triggers this but not the WD drive, I do
not know yet.

Hope that helps a bit.




signature.asc
Description: OpenPGP digital signature


Re: Any objections/comments on axing out old ATA stack?

2013-04-03 Thread Jeremy Chadwick
On Thu, Apr 04, 2013 at 12:15:32AM +0200, Matthias Andree wrote:
 I have just sent more information to the PR at
 http://www.freebsd.org/cgi/query-pr.cgi?pr=157397
 
 The short summary (more info in the PR) is:
 
 - limiting tags to 31 does not help
 
 - disabling NCQ appears to help in initial testing, but warrants more
 testing
 
 - error happens during WRITE_FPDMA_QUEUED,

This is an NCQ-based write LBA request.  There are many non-NCQ
equivalents of this, ATA-protocol-wise (too many to list here), but the
most likely non-NCQ ATA command you'd see is WRITE_DMA48.

 - File system in question is SU+J UFS2 mounted on /usr, and I can for
 instance rm -rf /usr/obj or just log into GNOME and try to open a
 gnome-terminal to trigger stalls;
 
 - Linux uses 31 tags (for different reason) and has no drive quirks, but
 a controller quirk;
 
 for Jeremy's topic #6, regarding the ATI/AMD SB7x0 that I am using, it
 might be worthwhile investigating the AHCI_HFLAG_IGN_SERR_INTERNAL flag
 - it gets set by Linux on the SB700 that my computer is using, see
 ahci_error_intr() in libahci.h - I am not going to interpret that for
 lack of expertise, but it does affect error handling and appears to
 ignore a certain condition.

Alexander could expand on this, but the name of the flag implies that
there are certain conditions where the SATA-level SERR condition gets
ignored (IGN).

While skimming Linux libata code and commits in the past, the only
glaringly obvious bug/issue I see is with SB600/SB700 chipsets (the
hardware revision apparently matters) and port multiplier (PMP) support
and soft resets.

Are you using a port multiplier?  I doubt it, but I have to ask.

 Why only my Samsung HDD drive triggers this but not the WD drive, I do
 not know yet.

Please provide gpart show -p ada1 output, both here and in the PR,
if you could.

I have a gut feeling I know what the issue is (and if it is what I think
it is, it's actually happening all the time, just that NCQ exacerbates
it given how command queueing works), but I won't know for sure until I
see the output.

Thanks.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-03 Thread Matthias Andree
Am 04.04.2013 01:38, schrieb Jeremy Chadwick:

...

 While skimming Linux libata code and commits in the past, the only
 glaringly obvious bug/issue I see is with SB600/SB700 chipsets (the
 hardware revision apparently matters) and port multiplier (PMP) support
 and soft resets.
 
 Are you using a port multiplier?  I doubt it, but I have to ask.

I am not using a PMP as far as I know (unless one is buried on my Asus
M4A78T-E main board). It would seem the drives are directly attached to
the south bridge's SATA ports.

 Why only my Samsung HDD drive triggers this but not the WD drive, I do
 not know yet.
 
 Please provide gpart show -p ada1 output, both here and in the PR,
 if you could.

=63  1953525105ada1  MBR  (931G)
  63   209714337  ada1s1  freebsd  [active]  (100G)
   209714400 800  - free -  (400k)
   2097152007168  ada1s2  ntfs  (34G)
   281395200   15405  - free -  (7.5M)
   281410605   488263545  ada1s3  linux-data  (232G)
   769674150  1183851018  - free -  (564G)

HTH

Best regards
Matthias



signature.asc
Description: OpenPGP digital signature


Re: Any objections/comments on axing out old ATA stack?

2013-04-03 Thread Jeremy Chadwick
On Thu, Apr 04, 2013 at 02:19:16AM +0200, Matthias Andree wrote:
 Am 04.04.2013 01:38, schrieb Jeremy Chadwick:
 
 ...
 
  While skimming Linux libata code and commits in the past, the only
  glaringly obvious bug/issue I see is with SB600/SB700 chipsets (the
  hardware revision apparently matters) and port multiplier (PMP) support
  and soft resets.
  
  Are you using a port multiplier?  I doubt it, but I have to ask.
 
 I am not using a PMP as far as I know (unless one is buried on my Asus
 M4A78T-E main board). It would seem the drives are directly attached to
 the south bridge's SATA ports.

Then the answer is nope, you're not using a PM.  Details:

http://www.serialata.org/technology/port_multipliers.asp
http://en.wikipedia.org/wiki/Port_multiplier

  Why only my Samsung HDD drive triggers this but not the WD drive, I do
  not know yet.
  
  Please provide gpart show -p ada1 output, both here and in the PR,
  if you could.
 
 =63  1953525105ada1  MBR  (931G)
   63   209714337  ada1s1  freebsd  [active]  (100G)
209714400 800  - free -  (400k)
2097152007168  ada1s2  ntfs  (34G)
281395200   15405  - free -  (7.5M)
281410605   488263545  ada1s3  linux-data  (232G)
769674150  1183851018  - free -  (564G)

This is what I was worried about.  Referring to your camcontrol
identify output:

 device model SAMSUNG HD103SI
 sector size logical 512, physical 512, offset 0

Hear me out entirely on this one.

My theory is that your hard disk actually uses 4096-byte sectors but is
too old to provide ATA IDENTIFY semantics to delineate between logical
vs. physical sector size.  In other words, only logical is provided,
thus logical=physical in the eyes of all software; smartctl will show
you the exact same thing too.

There are drives like this in the wild, both SSDs as well as MHDDs.
For example, the Intel 320-series SSD behaves this way too (providing
only logical size).

Do not let the capacity/size of the drive be the deciding factor; your
drive is 1TB, but I also have many 1TB MHDDs that use 4096-byte sectors.

Seagate/Samsung's specification** for the HD103SI states, and I quote:
Byte per Sensor: 512 bytes.  Yes, it says Sensor.  Whether or not
this documentation is correct/accurate is unknown, and when vendors have
typos in their own specification docs, I cannot help but to honour the
possibility of the information being wrong.  So I'm unsure if this drive
uses 512-byte sectors or 4096-byte sectors.

That said: in your gpart show ada1 output, none of your partitions
(FreeBSD, NTFS, nor Linux) appear to be aligned to 4096-byte boundaries.
Ideally you'd want to have these aligned to 1MB or 2MByte boundaries in
the case you ever move to an SSD.  You're also using the MBR scheme,
which does not tend to play well with alignment.

Comparatively, your WD5002ABYS drive **does** use 512-byte sectors (I
know this for a fact).

The problem here is that I cannot guarantee you that alignment is
the problem.  The performance impact of writes to partitions which are
non-aligned is quite high, and NCQ just exacerbates this problem.  I
would love to tell you switch to GPT and follow Warren Block's
document*** but if your NTFS partition is Windows and is a Windows version
older than Windows 7 GPT is not supported.

One piece of evidence that refutes my theory is that if Windows and/or
Linux partition are something you boot into and use often, I would
imagine NCQ would be used in both of those environments and would suffer
from the same issue.  Although Windows tends to hide all sorts of
transient errors from the user (sigh), Linux tends to be like FreeBSD
with regards to such issues (on the console anyway; you wouldn't see
such messages normally inside of X).

If you have the time and want to put forth the effort, I would recommend
backing up all your data on ada1, zero the first and last 1MByte of the
drive, and then try following Warren Block's guide.  I'd just recommend
doing this:

gpart create -s gpt ada1
gpart add -t freebsd-ufs -b 2m ada1
newfs -U -j /dev/ada1p1   (or remove -j if you don't want to use SUJ)

I picked an alignment value of 2MBytes since it's both 4K-aligned and is
generally safe for things like newer SSDs that have larger NAND erase
block size (I am not going to get into a discussion about that here, so
please stay focused.  :-) )

If the problem is gone after that (it should be easy to induce by
writing tons and tons of data to the drive), then we can safely say that
the drive uses 4096-byte sectors and need to add it to the quirks list
in ata_da.c.

If the problem remains after that, then further investigation is needed,
and we can safely rule out alignment.  Welcome to all the pain/effort
one has to go through when troubleshooting things like this.  :-)

Another thing: in your PR you state:

 - I am running with kern.cam.ada.default_timeout=5 which makes the
 computer recover faster

I can definitely imagine cases where 

Re: Any objections/comments on axing out old ATA stack?

2013-04-02 Thread Matthias Andree
Am 31.03.2013 23:02, schrieb Scott Long:

 So what I hear you and Matthias saying, I believe, is that it should be 
 easier to
 force disks to fall back to non-NCQ mode, and/or have a more responsive
 black-list for problematic controllers.  Would this help the situation?  It's 
 hard to
 justify holding back overall forward progress because of some bad controllers;
 we do several Tbps off of AHCI controllers with NCQ enabled on FreeBSD 9.x,
 enough to make up a sizable percentage of the internet's traffic, and we see 
 no
 problems.  How can we move forward but also take care of you guys with
 problematic hardware?

Well, I am running the driver fine off of my WD Caviar RE3 disk, and the
problematic drive also works just fine with Windows and Linux, so it
must be something between the problematic drive and the FreeBSD driver.

I would like to see any of this, in decreasing order of precedence:

- debugged driver

- assistance/instructions on helping how to debug the driver/trace NCQ
stuff/...  (as in Jeremy Chadwick's followup in this same thread - this
helps, I will attempt to procure the required information; back then,
reducing the number of tags to 31 was ineffective, including an error
message and getting a value of 32 when reading the setting back)

- user-space contingency features, such as letting camcontrol limit
the number of open NCQ tags, or disable NCQ, either on a per-drive basis

I am capable of debugging C - mostly with gdb command-line, and
graphical Windows IDEs - but am unfamiliar with FreeBSD kernel
debugging. If necessary, I can pull up a second console, but the PC that
is affected is legacy-free, so serial port only works through a
serial/USB converter.



signature.asc
Description: OpenPGP digital signature


Re: Any objections/comments on axing out old ATA stack?

2013-04-02 Thread Matthias Andree
Am 01.04.2013 17:07, schrieb Stefan Esser:
 Am 01.04.2013 15:14, schrieb Victor Balada Diaz:
 Being able to configure quirks from loader.conf for disks AND controllers 
 would be great
 and is not hard to do. If you want i can do a patch in two weeks and send it 
 to you. That
 way it's easy to test disabling NCQ and/or other things in case of hitting a 
 bug. Also
 being able to modify the configuration without a kernel recompile would be a 
 big
 improvement because we could still use freebsd-update to keep systems 
 updated.
 
 Something like:
 
 kern.cam.ada.0.quirks=1
 
 to force 4KB sectors?
 
 No need to implement that, it is in -CURRENT (did not check -STABLE).
 But there is no quirk, that disables NCQ, currently, although it is
 easy to implement. See the places where ADA_FLAG_CAN_NCQ is set and
 make that value depend on a new quirk flag being unset ...
 
 But instead of setting that flag in the loader, it would be good to
 collect drive signatures that need it and to add quirk entries for
 them in ata_da.c ...

Before we can do that, we need to know if it's really the drive's fault
or if the driver is wrong.  We need to debug that.

If we have relevant parameters exposed through the CAM interface (rather
than loader variables), that would also help expedite the debugging.



signature.asc
Description: OpenPGP digital signature


Re: Any objections/comments on axing out old ATA stack?

2013-04-01 Thread Victor Balada Diaz
On Sun, Mar 31, 2013 at 03:02:09PM -0600, Scott Long wrote:
 
 On Mar 31, 2013, at 7:04 AM, Victor Balada Diaz vic...@bsdes.net wrote:
 
  On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:
  Hi.
  
  Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA 
  stack, using only some controller drivers of old ata(4) by having 
  `options ATA_CAM` enabled in all kernels by default. I have a wish to 
  drop non-ATA_CAM ata(4) code, unused since that time from the head 
  branch to allow further ATA code cleanup.
  
  Does any one here still uses legacy ATA stack (kernel explicitly built 
  without `options ATA_CAM`) for some reason, for example as workaround 
  for some regression? Does anybody have good ideas why we should not drop 
  it now?
  
  Hello,
  
  At my previous job we had troubles with NCQ on some controllers. It caused
  failures and silent data corruption. As old ata code didn't use NCQ we just 
  used
  it.
  
  I reported some of the problems on 8.2[1] but the problem existed with 8.3.
  
  I no longer have access to those systems, so i don't know if the problem
  still exists or have been fixed on newer versions.
  
  Regards.
  Victor.
 
 
 So what I hear you and Matthias saying, I believe, is that it should be 
 easier to
 force disks to fall back to non-NCQ mode, and/or have a more responsive
 black-list for problematic controllers.  Would this help the situation?  It's 
 hard to
 justify holding back overall forward progress because of some bad controllers;
 we do several Tbps off of AHCI controllers with NCQ enabled on FreeBSD 9.x,
 enough to make up a sizable percentage of the internet's traffic, and we see 
 no
 problems.  How can we move forward but also take care of you guys with
 problematic hardware?
 
 Scott

Being able to configure quirks from loader.conf for disks AND controllers would 
be great
and is not hard to do. If you want i can do a patch in two weeks and send it to 
you. That
way it's easy to test disabling NCQ and/or other things in case of hitting a 
bug. Also
being able to modify the configuration without a kernel recompile would be a big
improvement because we could still use freebsd-update to keep systems updated.

Anyway, my comment was not against dropping old ata code, but more on the 
comments on
regresssions on the new one.

Regards.
Victor.
-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-01 Thread Stefan Esser
Am 01.04.2013 15:14, schrieb Victor Balada Diaz:
 Being able to configure quirks from loader.conf for disks AND controllers 
 would be great
 and is not hard to do. If you want i can do a patch in two weeks and send it 
 to you. That
 way it's easy to test disabling NCQ and/or other things in case of hitting a 
 bug. Also
 being able to modify the configuration without a kernel recompile would be a 
 big
 improvement because we could still use freebsd-update to keep systems updated.

Something like:

kern.cam.ada.0.quirks=1

to force 4KB sectors?

No need to implement that, it is in -CURRENT (did not check -STABLE).
But there is no quirk, that disables NCQ, currently, although it is
easy to implement. See the places where ADA_FLAG_CAN_NCQ is set and
make that value depend on a new quirk flag being unset ...

But instead of setting that flag in the loader, it would be good to
collect drive signatures that need it and to add quirk entries for
them in ata_da.c ...

Regards, STefan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-01 Thread Victor Balada Diaz
On Mon, Apr 01, 2013 at 05:07:20PM +0200, Stefan Esser wrote:
 Am 01.04.2013 15:14, schrieb Victor Balada Diaz:
  Being able to configure quirks from loader.conf for disks AND controllers 
  would be great
  and is not hard to do. If you want i can do a patch in two weeks and send 
  it to you. That
  way it's easy to test disabling NCQ and/or other things in case of hitting 
  a bug. Also
  being able to modify the configuration without a kernel recompile would be 
  a big
  improvement because we could still use freebsd-update to keep systems 
  updated.
 
 Something like:
 
 kern.cam.ada.0.quirks=1
 
 to force 4KB sectors?
 
 No need to implement that, it is in -CURRENT (did not check -STABLE).
 But there is no quirk, that disables NCQ, currently, although it is
 easy to implement. See the places where ADA_FLAG_CAN_NCQ is set and
 make that value depend on a new quirk flag being unset ...
 
 But instead of setting that flag in the loader, it would be good to
 collect drive signatures that need it and to add quirk entries for
 them in ata_da.c ...
 
 Regards, STefan

Yep, something like that but also for controllers. Looking here[1] i don't
see it implemented for controllers on current.

I agree that we should collect drive and controller signatures and add that
quirks to the OS, but being able to play with quirks from loader is still 
useful.

If your FreeBSD version don't have yet the quirks needed for the disk/controller
that you're using, you'd need to patch and rebuild a custom kernel.
Having a loader tunable helps maintaining old FreeBSD versions easier.

Regards.
Victor.

[1]: http://fxr.watson.org/fxr/source/dev/ahci/ahci.c
-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-31 Thread Alexander Motin

On 31.03.2013 08:13, Ian Smith wrote:

On Sat, 30 Mar 2013 21:00:24 -0700, Peter Wemm wrote:
   On Sat, Mar 30, 2013 at 4:29 PM, Matthias Andree mand...@freebsd.org 
wrote:
Am 27.03.2013 22:22, schrieb Alexander Motin:
Hi.
   
Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
stack, using only some controller drivers of old ata(4) by having
`options ATA_CAM` enabled in all kernels by default. I have a wish to
drop non-ATA_CAM ata(4) code, unused since that time from the head
branch to allow further ATA code cleanup.
   
Does any one here still uses legacy ATA stack (kernel explicitly built
without `options ATA_CAM`) for some reason, for example as workaround
for some regression? Does anybody have good ideas why we should not drop
it now?
   
Alexander,
   
The regression in http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/157397
where the SATA NCQ slots stall for some Samsung drives in the new stack,
and consequently hang the computer for prolonged episodes where it is in
the NCQ error handling, disallows removal of the old driver. (Last
checked with 9.1-RELEASE at current patchlevel.)
  
   We're talking about 10.x, so if you want it fixed, you need update
   with 10.x information.
  
   Please put 10.x diagnostics in the PR.

Given Alexander also posted this to -stable, just for clarity, are we
_only_ talking about 10.x here, or might this change get MFC'd to 9?


Yes, I am only going to drop it from 10.x, but bug reports from 9-STABLE 
users are welcome, as at some point they will become 10.x users.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-31 Thread Matthias Andree
Am 31.03.2013 06:00, schrieb Peter Wemm:

 We're talking about 10.x, so if you want it fixed, you need update
 with 10.x information.
 
 Please put 10.x diagnostics in the PR.

I will not.  The PR was filed four months before 10-CURRENT branched;
I have no reason to assume it were to be no longer pertinent -- no MFCs,
no PR followups).

(according to
http://www.freebsd.org/doc/en/books/porters-handbook/freebsd-versions.html,
10-CURRENT appeared on 2011-09-26)

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-31 Thread Victor Balada Diaz
On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:
 Hi.
 
 Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA 
 stack, using only some controller drivers of old ata(4) by having 
 `options ATA_CAM` enabled in all kernels by default. I have a wish to 
 drop non-ATA_CAM ata(4) code, unused since that time from the head 
 branch to allow further ATA code cleanup.
 
 Does any one here still uses legacy ATA stack (kernel explicitly built 
 without `options ATA_CAM`) for some reason, for example as workaround 
 for some regression? Does anybody have good ideas why we should not drop 
 it now?

Hello,

At my previous job we had troubles with NCQ on some controllers. It caused
failures and silent data corruption. As old ata code didn't use NCQ we just used
it.

I reported some of the problems on 8.2[1] but the problem existed with 8.3.

I no longer have access to those systems, so i don't know if the problem
still exists or have been fixed on newer versions.

Regards.
Victor.

[1]: 
https://groups.google.com/forum/#!topic/muc.lists.freebsd.stable/dAMf028CtXM
-- 
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros. 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-31 Thread Scott Long

On Mar 31, 2013, at 7:04 AM, Victor Balada Diaz vic...@bsdes.net wrote:

 On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:
 Hi.
 
 Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA 
 stack, using only some controller drivers of old ata(4) by having 
 `options ATA_CAM` enabled in all kernels by default. I have a wish to 
 drop non-ATA_CAM ata(4) code, unused since that time from the head 
 branch to allow further ATA code cleanup.
 
 Does any one here still uses legacy ATA stack (kernel explicitly built 
 without `options ATA_CAM`) for some reason, for example as workaround 
 for some regression? Does anybody have good ideas why we should not drop 
 it now?
 
 Hello,
 
 At my previous job we had troubles with NCQ on some controllers. It caused
 failures and silent data corruption. As old ata code didn't use NCQ we just 
 used
 it.
 
 I reported some of the problems on 8.2[1] but the problem existed with 8.3.
 
 I no longer have access to those systems, so i don't know if the problem
 still exists or have been fixed on newer versions.
 
 Regards.
 Victor.


So what I hear you and Matthias saying, I believe, is that it should be easier 
to
force disks to fall back to non-NCQ mode, and/or have a more responsive
black-list for problematic controllers.  Would this help the situation?  It's 
hard to
justify holding back overall forward progress because of some bad controllers;
we do several Tbps off of AHCI controllers with NCQ enabled on FreeBSD 9.x,
enough to make up a sizable percentage of the internet's traffic, and we see no
problems.  How can we move forward but also take care of you guys with
problematic hardware?

Scott

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-31 Thread Jeremy Chadwick
On Sun, Mar 31, 2013 at 03:02:09PM -0600, Scott Long wrote:
 On Mar 31, 2013, at 7:04 AM, Victor Balada Diaz vic...@bsdes.net wrote:
  On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:
  Hi.
  
  Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA 
  stack, using only some controller drivers of old ata(4) by having 
  `options ATA_CAM` enabled in all kernels by default. I have a wish to 
  drop non-ATA_CAM ata(4) code, unused since that time from the head 
  branch to allow further ATA code cleanup.
  
  Does any one here still uses legacy ATA stack (kernel explicitly built 
  without `options ATA_CAM`) for some reason, for example as workaround 
  for some regression? Does anybody have good ideas why we should not drop 
  it now?
  
  Hello,
  
  At my previous job we had troubles with NCQ on some controllers. It caused
  failures and silent data corruption. As old ata code didn't use NCQ we just 
  used
  it.
  
  I reported some of the problems on 8.2[1] but the problem existed with 8.3.
  
  I no longer have access to those systems, so i don't know if the problem
  still exists or have been fixed on newer versions.
 
 So what I hear you and Matthias saying, I believe, is that it should be 
 easier to
 force disks to fall back to non-NCQ mode, and/or have a more responsive
 black-list for problematic controllers.  Would this help the situation?  It's 
 hard to
 justify holding back overall forward progress because of some bad controllers;
 we do several Tbps off of AHCI controllers with NCQ enabled on FreeBSD 9.x,
 enough to make up a sizable percentage of the internet's traffic, and we see 
 no
 problems.  How can we move forward but also take care of you guys with
 problematic hardware?

I've read a referenced PR (157397) except there really isn't enough
technical troubleshooting/detail to determine what the root cause is.

That isn't the fault of the reporter either -- the reporter needs to be
told what information they need to provide / how to troubleshoot it.
Meaning: kernel folks who are in-the-know need to step up and help.

That PR is soon-to-be 2 years old and is missing tons of information
that, even as a non-kernel guy, that *I* would find useful:

1. Output from:
   - camcontrol tags ada1 -v
   - camcontrol identify ada1
   - What sorts of filesystems are on ada1; if UFS, tunefs -p output
 would be greatly appreciated
   - If the timeouts happen during heavy I/O load, and if so, during
 what kinds of I/O load (reads or writes).

2. Does camcontrol tags ada1 -N 31 help?  I mention this because
stated here:

http://lists.freebsd.org/pipermail/freebsd-stable/2013-March/072985.html

...there are statements which imply decreasing queue length may solve
the issue.  What confuses me, however, is that the queue length on my
own systems (with different models of disks, as well as an SSD) all have
a limit of 32.  I dug through the kernel source for a while but could
not easily find where this number comes from.  (I have very little
familiarity with command queuing at the protocol level)

3. Why not find out why Linux (probably libata) has a 32 (or 31?) queue
limit?  They have commit logs, and there is the LVKM where you could
ask.  While I understand reluctance to add something just because Linux
does it, it doesn't appear anyone's stepped up to the plate to ask them
why; I pray this is not caused by anti-Linux sentiment.

4. The ada1 device in the PR is a Samsung Spinpoint EcoGreen F2 hard
drive (1TB, 5400rpm, 32MB cache).  Possibly the drive has firmware bugs
relating to its NCQ implementation, or possibly it's going into some
power-saving mode (it is an EcoGreen model).  I've always been wary of
the EcoGreen disks since reading about the F4 EcoGreen firmware fiasco
(even though the same page says the F1 and F3 EcoGreen had no issue):

http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks

5. We really need to have some way to print active quirks for devices,
even if it's only at boot-up, e.g.:

ada3: quirks=0x00034K,NO_NCQ

I'd be happy to write the code for this (basing it on how we do CPU
flags), but as I've said in the past, kernel-land is scary to me.

6. The controller referenced is an ATI IXP700.  I cannot tell you how
many times on the mailing lists I've seen weird issues reported by
people using that controller.  I am in no way/shape/form saying the
issue is with the controller or with AHCI compatibility (FreeBSD vs.
ATI), because I have no proof.  I just find it very unnerving that so
many issues have been reported where that controller is involved, and
often across all sorts of different device/disk models.

All that said:

I agree a loader tunable to inhibit command queueing would be nice.
sysctl would be even more convenient (easier for real-time testing) but
I don't know the implications of turning CQ off in the middle of any
pending I/O requests.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| 

Re: Any objections/comments on axing out old ATA stack?

2013-03-30 Thread Matthias Andree
Am 27.03.2013 22:22, schrieb Alexander Motin:
 Hi.
 
 Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
 stack, using only some controller drivers of old ata(4) by having
 `options ATA_CAM` enabled in all kernels by default. I have a wish to
 drop non-ATA_CAM ata(4) code, unused since that time from the head
 branch to allow further ATA code cleanup.
 
 Does any one here still uses legacy ATA stack (kernel explicitly built
 without `options ATA_CAM`) for some reason, for example as workaround
 for some regression? Does anybody have good ideas why we should not drop
 it now?

Alexander,

The regression in http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/157397
where the SATA NCQ slots stall for some Samsung drives in the new stack,
and consequently hang the computer for prolonged episodes where it is in
the NCQ error handling, disallows removal of the old driver. (Last
checked with 9.1-RELEASE at current patchlevel.)

Chances are that limiting the open queue slots to 31 might help, but
that is hearsay from what Linux would be doing.

Unless we get a fix, if you want to drop the old driver, you'll need to
add features so that

1. the new driver to lets users (down-)configure the max. number of
tagged openings

2. the new driver allows disabling NCQ altogether for individual drives

3. list the relevant Samsung drives in some quirks data base so that we
avoid the stalls while permitting users to open it up to 32 NCQ slots.

So unless these are all addressed, I'd veto removal of the old ATA
driver - sorry!

Best regards
Matthias




signature.asc
Description: OpenPGP digital signature


Re: Any objections/comments on axing out old ATA stack?

2013-03-30 Thread Matthias Andree
Am 28.03.2013 16:31, schrieb Scott Long:
 
 On Mar 28, 2013, at 8:00 AM, Ian Lepore i...@freebsd.org wrote:
 
 On Thu, 2013-03-28 at 09:17 +0200, Alexander Motin wrote:
 On 28.03.2013 02:43, Adrian Chadd wrote:
 My main concern with the new stuff is that it requires CAM and that's
 reasonably big compared to the standalone ATA code.

 It'd be nice if we could slim down the CAM stack a bit first; it makes
 embedding it on the smaller devices really freaking painful.

 Are there many boards now with ATA, but without USB? But I agree, it 
 should be checked.


 It's not necessarily what the boards have but how they're used.  We use
 industrial SBCs at work that have ata compact flash sockets on the board
 which we do use, and usb interfaces which we don't use.

 I've never tested the new ata+cam stuff on some of these boards, most
 based on Cyrix, Via, Geode, and VortexD86 chipsets.  The older ata code
 works, but not always very well -- for example, we usually have to set
 hw.ata.ata_dma=0 for absolutely no reason we've ever been able to figure
 out except that if we leave it enabled we get DMA errors and panics on
 some CF cards and not on others.  I have no idea whether to expect such
 things to be better, worse, or no different by changing to the ata+cam
 way of doing things (but I don't really have time to do extensive
 testing right now either).

 
 
 The legacy ATA code was hard to maintain, very buggy (as you point out), and
 is essentially unmaintained.  Also, IIRC, the legacy stack simply cannot 
 support
 NCQ tagged queueing.

...which is exactly why it currently is the only way to get certain
Samsung drives to cooperate reliably, without stalling the kernel for
prolonged times (minutes) making the computer essentially unusable once
it gets under I/O load (such as make -C /usr/src -j4 buildworld) - as
the new ahci+ata+cam+... would.

Details including PR reference in my other message in this thread.



signature.asc
Description: OpenPGP digital signature


Re: Any objections/comments on axing out old ATA stack?

2013-03-30 Thread Peter Wemm
On Sat, Mar 30, 2013 at 4:29 PM, Matthias Andree mand...@freebsd.org wrote:
 Am 27.03.2013 22:22, schrieb Alexander Motin:
 Hi.

 Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
 stack, using only some controller drivers of old ata(4) by having
 `options ATA_CAM` enabled in all kernels by default. I have a wish to
 drop non-ATA_CAM ata(4) code, unused since that time from the head
 branch to allow further ATA code cleanup.

 Does any one here still uses legacy ATA stack (kernel explicitly built
 without `options ATA_CAM`) for some reason, for example as workaround
 for some regression? Does anybody have good ideas why we should not drop
 it now?

 Alexander,

 The regression in http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/157397
 where the SATA NCQ slots stall for some Samsung drives in the new stack,
 and consequently hang the computer for prolonged episodes where it is in
 the NCQ error handling, disallows removal of the old driver. (Last
 checked with 9.1-RELEASE at current patchlevel.)

We're talking about 10.x, so if you want it fixed, you need update
with 10.x information.

Please put 10.x diagnostics in the PR.

-- 
Peter Wemm - pe...@wemm.org; pe...@freebsd.org; pe...@yahoo-inc.com; KI6FJV
bitcoin:188ZjyYLFJiEheQZw4UtU27e2FMLmuRBUE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-30 Thread Ian Smith
On Sat, 30 Mar 2013 21:00:24 -0700, Peter Wemm wrote:
  On Sat, Mar 30, 2013 at 4:29 PM, Matthias Andree mand...@freebsd.org wrote:
   Am 27.03.2013 22:22, schrieb Alexander Motin:
   Hi.
  
   Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
   stack, using only some controller drivers of old ata(4) by having
   `options ATA_CAM` enabled in all kernels by default. I have a wish to
   drop non-ATA_CAM ata(4) code, unused since that time from the head
   branch to allow further ATA code cleanup.
  
   Does any one here still uses legacy ATA stack (kernel explicitly built
   without `options ATA_CAM`) for some reason, for example as workaround
   for some regression? Does anybody have good ideas why we should not drop
   it now?
  
   Alexander,
  
   The regression in http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/157397
   where the SATA NCQ slots stall for some Samsung drives in the new stack,
   and consequently hang the computer for prolonged episodes where it is in
   the NCQ error handling, disallows removal of the old driver. (Last
   checked with 9.1-RELEASE at current patchlevel.)
  
  We're talking about 10.x, so if you want it fixed, you need update
  with 10.x information.
  
  Please put 10.x diagnostics in the PR.

Given Alexander also posted this to -stable, just for clarity, are we 
_only_ talking about 10.x here, or might this change get MFC'd to 9?

cheers, Ian

(dropping -current as I'm not subscribed so would only get bounced)
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Alexander Motin

On 28.03.2013 02:43, Adrian Chadd wrote:

My main concern with the new stuff is that it requires CAM and that's
reasonably big compared to the standalone ATA code.

It'd be nice if we could slim down the CAM stack a bit first; it makes
embedding it on the smaller devices really freaking painful.


Are there many boards now with ATA, but without USB? But I agree, it 
should be checked.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread John-Mark Gurney
Alexander Motin wrote this message on Thu, Mar 28, 2013 at 09:17 +0200:
 On 28.03.2013 02:43, Adrian Chadd wrote:
 My main concern with the new stuff is that it requires CAM and that's
 reasonably big compared to the standalone ATA code.
 
 It'd be nice if we could slim down the CAM stack a bit first; it makes
 embedding it on the smaller devices really freaking painful.
 
 Are there many boards now with ATA, but without USB? But I agree, it 
 should be checked.

The net4501 board has ATA but no USB.. Also, depending upon use, you
might choose to not include USB, but use ATA, or not use umass, but
the rest of USB...

Someone on a list was talking about trying to get FreeBSD down on a
really small system, 16MB ram...

/me thinks of the old wd driver.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Ian Lepore
On Thu, 2013-03-28 at 09:17 +0200, Alexander Motin wrote:
 On 28.03.2013 02:43, Adrian Chadd wrote:
  My main concern with the new stuff is that it requires CAM and that's
  reasonably big compared to the standalone ATA code.
 
  It'd be nice if we could slim down the CAM stack a bit first; it makes
  embedding it on the smaller devices really freaking painful.
 
 Are there many boards now with ATA, but without USB? But I agree, it 
 should be checked.
 

It's not necessarily what the boards have but how they're used.  We use
industrial SBCs at work that have ata compact flash sockets on the board
which we do use, and usb interfaces which we don't use.

I've never tested the new ata+cam stuff on some of these boards, most
based on Cyrix, Via, Geode, and VortexD86 chipsets.  The older ata code
works, but not always very well -- for example, we usually have to set
hw.ata.ata_dma=0 for absolutely no reason we've ever been able to figure
out except that if we leave it enabled we get DMA errors and panics on
some CF cards and not on others.  I have no idea whether to expect such
things to be better, worse, or no different by changing to the ata+cam
way of doing things (but I don't really have time to do extensive
testing right now either).

-- Ian


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Aleksandr Rybalko
On Wed, 27 Mar 2013 17:43:07 -0700
Adrian Chadd adr...@freebsd.org wrote:

 My main concern with the new stuff is that it requires CAM and that's
 reasonably big compared to the standalone ATA code.
 
 It'd be nice if we could slim down the CAM stack a bit first; it makes
 embedding it on the smaller devices really freaking painful.

/me never seen embedded devices with ATA/SATA and less than 64MB of RAM.
(i386/i486 old machines does not count :) )
I'm missing something?

 
 Thanks,
 
 
 
 adrian
 ___
 freebsd-curr...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


-- 
Aleksandr Rybalko r...@ddteam.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Scott Long

On Mar 27, 2013, at 6:43 PM, Adrian Chadd adr...@freebsd.org wrote:

 My main concern with the new stuff is that it requires CAM and that's
 reasonably big compared to the standalone ATA code.
 

From a code execution standpoint?  No, it's not.

 It'd be nice if we could slim down the CAM stack a bit first; it makes
 embedding it on the smaller devices really freaking painful.
 

From a code segment size standpoint, there's definitely some stuff that should 
be
made modular and optional.

Scott

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Scott Long

On Mar 28, 2013, at 8:00 AM, Ian Lepore i...@freebsd.org wrote:

 On Thu, 2013-03-28 at 09:17 +0200, Alexander Motin wrote:
 On 28.03.2013 02:43, Adrian Chadd wrote:
 My main concern with the new stuff is that it requires CAM and that's
 reasonably big compared to the standalone ATA code.
 
 It'd be nice if we could slim down the CAM stack a bit first; it makes
 embedding it on the smaller devices really freaking painful.
 
 Are there many boards now with ATA, but without USB? But I agree, it 
 should be checked.
 
 
 It's not necessarily what the boards have but how they're used.  We use
 industrial SBCs at work that have ata compact flash sockets on the board
 which we do use, and usb interfaces which we don't use.
 
 I've never tested the new ata+cam stuff on some of these boards, most
 based on Cyrix, Via, Geode, and VortexD86 chipsets.  The older ata code
 works, but not always very well -- for example, we usually have to set
 hw.ata.ata_dma=0 for absolutely no reason we've ever been able to figure
 out except that if we leave it enabled we get DMA errors and panics on
 some CF cards and not on others.  I have no idea whether to expect such
 things to be better, worse, or no different by changing to the ata+cam
 way of doing things (but I don't really have time to do extensive
 testing right now either).
 


The legacy ATA code was hard to maintain, very buggy (as you point out), and
is essentially unmaintained.  Also, IIRC, the legacy stack simply cannot support
NCQ tagged queueing.

I think that Alexander has done a superb job with both developing and supporting
the CAM_ATA stack.

Scott


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Lev Serebryakov
Hello, Aleksandr.
You wrote 28 марта 2013 г., 18:09:53:


 It'd be nice if we could slim down the CAM stack a bit first; it makes
 embedding it on the smaller devices really freaking painful.
AR /me never seen embedded devices with ATA/SATA and less than 64MB of RAM.
AR (i386/i486 old machines does not count :) )
AR I'm missing something?
 Yes:  USB  UMASS. It uses CAM too, and useful for very small systems,
 like  4MiB  FLASH  and 16MiB RAM (yes, whole system image, kernel and
 all, should be packed to 4MiB).

 Please note, Adrian speaks about CAM, not only CAM + ATA.

-- 
// Black Lion AKA Lev Serebryakov l...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Adrian Chadd
On 28 March 2013 09:05, Lev Serebryakov l...@freebsd.org wrote:

  Yes:  USB  UMASS. It uses CAM too, and useful for very small systems,
  like  4MiB  FLASH  and 16MiB RAM (yes, whole system image, kernel and
  all, should be packed to 4MiB).

  Please note, Adrian speaks about CAM, not only CAM + ATA.

And I'm not at all saying we should keep the old ATA driver around.
I'm just pointing out a set of use cases that most FreeBSD developers
aren't involved with and I'd like to find a way to squeeze it more
efficiently into embedded platforms.

I've never had any noticable performance issues with CAM on my
embedded MIPS boards because it's typically pushing packets. It's just
the resultant binary size of the whole stack that's a problem.

adrian@freefall:~/public_html/ath$ cat AP121-nodebug.txt  | grep scsi
   textdata bss dec hex filename

  49372   10672  80   60124eadc scsi_all.o
  212002576  16   237925cf0 scsi_da.o
  232881488  16   2479260d8 scsi_xpt.o
adrian@freefall:~/public_html/ath$ cat AP121-nodebug.txt  | grep cam
   textdata bss dec hex filename

   3824  96  163936 f60 cam.o
  13552 144  16   137123590 cam_periph.o
   2344 144   02488 9b8 cam_queue.o
640  48   0 688 2b0 cam_sim.o
  40684 752 192   41628a29c cam_xpt.o
adrian@freefall:~/public_html/ath$ cat AP121-nodebug.txt  | grep umass
   textdata bss dec hex filename

  225921072  16   236805c80 umass.o

adrian@freefall:~/public_html/ath$ cat AP121-nodebug.txt | egrep
'(cam_|umass|scsi_)'
  13552 144  16   137123590 cam_periph.o
   2344 144   02488 9b8 cam_queue.o
640  48   0 688 2b0 cam_sim.o
  40684 752 192   41628a29c cam_xpt.o
  49372   10672  80   60124eadc scsi_all.o
  212002576  16   237925cf0 scsi_da.o
  232881488  16   2479260d8 scsi_xpt.o
  225921072  16   236805c80 umass.o
adrian@freefall:~/public_html/ath$ cat AP121-nodebug.txt | egrep
'(cam_|umass|scsi_)' | awk '{a+=$4} END {print a}'
190904

It doesn't seem like a lot, but it does add up..



Adrian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Adrian Chadd
.. and before you ask - yes, there are embedded boards with limited
RAM that also have ATA ports. :-)




Adrian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Poul-Henning Kamp
In message 
CAJ-Vmo=qATZHubkKZ2heiJ3528e__JG4RLru7LU9rwP5_EwT=g...@mail.gmail.com, Adrian 
Chadd wri
tes:
On 28 March 2013 09:05, Lev Serebryakov l...@freebsd.org wrote:

adrian@freefall:~/public_html/ath$ cat AP121-nodebug.txt | egrep
'(cam_|umass|scsi_)' | awk '{a+=$4} END {print a}'
190904

It doesn't seem like a lot, but it does add up..

Isn't there some kernel compile-time option to eliminate the huge
tables used for errormessages etc ?

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Daniel Eischen

On Thu, 28 Mar 2013, Ian Lepore wrote:


On Thu, 2013-03-28 at 09:17 +0200, Alexander Motin wrote:

On 28.03.2013 02:43, Adrian Chadd wrote:

My main concern with the new stuff is that it requires CAM and that's
reasonably big compared to the standalone ATA code.

It'd be nice if we could slim down the CAM stack a bit first; it makes
embedding it on the smaller devices really freaking painful.


Are there many boards now with ATA, but without USB? But I agree, it
should be checked.



It's not necessarily what the boards have but how they're used.  We use
industrial SBCs at work that have ata compact flash sockets on the board
which we do use, and usb interfaces which we don't use.

I've never tested the new ata+cam stuff on some of these boards, most
based on Cyrix, Via, Geode, and VortexD86 chipsets.  The older ata code
works, but not always very well -- for example, we usually have to set
hw.ata.ata_dma=0 for absolutely no reason we've ever been able to figure
out except that if we leave it enabled we get DMA errors and panics on
some CF cards and not on others.  I have no idea whether to expect such
things to be better, worse, or no different by changing to the ata+cam
way of doing things (but I don't really have time to do extensive
testing right now either).


Woa, I have to set hw.ata.ata_dma=0 also in order to get
FreeBSD to boot on a PC104 board.  I think ours is a Cyrix
or Via also.

--
DE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Adrian Chadd
On 28 March 2013 10:26, Poul-Henning Kamp p...@phk.freebsd.dk wrote:

 Isn't there some kernel compile-time option to eliminate the huge
 tables used for errormessages etc ?

Yup. It doesn't save all that much in the grand scheme of things.
Doubly so since my secondary size constraint is an 896k partition that
I lzma compress the kernel to fit into.

Those strings don't add much to the final lzma image size.



Adrian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Steve Kargl
On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:
 Hi.
 
 Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA 
 stack, using only some controller drivers of old ata(4) by having 
 `options ATA_CAM` enabled in all kernels by default. I have a wish to 
 drop non-ATA_CAM ata(4) code, unused since that time from the head 
 branch to allow further ATA code cleanup.
 
 Does any one here still uses legacy ATA stack (kernel explicitly built 
 without `options ATA_CAM`) for some reason, for example as workaround 
 for some regression?

Yes, I use the legacy ATA stack.

 Does anybody have good ideas why we should not drop 
 it now?

Because it works?

-- 
Steve
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Alexander Motin

On 27.03.2013 23:32, Steve Kargl wrote:

On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:

Hi.

Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
stack, using only some controller drivers of old ata(4) by having
`options ATA_CAM` enabled in all kernels by default. I have a wish to
drop non-ATA_CAM ata(4) code, unused since that time from the head
branch to allow further ATA code cleanup.

Does any one here still uses legacy ATA stack (kernel explicitly built
without `options ATA_CAM`) for some reason, for example as workaround
for some regression?


Yes, I use the legacy ATA stack.


On 9.x or HEAD where new one is default?


Does anybody have good ideas why we should not drop
it now?


Because it works?


Any problems with new one?

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Freddie Cash
On Wed, Mar 27, 2013 at 2:32 PM, Steve Kargl 
s...@troutmask.apl.washington.edu wrote:

 On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:
  Hi.
 
  Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
  stack, using only some controller drivers of old ata(4) by having
  `options ATA_CAM` enabled in all kernels by default. I have a wish to
  drop non-ATA_CAM ata(4) code, unused since that time from the head
  branch to allow further ATA code cleanup.
 
  Does any one here still uses legacy ATA stack (kernel explicitly built
  without `options ATA_CAM`) for some reason, for example as workaround
  for some regression?

 Yes, I use the legacy ATA stack.

 You're missing the reason for why you're running the old ATA stack.

Do you have hardware that doesn't work with ATA_CAM?  Have you not tried
ATA_CAM on that box?  Some other reason?

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Steve Kargl
On Wed, Mar 27, 2013 at 11:35:35PM +0200, Alexander Motin wrote:
 On 27.03.2013 23:32, Steve Kargl wrote:
  On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:
  Hi.
 
  Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
  stack, using only some controller drivers of old ata(4) by having
  `options ATA_CAM` enabled in all kernels by default. I have a wish to
  drop non-ATA_CAM ata(4) code, unused since that time from the head
  branch to allow further ATA code cleanup.
 
  Does any one here still uses legacy ATA stack (kernel explicitly built
  without `options ATA_CAM`) for some reason, for example as workaround
  for some regression?
 
  Yes, I use the legacy ATA stack.
 
 On 9.x or HEAD where new one is default?

Head.

  Does anybody have good ideas why we should not drop
  it now?
 
  Because it works?
 
 Any problems with new one?
 

Last time I tested the new one, and this was several months
ago, the system (a Dell Latitude D530 laptop) would not boot.

-- 
Steve
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Alexander Motin

On 28.03.2013 00:05, Steve Kargl wrote:

On Wed, Mar 27, 2013 at 11:35:35PM +0200, Alexander Motin wrote:

On 27.03.2013 23:32, Steve Kargl wrote:

On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:

Hi.

Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
stack, using only some controller drivers of old ata(4) by having
`options ATA_CAM` enabled in all kernels by default. I have a wish to
drop non-ATA_CAM ata(4) code, unused since that time from the head
branch to allow further ATA code cleanup.

Does any one here still uses legacy ATA stack (kernel explicitly built
without `options ATA_CAM`) for some reason, for example as workaround
for some regression?


Yes, I use the legacy ATA stack.


On 9.x or HEAD where new one is default?


Head.


Does anybody have good ideas why we should not drop
it now?


Because it works?


Any problems with new one?



Last time I tested the new one, and this was several months
ago, the system (a Dell Latitude D530 laptop) would not boot.


Probably we should just fix that. Any more info?

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Steve Kargl
On Thu, Mar 28, 2013 at 12:22:11AM +0200, Alexander Motin wrote:
 On 28.03.2013 00:05, Steve Kargl wrote:
 
  Last time I tested the new one, and this was several months
  ago, the system (a Dell Latitude D530 laptop) would not boot.
 
 Probably we should just fix that. Any more info?
 

I can't remember all the details.  I intended to try again
as work was being done on the new code at the time.  I 
never got around to it as my laptop worked fine with the
old code and unfortunately I got busy with work and family.
Reading the freebsd-current mailing lists suggests that 
now is not the time to be a hero.

-- 
Steve
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Adrian Chadd
My main concern with the new stuff is that it requires CAM and that's
reasonably big compared to the standalone ATA code.

It'd be nice if we could slim down the CAM stack a bit first; it makes
embedding it on the smaller devices really freaking painful.

Thanks,



adrian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org