Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-11-04 Thread grant beattie
Ed Saipetch wrote:
> To answer a number of questions:
> 
> Regarding different controllers, I've tried 2 Syba Sil 3114 controllers 
> purchased about 4 months apart.  I've tried 5.4.3 firmware with one and 
> 5.4.13 with another.  Maybe Syba makes crappy Sil 3114 cards but it's the 
> same one that someone on blogs.sun.com used with success.  I had weird 
> problems flashing the first card I got, hence the order of another one.  I'm 
> not sure how I could get 2 different controllers 4 months apart and then use 
> them in 2 completely different computers and both controllers be bad.

another data point..

I run two SiI 3114 based cards in my home fileserver running s10u3. I 
was having ZFS data corruption issues and I suspected the SiI cards - 
that was until I replaced the motherboard/CPU/memory. I didn't have the 
time or patience to try to determine which component was at fault, but I 
swapped the motherboard/CPU/memory and stressed it for a few hours and 
the data corruption problem was gone.

before that, I was seeing data corruption issues within minutes. maybe 
it was just memory, but I'll never know. I junked the old kit after I 
confirmed I had eliminated the problem.

grant.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Edward Saipetch
Mario,

I don't have any issues getting a new card.  The root of the discussion 
started because people did indeed post that they had good luck with 
them.  In fact, when I went out there and google'd to find which cards 
would worked well, it seemed to be at the top of the list.  I'm 
interested to know if it's something I can help resolve so other people 
don't have this problem or make sure people don't run into the same 
issue I do.

Mario Goebbels wrote:
> I haven't seen the beginning of this discussion, but seeing SiI sets the
> fire alarm off here.
>
> The Silicon Image chipsets are renowned to be crap and causing data
> corruption. At least the variants that usually go onto mainboards. Based
> on this, I suggest that you should get a different card.
>
> -mg
>   
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Mario Goebbels
I haven't seen the beginning of this discussion, but seeing SiI sets the
fire alarm off here.

The Silicon Image chipsets are renowned to be crap and causing data
corruption. At least the variants that usually go onto mainboards. Based
on this, I suggest that you should get a different card.

-mg



signature.asc
Description: OpenPGP digital signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Edward Saipetch
Nigel,

Thanks for the response!  Basically my last method of testing was to 
sftp a few 50-100MB files to /tank over a couple of minutes and force a 
scrub after.  The very first time this happened, I was using it as a NAS 
device dumping data to it for over a week.  I went to a customer's site 
to show him how cool zfs was and upon running zpool status, I saw the 
data corruption status and telling me to restore from a backup.  Running 
zpool status without a scrub shows no errors.

I tried mirrored devices, no raid whatsoever and raidz, all with the 
same results.  All the motherboards I've been using only have PCI since 
I was hoping I could create a low cost solution as a POC.  I'll test 
changing the transfer mode a bit later.  Other people have had better 
luck, what other debugging can be done?  I'm willing to even let someone 
have remote access to the box if they want.

Nigel Smith wrote:
> Ok, this is a strange problem!
> You seem to have tried & eliminated all the possible issues
> that the community has suggested!
>
> I was hoping you would see some errors logged in
> '/var/adm/messages' that would give a clue.
>
> Your original 'zpool status' said 140 errors.
> Over what time period are these occurring?
> I'm wondering if the errors are occurring at a
> constant steady rate or if there are bursts of error?
> Maybe you could monitor zpool status while generating
> activity with "dd" or similar.
> You could use "zpool iostat " to monitor
> bandwidth and see if it is reasonably steady or erratic.
>
> >From your "prtconf -D" we see the 3114 card is using
> the "ata" driver, as expected.
> I believe the driver can talk to the disk drive
> in either PIO or DMA mode, so you could try 
> changing that in the "ata.conf" file. See here for details:
> http://docs.sun.com/app/docs/doc/819-2254/ata-7d?a=view
>
> I've just had a quick look at the source code for
> the ata driver, and there does seem to be specific support
> for the Silicon Image chips in the drivers:
> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.c
> and
> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.h
> The file "sil3xxx.h" does mention:
>   "Errata Sil-AN-0109-B2 (Sil3114 Rev 0.3)
>   To prevent erroneous ERR set for queued DMA transfers
>   greater then 8k, FIS reception for FIS0cfg needs to be set
>   to Accept FIS without Interlock"
> ..which I read as meaning there have being some 'issues'
> with this chip. And it sounds similar to the issue mention on
> the link that Tomasz supplied:
> http://home-tj.org/wiki/index.php/Sil_m15w
>
> If you decide to try a different SATA controller card, possible options are:
>
> 1. The si3124 driver, which supports SiI-3132 (PCI-E)
>and SiI-3124 (PCI-X) devices.
>
> 2. The AHCI driver, which supports the Intel ICH6 and latter devices, often
>found on motherboard.
>
> 4. The NV_SATA driver which supports Nvidia ck804/mcp55 devices.
>
> Regards
> Nigel Smith
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Nigel Smith
Ok, this is a strange problem!
You seem to have tried & eliminated all the possible issues
that the community has suggested!

I was hoping you would see some errors logged in
'/var/adm/messages' that would give a clue.

Your original 'zpool status' said 140 errors.
Over what time period are these occurring?
I'm wondering if the errors are occurring at a
constant steady rate or if there are bursts of error?
Maybe you could monitor zpool status while generating
activity with "dd" or similar.
You could use "zpool iostat " to monitor
bandwidth and see if it is reasonably steady or erratic.

>From your "prtconf -D" we see the 3114 card is using
the "ata" driver, as expected.
I believe the driver can talk to the disk drive
in either PIO or DMA mode, so you could try 
changing that in the "ata.conf" file. See here for details:
http://docs.sun.com/app/docs/doc/819-2254/ata-7d?a=view

I've just had a quick look at the source code for
the ata driver, and there does seem to be specific support
for the Silicon Image chips in the drivers:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.c
and
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.h
The file "sil3xxx.h" does mention:
  "Errata Sil-AN-0109-B2 (Sil3114 Rev 0.3)
  To prevent erroneous ERR set for queued DMA transfers
  greater then 8k, FIS reception for FIS0cfg needs to be set
  to Accept FIS without Interlock"
..which I read as meaning there have being some 'issues'
with this chip. And it sounds similar to the issue mention on
the link that Tomasz supplied:
http://home-tj.org/wiki/index.php/Sil_m15w

If you decide to try a different SATA controller card, possible options are:

1. The si3124 driver, which supports SiI-3132 (PCI-E)
   and SiI-3124 (PCI-X) devices.
   
2. The AHCI driver, which supports the Intel ICH6 and latter devices, often
   found on motherboard.
   
4. The NV_SATA driver which supports Nvidia ck804/mcp55 devices.

Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Mauro Mozzarelli
Hi,

I have the same sil3114 based controller, installed in a dual Opteron box. I 
have installed Solaris x86 and have had no problem with it, however I hardly 
used that box with Solaris as my installation was only to try out Solaris on my 
Opteron worksation. Instead, on that workstation I constantly run Linux, and 
twice in a few months I came across (while running linux Fedora) several I/O 
errors on the SATA disk attached to that controller. I though at first that the 
hard drive was gone, but then I swapped that controller with a sil3112 and the 
I/O errors stopped. I swapped back the sil3114 and had no errors since. I 
reckon that it might have been due to one of the SATA cables (power or data?) 
not making a perfect contact. SATA connectors are of extremely poor quality and 
they fail to hold in place as well as the older IDE or SCSI or molex power 
connector. I noticed as well that they crack easily if inadvertently pulled or 
pushed while working inside the computer case.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Ed Saipetch
Tried that... completely different cases with different power supplies.

On Oct 30, 2007, at 10:28 AM, Al Hopper wrote:

> On Mon, 29 Oct 2007, MC wrote:
>
>>> Here's what I've done so far:
>>
>> The obvious thing to test is the drive controller, so maybe you  
>> should do that :)
>>
>
> Also - while you're doing swapTronics - don't forget the Power Supply
> (PSU).  Ensure that your PSU has sufficient capacity on its 12Volt
> rails (older PSUs did'nt even tell you how much current they can push
> out on the 12V outputs).
>
> See also: http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta
>
> Regards,
>
> Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
>Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
> OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
> http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
> Graduate from "sugar-coating school"?  Sorry - I never attended! :)
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Al Hopper
On Mon, 29 Oct 2007, MC wrote:

>> Here's what I've done so far:
>
> The obvious thing to test is the drive controller, so maybe you should do 
> that :)
>

Also - while you're doing swapTronics - don't forget the Power Supply 
(PSU).  Ensure that your PSU has sufficient capacity on its 12Volt 
rails (older PSUs did'nt even tell you how much current they can push 
out on the 12V outputs).

See also: http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
Graduate from "sugar-coating school"?  Sorry - I never attended! :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Ed Saipetch
To answer a number of questions:

Regarding different controllers, I've tried 2 Syba Sil 3114 controllers 
purchased about 4 months apart.  I've tried 5.4.3 firmware with one and 5.4.13 
with another.  Maybe Syba makes crappy Sil 3114 cards but it's the same one 
that someone on blogs.sun.com used with success.  I had weird problems flashing 
the first card I got, hence the order of another one.  I'm not sure how I could 
get 2 different controllers 4 months apart and then use them in 2 completely 
different computers and both controllers be bad.

Regarding cables, they aren't densely packed.  I've just got 1 drive attached 
in this new instance.  In the old, I just had 4 cables unbundled (not bound 
together) attached between the card and the drives.

Here's an error on startup in /var/adm/messages, note however that this error 
didn't come up on the old mb/cpu combo with the older 3114 hba.  These errors 
happen only during boot and don't happen during file transfers:

Sep 14 23:51:49 eknas genunix: [ID 936769 kern.info] sd0 is /[EMAIL 
PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]/[EMAIL PROTECTED],0
Sep 14 23:52:11 eknas scsi: [ID 107833 kern.warning] WARNING: /[EMAIL 
PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED] (ata0):
Sep 14 23:52:11 eknas   timeout: abort request, target=1 lun=0

Here's the scanpci output:
pci bus 0x cardnum 0x08 function 0x00: vendor 0x1095 device 0x3114
 Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller

and prtconf -pv:
subsystem-vendor-id:  1095
subsystem-id:  3114
unit-address:  '8'
class-code:  00018000
revision-id:  0002
vendor-id:  1095
device-id:  3114

and prtconf -D:
pci-ide, instance #0 (driver name: pci-ide)
ide, instance #0 (driver name: ata)

and pertinent modinfo:
 40 fbbf1250   1050 224   1  pci-ide (pciide nexus driver for 'PCI-ID)
 41 f783c000  10230 112   1  ata (ATA AT-bus attachment disk cont)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Frank . Hofmann
On Tue, 30 Oct 2007, Tomasz Torcz wrote:

> On 10/30/07, Neal Pollack <[EMAIL PROTECTED]> wrote:
>>> I'm experiencing major checksum errors when using a syba silicon image 3114 
>>> based pci sata controller w/ nonraid firmware.  I've tested by copying data 
>>> via sftp and smb.  With everything I've swapped out, I can't fathom this 
>>> being a hardware problem.
>> Even before ZFS, I've had numerous situations where various si3112 and
>> 3114 chips
>> would corrupt data on UFS and PCFS, with very simple  copy and checksum
>> test scripts, doing large bulk transfers.
>
>  Those SIL chips are really broken when used with certain Seagate drivers.
> But I have data corrupted by them with WD drive also.
> Linux can workaround this bug by reducing transfer sizes (and thus
> dramatically impacting speed). Solaris probably don't have workaround.

Might be slightly off-topic for the whole, but _this_ specific thing 
(reducing transfer sizes) is possible on Solaris as well. As documented 
here:

http://docs.sun.com/app/docs/doc/819-2724/chapter2-29?a=view

You can also read a bit more on the following thread:

http://www.opensolaris.org/jive/thread.jspa?threadID=6866

It's possible to limit this system-wide or per-LUN.

Best regards,
FrankH.

> With this quirk enabled (on Linux), I get at most 20 MB/s from drives,
> but ZFS do not report any corruption. Before I had corruptions hourly.
>
> More info about SIL issue: http://home-tj.org/wiki/index.php/Sil_m15w
> I have Si 3112, but despite SIL claims other chips seem to be affected also.
>
>
> -- 
> Tomasz Torcz
> [EMAIL PROTECTED]
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

--
No good can come from selling your freedom, not for all the gold in the world,
for the value of this heavenly gift far exceeds that of any fortune on earth.
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Stephen Usher
One thing to check before you blame your controller:

Are the SATA cables close together for an extended length?

Basically, most SATA cables will generate massive levels of cross-talk between 
them if they're tied together or a run parallel in close proximity for a part 
of 
their run-length.

I friend found this sort of problem a couple of months ago and it was cured by 
separating the cables.

Steve
-- 
---
Computer Systems Administrator,E-Mail:[EMAIL PROTECTED]
Department of Earth Sciences, Tel:-  +44 (0)1865 282110
University of Oxford, Parks Road, Oxford, UK. Fax:-  +44 (0)1865 272072
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Tomasz Torcz
On 10/30/07, Neal Pollack <[EMAIL PROTECTED]> wrote:
> > I'm experiencing major checksum errors when using a syba silicon image 3114 
> > based pci sata controller w/ nonraid firmware.  I've tested by copying data 
> > via sftp and smb.  With everything I've swapped out, I can't fathom this 
> > being a hardware problem.
> Even before ZFS, I've had numerous situations where various si3112 and
> 3114 chips
> would corrupt data on UFS and PCFS, with very simple  copy and checksum
> test scripts, doing large bulk transfers.

  Those SIL chips are really broken when used with certain Seagate drivers.
But I have data corrupted by them with WD drive also.
Linux can workaround this bug by reducing transfer sizes (and thus
dramatically impacting speed). Solaris probably don't have workaround.
With this quirk enabled (on Linux), I get at most 20 MB/s from drives,
but ZFS do not report any corruption. Before I had corruptions hourly.

More info about SIL issue: http://home-tj.org/wiki/index.php/Sil_m15w
I have Si 3112, but despite SIL claims other chips seem to be affected also.


-- 
Tomasz Torcz
[EMAIL PROTECTED]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Nigel Smith
And are you seeing any error messages in '/var/adm/messages'
indicating any failure on the disk controller card?
If so, please post a sample back here to the forum.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Nigel Smith
First off, can we just confirm the exact version of the Silicon Image Card
and which driver Solaris is using.

Use 'prtconf -pv' and '/usr/X11/bin/scanpci'
to get the PCI vendor & device ID information.

Use 'prtconf -D' to confirm which drivers are being used by which devices.

And 'modinfo' will tell you the version of the drivers.

The above commands will give details for all the devices
in the PC.  You may want to edit down the output before
posting it back here, or alternatively put the output into an
attached file.

See this link for an example of this sort of information
for a different hard disk controller card:
http://mail.opensolaris.org/pipermail/storage-discuss/2007-September/003399.html

Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Neal Pollack
Edward Saipetch wrote:
> Neal Pollack wrote:
>> Ed Saipetch wrote:
>>> Hello,
>>>
>>> I'm experiencing major checksum errors when using a syba silicon 
>>> image 3114 based pci sata controller w/ nonraid firmware.  I've 
>>> tested by copying data via sftp and smb.  With everything I've 
>>> swapped out, I can't fathom this being a hardware problem.  
>>
>> I can.  But I suppose it could also be in some unknown way a driver 
>> issue.
>> Even before ZFS, I've had numerous situations where various si3112 
>> and 3114 chips
>> would corrupt data on UFS and PCFS, with very simple  copy and checksum
>> test scripts, doing large bulk transfers.
>>
>> Si chips are best used to clean coffee grinders.  Go buy a real SATA 
>> controller.
>>
>> Neal
> I have no problem ponying up money for a better SATA controller.  I 
> saw a bunch of blog posts that people were successful using the card 
> so I thought maybe I had a bad card with corrupt firmware nvram.  Is 
> it worth trying to trace down the bug?

Of course it is.  File a bug so someone on the SATA team can study it.

> If this type of corruption exists, nobody should be using this card.  
> As a side note, what SATA cards are people having luck with?

A lot of people are happy with the 8 port PCI SATA card made by 
SuperMicro that has the Marvell chip on it.
Don't buy other marvell cards on ebay, because Marvell dumped a ton of 
cards that ended up with an earlier
rev of the silicon that can corrupt data.  But all the cards made by 
SuperMicro and sold by them have the c rev
or later silicon and work great.

That said, I wish someone would investigate the Silicon Image issues, 
but there are only so many engineers,
with so little time.
>>
>>> There have been quite a few blog posts out there with people having 
>>> a similar config and not having any problems.
>>>
>>> Here's what I've done so far:
>>> 1. Changed solaris releases from S10 U3 to NV 75a
>>> 2. Switched out motherboards and cpus from AMD sempron to a Celeron D
>>> 3. Switched out memory to use completely different dimms
>>> 4. Switched out sata drives (2-3 250gb hitachi's and seagates in 
>>> RAIDZ, 3x400GB seagates RAIDZ and 1x250GB hitachi with no raid)
>>>
>>> Here's output of a scrub and the status (ignore the date and time, I 
>>> haven't reset it on this new motherboard) and please point me in the 
>>> right direction if I'm barking up the wrong tree.
>>>
>>> # zpool scrub tank
>>> # zpool status
>>>   pool: tank
>>>  state: ONLINE
>>> status: One or more devices has experienced an error resulting in data
>>> corruption.  Applications may be affected.
>>> action: Restore the file in question if possible.  Otherwise restore 
>>> the
>>> entire pool from backup.
>>>see: http://www.sun.com/msg/ZFS-8000-8A
>>>  scrub: scrub completed with 140 errors on Sat Sep 15 02:07:35 2007
>>> config:
>>>
>>> NAMESTATE READ WRITE CKSUM
>>> tankONLINE   0 0   293
>>>   c0d1  ONLINE   0 0   293
>>>
>>> errors: 140 data errors, use '-v' for a list
>>>  
>>>  
>>> This message posted from opensolaris.org
>>> ___
>>> zfs-discuss mailing list
>>> zfs-discuss@opensolaris.org
>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>>   
>>
>

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread James C. McPherson
Will Murnane wrote:
> On 10/30/07, Edward Saipetch <[EMAIL PROTECTED]> wrote:
>> As a side note, what SATA cards are people having luck with?
> Running b74, I'm very happy with the Marvell mv88sx6081-based Supermicro card:
> http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
> http://www.newegg.com/Product/Product.aspx?Item=N82E16815121009&Tpk=aoc-sat2
> http://www.wiredzone.com/xq/asp/ic.10016527/qx/itemdesc.htm
> It hypothetically supports port multipliers, but I haven't tested this myself.
> 
> On earlier releases (b69, specifically) I had problems with disks
> occasionally disappearing.  Those appear to have been completely
> resolved; the box has most recently been up for 16 days with no
> errors.

We don't currently have support for SATA port multipliers in
Solaris or OpenSolaris. I know this because people in my team
are working on it (no ETA as yet) and we discussed it last week.



James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Will Murnane
On 10/30/07, Edward Saipetch <[EMAIL PROTECTED]> wrote:
> As a side note, what SATA cards are people having luck with?
Running b74, I'm very happy with the Marvell mv88sx6081-based Supermicro card:
http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
http://www.newegg.com/Product/Product.aspx?Item=N82E16815121009&Tpk=aoc-sat2
http://www.wiredzone.com/xq/asp/ic.10016527/qx/itemdesc.htm
It hypothetically supports port multipliers, but I haven't tested this myself.

On earlier releases (b69, specifically) I had problems with disks
occasionally disappearing.  Those appear to have been completely
resolved; the box has most recently been up for 16 days with no
errors.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread MC
> Here's what I've done so far:

The obvious thing to test is the drive controller, so maybe you should do that 
:)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Edward Saipetch
Neal Pollack wrote:
> Ed Saipetch wrote:
>> Hello,
>>
>> I'm experiencing major checksum errors when using a syba silicon  
>> image 3114 based pci sata controller w/ nonraid firmware.  I've  
>> tested by copying data via sftp and smb.  With everything I've  
>> swapped out, I can't fathom this being a hardware problem.
>
> I can.  But I suppose it could also be in some unknown way a driver  
> issue.
> Even before ZFS, I've had numerous situations where various si3112  
> and 3114 chips
> would corrupt data on UFS and PCFS, with very simple  copy and  
> checksum
> test scripts, doing large bulk transfers.
>
> Si chips are best used to clean coffee grinders.  Go buy a real SATA  
> controller.
>
> Neal

I have no problem ponying up money for a better SATA controller.  I saw
a bunch of blog posts that people were successful using the card so I
thought maybe I had a bad card with corrupt firmware nvram.  Is it worth
trying to trace down the bug?  If this type of corruption exists, nobody
should be using this card.  As a side note, what SATA cards are people
having luck with?

>
>> There have been quite a few blog posts out there with people having  
>> a similar config and not having any problems.
>>
>> Here's what I've done so far:
>> 1. Changed solaris releases from S10 U3 to NV 75a
>> 2. Switched out motherboards and cpus from AMD sempron to a Celeron D
>> 3. Switched out memory to use completely different dimms
>> 4. Switched out sata drives (2-3 250gb hitachi's and seagates in  
>> RAIDZ, 3x400GB seagates RAIDZ and 1x250GB hitachi with no raid)
>>
>> Here's output of a scrub and the status (ignore the date and time,  
>> I haven't reset it on this new motherboard) and please point me in  
>> the right direction if I'm barking up the wrong tree.
>>
>> # zpool scrub tank
>> # zpool status
>>  pool: tank
>> state: ONLINE
>> status: One or more devices has experienced an error resulting in  
>> data
>>corruption.  Applications may be affected.
>> action: Restore the file in question if possible.  Otherwise  
>> restore the
>>entire pool from backup.
>>   see: http://www.sun.com/msg/ZFS-8000-8A
>> scrub: scrub completed with 140 errors on Sat Sep 15 02:07:35 2007
>> config:
>>
>>NAMESTATE READ WRITE CKSUM
>>tankONLINE   0 0   293
>>  c0d1  ONLINE   0 0   293
>>
>> errors: 140 data errors, use '-v' for a list
>>  This message posted from opensolaris.org
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>
>



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Neal Pollack
Ed Saipetch wrote:
> Hello,
>
> I'm experiencing major checksum errors when using a syba silicon image 3114 
> based pci sata controller w/ nonraid firmware.  I've tested by copying data 
> via sftp and smb.  With everything I've swapped out, I can't fathom this 
> being a hardware problem.  

I can.  But I suppose it could also be in some unknown way a driver issue.
Even before ZFS, I've had numerous situations where various si3112 and 
3114 chips
would corrupt data on UFS and PCFS, with very simple  copy and checksum
test scripts, doing large bulk transfers.

Si chips are best used to clean coffee grinders.  Go buy a real SATA 
controller.

Neal

> There have been quite a few blog posts out there with people having a similar 
> config and not having any problems.
>
> Here's what I've done so far:
> 1. Changed solaris releases from S10 U3 to NV 75a
> 2. Switched out motherboards and cpus from AMD sempron to a Celeron D
> 3. Switched out memory to use completely different dimms
> 4. Switched out sata drives (2-3 250gb hitachi's and seagates in RAIDZ, 
> 3x400GB seagates RAIDZ and 1x250GB hitachi with no raid)
>
> Here's output of a scrub and the status (ignore the date and time, I haven't 
> reset it on this new motherboard) and please point me in the right direction 
> if I'm barking up the wrong tree.
>
> # zpool scrub tank
> # zpool status
>   pool: tank
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
> entire pool from backup.
>see: http://www.sun.com/msg/ZFS-8000-8A
>  scrub: scrub completed with 140 errors on Sat Sep 15 02:07:35 2007
> config:
>
> NAMESTATE READ WRITE CKSUM
> tankONLINE   0 0   293
>   c0d1  ONLINE   0 0   293
>
> errors: 140 data errors, use '-v' for a list
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-29 Thread Nathan Kroenert
You have not mentioned if you have swapped the 3114 based HBA itself...?

Have you tried a different HBA? :)

Nathan.

Ed Saipetch wrote:
> Hello,
> 
> I'm experiencing major checksum errors when using a syba silicon image 3114 
> based pci sata controller w/ nonraid firmware.  I've tested by copying data 
> via sftp and smb.  With everything I've swapped out, I can't fathom this 
> being a hardware problem.  There have been quite a few blog posts out there 
> with people having a similar config and not having any problems.
> 
> Here's what I've done so far:
> 1. Changed solaris releases from S10 U3 to NV 75a
> 2. Switched out motherboards and cpus from AMD sempron to a Celeron D
> 3. Switched out memory to use completely different dimms
> 4. Switched out sata drives (2-3 250gb hitachi's and seagates in RAIDZ, 
> 3x400GB seagates RAIDZ and 1x250GB hitachi with no raid)
> 
> Here's output of a scrub and the status (ignore the date and time, I haven't 
> reset it on this new motherboard) and please point me in the right direction 
> if I'm barking up the wrong tree.
> 
> # zpool scrub tank
> # zpool status
>   pool: tank
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
> entire pool from backup.
>see: http://www.sun.com/msg/ZFS-8000-8A
>  scrub: scrub completed with 140 errors on Sat Sep 15 02:07:35 2007
> config:
> 
> NAMESTATE READ WRITE CKSUM
> tankONLINE   0 0   293
>   c0d1  ONLINE   0 0   293
> 
> errors: 140 data errors, use '-v' for a list
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss