Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-16 Thread Nick Holland
On 02/15/16 16:02, Karel Gardas wrote:
>> ..And therefore you need enterprise disks because they behave "cleanly", as
>> when using those only, essentially full softraid QoS is maintained at all
>> times.
>
> Interesting! I've understand Nick excellent email in completely
> reversed sense. I understood it in "use consumer drives which fail
> really slowly and with degraded performance which will give you a
> chance to notice it at all. With enterprise, your drives may fail too
> quickly so there is a danger of failing drive in a array which is just
> rebuilding after another drive failure few hours ago".
>

And that's the way I meant it...

I've had maybe five drives do the "slow-fail" thing.  Maybe.  In 34
years, including selling and supporting thousands of computers at a very
successful store, working for a few very large companies, and working
with a lot of tiny companies.  I'd file that under "it happens, don't
wait up, and certainly don't design around it".

In contrast, the number of "fast failures" I've seen on "Enterprise
grade" stuff is ... stunning.  And, I think I've seen evidence of one
"event" taking multiple drives off-line at once, with predictable
results to the array.  Fix?  Remove and re-insert drive, and rebuild,
since there is really nothing wrong with the disk 80-90% of the time.
Oh, guess you need a hot-swap enclosure, then.

My experience can be summed up as: Simple systems have simple problems.
 "Enterprise Grade" stuff that is never supposed to break or go
down...will (due to complexity) and will stay that way for amazing
periods of time (due to your lack of preparation, because you don't
believe it will happen).

And when it comes to disk systems, IF "enterprise grade" *disks* are any
better (and I don't believe it), when combined with enterprise grade
enclosures and enterprise grade disk controllers and firmware and fancy
drivers...no question in my mind, consumer grade SATA disks on dull
interfaces win, hands down.  Remember, it isn't WHY you lost data that
matters (be it hardware, software or human error), just that you did.
(A common failure part in "enterprise grade" servers is the disk
backplane board.  There's almost no active electronics on it, but they
fail often.  they don't exist on a desktop pc.  I suspect the vibration
of drives cracks the solder joints).
with
My recommendation:
1) Plan for things to break.
2) Plan for ANYTHING to break.
3) Have an in-house way of dealing with whatever breaks.
4) Don't rely on others.  It's not their business that is down.
5) The people you paid to bail you out of 1 & 2 so you don't have to
worry about 3 and 4 WILL let you down and will not live up to their
promises, and when you read the fine print, you will realize there isn't
a damn thing you can do about it, 'cept pay them again when the contract
comes up.

And after you do that, you will realize that obsessing over "enterprise
grade" parts is not part of the design.


NOTE WELL: That's my opinion based on *my* experience (including what
was almost a "controlled experiment" along those lines).  Every
manufacturer out there says I'm wrong.  Most of my coworkers say I'm
wrong.  Every new technology (like SSDs) give another opportunity to
"change everything" (and the results always seem to be the same, but
maybe THIS time will be different).  If you follow my advice and things
blow up, you will look like an idiot, and I really don't want to hear
about it.  If you follow the mainstream mindset, you can always say,
"That's what (almost) everyone said is the right way, not my fault!".
Blindly following the opinions of some crackpot on the internet may be
foolish.  Blindly following the opinions of people who profit from what
they advise you will be expensive.

Nick.



Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-16 Thread lists
Tue, 16 Feb 2016 10:57:38 -0800 Chris Cappuccio 
> li...@wrant.com [li...@wrant.com] wrote:
> > 
> > Plan for your use case, and consult the man page and respective source
> > code on implementation details.  And flash storage disks are still
> > unreliable compared to spinning hard drives.  
> 
> Although I was a long proponent of read-only flash use, I've found the
> Samsung 845DC Pro and Samsung SM863 to be very durable in heavy write
> environments (heavily written-to monitoring database, mail server).

Thank you for the tip, I'll consider these in the future too.  I've
found Intel 35xx/37xx series to be the other option of better flash
drives currently on the market.

Yet, it's still not the same class of reliability.  This is not related
to OpenBSD, but my 20+ years of hard disks are still able to store and
retrieve data, after their long and useful production life.  I can not
validate this for any other flash or memory based storage device.

In present understanding data retention decay is still present in the
flash devices and can not meet spinning hard disks, and we all know
that's not going to change without improvement in battery ageing and
the type of cells used in the flash drives.

I insist on recommending pairing any storage type device in soft-RAID
and not mixing device types in the same array, advising the reliable
parts despite hating the enterprise server tax for personal use.

This and advanced engineering knowledge on the basis of technical
specifications and hardware documentation, to compliment the incredibly
useful OpenBSD software man pages and source code.  For kids: don't
forget to make a copy of your important files.



Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-16 Thread Chris Cappuccio
li...@wrant.com [li...@wrant.com] wrote:
> 
> Plan for your use case, and consult the man page and respective source
> code on implementation details.  And flash storage disks are still
> unreliable compared to spinning hard drives.

Although I was a long proponent of read-only flash use, I've found the
Samsung 845DC Pro and Samsung SM863 to be very durable in heavy write
environments (heavily written-to monitoring database, mail server).



Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-16 Thread lists
Mon, 15 Feb 2016 22:03:13 +0100 Karel Gardas 
> > ..And therefore you need enterprise disks because they behave "cleanly", as
> > when using those only, essentially full softraid QoS is maintained at all
> > times.  
> 
> Interesting! I've understand Nick excellent email in completely
> reversed sense.

That does not reverse the advice however.  Double slow speed read again
carefully ;-)

> I understood it in "use consumer drives which fail
> really slowly and with degraded performance which will give you a
> chance to notice it at all.

This is not the concept.  It is more an important technological
prerequisite many people don't know exists in the hardware RAID world.

> With enterprise, your drives may fail too
> quickly so there is a danger of failing drive in a array which is just
> rebuilding after another drive failure few hours ago".

That's not the takeaway advice.  That would be: have in mind some
controllers reject a drive which is still operational but does not meet
the controller timeout.  More like: hardware RAID controllers twist
your hands to buy enterprise class disks and replace them more
diligently before they actually reach the fail state on continuous
usage timing parameters.

Plan for your use case, and consult the man page and respective source
code on implementation details.  And flash storage disks are still
unreliable compared to spinning hard drives.



Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-15 Thread Karel Gardas
> ..And therefore you need enterprise disks because they behave "cleanly", as
> when using those only, essentially full softraid QoS is maintained at all
> times.

Interesting! I've understand Nick excellent email in completely
reversed sense. I understood it in "use consumer drives which fail
really slowly and with degraded performance which will give you a
chance to notice it at all. With enterprise, your drives may fail too
quickly so there is a danger of failing drive in a array which is just
rebuilding after another drive failure few hours ago".



Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-15 Thread Tinker

Constantine,

Just basically followup to say that I agree with you.

On 2016-02-15 17:41, Constantine A. Murenin wrote:

On 13 February 2016 at 08:50, Tinker <ti...@openmailbox.org> wrote:

Hi,

1)
http://www.openbsd.org/papers/asiabsdcon2010_softraid/softraid.pdf 
page 3
"2.2 RAID 1" says that it reads "on a round-robin basis from all 
active

chunks", i.e. read operations are spread evenly across disks.


Yes, that's still the case today:

..

There are presently no optimisations in-tree, but


the softraid policies are so simple that it's really easy to hack it up 
to do

something else that you may want.


That is awesome.

Since then did anyone implement selective reading based on experienced 
read

operation time, or a user-specified device read priority order?


That would make the code less readable!  :-)


That is indeed an excellent reason for not adding an additional feature 
- couldn't agree with you more.


Added complexity is (the root of all) 'evil'.

That would allow Softraid RAID1 based on 1 SSD mirror + 1 SSD mirror + 
1 HDD
mirror, which would give the best combination of IO performance and 
data

security OpenBSD would offer today.


Not sure what'd be the practical point of such a setup.  Your writes
will still be limited by the slowest component, and IOPS specs are
vastly different between SSDs and HDDs.  (And modern SSDs are no
longer considered nearly as unreliable as they once were.)


Yeah. I'm half-unwillingly starting to agree with that (discussed in 
depth with Nick in the previous email).



2)
Also if there's a read/write failure (or excessive time consumption 
for a
single operation, say 15 seconds), will Softraid RAID1 learn to take 
the

broken disk out of use?


A failure in a softraid1 chunk will result in the chunk being taken 
offline.
(What constitutes a failure is most likely outside of softraid's 
control.)


My best understanding today is that Nick clarified this in the previous 
post, that is, he clarified that softraid doesn't actually have any IO 
operation timeouts, and IO lag will not lead to softraid plugging out a 
disk - only a disconnect or specific disk failure SMART command from the 
underlying disk will have that effect on softraid (of causing that 
respective physical disk to be automatically disconnected).


..And therefore you need enterprise disks because they behave "cleanly", 
as when using those only, essentially full softraid QoS is maintained at 
all times.


Best regards,
Tinker



Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-15 Thread Tinker

Dear Nick,

On 2016-02-15 05:29, Nick Holland wrote:

On 02/13/16 11:49, Tinker wrote:

Hi,

1)
http://www.openbsd.org/papers/asiabsdcon2010_softraid/softraid.pdf 
page

3 "2.2 RAID 1" says that it reads "on a round-robin basis from all
active chunks", i.e. read operations are spread evenly across disks.

Since then did anyone implement selective reading based on experienced
read operation time, or a user-specified device read priority order?


That would allow Softraid RAID1 based on 1 SSD mirror + 1 SSD mirror + 
1
HDD mirror, which would give the best combination of IO performance 
and

data security OpenBSD would offer today.


I keep flip-flopping on the merits of this.
At one point, I was with you, thinking, "great idea!  Back an 
expensive,

fast disk with a cheap disk".

Currently, I'm thinking, "REALLY BAD IDEA".  Here's my logic:

There's no such thing as an "expensive disk" anymore.  A quick look

..

of "fast" storage to make their very few business apps run better.  No
question in their mind, it was worth it.  Now we do much more with our
computers and it costs much less.  The business value of our investment
should be much greater than it was in 1982.

And ignoring hardware, it is.  Companies drop thousands of dollars on
consulting and assistance and think nothing of it.  And in a major
computer project, a couple $1000 disks barely show as a blip on the
budget.  Hey, I'm all about being a cheap bastard whenever possible, 
but

this just isn't a reasonable place to be cheap, so not somewhere I'd
suggest spending developer resources.


Also ... it's probably a bad idea for functional reasons.  You can't
just assume that "slower" is better than "nothing" -- very often, it's
indistinguishable from "nothing".  In many cases, computer systems that
perform below a certain speed are basically non-functional, as tasks 
can

pile up on them faster than they can produce results.  Anyone who has
dealt with an overloaded database server, mail server or firewall will
know what I'm saying here -- at a certain load, they go from "running
ok" to "death spiral", and they do it very quickly.

If you /need/ the speed of an SSD, you can justify the cost of a pair 
of

'em.  If you can't justify the cost, you are really working with a
really unimportant environment, and you can either wait for two cheap
slow disks or skip the RAID entirely.

How fast do you need to get to your porn, anyway?


I technically agree with you -


What lead me to think about SDD+HDD was the idea of having on the same 
mountpoint a hybrid-SSD-HDD storage where the "important stuff" would be 
automatically in the SSD and the "less important" on the HDD.


This symmetry would mean that those two data sets could be stored within 
one and the same directory structure, which would be really handy, and 
archiving of unused files would be implicit.


I understand that ZFS is quite good at delivering this. LSI MegRaid 
cards are good at that as long as the "important stuff" is forever 
<512GB, which is not the case, duh.


This whole idea has a really exotic, unpredictable, ""stinking"" edge to 
it though. Your "slower" is generally as bad as "nothing" allegory 
combined with the market price situation, makes all sense -


So, even if kind of unwillingly, I must agree with your reasoning.



(now ... that being said, part of me would love a tmpfs / disk RAID1,
one that would come up degraded, and the disk would populate the RAM
disk, writes would go to both subsystems, reads would come from the RAM
disk once populated.  I could see this for some applications like CVS
repositories or source directories where things are "read mostly", and
typically smaller than a practical RAM size these days, and as there 
are

still a few orders of magnitude greater performance in a RAM disk than
an SSD and this will likely remain true for a while, there are SOME
applications where this could be nice)


Wait.. you mean you would like OpenBSD to implement read cache that is 
"100% caching agressive" rather than the current "buffer cache" which 
has "dynamic caching agressiveness" - I don't understand how this could 
make sense, can you please clarify?



2)
Also if there's a read/write failure (or excessive time consumption 
for

a single operation, say 15 seconds), will Softraid RAID1 learn to take
the broken disk out of use?


As far as I am aware, Softraid (like most RAID systems, hw or sw) will
deactivate a drive which reports a failure.  Drives which go super slow
(i.e., always manage to get the data BEFORE the X'th retry at which 
they
would toss an error) never report an error back, so never deactivate 
the

drive.

Sound implausible?  Nope.  It Happens.  Frustrating as heck when you
have this happen to you until you figure it out.  I

Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-15 Thread Constantine A. Murenin
On 13 February 2016 at 08:50, Tinker <ti...@openmailbox.org> wrote:
> Hi,
>
> 1)
> http://www.openbsd.org/papers/asiabsdcon2010_softraid/softraid.pdf page 3
> "2.2 RAID 1" says that it reads "on a round-robin basis from all active
> chunks", i.e. read operations are spread evenly across disks.

Yes, that's still the case today:

http://bxr.su/o/sys/dev/softraid_raid1.c#sr_raid1_rw

345rt = 0;
346ragain:
347/* interleave reads */
348chunk = sd->mds.mdd_raid1.sr1_counter++ %
349sd->sd_meta->ssdi.ssd_chunk_no;
350scp = sd->sd_vol.sv_chunks[chunk];
351switch (scp->src_meta.scm_status) {

356case BIOC_SDOFFLINE:

359if (rt++ < sd->sd_meta->ssdi.ssd_chunk_no)
360goto ragain;

There are presently no optimisations in-tree, but the softraid
policies are so simple that it's really easy to hack it up to do
something else that you may want.

>
> Since then did anyone implement selective reading based on experienced read
> operation time, or a user-specified device read priority order?

That would make the code less readable!  :-)

>
>
> That would allow Softraid RAID1 based on 1 SSD mirror + 1 SSD mirror + 1 HDD
> mirror, which would give the best combination of IO performance and data
> security OpenBSD would offer today.

Not sure what'd be the practical point of such a setup.  Your writes
will still be limited by the slowest component, and IOPS specs are
vastly different between SSDs and HDDs.  (And modern SSDs are no
longer considered nearly as unreliable as they once were.)

>
> 2)
> Also if there's a read/write failure (or excessive time consumption for a
> single operation, say 15 seconds), will Softraid RAID1 learn to take the
> broken disk out of use?

A failure in a softraid1 chunk will result in the chunk being taken
offline.  (What constitutes a failure is most likely outside of
softraid's control.)

C.



Re: Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-14 Thread Nick Holland
On 02/13/16 11:49, Tinker wrote:
> Hi,
> 
> 1)
> http://www.openbsd.org/papers/asiabsdcon2010_softraid/softraid.pdf page 
> 3 "2.2 RAID 1" says that it reads "on a round-robin basis from all 
> active chunks", i.e. read operations are spread evenly across disks.
> 
> Since then did anyone implement selective reading based on experienced 
> read operation time, or a user-specified device read priority order?
> 
> 
> That would allow Softraid RAID1 based on 1 SSD mirror + 1 SSD mirror + 1 
> HDD mirror, which would give the best combination of IO performance and 
> data security OpenBSD would offer today.

I keep flip-flopping on the merits of this.
At one point, I was with you, thinking, "great idea!  Back an expensive,
fast disk with a cheap disk".

Currently, I'm thinking, "REALLY BAD IDEA".  Here's my logic:

There's no such thing as an "expensive disk" anymore.  A quick look
shows me that I can WALK INTO my local computer store and pick up a 2TB
SSD for under $1000US.  Now, that looks like a lot of money, and as a
life-long cheapskate, when I get to four digits, I'm expecting at least
two wheels and an engine.  But in the Big Picture?  No.  That's one heck
of a lot of stunningly fast storage for a reasonable chunk of change.

Thirty-four years ago when I started in this business, I was installing
10MB disks for $2000/ea as fast as we could get the parts (and at that
time, you could get a darned nice car for five of those drives, and a
new Corvette cost less than ten of them).  Now sure, the price has
dropped a whole lot since then, and my first reaction would be "What
does that have to do anything?  I can buy 2TB disks for under $100,
that's a huge savings over the SSD!"  In raw dollars, sure.  Percentage?
 Sure.  In "value to business"?  I don't think so.  In 1982, people felt
the computers of the day were worth adding $2000 to to get a tiny amount
of "fast" storage to make their very few business apps run better.  No
question in their mind, it was worth it.  Now we do much more with our
computers and it costs much less.  The business value of our investment
should be much greater than it was in 1982.

And ignoring hardware, it is.  Companies drop thousands of dollars on
consulting and assistance and think nothing of it.  And in a major
computer project, a couple $1000 disks barely show as a blip on the
budget.  Hey, I'm all about being a cheap bastard whenever possible, but
this just isn't a reasonable place to be cheap, so not somewhere I'd
suggest spending developer resources.


Also ... it's probably a bad idea for functional reasons.  You can't
just assume that "slower" is better than "nothing" -- very often, it's
indistinguishable from "nothing".  In many cases, computer systems that
perform below a certain speed are basically non-functional, as tasks can
pile up on them faster than they can produce results.  Anyone who has
dealt with an overloaded database server, mail server or firewall will
know what I'm saying here -- at a certain load, they go from "running
ok" to "death spiral", and they do it very quickly.

If you /need/ the speed of an SSD, you can justify the cost of a pair of
'em.  If you can't justify the cost, you are really working with a
really unimportant environment, and you can either wait for two cheap
slow disks or skip the RAID entirely.

How fast do you need to get to your porn, anyway?

(now ... that being said, part of me would love a tmpfs / disk RAID1,
one that would come up degraded, and the disk would populate the RAM
disk, writes would go to both subsystems, reads would come from the RAM
disk once populated.  I could see this for some applications like CVS
repositories or source directories where things are "read mostly", and
typically smaller than a practical RAM size these days, and as there are
still a few orders of magnitude greater performance in a RAM disk than
an SSD and this will likely remain true for a while, there are SOME
applications where this could be nice)


> 2)
> Also if there's a read/write failure (or excessive time consumption for 
> a single operation, say 15 seconds), will Softraid RAID1 learn to take 
> the broken disk out of use?

As far as I am aware, Softraid (like most RAID systems, hw or sw) will
deactivate a drive which reports a failure.  Drives which go super slow
(i.e., always manage to get the data BEFORE the X'th retry at which they
would toss an error) never report an error back, so never deactivate the
drive.

Sound implausible?  Nope.  It Happens.  Frustrating as heck when you
have this happen to you until you figure it out.  In fact, one key
feature of "enterprise" and "RAID" grade disks is that when they hop
off-line and throw an error fast and early, to prevent this problem
(some "NAS" grade disks may do 

Will Softraid RAID1 read from the fastest mirror/-s / supports user-specified device read priority order, nowadays? Takes broken disk out of use?

2016-02-13 Thread Tinker

Hi,

1)
http://www.openbsd.org/papers/asiabsdcon2010_softraid/softraid.pdf page 
3 "2.2 RAID 1" says that it reads "on a round-robin basis from all 
active chunks", i.e. read operations are spread evenly across disks.


Since then did anyone implement selective reading based on experienced 
read operation time, or a user-specified device read priority order?



That would allow Softraid RAID1 based on 1 SSD mirror + 1 SSD mirror + 1 
HDD mirror, which would give the best combination of IO performance and 
data security OpenBSD would offer today.


2)
Also if there's a read/write failure (or excessive time consumption for 
a single operation, say 15 seconds), will Softraid RAID1 learn to take 
the broken disk out of use?


Thanks,
Tinker



Re: broken disk?

2008-09-21 Thread Jordi Espasa Clofent

It seems that it runs fine but I don't get output from the long
test... Any hint?


?Why? It's very easy:

$ smartctl -h -t long /dev/wd0c

... wait the needed time and next

$ smartctl -l selftest /dev/wd0c

PD. Adromina it's a funny name (divertit, vaja)
:P

--
Thanks,
Jordi Espasa Clofent



Re: broken disk?

2008-09-21 Thread Rajneesh N. Shetty
i have a fracture in my spine, does that answer your question?

tel :  +61431 823 603



'Worry looks around, sorry looks back, faith looks up'.

--- On Sun, 21/9/08, Pau [EMAIL PROTECTED] wrote:
From: Pau [EMAIL PROTECTED]
Subject: broken disk?
To: misc misc@openbsd.org
Received: Sunday, 21 September, 2008, 1:04 AM

Hi,

I recently posted in ports some problems I am having with an i386 laptop

http://marc.info/?l=openbsd-portsm=122191620826430w=2

and especially

http://marc.info/?l=openbsd-portsm=122189105726930w=2

Nikolay suggested it could be a hardware problem. To be sure, I made a
clean install of the system and the problems (not same, but similar)
were still there. Therefore I booted into memtest86 from a linux live
CD. The test went fine; then I tried smartcl (messages attached to
this message below).

It seems that it runs fine but I don't get output from the long
test... Any hint?

I have tried /dev/rwd0c too... but same result.

How can I check my problem??

Thanks,

Pau

andromina# smartctl -i /dev/wd0c
zsh: command not found: smartctl
andromina# /usr/local/sbin/smartctl  -i /dev/wd0c
smartctl version 5.37 [i386-unknown-openbsd4.3] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Fujitsu MHT series
Device Model: FUJITSU MHT2080AT
Serial Number:NN7CT4A15HPM
Firmware Version: 0022
User Capacity:80,026,361,856 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 3a
Local Time is:Sat Sep 20 15:18:00 2008 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

andromina# /usr/local/sbin/smartctl  -s on -d ata /dev/wd0c
smartctl version 5.37 [i386-unknown-openbsd4.3] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Enabled.

andromina# /usr/local/sbin/smartctl -d ata -a /dev/wd0c
smartctl version 5.37 [i386-unknown-openbsd4.3] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Fujitsu MHT series
Device Model: FUJITSU MHT2080AT
Serial Number:NN7CT4A15HPM
Firmware Version: 0022
User Capacity:80,026,361,856 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 3a
Local Time is:Sat Sep 20 15:18:41 2008 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:  (   0) The previous self-test routine
completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 587) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time:(   2) minutes.
Extended self-test routine
recommended polling time:(  80) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   100   100   046Pre-fail
Always   -   41054
  2 Throughput_Performance  0x0005   100   100   030Pre-fail
Offline  -   31064064
  3 Spin_Up_Time0x0003   100   100   025Pre-fail
Always   -   1
  4 Start_Stop_Count0x0032   098   098   000Old_age
Always   -   7953
  5

Re: broken disk?

2008-09-21 Thread Pau
Hi Jordi,

thanks. I have looked also in the bios. SART is enabled per default.
It seems that the disk is fine.

Could it be the RAM? How to test?

Pau

# /usr/local/sbin/smartctl -d ata -t long /dev/wd0c
smartctl version 5.37 [i386-unknown-openbsd4.3] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: Execute SMART Extended self-test routine immediately
in off-line mode.
Drive command Execute SMART Extended self-test routine immediately in
off-line mode successful.
Testing has begun.
Please wait 80 minutes for test to complete.
Test will complete after Sun Sep 21 13:30:47 2008

Use smartctl -X to abort test.
# /usr/local/sbin/smartctl -l selftest /dev/wd0c
smartctl version 5.37 [i386-unknown-openbsd4.3] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offlineCompleted without error   00%  7704
-
# 2  Extended offlineCompleted without error   00%  7698
-
# 3  Short offline   Completed without error   00%  7694
-
# 4  Short offline   Completed without error   00%  7694
-
# 5  Extended offlineCompleted without error   00%  7693
-


2008/9/21 Jordi Espasa Clofent [EMAIL PROTECTED]:
 It seems that it runs fine but I don't get output from the long
 test... Any hint?

 ?Why? It's very easy:

 $ smartctl -h -t long /dev/wd0c

 ... wait the needed time and next

 $ smartctl -l selftest /dev/wd0c

 PD. Adromina it's a funny name (divertit, vaja)
 :P

 --
 Thanks,
 Jordi Espasa Clofent



Re: broken disk?

2008-09-21 Thread Jordi Espasa Clofent

thanks. I have looked also in the bios. SART is enabled per default.
It seems that the disk is fine.

Could it be the RAM? How to test?


Could be.
A deep memtest test should be enough.

--
Thanks,
Jordi Espasa Clofent



Re: broken disk?

2008-09-21 Thread ropers
2008/9/21 Jordi Espasa Clofent [EMAIL PROTECTED]:
 thanks. I have looked also in the bios. SART is enabled per default.
 It seems that the disk is fine.

 Could it be the RAM? How to test?

 Could be.
 A deep memtest test should be enough.

Apologies if you already know this and/or did this, but since you (Pau) asked:

Deep memtest = burn-in test.
memtest86 has an option for this. Launch it, and leave it running for
24hrs. If memtest86 hasn't found any errors after that many passes,
then you can be virtually certain that it's not the RAM that is
faulty. (I've never encountered faulty RAM that a 24h burn-in
memtest86 check didn't detect as such, but I have more than once seen
memtest86 fail to detect faulty RAM during a single-pass test.)

regards,
--ropers



broken disk?

2008-09-20 Thread Pau
Hi,

I recently posted in ports some problems I am having with an i386 laptop

http://marc.info/?l=openbsd-portsm=122191620826430w=2

and especially

http://marc.info/?l=openbsd-portsm=122189105726930w=2

Nikolay suggested it could be a hardware problem. To be sure, I made a
clean install of the system and the problems (not same, but similar)
were still there. Therefore I booted into memtest86 from a linux live
CD. The test went fine; then I tried smartcl (messages attached to
this message below).

It seems that it runs fine but I don't get output from the long
test... Any hint?

I have tried /dev/rwd0c too... but same result.

How can I check my problem??

Thanks,

Pau

andromina# smartctl -i /dev/wd0c
zsh: command not found: smartctl
andromina# /usr/local/sbin/smartctl  -i /dev/wd0c
smartctl version 5.37 [i386-unknown-openbsd4.3] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Fujitsu MHT series
Device Model: FUJITSU MHT2080AT
Serial Number:NN7CT4A15HPM
Firmware Version: 0022
User Capacity:80,026,361,856 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 3a
Local Time is:Sat Sep 20 15:18:00 2008 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

andromina# /usr/local/sbin/smartctl  -s on -d ata /dev/wd0c
smartctl version 5.37 [i386-unknown-openbsd4.3] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Enabled.

andromina# /usr/local/sbin/smartctl -d ata -a /dev/wd0c
smartctl version 5.37 [i386-unknown-openbsd4.3] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Fujitsu MHT series
Device Model: FUJITSU MHT2080AT
Serial Number:NN7CT4A15HPM
Firmware Version: 0022
User Capacity:80,026,361,856 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 3a
Local Time is:Sat Sep 20 15:18:41 2008 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:  (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 587) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time:(   2) minutes.
Extended self-test routine
recommended polling time:(  80) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   100   100   046Pre-fail
Always   -   41054
  2 Throughput_Performance  0x0005   100   100   030Pre-fail
Offline  -   31064064
  3 Spin_Up_Time0x0003   100   100   025Pre-fail
Always   -   1
  4 Start_Stop_Count0x0032   098   098   000Old_age
Always   -   7953
  5 Reallocated_Sector_Ct   0x0033   100   100   024Pre-fail
Always   -   8589934592000
  7 Seek_Error_Rate 0x000f   100   100   047Pre-fail
Always   -   3090
  8 Seek_Time_Performance   0x0005   100   100   019Pre-fail
Offline  -   0
  9 Power_On_Seconds0x0032   085   085   000   

Re: broken disk?

2008-09-20 Thread Lars Kotthoff
 SMART Self-test log structure revision number 1
 No self-tests have been logged.  [To run self-tests, use: smartctl -t]

You need to explicitely run the self test, e.g.
smartctl -t long /dev/wd0c
and wait until it finished -- the above section of smartctl -a /dev/wd0c will
tell you. Also see smartctl(8).

Lars



Re: broken disk?

2008-09-20 Thread Stuart Henderson
On 2008-09-20, Pau [EMAIL PROTECTED] wrote:
   Therefore I booted into memtest86 from a linux live
 CD. The test went fine

this does not necessarily mean the RAM is good; just that
memtest didn't find a problem.