Re: SSD and RAID question

2012-09-05 Thread Sean Murray

On 09/04/2012 09:21 PM, Konstantin Olchanski wrote:

On Sun, Sep 02, 2012 at 05:33:24PM -0700, Todd And Margo Chester wrote:


Cherryville drives have a 1.2 million hour MTBF (mean time
between failure) and a 5 year warranty.



Note that MTBF of 1.2 Mhrs (137 years?!?) is the *vendor's estimate*.

Its worse than that, if you read their docs, that number is based on an
average write/read rate of (intel) 20G per day, which is painfully little
for a server.

I looked at these for caching data but from our use case and of course
assuming the MTBF to be accurate, the MTBF would be 6 months

Sean



Actual failure rates observed in production are unknown, the devices have
not been around long enough.

However, if you read product feedback on newegg, you may note that many SSDs
seem to suffer from the "sudden death" syndrome - a problem we happily
no longer see on spinning disks.

I guess the "5 year warranty" is real enough, but it does not cover your
costs in labour for replacing dead disks, costs of down time and costs of lost 
data.



... risk of dropping RAID in favor of just one of these drives?



To help you make a informed decision, here is my data.

I have about 9 SSDs in production use (most are in RAID1 pairs), oldest has been
running since last October:
- 1 has 3 bad blocks (no RAID1),
- 1 has SATA comm problem (vanishes from the system - system survives because
   it's a RAID1 pair).
- 0 dead so far

I have about 20 USB and CF flash drives in production used as SL4/5/6 system 
disks,
some in RAID1, some as singles, oldest has been in use for 3 ( or more?) years.
There are zero failures, except 1 USB flash has a few bad blocks, except
for infant mortality (every USB3 and all except 1 brand USB2 flash drives
fail within a few weeks).

All drives used as singles are backed up nightly (rsync).

All spining disks are installed in RAID1 pairs.

Would *I* use single drives (any technology - SSD, USB flash, spinning)?

Only for a system that does not require 100% uptime (is not used
by any users) and when I can do daily backups (it cannot be in a room
without a GigE network).







smime.p7s
Description: S/MIME Cryptographic Signature


Posted for testing: security packages for Java (SL5)

2012-09-05 Thread Pat Riehecky

Security packages for Java posted for testing at

ftp://ftp.scientificlinux.org/linux/scientific/5rolling/testing/i386/
ftp://ftp.scientificlinux.org/linux/scientific/5rolling/testing/x86_64/

Next week these packages will be officially released.  This delay is to
allow you time to test and verify your production applications will run
as expected once this security update is applied.

If you do not want this security update please consult your site's
local security policy to determine how you should proceed.  Scientific
Linux will automatically feature this update next week.

As a reminder, the openjdk Java environment is available in Scientific
Linux 5.  Updates for openjdk are released in a similar manner to other
security updates.  Additionally, Scientific Linux 6 does not bundle the
closed source Java environment.  So if you are planning to move to
Scientific Linux 6 in the future, you may wish to begin the java
migration to openjdk at this time.




The update advisory is posted below:

Synopsis: Important: java-1.6.0-sun
Issue Date: 2012-09-04
CVE Numbers: CVE-2012-4681


These vulnerabilities may be remotely exploitable without 
authentication, i.e., they may be exploited over a network without the 
need for a username and password. To be successfully exploited, an 
unsuspecting user running an affected release in a browser will need to 
visit a malicious web page that leverages this vulnerability. Successful 
exploits can impact the availability, integrity, and confidentiality of 
the user's system.


In addition, this Security Alert includes a security-in-depth fix in the 
AWT subcomponent of the Java Runtime Environment.


Due to the severity of these vulnerabilities, the public disclosure of 
technical details and the reported exploitation of CVE-2012-4681 "in the 
wild," we strongly recommend that you apply the updates as soon as

possible.


Re: SSD and RAID question

2012-09-05 Thread Konstantin Olchanski
On Wed, Sep 05, 2012 at 10:34:24AM +0200, Sean Murray wrote:
> On 09/04/2012 09:21 PM, Konstantin Olchanski wrote:
> >On Sun, Sep 02, 2012 at 05:33:24PM -0700, Todd And Margo Chester wrote:
> >>
> >>Cherryville drives have a 1.2 million hour MTBF (mean time
> >>between failure) and a 5 year warranty.
> >>
> >
> >Note that MTBF of 1.2 Mhrs (137 years?!?) is the *vendor's estimate*.
>
> Its worse than that, if you read their docs, that number is based on an
> average write/read rate of (intel) 20G per day, which is painfully little
> for a server.
> 

I am not sure if I believe any of these numbers. Here is why.

I just checked my running systems and the oldest "/" filesystem on a USB Flash
drive is "Nov 2010", not quite 2 years old. (other USB Flash drives are probably
older than that but have "new" filesystems when upgraded from SL4 to SL5 to 
SL6).

USB Flash drives were never intended for heavy duty use as "/" disks,
the Linux ext3/ext4 is not even supposed to be easy on flash media,
and I remember the dire predictions of flash media "wear-out".

I read all this to mean that nobody knows how long flash media really lasts in 
production use.

I think "they" have to fix the sudden-death problem first, *then* we will start
seeing how the media life-time and wear-out comes into play.

-- 
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada


Re: SSD and RAID question

2012-09-05 Thread Todd And Margo Chester

On 09/04/2012 12:21 PM, Konstantin Olchanski wrote:

Cherryville drives have a 1.2 million hour MTBF (mean time
>between failure) and a 5 year warranty.
>

Note that MTBF of 1.2 Mhrs (137 years?!?) is the*vendor's estimate*.


Baloney check.  1.2 Mhrs does not mean that the device is expected
to last 137 years.  It means that if you have 1.2 million devices
in front of you on a test bench, you would expect one device to
fail in one hour.

-T


Re: SSD and RAID question

2012-09-05 Thread Todd And Margo Chester

On 09/04/2012 12:21 PM, Konstantin Olchanski wrote:

every USB3 and all except 1 brand USB2 flash drives
fail within a few weeks


I have been selling a "few" Kanguru USB 3 flash drives
as backup sticks.  So far so good.  Any idea what is
happening on your end?   (USB3 sticks are so fast.)


ksoftirqd cpu usage

2012-09-05 Thread Orion Poplawski
Anyone else here noticed (dramatically) increased cpu usage for ksoftirqd on 
their SL6 machines with recent kernels?  Anyone know what's going on?


--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA, Boulder Office  FAX: 303-415-9702
3380 Mitchell Lane   or...@nwra.com
Boulder, CO 80301   http://www.nwra.com


Re: ksoftirqd cpu usage

2012-09-05 Thread Orion Poplawski

On 09/05/2012 02:52 PM, Orion Poplawski wrote:

Anyone else here noticed (dramatically) increased cpu usage for ksoftirqd on
their SL6 machines with recent kernels?  Anyone know what's going on?



Sometimes it seems to be kondemand that is using lots of cpu instead.

--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA, Boulder Office  FAX: 303-415-9702
3380 Mitchell Lane   or...@nwra.com
Boulder, CO 80301   http://www.nwra.com


Re: SL-compatible webcam recommendations?

2012-09-05 Thread Chris Schanzle

On 09/04/2012 04:01 PM, Steve Gaarder wrote:

I'm looking for a webcam that I can use reliably with SL6 and Ekiga, one
that has support in the SL or EPEL repos.  Any suggestions?


Logitech 9000 is excellent.


Re: SSD and RAID question

2012-09-05 Thread jdow

On 2012/09/05 11:38, Todd And Margo Chester wrote:

On 09/04/2012 12:21 PM, Konstantin Olchanski wrote:

Cherryville drives have a 1.2 million hour MTBF (mean time
>between failure) and a 5 year warranty.
>

Note that MTBF of 1.2 Mhrs (137 years?!?) is the*vendor's estimate*.


Baloney check.  1.2 Mhrs does not mean that the device is expected
to last 137 years.  It means that if you have 1.2 million devices
in front of you on a test bench, you would expect one device to
fail in one hour.

-T


Baloney check back at you. If you have 1.2 million devices in front
of you all operating under the same conditions as specified for the 1.2
million hours MTBF that half of them would have failed. bu the end of
the 1.2 million hours. Commentary here indicates those conditions are
a severe derating on the drive's transaction capacity.

It does not say much of anything about the drive's life under other
conditions because no failure mechanism is cited. For example, if the
drive is well cooled and that means the components inside are well
cooled rather than left in usual mountings the life might be far
greater simply based on the component temperature drop. 10C can make
a large difference in lifetime. But if the real limit is related to
read write cycles on the memory locations you may find that temperature
has little real affect on the system lifetime.

If I could design a system that worked off a fast normal RAID and could
buffer in the SSD RAID with a safe automatic fall over when the SSD RAID
failed, regardless of failure mode, and I needed the speed you can bet
I'd be in there collecting statistics for the company for whom I did the
work. There is a financial incentive here and potential competitive
advantage to KNOW how these drives fail. With 100 drives after the first
5 or 10 had died some knowledge might be gained. And, of course, if the
drives did NOT die even more important knowledge would be gained.

Simple MTBF under gamer type use is pretty useless for a massive database
application. And if manufacturers are not collecting the data there is a
really good potential for competitive advantage if you collect your own
data and hold it proprietary. I betcha somebody out there is doing this
right now for Wall Street automated trading uses if nothing else.

{^_^}


Re: SL-compatible webcam recommendations?

2012-09-05 Thread Chris Schanzle

On 09/05/2012 06:17 PM, Chris Schanzle wrote:

On 09/04/2012 04:01 PM, Steve Gaarder wrote:

I'm looking for a webcam that I can use reliably with SL6 and Ekiga, one
that has support in the SL or EPEL repos.  Any suggestions?

Logitech 9000 is excellent.


Let me rephrase that.  We've had good luck with the Logitech 9000.  [I need to 
be careful not to endorse specific products.]