Re: [atlas] USB drive more harmful than helpful?

2016-05-24 Thread Daniel Suchy
It doesn't help, when we're talking about Atlas probes. I have one probe, where 
external flash died twice, even it's placed in datacenter with UPS-protected 
power.

 

On 23.05.16 15:35, "ripe-atlas on behalf of James R Cutler" 
 wrote:

 

Stable power, as from a UPS, also isolates the probe from power glitches which 
may cause rebooting of the probe, thus adding to the write count. Not to 
mention corruption from power failure during writes.

 

Opinion:  If the device/system/operation is at all important, use of a UPS is 
effectively mandatory.



Re: [atlas] USB drive more harmful than helpful?

2016-05-23 Thread James R Cutler
> On May 23, 2016, at 8:41 AM, Wilfried Woeber  wrote:
> 
> [...]
>> Has anyone tested how many writes are going on to the ATLAS thumb
>> drive?  Perhaps with all the failures within a year of start, perhaps
>> too many writes are taking place?
> 
> I know that a very small number of probes is not a valid basis for statistics,
> but there wasn't a USB drive failure yet for the long-term, always-on probe.
> 
> But they are powered with dedicated, stable power sources.
> Thus I tend to lean more towards the explanation involving level or stability
> of power, rather than # of writes.
> 
> FWIW,
> Wilfried
> 
>> Regards,
>> Hank
> 


Stable power, as from a UPS, also isolates the probe from power glitches which 
may cause rebooting of the probe, thus adding to the write count. Not to 
mention corruption from power failure during writes.

Opinion:  If the device/system/operation is at all important, use of a UPS is 
effectively mandatory.

James R. Cutler
james.cut...@consultant.com
PGP keys at http://pgp.mit.edu



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [atlas] USB drive more harmful than helpful?

2016-05-23 Thread Bruno Pagani
Le 23/05/2016 à 14:41, Wilfried Woeber a écrit :

> [...]
>> Has anyone tested how many writes are going on to the ATLAS thumb
>> drive?  Perhaps with all the failures within a year of start, perhaps
>> too many writes are taking place?
> I know that a very small number of probes is not a valid basis for statistics,
> but there wasn't a USB drive failure yet for the long-term, always-on probe.
>
> But they are powered with dedicated, stable power sources.
> Thus I tend to lean more towards the explanation involving level or stability
> of power, rather than # of writes.
>
> FWIW,
> Wilfried
>
>> Regards,
>> Hank

FWIW, my failed #12033 probe was powered using only 1 usb port from my
ISP provided router. I’ve plugged the replacement one on a 2+1 A
dedicated power supply. So while that second one hasn’t been around long
enough to be relevant, the first one fall in the low power level issue
range.

Bruno



signature.asc
Description: OpenPGP digital signature


Re: [atlas] USB drive more harmful than helpful?

2016-05-23 Thread Wilfried Woeber
[...]
> Has anyone tested how many writes are going on to the ATLAS thumb
> drive?  Perhaps with all the failures within a year of start, perhaps
> too many writes are taking place?

I know that a very small number of probes is not a valid basis for statistics,
but there wasn't a USB drive failure yet for the long-term, always-on probe.

But they are powered with dedicated, stable power sources.
Thus I tend to lean more towards the explanation involving level or stability
of power, rather than # of writes.

FWIW,
Wilfried

> Regards,
> Hank



Re: [atlas] USB drive more harmful than helpful?

2016-05-23 Thread Joe Provo
Fwiw, I always power directly from an outlet, never tributary on the USB.
I've yet to have such fails, so my anecdata aligns with the underpower
theory.
On May 20, 2016 15:08, "Phillip Remaker"  wrote:

So I have a few theories. I have now had 3 different USB sticks fail on me:
Two Sandisk 4GB SDCZ33 and one cheap generic 8GB replacement.

The power draw of the TP-Link system + USB is probably more than the
opportunistic USB ports they get plugged in to. An underpowered probe runs
great MOST of the time, but a flash bit write is probably the highest power
strain and Flash can get really unhappy with power interrupts, based on
this SSD research:
https://www.usenix.org/system/files/conference/fast13/fast13-final80.pdf

I usually use a 500mA or 800mA supply, or tap a nearby router USB port in
that range.
I suspect the system may demand 1200mA or more.

When most flash sticks get errored out enough, they permanently fail into a
read only mode, or become fully unreadable. Read-only mode can be reset on
some models, but it is not recommended by the vendor.  At least one of the
failed SANdisk units I had was stuck in a read-only mode.

Also, probes may be subjected to ungraceful power down situations,
depending on where they are stationed. That can also be a flash drive
killer.

I don't think we are hitting the write limits of the sticks. I suspect the
units are often in underpowered or ungraceful pwoer-down situations, or the
USB flash itself is not responding gracefully to poweroff situations.

I don't suppose RIPE buys enough USB sticks to get to talk to engineers at
SanDISK?

I know the newer Raspberry Pi will report when it is in an underpowered
situation. Can the TP-Link detect and warn when underpowered?

What is the minimum power recommended for TP-Link + USB?  Also, are there
any USB sticks that have lower power needs and are more robust in low power
IoT situations?

Is anyone trying to post-mortem the failed sticks?

On Fri, May 20, 2016 at 10:03 AM, Gert Doering  wrote:

> Hi,
>
> On Fri, May 20, 2016 at 04:10:47PM +0200, Philip Homburg wrote:
> > We have no clear idea why they fail. It seems that time to failure is
> > highly variable.
>
> Can you correlate tests-until-failure or data-written-until-failure?
>
> One of mine has failed at least two times now, and it could be that
> people just *love* to run tests from 3320...
>
> My gen 1 probe in 5539 has never had *any* issues.
>
> gert
> --
> have you enabled IPv6 on something today...?
>
> SpaceNet AGVorstand: Sebastian v. Bomhard
> Joseph-Dollinger-Bogen 14  Aufsichtsratsvors.: A. Grundner-Culemann
> D-80807 Muenchen   HRB: 136055 (AG Muenchen)
> Tel: +49 (0)89/32356-444   USt-IdNr.: DE813185279
>


Re: [atlas] USB drive more harmful than helpful?

2016-05-23 Thread Philip Homburg
On 2016/05/21 21:32 , Hank Nussbacher wrote:
> On 20/05/2016 22:08, Phillip Remaker wrote:
>> I don't suppose RIPE buys enough USB sticks to get to talk to
>> engineers at SanDISK?
>>
> Sandisk R&D is located in Israel:
> http://www.globes.co.il/en/article-sandisk-acquisition-affects-650-israeli-employees-1001075338
> I could probably arrange a meeting with the technical staff there
> provided there is a clear document detailing the issue.
> Maybe RIPE ATLAS technical staff would like to come to a meeting?

We switched from SanDisk to Verbatim because the failure rate of the
SanDisk was too high.

Unfortunately, the log files that contain details about the USB sticks
are archived in a way that makes them hard to access. But we will try to
analyze those logs soon to see what we can get out of them.

Independent of that, it would be nice if we could figure out a way to
induce failures (instead of having to way for probes in the field to end
up with a corrupt filesystem) and to figure out what causes those failures.

One thing we currently wonder is if a marginal power supply (as seen
from the USB stick) would cause corruption or not.

Philip





Re: [atlas] USB drive more harmful than helpful?

2016-05-21 Thread Michael Ionescu
It was powered through the dedicated wall plug included with the probe. 

On May 22, 2016 12:03:39 AM GMT+02:00, Phillip Remaker  
wrote:
>How was the drive powered? Dedicated supply, or a port on a router?
>
>On Sat, May 21, 2016 at 1:32 PM, Michael Ionescu 
>wrote:
>
>> On May 20, 2016 9:08:08 PM GMT+02:00, Phillip Remaker
>
>> wrote:
>> >I don't suppose RIPE buys enough USB sticks to get to talk to
>engineers
>> >at SanDISK?
>>
>> I just had a Verbatim drive originally supplied with the probe go
>> read-only, so I would say RIPE is not procuring only SanDISK.




Re: [atlas] USB drive more harmful than helpful?

2016-05-21 Thread Michael Ionescu


On May 20, 2016 3:58:06 PM GMT+02:00, Philip Homburg  
wrote:
>No, the probe actually runs from the USB stick. The internal 4MB flash
>is just enough to initialize the USB stick in a secure way. And even
>that is already tricky.

Could you perhaps write some statistical data regarding drive usage to a 
cleartext partition that could be evaluated by the hoster once the probe is in 
limbo? 
-- 




Re: [atlas] USB drive more harmful than helpful?

2016-05-21 Thread Michael Ionescu


On May 20, 2016 9:08:08 PM GMT+02:00, Phillip Remaker  wrote:
>I don't suppose RIPE buys enough USB sticks to get to talk to engineers
>at SanDISK?

I just had a Verbatim drive originally supplied with the probe go read-only, so 
I would say RIPE is not procuring only SanDISK. 

--



Re: [atlas] USB drive more harmful than helpful?

2016-05-21 Thread Hank Nussbacher
On 20/05/2016 22:08, Phillip Remaker wrote:
>
> When most flash sticks get errored out enough, they permanently fail
> into a read only mode, or become fully unreadable. Read-only mode can
> be reset on some models, but it is not recommended by the vendor.  At
> least one of the failed SANdisk units I had was stuck in a read-only mode.
>
> Also, probes may be subjected to ungraceful power down situations,
> depending on where they are stationed. That can also be a flash drive
> killer.
>
> I don't think we are hitting the write limits of the sticks. I suspect
> the units are often in underpowered or ungraceful pwoer-down
> situations, or the USB flash itself is not responding gracefully to
> poweroff situations.
>
> I don't suppose RIPE buys enough USB sticks to get to talk to
> engineers at SanDISK?
>
Sandisk R&D is located in Israel:
http://www.globes.co.il/en/article-sandisk-acquisition-affects-650-israeli-employees-1001075338
I could probably arrange a meeting with the technical staff there
provided there is a clear document detailing the issue.
Maybe RIPE ATLAS technical staff would like to come to a meeting?

Regards,
Hank




Re: [atlas] USB drive more harmful than helpful?

2016-05-20 Thread Phillip Remaker
So I have a few theories. I have now had 3 different USB sticks fail on me:
Two Sandisk 4GB SDCZ33 and one cheap generic 8GB replacement.

The power draw of the TP-Link system + USB is probably more than the
opportunistic USB ports they get plugged in to. An underpowered probe runs
great MOST of the time, but a flash bit write is probably the highest power
strain and Flash can get really unhappy with power interrupts, based on
this SSD research:
https://www.usenix.org/system/files/conference/fast13/fast13-final80.pdf

I usually use a 500mA or 800mA supply, or tap a nearby router USB port in
that range.
I suspect the system may demand 1200mA or more.

When most flash sticks get errored out enough, they permanently fail into a
read only mode, or become fully unreadable. Read-only mode can be reset on
some models, but it is not recommended by the vendor.  At least one of the
failed SANdisk units I had was stuck in a read-only mode.

Also, probes may be subjected to ungraceful power down situations,
depending on where they are stationed. That can also be a flash drive
killer.

I don't think we are hitting the write limits of the sticks. I suspect the
units are often in underpowered or ungraceful pwoer-down situations, or the
USB flash itself is not responding gracefully to poweroff situations.

I don't suppose RIPE buys enough USB sticks to get to talk to engineers at
SanDISK?

I know the newer Raspberry Pi will report when it is in an underpowered
situation. Can the TP-Link detect and warn when underpowered?

What is the minimum power recommended for TP-Link + USB?  Also, are there
any USB sticks that have lower power needs and are more robust in low power
IoT situations?

Is anyone trying to post-mortem the failed sticks?

On Fri, May 20, 2016 at 10:03 AM, Gert Doering  wrote:

> Hi,
>
> On Fri, May 20, 2016 at 04:10:47PM +0200, Philip Homburg wrote:
> > We have no clear idea why they fail. It seems that time to failure is
> > highly variable.
>
> Can you correlate tests-until-failure or data-written-until-failure?
>
> One of mine has failed at least two times now, and it could be that
> people just *love* to run tests from 3320...
>
> My gen 1 probe in 5539 has never had *any* issues.
>
> gert
> --
> have you enabled IPv6 on something today...?
>
> SpaceNet AGVorstand: Sebastian v. Bomhard
> Joseph-Dollinger-Bogen 14  Aufsichtsratsvors.: A. Grundner-Culemann
> D-80807 Muenchen   HRB: 136055 (AG Muenchen)
> Tel: +49 (0)89/32356-444   USt-IdNr.: DE813185279
>


Re: [atlas] USB drive more harmful than helpful?

2016-05-20 Thread Gert Doering
Hi,

On Fri, May 20, 2016 at 04:10:47PM +0200, Philip Homburg wrote:
> We have no clear idea why they fail. It seems that time to failure is
> highly variable.

Can you correlate tests-until-failure or data-written-until-failure?

One of mine has failed at least two times now, and it could be that 
people just *love* to run tests from 3320...

My gen 1 probe in 5539 has never had *any* issues.

gert
-- 
have you enabled IPv6 on something today...?

SpaceNet AGVorstand: Sebastian v. Bomhard
Joseph-Dollinger-Bogen 14  Aufsichtsratsvors.: A. Grundner-Culemann
D-80807 Muenchen   HRB: 136055 (AG Muenchen)
Tel: +49 (0)89/32356-444   USt-IdNr.: DE813185279


signature.asc
Description: PGP signature


Re: [atlas] USB drive more harmful than helpful?

2016-05-20 Thread Philip Homburg
On 2016/05/20 14:57 , Hank Nussbacher wrote:
> Has anyone tested how many writes are going on to the ATLAS thumb
> drive?  Perhaps with all the failures within a year of start, perhaps
> too many writes are taking place?

We have no clear idea why they fail. It seems that time to failure is
highly variable.





Re: [atlas] USB drive more harmful than helpful?

2016-05-20 Thread Philip Homburg
On 2016/05/20 14:37 , Michael Ionescu wrote:
> If the main reason for the drive is to cache data during unavailability
> of the command and control center, this may not be worth the effort.

No, the probe actually runs from the USB stick. The internal 4MB flash
is just enough to initialize the USB stick in a secure way. And even
that is already tricky.






Re: [atlas] USB drive more harmful than helpful?

2016-05-20 Thread Gil Bahat
+1. I lost most of the probes this way and I'm not really sure how to
recover them - I need to ask for a batch of USB drives or ask all the hosts
to remove them... can't this be handled better with a firmware replacement?
I would at least then ask all the hosts to unplug the USB and leave the
hosts as is.

Gil

On Fri, May 20, 2016 at 4:06 PM, Gert Doering  wrote:

> Hi,
>
> On Fri, May 20, 2016 at 02:37:44PM +0200, Michael Ionescu wrote:
> > >From both my own (short term) experience and from what's being written
> on this list, I'm getting the impression that the USB drive may be costing
> more than it's worth.
> [..]
> > Any thoughts?
>
> The USB outages and the lack of proper guidance for probe hosts about
> the problem status and how to get the probes back has been my gripe #1
> for a while now.
>
> So, yes, this needs fixing, one way or the other.
>
> gert
> --
> have you enabled IPv6 on something today...?
>
> SpaceNet AGVorstand: Sebastian v. Bomhard
> Joseph-Dollinger-Bogen 14  Aufsichtsratsvors.: A. Grundner-Culemann
> D-80807 Muenchen   HRB: 136055 (AG Muenchen)
> Tel: +49 (0)89/32356-444   USt-IdNr.: DE813185279
>


Re: [atlas] USB drive more harmful than helpful?

2016-05-20 Thread Gert Doering
Hi,

On Fri, May 20, 2016 at 02:37:44PM +0200, Michael Ionescu wrote:
> >From both my own (short term) experience and from what's being written on 
> >this list, I'm getting the impression that the USB drive may be costing more 
> >than it's worth. 
[..]
> Any thoughts?

The USB outages and the lack of proper guidance for probe hosts about
the problem status and how to get the probes back has been my gripe #1
for a while now.

So, yes, this needs fixing, one way or the other.

gert
-- 
have you enabled IPv6 on something today...?

SpaceNet AGVorstand: Sebastian v. Bomhard
Joseph-Dollinger-Bogen 14  Aufsichtsratsvors.: A. Grundner-Culemann
D-80807 Muenchen   HRB: 136055 (AG Muenchen)
Tel: +49 (0)89/32356-444   USt-IdNr.: DE813185279


signature.asc
Description: PGP signature


Re: [atlas] USB drive more harmful than helpful?

2016-05-20 Thread Hank Nussbacher
On 20/05/2016 15:37, Michael Ionescu wrote:

Interesting idea to make the USB drive optional.  Based on literature:

https://en.wikipedia.org/wiki/USB_flash_drive#Failures
https://askleo.com/can_a_usb_thumbdrive_wear_out/ - 10,000-100,000
http://cfgearblog.blogspot.co.il/2011/03/how-long-does-flash-drive-last_22.html 
- 10,000-1M

Has anyone tested how many writes are going on to the ATLAS thumb
drive?  Perhaps with all the failures within a year of start, perhaps
too many writes are taking place?

Regards,
Hank


> >From both my own (short term) experience and from what's being
> written on this list, I'm getting the impression that the USB drive
> may be costing more than it's worth.
>
> I have in only about 3 months experienced multiple probe issues due to
> USB drives and there have been multiple threads on this list which
> suggest that I am far from alone.
>
> I will go so far as to suspect that a substantial number of
> disconnected and abandoned probes have similar issues, but the hosters
> may be unwilling or unable to spend the necessary time to investigate
> and resolve them.
>
> If the main reason for the drive is to cache data during
> unavailability of the command and control center, this may not be
> worth the effort.
>
> I would suggest making the drive optional. That may mean fewer data
> points from and/or fewer UDMs possible on those probes without a
> functioning drive. But it may also mean a couple thousand probes more
> connected and therefore available for measurements at all.
>
> I'm not saying that probes should not ask for their drives to be
> fixed/replaced, but it should not be a requirement for the probe to
> run. One might give an incentive to hosters to run their probes with
> functioning drives by giving less credits for connected probes without
> a drive.
>
> Any thoughts?
>
> Michael
> -- 
> Sent from a mobile. Please excuse my brevity. // M: +49-163-6866568 






[atlas] USB drive more harmful than helpful?

2016-05-20 Thread Michael Ionescu
>From both my own (short term) experience and from what's being written on this 
>list, I'm getting the impression that the USB drive may be costing more than 
>it's worth. 

I have in only about 3 months experienced multiple probe issues due to USB 
drives and there have been multiple threads on this list which suggest that I 
am far from alone. 

I will go so far as to suspect that a substantial number of disconnected and 
abandoned probes have similar issues, but the hosters may be unwilling or 
unable to spend the necessary time to investigate and resolve them. 

If the main reason for the drive is to cache data during unavailability of the 
command and control center, this may not be worth the effort. 

I would suggest making the drive optional. That may mean fewer data points from 
and/or fewer UDMs possible on those probes without a functioning drive. But it 
may also mean a couple thousand probes more connected and therefore available 
for measurements at all. 

I'm not saying that probes should not ask for their drives to be 
fixed/replaced, but it should not be a requirement for the probe to run. One 
might give an incentive to hosters to run their probes with functioning drives 
by giving less credits for connected probes without a drive. 

Any thoughts?

Michael
-- 
Sent from a mobile. Please excuse my brevity. // M: +49-163-6866568