Yeah, I'm not sure that HIPAA is really the issue here. >From a technology perspective, though, it's certainly an interesting problem. Being too cheap to solve the problem properly doesn't help, but the real question is how to properly solve the problem.
We have looked at several technologies recently that claim to be able to help. Filetek is an object storage software system that sits in front of any storage, so you are no longer tied to the particular storage technology. You can combine different block, file and tape systems behind it and even move or replicate data around nondisruptively. It looks promising. The other system we saw recently is called Amplidata. They have a file system (if it can really be called that) that provides so much protection via erasure coding that you no longer need to back up the data because the calculated protection is three orders of magnitude better than what replicated backups can provide. So with something like Filetek managing the data, and storing the object on a backend Amplidata system, it would seem that vast amounts of data can be kept protected indefinitely. I'm curious what other similar systems exist for this. Of course none of that actually *does* stuff with the data, but a big data analytical engine could plug into the datastore and search/link/manipulate the data into reports pretty easily... Adam On Saturday, September 14, 2013, Francis Liu <[email protected]> wrote: > I thought HIPAA would only apply to user-identifiable data, not any old random big data dataset. Best keep the important legislation-constrained data as small as you can. > > > I think many of the current uses of "big data" actually have a short half life. The bigger the feed you're getting now, the less useful it is in x months, because the data will be aged or expired. > > As the OP indicates, maintenance isn't free, and "future-proofing" has never ever been free. > > > On Sun, Sep 15, 2013 at 12:05 AM, Andrew Hume <[email protected]> wrote: >> >> a recent meditation from a mailing list on the issue of media bandwidth for big data. >> note especially the claim that the time needed to migrate the data to another medium exceeds >> the lifetime of the current medium. >> >> Exactly what I was referring to - bandwidth needed for data integrity & migration. Oh, and read-never is not a myth at all - at least in the minds of many datacenters and the folks who run them. They are under either legal mandate and/or company policy to retain data, read regardless. They hope to never read it at all. Still, they must prove in a court of law that they have retained it. >> >> Which brings us back to data integrity and long-term preservation. If you think it's a problem today, just wait...this is the 8 bazillion pound gorilla that faces all institutions who plan on storing exabytes of data. FB is one of those. >> >> To your point about large tape farms (disclaimer: I used to work for StorageTek) I already know several HPC sites who are 'stuck' - i.e. they cannot (or will not pay for) the necessary infrastructure to correctly maintain and migrate exascale data collections. It would take them longer to migrate the collection to new tape than the useful lifetime of the media. And they are too cheap to buy and maintain the needed infrastructure to perform such a migration in parallel, to reduce the time needed. >> >> Just you wait. 5 years from now, the scheist will hit the (exabyte) fan. Storing data today is one thing, preserving it for decades is quite another. HIPAA, anyone? >> >> ----------------------- >> Andrew Hume >> 949-707-1964 (VO and best) >> 732-420-2275 (NJ) >> [email protected] >> >> >> >> >> _______________________________________________ >> Tech mailing list >> [email protected] >> https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech >> This list provided by the League of Professional System Administrators >> http://lopsa.org/ >> > >
_______________________________________________ Tech mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
