On Fri, 8 Jun 2012 14:05:06 -0700
David Leimbach <leim...@gmail.com> wrote:

> On Fri, Jun 8, 2012 at 8:44 AM, erik quanstrom 
> <quans...@labs.coraid.com>wrote:
> 
> > i haven't seen any evidence that strongly typed files are a good idea.
> >  but maybe
> > others have?
> >
> 
> I can tell you that the "Big Data Analytics" explosion that's been going on
> that is creating lots of jobs for data scientists, has an awful lot to do
> with the fact that files on a filesystem are unstructured or "untyped"
> (without a schema).

Most of this is passing me by, but I would really like a search engine
for my own hard drive, if that's part of what you're talking about. I
guess with a search engine types matter, but there are very few file
types you'd want to read, view, or watch which can't be determined by
looking at the first part of the file. Within the file each type has
its own structure. I don't really understand what's being said by
"unstructured" here unless they want programs to handle all types
without recognising each one individually. Even then I don't see the
problem because many file types are just containers anyway, but I
should probably stop there as I really don't know what comes of
analytics and I don't have big data for any purposes other than to be
searched.

> 
> On systems like iOS, applications don't expose a file system to the end
> user but instead apps that work with PDFs can be used to forward those
> documents to other applications that understand PDFs.  This corresponds
> more to "data types".  The type-less mode is more general, and the typed
> mode seems easier to reason about.

It sounds like a weaker version of old PalmOS which had databases not
files and each db had a type associating it with an app. I think I
prefer the weaker version although I will probably dissect some of my
numerous Palm dbs if I find the time. I'd like to see how they're
structured.

> 
> In fact, the people who will eat the lunch of these people wrangling
> unstructured data, are the ones that figure out how to structure the data
> in a way that it's not a problem anymore.

I will be very, VERY interested if they manage to find something that
actually works, considering the last attempt was XML. ;) Even if you
don't count XML as an attempt at a universal structure, it's certainly
used for that, a lot.

Reply via email to