[users] Re: Sheet in CALC

Rod Engelsman Fri, 21 Apr 2006 15:45:35 -0700

Michael Adams wrote:

So I had a situation where I needed to massage some data that I hadprocured as text files. Do some statistics and draw some graphs, thatsort of thing. It turned out to be over 62000 data sets. But DataPilot
was the perfect tool to do what I needed.
Oh no it wasn't ;) It was the tool you were familiar with. A database is
the best tool for this (snuffle - nuthins perfect).

I beg to differ. Given the structure of the data and what I wanted toaccomplish, the datapilot was the perfect tool. You could have used myproblem as a textbook example of "How to Use a Datapilot". The *only*aspect of this that suggested the use of a database was the quantity ofdata.

Furthermore, once I had set up the Pivot table, I could just select theappropriate columns and immediately produce a graph. Using a database Iwould have had to export a results table into a graphing program likegnuplot.

Lastly, the fact that I was under time pressure makes my familiaritywith the tool a *very* relevant factor.

I probably _could_ have worked it out by creating a database, but itwould have taken me a whole lot longer because I would have spentconsiderable time just screwing around figuring out how to do it.
But the experience gained pays for itself.

How? I haven't had to do that kind of analysis since then, so I wouldstill be waiting for that payoff.

Now if it had been 72K instead of 62K rows I would have had no choice.


Next time it might be.


Assuming there's a next time. It was for a term paper -- one time event.

You can create dictionary definitions of "spreadsheet" and "databaseapp" that make them into completely different animals, but in real
life applications there is a considerable overlap.


Not when you get into relational databases - real world stuff. A
spreadsheet will handle single table type stuff sort of. It won't do it
as well as a database.

Think about this example - people and their email addresses.
You have a person, you have an email address, simple.
But then you have the same person that wants to contact you from work
and home. One person - Two email addresses.
He tells his wife and she signs up using the same home email address,
and her work address. Two people - three email addresses.
Their daughter signs up as well at home. Three people - three email
addresses.
With a flat database (spreadsheet) you have five entries.
dad - home
dad - dads work
mum - home
mum - mums work
grl - home

with a relational database you have two tables with three in each and
the crossreference info.

dad \|/ dads work
mum -|- home
grl /|\ mums work

Sorts and lookups are then done much faster as there are less to sort
through. Sort by person or sort by email. Look up the person then list
their emails.

Really simple example but just enough to explain my point (i hope).

I totally understand the concept of relational databases (and I'll seeyour First Normal Form and raise you to Third). The issue in my mindisn't so much the data itself or how much of it you have as it is whatyou intend to do with it. RDBs are the thing for storing, retrieving,and updating information. Spreadsheets are good for calculating andanalyzing information. I was doing the latter on a single, static, setof data.

But if I was trying to track customer information, or financialtransactions, or inventory, or anything else that is dynamic or anytimeyou need to break out the info into sub-tables (where you get into joinsand unions) like you described, then of course you would use a RDBMS.

The point is to use the proper tool for the job. What I didn't mentionin the first post (because I didn't think it relevant) was that I alsoused a couple of simple sed expressions to filter and pre-process thedata. Why? I know how to use sed. I know how to use Calc. I knowdatabase theory but I have almost zero experience at it. And gnuplot Iknow about, but haven't messed with. Easy decision.


--

Rod

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[users] Re: Sheet in CALC

Reply via email to