[sc-issues] [Issue 80139] Avoid ERRORS in duplicate Spreadsheets and Data

discoleo Sat, 28 Jul 2007 10:24:44 -0700

To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=80139
                 Issue #|80139
                 Summary|Avoid ERRORS in duplicate Spreadsheets and Data
               Component|Spreadsheet
                 Version|OOo 2.2.1
                Platform|All
                     URL|
              OS/Version|All
                  Status|UNCONFIRMED
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|FEATURE
                Priority|P3
            Subcomponent|code
             Assigned to|spreadsheet
             Reported by|discoleo






------- Additional comments from [EMAIL PROTECTED] Sat Jul 28 17:24:38 +0000 
2007 -------
This issue makes it in my TOP 5 of major design flaws of existing spreadsheet
applications.


TOC
===

1. INTRO / REAL CASES
2. PROBLEM
3. SOLUTIONS
  3.1 COLLABORATIONS
  3.2 DUPLICATE DATA / DATA LINKS
  3.3 TRACK CHANGES
  3.4 VERSIONING
4. OTHER RELEVANT ISSUES


1. INTRO
========

*Duplicate Data* perpetuates ERRORS!

One encounters in current practice dozens of copies of the same spreadsheet
because different people need to work with it, or because one needs only a
subset of the original data (which is copied/saved into a new table/
spreadsheet). [This is NOT a duplicate of issue 8811.]

One of my first tasks - as I started work in a big department - was to oversee
the allocation of revenues to various contractors based on some specific
characteristics. The first issue I stumbled upon, was the *lack of a central
repository* where those characteristics were stored.

Instead, various departments/employees had their own copy of the spreadsheet and
everybody was updating his own copy.

BUT no one had the spreadsheet up to date. There was NO way to get a fully *up
to date* spreadsheet, and there was NO way to *track the changes*.

As a second example, I will describe now a scenario that happened while I was
doing some research. I was the only person involved in analysing the
spreadsheet. However, due to the complexity of the spreadsheet, I created
various (sub-) tables containing only relevant portions of the data.

In the case I detected one error in the data, there was NO method to correct the
error *only ONCE*, and have the correction applied in all sheets/data. The
errors were doomed to persist in the various duplicate data.


2. THE PROBLEM
==============
Existing spreadsheets offer very little mechanisms to avoid problems of
duplicate data and the errors that arise because of this.

While many will argue that in such cases a database should be used, common
practice shows that most work is still done with spreadsheets, because of the
following reasons:
 a.) easy to create (actually NO effort at all)
 b.) needs NOT a rigid structure (unlike a DB), therefore
     great flexibility in the beginning, when the data may be largely unknown
 c.) you see what you do, and you already see the results
     (unlike for DBs - you need a dedicated software to compute the results)
 d.) NO special coding skills needed


3. SOLUTIONS
============

As mentioned, a DB might be tentative, BUT NOT in real practice. Some
spreadsheets offer some (incomplete) solutions (like Excel for issue 8811), BUT
they will all fail on a global scale.

  3.1 COLLABORATIONS
  ==================
  Spreadsheets should allow collaboration between people. This is described in
greater detail in issue 8811. Features to be implemented:
 - allow simultaneous editing by different users
 - allow various locking mechanisms

  3.2 DATA LINKS
  ==============
  Often, one needs only subsets of the original data to further process. One may
filter this data out and then copy/paste it to a new spreadsheet.

However, what is lacking is to paste NOT the actual data, but a link to the
original data, like a *HARD-LINK* on Unix-like OSs. This is described in greater
detail in issue 66817 (see 
http://www.openoffice.org/issues/show_bug.cgi?id=66817).

A *HARD-LINK* would implement a mechanism through which a correction done in a
copied cell is propagated back in the original cell.

Another issue in this series is issue 34213 that involves pasting references to
the original cells (see http://www.openoffice.org/issues/show_bug.cgi?id=34213).
[Though the hard-links would be more powerful.]

  3.3 TRACK CHANGES
  =================
  A very useful feature for such collaborative efforts is to track the changes.
This is even more important, as numerous simultaneous changes may slow down the
computer due to the re-calculations.

Therefore, a mechanism should be in place to disable automatic recalculations
BUT to show which cells have potentially changed (and need be updated IF one
needs those values).

Some more brainstorming is here really indicated.

  3.4 VERSIONING
  ==============
  Another problem of these spreadsheets arises directly due to the frequent
changes. Previous spreadsheets might well be needed (because e.g. of legal
reasons), so one has to save the various versions of the spreadsheet.

I like in this respect especially the versioning scheme described in FORTRESS
(see the Fortress specification,
http://research.sun.com/projects/plrg/Publications/index.html), where new
versions of the same program are saved alongside the old version.

This Versioning-mechanism should be able to store snapshots of the spreadsheet
for future use. [This is definitely more difficult than the case in FORTRESS, so
some more brainstorming is warranted.]


4. RELEVANT ISSUES
==================
I have already mentioned the relevant issues. There are surely more relevant
issues (to come). In brief:
 - issue 8811: Allow multiple users to edit the same spreadsheet...
 - issue 66817: Paste as HARD-LINK (Unix style)
 - issue 34213: Paste as reference
 - still others to come ...

---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[sc-issues] [Issue 80139] Avoid ERRORS in duplicate Spreadsheets and Data

Reply via email to