RFC: goals for rewritten output subsystem

Ben Pfaff Thu, 29 May 2008 22:53:23 -0700

Hi everyone.

One of my goals for the version of PSPP following the 0.6.0
release is to improve the output subsystem.  I have some ideas
for how to do this, that I want to discuss with the list, but I
thought that the first step should be to talk about why the
existing output subsystem is inadequate and how we'd like its
successor to be an improvement.


I've discussed a little of this briefly with Jason over IRC.

Some of the goals relate to what I'd call genuine problems with
the PSPP output subsystem:

1. Problem: As a developer, it is difficult to work with the
   output subsystem.  Formatting output for a procedure requires
   writing more code than it should.  Code for output requires
   careful thought about how the formatted output should look.
   It also tightly binds cosmetic details of output to the code
   that produces it.

   Goal: The new output subsystem should allow developers to
   focus on data output, not formatting.  Ideally, data output
   should be completely separated from presentation details.

2. Problem: PSPP output is not easily machine-readable in a
   semantically meaningful way.  That is, data produced as part
   of the output is difficult to extract for use by other
   software or by subsequent PSPP procedures.  Another aspect of
   the same issue is that PSPP tests that compare output end up
   compare cosmetic details of the formatting, not just the data
   produced.

   Goal: The new output subsystem should be able to produce
   machine-readable, semantically meaningful data output in at
   least one widely understood format, such as CSV or an XML
   schema.  Then tests can compare this output format.

3. Problem: Output is fixed in form at the time of its
   generation; that is, there is no practical way for the GUI to
   allow the user to re-style or reformat tables, and certainly
   not to do anything more advanced like the "pivot table"
   features of spreadsheets.

   Goal: Provide good interactive output support.

4. Problem: Tables larger than memory cannot be efficiently
   formatted.  (This is why the LIST procedure more or less
   sidesteps the output subsystem, without producing real tables
   in its output.)

   Goal: Efficiently support tables larger than memory.

5. Problem: As a user, the output subsystem is difficult to
   configure.

   Goal: Simplify configuration.

Other goals are really just improvements:

6. Support the OMS (output management system) commands from the
   competing software, to the extent possible.

Finally, a few properties that I'd like to retain:

7. Keep performance reasonable.

8. Retain potential for internationalization of output.  (Right
   now it's just "potential" only because no one has actually
   translated it.)

9. Make sure that reasonably formatted plain text output with
   tables and (in separate files) graphs is still possible.  It's
   still sometimes easier to view a text file than a PDF or HTML
   document.

I've been doing experiments off and on over the last few years,
trying to figure out a good way to do it.  Now I think I may have
a reasonable approach.  But I'd like to agree on what our goals
are before I go on.  So does anyone have anything to add or
detract to the above?  This will be about third or fourth
iteration of the PSPP output subsystem.  Perhaps we can get it
right this time...
-- 
Ben Pfaff 
http://benpfaff.org


_______________________________________________
pspp-dev mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/pspp-dev

RFC: goals for rewritten output subsystem

Reply via email to