Hi everyone. One of my goals for the version of PSPP following the 0.6.0 release is to improve the output subsystem. I have some ideas for how to do this, that I want to discuss with the list, but I thought that the first step should be to talk about why the existing output subsystem is inadequate and how we'd like its successor to be an improvement.
I've discussed a little of this briefly with Jason over IRC. Some of the goals relate to what I'd call genuine problems with the PSPP output subsystem: 1. Problem: As a developer, it is difficult to work with the output subsystem. Formatting output for a procedure requires writing more code than it should. Code for output requires careful thought about how the formatted output should look. It also tightly binds cosmetic details of output to the code that produces it. Goal: The new output subsystem should allow developers to focus on data output, not formatting. Ideally, data output should be completely separated from presentation details. 2. Problem: PSPP output is not easily machine-readable in a semantically meaningful way. That is, data produced as part of the output is difficult to extract for use by other software or by subsequent PSPP procedures. Another aspect of the same issue is that PSPP tests that compare output end up compare cosmetic details of the formatting, not just the data produced. Goal: The new output subsystem should be able to produce machine-readable, semantically meaningful data output in at least one widely understood format, such as CSV or an XML schema. Then tests can compare this output format. 3. Problem: Output is fixed in form at the time of its generation; that is, there is no practical way for the GUI to allow the user to re-style or reformat tables, and certainly not to do anything more advanced like the "pivot table" features of spreadsheets. Goal: Provide good interactive output support. 4. Problem: Tables larger than memory cannot be efficiently formatted. (This is why the LIST procedure more or less sidesteps the output subsystem, without producing real tables in its output.) Goal: Efficiently support tables larger than memory. 5. Problem: As a user, the output subsystem is difficult to configure. Goal: Simplify configuration. Other goals are really just improvements: 6. Support the OMS (output management system) commands from the competing software, to the extent possible. Finally, a few properties that I'd like to retain: 7. Keep performance reasonable. 8. Retain potential for internationalization of output. (Right now it's just "potential" only because no one has actually translated it.) 9. Make sure that reasonably formatted plain text output with tables and (in separate files) graphs is still possible. It's still sometimes easier to view a text file than a PDF or HTML document. I've been doing experiments off and on over the last few years, trying to figure out a good way to do it. Now I think I may have a reasonable approach. But I'd like to agree on what our goals are before I go on. So does anyone have anything to add or detract to the above? This will be about third or fourth iteration of the PSPP output subsystem. Perhaps we can get it right this time... -- Ben Pfaff http://benpfaff.org _______________________________________________ pspp-dev mailing list [email protected] http://lists.gnu.org/mailman/listinfo/pspp-dev
