On Wed, May 02, 2007 at 09:39:34AM +0800, John Darrington wrote: > Having thought about this some more and tried a few experiments, I'm > in favour of approach 2. > > We need not duplicate category.[ch] rather we can generalise it, so > that the new structure can do the work of both.
Okay. I had made some progress using approach 2, but have been sidetracked by final exams. I'll get back to it next week. -Jason > > J' > > > On Sun, Apr 15, 2007 at 03:06:17PM -0400, Jason Stover wrote: > To have a glm procedure, pspp needs a data structure to handle > interactions. An interaction can be thought of as another variable > which is a function of two or more variables, usually categorical, > like this: > > Variable 1 Variable 2 Interaction > A B AB > E B EB > A C AC > E C EC > > > ...etc. The interaction term could be created in one of two ways: > Either 1) create a new variable in the dictionary that corresponds to > the interaction, or 2) create a new 'interaction' data structure > that contains all necessary mappings between existing variables and > the value of the interaction. > > Approach 1 would add a variable to the dictionary, but would not > create any more observations in the data set. It would make coding any > procedures that use interactions easier than approach 2, because doing > so would mean the procedure doesn't need to know about much special > code to handle interactions. It would also prevent the need for having > any more obscure string-values-to-binary-vector code like that in > category.[ch]. Approach 1 would still require the creation of some > code to create the interaction, though it may not require the creation > of a specialized "interaction" data structure to be available for use > by all procedures. > > Approach 2 doesn't require adding anything to the dictionary, but it > does mean that any procedures that need to use interactions would have > to create those interactions themselves. These interactions would > therefore be lost after the procedure exits, meaning that any other > procedure that needs interactions would have to recreate > them. Approach 2 also means writing more code that partly duplicates > the code already in category.[ch]. > > I favor approach number 1, but before I fiddle with the > dictionary, I thought I should ask. > > -- > PGP Public key ID: 1024D/2DE827B3 > fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 > See http://pgp.mit.edu or any PGP keyserver for public key. > > _______________________________________________ pspp-dev mailing list [email protected] http://lists.gnu.org/mailman/listinfo/pspp-dev
