I'm investigating working with data stored in inverted tables.
<http://www.jsoftware.com/jwiki/Essays/Inverted_Table>
Some example data:
hdr=: ;: 'ID DOB Sex Nat Height Weight Arrests'
x0=: 'ID',"1 ];._1 ' ',": 10000+ 40 [EMAIL PROTECTED] 4
x1=: 1998+ 40 [EMAIL PROTECTED] 4
x2=: (40 [EMAIL PROTECTED] 2){ >'Female';'Male'
x3=: (40 [EMAIL PROTECTED] 3){ >'NZ';'US';'CH'
x4=: 1.2 + 1 * ([EMAIL PROTECTED] 0)
x5=: 60 + 12 * ([EMAIL PROTECTED] 0)
x6=: 40 [EMAIL PROTECTED] 6
invtble=: x0;x1;x2;x3;x4;x5;x6
I came up with the following to give the frequency of each unique rows
in an inverted table:
tfreq=: #/.~@:|:@:(i.&>)~
key=: 1 3 2{invtble
,.each tfreq key
The following verb makes a frequency table:
tfreqtble=: [: tsort tnub , <@:tfreq
,.each tfreqtble key
Or a more generalised version
tkeytble=: [: tsort ([: tnub [) , [: boxopen ]
,.each key tkeytble tfreq key
And I can calculate say the sums of data columns by the set of key
columns e.g.
tkeysum=: ] +//.&.>~ <@:tindexof~@:[
dat=: 4 5 6{invtble
,.each key tkeysum dat
But what I think I really want is a verb tkey, that does what /. does,
but on inverted tables.
Then instead of defining tkeysum, tkeymin, tkeymax ..., I could do this:
key +/tkey dat
key <./tkey dat
key >./tkey dat
...
Or in a sorted table:
,.each key tkeytble key +/tkey dat
Any pointers or other approaches to consider gratefully accepted!
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm