I'm investigating working with data stored in inverted tables.
<http://www.jsoftware.com/jwiki/Essays/Inverted_Table>

Some example data:
hdr=: ;: 'ID DOB Sex Nat Height Weight Arrests'
x0=: 'ID',"1 ];._1 ' ',": 10000+ 40 [EMAIL PROTECTED] 4
x1=: 1998+ 40 [EMAIL PROTECTED] 4
x2=: (40 [EMAIL PROTECTED] 2){ >'Female';'Male'
x3=: (40 [EMAIL PROTECTED] 3){ >'NZ';'US';'CH'
x4=: 1.2 +  1 * ([EMAIL PROTECTED] 0)
x5=:  60 + 12 * ([EMAIL PROTECTED] 0)
x6=: 40 [EMAIL PROTECTED] 6
invtble=: x0;x1;x2;x3;x4;x5;x6

I came up with the following to give the frequency of each unique rows
in an inverted table:
tfreq=: #/.~@:|:@:(i.&>)~

key=: 1 3 2{invtble
,.each tfreq key

The following verb makes a frequency table:
tfreqtble=: [: tsort tnub , <@:tfreq

,.each tfreqtble key

Or a more generalised version
tkeytble=: [: tsort ([: tnub [) , [: boxopen ]

,.each key tkeytble tfreq key

And I can calculate say the sums of data columns by the set of key
columns e.g.
tkeysum=: ] +//.&.>~ <@:tindexof~@:[

dat=: 4 5 6{invtble
,.each key tkeysum dat

But what I think I really want is a verb tkey, that does what /. does,
but on inverted tables.
Then instead of defining tkeysum, tkeymin, tkeymax ..., I could do this:
  key +/tkey  dat
  key <./tkey dat
  key >./tkey dat
  ...
Or in a sorted table:
,.each  key tkeytble key +/tkey dat

Any pointers or other approaches to consider gratefully accepted!

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to