[Jprogramming] Documenting the i.-family: applications of dyad I.

Dan Bron Mon, 07 Oct 2013 17:22:25 -0700

In [1], re J's the documentation, Joe Bogner wrote:
>  There are gaps [in the documentation] that can be 
>  frustrating. Most of the work I've done so far has 
>  been heavily depending upon i., E., e., I. ~. /. 
>  [and] the standard vocabulary pages are light on 
>  information.

The primitives Joe has been focusing on, i. E. e. I. ~. /. form a verb
family in J [3,4], so it's unsurprising that he encountered them together in
his recent work.  Because i. (the lookup function) underlies most of the
other primitives, it also characterizes their function, so we might call
this collection the i.-family.*  This i.-family has a long history and is
one of the most distinguished in the language (its members are both
high-level and useful, and therefore absolutely ubiquitous in J code).

Earlier in the same thread [2], I wrote:
>  If someone has a specific article they'd like 
>  written up, or have a specific need they're 
>  currently addressing, I'd be happy to contribute 
>  an article on that topic.  Requests?

So, along the lines we've been discussing that thread, I'd say the i.-family
members are strong candidates to target for expanded and accessible
documentation.  

Joe also wrote:
>  I particularly like the uses simple & advanced.
>  It would also be great to find more longer 
>  examples of applications.

Here's a start at that: documenting more applications of dyad I. . 

In the recent Chat thread on scoring bike races, I wrote [5,6]:
>  The dyad I., interval index, is a sterling example
>  of the expressivity and economy of J's notation.
>
>  The core parts of the scoring algorithm are both
>  based on the dyad I. ... because those parts of 
>  the scoring system are defined in terms of ranges
>  (a race of 5 to 10 riders, a race of 11 to 20 riders 
>  vs. a total score from 0 to 20, a total score from
>  21 to 30, etc). 

While participating in a recent code-golf challenge on StackExchange [7] I
was reminded of another use of I. to shorten and clarify code: character
ranges. The challenge was to implement a text case-conversion function
without using any built-ins or libraries supplied for that purpose.  Now,
there's a standard J idiom for doing this:

        UCALPHA&, {~ LCALPHA&, i. ] 

which is actually pretty concise.  That is, until you take into account
defining the constants UCALPHA and LCALPHA, which would take a minimum of 56
characters.  Given that the verb itself is only 23 characters (including
long embedded names), it seems like the biggest opportunity for reduction is
in how we identify character ranges.

If you've been following along (and you noticed the word "ranges"), you
won't be surprised to hear that I. is the method of choice.  I lifted some
old code from [8] and came up with:

        '@Z'(]+32*1=I.)&.(a.&i.)]

Not bad! The whole case folding algorithm expressed in three simple
operations: ] + 32 * 1 = I. (compare, multiply, add). Here, the biggest
savings came from expressing case conversion as arithmetic in codepoint
space (i.e. &.(a.&i.)), meaning we don't even have to mention uppercase
letters at all, but of course this was only possible because we were able to
characterize the lowercase letters as a range: that is, those characters
falling between '@' and 'Z' in the ASCII alphabet.

Now, I don't think J is going to win this particular code-golf challenge
(against e.g. Perl which has built-in syntax for character ranges and
translation), but I still think the example of I. is pretty neat. Once
again, it has distilled what we really mean: we have expressed "case
conversion" in an algebraic, quantitative manner. 

Anyway, I think I'll take up the task of adding (or enhancing) the NuVoc
entry on dyad I. .  I think as a community we might flesh out the
documentation of the entire i.-family, both individually and in aggregate.

Feedback welcome.

-Dan

*  Another reasonable name would be the ~.-family, because in one way or
another all its members are concerned with uniqueness. But since i. is by
far the most prominent member, both in field (being used ~10x as often as ~.
[9]) and at home (representing 1K lines of highly optimized C code [A]), it
is appropriate and customary to call it the i.-family.

[1] Gaps in J's current documentation, and potential models for improvement:
    http://www.jsoftware.com/pipermail/programming/2013-October/033632.html
[2] Improving J's documentation:
    http://www.jsoftware.com/pipermail/programming/2013-October/033630.html

[3] Performance improved for i.-family primitives:
    http://www.jsoftware.com/release/indexof.htm
[4] JfC entry on i. family:

http://www.jsoftware.com/help/jforc/performance_measurement__tip.htm#_Toc191
734576

[5] Scoring road races in J using I. : 
    http://www.jsoftware.com/pipermail/chat/2013-October/005366.html
[6] Improvement to scoring algorithm: 
    http://www.jsoftware.com/pipermail/chat/2013-October/005364.html

[7] codegolf.stackexchange challenge, "tolower":

http://codegolf.stackexchange.com/questions/12760/converting-a-string-to-low
er-case-without-built-in-to-lower-functions/12764#12764
[8] Jolf, "Fifth Hole": 
    http://www.jsoftware.com/pipermail/programming/2010-June/019665.html

[9] Relative frequency of J primitives:
    http://www.jsoftware.com/pipermail/chat/2010-November/004035.html
    According to that list, i.-family is ranked roughly as follows:

        i.      20x
        e.      12x
        -.      12x
        ~:       6x
        I.       3x
        ~.       1x
        i:       1x

    But note this doesn't distinguish monads from dyads, which will
    skew the results (for example, both monad i. and dyad ~: are very
    common, but neither one is actually in the i.-family). For a 
    fairer comparison, we'd have to account for this somehow, and 
    make other adjustments, like including dyad i.~ (self-reflection).

[A] Roger once wrote "The dyad i. can be used to justify the 
    exorbitant price for the product.":
    http://www.jsoftware.com/pipermail/programming/2011-August/023683.html

    For those interested, the source for dyad i. can be found at 
    https://github.com/openj/core/blob/master/vi.c (and visp.c for
    the sparse analog).  All told, the implementation is nearly 
    1000 lines of macro-dense code.

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

[Jprogramming] Documenting the i.-family: applications of dyad I.

Reply via email to