Re: [Jchat] [Jprogramming] Report on the J wiki meeting of January 27, 2022

Ian Clark Wed, 02 Feb 2022 10:32:16 -0800

Scott wrote
> exist in K (everything is a rank 1 list, hooked together by dictionaries).

This is starting to make a lot more sense to me. It throws me back to a
past-life as a RDB (relational data base) theorist of the Codd/Date school,
circa 1973. Which was before Square/Sequel/SQL came along. I can at last
appreciate how K supports a relational data base already reduced to 3rd
Normal Form.

And the key to it all is Dictionaries. Thud!

I'm amazed at myself. All these years I've totally ignored K, as a
dumbed-down version of APL for minds that can only think in 2 dimensions or
less. Which goes to show how crippled you become when you hit on a
suboptimal data model as your intellectual foundations. (See
https://code.jsoftware.com/wiki/Essays/Izzy_The_Genius_Tailor ).

I shall ignore K no longer. (Thanks Rob for the session log.)

By 1973 IBM Peterlee Scientific Centre had built a working relational data
base engine IS/1 -> PRTV
https://en.wikipedia.org/wiki/IBM_Peterlee_Relational_Test_Vehicle_(PRTV)

The IBM salesmen, already smarting under the lash of Ted Codd who was going
around trashing IMS (IBM Information Management System, which proclaimed
Hierarchies as the fundamental building block of All Human Thought) – the
IBM salesmen made ideological war on it to make sure it *was* forgotten –
pronto. But the Peterlee database had already pioneered some cool data
mining prototypes among IBM's UK customers in areas like town planning,
plus underpinning the operating system of the (now forgotten) Future Series
of mainframes, then being developed at IBM ASDD Mohansic.

I see that "System R" (IBM San Jose Research, 1974) has entered history as
the very first working relational dbms (data base management system).
False. Peterlee's dbms predated it by at least 2 years.

Calling a data engine a "relational" database is like talking about "lactic
cheese" or "extraterrestrial planets". RDB in Ted Codd's mind (he was
trained as a mathematician) wasn't a data base engine, it was an
abstraction, a schematic for data analysis applicable to any collection of
data. Its unique selling point was how it eliminated undesirable data
dependencies via a process called normalization. Hence: 3rd normal form.

To a RDB theorist, it didn't matter how a relation was implemented, whether
the tuples (2-tuples or "duples", in 3rd normal form) were computed,
streamed or physically stored. In-flight, you could unplug one and plug-in
another.

Old stagers went around groaning "so now I've got to call all my files
'relations' have I?" Yes – and what's worse, all your programs too. Plus
the things you already knew as relations, like .EQ. .GE. .GT. etc. So watch
out, banks and insurance companies, your entire programming department is
 all set to disappear up your RDB.

To an IBM salesman, this was the zombie apocalypse, it stopped you getting
to sleep at night. You'd totally lose control of your account (…did you
know that IBM thought it controlled you in those days? – not the first
vendor to do so, and by no means the last!).
It Wasn't Going To Happen! -they said.
It did. In the end the unthinkable happened: IMS embraced SQL. Though by
then a new generation had come along and nobody noticed.

PRTV had a defiantly "mathematical" interface. Apart from the usual
set-operations: union, intersection, difference, there was basically only
one operation on relations, viz "join". This was symbolized by an
extravagantly decorated asterisk. Square/Sequel replaced all that
extravagant decoration with plain English.

On Tue, 1 Feb 2022 at 16:01, Scott Locklin <[email protected]> wrote:

> One of the problems you're going to run into right away with making
> dictionaries a core part of J is the concept of rank, which doesn't really
> exist in K (everything is a rank 1 list, hooked together by dictionaries).
>
> At the end of the day, what I think most people want from maps/dictionaries
> is a sort of syntax sugar for relatively flat boxes. When you slurp in some
> json object in current-year J, it is a box, possibly with a sort of
> suggestive index of key names for the json values. Same with jd-like tables
> (with a few column names you can more or less select by). It would be much
> nicer and more "current year" if you could slurp in some json and then read
> the fields off like an associated list in R. You could write verbs like
> this; making inverted tables with headers.
> https://code.jsoftware.com/wiki/Essays/Inverted_Table
>
> FWIIW the reason Pandas looks the way it does is both inspiration from R's
> data.tables/frames (which are associated list rank 1 vectors), but also
> from J and K, which Wes is a big fan of.
>
> Anyway the K way is pretty cool, but just creating an idiom of J for
> "convenient selectors" for the "slurping in a json" and "selecting columns
> of a table" use cases probably makes everyone happy. No need to specify
> anything beyond that.
>
> -SL
>
> On Tue, Feb 1, 2022 at 12:50 PM Danil Osipchuk <[email protected]>
> wrote:
>
> > I would second looking at how k approaches dictionaries regarding
> > operations available and their domain. Uniformity between lists,
> > dictionaries and function calls is really elegant there too (although
> > likely out of reach - but maybe still having some parts of syntactic
> sugar
> > is possible - like having some forms [  ]  of notation, yes identity
> > functions have to be renamed may be to x. and y. making it a drastic
> > change, but can we dream?).
> >
> > But putting syntax aside, at the very minimum, to me, if it is not a tree
> > like structure with keys of arbitrary type, such a thing is not of much
> > added value.  After all j is already extremely well equipped to handle
> > rectangular data and people can manage it perfectly well. It is precisely
> > in tree like data structure things get awkward quickly
> >
> > Regards,
> > Danil
> >
> > вт, 1 февр. 2022 г., 10:12 Raul Miller <[email protected]>:
> >
> > > Another model to think about here is K's implementation, where the
> symbol
> > > table is a tree, and thus supports contexts which do not have equal
> > length
> > > columns.
> > >
> > > --
> > > Raul
> > >
> > > On Mon, Jan 31, 2022 at 9:54 PM Ric Sherlock <[email protected]>
> wrote:
> > >
> > > > > otoh, we already have a binding to R where you can deal
> > > > > with dataframes easily – do we want to “compete” here, too?
> > > > >
> > > > > In terms of whether to just adopt/use one of
> > > R's/Pandas'/Polars'/Julia's
> > > > dataframes rather than reinventing the wheel - I think for me that if
> > it
> > > > were possible to use J primitives to interact with the dataframe
> > > directly,
> > > > that would be really compelling. If that were possible with a binding
> > to
> > > an
> > > > external implementation, then I'm all for it. If not then I think it
> is
> > > > still worth discussing J native dataframes.
> > > >
> ----------------------------------------------------------------------
> > > > For information about J forums see
> http://www.jsoftware.com/forums.htm
> > > >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> > >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jchat] [Jprogramming] Report on the J wiki meeting of January 27, 2022

Reply via email to