Re: [Jchat] Fw: [Jprogramming] DataFrames in J

Raul Miller Tue, 15 Feb 2022 09:50:03 -0800

Ok... this gives me an opportunity to clean up the code slightly:

PLUK=: <;._2 (fread '~user/CREDO')-.CR
pluk=: {{ y=.":y  assert.*/y e.'0123456789'
  r=. y,':'
  for_line.PLUK do.A=.;line
    len=.(#y)<.A i.' '
    wild=. y ,&I.&('0'=len{.]) A
    if. y */ .=&('0' wild} len{.]) A do.r=.r,(}.~i.&' ')A end.
  end.r
}}


I hope this helps,

-- 
Raul

On Tue, Feb 15, 2022 at 7:56 AM 'Bo Jacoby' via Chat <[email protected]> wrote:
>
> I am grateful for Raul's implementation. I am going to study it. Thank you 
> for the work.
>
> The first line of Raul's program is
>     CREDO=: <;._2 (fread '~user/CREDO')-.CR
> I guess it should have been
>     PLUK=: <;._2 (fread '~user/CREDO')-.CR
>
> Raul is concerned about the proofreading. When a file, such as CREDO, is 
> given the job is about analyzing the text. The usual situation, however, is 
> that the database is made in steps. The relationships between records should 
> match the relationships between the line numbers: 0>1, 1><2, 10<>01, so that 
> header lines are superordinate to detail lines and so on.
>
> Thanks!
>
> Bo.
>
>
>
>
>
> Den tirsdag den 15. februar 2022 01.32.50 CET skrev Raul Miller 
> <[email protected]>:
>
>
>
>
>
> That's kind of difficult to read...
>
> But ... it looks like the BASIC pluk implementation sets up a loop,
> reading numbers, and printing lines of text. That doesn't feel right
> for J -- it would be nicer to have a textual result that can be used
> elsewhere. So, I wrote a pluk which takes a numeric argument along the
> lines of what the BASIC pluk supports, and returns a character list
> similar to what the BASIC program displays:
>
> This implementation requires that the text of the CREDO file be saved
> in the J user directory. (And, if you are copying and pasting from the
> pdf, beware that the final line needs to begin with a space character.
> When I copied and pasted, that space character was lost.)
>
> Anyways, here's the implementation:
>
> CREDO=: <;._2 (fread '~user/CREDO')-.CR
> pluk=: {{ y=.":y  assert.*/y e.'0123456789'
>   r=. y,':'
>   for_line.CREDO do.A=.;line
>     len=.(#y)<.A i.' '
>     wild=. (I.'0'=len{.y),I.'0'=len{.A
>     if. y */ .=&('0' wild} ])&(len{.]) A do.r=.r,(}.~i.&' ')A end.
>   end.r
> }}
>
> Example use:
>   pluk '13510'
> 13510: CREDO IN SPIRITUM QUI CUM PATRE ET FILIO SIMUL ADORATUR AMEN
>   pluk '13520'
> 13520: CREDO IN SPIRITUM QUI CUM PATRE ET FILIO SIMUL GLORIFICATUR AMEN
>
> While technically, this approach is somewhat general -- you can
> construct numbers which reference sequence, I am not sure how I would
> proofread a file, for correctness (for hypothetical files other than
> CREDO).
>
> I guess this feels like it's about half way between a compression
> system and an indexing system.
>
> I hope this helps,
>
> --
> Raul
>
> On Mon, Feb 14, 2022 at 4:28 PM 'Bo Jacoby' via Chat <[email protected]> 
> wrote:
> >
> > Ric requested thoughts on database structure and asked for examples.
> >
> > Consider this example.
> >
> > https://www.dropbox.com/s/gj8r19hd6exyw3s/Norddata89.pdf?dl=0
> >
> > The database browser is 8 lines of BASIC. I have not converted it into J. I 
> > have no routine in file handling in J.
> >
> > The database example is in Latin. The explanation is in Danish.
> >
> > Have fun!
> >
> > Thanks. Bo.
> >
> >
> >
> >
> >
> > ----- Videresendt meddelelse -----
> >
> > Fra: Ric Sherlock <[email protected]>
> > Til: Programming JForum <[email protected]>
> > Sendt: mandag den 14. februar 2022 02.31.51 CET
> > Emne: Re: [Jprogramming] DataFrames in J
> >
> >
> > Hi Pascal,
> > From your email examples this looks promising.
> > I'll need to spend a bit of time with your repo though to get a better
> > understanding of what is going on here & what I think about it.
> > Thanks!
> >
> > On Mon, Feb 14, 2022 at 12:27 PM 'Pascal Jasmin' via Programming <
> > [email protected]> wrote:
> >
> > > updated kv to support inverted tables.
> > >
> > > https://github.com/Pascal-J/kv/blob/main/kv.ijs
> > >
> > >
> > > output for these functions are near bottom of
> > > https://raw.githubusercontent.com/Pascal-J/kv/main/README.md
> > >
> > > 'Id Name Job Status' kv ,.&.:>"1 |: maybenum each > ','cut each cutLF 0 :
> > > 0 NB. from your example
> > >
> > > 3,Jerry,Unemployed,Married
> > > 6,Jan,CEO,Married
> > > 5,Frieda,student,Single
> > > 1,Alex,Waiter,Separated
> > > )
> > >
> > >
> > > A one line dsl version:
> > >
> > > itdsl kvdsL 'Id Name Job Status `3 6 5 1 ; Jerry Jan Frieda Alex
> > > ;Unemployed:CEO:student:Waiter; Married Married Single Separated'
> > > NB. note extra garbage spaces
> > >
> > > In a dictionary, inverted tables are stored as key(fieldname), and
> > > associated boxed table (field value that can be longer than 1 char per
> > > value).
> > >
> > > I have an inverted table display (keys as column headers) too.  That can
> > > optionally filter just the fields you wish
> > >
> > > A query that retrieves records where Job is 'CEO' or 'student' then
> > > displays in IT format
> > >
> > > itdisp ('CEO;student' padstrmatch 'Job' kvget ]) kvQ it_kvtest_
> > >
> > > or partial 3 column display without the special ; parsing
> > >
> > > 'Id Name Job' itdisp1 (('CEO';'student') padstrmatch 'Job' kvget ]) kvQ it
> > >
> > > kvQ is a "query engine" (too simple to call it that) that takes a boolean
> > > function to filter records "on set of properties/fields"
> > >
> > > Basically an inverted table as a dictionary is a dictionary that only
> > > contains the properties corresponding to fields.  Because kv can store
> > > other dictionaries, if you have additional information to store, it can be
> > > in "peer" keys to your IT dictionary.
> > >
> > >
> > >
> > >
> > > On Sunday, February 13, 2022, 08:49:02 a.m. EST, Ric Sherlock <
> > > [email protected]> wrote:
> > >
> > >
> > >
> > >
> > >
> > > Inspired by recent threads, I've started experimenting with a DataFrame
> > > structure in J.
> > > I began by building off the 'general/misc/inverted' utilities so that a
> > > DataFrame is just a 2-row table, where the first row is a list of labels &
> > > the 2nd is an inverted table.
> > >
> > > It can be installed as 'tables/dataframe' from the following github
> > > repository: https://github.com/tikkanz/jdataframe
> > >
> > > The JArrow bindings that Aaron pointed me at a while back represent a
> > > similar data structure as a 2-column table where the labels are in the
> > > first column and the "columns" are rows in the 2nd column.
> > > (https://github.com/tikkanz/JArrow)
> > > I think I need to play with both for a while to see which works better.
> > >
> > > Obviously there is a lot more to do, but I thought I'd share in case 
> > > anyone
> > > has any thoughts, suggestions or wants to help take the idea further.
> > > Ric
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
>
> > >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jchat] Fw: [Jprogramming] DataFrames in J

Reply via email to