I think your approach is fine.

That said, I picked up the aggregate functions from the jd tutorial.

In other words, something like this:

   require'jd' NB. and inspect the contents of jdwelcome_jd_
   jdrt ''
   jdrt'op'
   jdrt'reads'
   jdrt 'reads_aggregation'

Most of those jdrt instances just present indexes (or chapter
headings). jdrt'reads_aggregation' starts the tutorial explaining the
aggregation functions.

Anyways, down in that reads_aggregation tutorial is this bit, which I
think you might be interested in:

---------------- 8< ----- clip here ---------- 8< ----------------

db custom aggregations fns are defined in db custom.ijs
define avgnonneg to ignore negative values (null) in getting the average

   custom=: 0 : 0
aggavgx=: 3 : '(+/t)%#t=. (y>:0)#y'
aggavgx addagg 'avgnonneg'
)

   custom fwrite '~temp/jd/test/custom.ijs'
   jdloadcustom_jd_'' NB. load changes

---------------- 8< ----- clip here ---------- 8< ----------------

Here, 'test' was the tutorial's argument to jdadminx

And, then the tutorial goes on to show that jd'info agg' shows the
newly defined 'avgnonneg' word, and that avgnonneg can be used in a jd
reads sentence.

Or: if what you are doing is conceptually an aggregation, I expect
that you can introduce definitions like this. (I did not need to do
that, for this rosettacode task.)

I hope this helps,


--
Raul

On Thu, Feb 10, 2022 at 9:14 PM Devon McCormick <devon...@gmail.com> wrote:
>
> Hi Raul,
>
> I've been using jd for the past few months - I have 3e8 simulations of
> poker games for 2 to 11 players - but am still a novice.  I basically pull
> in the entire column I want and manipulate it in J, which is fine, but I'd
> like to be able to do something more sophisticated, like apply a function
> to a column without reading in the whole thing first.
> However, the way I'm doing it has worked fine, even on my machine with only
> 16GB so I am not motivated enough to take the next step and do things
> within the database.
>
> If you figure out anything more advanced, which it sounds like you already
> have, I'm interested in what you have learned.
>
> Thanks,
>
> Devon
>
>
>
> On Thu, Feb 10, 2022 at 6:37 PM Raul Miller <rauldmil...@gmail.com> wrote:
>
> > Thanks again,
> >
> > --
> > Raul
> >
> > On Thu, Feb 10, 2022 at 6:07 PM Eric Iverson <eric.b.iver...@gmail.com>
> > wrote:
> > >
> > > Yes, you can do that.
> > >     jd'info ref' NB. list ref cols
> > >     jd'dropcol colnam'
> > >
> > > https://code.jsoftware.com/wiki/Jd/Ops_info#info
> > > https://code.jsoftware.com/wiki/Jd/Ops_drop#dropcol
> > >
> > >
> > > On Thu, Feb 10, 2022 at 5:44 PM Raul Miller <rauldmil...@gmail.com>
> > wrote:
> > >
> > > > Excellent, thanks. Adding 'first' to the other column labels is easy
> > > > and makes sense to me.
> > > >
> > > > One other question: currently to test the script I am writing, I am
> > > > having to delete the jd files from outside of J. If I fail to do that,
> > > > I get the error "|Jd error: table has ref: op:droptable db:csvload"
> > > > when I re-run my script.
> > > >
> > > > In this context, I do not want to protect the data -- in this context,
> > > > testing my script for unexpected dependencies is key. Is there some
> > > > way of dropping that ref?
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Raul
> > > >
> > > > On Thu, Feb 10, 2022 at 2:42 PM Eric Iverson <eric.b.iver...@gmail.com
> > >
> > > > wrote:
> > > > >
> > > > > Take a look at https://code.jsoftware.com/wiki/Jd if you have not
> > > > already
> > > > > done so.
> > > > >
> > > > > The docs are weak and we depend more on the tutorials. There are
> > > > tutorials
> > > > > that cover the area you are interested in.
> > > > >
> > > > > I loaded your files and have taken a quick look.
> > > > >
> > > > > First, I removed all the col relabel stuff as it is noise (and at
> > times
> > > > in
> > > > > the past has triggered bugs). That is, remove all the txt: and other
> > > > stuff
> > > > > to make the statement as simple as possible.
> > > > >
> > > > > The by clause requires that there be an aggregation function for
> > each col
> > > > > in the select clause. That is the important part you were missing,
> > > > although
> > > > > the error was quite misleading.
> > > > >
> > > > > The following is a simple statement that works and has all the
> > important
> > > > > elements from your example.
> > > > >
> > > > >
> > > > > jd'reads count patients.LASTNAME , count visits.VISIT_DATE by
> > > > > patients.PATIENTID  from patients, patients.visits'
> > > > >
> > > > > Perhaps it will help you get further along.
> > > > >
> > > > > If you get stuck again or are still stuck, please ask again.
> > > > >
> > > > > On Thu, Feb 10, 2022 at 2:12 PM Raul Miller <rauldmil...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Embarrassing mis-statement:
> > > > > >
> > > > > > I wrote: "I am not trying to learn Jd"
> > > > > >
> > > > > > I meant: I am *now* trying to learn Jd
> > > > > >
> > > > > > --
> > > > > > Raul
> > > > > >
> > > > > > On Thu, Feb 10, 2022 at 1:58 PM Raul Miller <rauldmil...@gmail.com
> > >
> > > > wrote:
> > > > > > >
> > > > > > > http://rosettacode.org/wiki/Merge_and_aggregate_datasets
> > > > > > >
> > > > > > > After looking at this rosettacode task, I decided that Jd is
> > probably
> > > > > > > best suited for the J task implementation. So, I am not trying to
> > > > > > > learn Jd (I had not had occasion to use it, previously).
> > > > > > >
> > > > > > > After a few minor mishaps, I've stumbled on an issue which I do
> > not
> > > > > > > know how to resolve.
> > > > > > >
> > > > > > > (Previous mishaps: csvload fails with an error suggesting that
> > the
> > > > > > > file does not exist if csvprepare has not been previously run.
> > This
> > > > > > > behavior is undocumented, except by example in the tutorials.
> > Also, I
> > > > > > > need to run the tutorial in a different J session from my testing
> > > > work
> > > > > > > to prevent the tutorials from breaking my tests.)
> > > > > > >
> > > > > > > Anyways, I'm currently getting an Unrecognized aggregate function
> > > > > > > error, when trying to use the 'max' aggregate function. This
> > should
> > > > > > > demonstrate where I'm at:
> > > > > > >
> > > > > > > require'jd pacman'
> > > > > > > load JDP,'tools/csv_load.ijs'
> > > > > > > F=: jpath '~temp/rosettacode/example/CSV'
> > > > > > > jdcreatefolder_jd_ CSVFOLDER=: F
> > > > > > >
> > > > > > > assert 0<{{)n
> > > > > > > PATIENTID,LASTNAME
> > > > > > > 1001,Hopper
> > > > > > > 4004,Wirth
> > > > > > > 3003,Kemeny
> > > > > > > 2002,Gosling
> > > > > > > 5005,Kurtz
> > > > > > > }} fwrite F,'patients.csv'
> > > > > > >
> > > > > > > assert 0<{{)n
> > > > > > > PATIENTID,VISIT_DATE,SCORE
> > > > > > > 2002,2020-09-10,6.8
> > > > > > > 1001,2020-09-17,5.5
> > > > > > > 4004,2020-09-24,8.4
> > > > > > > 2002,2020-10-08,
> > > > > > > 1001,,6.6
> > > > > > > 3003,2020-11-12,
> > > > > > > 4004,2020-11-05,7.0
> > > > > > > 1001,2020-11-19,5.3
> > > > > > > }} fwrite F,'visits.csv'
> > > > > > >
> > > > > > > csvprepare 'patients';F,'patients.csv'
> > > > > > > csvprepare 'visits';F,'visits.csv'
> > > > > > >
> > > > > > > csvload 'patients';1
> > > > > > > csvload 'visits';1
> > > > > > >
> > > > > > > jd'ref patients PATIENTID  visits PATIENTID'
> > > > > > >
> > > > > > > echo jd ([echo) deb {{)n
> > > > > > >   reads
> > > > > > >      p.PATIENTID,
> > > > > > >      LASTNAME:p.LASTNAME,
> > > > > > >      first v.VISIT_DATE
> > > > > > >     by
> > > > > > >      p.PATIENTID
> > > > > > >     from
> > > > > > >       p:patients,
> > > > > > >       v:p.visits
> > > > > > > }} -.LF
> > > > > > >
> > > > > > > Now, ... one of my thoughts was that maybe this is a type error,
> > > > > > > indicating that 'max' does not have a definition for the type of
> > data
> > > > > > > in this column. However, replacing 'max' with 'first' (which
> > should
> > > > be
> > > > > > > defined for any type of column) also gives an "Unrecognized
> > aggregate
> > > > > > > function' error.
> > > > > > >
> > > > > > > So...
> > > > > > >
> > > > > > > (1) What am I doing wrong here, and
> > > > > > >
> > > > > > > (2) What should I have looked at to discover this information?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > --
> > > > > > > Raul
> > > > > >
> > ----------------------------------------------------------------------
> > > > > > For information about J forums see
> > http://www.jsoftware.com/forums.htm
> > > > > >
> > > > >
> > ----------------------------------------------------------------------
> > > > > For information about J forums see
> > http://www.jsoftware.com/forums.htm
> > > > ----------------------------------------------------------------------
> > > > For information about J forums see http://www.jsoftware.com/forums.htm
> > > >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
>
>
> --
>
> Devon McCormick, CFA
>
> Quantitative Consultant
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to