Have you tried stored data as a vector and the linear index of the item into another vector? Final step would be fill in data into a sparse vector and then reshape it into rank 2.
However 37% filled is not that sparse , IMO sparse meant at most a few percent filled. Use of sparse array for 37% is overkill. What you needed is more physical RAM. On Tue, 11 Oct 2022 at 1:29 AM David Lambert <b49p23t...@gmail.com> wrote: > This was my original code which ran `forever'. Were the amendments > truly in place? One array is sparse, the other pre-allocated. Comments > show observations on task manager. I don't have much experience with > Windows beyond the usual Office programs. (PS. I now realize I could > have discarded the "T" and stored the remaining number in the sparse > array.) > > NB. According to task manager, 7GByte free (of 16 GB) > NB. and j process steadily fluctuated by 10 Mbytes. > > NB. is this a mapped file issue on Windows 10 > NB. or were the amendments not in place? > > NB. file detail > NB. fields into rows * columns > NB. 67653078 *inv 1183748 * 2141 > NB. 37.4618 percent filled > > NB. c:/Users/user/Downloads/j904_win64/j904/bin/jconsole.exe > NB. JVERSION > NB. Engine: j904/j64avx/windows > NB. Beta-e: commercial/2022-07-16T19:25:02 > NB. Library: 9.04.03 > NB. Platform: Win 64 > NB. Installer: J904 install > NB. InstallPath: c:/users/user/downloads/j904_win64/j904 > NB. Contact: www.jsoftware.com > > require 'jmf' > > testfile=:'c:/Users/user/temp/tc.csv' > datafile=:'c:/Users/user/ZW/ > kaggle.com/bosch-production-line-performance/train_categorical.csv' > > NB. INF {~ 0 indexes rows > NB. gets data of first row > indexes=: (>:@{. + [: i.@<: -~/)@({ ~ 0 1&+)~ > > tokenize=: 3 :0 NB. y is the literal > rows=. _1 , I. LF = y > row_tally=. <: # rows > row=. col=. 0 > k=. _1 NB. current data index > col_tally=. >: +/ ',' = y {~ 0 indexes rows NB. tally of columns > data=: a: $~ col_tally + +/ 'T' = y NB. columns + those with data, > skipping ID > NB. coor shall be sparse > NB. coor=. ((<: # rows) , col_tally) $ _1 > coor=: 1 $. ((<: # rows) , col_tally) ; 0 1 ; _1 NB. coordinates of data > while. row < 9 >. row_tally do. > fields=. ([: <;._2 ,&',') y {~ row indexes rows > cols=. }. I. a: ~: fields NB. indexes of data in row excluding ID > po=. (>: k) + i. # cols NB. positions of these items in data > co=. < row ; cols NB. location in sparse array to store po > da=. cols { fields > > > NB. coor is sparse > coor=: po co} coor NB.NB.NB. assignments in place? > NB. data is preallocated > data=: da po} data NB.NB.NB. assignments in place? > > > k=. k + # cols > row=. >: row > end. > 'data and coor are global' > ) > > > JCHAR map_jmf_'INF';testfile ] datafile > > tokenize INF > > unmap_jmf_'INF' > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm