Yes, they are now both the same. ( tblcsv is about 3 times faster on large files ).
On Thu, Jan 12, 2012 at 4:41 PM, Ric Sherlock <[email protected]> wrote: > I suspect the reason that the minus sign didn't convert is that the > column wasn't converted to numeric because the column header is part > of the file. If the header doesn't convert to numeric successfully the > column will fail the condition that the whole column must be numeric > for it to be converted. > > If that is indeed the problem then the following should work: > > a=: ({. , makenumcol@}.) fixcsv (1!:1<'sm.csv') -.'"/' > > On Fri, Jan 13, 2012 at 10:16 AM, Tom Szczesny <[email protected]> wrote: > > No. They are not the same. For a very small sample file the results > are: > > > > a=: makenumcol fixcsv (1!:1<'sm.csv') -.'"/' > > b=:'ssdndssnnnnnnnnnnnnndnns' tblcsv 1!:1 <'sm.csv' > > a-:b > > 0 > > (<a:;7){a > > +--++---+++----+----+----+ > > |qu||444|||2000|6761|-100| > > +--++---+++----+----+----+ > > (<a:;7){b > > +--++---+++----+----+----+ > > |qu||444|||2000|6761|_100| > > +--++---+++----+----+----+ > > > > The minus sign did not get converted properly in the table/csv case. > > > > > > On Thu, Jan 12, 2012 at 3:46 PM, Ric Sherlock <[email protected]> wrote: > > > >> Out of interest does the following give you the same result? > >> load 'tables/csv' > >> makenumcol fixcsv csvdata -.'"/' > >> > >> Where csvdata is the string you are feeding to tblcsv. > >> > >> On Fri, Jan 13, 2012 at 9:09 AM, Tom Szczesny <[email protected]> wrote: > >> > wow > >> > > >> > On Thu, Jan 12, 2012 at 2:38 PM, Ric Sherlock <[email protected]> > wrote: > >> > > >> >> ( (x e. 'dn')#i.$x ) <--> ( I. x e. 'dn' ) > >> >> > >> >> You can lose the for loop and the two transposes by amending all the > >> >> columns at once. > >> >> > >> >> It is better/safer to use dyadic ". for converting literals to > numbers. > >> >> > >> >> So this should work (untested): > >> >> > >> >> tblcsv =: 4 : 0 > >> >> t=.([: < ;._1 ',',]) ;._2 y-.'"/' > >> >> data=. }.t > >> >> idx=. I. x e. 'dn' > >> >> data=. (_99&". &.> idx{"1 data) (<a:;idx)} data > >> >> ({.t),data > >> >> ) > >> >> > >> >> Because in-place amendment is not supported for the boxed datatype, > >> >> you shouldn't lose anything by dropping the ( data=. aa i}data ) so > >> >> you could just do: > >> >> > >> >> tblcsv =: 4 : 0 > >> >> t=.([: < ;._1 ',',]) ;._2 y-.'"/' > >> >> idx=. I. x e. 'dn' > >> >> ({.t), (_99&". &.> idx{"1 }.t) (<a:;idx)} }.t > >> >> ) > >> >> > >> >> > >> >> On Fri, Jan 13, 2012 at 6:25 AM, Tom Szczesny <[email protected]> > wrote: > >> >> > Sorry for all the "chatter". I think I'm done now... > >> >> > Version #4: > >> >> > > >> >> > tblcsv =: 4 : 0 > >> >> > t=.([: < ;._1 ',',]) ;._2 y-.'"/' > >> >> > data=. |:}.t > >> >> > for_k. (x e. 'dn')#i.$x do. > >> >> > data=. (><&.>".'0',"1>k{data) k}data > >> >> > end. > >> >> > ({.t),|:data > >> >> > ) > >> >> > > >> >> > > >> >> > On Thu, Jan 12, 2012 at 12:05 PM, Tom Szczesny <[email protected]> > >> wrote: > >> >> > > >> >> >> Replacing the - with _ is not necessary. > >> >> >> Version #2 only works because of a typo in the 3rd line =, > >> >> instead > >> >> >> of =. > >> >> >> Version #3: > >> >> >> > >> >> >> tblcsv =: 4 : 0 > >> >> >> r=. y-.'"/' > >> >> >> r=.([: < ;._1 ',',]) ;._2 r > >> >> >> ttl=. {.r > >> >> >> dat=. |:}.r > >> >> >> for_j. (x e. 'dn')#i.$x do. > >> >> >> dat=. (><&.>".'0',"1>j{dat) j}dat > >> >> >> end. > >> >> >> ttl,|:dat > >> >> >> ) > >> >> >> > >> >> >> > >> >> >> On Thu, Jan 12, 2012 at 11:41 AM, Tom Szczesny <[email protected]> > >> >> wrote: > >> >> >> > >> >> >>> The version I sent earlier appeared to work, but actually does > not. > >> >> This > >> >> >>> version does: > >> >> >>> > >> >> >>> tblcsv =: 4 : 0 > >> >> >>> r=. y-.'"/' > >> >> >>> r=, '_' ((r='-')#i.$r) } r > >> >> >>> r=.([: < ;._1 ',',]) ;._2 r > >> >> >>> ttl=. {.r > >> >> >>> dat=. |:}.r > >> >> >>> for_k. (x e. 'dn')#i.$x do. > >> >> >>> dat=. (><&.>".'0',"1>k{dat) k}dat > >> >> >>> end. > >> >> >>> ttl,|:dat > >> >> >>> ) > >> >> >>> > >> >> >>> Thanks again to everyone for all the help. > >> >> >>> > >> >> >>> On Thu, Jan 12, 2012 at 8:59 AM, Tom Szczesny <[email protected]> > >> >> wrote: > >> >> >>> > >> >> >>>> Thanks much. > >> >> >>>> Since I create the csv files, I was able to take some shortcuts: > >> >> >>>> 1) All " are spurious (no commas in columns containing > >> comments) > >> >> >>>> 2) - only occurs in negative numbers > >> >> >>>> 3) / only occus in dates 2004/12/15 (remove the / and > don't > >> >> need > >> >> >>>> getdate at all) > >> >> >>>> > >> >> >>>> This definition is working fine: > >> >> >>>> > >> >> >>>> tblcsv =: 4 : 0 > >> >> >>>> r=. y-.'"/' > >> >> >>>> r=, '_' ((r='-')#i.$r) } r > >> >> >>>> r=.([: < ;._1 ',',]) ;._2 r > >> >> >>>> ttl=. {.r > >> >> >>>> dat=. |:}.r > >> >> >>>> for_i. (x e. 'dn')#i.$x do. > >> >> >>>> dat=. (<"".>i{ dat) i} dat > >> >> >>>> end. > >> >> >>>> ttl,|:dat > >> >> >>>> ) > >> >> >>>> > >> >> >>>> For large files, the biggest time user might be the need to do > >> >> transpose > >> >> >>>> twice. > >> >> >>>> tx=:'ssdndssnnnnnnnnnnnnndnns' tblcsv 1!:1 <'t.csv' > >> >> >>>> > >> >> >>>> On Thu, Jan 12, 2012 at 6:16 AM, R.E. Boss <[email protected]> > >> >> wrote: > >> >> >>>> > >> >> >>>>> If you have a lot of dates, say >100k, getdate will be rather > >> slow. > >> >> >>>>> But since the nub of these dates probably will contain <10k > dates, > >> >> you > >> >> >>>>> can > >> >> >>>>> use nubindex http://www.jsoftware.com/help/dictionary/d221.htm > >> >> >>>>> For that the following adverb will do (tested long time ago) > >> >> >>>>> > >> >> >>>>> nbind=:1 : '](i.!.0~ { u @:]) ~.' > >> >> >>>>> > >> >> >>>>> > >> >> >>>>> R.E. Boss > >> >> >>>>> > >> >> >>>>> > >> >> >>>>> > -----Oorspronkelijk bericht----- > >> >> >>>>> > Van: [email protected] [mailto:general- > >> >> >>>>> > [email protected]] Namens R.E. Boss > >> >> >>>>> > Verzonden: woensdag 11 januari 2012 21:48 > >> >> >>>>> > Aan: 'General forum' > >> >> >>>>> > Onderwerp: Re: [Jgeneral] Data from csv files > >> >> >>>>> > > >> >> >>>>> > require 'dates' > >> >> >>>>> > > >> >> >>>>> > 100 #. getdate'2012/01/12' > >> >> >>>>> > 20120112 > >> >> >>>>> > > >> >> >>>>> > I would transpose (|:) the matrix and then work a row at a > time, > >> >> >>>>> depending > >> >> >>>>> > on the control vector. > >> >> >>>>> > Recently I learned techniques to assign different verbs to > >> >> different > >> >> >>>>> items. > >> >> >>>>> > > >> >> >>>>> > > >> >> >>>>> > R.E. Boss > >> >> >>>>> > > >> >> >>>>> > > >> >> >>>>> > > -----Oorspronkelijk bericht----- > >> >> >>>>> > > Van: [email protected] [mailto:general- > >> >> >>>>> > > [email protected]] Namens Tom Szczesny > >> >> >>>>> > > Verzonden: woensdag 11 januari 2012 21:13 > >> >> >>>>> > > Aan: General forum > >> >> >>>>> > > Onderwerp: Re: [Jgeneral] Data from csv files > >> >> >>>>> > > > >> >> >>>>> > > Thanks, that is very nice to know, but ... > >> >> >>>>> > > > >> >> >>>>> > > Since the csv files I need were created by me, I also know > >> that > >> >> >>>>> > > - the only occurrences of " are spuriously added. > >> >> >>>>> > > - the only occurrences of - are in the representation > of > >> >> >>>>> negative > >> >> >>>>> > > numbers, so I can define > >> >> >>>>> > > > >> >> >>>>> > > tblcsv=: 3 : 0 > >> >> >>>>> > > r=: (-.y='"')#y > >> >> >>>>> > > r=: '_' ((r='-')#i.$r) } r > >> >> >>>>> > > ([: < ;._1 ',',]) ;._2 r > >> >> >>>>> > > ) > >> >> >>>>> > > > >> >> >>>>> > > Next, I plan to figure out how to convert the columns with > >> >> character > >> >> >>>>> > > strings representing numbers into actual numbers, > >> >> >>>>> > > and the columns with character strings representing dates ( > >> >> >>>>> 2012/01/12 ) > >> >> >>>>> > > into numbers representing dates ( 20120112 ), > >> >> >>>>> > > where tblcsv becomes dyadic with a control vector like > >> >> 'SDSSNDNSNS' > >> >> >>>>> as > >> >> >>>>> > the > >> >> >>>>> > > left argument > >> >> >>>>> > > indicating which columns are strings, dates & numbers. > >> >> >>>>> > > > >> >> >>>>> > > On Wed, Jan 11, 2012 at 2:27 PM, Ric Sherlock < > >> [email protected] > >> >> > > >> >> >>>>> wrote: > >> >> >>>>> > > > >> >> >>>>> > > > Note that you don't need to define tblcsv explicitly: > >> >> >>>>> > > > tblcsv=: ([: <;._1 ','&,);._2 > >> >> >>>>> > > > or > >> >> >>>>> > > > tblcsv=: ([: <;._1 ',' , ]);._2 > >> >> >>>>> > > > > >> >> >>>>> > > > On Thu, Jan 12, 2012 at 6:29 AM, Tom Szczesny < > >> >> [email protected]> > >> >> >>>>> > > wrote: > >> >> >>>>> > > > > tested ...... works . ..... thanks! > >> >> >>>>> > > > > > >> >> >>>>> > > > > On Wed, Jan 11, 2012 at 12:23 PM, R.E. Boss < > >> >> [email protected] > >> >> >>>>> > > >> >> >>>>> > wrote: > >> >> >>>>> > > > > > >> >> >>>>> > > > >> tblcsv =: 3 : 0 > >> >> >>>>> > > > >> ([: <;._1 ',',]) ;._2 y > >> >> >>>>> > > > >> ) > >> >> >>>>> > > > >> (untested) > >> >> >>>>> > > > >> > >> >> >>>>> > > > >> R.E. Boss > >> >> >>>>> > > > >> > >> >> >>>>> > > > >> > >> >> >>>>> > > > >> > -----Oorspronkelijk bericht----- > >> >> >>>>> > > > >> > Van: [email protected] [mailto:general- > >> >> >>>>> > > > >> > [email protected]] Namens Tom Szczesny > >> >> >>>>> > > > >> > Verzonden: woensdag 11 januari 2012 17:55 > >> >> >>>>> > > > >> > Aan: General forum > >> >> >>>>> > > > >> > Onderwerp: Re: [Jgeneral] Data from csv files > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > Given > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > cutc =: 3 : 0 > >> >> >>>>> > > > >> > < ;._1 ',',y > >> >> >>>>> > > > >> > ) > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > tblcsv =: 3 : 0 > >> >> >>>>> > > > >> > cutc ;._2 y > >> >> >>>>> > > > >> > ) > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > tblcsv 1!:1 <'test.csv' > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > I assumed that I would be able to represent this as > a > >> >> single > >> >> >>>>> > > > definition, > >> >> >>>>> > > > >> > such as > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > tblcsv =: 3 : 0 > >> >> >>>>> > > > >> > ( <;._1 ',',) ;._2 y > >> >> >>>>> > > > >> > ) > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > or > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > tblcsv =: 3 : 0 > >> >> >>>>> > > > >> > (( <;._1',',)&) ;._2 y > >> >> >>>>> > > > >> > ) > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > I've tried several other vaiations, and all result > in > >> >> 'syntax > >> >> >>>>> > error'. > >> >> >>>>> > > > Am > >> >> >>>>> > > > >> I > >> >> >>>>> > > > >> > missing something, or does the verb applied to each > >> 'cut' > >> >> >>>>> interval > >> >> >>>>> > > > need > >> >> >>>>> > > > >> to > >> >> >>>>> > > > >> > be defined separately? > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > On Mon, Jan 9, 2012 at 5:00 PM, Arthur Anger < > >> >> [email protected]> > >> >> >>>>> > > wrote: > >> >> >>>>> > > > >> > > >> >> >>>>> > > > >> > > I often use Rank to select sub-arrays from an > array: > >> >> >>>>> > > > >> > > <"0 i. 7 > >> >> >>>>> > > > >> > > --Art > >> >> >>>>> > > > >> > > ------------------ > >> >> >>>>> > > > >> > > Quoting [email protected]: > >> >> >>>>> > > > >> > > . . . > >> >> >>>>> > > > >> > > > Message: 2 > >> >> >>>>> > > > >> > > > Date: Mon, 9 Jan 2012 11:46:24 -0500 > >> >> >>>>> > > > >> > > > From: Tom Szczesny <[email protected]> > >> >> >>>>> > > > >> > > > Subject: Re: [Jgeneral] Data from csv files > >> >> >>>>> > > > >> > > > To: General forum <[email protected]> > >> >> >>>>> > > > >> > > > Message-ID: > >> >> >>>>> > > > >> > > > <CABn7SNYFw2gyAPKcjx1DMLEru97NMTst6zoGx= > >> >> >>>>> > > > >> > > [email protected]> > >> >> >>>>> > > > >> > > > Content-Type: text/plain; charset=ISO-8859-1 > >> >> >>>>> > > > >> > > > > >> >> >>>>> > > > >> > > > As mentioned in the dictionary entry for cut: > >> >> >>>>> > > > >> > > > the phrase u;._2 y applies the verb u to > each > >> >> >>>>> interval > >> >> >>>>> > > > created > >> >> >>>>> > > > >> by > >> >> >>>>> > > > >> > > cut, > >> >> >>>>> > > > >> > > > where the fret is the last item, and marks the > >> ends of > >> >> >>>>> the > >> >> >>>>> > > > intervals. > >> >> >>>>> > > > >> > > > > >> >> >>>>> > > > >> > > > What is the notation for applying the "each" > >> concept > >> >> to a > >> >> >>>>> verb > >> >> >>>>> > > > >> > > independent > >> >> >>>>> > > > >> > > > of cut ? > >> >> >>>>> > > > >> > > > For example, > >> >> >>>>> > > > >> > > > <i.7 > >> >> >>>>> > > > >> > > > +-------------+ > >> >> >>>>> > > > >> > > > |0 1 2 3 4 5 6| > >> >> >>>>> > > > >> > > > +-------------+ > >> >> >>>>> > > > >> > > > > >> >> >>>>> > > > >> > > > How do you express > >> >> >>>>> > > > >> > > > < each i.7 > >> >> >>>>> > > > >> > > > and get 7 individually boxed items? > >> >> >>>>> > > > >> > > > (I could not find an entry for "each" in the > Index, > >> >> nor > >> >> >>>>> in > >> >> >>>>> the > >> >> >>>>> > > > >> > > Vocabulary.) > >> >> >>>>> > > > >> > > . . . > >> >> >>>>> > > > >> > > > End of General Digest, Vol 76, Issue 8 > >> >> >>>>> > > > >> > > > ************************************** > >> >> >>>>> > > > >> > > > >> >> >>>>> > > > >> > > > >> >> >>>>> > > > > >> >> >>>>> > >> >> > ---------------------------------------------------------------------- > >> >> >>>>> > > > >> > > For information about J forums see > >> >> >>>>> > > > >> > http://www.jsoftware.com/forums.htm > >> >> >>>>> > > > >> > > > >> >> >>>>> > > > >> > > >> >> >>>>> > > >> >> > ---------------------------------------------------------------------- > >> >> >>>>> > > > >> > For information about J forums see > >> >> >>>>> > > > http://www.jsoftware.com/forums.htm > >> >> >>>>> > > > >> > >> >> >>>>> > > > >> > >> >> >>>>> > > >> >> > ---------------------------------------------------------------------- > >> >> >>>>> > > > >> For information about J forums see > >> >> >>>>> > > http://www.jsoftware.com/forums.htm > >> >> >>>>> > > > >> > >> >> >>>>> > > > > > >> >> >>>>> > >> >> > ---------------------------------------------------------------------- > >> >> >>>>> > > > > For information about J forums see > >> >> >>>>> > > http://www.jsoftware.com/forums.htm > >> >> >>>>> > > > > >> >> >>>>> > >> >> > ---------------------------------------------------------------------- > >> >> >>>>> > > > For information about J forums see > >> >> >>>>> > > http://www.jsoftware.com/forums.htm > >> >> >>>>> > > > > >> >> >>>>> > > > >> >> >>>>> > >> >> > ---------------------------------------------------------------------- > >> >> >>>>> > > For information about J forums see > >> >> >>>>> > http://www.jsoftware.com/forums.htm > >> >> >>>>> > > >> >> >>>>> > > >> >> > ---------------------------------------------------------------------- > >> >> >>>>> > For information about J forums see > >> >> >>>>> http://www.jsoftware.com/forums.htm > >> >> >>>>> > >> >> >>>>> > >> >> > ---------------------------------------------------------------------- > >> >> >>>>> For information about J forums see > >> >> http://www.jsoftware.com/forums.htm > >> >> >>>>> > >> >> >>>> > >> >> >>>> > >> >> >>> > >> >> >> > >> >> > > ---------------------------------------------------------------------- > >> >> > For information about J forums see > >> http://www.jsoftware.com/forums.htm > >> >> > ---------------------------------------------------------------------- > >> >> For information about J forums see > http://www.jsoftware.com/forums.htm > >> >> > >> > ---------------------------------------------------------------------- > >> > For information about J forums see > http://www.jsoftware.com/forums.htm > >> ---------------------------------------------------------------------- > >> For information about J forums see http://www.jsoftware.com/forums.htm > >> > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
