Raul - in my basic benchmarking that sped it up about 5% (200ms)

Here's my implementation - trying to be more explicit. It takes about
5891 ms to run 10000 iterations  as compared to 4000 for the other J
implementations (43% slower). I'm sure it might be able to be
improved. There's more boxing and each than probably necessary.

I grab the lines in two passes and operate on them.

The disabled lines is a requirement that both the prior J
implementations seem to have missed. Rosettacode says it should also
output

"seedsremoved = false"


starts=: 13 : '(<x) = {. each y'

keyValue=: 4 : 0
DefaultValue=:x
Spaces=:> (' ' (i.&1@:=) each y)
Keys=:Spaces {. each y
Values=: (Spaces+1) }. each y
Values=:(3 : '> ((# y) > 0)  } (DefaultValue;y)') each Values
(Keys,.Values)
)

readConf=: 3 : 0
All=:LF cut y
NB. All lines except comments, disabled and blank
Lines=:(-. (';' starts All) + ('#' starts All) + (CR starts All)) # All
NB. only the disabled lines
Disabled=: 2}. each (';' starts All) # All          NB. chop off ;<space>
('T' keyValue Lines),('F' keyValue Disabled)
)

go=: 3 : 0
readConf (fread 'c:\temp\test.conf')
)


+--------------+-------------------------+
|FULLNAME      |Foo Barber               |
+--------------+-------------------------+
|FAVOURITEFRUIT|banana                   |
+--------------+-------------------------+
|NEEDSPEELING  |T                        |
+--------------+-------------------------+
|OTHERFAMILY   |Rhu Barber, Harry Barber |
+--------------+-------------------------+
|SEEDSREMOVED  |F                        |
+--------------+-------------------------+



On Tue, Jan 14, 2014 at 9:48 AM, Raul Miller <rauldmil...@gmail.com> wrote:
> It might be interesting to try it on a large file.
>
> Here's another state machine implementation that might perform better:
>
> StateMachine=: 2 :0
>   (m;(0 10#:10*".;._2]0 :0);<n)&;:
> )
>
> CleanChrs=: '#;';(' ',TAB);LF;a.-.'#; ',TAB,LF
> NB. comment, space, line, other
>
> clean=: 1 StateMachine CleanChrs
>   1.0  0.0  0.0  2.1  NB. 0: skip whitespace (start here)
>   1.0  1.0  0.0  1.0  NB. 1: comment
>   3.3  4.0  6.0  2.0  NB. 2: word
>   3.0  3.0  6.1  3.0  NB. 3: comment after word
>   3.3  5.3  6.0  2.0  NB. 4: first space after word
>   3.0  5.0  6.1  2.1  NB. 5: extra space after word
>   1.3  0.3  0.3  2.0  NB. 6: line end after word
> )
> NB. .0 continue, .1 start, .3 end
>
> SplitChrs=: (' ',TAB);a.-.' ',TAB
> NB. space, other
>
> split=: 0 StateMachine SplitChrs
>   0.6 1.1 NB. start here
>   2.3 1.0 NB. other (first word)
>   0.6 3.1 NB. first space
>   3.0 3.0 NB. rest
> )
> NB. .6 error
>
> readConf=: split;._2@clean@fread
>
> I think the performance problem you observed is because the first
> version started boxing too early. Here, I save boxing till the end,
> and create fewer boxes, both of which should reduce overhead.
>
> Thanks,
>
> --
> Raul
>
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to