My (not very educated) guess is that each SubString object gets its own memory 
allocated. In the past, I've dealt with these problems by using raw-byte 
buffers and working with those, since you can keep reusing a single buffer for 
every line and avoid all memory allocation.

I'm not clear what changes need to happen in Julia to make sure something like 
this doesn't keep allocating new memory. In an ideal world, I'd be able to 
process a file by:

(1) Allocating a single string object that has, at its backend, a large buffer 
of bytes.
(2) Read in a new line from the file into this string object without allocating 
more bytes unless strictly necessary.
(3) Run parse functions on this string object without allocating new memory 
except for the bytes needed to store a float.

My sense is that this is a little hard in Julia at the moment.

 -- John

On Mar 10, 2014, at 1:30 PM, Keith Campbell <keithcc1...@gmail.com> wrote:

> Hi all,
>      
> I'm trying to minimize memory allocation while doing line-oriented processing 
> on a fairly large set of text files.  SubString and pre-allocated outputs 
> have helped, but I'm still getting memory allocations proportional to the 
> size of the input set and looking for new ideas.
> 
> The toy example below illustrates how the allocations grow.
> Am I right to suspect that float() is the culprit.  Any thoughts for how to 
> cut out the remaining allocations?
> 
> thanks,
> Keith
> 
> julia> function str_with_sub(N)
>            mystr = ascii("1.1,2.2")
>            fs=Array(Float64,2)
>        
>            for i in 1:N
>                dostr!(mystr, fs)
>            end
>        end
> str_with_sub (generic function with 1 method)
> 
> julia> function dostr!(mystr, fs)
>           fs[1] = float(SubString(mystr,1,3))
>           fs[2] = float(SubString(mystr,5,7))
>        end
> dostr! (generic function with 1 method)
> 
> julia> @time str_with_sub(4)
> elapsed time: 0.008214327 seconds (190612 bytes allocated)
> 
> julia> @time str_with_sub(4)
> elapsed time: 8.441e-6 seconds (496 bytes allocated)
> 
> julia> @time str_with_sub(4)
> elapsed time: 6.493e-6 seconds (496 bytes allocated)
> 
> julia> @time str_with_sub(6)
> elapsed time: 7.074e-6 seconds (688 bytes allocated)
> 
> julia> @time str_with_sub(8)
> elapsed time: 7.437e-6 seconds (880 bytes allocated)
> 

Reply via email to