Jeff and I were just discussing a plan to massively reduce the overhead for
strings and eliminate substrings. It's a bit early to get into much detail,
but it would hopefully help with a lot of string-related problems that are
pretty inefficient right now. Not very helpful at the moment,
unfortunately, however.


On Mon, Mar 10, 2014 at 4:52 PM, John Myles White
<johnmyleswh...@gmail.com>wrote:

> My (not very educated) guess is that each SubString object gets its own
> memory allocated. In the past, I've dealt with these problems by using
> raw-byte buffers and working with those, since you can keep reusing a
> single buffer for every line and avoid all memory allocation.
>
> I'm not clear what changes need to happen in Julia to make sure something
> like this doesn't keep allocating new memory. In an ideal world, I'd be
> able to process a file by:
>
> (1) Allocating a single string object that has, at its backend, a large
> buffer of bytes.
> (2) Read in a new line from the file into this string object without
> allocating more bytes unless strictly necessary.
> (3) Run parse functions on this string object without allocating new
> memory except for the bytes needed to store a float.
>
> My sense is that this is a little hard in Julia at the moment.
>
>  -- John
>
>
> On Mar 10, 2014, at 1:30 PM, Keith Campbell <keithcc1...@gmail.com> wrote:
>
> Hi all,
>
> I'm trying to minimize memory allocation while doing line-oriented
> processing on a fairly large set of text files.  SubString and
> pre-allocated outputs have helped, but I'm still getting memory allocations
> proportional to the size of the input set and looking for new ideas.
>
> The toy example below illustrates how the allocations grow.
> Am I right to suspect that float() is the culprit.  Any thoughts for how
> to cut out the remaining allocations?
>
> thanks,
> Keith
>
> julia> function str_with_sub(N)
>            mystr = ascii("1.1,2.2")
>            fs=Array(Float64,2)
>
>            for i in 1:N
>                dostr!(mystr, fs)
>            end
>        end
> str_with_sub (generic function with 1 method)
>
> julia> function dostr!(mystr, fs)
>           fs[1] = float(SubString(mystr,1,3))
>           fs[2] = float(SubString(mystr,5,7))
>        end
> dostr! (generic function with 1 method)
>
> julia> @time str_with_sub(4)
> elapsed time: 0.008214327 seconds (190612 bytes allocated)
>
> julia> @time str_with_sub(4)
> elapsed time: 8.441e-6 seconds (496 bytes allocated)
>
> julia> @time str_with_sub(4)
> elapsed time: 6.493e-6 seconds (496 bytes allocated)
>
> julia> @time str_with_sub(6)
> elapsed time: 7.074e-6 seconds (688 bytes allocated)
>
> julia> @time str_with_sub(8)
> elapsed time: 7.437e-6 seconds (880 bytes allocated)
>
>
>

Reply via email to