Serializing strings was indeed unreasonably slow. I just pushed a
commit that should be a significant improvement.

On Wed, May 21, 2014 at 7:17 AM, Tim Holy <tim.h...@gmail.com> wrote:
> Samuel, rewriting in C is almost never necessary, because Julia is as fast as
> C. It's just that some implementations are faster, and some slower.
>
> I don't want to dissuade you in any way from improving the performance of
> serialize, but you might consider looking at HDF5/JLD.
>
> --Tim
>
> On Tuesday, May 20, 2014 07:34:11 PM Samuel Colvin wrote:
>> Thanks for that, yes you're write I was being dumb.
>>
>> Just to give an example of how slooooow, reading a 48mb csv files with a
>> mixture of strings and numbers with DataFrames's readtable, then writing it
>> gives:
>>
>> *julia> **@time writetable("data.csv", r)*
>>
>> elapsed time: 3.539380236 seconds (77981180 bytes allocated)
>>
>> *julia> **@time serialize(open("data.dat", "w"), r)*
>>
>> elapsed time: 83.743085747 seconds (3439332792 bytes allocated)
>>
>> Lots of time seems to be spent in string.jl >> print_escaped.
>>
>> Surely there is some improvement that can be made to this? If print_escape
>> is the bottle neck couldn't it be rewritten in c?

Reply via email to