I'm not sure what's wrong with sub, but don't use it -- it's definitely worse than just making a copy of the subset you want to write.
s = df[df[:rank_PV].<=r_max,:] @time write_results(s, name, "significant", sep, h) On Thursday, October 27, 2016 at 5:07:31 AM UTC-7, Fred wrote: > > Hi, > > In the same program, I save in a file a DataFrame "df" and a subset of > this DataFrame in another file. The problem I have is that saving the > subset is much slower than saving the entire DataFrame : 220 times slower. > It is too slow and I don't what is my mistake. > > Thank you for your advices ! > > in Julia 0.4.5 : > > Saving the entire DataFrame > Saving... results/Stat.csv > 1.115944 seconds (13.78 M allocations: 319.534 MB, 2.59% gc time) > > > Saving the subset of the DataFrame > Saving... significant/Stat.csv > 246.099835 seconds (41.79 M allocations: 376.189 GB, 4.77% gc time) > elapsed time: 251.581459853 seconds > > > in Julia 0.5 : > > Saving the entire DataFrame > Saving... results/Stat.csv > 1.060365 seconds (7.08 M allocations: 116.025 MB, 0.73% gc time) > > Saving the subset of the DataFrame > Saving... significant/Stat.csv > 226.813587 seconds (37.40 M allocations: 376.268 GB, 2.42% gc time) > elapsed time: 232.95933586 seconds > > ################################################ > # my function to save the results to a file > > function write_results(x, name, dir, sep, h) > outfile = "$dir/$name" > println("Saving...\t", outfile) > writetable( outfile, x, separator = sep, header = h) > end > > > # save my DataFrame df : very fast > @time write_results(df, name, "results", sep, h) > > > # subset DataFrame s > s = sub(df, (df[:rank_PV] .<= r_max)) > > # save my subset DataFrame s : incredibly slow ! > > @time write_results(s, name, "significant", sep, h) > > >