Aside from the memory allocation concerns already raised, I also think that constructing a dataframe just to add it to another adds quite a lot of redundancy in the code. For example, I'll have to specify the column names an extra time for each row I append, rather than just once at the beginning. (However, this argument might be moot if the column order is not always well-defined - in that case, I don't really see a way around creating a new dataframe, since the columns need to be named.)
I just find the whole procedure of constructing a full data frame just to append it and throw it away seems very roundabout and complicated. // T On Tuesday, May 27, 2014 5:38:34 AM UTC+2, Kevin Squire wrote: > > So, the other argument is that, if the types fit, why not make it easy to > append data to a DataFrame via any iterable? Constructing a DataFrame just > to append it to another DataFrame and throw it away seems wasteful, > especially since a new array is allocated for each column, and (I think) > each array allocates space for 16 elements. That means we're allocating > and throwing away, e.g., 128 bytes per Float64 column, just so we can > append one number to the column. > > If we had a separate type for DataFrame rows, on the other hand... > > Cheers, > Kevin > > On Monday, May 26, 2014, John Myles White <johnmyl...@gmail.com<javascript:>> > wrote: > >> I’d not really opposed to it, but I’m also not super excited about it. >> It’s a redundant and non-obvious interface: I’ve seen people try to use >> both vectors and 1-row matrices to do this. That suggests to me there’s no >> clear right answer, so picking one way arbitrarily (appending only >> DataFrames to DataFrames) is pretty reasonable. >> >> — John >> >> On May 26, 2014, at 8:14 PM, Kevin Squire <kevin.sq...@gmail.com> wrote: >> >> It shouldn't be that hard to make the array version work. I might give >> it a shot, unless that isn't desired. >> >> Kevin >> >> On Monday, May 26, 2014, Jason Solack <jays...@gmail.com> wrote: >> >>> this works for me: >>> >>> dfA = DataFrame(A=[1:10], B=[11:20]) >>> dfB = DataFrame(A=11, B=21) >>> append!(dfA, dfB) >>> >>> >>> >>> On Monday, May 26, 2014 11:59:28 AM UTC-4, Tomas Lycken wrote: >>>> >>>> I'm probably just being incredibly daft, but I can't figure out how to >>>> add a new row to a DataFrame. >>>> >>>> Basically, I have a bunch of data sets for which I want to perform some >>>> calculations - lets say the mean and standard deviation of something - >>>> each >>>> dataset corresponding to some named category of data. So I do the >>>> following >>>> to construct my new DataFrame >>>> >>>> julia> measures = DataFrame() >>>> julia> measures[:Mean] = Float64[] >>>> julia> measures[:StdDev] = Float64[] >>>> julia> measures[:Category] = Symbol[] >>>> >>>> Now, I want to add some values that are the results of a calculation on >>>> a different data set, and I try this: >>>> >>>> julia> push!(psispread, [1.0,0.1,:Fake]) >>>> ERROR: no method push!(DataFrame, Array{Any,1}) >>>> julia> append!(psispread, [1.0,0.1,:Fake]) >>>> ERROR: no method append!(DataFrame, Array{Any,1}) >>>> julia> psispread[1,:] = [1.0,0.1,:Fake] >>>> ERROR: BoundsError() >>>> in setindex! at /home/tlycken/.julia/v0.3/DataArrays/src/dataarray.jl: >>>> 764 >>>> in insert_single_entry! at /home/tlycken/.julia/v0.3/ >>>> DataFrames/src/dataframe/dataframe.jl:410 >>>> in setindex! at /home/tlycken/.julia/v0.3/DataFrames/src/dataframe/ >>>> dataframe.jl:521 >>>> >>>> Is there a nice and simple way to add a row to a DataFrame without >>>> having to do it one value at a time? >>>> >>>> // T >>>> >>> >>