Aside from the memory allocation concerns already raised, I also think that 
constructing a dataframe just to add it to another adds quite a lot of 
redundancy in the code. For example, I'll have to specify the column names 
an extra time for each row I append, rather than just once at the 
beginning. (However, this argument might be moot if the column order is not 
always well-defined - in that case, I don't really see a way around 
creating a new dataframe, since the columns need to be named.)

I just find the whole procedure of constructing a full data frame just to 
append it and throw it away seems very roundabout and complicated.

// T

On Tuesday, May 27, 2014 5:38:34 AM UTC+2, Kevin Squire wrote:
>
> So, the other argument is that, if the types fit, why not make it easy to 
> append data to a DataFrame via any iterable?  Constructing a DataFrame just 
> to append it to another DataFrame and throw it away seems wasteful, 
> especially since a new array is allocated for each column, and (I think) 
> each array allocates space for 16 elements.  That means we're allocating 
> and throwing away, e.g., 128 bytes per Float64 column, just so we can 
> append one number to the column. 
>
> If we had a separate type for DataFrame rows, on the other hand... 
>
> Cheers,
>   Kevin
>
> On Monday, May 26, 2014, John Myles White <johnmyl...@gmail.com<javascript:>> 
> wrote:
>
>> I’d not really opposed to it, but I’m also not super excited about it. 
>> It’s a redundant and non-obvious interface: I’ve seen people try to use 
>> both vectors and 1-row matrices to do this. That suggests to me there’s no 
>> clear right answer, so picking one way arbitrarily (appending only 
>> DataFrames to DataFrames) is pretty reasonable.
>>
>>  — John
>>
>> On May 26, 2014, at 8:14 PM, Kevin Squire <kevin.sq...@gmail.com> wrote:
>>
>> It shouldn't be that hard to make the array version work.  I might give 
>> it a shot, unless that isn't desired. 
>>
>> Kevin
>>
>> On Monday, May 26, 2014, Jason Solack <jays...@gmail.com> wrote:
>>
>>> this works for me:
>>>
>>> dfA = DataFrame(A=[1:10], B=[11:20])
>>> dfB = DataFrame(A=11, B=21)
>>> append!(dfA, dfB)
>>>
>>>
>>>
>>> On Monday, May 26, 2014 11:59:28 AM UTC-4, Tomas Lycken wrote:
>>>>
>>>> I'm probably just being incredibly daft, but I can't figure out how to 
>>>> add a new row to a DataFrame.
>>>>
>>>> Basically, I have a bunch of data sets for which I want to perform some 
>>>> calculations - lets say the mean and standard deviation of something - 
>>>> each 
>>>> dataset corresponding to some named category of data. So I do the 
>>>> following 
>>>> to construct my new DataFrame
>>>>
>>>> julia> measures = DataFrame()
>>>> julia> measures[:Mean] = Float64[]
>>>> julia> measures[:StdDev] = Float64[]
>>>> julia> measures[:Category] = Symbol[]
>>>>
>>>> Now, I want to add some values that are the results of a calculation on 
>>>> a different data set, and I try this:
>>>>
>>>> julia> push!(psispread, [1.0,0.1,:Fake])
>>>> ERROR: no method push!(DataFrame, Array{Any,1})
>>>> julia> append!(psispread, [1.0,0.1,:Fake])
>>>> ERROR: no method append!(DataFrame, Array{Any,1})
>>>> julia> psispread[1,:] = [1.0,0.1,:Fake]
>>>> ERROR: BoundsError()
>>>>  in setindex! at /home/tlycken/.julia/v0.3/DataArrays/src/dataarray.jl:
>>>> 764
>>>>  in insert_single_entry! at /home/tlycken/.julia/v0.3/
>>>> DataFrames/src/dataframe/dataframe.jl:410
>>>>  in setindex! at /home/tlycken/.julia/v0.3/DataFrames/src/dataframe/
>>>> dataframe.jl:521
>>>>
>>>> Is there a nice and simple way to add a row to a DataFrame without 
>>>> having to do it one value at a time?
>>>>
>>>> // T
>>>>
>>>
>>

Reply via email to