dropna() is only defined for DataArrays. The individual columns in a 
DataFrame are DataArrays, but the DataFrame itself is not. There is a issue 
for it <https://github.com/JuliaStats/DataFrames.jl/issues/602>.

To get a Array out of a DataFrame you are best of building it yourself I 
think:

complete_cases!(data)
[data[:x1] data[:x2]]

On Friday, July 4, 2014 4:07:54 AM UTC+3, Donald Lacombe wrote:
>
> Johan,
>
> I think there may be an issue with the Data Frames package as I get the 
> following:
>
> julia> data = readtable("test.csv",header=false)
>
> 6x2 DataFrame:
>
>         x1 x2
>
> [1,]     1  7
>
> [2,]     2  8
>
> [3,]     3  9
>
> [4,]     4 10
>
> [5,]     5 11
>
> [6,]     6 12
>
>
>
> julia> convert(Array,data)
>
> MethodError(convert,(Array{T,N},6x2 DataFrame:
>
>         x1 x2
>
> [1,]     1  7
>
> [2,]     2  8
>
> [3,]     3  9
>
> [4,]     4 10
>
> [5,]     5 11
>
> [6,]     6 12
>
> ))
>
>
> julia> dropna(data)
>
> ErrorException("dropna not defined")
>
>
> I read the documentation and they both say the same thing but it doesn't seem 
> to work in my case.
>
>
> Thoughts?
>
>
> Thanks,
>
> Don
>
>
> On Thursday, July 3, 2014 7:54:49 PM UTC-4, Johan Sigfrids wrote:
>>
>> You can use dropna() to convert a DataArray to a Array. This will 
>> obviously drop any missing values. 
>>
>> On Friday, July 4, 2014 2:08:55 AM UTC+3, Donald Lacombe wrote:
>>>
>>> Patrick (and others),
>>>
>>> Another issue that has reared it's ugly head is that when I read the 
>>> data using the Data Frames package, I get the following:
>>>
>>> data = readtable("ct_coord_2.csv",header=false)
>>>
>>> 8x2 DataFrame:
>>>
>>>               x1      x2
>>>
>>> [1,]    -73.3712  41.225
>>>
>>> [2,]    -72.1065 41.4667
>>>
>>> [3,]    -73.2453 41.7925
>>>
>>> [4,]    -71.9876   41.83
>>>
>>> [5,]    -72.3365  41.855
>>>
>>> [6,]    -72.7328 41.8064
>>>
>>> [7,]    -72.5231 41.4354
>>>
>>> [8,]    -72.8999 41.3488
>>>
>>>
>>> julia> xc = data[:,1]
>>>
>>> 8-element DataArray{Float64,1}:
>>>
>>>  -73.3712
>>>
>>>  -72.1065
>>>
>>>  -73.2453
>>>
>>>  -71.9876
>>>
>>>  -72.3365
>>>
>>>  -72.7328
>>>
>>>  -72.5231
>>>
>>>  -72.8999
>>>
>>>
>>> julia> yc = data[:,2]
>>>
>>> 8-element DataArray{Float64,1}:
>>>
>>>  41.225 
>>>
>>>  41.4667
>>>
>>>  41.7925
>>>
>>>  41.83  
>>>
>>>  41.855 
>>>
>>>  41.8064
>>>
>>>  41.4354
>>>
>>>  41.3488
>>>
>>>
>>> julia> xc=xc'
>>>
>>> 1x8 DataArray{Float64,2}:
>>>
>>>  -73.3712  -72.1065  -73.2453  -71.9876  …  -72.7328  -72.5231  -72.8999
>>>
>>>
>>> julia> yc=yc'
>>>
>>> 1x8 DataArray{Float64,2}:
>>>
>>>  41.225  41.4667  41.7925  41.83  41.855  41.8064  41.4354  41.3488
>>>
>>>
>>> julia> temp = [xc;yc]
>>>
>>> 2x8 DataArray{Float64,2}:
>>>
>>>  -73.3712  -72.1065  -73.2453  -71.9876  …  -72.7328  -72.5231  -72.8999
>>>
>>>   41.225    41.4667   41.7925   41.83        41.8064   41.4354   41.3488
>>>
>>>
>>> julia> R = pairwise(Euclidean(),temp)
>>>
>>> MethodError(At_mul_B!,(
>>>
>>> 8x8 Array{Float64,2}:
>>>
>>>  2.7273e-316   2.7273e-316   2.67478e-315  …  2.7273e-316   2.7273e-316 
>>>
>>>  2.67736e-315  2.67736e-315  2.67736e-315     2.72726e-316  2.72726e-316
>>>
>>>  2.67727e-315  2.67727e-315  2.67727e-315     2.67727e-315  2.67727e-315
>>>
>>>  2.67727e-315  2.67727e-315  2.67727e-315     2.67727e-315  2.67727e-315
>>>
>>>  4.94066e-324  4.94066e-324  4.94066e-324     9.88131e-324  4.94066e-324
>>>
>>>  2.76235e-318  2.76235e-318  2.76235e-318  …  2.76235e-318  2.76235e-318
>>>
>>>  4.94066e-324  4.94066e-324  4.94066e-324     9.88131e-324  4.94066e-324
>>>
>>>  4.94066e-324  4.94066e-324  4.94066e-324     9.88131e-324  4.94066e-324,
>>>
>>>
>>> 2x8 DataArray{Float64,2}:
>>>
>>>  -73.3712  -72.1065  -73.2453  -71.9876  …  -72.7328  -72.5231  -72.8999
>>>
>>>   41.225    41.4667   41.7925   41.83        41.8064   41.4354   41.3488,
>>>
>>>
>>> 2x8 DataArray{Float64,2}:
>>>
>>>  -73.3712  -72.1065  -73.2453  -71.9876  …  -72.7328  -72.5231  -72.8999
>>>
>>>   41.225    41.4667   41.7925   41.83        41.8064   41.4354   41.3488))
>>>
>>>
>>> I do not think that the Distance package likes the types that is input into 
>>> the function, i.e. the vectors are DataArrays instead of Arrays. It works 
>>> just fine when I used Tony's idea:
>>>
>>>
>>>  julia> data = readcsv("ct_coord_2.csv",Float64)
>>>
>>> 8x2 Array{Float64,2}:
>>>
>>>  -73.3712  41.225 
>>>
>>>  -72.1065  41.4667
>>>
>>>  -73.2453  41.7925
>>>
>>>  -71.9876  41.83  
>>>
>>>  -72.3365  41.855 
>>>
>>>  -72.7328  41.8064
>>>
>>>  -72.5231  41.4354
>>>
>>>  -72.8999  41.3488
>>>
>>>
>>> julia> xc = data[:,1]
>>>
>>> 8-element Array{Float64,1}:
>>>
>>>  -73.3712
>>>
>>>  -72.1065
>>>
>>>  -73.2453
>>>
>>>  -71.9876
>>>
>>>  -72.3365
>>>
>>>  -72.7328
>>>
>>>  -72.5231
>>>
>>>  -72.8999
>>>
>>>
>>> julia> yc = data[:,2]
>>>
>>> 8-element Array{Float64,1}:
>>>
>>>  41.225 
>>>
>>>  41.4667
>>>
>>>  41.7925
>>>
>>>  41.83  
>>>
>>>  41.855 
>>>
>>>  41.8064
>>>
>>>  41.4354
>>>
>>>  41.3488
>>>
>>>
>>> julia> xc=xc'
>>>
>>> 1x8 Array{Float64,2}:
>>>
>>>  -73.3712  -72.1065  -73.2453  -71.9876  …  -72.7328  -72.5231  -72.8999
>>>
>>>
>>> julia> yc=yc'
>>>
>>> 1x8 Array{Float64,2}:
>>>
>>>  41.225  41.4667  41.7925  41.83  41.855  41.8064  41.4354  41.3488
>>>
>>>
>>> julia> temp = [xc;yc]
>>>
>>> 2x8 Array{Float64,2}:
>>>
>>>  -73.3712  -72.1065  -73.2453  -71.9876  …  -72.7328  -72.5231  -72.8999
>>>
>>>   41.225    41.4667   41.7925   41.83        41.8064   41.4354   41.3488
>>>
>>>
>>> julia> R = pairwise(Euclidean(),temp)
>>>
>>> 8x8 Array{Float64,2}:
>>>
>>>  0.0       1.28762   0.581327  1.51014   …  0.863479  0.873799  0.487347
>>>
>>>  1.28762   0.0       1.18451   0.382214     0.712542  0.417808  0.802085
>>>
>>>  0.581327  1.18451   0.0       1.25833      0.512668  0.805673  0.562309
>>>
>>>  1.51014   0.382214  1.25833   0.0          0.745667  0.665227  1.03141 
>>>
>>>  1.21144   0.451294  0.910982  0.349837     0.399323  0.459258  0.757372
>>>
>>>  0.863479  0.712542  0.512668  0.745667  …  0.0       0.426208  0.487124
>>>
>>>  0.873799  0.417808  0.805673  0.665227     0.426208  0.0       0.386557
>>>
>>>  0.487347  0.802085  0.562309  1.03141      0.487124  0.386557  0.0     
>>>
>>>
>>> There seems to be some issue with the Distance package not accepting Data 
>>> Frames. Of course, the readcsv works fine but this might be an issue for 
>>> others as well.
>>>
>>>
>>> Thanks,
>>>
>>> Don
>>>
>>>
>>>
>>> On Thursday, July 3, 2014 6:49:18 PM UTC-4, Patrick O'Leary wrote:
>>>>
>>>> On Thursday, July 3, 2014 5:36:23 PM UTC-5, Donald Lacombe wrote:
>>>>>
>>>>> I'm no GIS expert (I'm an applied econometrician) and the code I've 
>>>>> written seems to work. The Distance package also works with my "real" 
>>>>> data 
>>>>> which are the centroids of the counties in Connecticut and I tested it 
>>>>> with 
>>>>> Euclidean, Cityblock, and SqEuclidean.
>>>>>
>>>>
>>>> Glad you got something working. Whether those distances are accurate 
>>>> enough depends on how the points are arranged and what you plan to do with 
>>>> it--I can see where it wouldn't make much difference in this case. I can't 
>>>> let the statisticians and image processing folks have all the technical 
>>>> conversation fun in this mailing list, though!
>>>>
>>>

Reply via email to