[julia-users] Re: Array/Cell - a useful distinction, or not?

Ivar Nesje Tue, 29 Apr 2014 07:54:46 -0700

Sorry for nitpicking, but point 3 is wrong, and it might cause trouble in 
the following discussion.


f{T<:Real}(a::Array{T})
Matches any array with a element type that is a subtype of Real (eg. 
Integer[1,2,BigInt(44)] and Real[1, 3.4])

I have had trouble with this too, but now that I somewhat understand the 
rationale, I'm less frustrated. I'm not at the point of defending the 
current behaviour (yet), so others will have to do that (again).


kl. 11:38:33 UTC+2 tirsdag 29. april 2014 skrev Oliver Woodford følgende:
>
> A habitual MATLAB user, I've been trying out Julia over the last two weeks 
> to see if it might be a suitable replacement. I want something that is as 
> fast to develop using, but has much faster runtime. Perhaps I'll write 
> about my general thoughts in another post. However, in this thread I want 
> to address one linguistic thing I found confusing.
>
> Ignoring subarrays, dense/sparse arrays, there are two main types of array 
> in Julia. I will call them homogenous and heterogenous. Homogenous arrays 
> are declared as having all elements be the same type: e.g. array{Float64}. 
> They are efficient to store in memory, as the elements are simply laid out 
> consecutively in memory. Heterogenous arrays have an abstract element type, 
> e.g. array{Real}. The way Julia interprets this is that every element must 
> be a concrete subtype of Real, but that they don't have to be the same 
> type. Each element can therefore be a different type, with different 
> storage requirements, so these arrays contain a pointer to each element, 
> which is then stored somewhere else - this carries a massive overhead. In 
> MATLAB these arrays would be termed an array and a cell array respectively, 
> so there is a clear distinction. What I found confusing with Julia is that 
> the distinction is less clear.
>
> This confusion was highlighted in a stackoverflow 
> question<http://stackoverflow.com/questions/23326848/julia-arrays-with-abstract-parameters-cause-errors-but-variables-with-abstract>,
>  
> which I'll outline it again, now:
>
> f(x::Real) = x is equivalent to f{T<:Real}(x::T) = x, but f(x::Array{Real}) 
> = x is different from f{T<:Real}(x::Array{T}) = x.
>
> The second form for input arrays, requiring static parameters, is needed 
> to declare that the array is homogenous, not heterogenous. This seems a 
> funny way of doing things to me because:
> 1. The homogeneity/heterogeneity of the array is a characteristic of the 
> array only, and not of the function
> 2. The static parameter T is not required anywhere else, and the Julia 
> style 
> guide<http://julia.readthedocs.org/en/latest/manual/style-guide/#don-t-use-unnecessary-static-parameters>
>  explicitly 
> counsels against the use of such parameters, where they are unnecessary.
> 3. To declare a function which can take homogenous or heterogenous arrays, 
> I believe you'd have to do something like  f{T<:Real}(x::Union(Array{T}, 
> Array{Real})) = x, which seems totally bizarre (due to point 1).
>
> What I would advocate instead is two types of array, one homogenous, one 
> heterogenous. Array for homogenous and Cell for heterogenous would work. 
> It would do away with the need for static parameters in this case, and 
> also, in my view, make people far more aware of when they are using the 
> different types of array. I suspect many beginners are oblivious to the 
> distinction, currently.
>
> In the stackoverflow question, someone suggested two points against this:
> 1. Having an array whose elements are all guaranteed to be some subtype of 
> Real is not particularly useful without specifying which subtype since 
> without that information almost no structural information is being provided 
> to the compiler (e.g. about memory layout, etc.)
> Well, firstly I disagree; there is a lot of structural information being 
> supplied - to read each element, the compiler knows that it just needs to 
> compute an offset, rather than compute an offset, read a pointer and then 
> read another memory location. However, I don't think this is exploited 
> (though it could be) because the function will be recompiled from scratch 
> for each element type. Secondly, this isn't about helping the compiler, 
> it's about making the language more consistent and sensible - helping the 
> *user*.
> 2. You almost always pass homogenous arrays of a concrete type as 
> arguments anyway and the compiler is able to specialize on that.
> Firstly, homogenous arrays that you pass in *always *have a concrete 
> type. Secondly, you don't always know what that type will be. It might be 
> Float64 or Unit8, etc.
>
> I haven't yet heard a convincing counterargument to it making more sense 
> to distinguish homogenous and heterogenous arrays by the array type rather 
> than by static function parameter.
>
> Let the discussion begin...
>

[julia-users] Re: Array/Cell - a useful distinction, or not?

Reply via email to