Issue created: https://github.com/JuliaLang/julia/issues/9147.

On Tue, Nov 25, 2014 at 10:16 AM, Stefan Karpinski <ste...@karpinski.org>
wrote:

> It seems more reasonable to me to always zero uninitialized fields of
> composite values. This is basically free since objects larger than a memory
> page are not common.
>
> On Tue, Nov 25, 2014 at 1:13 AM, Ronald L. Rivest <rivest....@gmail.com>
> wrote:
>
>> Sorry; zeros() does not work here instead of new().  My mistake.
>> Is there a safe alternative to new() that guarantees that all fields
>> will have a definite fixed value?
>>
>> Cheers,
>> Ron
>>
>>
>> On Tuesday, November 25, 2014 1:05:40 AM UTC-5, Ronald L. Rivest wrote:
>>>
>>> The problem also exists for new() (e.g. when initializing a
>>> record/object).  zeros() can
>>> apparently be used here instead.
>>>
>>> Cheers,
>>> Ron
>>>
>>> On Tuesday, November 25, 2014 12:29:07 AM UTC-5, Viral Shah wrote:
>>>>
>>>> Much has been already said on this topic.
>>>>
>>>> The Array(...) interface was kind of meant to be low-level for the user
>>>> of scientific computing, only to be used when they know what they are
>>>> doing. You get the raw uninitialized memory as fast as possible.
>>>>
>>>> The user-facing interface was always an array constructor - zeros(),
>>>> ones(), rand(), etc. Some of this is because of our past experience coming
>>>> from a matlab/R-like world.
>>>>
>>>> As Julia has become more popular, we have realized that those not
>>>> coming from matlab/R end up using all the possible constructors. While this
>>>> has raised a variety of issues, I'd like to say that this will not get
>>>> sorted out satisfactorily before the 0.4 release. For a class that may be
>>>> taught soon, the thing to do would be to use the zeros/ones/rand
>>>> constructors to construct arrays, instead of Array(), which currently is
>>>> more for a package developer. I understand that Array() is a much better
>>>> name as Stefan points out, but zeros() is not too terrible - it at least
>>>> clearly tells the user that they get zeroed out arrays.
>>>>
>>>> While we have other "features" that can lead to unsafe code (ccall,
>>>> @inbounds), none of these are things one is likely to run into while
>>>> learning the language.
>>>>
>>>> -viral
>>>>
>>>> On Tuesday, November 25, 2014 1:00:10 AM UTC+5:30, Ronald L. Rivest
>>>> wrote:
>>>>>
>>>>> Regarding initialization:
>>>>>
>>>>>    -- I'm toying with the idea of recommending Julia for an
>>>>> introductory programming
>>>>>       class (rather than Python).
>>>>>
>>>>>    -- For this purpose, the language should not have hazards that
>>>>> catch the unwary.
>>>>>
>>>>>    -- Not initializing storage is definitely a hazard.  With
>>>>> uninitialized storage, a
>>>>>       program may run fine one day, and fail mysteriously the next,
>>>>> depending on
>>>>>       the contents of memory.  This is about predictability,
>>>>> reliability, dependability,
>>>>>       and correctness.
>>>>>
>>>>>    -- I would favor a solution like
>>>>>              A = Array(Int64,n)                   -- fills with zeros
>>>>>              A = Array(Int64,n,fill=1)          -- to fill with ones
>>>>>              A = Array(Int64,n,fill=None)    -- for an uninitialized
>>>>> array
>>>>>        so that the *default* is an initialized array, but the speed
>>>>> geeks
>>>>>        can get what they want.
>>>>>
>>>>> Cheers,
>>>>> Ron
>>>>>
>>>>> On Monday, November 24, 2014 1:57:14 PM UTC-5, Stefan Karpinski wrote:
>>>>>>
>>>>>> If we can make allocating zeroed arrays faster that's great, but
>>>>>> unless we can close the performance gap all the way and eliminate the 
>>>>>> need
>>>>>> to allocated uninitialized arrays altogether, this proposal is just a
>>>>>> rename – Unchecked.Array plays the exact same role as the current
>>>>>> Array constructor. It's unclear that this would even address the original
>>>>>> concern since it still *allows* uninitialized allocation of arrays. This
>>>>>> rename would just force people who have used Array correctly in code that
>>>>>> cares about being as efficient as possible even for very large arrays to
>>>>>> change their code and use Unchecked.Array instead.
>>>>>>
>>>>>> On Nov 24, 2014, at 1:36 PM, Jameson Nash <vtj...@gmail.com> wrote:
>>>>>>
>>>>>> I think that Rivest’s question may be a good reason to rethink the
>>>>>> initialization of structs and offer the explicit guarantee that all
>>>>>> unassigned elements will be initialized to 0 (and not just the jl_value_t
>>>>>> pointers). I would argue that the current behavior resulted more from a
>>>>>> desire to avoid clearing the array twice (if the user is about to call
>>>>>> fill, zeros, ones, +, etc.) than an intentional, casual exposure of
>>>>>> uninitialized memory.
>>>>>>
>>>>>> A random array of integers is also a security concern if an attacker
>>>>>> can extract some other information (with some probability) about the 
>>>>>> state
>>>>>> of the program. Julia is not hardened by design, so you can’t safely run 
>>>>>> an
>>>>>> unknown code fragment, but you still might have an unintended memory
>>>>>> exposure in a client-facing app. While zero’ing memory doesn’t prevent 
>>>>>> the
>>>>>> user from simply reusing a memory buffer in a security-unaware fashion
>>>>>> (rather than consistently allocating a new one for each use), it’s not
>>>>>> clear to me that the performance penalty would be all that noticeable for
>>>>>> map Array(X) to zero(X), and only providing an internal constructor for
>>>>>> grabbing uninitialized memory (perhaps Base.Unchecked.Array(X) from
>>>>>> #8227)
>>>>>>
>>>>>> On Mon Nov 24 2014 at 12:57:22 PM Stefan Karpinski
>>>>>> stefan.karpin...@gmail.com <http://mailto:stefan.karpin...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> There are two rather different issues to consider:
>>>>>>>
>>>>>>> 1. Preventing problems due to inadvertent programmer errors.
>>>>>>> 2. Preventing malicious security attacks.
>>>>>>>
>>>>>>> When we initially made this choice, it wasn't clear if 1 would be a
>>>>>>> big issue but we decided to see how it played out. It hasn't been a 
>>>>>>> problem
>>>>>>> in practice: once people grok that the Array(T, dims...) constructor 
>>>>>>> gives
>>>>>>> uninitialized memory and that the standard usage pattern is to call it 
>>>>>>> and
>>>>>>> then immediately initialize the memory, everything is ok. I can't
>>>>>>> recall a single situation where someone has had some terrible bug due to
>>>>>>> uninitialized int/float arrays.
>>>>>>>
>>>>>>> Regarding 2, Julia is not intended to be a hardened language for
>>>>>>> writing highly secure software. It allows all sorts of unsafe actions:
>>>>>>> pointer arithmetic, direct memory access, calling arbitrary C functions,
>>>>>>> etc. The future of really secure software seems to be small formally
>>>>>>> verified kernels written in statically typed languages that communicate
>>>>>>> with larger unverified systems over restricted channels. Julia might be
>>>>>>> appropriate for the larger unverified system but certainly not for the
>>>>>>> trusted kernel. Adding enough verification to Julia to write secure 
>>>>>>> kernels
>>>>>>> is not inconceivable, but would be a major research effort. The
>>>>>>> implementation would have to check lots of things, including, of course,
>>>>>>> ensuring that all arrays are initialized.
>>>>>>>
>>>>>>> A couple of other points:
>>>>>>>
>>>>>>> Modern OSes protect against data leaking between processes by
>>>>>>> zeroing pages before a process first accesses them. Thus any data 
>>>>>>> exposed
>>>>>>> by Array(T, dims...) comes from the same process and is not a security 
>>>>>>> leak.
>>>>>>>
>>>>>>> An uninitialized array of, say, integers is not in itself a security
>>>>>>> concern – the issue is what you do with those integers. The classic
>>>>>>> security hole is to use a "random" value from uninitialized memory to
>>>>>>> access other memory by using it to index into an array or otherwise 
>>>>>>> convert
>>>>>>> it to a pointer. In the presence of bounds checking, however, this isn't
>>>>>>> actually a big concern since you will still either get a bounds error 
>>>>>>> or a
>>>>>>> valid array value – not a meaningful one, of course, but still just a 
>>>>>>> value.
>>>>>>>
>>>>>>> Writing programs that are secure against malicious attacks is a
>>>>>>> hard, unsolved problem. So is doing efficient, productive high-level
>>>>>>> numerical programming. Trying to solve both problems at the same time 
>>>>>>> seems
>>>>>>> like a recipe for failing at both.
>>>>>>>
>>>>>>> On Nov 24, 2014, at 11:43 AM, David Smith <david...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Some ideas:
>>>>>>>
>>>>>>> Is there a way to return an error for accesses before at least one
>>>>>>> assignment in bits types?  I.e. when the object is created 
>>>>>>> uninitialized it
>>>>>>> is marked "dirty" and only after assignment of some user values can it 
>>>>>>> be
>>>>>>> "cleanly" accessed?
>>>>>>>
>>>>>>> Can Julia provide a thin memory management layer that grabs memory
>>>>>>> from the OS first, zeroes it, and then gives it to the user upon initial
>>>>>>> allocation?  After gc+reallocation it doesn't need to be zeroed again,
>>>>>>> unless the next allocation is larger than anything previous, at which 
>>>>>>> time
>>>>>>> Julia grabs more memory, sanitizes it, and hands it off.
>>>>>>>
>>>>>>> On Monday, November 24, 2014 2:48:05 AM UTC-6, Mauro wrote:
>>>>>>>>
>>>>>>>> Pointer types will initialise to undef and any operation on them
>>>>>>>> fails:
>>>>>>>> julia> a = Array(ASCIIString, 5);
>>>>>>>>
>>>>>>>> julia> a[1]
>>>>>>>> ERROR: access to undefined reference
>>>>>>>>  in getindex at array.jl:246
>>>>>>>>
>>>>>>>> But you're right, for bits-types this is not an error an will just
>>>>>>>> return whatever was there before.  I think the reason this will
>>>>>>>> stay
>>>>>>>> that way is that Julia is a numerics oriented language.  Thus you
>>>>>>>> many
>>>>>>>> wanna create a 1GB array of Float64 and then fill it with something
>>>>>>>> as
>>>>>>>> opposed to first fill it with zeros and then fill it with
>>>>>>>> something.
>>>>>>>> See:
>>>>>>>>
>>>>>>>> julia> @time b = Array(Float64, 10^9);
>>>>>>>> elapsed time: 0.029523638 seconds (8000000144 bytes allocated)
>>>>>>>>
>>>>>>>> julia> @time c = zeros(Float64, 10^9);
>>>>>>>> elapsed time: 0.835062841 seconds (8000000168 bytes allocated)
>>>>>>>>
>>>>>>>> You can argue that the time gain isn't worth the risk but I suspect
>>>>>>>> that
>>>>>>>> others may feel different.
>>>>>>>>
>>>>>>>> On Mon, 2014-11-24 at 09:28, Ronald L. Rivest <rives...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > I am just learning Julia...
>>>>>>>> >
>>>>>>>> > I was quite shocked today to learn that Julia does *not*
>>>>>>>> > initialize allocated storage (e.g. to 0 or some default value).
>>>>>>>> > E.g. the code
>>>>>>>> >      A = Array(Int64,5)
>>>>>>>> >      println(A[1])
>>>>>>>> > has unpredictable behavior, may disclose information from
>>>>>>>> > other modules, etc.
>>>>>>>> >
>>>>>>>> > This is really quite unacceptable in a modern programming
>>>>>>>> > language; it is as bad as not checking array reads for
>>>>>>>> out-of-bounds
>>>>>>>> > indices.
>>>>>>>> >
>>>>>>>> > Google for "uninitialized security" to find numerous instances
>>>>>>>> > of security violations and unreliability problems caused by the
>>>>>>>> > use of uninitialized variables, and numerous security advisories
>>>>>>>> > warning of problems caused by the (perhaps inadvertent) use
>>>>>>>> > of uninitialized variables.
>>>>>>>> >
>>>>>>>> > You can't design a programming language today under the naive
>>>>>>>> > assumption that code in that language won't be used in highly
>>>>>>>> > critical applications or won't be under adversarial attack.
>>>>>>>> >
>>>>>>>> > You can't reasonably ask all programmers to properly initialize
>>>>>>>> > their allocated storage manually any more than you can ask them
>>>>>>>> > to test all indices before accessing an array manually; these are
>>>>>>>> > things that a high-level language should do for you.
>>>>>>>> >
>>>>>>>> > The default non-initialization of allocated storage is a
>>>>>>>> > mis-feature that should absolutely be fixed.
>>>>>>>> >
>>>>>>>> > There is no efficiency argument here in favor of uninitialized
>>>>>>>> storage
>>>>>>>> > that can outweigh the security and reliability disadvantages...
>>>>>>>> >
>>>>>>>> > Cheers,
>>>>>>>> > Ron Rivest
>>>>>>>>
>>>>>>>> ​
>>>>>>
>>>>>>
>

Reply via email to