Re: [julia-users] Security problem with unitialized memory

Ronald L. Rivest Mon, 24 Nov 2014 22:14:39 -0800

Sorry; zeros() does not work here instead of new().  My mistake.
Is there a safe alternative to new() that guarantees that all fields
will have a definite fixed value?


Cheers,
Ron

On Tuesday, November 25, 2014 1:05:40 AM UTC-5, Ronald L. Rivest wrote:
>
> The problem also exists for new() (e.g. when initializing a 
> record/object).  zeros() can
> apparently be used here instead.
>
> Cheers,
> Ron
>
> On Tuesday, November 25, 2014 12:29:07 AM UTC-5, Viral Shah wrote:
>>
>> Much has been already said on this topic. 
>>
>> The Array(...) interface was kind of meant to be low-level for the user 
>> of scientific computing, only to be used when they know what they are 
>> doing. You get the raw uninitialized memory as fast as possible.
>>
>> The user-facing interface was always an array constructor - zeros(), 
>> ones(), rand(), etc. Some of this is because of our past experience coming 
>> from a matlab/R-like world. 
>>
>> As Julia has become more popular, we have realized that those not coming 
>> from matlab/R end up using all the possible constructors. While this has 
>> raised a variety of issues, I'd like to say that this will not get sorted 
>> out satisfactorily before the 0.4 release. For a class that may be taught 
>> soon, the thing to do would be to use the zeros/ones/rand constructors to 
>> construct arrays, instead of Array(), which currently is more for a package 
>> developer. I understand that Array() is a much better name as Stefan points 
>> out, but zeros() is not too terrible - it at least clearly tells the user 
>> that they get zeroed out arrays.
>>
>> While we have other "features" that can lead to unsafe code (ccall, 
>> @inbounds), none of these are things one is likely to run into while 
>> learning the language.
>>
>> -viral
>>
>> On Tuesday, November 25, 2014 1:00:10 AM UTC+5:30, Ronald L. Rivest wrote:
>>>
>>> Regarding initialization:
>>>
>>>    -- I'm toying with the idea of recommending Julia for an introductory 
>>> programming
>>>       class (rather than Python).  
>>>
>>>    -- For this purpose, the language should not have hazards that catch 
>>> the unwary.
>>>
>>>    -- Not initializing storage is definitely a hazard.  With 
>>> uninitialized storage, a 
>>>       program may run fine one day, and fail mysteriously the next, 
>>> depending on 
>>>       the contents of memory.  This is about predictability, 
>>> reliability, dependability,
>>>       and correctness.
>>>
>>>    -- I would favor a solution like
>>>              A = Array(Int64,n)                   -- fills with zeros
>>>              A = Array(Int64,n,fill=1)          -- to fill with ones
>>>              A = Array(Int64,n,fill=None)    -- for an uninitialized 
>>> array
>>>        so that the *default* is an initialized array, but the speed geeks
>>>        can get what they want.
>>>
>>> Cheers,
>>> Ron
>>>
>>> On Monday, November 24, 2014 1:57:14 PM UTC-5, Stefan Karpinski wrote:
>>>>
>>>> If we can make allocating zeroed arrays faster that's great, but unless 
>>>> we can close the performance gap all the way and eliminate the need to 
>>>> allocated uninitialized arrays altogether, this proposal is just a rename 
>>>> – Unchecked.Array 
>>>> plays the exact same role as the current Array constructor. It's 
>>>> unclear that this would even address the original concern since it still 
>>>> *allows* uninitialized allocation of arrays. This rename would just force 
>>>> people who have used Array correctly in code that cares about being as 
>>>> efficient as possible even for very large arrays to change their code and 
>>>> use Unchecked.Array instead.
>>>>
>>>> On Nov 24, 2014, at 1:36 PM, Jameson Nash <vtj...@gmail.com> wrote:
>>>>
>>>> I think that Rivest’s question may be a good reason to rethink the 
>>>> initialization of structs and offer the explicit guarantee that all 
>>>> unassigned elements will be initialized to 0 (and not just the jl_value_t 
>>>> pointers). I would argue that the current behavior resulted more from a 
>>>> desire to avoid clearing the array twice (if the user is about to call 
>>>> fill, zeros, ones, +, etc.) than an intentional, casual exposure of 
>>>> uninitialized memory.
>>>>
>>>> A random array of integers is also a security concern if an attacker 
>>>> can extract some other information (with some probability) about the state 
>>>> of the program. Julia is not hardened by design, so you can’t safely run 
>>>> an 
>>>> unknown code fragment, but you still might have an unintended memory 
>>>> exposure in a client-facing app. While zero’ing memory doesn’t prevent the 
>>>> user from simply reusing a memory buffer in a security-unaware fashion 
>>>> (rather than consistently allocating a new one for each use), it’s not 
>>>> clear to me that the performance penalty would be all that noticeable for 
>>>> map Array(X) to zero(X), and only providing an internal constructor for 
>>>> grabbing uninitialized memory (perhaps Base.Unchecked.Array(X) from 
>>>> #8227)
>>>>
>>>> On Mon Nov 24 2014 at 12:57:22 PM Stefan Karpinski 
>>>> stefan.karpin...@gmail.com <http://mailto:stefan.karpin...@gmail.com> 
>>>> wrote:
>>>>
>>>> There are two rather different issues to consider:
>>>>>
>>>>> 1. Preventing problems due to inadvertent programmer errors.
>>>>> 2. Preventing malicious security attacks.
>>>>>
>>>>> When we initially made this choice, it wasn't clear if 1 would be a 
>>>>> big issue but we decided to see how it played out. It hasn't been a 
>>>>> problem 
>>>>> in practice: once people grok that the Array(T, dims...) constructor 
>>>>> gives 
>>>>> uninitialized memory and that the standard usage pattern is to call it 
>>>>> and 
>>>>> then immediately initialize the memory, everything is ok. I can't 
>>>>> recall a single situation where someone has had some terrible bug due to 
>>>>> uninitialized int/float arrays.
>>>>>
>>>>> Regarding 2, Julia is not intended to be a hardened language for 
>>>>> writing highly secure software. It allows all sorts of unsafe actions: 
>>>>> pointer arithmetic, direct memory access, calling arbitrary C functions, 
>>>>> etc. The future of really secure software seems to be small formally 
>>>>> verified kernels written in statically typed languages that communicate 
>>>>> with larger unverified systems over restricted channels. Julia might be 
>>>>> appropriate for the larger unverified system but certainly not for the 
>>>>> trusted kernel. Adding enough verification to Julia to write secure 
>>>>> kernels 
>>>>> is not inconceivable, but would be a major research effort. The 
>>>>> implementation would have to check lots of things, including, of course, 
>>>>> ensuring that all arrays are initialized.
>>>>>
>>>>> A couple of other points:
>>>>>
>>>>> Modern OSes protect against data leaking between processes by zeroing 
>>>>> pages before a process first accesses them. Thus any data exposed by 
>>>>> Array(T, dims...) comes from the same process and is not a security leak.
>>>>>
>>>>> An uninitialized array of, say, integers is not in itself a security 
>>>>> concern – the issue is what you do with those integers. The classic 
>>>>> security hole is to use a "random" value from uninitialized memory to 
>>>>> access other memory by using it to index into an array or otherwise 
>>>>> convert 
>>>>> it to a pointer. In the presence of bounds checking, however, this isn't 
>>>>> actually a big concern since you will still either get a bounds error or 
>>>>> a 
>>>>> valid array value – not a meaningful one, of course, but still just a 
>>>>> value.
>>>>>
>>>>> Writing programs that are secure against malicious attacks is a hard, 
>>>>> unsolved problem. So is doing efficient, productive high-level numerical 
>>>>> programming. Trying to solve both problems at the same time seems like a 
>>>>> recipe for failing at both.
>>>>>
>>>>> On Nov 24, 2014, at 11:43 AM, David Smith <david...@gmail.com> wrote:
>>>>>
>>>>> Some ideas:
>>>>>
>>>>> Is there a way to return an error for accesses before at least one 
>>>>> assignment in bits types?  I.e. when the object is created uninitialized 
>>>>> it 
>>>>> is marked "dirty" and only after assignment of some user values can it be 
>>>>> "cleanly" accessed?
>>>>>
>>>>> Can Julia provide a thin memory management layer that grabs memory 
>>>>> from the OS first, zeroes it, and then gives it to the user upon initial 
>>>>> allocation?  After gc+reallocation it doesn't need to be zeroed again, 
>>>>> unless the next allocation is larger than anything previous, at which 
>>>>> time 
>>>>> Julia grabs more memory, sanitizes it, and hands it off. 
>>>>>
>>>>> On Monday, November 24, 2014 2:48:05 AM UTC-6, Mauro wrote:
>>>>>>
>>>>>> Pointer types will initialise to undef and any operation on them 
>>>>>> fails: 
>>>>>> julia> a = Array(ASCIIString, 5); 
>>>>>>
>>>>>> julia> a[1] 
>>>>>> ERROR: access to undefined reference 
>>>>>>  in getindex at array.jl:246 
>>>>>>
>>>>>> But you're right, for bits-types this is not an error an will just 
>>>>>> return whatever was there before.  I think the reason this will stay 
>>>>>> that way is that Julia is a numerics oriented language.  Thus you 
>>>>>> many 
>>>>>> wanna create a 1GB array of Float64 and then fill it with something 
>>>>>> as 
>>>>>> opposed to first fill it with zeros and then fill it with something. 
>>>>>> See: 
>>>>>>
>>>>>> julia> @time b = Array(Float64, 10^9); 
>>>>>> elapsed time: 0.029523638 seconds (8000000144 bytes allocated) 
>>>>>>
>>>>>> julia> @time c = zeros(Float64, 10^9); 
>>>>>> elapsed time: 0.835062841 seconds (8000000168 bytes allocated) 
>>>>>>
>>>>>> You can argue that the time gain isn't worth the risk but I suspect 
>>>>>> that 
>>>>>> others may feel different. 
>>>>>>
>>>>>> On Mon, 2014-11-24 at 09:28, Ronald L. Rivest <rives...@gmail.com> 
>>>>>> wrote: 
>>>>>> > I am just learning Julia... 
>>>>>> > 
>>>>>> > I was quite shocked today to learn that Julia does *not* 
>>>>>> > initialize allocated storage (e.g. to 0 or some default value). 
>>>>>> > E.g. the code 
>>>>>> >      A = Array(Int64,5) 
>>>>>> >      println(A[1]) 
>>>>>> > has unpredictable behavior, may disclose information from 
>>>>>> > other modules, etc. 
>>>>>> > 
>>>>>> > This is really quite unacceptable in a modern programming 
>>>>>> > language; it is as bad as not checking array reads for 
>>>>>> out-of-bounds 
>>>>>> > indices.   
>>>>>> > 
>>>>>> > Google for "uninitialized security" to find numerous instances 
>>>>>> > of security violations and unreliability problems caused by the 
>>>>>> > use of uninitialized variables, and numerous security advisories 
>>>>>> > warning of problems caused by the (perhaps inadvertent) use 
>>>>>> > of uninitialized variables. 
>>>>>> > 
>>>>>> > You can't design a programming language today under the naive 
>>>>>> > assumption that code in that language won't be used in highly 
>>>>>> > critical applications or won't be under adversarial attack. 
>>>>>> > 
>>>>>> > You can't reasonably ask all programmers to properly initialize 
>>>>>> > their allocated storage manually any more than you can ask them 
>>>>>> > to test all indices before accessing an array manually; these are 
>>>>>> > things that a high-level language should do for you. 
>>>>>> > 
>>>>>> > The default non-initialization of allocated storage is a 
>>>>>> > mis-feature that should absolutely be fixed. 
>>>>>> > 
>>>>>> > There is no efficiency argument here in favor of uninitialized 
>>>>>> storage 
>>>>>> > that can outweigh the security and reliability disadvantages... 
>>>>>> > 
>>>>>> > Cheers, 
>>>>>> > Ron Rivest 
>>>>>>
>>>>>> 
>>>>
>>>>

Re: [julia-users] Security problem with unitialized memory

Reply via email to