* Ifelse works on sparse.
* It makes use of gpu shared variable more transparent with theano.function
updates and givens parameter.
* Added a_tensor.transpose(axes) axes is optional (James)
* theano.tensor.transpose(a_tensor, kwargs) We where ignoring kwargs, now
it is used as the axes.
* a_CudaNdarray_object[*] = int, now works (Frederic)
* tensor_variable.size (as numpy) computes the product of the shape elements.
(Olivier)
* sparse_variable.size (as scipy) computes the number of stored values.
(Olivier)
* sparse_variable[N, N] now works (Li Yao, Frederic)
* sparse_variable[M:N, O:P] now works (Li Yao, Frederic, Pascal)
M, N, O, and P can be Python int or scalar tensor variables, None, or
omitted (sparse_variable[:, :M] or sparse_variable[:M, N:] work).
* tensor.tensordot can now be moved to GPU (Sander Dieleman,
Pascal, based on code from Tijmen Tieleman's gnumpy,
http://www.cs.toronto.edu/~tijmen/gnumpy.html)
# Many infer_shape implemented on sparse matrices op. (David W.F.)
# Added theano.sparse.verify_grad_sparse to easily allow testing grad of
sparse op. It support testing the full and structured gradient.
# The keys in our cache now store the hash of constants and not the constant
values
themselves. This is significantly more efficient for big constant arrays.
(Frederic B.)
# 'theano-cache list' lists key files bigger than 1M (Frederic B.)
# 'theano-cache list' prints an histogram of the number of keys per compiled
module (Frederic B.)
# 'theano-cache list' prints the number of compiled modules per op class
(Frederic B.)
# The Theano flag "nvcc.fastmath" is now also used for the cuda_ndarray.cu
file.
# Add the header_dirs to the hard part of the compilation key. This is
currently used only by cuda, but if we use library that are only headers,
this can be useful. (Frederic B.)
# Alloc, GpuAlloc are not always pre-computed (constant_folding optimization)
at compile time if all their inputs are constant.
(Frederic B., Pascal L., reported by Sander Dieleman)
# New Op tensor.sort(), wrapping numpy.sort (Hani Almousli)
New optimizations:
* AdvancedSubtensor1 reuses preallocated memory if available (scan, c|py_nogc
linker) (Frederic)
* dot22, dot22scalar work with complex. (Frederic)
* Generate Gemv/Gemm more often. (James)
* Remove scan when all computations can be moved outside the loop. (Razvan)
* scan optimization done earlier. This allows other optimizations to be
applied. (Frederic, Guillaume, Razvan)
* exp(x) * sigmoid(-x) is now correctly optimized to the more stable form
sigmoid(x). (Olivier)
* Added Subtensor(Rebroadcast(x)) => Rebroadcast(Subtensor(x)) optimization.
(Guillaume)
* Made the optimization process faster. (James)
* Allow fusion of elemwise when the scalar op needs support code. (James)
* Better opt that lifts transpose around dot. (James)
Crashes fixed:
* T.mean crash at graph building time. (Ian)
* "Interactive debugger" crash fix. (Ian, Frederic)
* Do not call gemm with strides 0, some blas refuse it. (Pascal Lamblin)
* Optimization crash with gemm and complex. (Frederic)
* GPU crash with elemwise. (Frederic, some reported by Chris Currivan)
* Compilation crash with amdlibm and the GPU. (Frederic)
* IfElse crash. (Frederic)
* Execution crash fix in AdvancedSubtensor1 on 32 bit computers. (Pascal)
* GPU compilation crash on MacOS X. (Olivier)
* Support for OSX Enthought Python Distribution 7.x. (Graham Taylor, Olivier)
* When the subtensor inputs had 0 dimensions and the outputs 0 dimensions.
(Frederic)
* Crash when the step to subtensor was not 1 in conjunction with some
optimization. (Frederic, reported by Olivier Chapelle)
* Runtime crash related to an optimization with subtensor of alloc (reported
by Razvan, fixed by Frederic)
* Fix dot22scalar cast of integer scalars (Justin Bayer, Frédéric, Olivier)
* Fix runtime crash in gemm, dot22. FB
* Fix on 32bits computer: make sure all shape are int64.(Olivier)
* Fix to deque on python 2.4 (Olivier)
* Fix crash when not using c code (or using DebugMode) (not used by
default) with numpy 1.6*. Numpy has a bug in the reduction code that
made it crash. (Pascal)
* Crashes of blas functions (Gemv on CPU; Ger, Gemv and Gemm on GPU)
when matrices had non-unit stride in both dimensions (CPU and GPU),
or when matrices had negative strides (GPU only). In those cases,
we are now making copies. (Pascal)
# More cases supported in AdvancedIncSubtensor1. (Olivier D.)
# Fix crash when a broadcasted constant was used as input of an
elemwise Op and needed to be upcasted to match the op's output.
(Reported by John Salvatier, fixed by Pascal L.)
# Fixed a memory leak with shared variable (we kept a pointer to the original
value) (Ian G.)
Known bugs:
* CAReduce with nan in inputs don't return the good output (`Ticket
<https://www.assembla.com/spaces/the