Thank you, that makes sense. Looking forward to 0.4!
On Wednesday, September 16, 2015 at 10:20:50 PM UTC-4, Simon Kornblith wrote: > > \ is supposed to use pivoted QR by default, which a standard way to solve > least squares when X is not full rank. It gives me a LAPACKException on 0.3 > too, but it works for me on 0.4: > > julia> X2 = [1 2 3; > 1 2 5 > 1 2 -8 > 2 4 23]; > > julia> y = [2.3, 5.6, 9.2, 10.5]; > > julia> X2\y > 3-element Array{Float64,1}: > 1.28112 > 2.56223 > -0.146502 > > You can explicitly tell Julia to solve using pivoted QR, which works on > 0.3: > > julia> qrfact(X2, pivot=true)\y # on Julia 0.4: qrfact(X2, Val{true})\y > 3-element Array{Float64,1}: > 1.28112 > 2.56223 > -0.146502 > > Or you can solve using SVD: > > julia> svdfact(X2)\y > 3-element Array{Float64,1}: > 1.28112 > 2.56223 > -0.146502 > > Pivoted QR should be slightly faster than SVD, but both should be much > faster than rref. > > Simon > > On Wednesday, September 16, 2015 at 10:02:49 AM UTC-4, Cedric St-Jean > wrote: >> >> Sorry, "singular" was the wrong word. I meant that the matrix does not >> have full rank: the columns are not linearly independent. >> >> # Linearly independent columns >> X1 = [1 2 3; >> 1 2 5 >> 1 2 -8 >> 1 4 23] >> # Linearly dependent columns (4th row changed >> X2 = [1 2 3; >> 1 2 5 >> 1 2 -8 >> 2 4 23] >> y = [2.3, 5.6, 9.2, 10.5] >> >> X1 \ y >> >3-element Array{Float64,1}: >> >> > -8.18265 >> > 6.94133 >> > -0.394898 >> >> X2 \ y >> >LAPACKException(2) >> >> >> This is because there is no unique solution to the linear regression >> problem when the matrix does not have full rank. The usual solution is >> regularization (eg. ridge regression), but selecting a linearly independent >> subset of the columns seems like a valid strategy too, if only prediction >> accuracy matters. >> >> Hence: isn't that a valid non-pedagogical use of rref? >> >> Cédric >> >> On Wednesday, September 16, 2015 at 12:13:55 AM UTC-4, Steven G. Johnson >> wrote: >>> >>> >>> >>> On Tuesday, September 15, 2015 at 10:22:47 AM UTC-4, Cedric St-Jean >>> wrote: >>>> >>>> In the discussion of issue 9804 >>>> <https://github.com/JuliaLang/julia/pull/9804#issuecomment-140404574>, >>>> the general consensus seemed to be: >>>> >>>> > Right. "rref" has no use in real numerical coding, but every use in >>>> pedagogy. >>>> >>>> >>>> I had a linear regression yield a PosDefError (the X matrix contains >>>> only integers and was singular). I solved it by selecting a linearly >>>> independent subset of columns >>>> <http://math.stackexchange.com/questions/199125/how-to-remove-linearly-dependent-rows-cols>. >>>> >>>> This uses the Row-Echelon-Form, rref. Is there a simpler way of doing >>>> this that I'm missing? >>>> >>> >>> A \ x, where A is a non-square matrix, finds least-square solution via >>> QR. This is normally the "right" way to do regression. >>> >>> (You can also compute the QR factorization of A explicitly to get a rank >>> estimate; google "rank revealing QR" for more info on this sort of thing.) >>> >>