[jira] [Commented] (MATH-1403) Collinearity test: QR Decomposition rank incorrect (SVD ok)

2017-02-27 Thread Gilles (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885855#comment-15885855
 ] 

Gilles commented on MATH-1403:
--

Jama's documentation for says:
{noformat}
public int rank()

Matrix rank

Returns:
effective numerical rank, obtained from SVD.
{noformat}


> Collinearity test: QR Decomposition rank incorrect (SVD ok)
> ---
>
> Key: MATH-1403
> URL: https://issues.apache.org/jira/browse/MATH-1403
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.6.1
> Environment: Linux ubuntu
> JDK 8
>Reporter: Hugo Ferrira
>
> Hello,
> I am aware that such a question have been asked before but I cannot seem to 
> solve this issue for a very simple example. The closest example I have is:
> https://issues.apache.org/jira/browse/MATH-1100
> from which I could not get an answer.
> I am trying to copy an algorithm from R's Caret package that identifies 
> collinear columns of a matrix [1]. I am assuming a "long" matrix and and am 
> using the trivial example from the reference above. However I cannot get this 
> to work because the QR's rank result is incorrect.
> I have the following example:
> import org.apache.commons.math3.linear.RealMatrix;
> import org.apache.commons.math3.linear.RRQRDecomposition;
> import org.apache.commons.math3.linear.Array2DRowRealMatrix;
> import org.apache.commons.math3.linear.SingularValueDecomposition ;
> public class QRIssue {
>   public static void main(String[] args) {
> double[][] am = new double[5][];
> double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ;
> double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ;
> double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ;
> double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ;
> double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ;
> am[0] = c1 ;
> am[1] = c2 ;
> am[2] = c3 ;
> am[3] = c4 ;
> am[4] = c6 ;
> Double threshold = 1e-1;
> Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false )  ; // use 
> array, don't copy
> RRQRDecomposition qr = new RRQRDecomposition( m,  threshold) ;
> RealMatrix r = qr.getR() ;
> int numColumns = r.getColumnDimension() ;
> int rank = qr.getRank( threshold ) ;
> System.out.println("QR rank: " + rank) ;
> System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ;
> System.out.println("QR is singular: " + (numColumns == rank) ) ;
> SingularValueDecomposition sv2 = new 
> org.apache.commons.math3.linear.SingularValueDecomposition(m);
> System.out.println("SVD rank: " + sv2.getRank()) ;
> }
> }
> For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 
> = c1 + c2). But for QR I get 5. I have tried several thresholds with no 
> success. For several subsets of the columns above (example only 0,1,2 I get 
> the correct answer). What am I doing wrong?
> TIA,
> Hugo F.
> 1. https://topepo.github.io/caret/pre-processing.html#lindep



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MATH-1403) Collinearity test: QR Decomposition rank incorrect (SVD ok)

2017-02-27 Thread Gilles (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885839#comment-15885839
 ] 

Gilles commented on MATH-1403:
--

bq. Unfortunately I am not knowledgeable enough to tackle this task.

It could start by finding out a reference algorithm (either in a scientific 
textbook or paper) or another code that implements the functionality, and 
figure out where the key differences are).
Unfortunately the Javadoc is out-of-sync since it refers to Jama having this 
same algo, whereas it [hasn't|http://math.nist.gov/javanumerics/jama/doc/].


> Collinearity test: QR Decomposition rank incorrect (SVD ok)
> ---
>
> Key: MATH-1403
> URL: https://issues.apache.org/jira/browse/MATH-1403
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.6.1
> Environment: Linux ubuntu
> JDK 8
>Reporter: Hugo Ferrira
>
> Hello,
> I am aware that such a question have been asked before but I cannot seem to 
> solve this issue for a very simple example. The closest example I have is:
> https://issues.apache.org/jira/browse/MATH-1100
> from which I could not get an answer.
> I am trying to copy an algorithm from R's Caret package that identifies 
> collinear columns of a matrix [1]. I am assuming a "long" matrix and and am 
> using the trivial example from the reference above. However I cannot get this 
> to work because the QR's rank result is incorrect.
> I have the following example:
> import org.apache.commons.math3.linear.RealMatrix;
> import org.apache.commons.math3.linear.RRQRDecomposition;
> import org.apache.commons.math3.linear.Array2DRowRealMatrix;
> import org.apache.commons.math3.linear.SingularValueDecomposition ;
> public class QRIssue {
>   public static void main(String[] args) {
> double[][] am = new double[5][];
> double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ;
> double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ;
> double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ;
> double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ;
> double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ;
> am[0] = c1 ;
> am[1] = c2 ;
> am[2] = c3 ;
> am[3] = c4 ;
> am[4] = c6 ;
> Double threshold = 1e-1;
> Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false )  ; // use 
> array, don't copy
> RRQRDecomposition qr = new RRQRDecomposition( m,  threshold) ;
> RealMatrix r = qr.getR() ;
> int numColumns = r.getColumnDimension() ;
> int rank = qr.getRank( threshold ) ;
> System.out.println("QR rank: " + rank) ;
> System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ;
> System.out.println("QR is singular: " + (numColumns == rank) ) ;
> SingularValueDecomposition sv2 = new 
> org.apache.commons.math3.linear.SingularValueDecomposition(m);
> System.out.println("SVD rank: " + sv2.getRank()) ;
> }
> }
> For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 
> = c1 + c2). But for QR I get 5. I have tried several thresholds with no 
> success. For several subsets of the columns above (example only 0,1,2 I get 
> the correct answer). What am I doing wrong?
> TIA,
> Hugo F.
> 1. https://topepo.github.io/caret/pre-processing.html#lindep



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MATH-1403) Collinearity test: QR Decomposition rank incorrect (SVD ok)

2017-02-27 Thread Hugo Ferrira (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885640#comment-15885640
 ] 

Hugo Ferrira commented on MATH-1403:


Hello Gilles,

Thanks for the feedback. Unfortunately I am not knowledgeable enough to tackle 
this task.

Finally, I confirmed that the original R code uses the BLAS library. Its 
implementation
is also a rank revealing QR decomposition. What I find interesting is that the 
rank value
is obtained after the decomposition and no explicit function is called. So 
these 
don't seem to be implementations of the same algorithm. 

As I said, I don't know much about numerical methods. However, if someone can
point me to a simple description of an algorithm I could try and debug it. 

Thanks

> Collinearity test: QR Decomposition rank incorrect (SVD ok)
> ---
>
> Key: MATH-1403
> URL: https://issues.apache.org/jira/browse/MATH-1403
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.6.1
> Environment: Linux ubuntu
> JDK 8
>Reporter: Hugo Ferrira
>
> Hello,
> I am aware that such a question have been asked before but I cannot seem to 
> solve this issue for a very simple example. The closest example I have is:
> https://issues.apache.org/jira/browse/MATH-1100
> from which I could not get an answer.
> I am trying to copy an algorithm from R's Caret package that identifies 
> collinear columns of a matrix [1]. I am assuming a "long" matrix and and am 
> using the trivial example from the reference above. However I cannot get this 
> to work because the QR's rank result is incorrect.
> I have the following example:
> import org.apache.commons.math3.linear.RealMatrix;
> import org.apache.commons.math3.linear.RRQRDecomposition;
> import org.apache.commons.math3.linear.Array2DRowRealMatrix;
> import org.apache.commons.math3.linear.SingularValueDecomposition ;
> public class QRIssue {
>   public static void main(String[] args) {
> double[][] am = new double[5][];
> double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ;
> double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ;
> double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ;
> double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ;
> double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ;
> am[0] = c1 ;
> am[1] = c2 ;
> am[2] = c3 ;
> am[3] = c4 ;
> am[4] = c6 ;
> Double threshold = 1e-1;
> Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false )  ; // use 
> array, don't copy
> RRQRDecomposition qr = new RRQRDecomposition( m,  threshold) ;
> RealMatrix r = qr.getR() ;
> int numColumns = r.getColumnDimension() ;
> int rank = qr.getRank( threshold ) ;
> System.out.println("QR rank: " + rank) ;
> System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ;
> System.out.println("QR is singular: " + (numColumns == rank) ) ;
> SingularValueDecomposition sv2 = new 
> org.apache.commons.math3.linear.SingularValueDecomposition(m);
> System.out.println("SVD rank: " + sv2.getRank()) ;
> }
> }
> For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 
> = c1 + c2). But for QR I get 5. I have tried several thresholds with no 
> success. For several subsets of the columns above (example only 0,1,2 I get 
> the correct answer). What am I doing wrong?
> TIA,
> Hugo F.
> 1. https://topepo.github.io/caret/pre-processing.html#lindep



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MATH-1403) Collinearity test: QR Decomposition rank incorrect (SVD ok)

2017-02-24 Thread Gilles (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882686#comment-15882686
 ] 

Gilles commented on MATH-1403:
--

It looks wrong indeed.
The Javadoc mentions "When a large fall in norm is seen, the rank is returned" 
which seems fairly unhelpful in order to select an appropriate threshold value.

As human resources have become scarce for the Commons Math project, you are 
most welcome to look at the code in order to find the bug.
I've slightly modified your example (transformed into a unit test):
{code}
@Test
public void testMath1403() {
final double delta = 1e-7; // Test fails when delta <= 1e-8.
final double[][] m = {
{1, 1, 1, 1 + delta, 1, 1},
{1, 1, 1, delta, 0, 0},
{0, 0, 0, 1, 1, 1},
{1, 0, 0, 1, 0, 0},
{0, 0, 1, 0, 0, 1}
};

final RRQRDecomposition qr = new RRQRDecomposition(new 
Array2DRowRealMatrix(m));
final double dropThreshold = 1e-7; // Test fails when dropThreshold <= 
1e-8.
Assert.assertEquals(4, qr.getRank(dropThreshold));
}
{code}
It hints at a numerical problem...


> Collinearity test: QR Decomposition rank incorrect (SVD ok)
> ---
>
> Key: MATH-1403
> URL: https://issues.apache.org/jira/browse/MATH-1403
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.6.1
> Environment: Linux ubuntu
> JDK 8
>Reporter: Hugo Ferrira
>
> Hello,
> I am aware that such a question have been asked before but I cannot seem to 
> solve this issue for a very simple example. The closest example I have is:
> https://issues.apache.org/jira/browse/MATH-1100
> from which I could not get an answer.
> I am trying to copy an algorithm from R's Caret package that identifies 
> collinear columns of a matrix [1]. I am assuming a "long" matrix and and am 
> using the trivial example from the reference above. However I cannot get this 
> to work because the QR's rank result is incorrect.
> I have the following example:
> import org.apache.commons.math3.linear.RealMatrix;
> import org.apache.commons.math3.linear.RRQRDecomposition;
> import org.apache.commons.math3.linear.Array2DRowRealMatrix;
> import org.apache.commons.math3.linear.SingularValueDecomposition ;
> public class QRIssue {
>   public static void main(String[] args) {
> double[][] am = new double[5][];
> double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ;
> double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ;
> double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ;
> double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ;
> double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ;
> am[0] = c1 ;
> am[1] = c2 ;
> am[2] = c3 ;
> am[3] = c4 ;
> am[4] = c6 ;
> Double threshold = 1e-1;
> Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false )  ; // use 
> array, don't copy
> RRQRDecomposition qr = new RRQRDecomposition( m,  threshold) ;
> RealMatrix r = qr.getR() ;
> int numColumns = r.getColumnDimension() ;
> int rank = qr.getRank( threshold ) ;
> System.out.println("QR rank: " + rank) ;
> System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ;
> System.out.println("QR is singular: " + (numColumns == rank) ) ;
> SingularValueDecomposition sv2 = new 
> org.apache.commons.math3.linear.SingularValueDecomposition(m);
> System.out.println("SVD rank: " + sv2.getRank()) ;
> }
> }
> For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 
> = c1 + c2). But for QR I get 5. I have tried several thresholds with no 
> success. For several subsets of the columns above (example only 0,1,2 I get 
> the correct answer). What am I doing wrong?
> TIA,
> Hugo F.
> 1. https://topepo.github.io/caret/pre-processing.html#lindep



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)