[jira] [Commented] (MATH-1403) Collinearity test: QR Decomposition rank incorrect (SVD ok)
[ https://issues.apache.org/jira/browse/MATH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885855#comment-15885855 ] Gilles commented on MATH-1403: -- Jama's documentation for says: {noformat} public int rank() Matrix rank Returns: effective numerical rank, obtained from SVD. {noformat} > Collinearity test: QR Decomposition rank incorrect (SVD ok) > --- > > Key: MATH-1403 > URL: https://issues.apache.org/jira/browse/MATH-1403 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.6.1 > Environment: Linux ubuntu > JDK 8 >Reporter: Hugo Ferrira > > Hello, > I am aware that such a question have been asked before but I cannot seem to > solve this issue for a very simple example. The closest example I have is: > https://issues.apache.org/jira/browse/MATH-1100 > from which I could not get an answer. > I am trying to copy an algorithm from R's Caret package that identifies > collinear columns of a matrix [1]. I am assuming a "long" matrix and and am > using the trivial example from the reference above. However I cannot get this > to work because the QR's rank result is incorrect. > I have the following example: > import org.apache.commons.math3.linear.RealMatrix; > import org.apache.commons.math3.linear.RRQRDecomposition; > import org.apache.commons.math3.linear.Array2DRowRealMatrix; > import org.apache.commons.math3.linear.SingularValueDecomposition ; > public class QRIssue { > public static void main(String[] args) { > double[][] am = new double[5][]; > double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ; > double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ; > double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ; > double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ; > double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ; > am[0] = c1 ; > am[1] = c2 ; > am[2] = c3 ; > am[3] = c4 ; > am[4] = c6 ; > Double threshold = 1e-1; > Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false ) ; // use > array, don't copy > RRQRDecomposition qr = new RRQRDecomposition( m, threshold) ; > RealMatrix r = qr.getR() ; > int numColumns = r.getColumnDimension() ; > int rank = qr.getRank( threshold ) ; > System.out.println("QR rank: " + rank) ; > System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ; > System.out.println("QR is singular: " + (numColumns == rank) ) ; > SingularValueDecomposition sv2 = new > org.apache.commons.math3.linear.SingularValueDecomposition(m); > System.out.println("SVD rank: " + sv2.getRank()) ; > } > } > For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 > = c1 + c2). But for QR I get 5. I have tried several thresholds with no > success. For several subsets of the columns above (example only 0,1,2 I get > the correct answer). What am I doing wrong? > TIA, > Hugo F. > 1. https://topepo.github.io/caret/pre-processing.html#lindep -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MATH-1403) Collinearity test: QR Decomposition rank incorrect (SVD ok)
[ https://issues.apache.org/jira/browse/MATH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885839#comment-15885839 ] Gilles commented on MATH-1403: -- bq. Unfortunately I am not knowledgeable enough to tackle this task. It could start by finding out a reference algorithm (either in a scientific textbook or paper) or another code that implements the functionality, and figure out where the key differences are). Unfortunately the Javadoc is out-of-sync since it refers to Jama having this same algo, whereas it [hasn't|http://math.nist.gov/javanumerics/jama/doc/]. > Collinearity test: QR Decomposition rank incorrect (SVD ok) > --- > > Key: MATH-1403 > URL: https://issues.apache.org/jira/browse/MATH-1403 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.6.1 > Environment: Linux ubuntu > JDK 8 >Reporter: Hugo Ferrira > > Hello, > I am aware that such a question have been asked before but I cannot seem to > solve this issue for a very simple example. The closest example I have is: > https://issues.apache.org/jira/browse/MATH-1100 > from which I could not get an answer. > I am trying to copy an algorithm from R's Caret package that identifies > collinear columns of a matrix [1]. I am assuming a "long" matrix and and am > using the trivial example from the reference above. However I cannot get this > to work because the QR's rank result is incorrect. > I have the following example: > import org.apache.commons.math3.linear.RealMatrix; > import org.apache.commons.math3.linear.RRQRDecomposition; > import org.apache.commons.math3.linear.Array2DRowRealMatrix; > import org.apache.commons.math3.linear.SingularValueDecomposition ; > public class QRIssue { > public static void main(String[] args) { > double[][] am = new double[5][]; > double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ; > double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ; > double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ; > double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ; > double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ; > am[0] = c1 ; > am[1] = c2 ; > am[2] = c3 ; > am[3] = c4 ; > am[4] = c6 ; > Double threshold = 1e-1; > Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false ) ; // use > array, don't copy > RRQRDecomposition qr = new RRQRDecomposition( m, threshold) ; > RealMatrix r = qr.getR() ; > int numColumns = r.getColumnDimension() ; > int rank = qr.getRank( threshold ) ; > System.out.println("QR rank: " + rank) ; > System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ; > System.out.println("QR is singular: " + (numColumns == rank) ) ; > SingularValueDecomposition sv2 = new > org.apache.commons.math3.linear.SingularValueDecomposition(m); > System.out.println("SVD rank: " + sv2.getRank()) ; > } > } > For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 > = c1 + c2). But for QR I get 5. I have tried several thresholds with no > success. For several subsets of the columns above (example only 0,1,2 I get > the correct answer). What am I doing wrong? > TIA, > Hugo F. > 1. https://topepo.github.io/caret/pre-processing.html#lindep -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MATH-1403) Collinearity test: QR Decomposition rank incorrect (SVD ok)
[ https://issues.apache.org/jira/browse/MATH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885640#comment-15885640 ] Hugo Ferrira commented on MATH-1403: Hello Gilles, Thanks for the feedback. Unfortunately I am not knowledgeable enough to tackle this task. Finally, I confirmed that the original R code uses the BLAS library. Its implementation is also a rank revealing QR decomposition. What I find interesting is that the rank value is obtained after the decomposition and no explicit function is called. So these don't seem to be implementations of the same algorithm. As I said, I don't know much about numerical methods. However, if someone can point me to a simple description of an algorithm I could try and debug it. Thanks > Collinearity test: QR Decomposition rank incorrect (SVD ok) > --- > > Key: MATH-1403 > URL: https://issues.apache.org/jira/browse/MATH-1403 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.6.1 > Environment: Linux ubuntu > JDK 8 >Reporter: Hugo Ferrira > > Hello, > I am aware that such a question have been asked before but I cannot seem to > solve this issue for a very simple example. The closest example I have is: > https://issues.apache.org/jira/browse/MATH-1100 > from which I could not get an answer. > I am trying to copy an algorithm from R's Caret package that identifies > collinear columns of a matrix [1]. I am assuming a "long" matrix and and am > using the trivial example from the reference above. However I cannot get this > to work because the QR's rank result is incorrect. > I have the following example: > import org.apache.commons.math3.linear.RealMatrix; > import org.apache.commons.math3.linear.RRQRDecomposition; > import org.apache.commons.math3.linear.Array2DRowRealMatrix; > import org.apache.commons.math3.linear.SingularValueDecomposition ; > public class QRIssue { > public static void main(String[] args) { > double[][] am = new double[5][]; > double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ; > double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ; > double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ; > double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ; > double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ; > am[0] = c1 ; > am[1] = c2 ; > am[2] = c3 ; > am[3] = c4 ; > am[4] = c6 ; > Double threshold = 1e-1; > Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false ) ; // use > array, don't copy > RRQRDecomposition qr = new RRQRDecomposition( m, threshold) ; > RealMatrix r = qr.getR() ; > int numColumns = r.getColumnDimension() ; > int rank = qr.getRank( threshold ) ; > System.out.println("QR rank: " + rank) ; > System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ; > System.out.println("QR is singular: " + (numColumns == rank) ) ; > SingularValueDecomposition sv2 = new > org.apache.commons.math3.linear.SingularValueDecomposition(m); > System.out.println("SVD rank: " + sv2.getRank()) ; > } > } > For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 > = c1 + c2). But for QR I get 5. I have tried several thresholds with no > success. For several subsets of the columns above (example only 0,1,2 I get > the correct answer). What am I doing wrong? > TIA, > Hugo F. > 1. https://topepo.github.io/caret/pre-processing.html#lindep -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MATH-1403) Collinearity test: QR Decomposition rank incorrect (SVD ok)
[ https://issues.apache.org/jira/browse/MATH-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882686#comment-15882686 ] Gilles commented on MATH-1403: -- It looks wrong indeed. The Javadoc mentions "When a large fall in norm is seen, the rank is returned" which seems fairly unhelpful in order to select an appropriate threshold value. As human resources have become scarce for the Commons Math project, you are most welcome to look at the code in order to find the bug. I've slightly modified your example (transformed into a unit test): {code} @Test public void testMath1403() { final double delta = 1e-7; // Test fails when delta <= 1e-8. final double[][] m = { {1, 1, 1, 1 + delta, 1, 1}, {1, 1, 1, delta, 0, 0}, {0, 0, 0, 1, 1, 1}, {1, 0, 0, 1, 0, 0}, {0, 0, 1, 0, 0, 1} }; final RRQRDecomposition qr = new RRQRDecomposition(new Array2DRowRealMatrix(m)); final double dropThreshold = 1e-7; // Test fails when dropThreshold <= 1e-8. Assert.assertEquals(4, qr.getRank(dropThreshold)); } {code} It hints at a numerical problem... > Collinearity test: QR Decomposition rank incorrect (SVD ok) > --- > > Key: MATH-1403 > URL: https://issues.apache.org/jira/browse/MATH-1403 > Project: Commons Math > Issue Type: Bug >Affects Versions: 3.6.1 > Environment: Linux ubuntu > JDK 8 >Reporter: Hugo Ferrira > > Hello, > I am aware that such a question have been asked before but I cannot seem to > solve this issue for a very simple example. The closest example I have is: > https://issues.apache.org/jira/browse/MATH-1100 > from which I could not get an answer. > I am trying to copy an algorithm from R's Caret package that identifies > collinear columns of a matrix [1]. I am assuming a "long" matrix and and am > using the trivial example from the reference above. However I cannot get this > to work because the QR's rank result is incorrect. > I have the following example: > import org.apache.commons.math3.linear.RealMatrix; > import org.apache.commons.math3.linear.RRQRDecomposition; > import org.apache.commons.math3.linear.Array2DRowRealMatrix; > import org.apache.commons.math3.linear.SingularValueDecomposition ; > public class QRIssue { > public static void main(String[] args) { > double[][] am = new double[5][]; > double[] c1 = new double[] {1.0, 1.0, 1.0, 1.0, 1.0, 1.0} ; > double[] c2 = new double[] {1.0, 1.0, 1.0, 0.0, 0.0, 0.0} ; > double[] c3 = new double[] {0.0, 0.0, 0.0, 1.0, 1.0, 1.0} ; > double[] c4 = new double[] {1.0, 0.0, 0.0, 1.0, 0.0, 0.0 } ; > double[] c6 = new double[] {0.0, 0.0, 1.0, 0.0, 0.0, 1.0 } ; > am[0] = c1 ; > am[1] = c2 ; > am[2] = c3 ; > am[3] = c4 ; > am[4] = c6 ; > Double threshold = 1e-1; > Array2DRowRealMatrix m = new Array2DRowRealMatrix( am, false ) ; // use > array, don't copy > RRQRDecomposition qr = new RRQRDecomposition( m, threshold) ; > RealMatrix r = qr.getR() ; > int numColumns = r.getColumnDimension() ; > int rank = qr.getRank( threshold ) ; > System.out.println("QR rank: " + rank) ; > System.out.println("QR is singular: " + !qr.getSolver().isNonSingular()) ; > System.out.println("QR is singular: " + (numColumns == rank) ) ; > SingularValueDecomposition sv2 = new > org.apache.commons.math3.linear.SingularValueDecomposition(m); > System.out.println("SVD rank: " + sv2.getRank()) ; > } > } > For SVD I get a rank of 4 which is correct (columns 0,1,2 are collinear : c0 > = c1 + c2). But for QR I get 5. I have tried several thresholds with no > success. For several subsets of the columns above (example only 0,1,2 I get > the correct answer). What am I doing wrong? > TIA, > Hugo F. > 1. https://topepo.github.io/caret/pre-processing.html#lindep -- This message was sent by Atlassian JIRA (v6.3.15#6346)