Re: apache-math dependency
There are 30 matches when searching for org.apache.commons.math3 in Mahout java files. On Fri, Aug 9, 2013 at 8:34 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: FYI SSVD does not have that dependency anymore (thanks to fixes to EigenSolver in Mahout). If there are no more methods using it, it can be deleted from pom. -d
Re: apache-math dependency
hm. so it crept in. As far back as i can recollect, we tried to minimize those and the only limiting factor for us were decompositions. Now all decompositions are available natively to Mahout, so perhaps it is time to review these occurrences. On Mon, Aug 12, 2013 at 2:03 PM, Stevo Slavić ssla...@gmail.com wrote: There are 30 matches when searching for org.apache.commons.math3 in Mahout java files. On Fri, Aug 9, 2013 at 8:34 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: FYI SSVD does not have that dependency anymore (thanks to fixes to EigenSolver in Mahout). If there are no more methods using it, it can be deleted from pom. -d
Re: apache-math dependency
Here are the uses [1] Counting the unique usages, we have these: 2 distribution.PoissonDistribution; 2 distribution.PascalDistribution; 2 distribution.NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.IntegerDistribution; 1 analysis.integration.UnivariateIntegrator; 1 analysis.integration.RombergIntegrator; 1 analysis.UnivariateFunction; These are pretty much all to do with statistical distributions. There is a residual EigenDecomposition, but that may be something that Dmitriy has already eliminated. The files that are involved are: 3 EigenSolverWrapper.java 3 DistributionChecks.java 2 UncommonDistributions.java 2 RandomWrapper.java 2 PoissonSamplerTest.java 1 SamplingLongPrimitiveIterator.java 1 SamplingIterator.java 1 RandomUtils.java 1 PoissonSampler.java 1 NormalTest.java 1 BreimanExample.java I will take a quick hack at these to see what can be done easily. [1] find . -name *.java | xargs grep -nH -e math3 ./core/src/main/java/org/apache/mahout/cf/taste/impl/common/SamplingLongPrimitiveIterator.java:23:import org.apache.commons.math3.distribution.PascalDistribution; ./core/src/main/java/org/apache/mahout/clustering/dirichlet/UncommonDistributions.java:20:import org.apache.commons.math3.distribution.NormalDistribution; ./core/src/main/java/org/apache/mahout/clustering/dirichlet/UncommonDistributions.java:21:import org.apache.commons.math3.distribution.RealDistribution; ./core/src/main/java/org/apache/mahout/common/iterator/SamplingIterator.java:24:import org.apache.commons.math3.distribution.PascalDistribution; ./examples/src/main/java/org/apache/mahout/classifier/df/BreimanExample.java:32:import org.apache.commons.math3.util.FastMath; ./math/src/main/java/org/apache/mahout/common/RandomUtils.java:26:import org.apache.commons.math3.primes.Primes; ./math/src/main/java/org/apache/mahout/common/RandomWrapper.java:20:import org.apache.commons.math3.random.MersenneTwister; ./math/src/main/java/org/apache/mahout/common/RandomWrapper.java:21:import org.apache.commons.math3.random.RandomGenerator; ./math/src/main/java/org/apache/mahout/math/random/PoissonSampler.java:21:import org.apache.commons.math3.distribution.PoissonDistribution; ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:19:import org.apache.commons.math3.linear.Array2DRowRealMatrix; ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:20:import org.apache.commons.math3.linear.EigenDecomposition; ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:21:import org.apache.commons.math3.linear.RealMatrix; ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:20:import org.apache.commons.math3.analysis.UnivariateFunction; ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:21:import org.apache.commons.math3.analysis.integration.RombergIntegrator; ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:22:import org.apache.commons.math3.analysis.integration.UnivariateIntegrator; ./math/src/test/java/org/apache/mahout/math/random/NormalTest.java:20:import org.apache.commons.math3.distribution.NormalDistribution; ./math/src/test/java/org/apache/mahout/math/random/PoissonSamplerTest.java:20:import org.apache.commons.math3.distribution.IntegerDistribution; ./math/src/test/java/org/apache/mahout/math/random/PoissonSamplerTest.java:21:import org.apache.commons.math3.distribution.PoissonDistribution; On Mon, Aug 12, 2013 at 2:11 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: hm. so it crept in. As far back as i can recollect, we tried to minimize those and the only limiting factor for us were decompositions. Now all decompositions are available natively to Mahout, so perhaps it is time to review these occurrences. On Mon, Aug 12, 2013 at 2:03 PM, Stevo Slavić ssla...@gmail.com wrote: There are 30 matches when searching for org.apache.commons.math3 in Mahout java files. On Fri, Aug 9, 2013 at 8:34 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: FYI SSVD does not have that dependency anymore (thanks to fixes to EigenSolver in Mahout). If there are no more methods using it, it can be deleted from pom. -d
Re: apache-math dependency
On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: Here are the uses [1] Counting the unique usages, we have these: 2 distribution.PoissonDistribution; 2 distribution.PascalDistribution; 2 distribution.NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.IntegerDistribution; 1 analysis.integration.UnivariateIntegrator; 1 analysis.integration.RombergIntegrator; 1 analysis.UnivariateFunction; These are pretty much all to do with statistical distributions. There is a residual EigenDecomposition, but that may be something that Dmitriy has already eliminated. The files that are involved are: 3 EigenSolverWrapper.java this file should have been deleted on trunk... let me know if not, it was part of wean SSVD off apache.math dep commit 3 DistributionChecks.java 2 UncommonDistributions.java 2 RandomWrapper.java 2 PoissonSamplerTest.java 1 SamplingLongPrimitiveIterator.java 1 SamplingIterator.java 1 RandomUtils.java 1 PoissonSampler.java 1 NormalTest.java 1 BreimanExample.java I will take a quick hack at these to see what can be done easily. [1] find . -name *.java | xargs grep -nH -e math3 ./core/src/main/java/org/apache/mahout/cf/taste/impl/common/SamplingLongPrimitiveIterator.java:23:import org.apache.commons.math3.distribution.PascalDistribution; ./core/src/main/java/org/apache/mahout/clustering/dirichlet/UncommonDistributions.java:20:import org.apache.commons.math3.distribution.NormalDistribution; ./core/src/main/java/org/apache/mahout/clustering/dirichlet/UncommonDistributions.java:21:import org.apache.commons.math3.distribution.RealDistribution; ./core/src/main/java/org/apache/mahout/common/iterator/SamplingIterator.java:24:import org.apache.commons.math3.distribution.PascalDistribution; ./examples/src/main/java/org/apache/mahout/classifier/df/BreimanExample.java:32:import org.apache.commons.math3.util.FastMath; ./math/src/main/java/org/apache/mahout/common/RandomUtils.java:26:import org.apache.commons.math3.primes.Primes; ./math/src/main/java/org/apache/mahout/common/RandomWrapper.java:20:import org.apache.commons.math3.random.MersenneTwister; ./math/src/main/java/org/apache/mahout/common/RandomWrapper.java:21:import org.apache.commons.math3.random.RandomGenerator; ./math/src/main/java/org/apache/mahout/math/random/PoissonSampler.java:21:import org.apache.commons.math3.distribution.PoissonDistribution; ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:19:import org.apache.commons.math3.linear.Array2DRowRealMatrix; ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:20:import org.apache.commons.math3.linear.EigenDecomposition; ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:21:import org.apache.commons.math3.linear.RealMatrix; ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:20:import org.apache.commons.math3.analysis.UnivariateFunction; ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:21:import org.apache.commons.math3.analysis.integration.RombergIntegrator; ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:22:import org.apache.commons.math3.analysis.integration.UnivariateIntegrator; ./math/src/test/java/org/apache/mahout/math/random/NormalTest.java:20:import org.apache.commons.math3.distribution.NormalDistribution; ./math/src/test/java/org/apache/mahout/math/random/PoissonSamplerTest.java:20:import org.apache.commons.math3.distribution.IntegerDistribution; ./math/src/test/java/org/apache/mahout/math/random/PoissonSamplerTest.java:21:import org.apache.commons.math3.distribution.PoissonDistribution; On Mon, Aug 12, 2013 at 2:11 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: hm. so it crept in. As far back as i can recollect, we tried to minimize those and the only limiting factor for us were decompositions. Now all decompositions are available natively to Mahout, so perhaps it is time to review these occurrences. On Mon, Aug 12, 2013 at 2:03 PM, Stevo Slavić ssla...@gmail.com wrote: There are 30 matches when searching for org.apache.commons.math3 in Mahout java files. On Fri, Aug 9, 2013 at 8:34 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: FYI SSVD does not have that dependency anymore (thanks to fixes to EigenSolver in Mahout). If there are no more methods using it, it can be deleted from pom. -d
Re: apache-math dependency
So I checked on these. The non-trivial issues with replacing Commons Math include: - Poisson and negative binomial distributions. This would be several hours work to write and test (we have Colt-inherited negative binomial distribution, but it takes no longer to write a new one than to test an old one). - random number generators. This is about and hour or two of work to pull the MersenneTwister implementation into our code. - next prime number finder. Not a big deal to replicate, but it would take a few hours to do. - quadrature. We use an adaptive integration routine to check distribution properties. This, again, would take a few hours to replace. I really don't see the benefit to this work. On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: 2 distribution.PoissonDistribution; 2 distribution.PascalDistribution; 2 distribution.NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.IntegerDistribution; 1 analysis.integration.UnivariateIntegrator; 1 analysis.integration.RombergIntegrator; 1 analysis.UnivariateFunction;
Re: apache-math dependency
On Mon, Aug 12, 2013 at 3:09 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: The files that are involved are: 3 EigenSolverWrapper.java this file should have been deleted on trunk... let me know if not, it was part of wean SSVD off apache.math dep commit Not an issue. I ran the count against a slightly old version.
Re: apache-math dependency
So you are ok with apache-math dependency to stay? On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com wrote: So I checked on these. The non-trivial issues with replacing Commons Math include: - Poisson and negative binomial distributions. This would be several hours work to write and test (we have Colt-inherited negative binomial distribution, but it takes no longer to write a new one than to test an old one). - random number generators. This is about and hour or two of work to pull the MersenneTwister implementation into our code. - next prime number finder. Not a big deal to replicate, but it would take a few hours to do. - quadrature. We use an adaptive integration routine to check distribution properties. This, again, would take a few hours to replace. I really don't see the benefit to this work. On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: 2 distribution.PoissonDistribution; 2 distribution.PascalDistribution; 2 distribution.NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.IntegerDistribution; 1 analysis.integration.UnivariateIntegrator; 1 analysis.integration.RombergIntegrator; 1 analysis.UnivariateFunction;
Re: apache-math dependency
I am fine with it staying. On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: So you are ok with apache-math dependency to stay? On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com wrote: So I checked on these. The non-trivial issues with replacing Commons Math include: - Poisson and negative binomial distributions. This would be several hours work to write and test (we have Colt-inherited negative binomial distribution, but it takes no longer to write a new one than to test an old one). - random number generators. This is about and hour or two of work to pull the MersenneTwister implementation into our code. - next prime number finder. Not a big deal to replicate, but it would take a few hours to do. - quadrature. We use an adaptive integration routine to check distribution properties. This, again, would take a few hours to replace. I really don't see the benefit to this work. On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: 2 distribution.PoissonDistribution; 2 distribution.PascalDistribution; 2 distribution.NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.IntegerDistribution; 1 analysis.integration.UnivariateIntegrator; 1 analysis.integration.RombergIntegrator; 1 analysis.UnivariateFunction;
Re: apache-math dependency
Larger part of mahout-math is linear algebra, which is currently broken for sparse part of the equation and which we don't use at all. One part of the problem is that our use for that library is always a fringe case, and as far as i can tell, will always continue to be such. Another part of the problem is that keeping dependency will invite bypassing Mahout's solvers and, as a result, architecture inconsistency. That said, I guess Ted's argument (which is mainly cost, as i gathered), trumps the two above. On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote: seriously, I would prefer the dependency as a good architectural pattern. It encourages other people to use/contribute to it to avoid repetitive work. On 13-08-12 06:16 PM, Ted Dunning wrote: I am fine with it staying. On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: So you are ok with apache-math dependency to stay? On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com wrote: So I checked on these. The non-trivial issues with replacing Commons Math include: - Poisson and negative binomial distributions. This would be several hours work to write and test (we have Colt-inherited negative binomial distribution, but it takes no longer to write a new one than to test an old one). - random number generators. This is about and hour or two of work to pull the MersenneTwister implementation into our code. - next prime number finder. Not a big deal to replicate, but it would take a few hours to do. - quadrature. We use an adaptive integration routine to check distribution properties. This, again, would take a few hours to replace. I really don't see the benefit to this work. On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: 2 distribution.**PoissonDistribution; 2 distribution.**PascalDistribution; 2 distribution.**NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.**IntegerDistribution; 1 analysis.integration.**UnivariateIntegrator; 1 analysis.integration.**RombergIntegrator; 1 analysis.UnivariateFunction;
Re: apache-math dependency
Sorry. typo. it should read Larger part of *apache-math* is linear algebra, which is currently broken for sparse part of the equation and which we don't use at all. On Mon, Aug 12, 2013 at 3:26 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: Larger part of apache-math is linear algebra, which is currently broken for sparse part of the equation and which we don't use at all. One part of the problem is that our use for that library is always a fringe case, and as far as i can tell, will always continue to be such. Another part of the problem is that keeping dependency will invite bypassing Mahout's solvers and, as a result, architecture inconsistency. That said, I guess Ted's argument (which is mainly cost, as i gathered), trumps the two above. On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote: seriously, I would prefer the dependency as a good architectural pattern. It encourages other people to use/contribute to it to avoid repetitive work. On 13-08-12 06:16 PM, Ted Dunning wrote: I am fine with it staying. On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: So you are ok with apache-math dependency to stay? On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com wrote: So I checked on these. The non-trivial issues with replacing Commons Math include: - Poisson and negative binomial distributions. This would be several hours work to write and test (we have Colt-inherited negative binomial distribution, but it takes no longer to write a new one than to test an old one). - random number generators. This is about and hour or two of work to pull the MersenneTwister implementation into our code. - next prime number finder. Not a big deal to replicate, but it would take a few hours to do. - quadrature. We use an adaptive integration routine to check distribution properties. This, again, would take a few hours to replace. I really don't see the benefit to this work. On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: 2 distribution.**PoissonDistribution; 2 distribution.**PascalDistribution; 2 distribution.**NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.**IntegerDistribution; 1 analysis.integration.**UnivariateIntegrator; 1 analysis.integration.**RombergIntegrator; 1 analysis.UnivariateFunction;
Re: apache-math dependency
Yes. Apache Math linear algebra is very difficult for us to use because their matrices are non-extensible. But there is actually quite a lot of code to do with random distributions, optimization and quadrature. Those are much more likely to be useful to us. On Mon, Aug 12, 2013 at 3:26 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: Larger part of mahout-math is linear algebra, which is currently broken for sparse part of the equation and which we don't use at all. One part of the problem is that our use for that library is always a fringe case, and as far as i can tell, will always continue to be such. Another part of the problem is that keeping dependency will invite bypassing Mahout's solvers and, as a result, architecture inconsistency. That said, I guess Ted's argument (which is mainly cost, as i gathered), trumps the two above. On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote: seriously, I would prefer the dependency as a good architectural pattern. It encourages other people to use/contribute to it to avoid repetitive work. On 13-08-12 06:16 PM, Ted Dunning wrote: I am fine with it staying. On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: So you are ok with apache-math dependency to stay? On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com wrote: So I checked on these. The non-trivial issues with replacing Commons Math include: - Poisson and negative binomial distributions. This would be several hours work to write and test (we have Colt-inherited negative binomial distribution, but it takes no longer to write a new one than to test an old one). - random number generators. This is about and hour or two of work to pull the MersenneTwister implementation into our code. - next prime number finder. Not a big deal to replicate, but it would take a few hours to do. - quadrature. We use an adaptive integration routine to check distribution properties. This, again, would take a few hours to replace. I really don't see the benefit to this work. On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: 2 distribution.**PoissonDistribution; 2 distribution.**PascalDistribution; 2 distribution.**NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.**IntegerDistribution; 1 analysis.integration.**UnivariateIntegrator; 1 analysis.integration.**RombergIntegrator; 1 analysis.UnivariateFunction;
Re: apache-math dependency
Apologies, I mistaken apache-math as mahout-math and didn't know what I'm talking about :) On 13-08-12 07:08 PM, Ted Dunning wrote: Yes. Apache Math linear algebra is very difficult for us to use because their matrices are non-extensible. But there is actually quite a lot of code to do with random distributions, optimization and quadrature. Those are much more likely to be useful to us. On Mon, Aug 12, 2013 at 3:26 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: Larger part of mahout-math is linear algebra, which is currently broken for sparse part of the equation and which we don't use at all. One part of the problem is that our use for that library is always a fringe case, and as far as i can tell, will always continue to be such. Another part of the problem is that keeping dependency will invite bypassing Mahout's solvers and, as a result, architecture inconsistency. That said, I guess Ted's argument (which is mainly cost, as i gathered), trumps the two above. On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote: seriously, I would prefer the dependency as a good architectural pattern. It encourages other people to use/contribute to it to avoid repetitive work. On 13-08-12 06:16 PM, Ted Dunning wrote: I am fine with it staying. On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: So you are ok with apache-math dependency to stay? On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com wrote: So I checked on these. The non-trivial issues with replacing Commons Math include: - Poisson and negative binomial distributions. This would be several hours work to write and test (we have Colt-inherited negative binomial distribution, but it takes no longer to write a new one than to test an old one). - random number generators. This is about and hour or two of work to pull the MersenneTwister implementation into our code. - next prime number finder. Not a big deal to replicate, but it would take a few hours to do. - quadrature. We use an adaptive integration routine to check distribution properties. This, again, would take a few hours to replace. I really don't see the benefit to this work. On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: 2 distribution.**PoissonDistribution; 2 distribution.**PascalDistribution; 2 distribution.**NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.**IntegerDistribution; 1 analysis.integration.**UnivariateIntegrator; 1 analysis.integration.**RombergIntegrator; 1 analysis.UnivariateFunction;
Re: apache-math dependency
Your advice isn't so bad even so. No need to reimplement interesting capabilities. On Mon, Aug 12, 2013 at 4:56 PM, Peng Cheng pc...@uowmail.edu.au wrote: Apologies, I mistaken apache-math as mahout-math and didn't know what I'm talking about :) On 13-08-12 07:08 PM, Ted Dunning wrote: Yes. Apache Math linear algebra is very difficult for us to use because their matrices are non-extensible. But there is actually quite a lot of code to do with random distributions, optimization and quadrature. Those are much more likely to be useful to us. On Mon, Aug 12, 2013 at 3:26 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: Larger part of mahout-math is linear algebra, which is currently broken for sparse part of the equation and which we don't use at all. One part of the problem is that our use for that library is always a fringe case, and as far as i can tell, will always continue to be such. Another part of the problem is that keeping dependency will invite bypassing Mahout's solvers and, as a result, architecture inconsistency. That said, I guess Ted's argument (which is mainly cost, as i gathered), trumps the two above. On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote: seriously, I would prefer the dependency as a good architectural pattern. It encourages other people to use/contribute to it to avoid repetitive work. On 13-08-12 06:16 PM, Ted Dunning wrote: I am fine with it staying. On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: So you are ok with apache-math dependency to stay? On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com wrote: So I checked on these. The non-trivial issues with replacing Commons Math include: - Poisson and negative binomial distributions. This would be several hours work to write and test (we have Colt-inherited negative binomial distribution, but it takes no longer to write a new one than to test an old one). - random number generators. This is about and hour or two of work to pull the MersenneTwister implementation into our code. - next prime number finder. Not a big deal to replicate, but it would take a few hours to do. - quadrature. We use an adaptive integration routine to check distribution properties. This, again, would take a few hours to replace. I really don't see the benefit to this work. On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: 2 distribution.PoissonDistribution; 2 distribution.PascalDistribution; 2 distribution.NormalDistribution; 1 util.FastMath; 1 random.RandomGenerator; 1 random.MersenneTwister; 1 primes.Primes; 1 linear.RealMatrix; 1 linear.EigenDecomposition; 1 linear.Array2DRowRealMatrix; 1 distribution.RealDistribution; 1 distribution.IntegerDistribution; 1 analysis.integration.UnivariateIntegrator; 1 analysis.integration.RombergIntegrator; 1 analysis.UnivariateFunction;
apache-math dependency
FYI SSVD does not have that dependency anymore (thanks to fixes to EigenSolver in Mahout). If there are no more methods using it, it can be deleted from pom. -d
[jira] [Updated] (MAHOUT-1281) Wean distributed SSVD from apache-math dependency.
[ https://issues.apache.org/jira/browse/MAHOUT-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Lyubimov updated MAHOUT-1281: - Resolution: Fixed Status: Resolved (was: Patch Available) Wean distributed SSVD from apache-math dependency. -- Key: MAHOUT-1281 URL: https://issues.apache.org/jira/browse/MAHOUT-1281 Project: Mahout Issue Type: Improvement Affects Versions: 0.8 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Priority: Trivial Fix For: 0.9 Attachments: MAHOUT-1281.patch-1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAHOUT-1281) Wean distributed SSVD from apache-math dependency.
Dmitriy Lyubimov created MAHOUT-1281: Summary: Wean distributed SSVD from apache-math dependency. Key: MAHOUT-1281 URL: https://issues.apache.org/jira/browse/MAHOUT-1281 Project: Mahout Issue Type: Improvement Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Priority: Trivial Fix For: 0.9 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAHOUT-1281) Wean distributed SSVD from apache-math dependency.
[ https://issues.apache.org/jira/browse/MAHOUT-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Lyubimov updated MAHOUT-1281: - Attachment: MAHOUT-1281.patch-1 SSVDSolver and LanczosSover Tests seem working. also includes fixes to EigenDecomposition to produce correct order of eigenValues and eigen vectors, also undid the corresponding workarounds in the Lanczos solver. Wean distributed SSVD from apache-math dependency. -- Key: MAHOUT-1281 URL: https://issues.apache.org/jira/browse/MAHOUT-1281 Project: Mahout Issue Type: Improvement Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Priority: Trivial Fix For: 0.9 Attachments: MAHOUT-1281.patch-1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAHOUT-1281) Wean distributed SSVD from apache-math dependency.
[ https://issues.apache.org/jira/browse/MAHOUT-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy Lyubimov updated MAHOUT-1281: - Affects Version/s: 0.8 Status: Patch Available (was: Open) Wean distributed SSVD from apache-math dependency. -- Key: MAHOUT-1281 URL: https://issues.apache.org/jira/browse/MAHOUT-1281 Project: Mahout Issue Type: Improvement Affects Versions: 0.8 Reporter: Dmitriy Lyubimov Assignee: Dmitriy Lyubimov Priority: Trivial Fix For: 0.9 Attachments: MAHOUT-1281.patch-1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira