Re: apache-math dependency

2013-08-12 Thread Stevo Slavić
There are 30 matches when searching for org.apache.commons.math3 in
Mahout java files.


On Fri, Aug 9, 2013 at 8:34 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:

 FYI SSVD does not have that dependency anymore (thanks to fixes to
 EigenSolver in Mahout). If there are no more methods using it, it can be
 deleted from pom.

 -d



Re: apache-math dependency

2013-08-12 Thread Dmitriy Lyubimov
hm. so it crept in.

As far back as i can recollect, we tried to minimize those and the only
limiting factor for us were decompositions. Now all decompositions are
available natively to Mahout, so perhaps it is time to review these
occurrences.


On Mon, Aug 12, 2013 at 2:03 PM, Stevo Slavić ssla...@gmail.com wrote:

 There are 30 matches when searching for org.apache.commons.math3 in
 Mahout java files.


 On Fri, Aug 9, 2013 at 8:34 PM, Dmitriy Lyubimov dlie...@gmail.com
 wrote:

  FYI SSVD does not have that dependency anymore (thanks to fixes to
  EigenSolver in Mahout). If there are no more methods using it, it can be
  deleted from pom.
 
  -d
 



Re: apache-math dependency

2013-08-12 Thread Ted Dunning
Here are the uses [1]

Counting the unique usages, we have these:

   2 distribution.PoissonDistribution;
   2 distribution.PascalDistribution;
   2 distribution.NormalDistribution;
   1 util.FastMath;
   1 random.RandomGenerator;
   1 random.MersenneTwister;
   1 primes.Primes;
   1 linear.RealMatrix;
   1 linear.EigenDecomposition;
   1 linear.Array2DRowRealMatrix;
   1 distribution.RealDistribution;
   1 distribution.IntegerDistribution;
   1 analysis.integration.UnivariateIntegrator;
   1 analysis.integration.RombergIntegrator;
   1 analysis.UnivariateFunction;

These are pretty much all to do with statistical distributions.

There is a residual EigenDecomposition, but that may be something that
Dmitriy has already eliminated.

The files that are involved are:

   3 EigenSolverWrapper.java
   3 DistributionChecks.java
   2 UncommonDistributions.java
   2 RandomWrapper.java
   2 PoissonSamplerTest.java
   1 SamplingLongPrimitiveIterator.java
   1 SamplingIterator.java
   1 RandomUtils.java
   1 PoissonSampler.java
   1 NormalTest.java
   1 BreimanExample.java

I will take a quick hack at these to see what can be done easily.

[1] find . -name *.java | xargs grep -nH -e math3
./core/src/main/java/org/apache/mahout/cf/taste/impl/common/SamplingLongPrimitiveIterator.java:23:import
org.apache.commons.math3.distribution.PascalDistribution;
./core/src/main/java/org/apache/mahout/clustering/dirichlet/UncommonDistributions.java:20:import
org.apache.commons.math3.distribution.NormalDistribution;
./core/src/main/java/org/apache/mahout/clustering/dirichlet/UncommonDistributions.java:21:import
org.apache.commons.math3.distribution.RealDistribution;
./core/src/main/java/org/apache/mahout/common/iterator/SamplingIterator.java:24:import
org.apache.commons.math3.distribution.PascalDistribution;
./examples/src/main/java/org/apache/mahout/classifier/df/BreimanExample.java:32:import
org.apache.commons.math3.util.FastMath;
./math/src/main/java/org/apache/mahout/common/RandomUtils.java:26:import
org.apache.commons.math3.primes.Primes;
./math/src/main/java/org/apache/mahout/common/RandomWrapper.java:20:import
org.apache.commons.math3.random.MersenneTwister;
./math/src/main/java/org/apache/mahout/common/RandomWrapper.java:21:import
org.apache.commons.math3.random.RandomGenerator;
./math/src/main/java/org/apache/mahout/math/random/PoissonSampler.java:21:import
org.apache.commons.math3.distribution.PoissonDistribution;
./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:19:import
org.apache.commons.math3.linear.Array2DRowRealMatrix;
./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:20:import
org.apache.commons.math3.linear.EigenDecomposition;
./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:21:import
org.apache.commons.math3.linear.RealMatrix;
./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:20:import
org.apache.commons.math3.analysis.UnivariateFunction;
./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:21:import
org.apache.commons.math3.analysis.integration.RombergIntegrator;
./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:22:import
org.apache.commons.math3.analysis.integration.UnivariateIntegrator;
./math/src/test/java/org/apache/mahout/math/random/NormalTest.java:20:import
org.apache.commons.math3.distribution.NormalDistribution;
./math/src/test/java/org/apache/mahout/math/random/PoissonSamplerTest.java:20:import
org.apache.commons.math3.distribution.IntegerDistribution;
./math/src/test/java/org/apache/mahout/math/random/PoissonSamplerTest.java:21:import
org.apache.commons.math3.distribution.PoissonDistribution;



On Mon, Aug 12, 2013 at 2:11 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:

 hm. so it crept in.

 As far back as i can recollect, we tried to minimize those and the only
 limiting factor for us were decompositions. Now all decompositions are
 available natively to Mahout, so perhaps it is time to review these
 occurrences.


 On Mon, Aug 12, 2013 at 2:03 PM, Stevo Slavić ssla...@gmail.com wrote:

  There are 30 matches when searching for org.apache.commons.math3 in
  Mahout java files.
 
 
  On Fri, Aug 9, 2013 at 8:34 PM, Dmitriy Lyubimov dlie...@gmail.com
  wrote:
 
   FYI SSVD does not have that dependency anymore (thanks to fixes to
   EigenSolver in Mahout). If there are no more methods using it, it can
 be
   deleted from pom.
  
   -d
  
 



Re: apache-math dependency

2013-08-12 Thread Dmitriy Lyubimov
On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 Here are the uses [1]

 Counting the unique usages, we have these:

2 distribution.PoissonDistribution;
2 distribution.PascalDistribution;
2 distribution.NormalDistribution;
1 util.FastMath;
1 random.RandomGenerator;
1 random.MersenneTwister;
1 primes.Primes;
1 linear.RealMatrix;
1 linear.EigenDecomposition;
1 linear.Array2DRowRealMatrix;
1 distribution.RealDistribution;
1 distribution.IntegerDistribution;
1 analysis.integration.UnivariateIntegrator;
1 analysis.integration.RombergIntegrator;
1 analysis.UnivariateFunction;

 These are pretty much all to do with statistical distributions.

 There is a residual EigenDecomposition, but that may be something that
 Dmitriy has already eliminated.

 The files that are involved are:

3 EigenSolverWrapper.java

this file should have been deleted on trunk... let me know if not, it was
part of wean SSVD off apache.math dep commit


3 DistributionChecks.java
2 UncommonDistributions.java
2 RandomWrapper.java
2 PoissonSamplerTest.java
1 SamplingLongPrimitiveIterator.java
1 SamplingIterator.java
1 RandomUtils.java
1 PoissonSampler.java
1 NormalTest.java
1 BreimanExample.java

 I will take a quick hack at these to see what can be done easily.

 [1] find . -name *.java | xargs grep -nH -e math3

 ./core/src/main/java/org/apache/mahout/cf/taste/impl/common/SamplingLongPrimitiveIterator.java:23:import
 org.apache.commons.math3.distribution.PascalDistribution;

 ./core/src/main/java/org/apache/mahout/clustering/dirichlet/UncommonDistributions.java:20:import
 org.apache.commons.math3.distribution.NormalDistribution;

 ./core/src/main/java/org/apache/mahout/clustering/dirichlet/UncommonDistributions.java:21:import
 org.apache.commons.math3.distribution.RealDistribution;

 ./core/src/main/java/org/apache/mahout/common/iterator/SamplingIterator.java:24:import
 org.apache.commons.math3.distribution.PascalDistribution;

 ./examples/src/main/java/org/apache/mahout/classifier/df/BreimanExample.java:32:import
 org.apache.commons.math3.util.FastMath;
 ./math/src/main/java/org/apache/mahout/common/RandomUtils.java:26:import
 org.apache.commons.math3.primes.Primes;
 ./math/src/main/java/org/apache/mahout/common/RandomWrapper.java:20:import
 org.apache.commons.math3.random.MersenneTwister;
 ./math/src/main/java/org/apache/mahout/common/RandomWrapper.java:21:import
 org.apache.commons.math3.random.RandomGenerator;

 ./math/src/main/java/org/apache/mahout/math/random/PoissonSampler.java:21:import
 org.apache.commons.math3.distribution.PoissonDistribution;

 ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:19:import
 org.apache.commons.math3.linear.Array2DRowRealMatrix;

 ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:20:import
 org.apache.commons.math3.linear.EigenDecomposition;

 ./math/src/main/java/org/apache/mahout/math/ssvd/EigenSolverWrapper.java:21:import
 org.apache.commons.math3.linear.RealMatrix;

 ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:20:import
 org.apache.commons.math3.analysis.UnivariateFunction;

 ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:21:import
 org.apache.commons.math3.analysis.integration.RombergIntegrator;

 ./math/src/test/java/org/apache/mahout/math/jet/random/DistributionChecks.java:22:import
 org.apache.commons.math3.analysis.integration.UnivariateIntegrator;

 ./math/src/test/java/org/apache/mahout/math/random/NormalTest.java:20:import
 org.apache.commons.math3.distribution.NormalDistribution;

 ./math/src/test/java/org/apache/mahout/math/random/PoissonSamplerTest.java:20:import
 org.apache.commons.math3.distribution.IntegerDistribution;

 ./math/src/test/java/org/apache/mahout/math/random/PoissonSamplerTest.java:21:import
 org.apache.commons.math3.distribution.PoissonDistribution;



 On Mon, Aug 12, 2013 at 2:11 PM, Dmitriy Lyubimov dlie...@gmail.com
 wrote:

  hm. so it crept in.
 
  As far back as i can recollect, we tried to minimize those and the only
  limiting factor for us were decompositions. Now all decompositions are
  available natively to Mahout, so perhaps it is time to review these
  occurrences.
 
 
  On Mon, Aug 12, 2013 at 2:03 PM, Stevo Slavić ssla...@gmail.com wrote:
 
   There are 30 matches when searching for org.apache.commons.math3 in
   Mahout java files.
  
  
   On Fri, Aug 9, 2013 at 8:34 PM, Dmitriy Lyubimov dlie...@gmail.com
   wrote:
  
FYI SSVD does not have that dependency anymore (thanks to fixes to
EigenSolver in Mahout). If there are no more methods using it, it can
  be
deleted from pom.
   
-d
   
  
 



Re: apache-math dependency

2013-08-12 Thread Ted Dunning
So I checked on these.  The non-trivial issues with replacing Commons Math
include:

- Poisson and negative binomial distributions.  This would be several hours
work to write and test (we have Colt-inherited negative binomial
distribution, but it takes no longer to write a new one than to test an old
one).

- random number generators.  This is about and hour or two of work to pull
the MersenneTwister implementation into our code.

- next prime number finder.  Not a big deal to replicate, but it would take
a few hours to do.

- quadrature.  We use an adaptive integration routine to check distribution
properties.  This, again, would take a few hours to replace.

I really don't see the benefit to this work.

On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com wrote:

2 distribution.PoissonDistribution;
2 distribution.PascalDistribution;
2 distribution.NormalDistribution;
1 util.FastMath;
1 random.RandomGenerator;
1 random.MersenneTwister;
1 primes.Primes;
1 linear.RealMatrix;
1 linear.EigenDecomposition;
1 linear.Array2DRowRealMatrix;
1 distribution.RealDistribution;
1 distribution.IntegerDistribution;
1 analysis.integration.UnivariateIntegrator;
1 analysis.integration.RombergIntegrator;
1 analysis.UnivariateFunction;



Re: apache-math dependency

2013-08-12 Thread Ted Dunning
On Mon, Aug 12, 2013 at 3:09 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:

  The files that are involved are:
 
 3 EigenSolverWrapper.java
 
 this file should have been deleted on trunk... let me know if not, it was
 part of wean SSVD off apache.math dep commit


Not an issue. I ran the count against a slightly old version.


Re: apache-math dependency

2013-08-12 Thread Dmitriy Lyubimov
So you are ok with apache-math dependency to stay?


On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 So I checked on these.  The non-trivial issues with replacing Commons Math
 include:

 - Poisson and negative binomial distributions.  This would be several hours
 work to write and test (we have Colt-inherited negative binomial
 distribution, but it takes no longer to write a new one than to test an old
 one).

 - random number generators.  This is about and hour or two of work to pull
 the MersenneTwister implementation into our code.

 - next prime number finder.  Not a big deal to replicate, but it would take
 a few hours to do.

 - quadrature.  We use an adaptive integration routine to check distribution
 properties.  This, again, would take a few hours to replace.

 I really don't see the benefit to this work.

 On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

 2 distribution.PoissonDistribution;
 2 distribution.PascalDistribution;
 2 distribution.NormalDistribution;
 1 util.FastMath;
 1 random.RandomGenerator;
 1 random.MersenneTwister;
 1 primes.Primes;
 1 linear.RealMatrix;
 1 linear.EigenDecomposition;
 1 linear.Array2DRowRealMatrix;
 1 distribution.RealDistribution;
 1 distribution.IntegerDistribution;
 1 analysis.integration.UnivariateIntegrator;
 1 analysis.integration.RombergIntegrator;
 1 analysis.UnivariateFunction;
 



Re: apache-math dependency

2013-08-12 Thread Ted Dunning
I am fine with it staying.


On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:

 So you are ok with apache-math dependency to stay?


 On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

  So I checked on these.  The non-trivial issues with replacing Commons
 Math
  include:
 
  - Poisson and negative binomial distributions.  This would be several
 hours
  work to write and test (we have Colt-inherited negative binomial
  distribution, but it takes no longer to write a new one than to test an
 old
  one).
 
  - random number generators.  This is about and hour or two of work to
 pull
  the MersenneTwister implementation into our code.
 
  - next prime number finder.  Not a big deal to replicate, but it would
 take
  a few hours to do.
 
  - quadrature.  We use an adaptive integration routine to check
 distribution
  properties.  This, again, would take a few hours to replace.
 
  I really don't see the benefit to this work.
 
  On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com
  wrote:
 
  2 distribution.PoissonDistribution;
  2 distribution.PascalDistribution;
  2 distribution.NormalDistribution;
  1 util.FastMath;
  1 random.RandomGenerator;
  1 random.MersenneTwister;
  1 primes.Primes;
  1 linear.RealMatrix;
  1 linear.EigenDecomposition;
  1 linear.Array2DRowRealMatrix;
  1 distribution.RealDistribution;
  1 distribution.IntegerDistribution;
  1 analysis.integration.UnivariateIntegrator;
  1 analysis.integration.RombergIntegrator;
  1 analysis.UnivariateFunction;
  
 



Re: apache-math dependency

2013-08-12 Thread Dmitriy Lyubimov
Larger part of mahout-math is linear algebra, which is currently broken for
sparse part of the equation and which we don't use at all.

One part of the problem is that our use for that library is always a fringe
case, and as far as i can tell, will always continue to be such.

Another part of the problem is that keeping dependency will invite
bypassing Mahout's solvers and, as a result, architecture inconsistency.

That said, I guess Ted's argument (which is mainly cost, as i gathered),
trumps the two above.


On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote:

 seriously, I would prefer the dependency as a good architectural pattern.
 It encourages other people to use/contribute to it to avoid repetitive work.


 On 13-08-12 06:16 PM, Ted Dunning wrote:

 I am fine with it staying.


 On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com
 wrote:

  So you are ok with apache-math dependency to stay?


 On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

  So I checked on these.  The non-trivial issues with replacing Commons

 Math

 include:

 - Poisson and negative binomial distributions.  This would be several

 hours

 work to write and test (we have Colt-inherited negative binomial
 distribution, but it takes no longer to write a new one than to test an

 old

 one).

 - random number generators.  This is about and hour or two of work to

 pull

 the MersenneTwister implementation into our code.

 - next prime number finder.  Not a big deal to replicate, but it would

 take

 a few hours to do.

 - quadrature.  We use an adaptive integration routine to check

 distribution

 properties.  This, again, would take a few hours to replace.

 I really don't see the benefit to this work.

 On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

  2 distribution.**PoissonDistribution;
 2 distribution.**PascalDistribution;
 2 distribution.**NormalDistribution;
 1 util.FastMath;
 1 random.RandomGenerator;
 1 random.MersenneTwister;
 1 primes.Primes;
 1 linear.RealMatrix;
 1 linear.EigenDecomposition;
 1 linear.Array2DRowRealMatrix;
 1 distribution.RealDistribution;
 1 distribution.**IntegerDistribution;
 1 analysis.integration.**UnivariateIntegrator;
 1 analysis.integration.**RombergIntegrator;
 1 analysis.UnivariateFunction;






Re: apache-math dependency

2013-08-12 Thread Dmitriy Lyubimov
Sorry. typo. it should read

 Larger part of *apache-math* is linear algebra, which is currently broken
for sparse part of the equation and which we don't use at all.


On Mon, Aug 12, 2013 at 3:26 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:

 Larger part of apache-math is linear algebra, which is currently broken
 for sparse part of the equation and which we don't use at all.

 One part of the problem is that our use for that library is always a
 fringe case, and as far as i can tell, will always continue to be such.

 Another part of the problem is that keeping dependency will invite
 bypassing Mahout's solvers and, as a result, architecture inconsistency.

 That said, I guess Ted's argument (which is mainly cost, as i gathered),
 trumps the two above.


 On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote:

 seriously, I would prefer the dependency as a good architectural pattern.
 It encourages other people to use/contribute to it to avoid repetitive work.


 On 13-08-12 06:16 PM, Ted Dunning wrote:

 I am fine with it staying.


 On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com
 wrote:

  So you are ok with apache-math dependency to stay?


 On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

  So I checked on these.  The non-trivial issues with replacing Commons

 Math

 include:

 - Poisson and negative binomial distributions.  This would be several

 hours

 work to write and test (we have Colt-inherited negative binomial
 distribution, but it takes no longer to write a new one than to test an

 old

 one).

 - random number generators.  This is about and hour or two of work to

 pull

 the MersenneTwister implementation into our code.

 - next prime number finder.  Not a big deal to replicate, but it would

 take

 a few hours to do.

 - quadrature.  We use an adaptive integration routine to check

 distribution

 properties.  This, again, would take a few hours to replace.

 I really don't see the benefit to this work.

 On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

  2 distribution.**PoissonDistribution;
 2 distribution.**PascalDistribution;
 2 distribution.**NormalDistribution;
 1 util.FastMath;
 1 random.RandomGenerator;
 1 random.MersenneTwister;
 1 primes.Primes;
 1 linear.RealMatrix;
 1 linear.EigenDecomposition;
 1 linear.Array2DRowRealMatrix;
 1 distribution.RealDistribution;
 1 distribution.**IntegerDistribution;
 1 analysis.integration.**UnivariateIntegrator;
 1 analysis.integration.**RombergIntegrator;
 1 analysis.UnivariateFunction;







Re: apache-math dependency

2013-08-12 Thread Ted Dunning
Yes.  Apache Math linear algebra is very difficult for us to use because
their matrices are non-extensible.

But there is actually quite a lot of code to do with random distributions,
optimization and quadrature. Those are much more likely to be useful to us.


On Mon, Aug 12, 2013 at 3:26 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:

 Larger part of mahout-math is linear algebra, which is currently broken for
 sparse part of the equation and which we don't use at all.

 One part of the problem is that our use for that library is always a fringe
 case, and as far as i can tell, will always continue to be such.

 Another part of the problem is that keeping dependency will invite
 bypassing Mahout's solvers and, as a result, architecture inconsistency.

 That said, I guess Ted's argument (which is mainly cost, as i gathered),
 trumps the two above.


 On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote:

  seriously, I would prefer the dependency as a good architectural pattern.
  It encourages other people to use/contribute to it to avoid repetitive
 work.
 
 
  On 13-08-12 06:16 PM, Ted Dunning wrote:
 
  I am fine with it staying.
 
 
  On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com
  wrote:
 
   So you are ok with apache-math dependency to stay?
 
 
  On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com
  wrote:
 
   So I checked on these.  The non-trivial issues with replacing Commons
 
  Math
 
  include:
 
  - Poisson and negative binomial distributions.  This would be several
 
  hours
 
  work to write and test (we have Colt-inherited negative binomial
  distribution, but it takes no longer to write a new one than to test
 an
 
  old
 
  one).
 
  - random number generators.  This is about and hour or two of work to
 
  pull
 
  the MersenneTwister implementation into our code.
 
  - next prime number finder.  Not a big deal to replicate, but it would
 
  take
 
  a few hours to do.
 
  - quadrature.  We use an adaptive integration routine to check
 
  distribution
 
  properties.  This, again, would take a few hours to replace.
 
  I really don't see the benefit to this work.
 
  On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com
  wrote:
 
   2 distribution.**PoissonDistribution;
  2 distribution.**PascalDistribution;
  2 distribution.**NormalDistribution;
  1 util.FastMath;
  1 random.RandomGenerator;
  1 random.MersenneTwister;
  1 primes.Primes;
  1 linear.RealMatrix;
  1 linear.EigenDecomposition;
  1 linear.Array2DRowRealMatrix;
  1 distribution.RealDistribution;
  1 distribution.**IntegerDistribution;
  1 analysis.integration.**UnivariateIntegrator;
  1 analysis.integration.**RombergIntegrator;
  1 analysis.UnivariateFunction;
 
 
 
 



Re: apache-math dependency

2013-08-12 Thread Peng Cheng
Apologies, I mistaken apache-math as mahout-math and didn't know what 
I'm talking about :)


On 13-08-12 07:08 PM, Ted Dunning wrote:

Yes.  Apache Math linear algebra is very difficult for us to use because
their matrices are non-extensible.

But there is actually quite a lot of code to do with random distributions,
optimization and quadrature. Those are much more likely to be useful to us.


On Mon, Aug 12, 2013 at 3:26 PM, Dmitriy Lyubimov dlie...@gmail.com wrote:


Larger part of mahout-math is linear algebra, which is currently broken for
sparse part of the equation and which we don't use at all.

One part of the problem is that our use for that library is always a fringe
case, and as far as i can tell, will always continue to be such.

Another part of the problem is that keeping dependency will invite
bypassing Mahout's solvers and, as a result, architecture inconsistency.

That said, I guess Ted's argument (which is mainly cost, as i gathered),
trumps the two above.


On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au wrote:


seriously, I would prefer the dependency as a good architectural pattern.
It encourages other people to use/contribute to it to avoid repetitive

work.


On 13-08-12 06:16 PM, Ted Dunning wrote:


I am fine with it staying.


On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com
wrote:

  So you are ok with apache-math dependency to stay?


On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com
wrote:

  So I checked on these.  The non-trivial issues with replacing Commons
Math


include:

- Poisson and negative binomial distributions.  This would be several


hours


work to write and test (we have Colt-inherited negative binomial
distribution, but it takes no longer to write a new one than to test

an

old


one).

- random number generators.  This is about and hour or two of work to


pull


the MersenneTwister implementation into our code.

- next prime number finder.  Not a big deal to replicate, but it would


take


a few hours to do.

- quadrature.  We use an adaptive integration routine to check


distribution


properties.  This, again, would take a few hours to replace.

I really don't see the benefit to this work.

On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com
wrote:

  2 distribution.**PoissonDistribution;

 2 distribution.**PascalDistribution;
 2 distribution.**NormalDistribution;
 1 util.FastMath;
 1 random.RandomGenerator;
 1 random.MersenneTwister;
 1 primes.Primes;
 1 linear.RealMatrix;
 1 linear.EigenDecomposition;
 1 linear.Array2DRowRealMatrix;
 1 distribution.RealDistribution;
 1 distribution.**IntegerDistribution;
 1 analysis.integration.**UnivariateIntegrator;
 1 analysis.integration.**RombergIntegrator;
 1 analysis.UnivariateFunction;









Re: apache-math dependency

2013-08-12 Thread Ted Dunning
Your advice isn't so bad even so.

No need to reimplement interesting capabilities.


On Mon, Aug 12, 2013 at 4:56 PM, Peng Cheng pc...@uowmail.edu.au wrote:

 Apologies, I mistaken apache-math as mahout-math and didn't know what I'm
 talking about :)


 On 13-08-12 07:08 PM, Ted Dunning wrote:

 Yes.  Apache Math linear algebra is very difficult for us to use because
 their matrices are non-extensible.

 But there is actually quite a lot of code to do with random distributions,
 optimization and quadrature. Those are much more likely to be useful to
 us.


 On Mon, Aug 12, 2013 at 3:26 PM, Dmitriy Lyubimov dlie...@gmail.com
 wrote:

  Larger part of mahout-math is linear algebra, which is currently broken
 for
 sparse part of the equation and which we don't use at all.

 One part of the problem is that our use for that library is always a
 fringe
 case, and as far as i can tell, will always continue to be such.

 Another part of the problem is that keeping dependency will invite
 bypassing Mahout's solvers and, as a result, architecture inconsistency.

 That said, I guess Ted's argument (which is mainly cost, as i gathered),
 trumps the two above.


 On Mon, Aug 12, 2013 at 3:20 PM, Peng Cheng pc...@uowmail.edu.au
 wrote:

  seriously, I would prefer the dependency as a good architectural
 pattern.
 It encourages other people to use/contribute to it to avoid repetitive

 work.


 On 13-08-12 06:16 PM, Ted Dunning wrote:

  I am fine with it staying.


 On Mon, Aug 12, 2013 at 3:14 PM, Dmitriy Lyubimov dlie...@gmail.com
 wrote:

   So you are ok with apache-math dependency to stay?


 On Mon, Aug 12, 2013 at 3:09 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

   So I checked on these.  The non-trivial issues with replacing
 Commons
 Math

  include:

 - Poisson and negative binomial distributions.  This would be several

  hours

  work to write and test (we have Colt-inherited negative binomial
 distribution, but it takes no longer to write a new one than to test

 an

 old

  one).

 - random number generators.  This is about and hour or two of work to

  pull

  the MersenneTwister implementation into our code.

 - next prime number finder.  Not a big deal to replicate, but it
 would

  take

  a few hours to do.

 - quadrature.  We use an adaptive integration routine to check

  distribution

  properties.  This, again, would take a few hours to replace.

 I really don't see the benefit to this work.

 On Mon, Aug 12, 2013 at 2:53 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

   2 distribution.PoissonDistribution;

  2 distribution.PascalDistribution;
  2 distribution.NormalDistribution;
  1 util.FastMath;
  1 random.RandomGenerator;
  1 random.MersenneTwister;
  1 primes.Primes;
  1 linear.RealMatrix;
  1 linear.EigenDecomposition;
  1 linear.Array2DRowRealMatrix;
  1 distribution.RealDistribution;
  1 distribution.IntegerDistribution;
  1 analysis.integration.UnivariateIntegrator;
  1 analysis.integration.RombergIntegrator;
  1 analysis.UnivariateFunction;








apache-math dependency

2013-08-09 Thread Dmitriy Lyubimov
FYI SSVD does not have that dependency anymore (thanks to fixes to
EigenSolver in Mahout). If there are no more methods using it, it can be
deleted from pom.

-d


[jira] [Updated] (MAHOUT-1281) Wean distributed SSVD from apache-math dependency.

2013-07-30 Thread Dmitriy Lyubimov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Lyubimov updated MAHOUT-1281:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Wean distributed SSVD from apache-math dependency.
 --

 Key: MAHOUT-1281
 URL: https://issues.apache.org/jira/browse/MAHOUT-1281
 Project: Mahout
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Dmitriy Lyubimov
Assignee: Dmitriy Lyubimov
Priority: Trivial
 Fix For: 0.9

 Attachments: MAHOUT-1281.patch-1




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAHOUT-1281) Wean distributed SSVD from apache-math dependency.

2013-07-10 Thread Dmitriy Lyubimov (JIRA)
Dmitriy Lyubimov created MAHOUT-1281:


 Summary: Wean distributed SSVD from apache-math dependency.
 Key: MAHOUT-1281
 URL: https://issues.apache.org/jira/browse/MAHOUT-1281
 Project: Mahout
  Issue Type: Improvement
Reporter: Dmitriy Lyubimov
Assignee: Dmitriy Lyubimov
Priority: Trivial
 Fix For: 0.9




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1281) Wean distributed SSVD from apache-math dependency.

2013-07-10 Thread Dmitriy Lyubimov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Lyubimov updated MAHOUT-1281:
-

Attachment: MAHOUT-1281.patch-1

SSVDSolver and LanczosSover Tests seem working. 

also includes fixes to EigenDecomposition to produce correct order of 
eigenValues and eigen vectors, also undid the corresponding workarounds in the 
Lanczos solver.

 Wean distributed SSVD from apache-math dependency.
 --

 Key: MAHOUT-1281
 URL: https://issues.apache.org/jira/browse/MAHOUT-1281
 Project: Mahout
  Issue Type: Improvement
Reporter: Dmitriy Lyubimov
Assignee: Dmitriy Lyubimov
Priority: Trivial
 Fix For: 0.9

 Attachments: MAHOUT-1281.patch-1




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1281) Wean distributed SSVD from apache-math dependency.

2013-07-10 Thread Dmitriy Lyubimov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Lyubimov updated MAHOUT-1281:
-

Affects Version/s: 0.8
   Status: Patch Available  (was: Open)

 Wean distributed SSVD from apache-math dependency.
 --

 Key: MAHOUT-1281
 URL: https://issues.apache.org/jira/browse/MAHOUT-1281
 Project: Mahout
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Dmitriy Lyubimov
Assignee: Dmitriy Lyubimov
Priority: Trivial
 Fix For: 0.9

 Attachments: MAHOUT-1281.patch-1




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira