Re: [math] patch submitted for commons-math sparse matrix support

2008-12-10 Thread Luc Maisonobe
Bill Barker a écrit :
> I've recently come up with a need in my day-job for a 
> SparseMatrix/SparseVector implementation, so would would like to offer to 
> help on this project.  I should still have karma for commons (my Apache id 
> is [EMAIL PROTECTED] for those that haven't been around since my last commons 
> commit. I'm primarily a Tomcat committer in Apache-land, but have worked on 
> Modeler and Daemon as well).  I have a degree in math, so I know my way 
> around linear algebra.  I've also been around in commons long enough to know 
> that I should ask nicely first before starting to commit ;).
> 
> I'm still in the process of reviewing the patches, so don't have any 
> specific recommendations yet.  I'll re-post when I do.

I have reviewed the patch on my side. It is mainly OK for me. I would
have prefered a separate class for the Map as I said in the comment on
JIRA, but this is not mandatory.

The patch is huge, but the largest part is dues to tests, many of them
being borrowed from the existing RealMatrixTest suite, so it is OK. The
new classes are already provided with appropriate Apache header by the
authors, so I think this could be considered sufficient for IP
clearance. Does anybody think we should ask for more ?

Bill, what do you think about the patch ?

Luc

> 
> "Sujit Pal" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
>> I've updated MATH-230 with a fresh patch (this time against trunk, since
>> MATH_2_0 has been moved to trunk) putting back the fallback code Luc
>> pointed out was used for performance.
>>
>> Details are in the bug:
>> http://issues.apache.org/jira/browse/MATH-230
>>
>> There are also some test times for running with the current version vs
>> the getEntry() stuff I put in, and for the current implementation. The
>> performance does degrade for large matrices with getEntry(), but not by
>> much for the sizes I was looking at, but since the complexity is
>> O(n**2), there is going to be more degradation at larger sizes. I wasn't
>> able to get the test to complete in either case in a reasonable length
>> of time (~10s).
>>
>> size   data[][]  getEntry()  restored data[][]
>> [3x2]*[2x4]  0ms 0ms   0ms
>> [6x4]*[4x8]) 0ms 0ms   0ms
>> [12x8]*[8x16])   1ms 0ms   0ms
>> [24x16]*[16x32]  2ms 2ms   2ms
>> [48x32]*[32x64]) 14ms16ms  23ms
>> [96x64]*[64x128] 2ms 3ms   3ms
>> [192x128]*[128x256]  24ms50ms  28ms
>> [384x256]*[256x512]  770ms   813ms 768ms
>>
>> One thing I would like to suggest. As a business programmer, I may have
>> to deal with matrices, but I would prefer not have to deal with
>> implementing individual matrix methods. Which is one reason someone like
>> me would want to use a library like commons-math in the first place.
>> However, as I found recently, I may have to make different
>> implementations of RealMatrix, which I would prefer to do by subclassing
>> and overriding the getEntry() and setEntry() methods. Many algorithms
>> are based on matrices (and they probably assume a JVM with a huge amount
>> of memory), so it makes the code cleaner to use matrices as an
>> abstraction, but I could use a Map, Lucene index, database table, etc as
>> storage for my data. Would it make sense to have a generic impl of
>> RealMatrix, perhaps called AbstractRealMatrixImpl where the getEntry and
>> setEntry are abstract, which is meant for this sort of application?
>> Obviously, this implies that there is a guarantee that getEntry and
>> setEntry are used uniformly throughout the code, except for special
>> cases such as RealMatrixImpl.
>>
>> -sujit
>>
>> On Tue, 2008-12-02 at 14:19 +0100, [EMAIL PROTECTED] wrote:
>>> - "Ted Dunning" <[EMAIL PROTECTED]> a écrit :
>>>
 Is performance in LUD normally achieved by fast element access?  Or is
 it
 done by using higher order operators on a view of a row (or column)?
>>> This is not used for LUD but only for simpler operations (multiplication, 
>>> addition, subtraction). For these operations, the algorithm is almost 
>>> reduced to one operation, so a single getEntry() instead of a direct 
>>> access result is really a performance bottleneck. This does not mean it 
>>> is the only point to improve, but it was the easiest one.
>>>
>>> The more I think about it, the more I feel your proposal to change the 
>>> underlying layout is the way to go. I am ready to do it if you want once 
>>> I have completed a first working implementation for SVD (even if this 
>>> first implementation is not fast enough). I would however be interested 
>>> if you could provide a few pointers to me, as you may have noticed, 
>>> linear algebra is not my main field of activity.
>>>
>>> What would you suggest ? How is storage handled in colt ?
>>>
>>> Luc
>>>
>>>
 I know the the Colt package made the argument that the latter it far
 preferabl

Re: [math] patch submitted for commons-math sparse matrix support

2008-12-09 Thread Bill Barker
I've recently come up with a need in my day-job for a 
SparseMatrix/SparseVector implementation, so would would like to offer to 
help on this project.  I should still have karma for commons (my Apache id 
is [EMAIL PROTECTED] for those that haven't been around since my last commons 
commit. I'm primarily a Tomcat committer in Apache-land, but have worked on 
Modeler and Daemon as well).  I have a degree in math, so I know my way 
around linear algebra.  I've also been around in commons long enough to know 
that I should ask nicely first before starting to commit ;).

I'm still in the process of reviewing the patches, so don't have any 
specific recommendations yet.  I'll re-post when I do.

"Sujit Pal" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> I've updated MATH-230 with a fresh patch (this time against trunk, since
> MATH_2_0 has been moved to trunk) putting back the fallback code Luc
> pointed out was used for performance.
>
> Details are in the bug:
> http://issues.apache.org/jira/browse/MATH-230
>
> There are also some test times for running with the current version vs
> the getEntry() stuff I put in, and for the current implementation. The
> performance does degrade for large matrices with getEntry(), but not by
> much for the sizes I was looking at, but since the complexity is
> O(n**2), there is going to be more degradation at larger sizes. I wasn't
> able to get the test to complete in either case in a reasonable length
> of time (~10s).
>
> size   data[][]  getEntry()  restored data[][]
> [3x2]*[2x4]  0ms 0ms   0ms
> [6x4]*[4x8]) 0ms 0ms   0ms
> [12x8]*[8x16])   1ms 0ms   0ms
> [24x16]*[16x32]  2ms 2ms   2ms
> [48x32]*[32x64]) 14ms16ms  23ms
> [96x64]*[64x128] 2ms 3ms   3ms
> [192x128]*[128x256]  24ms50ms  28ms
> [384x256]*[256x512]  770ms   813ms 768ms
>
> One thing I would like to suggest. As a business programmer, I may have
> to deal with matrices, but I would prefer not have to deal with
> implementing individual matrix methods. Which is one reason someone like
> me would want to use a library like commons-math in the first place.
> However, as I found recently, I may have to make different
> implementations of RealMatrix, which I would prefer to do by subclassing
> and overriding the getEntry() and setEntry() methods. Many algorithms
> are based on matrices (and they probably assume a JVM with a huge amount
> of memory), so it makes the code cleaner to use matrices as an
> abstraction, but I could use a Map, Lucene index, database table, etc as
> storage for my data. Would it make sense to have a generic impl of
> RealMatrix, perhaps called AbstractRealMatrixImpl where the getEntry and
> setEntry are abstract, which is meant for this sort of application?
> Obviously, this implies that there is a guarantee that getEntry and
> setEntry are used uniformly throughout the code, except for special
> cases such as RealMatrixImpl.
>
> -sujit
>
> On Tue, 2008-12-02 at 14:19 +0100, [EMAIL PROTECTED] wrote:
>> - "Ted Dunning" <[EMAIL PROTECTED]> a écrit :
>>
>> > Is performance in LUD normally achieved by fast element access?  Or is
>> > it
>> > done by using higher order operators on a view of a row (or column)?
>>
>> This is not used for LUD but only for simpler operations (multiplication, 
>> addition, subtraction). For these operations, the algorithm is almost 
>> reduced to one operation, so a single getEntry() instead of a direct 
>> access result is really a performance bottleneck. This does not mean it 
>> is the only point to improve, but it was the easiest one.
>>
>> The more I think about it, the more I feel your proposal to change the 
>> underlying layout is the way to go. I am ready to do it if you want once 
>> I have completed a first working implementation for SVD (even if this 
>> first implementation is not fast enough). I would however be interested 
>> if you could provide a few pointers to me, as you may have noticed, 
>> linear algebra is not my main field of activity.
>>
>> What would you suggest ? How is storage handled in colt ?
>>
>> Luc
>>
>>
>> >
>> > I know the the Colt package made the argument that the latter it far
>> > preferable.  I also know that Jama didn't do this, but was very hard
>> > to
>> > extend as a result.
>> >
>> > The colt approach was to require the matrices provide row, column and
>> > submatrix view operations and that vectors and matrices support
>> > functionally
>> > oriented destructive updates.  This can be hard for users to
>> > understand so
>> > you generally have to provide more algebraic interfaces as well, but
>> > it can
>> > work wonders for performance.
>> >
>> > On Mon, Dec 1, 2008 at 2:52 PM, Luc Maisonobe <[EMAIL PROTECTED]>
>> > wrote:
>> >
>> > > For now, one thing  that bother me is the removal of the dedicated
>> > code
>> > > that explicite

Re: [math] patch submitted for commons-math sparse matrix support

2008-12-03 Thread Sujit Pal
Ok, thanks very much.

-sujit

On Wed, 2008-12-03 at 22:53 +0100, Luc Maisonobe wrote:
> Sujit Pal a écrit :
> > Thanks, added it to JIRA:
> > https://issues.apache.org/jira/browse/MATH-231
> > 
> > Also, please let me know if you want me to go ahead and submit a patch
> > for this.
> 
> No, this is not needed, thanks.
> 
> > 
> > I also noticed that MATH-230 is open/unassigned. The patch is already
> > in, so all that remains is to review it before deciding to check in or
> > not. Not sure if I need to do anything here, if yes, please let me know.
> 
> I'll review it. I am already late but in the last few days, many
> interesting issues have been discussed about the linear algebra package.
> The current tasks for me are (in random order, counting only linear
> algebra):
>   - completing singular value decomposition implementation
>   - reworking the decomposition API for all algorithms
>   - thinking about changing double[][] to something faster
>   - reviewing your patch for MATH-230
>   - extracting the abstract class for MATH-231
>   - adding a few new methods for row/column handling someone asked for
> off-list
>   - thinking about views/selections
>   - thinking about field genericity
>   - benchmarking
> 
> Luc
> 
> > 
> > -sujit
> > 
> > On Tue, 2008-12-02 at 23:02 +0100, Luc Maisonobe wrote:
> >> Sujit Pal a écrit :
> >>> I've updated MATH-230 with a fresh patch (this time against trunk, since
> >>> MATH_2_0 has been moved to trunk) putting back the fallback code Luc
> >>> pointed out was used for performance.
> >> Thanks.
> >>
> >>> Details are in the bug:
> >>> http://issues.apache.org/jira/browse/MATH-230
> >>>
> >>> There are also some test times for running with the current version vs
> >>> the getEntry() stuff I put in, and for the current implementation. The
> >>> performance does degrade for large matrices with getEntry(), but not by
> >>> much for the sizes I was looking at, but since the complexity is
> >>> O(n**2), there is going to be more degradation at larger sizes. I wasn't
> >>> able to get the test to complete in either case in a reasonable length
> >>> of time (~10s).
> >>>
> >>> size   data[][]  getEntry()  restored data[][] 
> >>> [3x2]*[2x4]  0ms 0ms   0ms
> >>> [6x4]*[4x8]) 0ms 0ms   0ms
> >>> [12x8]*[8x16])   1ms 0ms   0ms
> >>> [24x16]*[16x32]  2ms 2ms   2ms
> >>> [48x32]*[32x64]) 14ms16ms  23ms
> >>> [96x64]*[64x128] 2ms 3ms   3ms
> >>> [192x128]*[128x256]  24ms50ms  28ms
> >>> [384x256]*[256x512]  770ms   813ms 768ms
> >> These differences are very small. The benchmark I did some months ago
> >> were rather ratios about 2 or 3 (but there where also other problems
> >> like spurious bounds checkings).
> >>
> >>> One thing I would like to suggest. As a business programmer, I may have
> >>> to deal with matrices, but I would prefer not have to deal with
> >>> implementing individual matrix methods. Which is one reason someone like
> >>> me would want to use a library like commons-math in the first place.
> >>> However, as I found recently, I may have to make different
> >>> implementations of RealMatrix, which I would prefer to do by subclassing
> >>> and overriding the getEntry() and setEntry() methods. Many algorithms
> >>> are based on matrices (and they probably assume a JVM with a huge amount
> >>> of memory), so it makes the code cleaner to use matrices as an
> >>> abstraction, but I could use a Map, Lucene index, database table, etc as
> >>> storage for my data. Would it make sense to have a generic impl of
> >>> RealMatrix, perhaps called AbstractRealMatrixImpl where the getEntry and
> >>> setEntry are abstract, which is meant for this sort of application?
> >>> Obviously, this implies that there is a guarantee that getEntry and
> >>> setEntry are used uniformly throughout the code, except for special
> >>> cases such as RealMatrixImpl.
> >> This does make sense to me as we seem to have several implementations in
> >> the near future. This was probably overkill with our single
> >> implementation up to now. It is also quite easy to do with modern
> >> development environments that allow refactoring, interface extraction
> >> and so on ...
> >> Could you please open a dedicated JIRA issue for this specific point so
> >> we don't forget it ?
> >>
> >> thanks,
> >> Luc
> >>
> >>> -sujit
> >>>
> >>> On Tue, 2008-12-02 at 14:19 +0100, [EMAIL PROTECTED] wrote:
>  - "Ted Dunning" <[EMAIL PROTECTED]> a écrit :
> 
> > Is performance in LUD normally achieved by fast element access?  Or is
> > it
> > done by using higher order operators on a view of a row (or column)?
>  This is not used for LUD but only for simpler operations 
>  (multiplication, addition, subtraction). For these operations, the 
>  algorithm is almost reduced to one operation, so a single getEntry() 
> >>

Re: [math] patch submitted for commons-math sparse matrix support

2008-12-03 Thread Luc Maisonobe
Sujit Pal a écrit :
> Thanks, added it to JIRA:
> https://issues.apache.org/jira/browse/MATH-231
> 
> Also, please let me know if you want me to go ahead and submit a patch
> for this.

No, this is not needed, thanks.

> 
> I also noticed that MATH-230 is open/unassigned. The patch is already
> in, so all that remains is to review it before deciding to check in or
> not. Not sure if I need to do anything here, if yes, please let me know.

I'll review it. I am already late but in the last few days, many
interesting issues have been discussed about the linear algebra package.
The current tasks for me are (in random order, counting only linear
algebra):
  - completing singular value decomposition implementation
  - reworking the decomposition API for all algorithms
  - thinking about changing double[][] to something faster
  - reviewing your patch for MATH-230
  - extracting the abstract class for MATH-231
  - adding a few new methods for row/column handling someone asked for
off-list
  - thinking about views/selections
  - thinking about field genericity
  - benchmarking

Luc

> 
> -sujit
> 
> On Tue, 2008-12-02 at 23:02 +0100, Luc Maisonobe wrote:
>> Sujit Pal a écrit :
>>> I've updated MATH-230 with a fresh patch (this time against trunk, since
>>> MATH_2_0 has been moved to trunk) putting back the fallback code Luc
>>> pointed out was used for performance.
>> Thanks.
>>
>>> Details are in the bug:
>>> http://issues.apache.org/jira/browse/MATH-230
>>>
>>> There are also some test times for running with the current version vs
>>> the getEntry() stuff I put in, and for the current implementation. The
>>> performance does degrade for large matrices with getEntry(), but not by
>>> much for the sizes I was looking at, but since the complexity is
>>> O(n**2), there is going to be more degradation at larger sizes. I wasn't
>>> able to get the test to complete in either case in a reasonable length
>>> of time (~10s).
>>>
>>> size   data[][]  getEntry()  restored data[][] 
>>> [3x2]*[2x4]  0ms 0ms   0ms
>>> [6x4]*[4x8]) 0ms 0ms   0ms
>>> [12x8]*[8x16])   1ms 0ms   0ms
>>> [24x16]*[16x32]  2ms 2ms   2ms
>>> [48x32]*[32x64]) 14ms16ms  23ms
>>> [96x64]*[64x128] 2ms 3ms   3ms
>>> [192x128]*[128x256]  24ms50ms  28ms
>>> [384x256]*[256x512]  770ms   813ms 768ms
>> These differences are very small. The benchmark I did some months ago
>> were rather ratios about 2 or 3 (but there where also other problems
>> like spurious bounds checkings).
>>
>>> One thing I would like to suggest. As a business programmer, I may have
>>> to deal with matrices, but I would prefer not have to deal with
>>> implementing individual matrix methods. Which is one reason someone like
>>> me would want to use a library like commons-math in the first place.
>>> However, as I found recently, I may have to make different
>>> implementations of RealMatrix, which I would prefer to do by subclassing
>>> and overriding the getEntry() and setEntry() methods. Many algorithms
>>> are based on matrices (and they probably assume a JVM with a huge amount
>>> of memory), so it makes the code cleaner to use matrices as an
>>> abstraction, but I could use a Map, Lucene index, database table, etc as
>>> storage for my data. Would it make sense to have a generic impl of
>>> RealMatrix, perhaps called AbstractRealMatrixImpl where the getEntry and
>>> setEntry are abstract, which is meant for this sort of application?
>>> Obviously, this implies that there is a guarantee that getEntry and
>>> setEntry are used uniformly throughout the code, except for special
>>> cases such as RealMatrixImpl.
>> This does make sense to me as we seem to have several implementations in
>> the near future. This was probably overkill with our single
>> implementation up to now. It is also quite easy to do with modern
>> development environments that allow refactoring, interface extraction
>> and so on ...
>> Could you please open a dedicated JIRA issue for this specific point so
>> we don't forget it ?
>>
>> thanks,
>> Luc
>>
>>> -sujit
>>>
>>> On Tue, 2008-12-02 at 14:19 +0100, [EMAIL PROTECTED] wrote:
 - "Ted Dunning" <[EMAIL PROTECTED]> a écrit :

> Is performance in LUD normally achieved by fast element access?  Or is
> it
> done by using higher order operators on a view of a row (or column)?
 This is not used for LUD but only for simpler operations (multiplication, 
 addition, subtraction). For these operations, the algorithm is almost 
 reduced to one operation, so a single getEntry() instead of a direct 
 access result is really a performance bottleneck. This does not mean it is 
 the only point to improve, but it was the easiest one.

 The more I think about it, the more I feel your proposal to change the 
 underlying layout is the way to go. I am ready to

Re: [math] patch submitted for commons-math sparse matrix support

2008-12-03 Thread Sujit Pal
Thanks, added it to JIRA:
https://issues.apache.org/jira/browse/MATH-231

Also, please let me know if you want me to go ahead and submit a patch
for this.

I also noticed that MATH-230 is open/unassigned. The patch is already
in, so all that remains is to review it before deciding to check in or
not. Not sure if I need to do anything here, if yes, please let me know.

-sujit

On Tue, 2008-12-02 at 23:02 +0100, Luc Maisonobe wrote:
> Sujit Pal a écrit :
> > I've updated MATH-230 with a fresh patch (this time against trunk, since
> > MATH_2_0 has been moved to trunk) putting back the fallback code Luc
> > pointed out was used for performance.
> 
> Thanks.
> 
> > 
> > Details are in the bug:
> > http://issues.apache.org/jira/browse/MATH-230
> > 
> > There are also some test times for running with the current version vs
> > the getEntry() stuff I put in, and for the current implementation. The
> > performance does degrade for large matrices with getEntry(), but not by
> > much for the sizes I was looking at, but since the complexity is
> > O(n**2), there is going to be more degradation at larger sizes. I wasn't
> > able to get the test to complete in either case in a reasonable length
> > of time (~10s).
> > 
> > size   data[][]  getEntry()  restored data[][] 
> > [3x2]*[2x4]  0ms 0ms   0ms
> > [6x4]*[4x8]) 0ms 0ms   0ms
> > [12x8]*[8x16])   1ms 0ms   0ms
> > [24x16]*[16x32]  2ms 2ms   2ms
> > [48x32]*[32x64]) 14ms16ms  23ms
> > [96x64]*[64x128] 2ms 3ms   3ms
> > [192x128]*[128x256]  24ms50ms  28ms
> > [384x256]*[256x512]  770ms   813ms 768ms
> 
> These differences are very small. The benchmark I did some months ago
> were rather ratios about 2 or 3 (but there where also other problems
> like spurious bounds checkings).
> 
> > 
> > One thing I would like to suggest. As a business programmer, I may have
> > to deal with matrices, but I would prefer not have to deal with
> > implementing individual matrix methods. Which is one reason someone like
> > me would want to use a library like commons-math in the first place.
> > However, as I found recently, I may have to make different
> > implementations of RealMatrix, which I would prefer to do by subclassing
> > and overriding the getEntry() and setEntry() methods. Many algorithms
> > are based on matrices (and they probably assume a JVM with a huge amount
> > of memory), so it makes the code cleaner to use matrices as an
> > abstraction, but I could use a Map, Lucene index, database table, etc as
> > storage for my data. Would it make sense to have a generic impl of
> > RealMatrix, perhaps called AbstractRealMatrixImpl where the getEntry and
> > setEntry are abstract, which is meant for this sort of application?
> > Obviously, this implies that there is a guarantee that getEntry and
> > setEntry are used uniformly throughout the code, except for special
> > cases such as RealMatrixImpl.
> 
> This does make sense to me as we seem to have several implementations in
> the near future. This was probably overkill with our single
> implementation up to now. It is also quite easy to do with modern
> development environments that allow refactoring, interface extraction
> and so on ...
> Could you please open a dedicated JIRA issue for this specific point so
> we don't forget it ?
> 
> thanks,
> Luc
> 
> > 
> > -sujit
> > 
> > On Tue, 2008-12-02 at 14:19 +0100, [EMAIL PROTECTED] wrote:
> >> - "Ted Dunning" <[EMAIL PROTECTED]> a écrit :
> >>
> >>> Is performance in LUD normally achieved by fast element access?  Or is
> >>> it
> >>> done by using higher order operators on a view of a row (or column)?
> >> This is not used for LUD but only for simpler operations (multiplication, 
> >> addition, subtraction). For these operations, the algorithm is almost 
> >> reduced to one operation, so a single getEntry() instead of a direct 
> >> access result is really a performance bottleneck. This does not mean it is 
> >> the only point to improve, but it was the easiest one.
> >>
> >> The more I think about it, the more I feel your proposal to change the 
> >> underlying layout is the way to go. I am ready to do it if you want once I 
> >> have completed a first working implementation for SVD (even if this first 
> >> implementation is not fast enough). I would however be interested if you 
> >> could provide a few pointers to me, as you may have noticed, linear 
> >> algebra is not my main field of activity.
> >>
> >> What would you suggest ? How is storage handled in colt ?
> >>
> >> Luc
> >>  
> >>
> >>> I know the the Colt package made the argument that the latter it far
> >>> preferable.  I also know that Jama didn't do this, but was very hard
> >>> to
> >>> extend as a result.
> >>>
> >>> The colt approach was to require the matrices provide row, column and
> >>> submatrix view operations and that vectors and 

Re: [math] patch submitted for commons-math sparse matrix support

2008-12-02 Thread Luc Maisonobe
Sujit Pal a écrit :
> I've updated MATH-230 with a fresh patch (this time against trunk, since
> MATH_2_0 has been moved to trunk) putting back the fallback code Luc
> pointed out was used for performance.

Thanks.

> 
> Details are in the bug:
> http://issues.apache.org/jira/browse/MATH-230
> 
> There are also some test times for running with the current version vs
> the getEntry() stuff I put in, and for the current implementation. The
> performance does degrade for large matrices with getEntry(), but not by
> much for the sizes I was looking at, but since the complexity is
> O(n**2), there is going to be more degradation at larger sizes. I wasn't
> able to get the test to complete in either case in a reasonable length
> of time (~10s).
> 
> size   data[][]  getEntry()  restored data[][] 
> [3x2]*[2x4]  0ms 0ms   0ms
> [6x4]*[4x8]) 0ms 0ms   0ms
> [12x8]*[8x16])   1ms 0ms   0ms
> [24x16]*[16x32]  2ms 2ms   2ms
> [48x32]*[32x64]) 14ms16ms  23ms
> [96x64]*[64x128] 2ms 3ms   3ms
> [192x128]*[128x256]  24ms50ms  28ms
> [384x256]*[256x512]  770ms   813ms 768ms

These differences are very small. The benchmark I did some months ago
were rather ratios about 2 or 3 (but there where also other problems
like spurious bounds checkings).

> 
> One thing I would like to suggest. As a business programmer, I may have
> to deal with matrices, but I would prefer not have to deal with
> implementing individual matrix methods. Which is one reason someone like
> me would want to use a library like commons-math in the first place.
> However, as I found recently, I may have to make different
> implementations of RealMatrix, which I would prefer to do by subclassing
> and overriding the getEntry() and setEntry() methods. Many algorithms
> are based on matrices (and they probably assume a JVM with a huge amount
> of memory), so it makes the code cleaner to use matrices as an
> abstraction, but I could use a Map, Lucene index, database table, etc as
> storage for my data. Would it make sense to have a generic impl of
> RealMatrix, perhaps called AbstractRealMatrixImpl where the getEntry and
> setEntry are abstract, which is meant for this sort of application?
> Obviously, this implies that there is a guarantee that getEntry and
> setEntry are used uniformly throughout the code, except for special
> cases such as RealMatrixImpl.

This does make sense to me as we seem to have several implementations in
the near future. This was probably overkill with our single
implementation up to now. It is also quite easy to do with modern
development environments that allow refactoring, interface extraction
and so on ...
Could you please open a dedicated JIRA issue for this specific point so
we don't forget it ?

thanks,
Luc

> 
> -sujit
> 
> On Tue, 2008-12-02 at 14:19 +0100, [EMAIL PROTECTED] wrote:
>> - "Ted Dunning" <[EMAIL PROTECTED]> a écrit :
>>
>>> Is performance in LUD normally achieved by fast element access?  Or is
>>> it
>>> done by using higher order operators on a view of a row (or column)?
>> This is not used for LUD but only for simpler operations (multiplication, 
>> addition, subtraction). For these operations, the algorithm is almost 
>> reduced to one operation, so a single getEntry() instead of a direct access 
>> result is really a performance bottleneck. This does not mean it is the only 
>> point to improve, but it was the easiest one.
>>
>> The more I think about it, the more I feel your proposal to change the 
>> underlying layout is the way to go. I am ready to do it if you want once I 
>> have completed a first working implementation for SVD (even if this first 
>> implementation is not fast enough). I would however be interested if you 
>> could provide a few pointers to me, as you may have noticed, linear algebra 
>> is not my main field of activity.
>>
>> What would you suggest ? How is storage handled in colt ?
>>
>> Luc
>>  
>>
>>> I know the the Colt package made the argument that the latter it far
>>> preferable.  I also know that Jama didn't do this, but was very hard
>>> to
>>> extend as a result.
>>>
>>> The colt approach was to require the matrices provide row, column and
>>> submatrix view operations and that vectors and matrices support
>>> functionally
>>> oriented destructive updates.  This can be hard for users to
>>> understand so
>>> you generally have to provide more algebraic interfaces as well, but
>>> it can
>>> work wonders for performance.
>>>
>>> On Mon, Dec 1, 2008 at 2:52 PM, Luc Maisonobe <[EMAIL PROTECTED]>
>>> wrote:
>>>
 For now, one thing  that bother me is the removal of the dedicated
>>> code
 that explicitely avoided getEntry(). This was added a few months ago
>>> for
 performance reason, since the previous generic code was painfully
>>> slow.
 The trick with the ClassCastException allows really fa

Re: [math] patch submitted for commons-math sparse matrix support

2008-12-02 Thread Sujit Pal
I've updated MATH-230 with a fresh patch (this time against trunk, since
MATH_2_0 has been moved to trunk) putting back the fallback code Luc
pointed out was used for performance.

Details are in the bug:
http://issues.apache.org/jira/browse/MATH-230

There are also some test times for running with the current version vs
the getEntry() stuff I put in, and for the current implementation. The
performance does degrade for large matrices with getEntry(), but not by
much for the sizes I was looking at, but since the complexity is
O(n**2), there is going to be more degradation at larger sizes. I wasn't
able to get the test to complete in either case in a reasonable length
of time (~10s).

size   data[][]  getEntry()  restored data[][] 
[3x2]*[2x4]  0ms 0ms   0ms
[6x4]*[4x8]) 0ms 0ms   0ms
[12x8]*[8x16])   1ms 0ms   0ms
[24x16]*[16x32]  2ms 2ms   2ms
[48x32]*[32x64]) 14ms16ms  23ms
[96x64]*[64x128] 2ms 3ms   3ms
[192x128]*[128x256]  24ms50ms  28ms
[384x256]*[256x512]  770ms   813ms 768ms

One thing I would like to suggest. As a business programmer, I may have
to deal with matrices, but I would prefer not have to deal with
implementing individual matrix methods. Which is one reason someone like
me would want to use a library like commons-math in the first place.
However, as I found recently, I may have to make different
implementations of RealMatrix, which I would prefer to do by subclassing
and overriding the getEntry() and setEntry() methods. Many algorithms
are based on matrices (and they probably assume a JVM with a huge amount
of memory), so it makes the code cleaner to use matrices as an
abstraction, but I could use a Map, Lucene index, database table, etc as
storage for my data. Would it make sense to have a generic impl of
RealMatrix, perhaps called AbstractRealMatrixImpl where the getEntry and
setEntry are abstract, which is meant for this sort of application?
Obviously, this implies that there is a guarantee that getEntry and
setEntry are used uniformly throughout the code, except for special
cases such as RealMatrixImpl.

-sujit

On Tue, 2008-12-02 at 14:19 +0100, [EMAIL PROTECTED] wrote:
> - "Ted Dunning" <[EMAIL PROTECTED]> a écrit :
> 
> > Is performance in LUD normally achieved by fast element access?  Or is
> > it
> > done by using higher order operators on a view of a row (or column)?
> 
> This is not used for LUD but only for simpler operations (multiplication, 
> addition, subtraction). For these operations, the algorithm is almost reduced 
> to one operation, so a single getEntry() instead of a direct access result is 
> really a performance bottleneck. This does not mean it is the only point to 
> improve, but it was the easiest one.
> 
> The more I think about it, the more I feel your proposal to change the 
> underlying layout is the way to go. I am ready to do it if you want once I 
> have completed a first working implementation for SVD (even if this first 
> implementation is not fast enough). I would however be interested if you 
> could provide a few pointers to me, as you may have noticed, linear algebra 
> is not my main field of activity.
> 
> What would you suggest ? How is storage handled in colt ?
> 
> Luc
>  
> 
> > 
> > I know the the Colt package made the argument that the latter it far
> > preferable.  I also know that Jama didn't do this, but was very hard
> > to
> > extend as a result.
> > 
> > The colt approach was to require the matrices provide row, column and
> > submatrix view operations and that vectors and matrices support
> > functionally
> > oriented destructive updates.  This can be hard for users to
> > understand so
> > you generally have to provide more algebraic interfaces as well, but
> > it can
> > work wonders for performance.
> > 
> > On Mon, Dec 1, 2008 at 2:52 PM, Luc Maisonobe <[EMAIL PROTECTED]>
> > wrote:
> > 
> > > For now, one thing  that bother me is the removal of the dedicated
> > code
> > > that explicitely avoided getEntry(). This was added a few months ago
> > for
> > > performance reason, since the previous generic code was painfully
> > slow.
> > > The trick with the ClassCastException allows really fast check since
> > the
> > > virtual machine can completely optimize it out in some cases, it is
> > an
> > > enhancement of what was discussed here:
> > > http://markmail.org/message/36fgg7vx6gzd3ziu. Our discussion at
> > that
> > > time was that the more common case (RealMatrixImpl) should be as
> > > efficient as possible (and Ted wants now it to be even faster by
> > > changing the underlying storage which is a good thing). This trick
> > is
> > > not beautiful perfectly generic code, it is a pragmatic way to be
> > fast
> > > in a common case and removing it will most probably induce poorer
> > > performances.
> > >
> > 
> > 
> > 
> > -- 
> > Ted Dunning, CTO
> > DeepDyv

Re: [math] patch submitted for commons-math sparse matrix support

2008-12-02 Thread luc . maisonobe

- "Ted Dunning" <[EMAIL PROTECTED]> a écrit :

> Is performance in LUD normally achieved by fast element access?  Or is
> it
> done by using higher order operators on a view of a row (or column)?

This is not used for LUD but only for simpler operations (multiplication, 
addition, subtraction). For these operations, the algorithm is almost reduced 
to one operation, so a single getEntry() instead of a direct access result is 
really a performance bottleneck. This does not mean it is the only point to 
improve, but it was the easiest one.

The more I think about it, the more I feel your proposal to change the 
underlying layout is the way to go. I am ready to do it if you want once I have 
completed a first working implementation for SVD (even if this first 
implementation is not fast enough). I would however be interested if you could 
provide a few pointers to me, as you may have noticed, linear algebra is not my 
main field of activity.

What would you suggest ? How is storage handled in colt ?

Luc
 

> 
> I know the the Colt package made the argument that the latter it far
> preferable.  I also know that Jama didn't do this, but was very hard
> to
> extend as a result.
> 
> The colt approach was to require the matrices provide row, column and
> submatrix view operations and that vectors and matrices support
> functionally
> oriented destructive updates.  This can be hard for users to
> understand so
> you generally have to provide more algebraic interfaces as well, but
> it can
> work wonders for performance.
> 
> On Mon, Dec 1, 2008 at 2:52 PM, Luc Maisonobe <[EMAIL PROTECTED]>
> wrote:
> 
> > For now, one thing  that bother me is the removal of the dedicated
> code
> > that explicitely avoided getEntry(). This was added a few months ago
> for
> > performance reason, since the previous generic code was painfully
> slow.
> > The trick with the ClassCastException allows really fast check since
> the
> > virtual machine can completely optimize it out in some cases, it is
> an
> > enhancement of what was discussed here:
> > http://markmail.org/message/36fgg7vx6gzd3ziu. Our discussion at
> that
> > time was that the more common case (RealMatrixImpl) should be as
> > efficient as possible (and Ted wants now it to be even faster by
> > changing the underlying storage which is a good thing). This trick
> is
> > not beautiful perfectly generic code, it is a pragmatic way to be
> fast
> > in a common case and removing it will most probably induce poorer
> > performances.
> >
> 
> 
> 
> -- 
> Ted Dunning, CTO
> DeepDyve
> 4600 Bohannon Drive, Suite 220
> Menlo Park, CA 94025
> www.deepdyve.com
> 650-324-0110, ext. 738
> 858-414-0013 (m)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] patch submitted for commons-math sparse matrix support

2008-12-01 Thread Sujit Pal
Hi Luc,

Thanks for the feedback. I did not realize that the check for the
ClassCastException was performance related. Let me know your thoughts
once you get a chance to look at the patch, and I can put this back in
and resubmit, ie if it is a RealMatrixImpl, then do what the methods
with RealMatrixImpl did, else use the getEntry() calls.

Also, I think the JVM can detect that the incoming arg is a
RealMatrixImpl and go to the correct method without the code having to
catch a ClassCastException. I have used this pattern in some other code
I've written, where there are multiple overloaded methods, each taking a
particular impl.

-sujit

On Mon, 2008-12-01 at 23:52 +0100, Luc Maisonobe wrote:
> Sujit Pal a écrit :
> > Hello,
> > 
> > I am a new, but fairly happy user of commons-math 1.2 and I have begun
> > to use it in my code. I needed a sparse matrix implementation for some
> > data analysis work I was doing, and unfortunately the code with
> > RealMatrixImpl was coming back with an OOME. So I built a
> > SparseRealMatrixImpl as a subclass of RealMatrixImpl, which uses a
> > Map as its backing store, storing only the non-zero
> > entries. I would like to contribute this implementation back to the
> > commons-math project.
> > 
> > I have opened a JIRA enhancement request MATH-230 with a patch against
> > the 2.0 branch. Patch and detailed description of the changes are
> > available here.
> > http://issues.apache.org/jira/browse/MATH-230
> > 
> > I have also had to make some changes to RealMatrixImpl and
> > LUDecompositionImpl, mainly replacing the data[][] references with
> > RealMatrix.getEntry(i,j) calls to insulate the backing store from being
> > accessed directly by calling code, and thus making it possible for
> > subclasses to provide their own internal backing stores.
> > 
> > Would appreciate if you folks can take a look and let me know if it is
> > feasible to add this into the 2.0 release. That way I can continue using
> > commons-math future versions without having to patch it after every
> > upgrade.
> 
> I'll have a look at it tomorrow evening.
> For now, one thing  that bother me is the removal of the dedicated code
> that explicitely avoided getEntry(). This was added a few months ago for
> performance reason, since the previous generic code was painfully slow.
> The trick with the ClassCastException allows really fast check since the
> virtual machine can completely optimize it out in some cases, it is an
> enhancement of what was discussed here:
> http://markmail.org/message/36fgg7vx6gzd3ziu. Our discussion at that
> time was that the more common case (RealMatrixImpl) should be as
> efficient as possible (and Ted wants now it to be even faster by
> changing the underlying storage which is a good thing). This trick is
> not beautiful perfectly generic code, it is a pragmatic way to be fast
> in a common case and removing it will most probably induce poorer
> performances.
> 
> Luc
> 
> > 
> > I noticed that there has been some previous discussion about sparse
> > matrix support here:
> > http://markmail.org/message/2k5dk4fbewdtn5rz
> > so I sent a personal email to John Iacona asking him about about the
> > code he speaks of here. But he is working on other things, and doesn't
> > know when he will be able to check in his sparse matrix support changes.
> > His approach is a little  different from mine (he has a SparseMatrix
> > interface, which is implemented by a SparseMatrixImpl). I considered
> > that approach initially, but came to the conclusion that it would mean
> > quite a bit of code duplication, which I wanted to avoid as far as
> > possible.
> > 
> > Thanks
> > -sujit
> > 
> > 
> > 
> > 
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> > 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] patch submitted for commons-math sparse matrix support

2008-12-01 Thread Ted Dunning
Is performance in LUD normally achieved by fast element access?  Or is it
done by using higher order operators on a view of a row (or column)?

I know the the Colt package made the argument that the latter it far
preferable.  I also know that Jama didn't do this, but was very hard to
extend as a result.

The colt approach was to require the matrices provide row, column and
submatrix view operations and that vectors and matrices support functionally
oriented destructive updates.  This can be hard for users to understand so
you generally have to provide more algebraic interfaces as well, but it can
work wonders for performance.

On Mon, Dec 1, 2008 at 2:52 PM, Luc Maisonobe <[EMAIL PROTECTED]> wrote:

> For now, one thing  that bother me is the removal of the dedicated code
> that explicitely avoided getEntry(). This was added a few months ago for
> performance reason, since the previous generic code was painfully slow.
> The trick with the ClassCastException allows really fast check since the
> virtual machine can completely optimize it out in some cases, it is an
> enhancement of what was discussed here:
> http://markmail.org/message/36fgg7vx6gzd3ziu. Our discussion at that
> time was that the more common case (RealMatrixImpl) should be as
> efficient as possible (and Ted wants now it to be even faster by
> changing the underlying storage which is a good thing). This trick is
> not beautiful perfectly generic code, it is a pragmatic way to be fast
> in a common case and removing it will most probably induce poorer
> performances.
>



-- 
Ted Dunning, CTO
DeepDyve
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
www.deepdyve.com
650-324-0110, ext. 738
858-414-0013 (m)


Re: [math] patch submitted for commons-math sparse matrix support

2008-12-01 Thread Luc Maisonobe
Sujit Pal a écrit :
> Hello,
> 
> I am a new, but fairly happy user of commons-math 1.2 and I have begun
> to use it in my code. I needed a sparse matrix implementation for some
> data analysis work I was doing, and unfortunately the code with
> RealMatrixImpl was coming back with an OOME. So I built a
> SparseRealMatrixImpl as a subclass of RealMatrixImpl, which uses a
> Map as its backing store, storing only the non-zero
> entries. I would like to contribute this implementation back to the
> commons-math project.
> 
> I have opened a JIRA enhancement request MATH-230 with a patch against
> the 2.0 branch. Patch and detailed description of the changes are
> available here.
> http://issues.apache.org/jira/browse/MATH-230
> 
> I have also had to make some changes to RealMatrixImpl and
> LUDecompositionImpl, mainly replacing the data[][] references with
> RealMatrix.getEntry(i,j) calls to insulate the backing store from being
> accessed directly by calling code, and thus making it possible for
> subclasses to provide their own internal backing stores.
> 
> Would appreciate if you folks can take a look and let me know if it is
> feasible to add this into the 2.0 release. That way I can continue using
> commons-math future versions without having to patch it after every
> upgrade.

I'll have a look at it tomorrow evening.
For now, one thing  that bother me is the removal of the dedicated code
that explicitely avoided getEntry(). This was added a few months ago for
performance reason, since the previous generic code was painfully slow.
The trick with the ClassCastException allows really fast check since the
virtual machine can completely optimize it out in some cases, it is an
enhancement of what was discussed here:
http://markmail.org/message/36fgg7vx6gzd3ziu. Our discussion at that
time was that the more common case (RealMatrixImpl) should be as
efficient as possible (and Ted wants now it to be even faster by
changing the underlying storage which is a good thing). This trick is
not beautiful perfectly generic code, it is a pragmatic way to be fast
in a common case and removing it will most probably induce poorer
performances.

Luc

> 
> I noticed that there has been some previous discussion about sparse
> matrix support here:
> http://markmail.org/message/2k5dk4fbewdtn5rz
> so I sent a personal email to John Iacona asking him about about the
> code he speaks of here. But he is working on other things, and doesn't
> know when he will be able to check in his sparse matrix support changes.
> His approach is a little  different from mine (he has a SparseMatrix
> interface, which is implemented by a SparseMatrixImpl). I considered
> that approach initially, but came to the conclusion that it would mean
> quite a bit of code duplication, which I wanted to avoid as far as
> possible.
> 
> Thanks
> -sujit
> 
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] patch submitted for commons-math sparse matrix support

2008-12-01 Thread Ted Dunning
This is often handled by using an offset sparse matrix or vector that is
sparse in that only elements different from the offset are stored.  You can
often do this at the algorithmic level by carrying around the requisite
constant matrix in your derivation.  Obviously, a normal sparse matrix is a
special case.

Offset sparse vectors provide simple ways to express row and column sums as
well.

On Mon, Dec 1, 2008 at 2:31 PM, Sujit Pal <[EMAIL PROTECTED]> wrote:

> On a slightly different note, I've been thinking about some other ways
> to reduce the memory footprint (of my application), and I think there
> may be some justification for having things like ConstantVector, ie a
> large non-sparse Vector which contains the same data for all its
> elements. The idea is similar, the getEntry() method would return a
> value that is set into the constructor.
>



-- 
Ted Dunning, CTO
DeepDyve
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
www.deepdyve.com
650-324-0110, ext. 738
858-414-0013 (m)


Re: [math] patch submitted for commons-math sparse matrix support

2008-12-01 Thread Sujit Pal
Hi Ted,

I considered that possibility too, peeked at one of the posts where this
representation was suggested :-).

Pros:
1) the representation is more elegant.

Cons:
1) it would have required me to implement SparseVector, which we should
do anyway, imo. But I wanted to keep the code changes to a minimum.
2) possibly greater memory usage, since we have reference to a bunch of
maps hanging off the int key. The Point class is a private inner class
so it is invisible outside the class and unlike the java.awt.Point,
contains almost no baggage, its simply a struct with equals() and
hashCode() implemented.

Speed wise, I don't think there is going to be much difference between
either approach. The current approach has some overhead in building up
the Point object for each getEntry() and setEntry() call, and the one
you suggest has an overhead to do the double lookup. Either one is
instantaneous for all practical purposes.

However, if Map data structure is preferred, the
only methods that need to change are the getEntry() and setEntry()
methods of SparseRealMatrixImpl, which I can do if needed.

On a slightly different note, I've been thinking about some other ways
to reduce the memory footprint (of my application), and I think there
may be some justification for having things like ConstantVector, ie a
large non-sparse Vector which contains the same data for all its
elements. The idea is similar, the getEntry() method would return a
value that is set into the constructor.

-sujit

On Mon, 2008-12-01 at 14:13 -0800, Ted Dunning wrote:
> Sujit,
> 
> What do you think the trade-offs are between the representation that you
> used and another one where you have a Map that contains
> rows or columns?
> 
> On Mon, Dec 1, 2008 at 1:40 PM, Sujit Pal <[EMAIL PROTECTED]> wrote:
> 
> > Hello,
> >
> > I am a new, but fairly happy user of commons-math 1.2 and I have begun
> > to use it in my code. I needed a sparse matrix implementation for some
> > data analysis work I was doing, and unfortunately the code with
> > RealMatrixImpl was coming back with an OOME. So I built a
> > SparseRealMatrixImpl as a subclass of RealMatrixImpl, which uses a
> > Map as its backing store, storing only the non-zero
> > entries. I would like to contribute this implementation back to the
> > commons-math project.
> >
> > I have opened a JIRA enhancement request MATH-230 with a patch against
> > the 2.0 branch. Patch and detailed description of the changes are
> > available here.
> > http://issues.apache.org/jira/browse/MATH-230
> >
> > I have also had to make some changes to RealMatrixImpl and
> > LUDecompositionImpl, mainly replacing the data[][] references with
> > RealMatrix.getEntry(i,j) calls to insulate the backing store from being
> > accessed directly by calling code, and thus making it possible for
> > subclasses to provide their own internal backing stores.
> >
> > Would appreciate if you folks can take a look and let me know if it is
> > feasible to add this into the 2.0 release. That way I can continue using
> > commons-math future versions without having to patch it after every
> > upgrade.
> >
> > I noticed that there has been some previous discussion about sparse
> > matrix support here:
> > http://markmail.org/message/2k5dk4fbewdtn5rz
> > so I sent a personal email to John Iacona asking him about about the
> > code he speaks of here. But he is working on other things, and doesn't
> > know when he will be able to check in his sparse matrix support changes.
> > His approach is a little  different from mine (he has a SparseMatrix
> > interface, which is implemented by a SparseMatrixImpl). I considered
> > that approach initially, but came to the conclusion that it would mean
> > quite a bit of code duplication, which I wanted to avoid as far as
> > possible.
> >
> > Thanks
> > -sujit
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] patch submitted for commons-math sparse matrix support

2008-12-01 Thread Ted Dunning
Sujit,

What do you think the trade-offs are between the representation that you
used and another one where you have a Map that contains
rows or columns?

On Mon, Dec 1, 2008 at 1:40 PM, Sujit Pal <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I am a new, but fairly happy user of commons-math 1.2 and I have begun
> to use it in my code. I needed a sparse matrix implementation for some
> data analysis work I was doing, and unfortunately the code with
> RealMatrixImpl was coming back with an OOME. So I built a
> SparseRealMatrixImpl as a subclass of RealMatrixImpl, which uses a
> Map as its backing store, storing only the non-zero
> entries. I would like to contribute this implementation back to the
> commons-math project.
>
> I have opened a JIRA enhancement request MATH-230 with a patch against
> the 2.0 branch. Patch and detailed description of the changes are
> available here.
> http://issues.apache.org/jira/browse/MATH-230
>
> I have also had to make some changes to RealMatrixImpl and
> LUDecompositionImpl, mainly replacing the data[][] references with
> RealMatrix.getEntry(i,j) calls to insulate the backing store from being
> accessed directly by calling code, and thus making it possible for
> subclasses to provide their own internal backing stores.
>
> Would appreciate if you folks can take a look and let me know if it is
> feasible to add this into the 2.0 release. That way I can continue using
> commons-math future versions without having to patch it after every
> upgrade.
>
> I noticed that there has been some previous discussion about sparse
> matrix support here:
> http://markmail.org/message/2k5dk4fbewdtn5rz
> so I sent a personal email to John Iacona asking him about about the
> code he speaks of here. But he is working on other things, and doesn't
> know when he will be able to check in his sparse matrix support changes.
> His approach is a little  different from mine (he has a SparseMatrix
> interface, which is implemented by a SparseMatrixImpl). I considered
> that approach initially, but came to the conclusion that it would mean
> quite a bit of code duplication, which I wanted to avoid as far as
> possible.
>
> Thanks
> -sujit
>
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


-- 
Ted Dunning, CTO
DeepDyve
4600 Bohannon Drive, Suite 220
Menlo Park, CA 94025
www.deepdyve.com
650-324-0110, ext. 738
858-414-0013 (m)


[math] patch submitted for commons-math sparse matrix support

2008-12-01 Thread Sujit Pal
Hello,

I am a new, but fairly happy user of commons-math 1.2 and I have begun
to use it in my code. I needed a sparse matrix implementation for some
data analysis work I was doing, and unfortunately the code with
RealMatrixImpl was coming back with an OOME. So I built a
SparseRealMatrixImpl as a subclass of RealMatrixImpl, which uses a
Map as its backing store, storing only the non-zero
entries. I would like to contribute this implementation back to the
commons-math project.

I have opened a JIRA enhancement request MATH-230 with a patch against
the 2.0 branch. Patch and detailed description of the changes are
available here.
http://issues.apache.org/jira/browse/MATH-230

I have also had to make some changes to RealMatrixImpl and
LUDecompositionImpl, mainly replacing the data[][] references with
RealMatrix.getEntry(i,j) calls to insulate the backing store from being
accessed directly by calling code, and thus making it possible for
subclasses to provide their own internal backing stores.

Would appreciate if you folks can take a look and let me know if it is
feasible to add this into the 2.0 release. That way I can continue using
commons-math future versions without having to patch it after every
upgrade.

I noticed that there has been some previous discussion about sparse
matrix support here:
http://markmail.org/message/2k5dk4fbewdtn5rz
so I sent a personal email to John Iacona asking him about about the
code he speaks of here. But he is working on other things, and doesn't
know when he will be able to check in his sparse matrix support changes.
His approach is a little  different from mine (he has a SparseMatrix
interface, which is implemented by a SparseMatrixImpl). I considered
that approach initially, but came to the conclusion that it would mean
quite a bit of code duplication, which I wanted to avoid as far as
possible.

Thanks
-sujit




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Sparse Matrix Support

2008-05-22 Thread John Iacona
>
> The best way to do this is to open a JIRA issue at
> http://issues.apache.org/jira/browse/MATH and to attach patches to this
> issue.
>

Excellent, I will do that

Do you think our current architecture is OK for your work ? Can you fit
> sparse matrices in it easily


Yes, I think the current RealMatrix interface will be fine to add a sparse
implementation to.

Regards,
John


Re: [math] Sparse Matrix Support

2008-05-22 Thread Luc Maisonobe
John Iacona wrote:
> So did anything ever happen with the project in the thread referenced by
> Luc? It seems like that was quite a while ago and I have not seen any
> activity in the area of sparse matrices in a while. But if someone is
> already working in this area I would definitely like to collaborate.

I am not aware of any work on this. Your proposal to contribute is very
appealing to me, so if you want to start something, please go ahead. I
will be glad to review and commit the patches.

The best way to do this is to open a JIRA issue at
http://issues.apache.org/jira/browse/MATH and to attach patches to this
issue.

> 
> As for ideas for what needs to be in the linear package, I think that some
> of the most important applications of matrix computation revolve around fast
> solvers that exploit the matrix structure effectively (i.e., solvers based
> on Cholesky, Triangular, LU, etc. decompositions).

Do you think our current architecture is OK for your work ? Can you fit
sparse matrices in it easily ?

Luc

> 
> Regards,
> John
> 
> On Thu, May 22, 2008 at 5:30 AM, Edward J. Yoon <[EMAIL PROTECTED]> wrote:
> 
>> Just FYI, If you interested in the matrix, you can also take a look at
>> http://wiki.apache.org/incubator/HamaProposal (Hams has accepted by
>> IPMC)
>>
>> Edward
>>
>> On Thu, May 22, 2008 at 4:02 PM,  <[EMAIL PROTECTED]> wrote:
>>> Improving the linear algebra package is a clear need and one of the
>> objectives for version 2.0.
>>> The link to the (short) thread may be this one:
>>> http://markmail.org/message/esfeuzzazz6yeqyk
>>>
>>> I have also have another discussion ongoing outside from potential users
>> of [math] who would like to have fast linear algebra with support for
>> vectors.
>>> I tried to get involved with Mahout developers who where seeking for some
>> math skills at some point, but the thread ended after my last message
>> without answers (http://markmail.org/message/a743yyz5ycqr2kdv). The team
>> seems to want to develop its own implementation and not being really
>> interested in our work. I gave up, but I really think this is a pity, we
>> should be able to work together.
>>> Perhaps is it time to put some ideas on what we need in our linear
>> algebra package to meet most users needs (SVD, sparse matrices, vectors,
>> efficiency, perhaps complex numbers ...)
>>> Luc
>>>
> 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Sparse Matrix Support

2008-05-22 Thread John Iacona
So did anything ever happen with the project in the thread referenced by
Luc? It seems like that was quite a while ago and I have not seen any
activity in the area of sparse matrices in a while. But if someone is
already working in this area I would definitely like to collaborate.

As for ideas for what needs to be in the linear package, I think that some
of the most important applications of matrix computation revolve around fast
solvers that exploit the matrix structure effectively (i.e., solvers based
on Cholesky, Triangular, LU, etc. decompositions).

Regards,
John

On Thu, May 22, 2008 at 5:30 AM, Edward J. Yoon <[EMAIL PROTECTED]> wrote:

> Just FYI, If you interested in the matrix, you can also take a look at
> http://wiki.apache.org/incubator/HamaProposal (Hams has accepted by
> IPMC)
>
> Edward
>
> On Thu, May 22, 2008 at 4:02 PM,  <[EMAIL PROTECTED]> wrote:
> > Improving the linear algebra package is a clear need and one of the
> objectives for version 2.0.
> >
> > The link to the (short) thread may be this one:
> > http://markmail.org/message/esfeuzzazz6yeqyk
> >
> > I have also have another discussion ongoing outside from potential users
> of [math] who would like to have fast linear algebra with support for
> vectors.
> >
> > I tried to get involved with Mahout developers who where seeking for some
> math skills at some point, but the thread ended after my last message
> without answers (http://markmail.org/message/a743yyz5ycqr2kdv). The team
> seems to want to develop its own implementation and not being really
> interested in our work. I gave up, but I really think this is a pity, we
> should be able to work together.
> >
> > Perhaps is it time to put some ideas on what we need in our linear
> algebra package to meet most users needs (SVD, sparse matrices, vectors,
> efficiency, perhaps complex numbers ...)
> >
> > Luc
> >
>


Re: [math] Sparse Matrix Support

2008-05-22 Thread Edward J. Yoon
Just FYI, If you interested in the matrix, you can also take a look at
http://wiki.apache.org/incubator/HamaProposal (Hams has accepted by
IPMC)

Edward

On Thu, May 22, 2008 at 4:02 PM,  <[EMAIL PROTECTED]> wrote:
> Improving the linear algebra package is a clear need and one of the 
> objectives for version 2.0.
>
> The link to the (short) thread may be this one:
> http://markmail.org/message/esfeuzzazz6yeqyk
>
> I have also have another discussion ongoing outside from potential users of 
> [math] who would like to have fast linear algebra with support for vectors.
>
> I tried to get involved with Mahout developers who where seeking for some 
> math skills at some point, but the thread ended after my last message without 
> answers (http://markmail.org/message/a743yyz5ycqr2kdv). The team seems to 
> want to develop its own implementation and not being really interested in our 
> work. I gave up, but I really think this is a pity, we should be able to work 
> together.
>
> Perhaps is it time to put some ideas on what we need in our linear algebra 
> package to meet most users needs (SVD, sparse matrices, vectors, efficiency, 
> perhaps complex numbers ...)
>
> Luc
>
> - Mail Original -
> De: "John Iacona" <[EMAIL PROTECTED]>
> À: "Commons Developers List" 
> Envoyé: Jeudi 22 Mai 2008 03:56:39 GMT +01:00 Amsterdam / Berlin / Berne / 
> Rome / Stockholm / Vienne
> Objet: Re: [math] Sparse Matrix Support
>
> This is the book I was referring to
> http://www.ec-securehost.com/SIAM/FA02.html
>
> Regards,
> John
>
> On Wed, May 21, 2008 at 5:26 PM, Rory Winston <[EMAIL PROTECTED]>
> wrote:
>
>> Hey John
>>
>> Which book are you referring to?
>>
>> Also, take a look at the matrix classes in the Apache Mahout project - I
>> believe (though dont quote me on it) that they may have some support for
>> sparse matrices.
>>
>> Cheers
>> Rory
>>
>>
>> John Iacona wrote:
>>
>>> I noticed that support for sparse matrix computation was on the commons
>>> math
>>> wishlist. This is a project I would be interested in taking on.
>>>
>>> I would like to add a SparseMatrix interface which would support
>>> functionality analogous to the RealMatrix interface.
>>> I have worked with Tim Davis the author of the CSparse suite and would be
>>> using algorithms outlined in his book to base my code on.
>>>
>>> On the wishlist page there was a reference to a thread using the old
>>> archive
>>> url format. Does anyone know how to find that thread with the new format?
>>> Searches come up empty.
>>>
>>> Regards,
>>>
>>> John Iacona
>>>
>>>
>>>
>>
>>
>> -
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-- 
Best regards,
Edward J. Yoon,
http://blog.udanax.org


Re: [math] Sparse Matrix Support

2008-05-22 Thread luc . maisonobe
Improving the linear algebra package is a clear need and one of the objectives 
for version 2.0.

The link to the (short) thread may be this one:
http://markmail.org/message/esfeuzzazz6yeqyk

I have also have another discussion ongoing outside from potential users of 
[math] who would like to have fast linear algebra with support for vectors.

I tried to get involved with Mahout developers who where seeking for some math 
skills at some point, but the thread ended after my last message without 
answers (http://markmail.org/message/a743yyz5ycqr2kdv). The team seems to want 
to develop its own implementation and not being really interested in our work. 
I gave up, but I really think this is a pity, we should be able to work 
together.

Perhaps is it time to put some ideas on what we need in our linear algebra 
package to meet most users needs (SVD, sparse matrices, vectors, efficiency, 
perhaps complex numbers ...)

Luc

- Mail Original -
De: "John Iacona" <[EMAIL PROTECTED]>
À: "Commons Developers List" 
Envoyé: Jeudi 22 Mai 2008 03:56:39 GMT +01:00 Amsterdam / Berlin / Berne / Rome 
/ Stockholm / Vienne
Objet: Re: [math] Sparse Matrix Support

This is the book I was referring to
http://www.ec-securehost.com/SIAM/FA02.html

Regards,
John

On Wed, May 21, 2008 at 5:26 PM, Rory Winston <[EMAIL PROTECTED]>
wrote:

> Hey John
>
> Which book are you referring to?
>
> Also, take a look at the matrix classes in the Apache Mahout project - I
> believe (though dont quote me on it) that they may have some support for
> sparse matrices.
>
> Cheers
> Rory
>
>
> John Iacona wrote:
>
>> I noticed that support for sparse matrix computation was on the commons
>> math
>> wishlist. This is a project I would be interested in taking on.
>>
>> I would like to add a SparseMatrix interface which would support
>> functionality analogous to the RealMatrix interface.
>> I have worked with Tim Davis the author of the CSparse suite and would be
>> using algorithms outlined in his book to base my code on.
>>
>> On the wishlist page there was a reference to a thread using the old
>> archive
>> url format. Does anyone know how to find that thread with the new format?
>> Searches come up empty.
>>
>> Regards,
>>
>> John Iacona
>>
>>
>>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [math] Sparse Matrix Support

2008-05-21 Thread John Iacona
This is the book I was referring to
http://www.ec-securehost.com/SIAM/FA02.html

Regards,
John

On Wed, May 21, 2008 at 5:26 PM, Rory Winston <[EMAIL PROTECTED]>
wrote:

> Hey John
>
> Which book are you referring to?
>
> Also, take a look at the matrix classes in the Apache Mahout project - I
> believe (though dont quote me on it) that they may have some support for
> sparse matrices.
>
> Cheers
> Rory
>
>
> John Iacona wrote:
>
>> I noticed that support for sparse matrix computation was on the commons
>> math
>> wishlist. This is a project I would be interested in taking on.
>>
>> I would like to add a SparseMatrix interface which would support
>> functionality analogous to the RealMatrix interface.
>> I have worked with Tim Davis the author of the CSparse suite and would be
>> using algorithms outlined in his book to base my code on.
>>
>> On the wishlist page there was a reference to a thread using the old
>> archive
>> url format. Does anyone know how to find that thread with the new format?
>> Searches come up empty.
>>
>> Regards,
>>
>> John Iacona
>>
>>
>>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


Re: [math] Sparse Matrix Support

2008-05-21 Thread Rory Winston

Hey John

Which book are you referring to?

Also, take a look at the matrix classes in the Apache Mahout project - I 
believe (though dont quote me on it) that they may have some support for 
sparse matrices.


Cheers
Rory

John Iacona wrote:

I noticed that support for sparse matrix computation was on the commons math
wishlist. This is a project I would be interested in taking on.

I would like to add a SparseMatrix interface which would support
functionality analogous to the RealMatrix interface.
I have worked with Tim Davis the author of the CSparse suite and would be
using algorithms outlined in his book to base my code on.

On the wishlist page there was a reference to a thread using the old archive
url format. Does anyone know how to find that thread with the new format?
Searches come up empty.

Regards,

John Iacona

  



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[math] Sparse Matrix Support

2008-05-21 Thread John Iacona
I noticed that support for sparse matrix computation was on the commons math
wishlist. This is a project I would be interested in taking on.

I would like to add a SparseMatrix interface which would support
functionality analogous to the RealMatrix interface.
I have worked with Tim Davis the author of the CSparse suite and would be
using algorithms outlined in his book to base my code on.

On the wishlist page there was a reference to a thread using the old archive
url format. Does anyone know how to find that thread with the new format?
Searches come up empty.

Regards,

John Iacona