Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hello, 2012/6/16 Sébastien Brisard : > Hi Gilles, > and thanks for your time spent on this issue! > > 2012/6/16 Gilles Sadowski : >>> [...] >>> >>> >> >>> >> > Maybe that we could also impose that it is forbidden to divide by a >>> >> > sparse >>> >> > vector whose default value is zero. This will avoid a check on all >>> >> > entries >>> >> > (at the expense of forbidding legitimate divisions, when all the >>> >> > entries >>> >> > have a non-default value; but then one would wonder why using a sparse >>> >> > vector...). >>> > >>> > [...] >>> >>> Gilles, I have the feeling that your prefered option is to store the >>> sign of the default value. >> >> Not really; if there are no obvious drawbacks, I'd prefer the above option. >> I.e. something like >> >> public OpenMapRealVector ebeDivide(OpenMapRealVector v) { >> if (v.getDefaultEntry() == 0) { >> throw new ZeroException(); >> } >> >> // ... >> } >> >>> In theory, that would solve the problem. I >>> see two objections >>> 1. sparse vectors would be less sparse (most users do not care about >>> the sign of the zero entries!) >> >> I agree; division by zero is in most (all?) cases a bug. So thoses users >> won't care about the above solution (or will be happy that their code fails >> early instead of propagating NaNs). >> > Yes, I think it's a good point (I would be glad if *any* calculation > could be stored when NaNs occured...). > What I can do is add some details in the javadoc, stating that we > opted for the exception option for robustness reasons, which leads to > less efficient code. I can suggest to use visitors if a more efficient > (and fragile !) solution is seeked. > > >>> 2. there is a threshold to decided whether or not an entry is actually >>> zero. So the current implementation already assumes the sign is lost. >> >> That's why the unit test fails. I think that it is more robust to make it >> explicit (by throwing an exception) that division by zero is not supported. >> >>> I'm really confused about the best solution to adopt. >> >> So am I. >> But my preference is to be more robust even if less efficient and not >> supporting corner cases. >> > I will implement the exception option, and alter the unit tests accordingly. > > Best regards, and thanks again for your help on this issue. That was > an interesting discussion! > > Sébastien well, I was hoping that we could close this issue, but I have realized that the problem is more wide-spread. Because the subject of this thread is not very explicit, I'm afraid not everyone is aware of these discussions. I'll create a new thread. Amitiés, Sébastien - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hi Gilles, and thanks for your time spent on this issue! 2012/6/16 Gilles Sadowski : >> [...] >> >> >> >> >> > Maybe that we could also impose that it is forbidden to divide by a >> >> > sparse >> >> > vector whose default value is zero. This will avoid a check on all >> >> > entries >> >> > (at the expense of forbidding legitimate divisions, when all the entries >> >> > have a non-default value; but then one would wonder why using a sparse >> >> > vector...). >> > >> > [...] >> >> Gilles, I have the feeling that your prefered option is to store the >> sign of the default value. > > Not really; if there are no obvious drawbacks, I'd prefer the above option. > I.e. something like > > public OpenMapRealVector ebeDivide(OpenMapRealVector v) { > if (v.getDefaultEntry() == 0) { > throw new ZeroException(); > } > > // ... > } > >> In theory, that would solve the problem. I >> see two objections >> 1. sparse vectors would be less sparse (most users do not care about >> the sign of the zero entries!) > > I agree; division by zero is in most (all?) cases a bug. So thoses users > won't care about the above solution (or will be happy that their code fails > early instead of propagating NaNs). > Yes, I think it's a good point (I would be glad if *any* calculation could be stored when NaNs occured...). What I can do is add some details in the javadoc, stating that we opted for the exception option for robustness reasons, which leads to less efficient code. I can suggest to use visitors if a more efficient (and fragile !) solution is seeked. >> 2. there is a threshold to decided whether or not an entry is actually >> zero. So the current implementation already assumes the sign is lost. > > That's why the unit test fails. I think that it is more robust to make it > explicit (by throwing an exception) that division by zero is not supported. > >> I'm really confused about the best solution to adopt. > > So am I. > But my preference is to be more robust even if less efficient and not > supporting corner cases. > I will implement the exception option, and alter the unit tests accordingly. Best regards, and thanks again for your help on this issue. That was an interesting discussion! Sébastien - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
> [...] > > >> > >> > Maybe that we could also impose that it is forbidden to divide by a > >> > sparse > >> > vector whose default value is zero. This will avoid a check on all > >> > entries > >> > (at the expense of forbidding legitimate divisions, when all the entries > >> > have a non-default value; but then one would wonder why using a sparse > >> > vector...). > > > > [...] > > Gilles, I have the feeling that your prefered option is to store the > sign of the default value. Not really; if there are no obvious drawbacks, I'd prefer the above option. I.e. something like public OpenMapRealVector ebeDivide(OpenMapRealVector v) { if (v.getDefaultEntry() == 0) { throw new ZeroException(); } // ... } > In theory, that would solve the problem. I > see two objections > 1. sparse vectors would be less sparse (most users do not care about > the sign of the zero entries!) I agree; division by zero is in most (all?) cases a bug. So thoses users won't care about the above solution (or will be happy that their code fails early instead of propagating NaNs). > 2. there is a threshold to decided whether or not an entry is actually > zero. So the current implementation already assumes the sign is lost. That's why the unit test fails. I think that it is more robust to make it explicit (by throwing an exception) that division by zero is not supported. > I'm really confused about the best solution to adopt. So am I. But my preference is to be more robust even if less efficient and not supporting corner cases. Best, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hi Gilles, 2012/6/16 Gilles Sadowski : > Hi. > >> >> >> >> > [...] >> >> > >> >> >> I fully agree. We could also opt for a less correct, but more >> >> >> efficient solution: we do not store the sign of zero, and return NaN >> >> >> each time v / zero occurs. The result should be NaN anyway, because >> >> >> its sign is undecidable. This specificity would be clearly stated in >> >> >> the javadoc. What do you think? >> >> > >> >> > We had this sort of discussion with the "Complex" class, where there are >> >> > such issues as "standard" vs "correct" vs "documented" etc. E.g. some >> >> > computations return "NaN" where it should be "infinity" (or vice-versa, >> >> > I >> >> > don't remember exactly). See MATH-667: >> >> > https://issues.apache.org/jira/browse/MATH-667 >> >> > >> >> > Maybe that we must be guided with some use-cases for this class. >> >> > Maybe in actual uses, the problems discussed here do not appear, and any >> >> > additional check would destroy the usefulness of this class in such >> >> > cases >> >> > (just a wild guess). Maybe that we should drop support for sparse >> >> > vectors! >> >> > >> >> I fully agree with you. Speaking quite honestly as a user of the >> >> linear package, even if I know that IEEE floats handle infinite values >> >> well, I tend to dish results that contain those values, and try to >> >> cure the cause of these unwanted values. So, as a user, whether the >> >> method returns NaN of infinity does not matter to me (as long as it >> >> does not return zero --or any normal value-- quietly): in any case, I >> >> would do a computation again, this time with "correct" values. >> >> >> >> I also think that in this case, strict adherence to IEEE standards >> >> (which were written for scalar values, not vectors) should not >> >> compromise the performance of the class. >> > >> > Am I mistaken, or is this at odds with a fix that will make the "division" >> > test pass? [I.e. If there is no way to distinguish between +0 and -0 as the >> > default entry then "1 / default" will always be "+inf".] >> > >> >> Do you mean I am bending the rules in favour of this specific case? > > I'm just referring to that failure: > --- > java.lang.AssertionError: entry #14, left = 1.0, right = -0.0 > expected:<-Infinity> but was: > --- > > I.e. you can never make this test pass if you do not keep the sign. > Agreed > >> In >> fact, it could be argued that the returned value *should* be NaN, >> since its sign is not decidable. Isn't that the reason why NaNs were >> created in the first place? If we clearly state that for sparse >> vectors with zero default value, the sign is lost, then the logical >> conclusion is that ebeDivide should return NaN instead of infinity. >> This is only my point of view, and I'm quite happy with any decision >> (since none is going to be perfect...). > > I'm not convinced; I'd rather throw an exception. > OK, that's an option. >> >> >> >> I would be more cautious with your suggestion about getting rid of >> >> sparse vectors. I think it's nice to have them, even if the support is >> >> limited (we should of course concentrate on array-based vectors >> >> first). As long as we clearly document these limits, I'd like to keep >> >> them. Besides, with the new set of abstract tests being constructed >> >> for any vector class, it should be possible in the future to propose >> >> and test a better implementation. The problem we have is clearly lack >> >> of feedback on these features (remember the debates we had on the >> >> "sparseIterators"-- not perfect, but experience only can tell how to >> >> improve). >> > >> > Well, if we know that it doesn't work well and we don't know how to fix it >> > while at the same time keep its usefulness (efficiency-wise), we have to >> > raise the issue of the necessity to keep supporting it even though it seems >> > that nobody uses it. >> > >> > For the time being (staying compatible), we should at least fix the failing >> > unit test problem; hence decide for example (as you said before) that the >> > purpose is to save memory, but the trade-off is loss of performance. >> > >> Yes, but the problem is that the inefficient fix has already been >> implemented... And it does not fix all bugs. >> Anyway, we could probably aim at the most correct solution (still to >> be defined!), at the expense of performance, since a more efficient >> implementation can easily be implemented through the newly implemented >> visitor pattern. > > > If the visitor can compute a correct answer more efficiently, then IMO we > should readily deprecate the method and remove it ASAP. > Sorry, that wasn't what I meant. I meant that a visitor can compute a less correct answer more efficiently. So if users can cope with special values incorrectly handled, this opportunity is offered to them, while the default implementation is more correct. >> >> > Maybe that we could also impose that it is forbidden to divide by a sparse >> > vector whose default
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hi. > >> > >> > [...] > >> > > >> >> I fully agree. We could also opt for a less correct, but more > >> >> efficient solution: we do not store the sign of zero, and return NaN > >> >> each time v / zero occurs. The result should be NaN anyway, because > >> >> its sign is undecidable. This specificity would be clearly stated in > >> >> the javadoc. What do you think? > >> > > >> > We had this sort of discussion with the "Complex" class, where there are > >> > such issues as "standard" vs "correct" vs "documented" etc. E.g. some > >> > computations return "NaN" where it should be "infinity" (or vice-versa, I > >> > don't remember exactly). See MATH-667: > >> > https://issues.apache.org/jira/browse/MATH-667 > >> > > >> > Maybe that we must be guided with some use-cases for this class. > >> > Maybe in actual uses, the problems discussed here do not appear, and any > >> > additional check would destroy the usefulness of this class in such cases > >> > (just a wild guess). Maybe that we should drop support for sparse > >> > vectors! > >> > > >> I fully agree with you. Speaking quite honestly as a user of the > >> linear package, even if I know that IEEE floats handle infinite values > >> well, I tend to dish results that contain those values, and try to > >> cure the cause of these unwanted values. So, as a user, whether the > >> method returns NaN of infinity does not matter to me (as long as it > >> does not return zero --or any normal value-- quietly): in any case, I > >> would do a computation again, this time with "correct" values. > >> > >> I also think that in this case, strict adherence to IEEE standards > >> (which were written for scalar values, not vectors) should not > >> compromise the performance of the class. > > > > Am I mistaken, or is this at odds with a fix that will make the "division" > > test pass? [I.e. If there is no way to distinguish between +0 and -0 as the > > default entry then "1 / default" will always be "+inf".] > > > > Do you mean I am bending the rules in favour of this specific case? I'm just referring to that failure: --- java.lang.AssertionError: entry #14, left = 1.0, right = -0.0 expected:<-Infinity> but was: --- I.e. you can never make this test pass if you do not keep the sign. > In > fact, it could be argued that the returned value *should* be NaN, > since its sign is not decidable. Isn't that the reason why NaNs were > created in the first place? If we clearly state that for sparse > vectors with zero default value, the sign is lost, then the logical > conclusion is that ebeDivide should return NaN instead of infinity. > This is only my point of view, and I'm quite happy with any decision > (since none is going to be perfect...). I'm not convinced; I'd rather throw an exception. > >> > >> I would be more cautious with your suggestion about getting rid of > >> sparse vectors. I think it's nice to have them, even if the support is > >> limited (we should of course concentrate on array-based vectors > >> first). As long as we clearly document these limits, I'd like to keep > >> them. Besides, with the new set of abstract tests being constructed > >> for any vector class, it should be possible in the future to propose > >> and test a better implementation. The problem we have is clearly lack > >> of feedback on these features (remember the debates we had on the > >> "sparseIterators"-- not perfect, but experience only can tell how to > >> improve). > > > > Well, if we know that it doesn't work well and we don't know how to fix it > > while at the same time keep its usefulness (efficiency-wise), we have to > > raise the issue of the necessity to keep supporting it even though it seems > > that nobody uses it. > > > > For the time being (staying compatible), we should at least fix the failing > > unit test problem; hence decide for example (as you said before) that the > > purpose is to save memory, but the trade-off is loss of performance. > > > Yes, but the problem is that the inefficient fix has already been > implemented... And it does not fix all bugs. > Anyway, we could probably aim at the most correct solution (still to > be defined!), at the expense of performance, since a more efficient > implementation can easily be implemented through the newly implemented > visitor pattern. If the visitor can compute a correct answer more efficiently, then IMO we should readily deprecate the method and remove it ASAP. > > > Maybe that we could also impose that it is forbidden to divide by a sparse > > vector whose default value is zero. This will avoid a check on all entries > > (at the expense of forbidding legitimate divisions, when all the entries > > have a non-default value; but then one would wonder why using a sparse > > vector...). > > > That's also an idea to explore. I have to say I still do not see > clearly what we should do. I quickly looked at BLAS, they apparently > do not provide an element-by-element division. I should look at > octave, scilab, numpy
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hi Gilles, > >> >> > [...] >> > >> >> I fully agree. We could also opt for a less correct, but more >> >> efficient solution: we do not store the sign of zero, and return NaN >> >> each time v / zero occurs. The result should be NaN anyway, because >> >> its sign is undecidable. This specificity would be clearly stated in >> >> the javadoc. What do you think? >> > >> > We had this sort of discussion with the "Complex" class, where there are >> > such issues as "standard" vs "correct" vs "documented" etc. E.g. some >> > computations return "NaN" where it should be "infinity" (or vice-versa, I >> > don't remember exactly). See MATH-667: >> > https://issues.apache.org/jira/browse/MATH-667 >> > >> > Maybe that we must be guided with some use-cases for this class. >> > Maybe in actual uses, the problems discussed here do not appear, and any >> > additional check would destroy the usefulness of this class in such cases >> > (just a wild guess). Maybe that we should drop support for sparse vectors! >> > >> I fully agree with you. Speaking quite honestly as a user of the >> linear package, even if I know that IEEE floats handle infinite values >> well, I tend to dish results that contain those values, and try to >> cure the cause of these unwanted values. So, as a user, whether the >> method returns NaN of infinity does not matter to me (as long as it >> does not return zero --or any normal value-- quietly): in any case, I >> would do a computation again, this time with "correct" values. >> >> I also think that in this case, strict adherence to IEEE standards >> (which were written for scalar values, not vectors) should not >> compromise the performance of the class. > > Am I mistaken, or is this at odds with a fix that will make the "division" > test pass? [I.e. If there is no way to distinguish between +0 and -0 as the > default entry then "1 / default" will always be "+inf".] > Do you mean I am bending the rules in favour of this specific case? In fact, it could be argued that the returned value *should* be NaN, since its sign is not decidable. Isn't that the reason why NaNs were created in the first place? If we clearly state that for sparse vectors with zero default value, the sign is lost, then the logical conclusion is that ebeDivide should return NaN instead of infinity. This is only my point of view, and I'm quite happy with any decision (since none is going to be perfect...). >> >> I would be more cautious with your suggestion about getting rid of >> sparse vectors. I think it's nice to have them, even if the support is >> limited (we should of course concentrate on array-based vectors >> first). As long as we clearly document these limits, I'd like to keep >> them. Besides, with the new set of abstract tests being constructed >> for any vector class, it should be possible in the future to propose >> and test a better implementation. The problem we have is clearly lack >> of feedback on these features (remember the debates we had on the >> "sparseIterators"-- not perfect, but experience only can tell how to >> improve). > > Well, if we know that it doesn't work well and we don't know how to fix it > while at the same time keep its usefulness (efficiency-wise), we have to > raise the issue of the necessity to keep supporting it even though it seems > that nobody uses it. > > For the time being (staying compatible), we should at least fix the failing > unit test problem; hence decide for example (as you said before) that the > purpose is to save memory, but the trade-off is loss of performance. > Yes, but the problem is that the inefficient fix has already been implemented... And it does not fix all bugs. Anyway, we could probably aim at the most correct solution (still to be defined!), at the expense of performance, since a more efficient implementation can easily be implemented through the newly implemented visitor pattern. > Maybe that we could also impose that it is forbidden to divide by a sparse > vector whose default value is zero. This will avoid a check on all entries > (at the expense of forbidding legitimate divisions, when all the entries > have a non-default value; but then one would wonder why using a sparse > vector...). > That's also an idea to explore. I have to say I still do not see clearly what we should do. I quickly looked at BLAS, they apparently do not provide an element-by-element division. I should look at octave, scilab, numpy... Sébastien > > Best, > Gilles > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hi. > > > [...] > > > >> I fully agree. We could also opt for a less correct, but more > >> efficient solution: we do not store the sign of zero, and return NaN > >> each time v / zero occurs. The result should be NaN anyway, because > >> its sign is undecidable. This specificity would be clearly stated in > >> the javadoc. What do you think? > > > > We had this sort of discussion with the "Complex" class, where there are > > such issues as "standard" vs "correct" vs "documented" etc. E.g. some > > computations return "NaN" where it should be "infinity" (or vice-versa, I > > don't remember exactly). See MATH-667: > > https://issues.apache.org/jira/browse/MATH-667 > > > > Maybe that we must be guided with some use-cases for this class. > > Maybe in actual uses, the problems discussed here do not appear, and any > > additional check would destroy the usefulness of this class in such cases > > (just a wild guess). Maybe that we should drop support for sparse vectors! > > > I fully agree with you. Speaking quite honestly as a user of the > linear package, even if I know that IEEE floats handle infinite values > well, I tend to dish results that contain those values, and try to > cure the cause of these unwanted values. So, as a user, whether the > method returns NaN of infinity does not matter to me (as long as it > does not return zero --or any normal value-- quietly): in any case, I > would do a computation again, this time with "correct" values. > > I also think that in this case, strict adherence to IEEE standards > (which were written for scalar values, not vectors) should not > compromise the performance of the class. Am I mistaken, or is this at odds with a fix that will make the "division" test pass? [I.e. If there is no way to distinguish between +0 and -0 as the default entry then "1 / default" will always be "+inf".] > > I would be more cautious with your suggestion about getting rid of > sparse vectors. I think it's nice to have them, even if the support is > limited (we should of course concentrate on array-based vectors > first). As long as we clearly document these limits, I'd like to keep > them. Besides, with the new set of abstract tests being constructed > for any vector class, it should be possible in the future to propose > and test a better implementation. The problem we have is clearly lack > of feedback on these features (remember the debates we had on the > "sparseIterators"-- not perfect, but experience only can tell how to > improve). Well, if we know that it doesn't work well and we don't know how to fix it while at the same time keep its usefulness (efficiency-wise), we have to raise the issue of the necessity to keep supporting it even though it seems that nobody uses it. For the time being (staying compatible), we should at least fix the failing unit test problem; hence decide for example (as you said before) that the purpose is to save memory, but the trade-off is loss of performance. Maybe that we could also impose that it is forbidden to divide by a sparse vector whose default value is zero. This will avoid a check on all entries (at the expense of forbidding legitimate divisions, when all the entries have a non-default value; but then one would wonder why using a sparse vector...). Best, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hello Gilles, > [...] > >> I fully agree. We could also opt for a less correct, but more >> efficient solution: we do not store the sign of zero, and return NaN >> each time v / zero occurs. The result should be NaN anyway, because >> its sign is undecidable. This specificity would be clearly stated in >> the javadoc. What do you think? > > We had this sort of discussion with the "Complex" class, where there are > such issues as "standard" vs "correct" vs "documented" etc. E.g. some > computations return "NaN" where it should be "infinity" (or vice-versa, I > don't remember exactly). See MATH-667: > https://issues.apache.org/jira/browse/MATH-667 > > Maybe that we must be guided with some use-cases for this class. > Maybe in actual uses, the problems discussed here do not appear, and any > additional check would destroy the usefulness of this class in such cases > (just a wild guess). Maybe that we should drop support for sparse vectors! > I fully agree with you. Speaking quite honestly as a user of the linear package, even if I know that IEEE floats handle infinite values well, I tend to dish results that contain those values, and try to cure the cause of these unwanted values. So, as a user, whether the method returns NaN of infinity does not matter to me (as long as it does not return zero --or any normal value-- quietly): in any case, I would do a computation again, this time with "correct" values. I also think that in this case, strict adherence to IEEE standards (which were written for scalar values, not vectors) should not compromise the performance of the class. I would be more cautious with your suggestion about getting rid of sparse vectors. I think it's nice to have them, even if the support is limited (we should of course concentrate on array-based vectors first). As long as we clearly document these limits, I'd like to keep them. Besides, with the new set of abstract tests being constructed for any vector class, it should be possible in the future to propose and test a better implementation. The problem we have is clearly lack of feedback on these features (remember the debates we had on the "sparseIterators"-- not perfect, but experience only can tell how to improve). Best regards, Sébastien - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
On Tue, Jun 12, 2012 at 01:14:40PM +0200, Sébastien Brisard wrote: > 2012/6/12 Gilles Sadowski : > > Hello. > > > >> > [...] > >> > I agree. > >> > > >> >> > >> >> This does not solve the whole issue, however, because if the default > >> >> entry is zero, its sign is lost, and {finite value} / {zero} is of > >> >> undetermined sign. Any idea regarding this point? > >> > > >> > I guess that we could also keep a flag for the sign. > >> > > >> Do you mean that the default value would be +0 or -0, in some zero > >> entries would be stored? > > > > I didn't get the end of the sentence. > > > Yes, something got lost in translation... > In your solution, some zero entries would be stored, while others > wouldn't. This is slightly inefficient memory-wise. > > > Anyways, what I meant is indeed to be able to specify that the default zero > > is negative (so that "v / zero" has the correct sign). > > > >> Or do you mean to store the sign of each zero > >> entry, which is slightly less efficient than the present > >> implementation, but would work OK? > > > > No. > > > >> These signs could be stored in very > >> compact form (as bits), as they need only seldom be retrieved. > > > > I strongly prefer a "boolean" unless you can devise a way to perform > > calculations that strongly benefit from bit manipulation. > > > > Anyways, let's not try to re-invent the wheel... > > > I fully agree. We could also opt for a less correct, but more > efficient solution: we do not store the sign of zero, and return NaN > each time v / zero occurs. The result should be NaN anyway, because > its sign is undecidable. This specificity would be clearly stated in > the javadoc. What do you think? We had this sort of discussion with the "Complex" class, where there are such issues as "standard" vs "correct" vs "documented" etc. E.g. some computations return "NaN" where it should be "infinity" (or vice-versa, I don't remember exactly). See MATH-667: https://issues.apache.org/jira/browse/MATH-667 Maybe that we must be guided with some use-cases for this class. Maybe in actual uses, the problems discussed here do not appear, and any additional check would destroy the usefulness of this class in such cases (just a wild guess). Maybe that we should drop support for sparse vectors! Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
2012/6/12 Gilles Sadowski : > Hello. > >> > [...] >> > I agree. >> > >> >> >> >> This does not solve the whole issue, however, because if the default >> >> entry is zero, its sign is lost, and {finite value} / {zero} is of >> >> undetermined sign. Any idea regarding this point? >> > >> > I guess that we could also keep a flag for the sign. >> > >> Do you mean that the default value would be +0 or -0, in some zero >> entries would be stored? > > I didn't get the end of the sentence. > Yes, something got lost in translation... In your solution, some zero entries would be stored, while others wouldn't. This is slightly inefficient memory-wise. > Anyways, what I meant is indeed to be able to specify that the default zero > is negative (so that "v / zero" has the correct sign). > >> Or do you mean to store the sign of each zero >> entry, which is slightly less efficient than the present >> implementation, but would work OK? > > No. > >> These signs could be stored in very >> compact form (as bits), as they need only seldom be retrieved. > > I strongly prefer a "boolean" unless you can devise a way to perform > calculations that strongly benefit from bit manipulation. > > Anyways, let's not try to re-invent the wheel... > I fully agree. We could also opt for a less correct, but more efficient solution: we do not store the sign of zero, and return NaN each time v / zero occurs. The result should be NaN anyway, because its sign is undecidable. This specificity would be clearly stated in the javadoc. What do you think? Sébastien > Regards, > Gilles > >> [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hello. > > [...] > > I agree. > > > >> > >> This does not solve the whole issue, however, because if the default > >> entry is zero, its sign is lost, and {finite value} / {zero} is of > >> undetermined sign. Any idea regarding this point? > > > > I guess that we could also keep a flag for the sign. > > > Do you mean that the default value would be +0 or -0, in some zero > entries would be stored? I didn't get the end of the sentence. Anyways, what I meant is indeed to be able to specify that the default zero is negative (so that "v / zero" has the correct sign). > Or do you mean to store the sign of each zero > entry, which is slightly less efficient than the present > implementation, but would work OK? No. > These signs could be stored in very > compact form (as bits), as they need only seldom be retrieved. I strongly prefer a "boolean" unless you can devise a way to perform calculations that strongly benefit from bit manipulation. Anyways, let's not try to re-invent the wheel... Regards, Gilles > [...] - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hello, 2012/6/11 Gilles Sadowski : > Hi. > >> >> >> + * MATH-803: it is not sufficient to loop through non zero >> >> >> entries of >> >> >> + * this only. Indeed, if this[i] = 0d and v[i] = 0d, then >> >> >> + * this[i] / v[i] = NaN, and not 0d. >> >> >> + */ >> >> >> + final int n = getDimension(); >> >> >> + for (int i = 0; i < n; i++) { >> >> >> + res.setEntry(i, this.getEntry(i) / v.getEntry(i)); >> >> >> } >> >> >> return res; >> >> >> } >> >> > >> >> > I think that this renders the class potentially useless, if we assume >> >> > that >> >> > it is used when "large" vectors are needed. >> >> > Indeed, after this operation, all the entries will be stored; thus >> >> > cancelling the memory efficiency of the class, and probably making the >> >> > returned value slower than an "ArrayRealVector" instance for subsequent >> >> > operations. >> >> > >> >> I'm not sure that all entries would be stored (except if setEntry does >> >> not distinguish between zero values and non-zero values). >> > >> > The problem is when the common values are not the default one, like ... >> > >> >> >> >> > For every method that a "RealVector" argument, there should be a >> >> > specialized >> >> > implementation that take an "OpenMapRealVector". >> >> > >> >> > Also, couldn't we solve some of these problems if the value of the >> >> > default >> >> > entry was stored and mutable? E.g. if the default for "v" and "w" is >> >> > zero, >> >> > then the default for "v / w" will be "NaN". >> > >> > ... in this example. >> > >> Ooops, missed this one! >> Yes that was an improvement I was thinking of (in fact, the unit tests >> are already implemented with this idea in mind). >> This would allow efficient implementation of the cosine for example. >> I didn't think about the application you suggest, but I like this idea >> very much. >> >> How about we leave it as is for the time being, and we open a JIRA >> ticket for this improvement, explicitely mentioning that ebeDivide() >> and ebeMultiply() should be revisited in order to improve their >> efficiency? > > I agree. > >> >> This does not solve the whole issue, however, because if the default >> entry is zero, its sign is lost, and {finite value} / {zero} is of >> undetermined sign. Any idea regarding this point? > > I guess that we could also keep a flag for the sign. > Do you mean that the default value would be +0 or -0, in some zero entries would be stored? Or do you mean to store the sign of each zero entry, which is slightly less efficient than the present implementation, but would work OK? These signs could be stored in very compact form (as bits), as they need only seldom be retrieved. > > Best regards, > Gilles > > P.S. Maybe that we should also have a look at how other libraries handle > this kind of issues. > Thanks for the suggestion, I will have a look. Best regards, Sébastien - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hi. > >> >> + * MATH-803: it is not sufficient to loop through non zero > >> >> entries of > >> >> + * this only. Indeed, if this[i] = 0d and v[i] = 0d, then > >> >> + * this[i] / v[i] = NaN, and not 0d. > >> >> + */ > >> >> + final int n = getDimension(); > >> >> + for (int i = 0; i < n; i++) { > >> >> + res.setEntry(i, this.getEntry(i) / v.getEntry(i)); > >> >> } > >> >> return res; > >> >> } > >> > > >> > I think that this renders the class potentially useless, if we assume > >> > that > >> > it is used when "large" vectors are needed. > >> > Indeed, after this operation, all the entries will be stored; thus > >> > cancelling the memory efficiency of the class, and probably making the > >> > returned value slower than an "ArrayRealVector" instance for subsequent > >> > operations. > >> > > >> I'm not sure that all entries would be stored (except if setEntry does > >> not distinguish between zero values and non-zero values). > > > > The problem is when the common values are not the default one, like ... > > > >> > >> > For every method that a "RealVector" argument, there should be a > >> > specialized > >> > implementation that take an "OpenMapRealVector". > >> > > >> > Also, couldn't we solve some of these problems if the value of the > >> > default > >> > entry was stored and mutable? E.g. if the default for "v" and "w" is > >> > zero, > >> > then the default for "v / w" will be "NaN". > > > > ... in this example. > > > Ooops, missed this one! > Yes that was an improvement I was thinking of (in fact, the unit tests > are already implemented with this idea in mind). > This would allow efficient implementation of the cosine for example. > I didn't think about the application you suggest, but I like this idea > very much. > > How about we leave it as is for the time being, and we open a JIRA > ticket for this improvement, explicitely mentioning that ebeDivide() > and ebeMultiply() should be revisited in order to improve their > efficiency? I agree. > > This does not solve the whole issue, however, because if the default > entry is zero, its sign is lost, and {finite value} / {zero} is of > undetermined sign. Any idea regarding this point? I guess that we could also keep a flag for the sign. Best regards, Gilles P.S. Maybe that we should also have a look at how other libraries handle this kind of issues. - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hello Gilles, 2012/6/11 Gilles Sadowski : > Hello Sébastien. > >> >> + /* >> >> + * MATH-803: it is not sufficient to loop through non zero >> >> entries of >> >> + * this only. Indeed, if this[i] = 0d and v[i] = 0d, then >> >> + * this[i] / v[i] = NaN, and not 0d. >> >> + */ >> >> + final int n = getDimension(); >> >> + for (int i = 0; i < n; i++) { >> >> + res.setEntry(i, this.getEntry(i) / v.getEntry(i)); >> >> } >> >> return res; >> >> } >> > >> > I think that this renders the class potentially useless, if we assume that >> > it is used when "large" vectors are needed. >> > Indeed, after this operation, all the entries will be stored; thus >> > cancelling the memory efficiency of the class, and probably making the >> > returned value slower than an "ArrayRealVector" instance for subsequent >> > operations. >> > >> I'm not sure that all entries would be stored (except if setEntry does >> not distinguish between zero values and non-zero values). > > The problem is when the common values are not the default one, like ... > >> >> > For every method that a "RealVector" argument, there should be a >> > specialized >> > implementation that take an "OpenMapRealVector". >> > >> > Also, couldn't we solve some of these problems if the value of the default >> > entry was stored and mutable? E.g. if the default for "v" and "w" is zero, >> > then the default for "v / w" will be "NaN". > > ... in this example. > Ooops, missed this one! Yes that was an improvement I was thinking of (in fact, the unit tests are already implemented with this idea in mind). This would allow efficient implementation of the cosine for example. I didn't think about the application you suggest, but I like this idea very much. How about we leave it as is for the time being, and we open a JIRA ticket for this improvement, explicitely mentioning that ebeDivide() and ebeMultiply() should be revisited in order to improve their efficiency? This does not solve the whole issue, however, because if the default entry is zero, its sign is lost, and {finite value} / {zero} is of undetermined sign. Any idea regarding this point? Sébastien - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hello Sébastien. > >> + /* > >> + * MATH-803: it is not sufficient to loop through non zero > >> entries of > >> + * this only. Indeed, if this[i] = 0d and v[i] = 0d, then > >> + * this[i] / v[i] = NaN, and not 0d. > >> + */ > >> + final int n = getDimension(); > >> + for (int i = 0; i < n; i++) { > >> + res.setEntry(i, this.getEntry(i) / v.getEntry(i)); > >> } > >> return res; > >> } > > > > I think that this renders the class potentially useless, if we assume that > > it is used when "large" vectors are needed. > > Indeed, after this operation, all the entries will be stored; thus > > cancelling the memory efficiency of the class, and probably making the > > returned value slower than an "ArrayRealVector" instance for subsequent > > operations. > > > I'm not sure that all entries would be stored (except if setEntry does > not distinguish between zero values and non-zero values). The problem is when the common values are not the default one, like ... > > > For every method that a "RealVector" argument, there should be a specialized > > implementation that take an "OpenMapRealVector". > > > > Also, couldn't we solve some of these problems if the value of the default > > entry was stored and mutable? E.g. if the default for "v" and "w" is zero, > > then the default for "v / w" will be "NaN". ... in this example. > > > >> @@ -359,6 +363,25 @@ public class OpenMapRealVector extends S > >> iter.advance(); > >> res.setEntry(iter.key(), iter.value() * > >> v.getEntry(iter.key())); > >> } > >> + /* > >> + * MATH-803: the above loop assumes that 0d * x = 0d for any > >> double x, > >> + * which allows to consider only the non-zero entries of this. > >> However, > >> + * this fails if this[i] == 0d and (v[i] = NaN or v[i] = > >> Infinity). > >> + * > >> + * These special cases are handled below. > >> + */ > >> + if (v.isNaN() || v.isInfinite()) { > >> + final int n = getDimension(); > >> + for (int i = 0; i < n; i++) { > >> + final double y = v.getEntry(i); > >> + if (Double.isNaN(y)) { > >> + res.setEntry(i, Double.NaN); > >> + } else if (Double.isInfinite(y)) { > >> + final double x = this.getEntry(i); > >> + res.setEntry(i, x * y); > >> + } > >> + } > >> + } > > > > This probably can only be a temporary solution: 3 additional loops over all > > the elements... > > > If we cache isNaN and isInfinite, this is "only" one additional loop ;). > I don't like this solution either, but the bugs I have identified are > very real, except if we accept that exceptional cases are not handled > correctly by sparse vectors. I'm personally adverse to this option. > If you have a better idea, please feel free to change that. By the > way, would you like me to revert this commit ? No. But maybe we should already open a ticket signalling that a potential problem has been identified. > > I just would like to mention that sparse vectors allow a gain in > memory. Speed is, in my view, a bonus, but you probably think the > opposite... No, that's fine; less so if you lose both. ;-) Best regards, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hi Gilles, 2012/6/11 Gilles Sadowski : > Hello. > >> Date: Mon Jun 11 05:52:16 2012 >> New Revision: 1348721 >> >> URL: http://svn.apache.org/viewvc?rev=1348721&view=rev >> Log: >> MATH-803: >> - modified OpenMapRealVector.ebeMultiply() and ebeDivide() to handle >> special cases 0d * NaN, 0d * Infinity, 0d / 0d and 0d / NaN. >> - added implementation of isNaN() and isInfinite() to >> SparseRealVectorTest.SparseRealVectorTestImpl in order to allow for testing >> of OpenMapRealVector.ebeMultiply() and ebeDivide() with mixed types. >> >> Modified: >> >> commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java >> >> commons/proper/math/trunk/src/test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java >> >> Modified: >> commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java >> URL: >> http://svn.apache.org/viewvc/commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java?rev=1348721&r1=1348720&r2=1348721&view=diff >> == >> --- >> commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java >> (original) >> +++ >> commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java >> Mon Jun 11 05:52:16 2012 >> @@ -341,10 +341,14 @@ public class OpenMapRealVector extends S >> public OpenMapRealVector ebeDivide(RealVector v) { >> checkVectorDimensions(v.getDimension()); >> OpenMapRealVector res = new OpenMapRealVector(this); >> - Iterator iter = entries.iterator(); >> - while (iter.hasNext()) { >> - iter.advance(); >> - res.setEntry(iter.key(), iter.value() / v.getEntry(iter.key())); >> + /* >> + * MATH-803: it is not sufficient to loop through non zero entries >> of >> + * this only. Indeed, if this[i] = 0d and v[i] = 0d, then >> + * this[i] / v[i] = NaN, and not 0d. >> + */ >> + final int n = getDimension(); >> + for (int i = 0; i < n; i++) { >> + res.setEntry(i, this.getEntry(i) / v.getEntry(i)); >> } >> return res; >> } > > I think that this renders the class potentially useless, if we assume that > it is used when "large" vectors are needed. > Indeed, after this operation, all the entries will be stored; thus > cancelling the memory efficiency of the class, and probably making the > returned value slower than an "ArrayRealVector" instance for subsequent > operations. > I'm not sure that all entries would be stored (except if setEntry does not distinguish between zero values and non-zero values). > For every method that a "RealVector" argument, there should be a specialized > implementation that take an "OpenMapRealVector". > > Also, couldn't we solve some of these problems if the value of the default > entry was stored and mutable? E.g. if the default for "v" and "w" is zero, > then the default for "v / w" will be "NaN". > >> @@ -359,6 +363,25 @@ public class OpenMapRealVector extends S >> iter.advance(); >> res.setEntry(iter.key(), iter.value() * v.getEntry(iter.key())); >> } >> + /* >> + * MATH-803: the above loop assumes that 0d * x = 0d for any >> double x, >> + * which allows to consider only the non-zero entries of this. >> However, >> + * this fails if this[i] == 0d and (v[i] = NaN or v[i] = Infinity). >> + * >> + * These special cases are handled below. >> + */ >> + if (v.isNaN() || v.isInfinite()) { >> + final int n = getDimension(); >> + for (int i = 0; i < n; i++) { >> + final double y = v.getEntry(i); >> + if (Double.isNaN(y)) { >> + res.setEntry(i, Double.NaN); >> + } else if (Double.isInfinite(y)) { >> + final double x = this.getEntry(i); >> + res.setEntry(i, x * y); >> + } >> + } >> + } > > This probably can only be a temporary solution: 3 additional loops over all > the elements... > If we cache isNaN and isInfinite, this is "only" one additional loop ;). I don't like this solution either, but the bugs I have identified are very real, except if we accept that exceptional cases are not handled correctly by sparse vectors. I'm personally adverse to this option. If you have a better idea, please feel free to change that. By the way, would you like me to revert this commit ? I just would like to mention that sparse vectors allow a gain in memory. Speed is, in my view, a bonus, but you probably think the opposite... Amitiés, Sébastien - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache
Re: svn commit: r1348721 - in /commons/proper/math/trunk/src: main/java/org/apache/commons/math3/linear/OpenMapRealVector.java test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java
Hello. > Date: Mon Jun 11 05:52:16 2012 > New Revision: 1348721 > > URL: http://svn.apache.org/viewvc?rev=1348721&view=rev > Log: > MATH-803: > - modified OpenMapRealVector.ebeMultiply() and ebeDivide() to handle > special cases 0d * NaN, 0d * Infinity, 0d / 0d and 0d / NaN. > - added implementation of isNaN() and isInfinite() to > SparseRealVectorTest.SparseRealVectorTestImpl in order to allow for testing > of OpenMapRealVector.ebeMultiply() and ebeDivide() with mixed types. > > Modified: > > commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java > > commons/proper/math/trunk/src/test/java/org/apache/commons/math3/linear/SparseRealVectorTest.java > > Modified: > commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java > URL: > http://svn.apache.org/viewvc/commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java?rev=1348721&r1=1348720&r2=1348721&view=diff > == > --- > commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java > (original) > +++ > commons/proper/math/trunk/src/main/java/org/apache/commons/math3/linear/OpenMapRealVector.java > Mon Jun 11 05:52:16 2012 > @@ -341,10 +341,14 @@ public class OpenMapRealVector extends S > public OpenMapRealVector ebeDivide(RealVector v) { > checkVectorDimensions(v.getDimension()); > OpenMapRealVector res = new OpenMapRealVector(this); > -Iterator iter = entries.iterator(); > -while (iter.hasNext()) { > -iter.advance(); > -res.setEntry(iter.key(), iter.value() / v.getEntry(iter.key())); > +/* > + * MATH-803: it is not sufficient to loop through non zero entries of > + * this only. Indeed, if this[i] = 0d and v[i] = 0d, then > + * this[i] / v[i] = NaN, and not 0d. > + */ > +final int n = getDimension(); > +for (int i = 0; i < n; i++) { > +res.setEntry(i, this.getEntry(i) / v.getEntry(i)); > } > return res; > } I think that this renders the class potentially useless, if we assume that it is used when "large" vectors are needed. Indeed, after this operation, all the entries will be stored; thus cancelling the memory efficiency of the class, and probably making the returned value slower than an "ArrayRealVector" instance for subsequent operations. For every method that a "RealVector" argument, there should be a specialized implementation that take an "OpenMapRealVector". Also, couldn't we solve some of these problems if the value of the default entry was stored and mutable? E.g. if the default for "v" and "w" is zero, then the default for "v / w" will be "NaN". > @@ -359,6 +363,25 @@ public class OpenMapRealVector extends S > iter.advance(); > res.setEntry(iter.key(), iter.value() * v.getEntry(iter.key())); > } > +/* > + * MATH-803: the above loop assumes that 0d * x = 0d for any double > x, > + * which allows to consider only the non-zero entries of this. > However, > + * this fails if this[i] == 0d and (v[i] = NaN or v[i] = Infinity). > + * > + * These special cases are handled below. > + */ > +if (v.isNaN() || v.isInfinite()) { > +final int n = getDimension(); > +for (int i = 0; i < n; i++) { > +final double y = v.getEntry(i); > +if (Double.isNaN(y)) { > +res.setEntry(i, Double.NaN); > +} else if (Double.isInfinite(y)) { > +final double x = this.getEntry(i); > +res.setEntry(i, x * y); > +} > +} > +} This probably can only be a temporary solution: 3 additional loops over all the elements... > [...] Best, Gilles - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org