[jira] [Commented] (MATH-578) Decrease DescriptiveStatistics performance from 2.0 to 2.2
[ https://issues.apache.org/jira/browse/MATH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420188#comment-13420188 ] Thomas Neidhart commented on MATH-578: -- I did the provided test myself and indeed it is the same problem and is fixed by the suggested changes. Decrease DescriptiveStatistics performance from 2.0 to 2.2 -- Key: MATH-578 URL: https://issues.apache.org/jira/browse/MATH-578 Project: Commons Math Issue Type: Bug Affects Versions: 2.2 Environment: Linux Reporter: Paolo Repele Assignee: Mikkel Meyer Andersen Priority: Minor Fix For: 3.1 Attachments: percentile.png Switching between commons-math 2.0 to 2.2 we note how the DescriptiveStatistics.addValue(double) has decrease the performance. I tested with 2 million values. DescriptiveStatistics ds = new DescriptiveStatistics(); for(int i = 0; i1000*1000*2; i++) { //2 million values ds.addValue(v); } ds.getPercentile(50); Seems that depending by the values inserted in the DescriptiveStatistics it takes different time: * with a single value (0) ** 2.0 - take ~500 ms ** 2.2 - take more than 10 minutes * with 50% fixed value (0) and 50% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~25 ms - ~250 second * with 100% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~70 ms -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-578) Decrease DescriptiveStatistics performance from 2.0 to 2.2
[ https://issues.apache.org/jira/browse/MATH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034056#comment-13034056 ] Mikkel Meyer Andersen commented on MATH-578: Have you tried more detailed profiling? E.g. in Eclipse to see which methods are using the majority of time? Decrease DescriptiveStatistics performance from 2.0 to 2.2 -- Key: MATH-578 URL: https://issues.apache.org/jira/browse/MATH-578 Project: Commons Math Issue Type: Bug Affects Versions: 2.2 Environment: Linux Reporter: Paolo Repele Priority: Minor Switching between commons-math 2.0 to 2.2 we note how the DescriptiveStatistics.addValue(double) has decrease the performance. I tested with 2 million values. DescriptiveStatistics ds = new DescriptiveStatistics(); for(int i = 0; i1000*1000*2; i++) { //2 million values ds.addValue(v); } ds.getPercentile(50); Seems that depending by the values inserted in the DescriptiveStatistics it takes different time: * with a single value (0) ** 2.0 - take ~500 ms ** 2.2 - take more than 10 minutes * with 50% fixed value (0) and 50% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~25 ms - ~250 second * with 100% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~70 ms -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-578) Decrease DescriptiveStatistics performance from 2.0 to 2.2
[ https://issues.apache.org/jira/browse/MATH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034061#comment-13034061 ] Phil Steitz commented on MATH-578: -- Thanks for reporting this. I assume the timings include the percentile calculation, right? This could be related to the changes in the Percentile implementation in 2.2. If isolating the timing to just the percentile calculation shows that is where the latency difference is, we should reopen MATH-417. The changes there were to improve Percentile performance, which in most cases they do. The first two results above are disturbing, however. If your data is largely constant and this creates a problem in your application, as a workaround, you can provide an alternative Percentile implementation to DescriptiveStatistics using setPercentileImpl. Decrease DescriptiveStatistics performance from 2.0 to 2.2 -- Key: MATH-578 URL: https://issues.apache.org/jira/browse/MATH-578 Project: Commons Math Issue Type: Bug Affects Versions: 2.2 Environment: Linux Reporter: Paolo Repele Assignee: Mikkel Meyer Andersen Priority: Minor Switching between commons-math 2.0 to 2.2 we note how the DescriptiveStatistics.addValue(double) has decrease the performance. I tested with 2 million values. DescriptiveStatistics ds = new DescriptiveStatistics(); for(int i = 0; i1000*1000*2; i++) { //2 million values ds.addValue(v); } ds.getPercentile(50); Seems that depending by the values inserted in the DescriptiveStatistics it takes different time: * with a single value (0) ** 2.0 - take ~500 ms ** 2.2 - take more than 10 minutes * with 50% fixed value (0) and 50% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~25 ms - ~250 second * with 100% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~70 ms -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-578) Decrease DescriptiveStatistics performance from 2.0 to 2.2
[ https://issues.apache.org/jira/browse/MATH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034068#comment-13034068 ] Mikkel Meyer Andersen commented on MATH-578: Sorry for my (too) short first answer. Thanks for your proper introduction, Phil. I'll try a more detailed profiling to see what's causing the performance problems. Decrease DescriptiveStatistics performance from 2.0 to 2.2 -- Key: MATH-578 URL: https://issues.apache.org/jira/browse/MATH-578 Project: Commons Math Issue Type: Bug Affects Versions: 2.2 Environment: Linux Reporter: Paolo Repele Assignee: Mikkel Meyer Andersen Priority: Minor Attachments: percentile.png Switching between commons-math 2.0 to 2.2 we note how the DescriptiveStatistics.addValue(double) has decrease the performance. I tested with 2 million values. DescriptiveStatistics ds = new DescriptiveStatistics(); for(int i = 0; i1000*1000*2; i++) { //2 million values ds.addValue(v); } ds.getPercentile(50); Seems that depending by the values inserted in the DescriptiveStatistics it takes different time: * with a single value (0) ** 2.0 - take ~500 ms ** 2.2 - take more than 10 minutes * with 50% fixed value (0) and 50% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~25 ms - ~250 second * with 100% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~70 ms -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-578) Decrease DescriptiveStatistics performance from 2.0 to 2.2
[ https://issues.apache.org/jira/browse/MATH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034070#comment-13034070 ] Paolo Repele commented on MATH-578: --- * yep, the time was only for the getPercentile() method. * I added an image where you can see the profile snapshot Usually we use this library to analyze some grids. These grids can be very huge and can be generated using the same values for all the cells or a continue function around the grid or any combination of both. Then we have really no idea how these grids can be generated. Decrease DescriptiveStatistics performance from 2.0 to 2.2 -- Key: MATH-578 URL: https://issues.apache.org/jira/browse/MATH-578 Project: Commons Math Issue Type: Bug Affects Versions: 2.2 Environment: Linux Reporter: Paolo Repele Assignee: Mikkel Meyer Andersen Priority: Minor Attachments: percentile.png Switching between commons-math 2.0 to 2.2 we note how the DescriptiveStatistics.addValue(double) has decrease the performance. I tested with 2 million values. DescriptiveStatistics ds = new DescriptiveStatistics(); for(int i = 0; i1000*1000*2; i++) { //2 million values ds.addValue(v); } ds.getPercentile(50); Seems that depending by the values inserted in the DescriptiveStatistics it takes different time: * with a single value (0) ** 2.0 - take ~500 ms ** 2.2 - take more than 10 minutes * with 50% fixed value (0) and 50% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~25 ms - ~250 second * with 100% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~70 ms -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-578) Decrease DescriptiveStatistics performance from 2.0 to 2.2
[ https://issues.apache.org/jira/browse/MATH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034094#comment-13034094 ] Mikkel Meyer Andersen commented on MATH-578: Also, it seems like FastMath is new to 2.2. I'll try to investigate what causes this. Decrease DescriptiveStatistics performance from 2.0 to 2.2 -- Key: MATH-578 URL: https://issues.apache.org/jira/browse/MATH-578 Project: Commons Math Issue Type: Bug Affects Versions: 2.2 Environment: Linux Reporter: Paolo Repele Assignee: Mikkel Meyer Andersen Priority: Minor Attachments: percentile.png Switching between commons-math 2.0 to 2.2 we note how the DescriptiveStatistics.addValue(double) has decrease the performance. I tested with 2 million values. DescriptiveStatistics ds = new DescriptiveStatistics(); for(int i = 0; i1000*1000*2; i++) { //2 million values ds.addValue(v); } ds.getPercentile(50); Seems that depending by the values inserted in the DescriptiveStatistics it takes different time: * with a single value (0) ** 2.0 - take ~500 ms ** 2.2 - take more than 10 minutes * with 50% fixed value (0) and 50% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~25 ms - ~250 second * with 100% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~70 ms -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MATH-578) Decrease DescriptiveStatistics performance from 2.0 to 2.2
[ https://issues.apache.org/jira/browse/MATH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034152#comment-13034152 ] Mikkel Meyer Andersen commented on MATH-578: As far as I can see, Percentile contributes a lot to the longer execution time, so reopening MATH-417 for datasets of this type might be the right thing to do. Decrease DescriptiveStatistics performance from 2.0 to 2.2 -- Key: MATH-578 URL: https://issues.apache.org/jira/browse/MATH-578 Project: Commons Math Issue Type: Bug Affects Versions: 2.2 Environment: Linux Reporter: Paolo Repele Assignee: Mikkel Meyer Andersen Priority: Minor Attachments: percentile.png Switching between commons-math 2.0 to 2.2 we note how the DescriptiveStatistics.addValue(double) has decrease the performance. I tested with 2 million values. DescriptiveStatistics ds = new DescriptiveStatistics(); for(int i = 0; i1000*1000*2; i++) { //2 million values ds.addValue(v); } ds.getPercentile(50); Seems that depending by the values inserted in the DescriptiveStatistics it takes different time: * with a single value (0) ** 2.0 - take ~500 ms ** 2.2 - take more than 10 minutes * with 50% fixed value (0) and 50% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~25 ms - ~250 second * with 100% Math.random() ** 2.0 - take ~500 ms ** 2.2 - take ~70 ms -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira