[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-08-31 Thread Luc Maisonobe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luc Maisonobe updated MATH-1120:

Fix Version/s: (was: 4.0)
   3.4

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 3.4

 Attachments: 18-jun-percentile-with-estimation-patch, 
 27-jun-refactored-kth-pivoting.patch, excel-percentile-patch, 
 math-1120-removeAndSlice.patch, percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-07-21 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: math-1120-removeAndSlice.patch

Hi

This patch has the following changes:

a) Small refactor in replaceAndSlice that makes a call to 
Precision.equalsIncludingNan that handles the NaN check effectively

b) removeAndSlice slightly re-written to optimize on
i) calling System.arraycopy - as now it goes by a bulk lengt of copy   
between the two occurances of removable item (as against one after another)
   ii) This is a correction: The last leg of copy checks correctly till begin + 
length (as against checking till length)

Please let know.

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: 18-jun-percentile-with-estimation-patch, 
 27-jun-refactored-kth-pivoting.patch, excel-percentile-patch, 
 math-1120-removeAndSlice.patch, percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-27 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: (was: 27-jun-refactored-pivot+nanchanges.patch)

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: 18-jun-percentile-with-estimation-patch, 
 excel-percentile-patch, percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-27 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: 27-jun-refactored-kth-pivoting.patch

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: 18-jun-percentile-with-estimation-patch, 
 27-jun-refactored-kth-pivoting.patch, excel-percentile-patch, 
 percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-26 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: 27-jun-refactored-pivot+nanchanges.patch

Please find attached the changes for refactoring
a) PivotingStrategyInterface and PivotingStrategy
b) KthSelector
c) Percentile changes for Pivoting, KthSelector refactoring

In addition Please make sure to apply math-1132.patch before applying this 
patch as it has a dependency on nan changes done in MATH-1132.

Please let me know. 

This is the latest changes as per discussion over dev mailing list on MATH-1120.

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: 18-jun-percentile-with-estimation-patch, 
 27-jun-refactored-pivot+nanchanges.patch, excel-percentile-patch, 
 percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-18 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: 18-jun-percentile-with-estimation-patch

Hi Luc, Gilles

First of, Iam immensly thankful to all your comments on this patch. Next, i am 
attaching my new patch with today's date(18-jun). However please advise if i 
need to remove the old patch file if it confuses.

Please find my response below. The new patch has the suggested changes in the 
switch case for nan handling; But; However i have my view points on the 
different default nan strategies associated to types. Please permit me to 
explain (sorry for long summary)

First, 
i would like to leave the Default implementation of Percentile as-is (Meaning 
in my MATH-1120 patch it is mapped to Type.CM)since otherwise we will break 
user old expectation even for non nan and non inf entries as well. The existing 
tests does fail if we change the default types (please refer to 
PercentileTest.java code as well to see the finer variations that is being 
looked at for different types)

Secondly,
Percentile.java header comment states somewhere to an effect that NaNs would be 
(left as-is and) handled by java's default sort behavior and no removal being 
done. So for me to map this behaviour to new implementation; it was 
NaNStrategy.FIXED that came close and didnt require any of the existing test 
cases for the existing Percentile behavior to change. What i am re-iterating 
here is the existing behavior tests have completely passed with new Type.CM and 
FIXED. (And now i have added several more tests including different types as 
well).

Thirdly,
While all the R_x (where x :[1-9])  types as run and verified by R tool; seemed 
to clearly convey the NaNs needed to be removed and hence you see that i have 
used different strategy NaNStrategy#REMOVED. 
I agree while multiple defaults are not wise to have ; however; if we are 
forced to have Apache CM as supported type (which is not one of R_x types) and 
we have the need to support multiple variants (R1- R9) ; then it is inevitable 
to have type sepcific NaNStrategy as per the need.
I also feel ; NaN handling should be allowed for overriding atleast in a 
controlled manner as different use cases may exist for needing this variation 
in nan handling. Therefore IMO while we could avoid the public access to change 
these defaults; it is relevant to support these variations of nan handling on a 
per type and allow atleast sub classes to override if a rare need arises.
While the very name NaNStrategy reminds me of  different ways to look at that; 
i feel we will be much restricted  if we just said that we stick to one way of 
NaNHandling for all types. Please let me know your thoughts.

Next,
Regarding the PivotingStrategy; At first, i wanted to convey here that to have 
all the partitioning, pivoting and selection in separated classes/enums than 
inside main class. I have made it as static due to the fact that; it is more of 
a non-functional requirement and felt that it need not be set for every 
instance (more of a global setting that doesnt vary across types). 
Please correct me here and let know if it still needs to be per instance. 
I also made it package accessable/settable solely because medianOf3 method had 
been package level for the sole intent of possible overriding of the same 
within that package. Meaning; if some one really needed to tinker around 
pivoting they need a way to do it which i have provided it using a strategy.

Next,
In the current patch that i am going to attach as new dated patch (since you 
have already started looking at the old one ; which i would leave it as is). I 
find many utility type methods; replaceNaN, removeNaN( Predicated Lists ) and 
copyOf(values, begin.length) and as well as KthSelector with PivotingStrategy 
etc all of which can perhaps make its way to MathArrays and MathUtils. Please 
let me know.If so i will once again re-factor these changes and submit the 
patch.

Thanks for reading this through and for your time in reviewing . Please let me 
know your opinion on all of these.

thanks
venkat.


 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: 18-jun-percentile-with-estimation-patch, 
 excel-percentile-patch, percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * 

[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-16 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: (was: percentile-with-estimation-patch)

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-16 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: percentile-with-estimation-patch

The preProcess method is removed and instead rolled in this  function of 
pre-processing to getWorkArray method.
The NaNs in the input array will now be handled by the NaNStrategy set at the 
time of Percentile construction. 

Please let me know.

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, 
 percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-12 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: (was: r-output.txt)

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-12 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: r-output.txt
percentile-with-estimation-patch

Attached is the new patch which has the following changes

Firstly, i have verified in my cygwin environment that the following command 
for patching works. (Did svn revert first and then tried this command)
$ patch -p0 -i ../../vmurthy/patch
patching file 
src/test/java/org/apache/commons/math3/stat/descriptive/rank/PercentileTest.java
patching file 
src/test/java/org/apache/commons/math3/stat/descriptive/rank/MedianTest.java
patching file 
src/main/java/org/apache/commons/math3/stat/descriptive/rank/Percentile.java
patching file 
src/main/java/org/apache/commons/math3/stat/descriptive/rank/Median.java

Next, the following are the responses for clarifications asked in the email.

 IIUC, in this reference
   http://stat.ethz.ch/R-manual/R-devel/library/stats/html/quantile.html
 what you called EstimationTechnique is referred to as Type.

 Then the R manual uses a numbering: 1 to 9.


Done and re-named the EstimationTechnique as Type.


 Was Commons Math's implementation none of those nine types?

Commons Math implementation comes very close to R_6(infact the index 
calculation is same) however it is the max and min limits
as to when x1 and xN needs to be considered that would differ  between CM and 
R_6..(I have put this in java doc of R_6)

 I wouldn't name the CM's implementation DEFAULT (and the R's manual
 refers to a paper that recommends type 8).

Renamed DEFAULT as CM and all others are named as R_1,R_2, etc..
However, By default the type i have set is CM due to the fact the existing 
behaviour should be provided witout setting any new configuration. I understand 
R_8 is recommended; however it may be too much to disrupt users 
expectaion/experience by setting R_8 to default than CM. Please let me know 
what you think.

 If it's OK to keep a tight link to the R's description of the variants,
 I'd suggest

 public enum Type {
   CM,  // instead of DEFAULT
   R_1,
   R_2,
   R_3,
   R_4,
   R_5,
   R_6,
   R_7,
   R_8,
   R_9,
   // TYPE_TEN ?
 }

Agreed taken. Also not implented the type 10 as i didnt yet get a bench mark 
such as R for comparison.


 R_9 is not implemented in the patch. Is it intended?
 Then on the Wikipedia page there is an unnamed 10th variant, also
 not implemented.

Well yes i didnt go about implementing all of them however initially. But;
i have added all of them except 10th type.

 People knowledgeable in what should be expected from such a
 functionality are most welcome to provide feedback...
Gilles:Thanks so much for the comments. Every one Please let know

I have added AtomicInteger (not b'cos of Threads)but as a holder for Int (akin 
to INOUT parameter). Ididnt add this exolicitly as i felt the variable name 
lengthHolder suggests the reason contextually. 





 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, 
 percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-08 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: (was: percentile-with-estimation-patch)

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-08 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: percentile-with-estimation-patch

Here is the latest patch with incorporated comments:

a) removed the alternateNam
b)  Removed the comment Each enum has a MathJax comment about the formulaes 
c) To best extent i have corrected typos.
d)  Removed all references to R script
e) No mixing of HTML and MathJax for a single formula . Though i would be 
interested to know the reasons here. 
f) variable named N is replaced with length for a more descriptive meaning.
g) AddedSome final keywords
h) Added href attribute values withindouble-quotes and tried for keepoing 
in one line.
i) Will send an email to dev ML on this EstimationTechnique the best possible 
name? As it will be part of the public API, perhaps you could ask this question 
on the dev ML.
j) Done on creating new constant for (0x1  MAX_CACHED_LEVELS) - 1
k) The medianOf3 method now carries a deprecation message to point to a 
estimation strategy setting and as well as the method now throws unsupported 
operation due to the fact that it is of no consequence if some one tried to 
overload the method.. 

Otherwise, the list of alternate percentile definitions seems a nice addition 
to the CM stat functionality. 
Thanks for the inputs.

So now in summary 
a) As medianOf3 was exposed as package level access method; with the given 
change i am proposing to deprecate the same by trowing an unsupported 
operation. The reason being; the pivoring as a strategy has been added which 
can now be set if it really warrants. Again its access level is maintained as 
package level.

b) Added PivotingStrategy enums such as randon and central privoting along with 
median of 3 approaches.

Many thanks for all the comments Please let me know


 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, 
 percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-03 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: r-output.txt

Just attaching the r-output that i used for verification

So basically i started with R1,R2,R3,R4,R7,R8 and DEFAULT estimations.

However one of the test asserts with Multiple Positive infinities is not 
matching for R1,R2,R3,R4 where as it matches for R7,R8 and DEFAULT (which is 
apache commons).  I am not clear on that still and looking at that. May need 
some help there.



 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, 
 percentile-with-estimation-patch, r-output.txt

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-02 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: (was: percentile-with-estimation-patch)

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-02 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: percentile-with-estimation-patch

Sorry for the inconvenience due to change in formatting.Modified the my IDE to 
not change format of a pre-existing code. Here is the latest drop of the same 
patch name (i replaced the earlier one with the new one).

I also verified (in cygwin windows) that patch -p0  
c:/workspaces/vmurthy/percentile-with-estimation-patch works saying patching 
the files Percentile and PercentileTest

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, percentile-with-estimation-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-02 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: (was: percentile-with-estimation-patch)

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-02 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: percentile-with-estimation-patch

Cleared check style, pmd, findbugs and improved code coverage for the changed 
portions.

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, percentile-with-estimation-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-02 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: (was: percentile-with-estimation-patch)

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-02 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: percentile-with-estimation-patch

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, percentile-with-estimation-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-01 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: percentile-with-estimation-patch

As per earlier discussion ; was advised to take a look at the references for 
possible different types of computation and come up with a draft.

Here is what i have been thinking

There are atleast 9-10 documented approaches (from 
http://en.wikipedia.org/wiki/Quantile ) ofcomputing the percentile and the R 
statistical tool also has a reference implementation of these. All these 
strategies have provided formulaes for choice of the index of the array and an 
estimation technique to compute the estimation. 

These estimation tecniques can be turned in naturally as enum 
EstimationTechnique (R1, R2, etc. where R1,R2 are estimation types as 
elucidated in wikipedia) with the below funtions
int index( double pthQuantile, int N);
double estimate(double[] values, int[] pivotsHeap, double pos, int length)

In addition the Percentile class already does amedian of 3 based pivoting for a 
kth selection. Since pivoting is again a strategy; we could go for a pivoting 
strategy enum along with defaults to median of 3. Further Kth Selection logic 
can now be sub sumed inside the EstimationTechnique as estimate method.

Changes to Percentile:
-
Percentile has one or 2 more constructors to accommodate specifying 
EstimationTechnique during concstruction. The default estimation technique 
being the existing Percentile computation logic Which need not be specified and 
just the existing constructors willl work the same way as it used to be.

Remove the Kth selection private methods and move them under KthSelector class 
(a separate nested class). However medianOf3 is exposed as package level access 
and hence needs to be refactored to use KthSelector class. It could also be 
deprecated as the method is not strictly with percentile logic (as much as 
Kthselection)
Add 2 small methods to getWorkArray and Cached pivots that will need to be 
passed along to estimation tecnhique.

I agree with removing/my earlier suggestion on ExcelPercentile{Test} and would 
like to look foward with opinions on the new approach.

Please let know on the attached percentile-with-estimation-patch




 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, percentile-with-estimation-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 I do have patch ready with small change needed in Percentile class and a new 
 ExcelPercentile class written with tests closely matching with that of 
 PercentileTest class.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-06-01 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Description: 
The current Percentile implementation assumes and hard-codes the quantile pth 
position as 
p * (N+1)/100 and provides a kth selected value.
However if we need to verify compare/contrast with standard statistical tools 
such as say MS Excel; it would be good to provide an extensible way of morphing 
this selection of position than hard code.
For example in order to generate the percentile closely matching with MS Excel 
the position required may be [p*(N-1)/100]+1.

Please let me know if i could submit this as a patch.

  was:
The current Percentile implementation assumes and hard-codes the quantile pth 
position as 
p * (N+1)/100 and provides a kth selected value.
However if we need to verify compare/contrast with standard statistical tools 
such as say MS Excel; it would be good to provide an extensible way of morphing 
this selection of position than hard code.
For example in order to generate the percentile closely matching with MS Excel 
the position required may be [p*(N-1)/100]+1.

I do have patch ready with small change needed in Percentile class and a new 
ExcelPercentile class written with tests closely matching with that of 
PercentileTest class.
Please let me know if i could submit this as a patch.


 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch, percentile-with-estimation-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (MATH-1120) Need Percentile computations that can be matched with standard spreadsheet formula

2014-05-20 Thread Venkatesha Murthy TS (JIRA)

 [ 
https://issues.apache.org/jira/browse/MATH-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesha Murthy TS updated MATH-1120:
---

Attachment: excel-percentile-patch

Added the first draft patch for review. Now that i have removed ExcelPtile as a 
new main class and since the position conversion priror to calling the 
percentile is not possible, this extension may be necessary. Please let me know

 Need Percentile computations that can be matched with standard spreadsheet 
 formula
 --

 Key: MATH-1120
 URL: https://issues.apache.org/jira/browse/MATH-1120
 Project: Commons Math
  Issue Type: Improvement
Affects Versions: 3.2
Reporter: Venkatesha Murthy TS
  Labels: Percentile
 Fix For: 4.0

 Attachments: excel-percentile-patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The current Percentile implementation assumes and hard-codes the quantile pth 
 position as 
 p * (N+1)/100 and provides a kth selected value.
 However if we need to verify compare/contrast with standard statistical tools 
 such as say MS Excel; it would be good to provide an extensible way of 
 morphing this selection of position than hard code.
 For example in order to generate the percentile closely matching with MS 
 Excel the position required may be [p*(N-1)/100]+1.
 I do have patch ready with small change needed in Percentile class and a new 
 ExcelPercentile class written with tests closely matching with that of 
 PercentileTest class.
 Please let me know if i could submit this as a patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)