henrikingo commented on code in PR #126:
URL: https://github.com/apache/otava/pull/126#discussion_r2790817401


##########
docs/imgs/example.png:
##########


Review Comment:
   Didn't read everything yet, but note that in this example the first change 
point arguably should not be found, because it  is actually a single outlier. 
Otoh at 92 and again at 150 arguably change points should be found.
   
   This leads me to ask what parameters were used for this result, and maybe we 
should make it a habit to always include in the picture itself - so they never 
get separated - the parameters used. (Educated guess and familiarity with this 
test data leads me  to guess here that this image was generated with the old 
defaults of p=0.05 or 0.01 wich causes the first change point because p-value 
is too big/relaxed and the outlier is quite large. Also the windowing mechanism 
makes the algorithm less robust against really big outliers. And the clear but 
smaller change points at (approx) 92 and 150 have been discarded because the 
difference is less than min_magnitude=0.05. (In fact, this test data set was my 
main argument for changing the defaults and ultimately to de-emphasize or even 
deprecate min_magnitude in the future.)



##########
docs/MATH.md:
##########
@@ -0,0 +1,79 @@
+# Change Point Detection
+## Overview
+Otava implements a nonparametric change point detection algorithm designed to 
identify statistically significant distribution changes in time-ordered data. 
The method is primarily based on the **E-Divisive family of algorithms** for 
multivariate change point detection, with some practical adaptations.
+
+At a high level, the algorithm:
+- Measures statistical divergence between segments of a time series
+- Searches for change points using hierarchical segmentation
+- Evaluates significance of candidate splits using statistical hypothesis 
testing
+
+The current implementation prioritizes:
+- Robustness to noisy real-world signals
+- Deterministic behavior
+- Practical runtime for production workloads
+
+A representative example of algorithm application:
+
+![Example](./imgs/example.png "Example")
+
+Here the algorithm detected 4 change points with statistical test showing that 
behavior of the time series changes at them. In other words, data have 
different distribution to the left and to the right of each change point.
+
+## Technical Details
+### Main Idea
+The main idea is to use a divergence measure between distributions to identify 
potential points in time series at which the characteristics of the time series 
changed. Namely, having a time series $$Z_1, \cdots, Z_T$$ (which may be 
multidimensional, i.e. from $$\mathbb{R}^d$$ with $$d\geq1$$) we are testing 
subsequences $$X_\tau = \{ Z_1, Z_2, \cdots, Z_\tau \}$$ and 
$$Y_\tau(\kappa)=\{ Z_{\tau+1}, Z_{\tau+2}, \cdots, Z_\kappa \}$$ for all 
possible $$1 \leq \tau < \kappa \leq T$$ to find such $$\hat{\tau}, 
\hat{\kappa}$$ (called candidates) that maximize the probability that 
$$X_\tau$$ and $$Y_\tau(\kappa)$$ come from different distributions. If the 
probability for the best found $$\hat{\tau}, \hat{\kappa}$$ is above a certain 
threshold, then candidate $$\hat{\tau}$$ is a change point. The process is 
repeated recursively to the left and to right of $$\hat{\tau}$$ until no 
candidate corresponds to a high enough probability. This process yields a 
series of change points $$0 < \hat{\ta
 u}_1 < \hat{\tau}_2 < \cdots < \hat{\tau}_k < T$$.
+
+### Original Work
+The original work was presented in [*"A Nonparametric Approach for Multiple 
Change Point Analysis of Multivariate Data" by Matteson and 
James*](https://arxiv.org/abs/1306.4933). The authors provided extensive 
theoretical reasoning on using the following empirical divergence measure:
+

Review Comment:
   Can't help but smile at your description of that paper as "extensive 
reasoning". But yes, it contains more text than many other math papers I guess. 
Just enough that I could understand it!



##########
docs/MATH.md:
##########
@@ -0,0 +1,79 @@
+# Change Point Detection
+## Overview
+Otava implements a nonparametric change point detection algorithm designed to 
identify statistically significant distribution changes in time-ordered data. 
The method is primarily based on the **E-Divisive family of algorithms** for 
multivariate change point detection, with some practical adaptations.
+
+At a high level, the algorithm:
+- Measures statistical divergence between segments of a time series
+- Searches for change points using hierarchical segmentation
+- Evaluates significance of candidate splits using statistical hypothesis 
testing
+
+The current implementation prioritizes:
+- Robustness to noisy real-world signals
+- Deterministic behavior
+- Practical runtime for production workloads
+
+A representative example of algorithm application:
+
+![Example](./imgs/example.png "Example")
+
+Here the algorithm detected 4 change points with statistical test showing that 
behavior of the time series changes at them. In other words, data have 
different distribution to the left and to the right of each change point.

Review Comment:
   Here the algorithm detected 4 change points at which the behavior of the 
time series changes in some way. In other words, data have different 
distribution to the left and to the right of each change point. The difference 
could be merely a change in the mean, but the algorithm will also detect other 
kinds of changes, such as a change in variance when mean stays constant.
   
   A statistical test is performed to validate that each of the points is in 
fact a change point. That is to say, it is *statistically significant*. The 
user gets to choose the p-value used for the statistical test. The p-value 
adjusts the balance between finding less change points and otoh finding false 
positives. For example, a p-value of 0.01 means - in theory - that 1 in 100 
points found is a false positive, while 99 out of 100 ought  to be real changes.
   
   ---
   Ok I think I drifted off at some point. I really just  tried to make the 
first sentence a bit easier for a layman to read. Second sentence is good as 
is. I leave it  to you  t o decide whether you want to keep any  of the rest or 
just discard the rest.



##########
docs/MATH.md:
##########
@@ -0,0 +1,79 @@
+# Change Point Detection
+## Overview
+Otava implements a nonparametric change point detection algorithm designed to 
identify statistically significant distribution changes in time-ordered data. 
The method is primarily based on the **E-Divisive family of algorithms** for 
multivariate change point detection, with some practical adaptations.
+
+At a high level, the algorithm:
+- Measures statistical divergence between segments of a time series
+- Searches for change points using hierarchical segmentation
+- Evaluates significance of candidate splits using statistical hypothesis 
testing
+
+The current implementation prioritizes:
+- Robustness to noisy real-world signals
+- Deterministic behavior
+- Practical runtime for production workloads
+
+A representative example of algorithm application:
+
+![Example](./imgs/example.png "Example")
+
+Here the algorithm detected 4 change points with statistical test showing that 
behavior of the time series changes at them. In other words, data have 
different distribution to the left and to the right of each change point.
+
+## Technical Details
+### Main Idea
+The main idea is to use a divergence measure between distributions to identify 
potential points in time series at which the characteristics of the time series 
changed. Namely, having a time series $$Z_1, \cdots, Z_T$$ (which may be 
multidimensional, i.e. from $$\mathbb{R}^d$$ with $$d\geq1$$) we are testing 
subsequences $$X_\tau = \{ Z_1, Z_2, \cdots, Z_\tau \}$$ and 
$$Y_\tau(\kappa)=\{ Z_{\tau+1}, Z_{\tau+2}, \cdots, Z_\kappa \}$$ for all 
possible $$1 \leq \tau < \kappa \leq T$$ to find such $$\hat{\tau}, 
\hat{\kappa}$$ (called candidates) that maximize the probability that 
$$X_\tau$$ and $$Y_\tau(\kappa)$$ come from different distributions. If the 
probability for the best found $$\hat{\tau}, \hat{\kappa}$$ is above a certain 
threshold, then candidate $$\hat{\tau}$$ is a change point. The process is 
repeated recursively to the left and to right of $$\hat{\tau}$$ until no 
candidate corresponds to a high enough probability. This process yields a 
series of change points $$0 < \hat{\ta
 u}_1 < \hat{\tau}_2 < \cdots < \hat{\tau}_k < T$$.
+
+### Original Work
+The original work was presented in [*"A Nonparametric Approach for Multiple 
Change Point Analysis of Multivariate Data" by Matteson and 
James*](https://arxiv.org/abs/1306.4933). The authors provided extensive 
theoretical reasoning on using the following empirical divergence measure:
+
+$$\hat{\mathcal{Q}}(X_\tau,Y_\tau(\kappa);\alpha)=\dfrac{\tau(\kappa - 
\tau)}{\kappa}\hat{\mathcal{E}}(X_\tau,Y_\tau(\kappa);\alpha),$$
+
+$$\hat{\mathcal{E}}(X_\tau,Y_\tau(\kappa);\alpha)=\dfrac{2}{\tau(\kappa - 
\tau)}\sum\limits_{i=1}^\tau\sum\limits_{j=\tau+1}^\kappa \|X_i - Y_j\|^\alpha 
- {\displaystyle \binom{\tau}{2}^{-1}} \sum\limits_{1\leq i < j \leq\tau}\|X_i 
- X_j\|^\alpha - {\displaystyle \binom{\kappa - \tau}{2}^{-1}} 
\sum\limits_{\tau+1\leq i < j \leq\kappa}\|Y_i - Y_j\|^\alpha,$$
+
+where $$\alpha \in (0, 2)$$, usually we take $$\alpha=1$$; $$\|\cdot\|$$ is 
Euclidean distance; and the coefficient in front of the second and third terms 
in $$\hat{\mathcal{E}}$$ are binomial coefficients.
+The candidates are given by the
+
+$$(\hat{\tau}, \hat{\kappa}) = \text{arg}\max\limits_{(\tau, 
\kappa)}\hat{\mathcal{Q}}(X_\tau,Y_\tau(\kappa);\alpha).$$
+
+After the candidates are found, one needs to find the probability that 
$$X_{\hat{\tau}}$$ and $$Y_{\hat{\tau}}(\hat{\kappa})$$ come from a different 
distribution. Generally speaking, the time sub-series $$X$$ and $$Y$$ could 
come from any distribution(s), and authors proposed the use of a non-parametric 
permutation test to test for significant difference between them. If the 
candidates are shown to be significant, the process is to be run using 
hierarchical segmentation, i.e., recursively. For more details read the linked 
paper.
+
+### Hunter Paper
+While the original paper was theoretically sound, there were a few practical 
issues with the methodology. They were outlined pretty well in [*Hunter: Using 
Change Point Detection to Hunt for Performance Regressions by Fleming et 
al.*](https://arxiv.org/abs/2301.03034). Here is the short outline, with more 
details in the linked paper:
+- High computational cost due to the permutation significance test
+- Non-deterministicity of the results due to the permutation significance test
+- Missing change points in some of the patterns as the time series expands.
+
+The authors proposed a few innovations to resolve the issues. Namely,
+1. **Faster significance test:** replace permutation test with Student's 
t-test, that demonstrated great results in practice - *This helps resolve 
computational cost and non-deterministicity*.
+2. **Fixed-Sized Windows:** Instead of looking at the whole time series, the 
algorithm traverses it through an overlapping sliding window approach - *This 
helps catch special pattern-cases described in the paper*.
+3. **Weak Change Points:** Having two significance thresholds. Algorithm 
starts with a more relaxed threshold to find "weak" change points, and then 
continues by re-evaluating all "weak" change points using stricter threshold to 
yield the final change points - *Using a single threshold could have myopically 
stopped the algorithm. Allowing it to look for more points and filter out the 
"weak" ones later resolves the issue.*
+
+### Otava Implementation
+The current implementation in Apache Otava is effectively the one from Hunter 
paper with an in-house implementation of the algorithm.

Review Comment:
   I would substitute in-house with "new" or "re-written" (or newly rewritten). 
Perhaps append also "but functionally equivalent".



##########
docs/MATH.md:
##########
@@ -0,0 +1,79 @@
+# Change Point Detection
+## Overview
+Otava implements a nonparametric change point detection algorithm designed to 
identify statistically significant distribution changes in time-ordered data. 
The method is primarily based on the **E-Divisive family of algorithms** for 
multivariate change point detection, with some practical adaptations.
+
+At a high level, the algorithm:
+- Measures statistical divergence between segments of a time series
+- Searches for change points using hierarchical segmentation
+- Evaluates significance of candidate splits using statistical hypothesis 
testing
+
+The current implementation prioritizes:
+- Robustness to noisy real-world signals
+- Deterministic behavior
+- Practical runtime for production workloads
+
+A representative example of algorithm application:
+
+![Example](./imgs/example.png "Example")
+
+Here the algorithm detected 4 change points with statistical test showing that 
behavior of the time series changes at them. In other words, data have 
different distribution to the left and to the right of each change point.
+
+## Technical Details
+### Main Idea
+The main idea is to use a divergence measure between distributions to identify 
potential points in time series at which the characteristics of the time series 
changed. Namely, having a time series $$Z_1, \cdots, Z_T$$ (which may be 
multidimensional, i.e. from $$\mathbb{R}^d$$ with $$d\geq1$$) we are testing 
subsequences $$X_\tau = \{ Z_1, Z_2, \cdots, Z_\tau \}$$ and 
$$Y_\tau(\kappa)=\{ Z_{\tau+1}, Z_{\tau+2}, \cdots, Z_\kappa \}$$ for all 
possible $$1 \leq \tau < \kappa \leq T$$ to find such $$\hat{\tau}, 
\hat{\kappa}$$ (called candidates) that maximize the probability that 
$$X_\tau$$ and $$Y_\tau(\kappa)$$ come from different distributions. If the 
probability for the best found $$\hat{\tau}, \hat{\kappa}$$ is above a certain 
threshold, then candidate $$\hat{\tau}$$ is a change point. The process is 
repeated recursively to the left and to right of $$\hat{\tau}$$ until no 
candidate corresponds to a high enough probability. This process yields a 
series of change points $$0 < \hat{\ta
 u}_1 < \hat{\tau}_2 < \cdots < \hat{\tau}_k < T$$.
+

Review Comment:
   Feel free to copy any of my* rather crude illustrations at 
https://docs.google.com/presentation/d/1VRp9SO6buB164wEMWSM3aPQpIR-LPsk-/edit?usp=sharing&ouid=112878647936194560295&rtpof=true&sd=true
   
   * ... it is okay since I am also a contributor and signed a CLA
   
   Second, if you want to visually illustrate anything of the inner workings, 
I've found that plotting the value  of \hat{\tau} in x/y coordinates is very 
helpful. (it is the peaks of this graph that become change point candidates, 
and the biggest peak goes first)



##########
docs/imgs/example.png:
##########


Review Comment:
   And lastly, if everything I comlained about is actually rather explained by 
the fact that your implementation now moves the Kappa variable over time axis, 
then I will have to withdraw to my chambers and learn to understand this new 
moving part .



##########
docs/MATH.md:
##########
@@ -0,0 +1,79 @@
+# Change Point Detection

Review Comment:
   Perhaps somewhere could be discussed the input parameters all in one place.
   
   In particular, this would be an opportunity to point out that min_magnitude 
isn't really an input parameter to the algorithm at all. And otoh window_size 
and alpha are, even if rarely exposed to the user.



##########
docs/MATH.md:
##########
@@ -0,0 +1,79 @@
+# Change Point Detection
+## Overview
+Otava implements a nonparametric change point detection algorithm designed to 
identify statistically significant distribution changes in time-ordered data. 
The method is primarily based on the **E-Divisive family of algorithms** for 
multivariate change point detection, with some practical adaptations.
+
+At a high level, the algorithm:
+- Measures statistical divergence between segments of a time series
+- Searches for change points using hierarchical segmentation
+- Evaluates significance of candidate splits using statistical hypothesis 
testing
+
+The current implementation prioritizes:
+- Robustness to noisy real-world signals
+- Deterministic behavior
+- Practical runtime for production workloads
+
+A representative example of algorithm application:
+
+![Example](./imgs/example.png "Example")
+
+Here the algorithm detected 4 change points with statistical test showing that 
behavior of the time series changes at them. In other words, data have 
different distribution to the left and to the right of each change point.
+
+## Technical Details
+### Main Idea
+The main idea is to use a divergence measure between distributions to identify 
potential points in time series at which the characteristics of the time series 
changed. Namely, having a time series $$Z_1, \cdots, Z_T$$ (which may be 
multidimensional, i.e. from $$\mathbb{R}^d$$ with $$d\geq1$$) we are testing 
subsequences $$X_\tau = \{ Z_1, Z_2, \cdots, Z_\tau \}$$ and 
$$Y_\tau(\kappa)=\{ Z_{\tau+1}, Z_{\tau+2}, \cdots, Z_\kappa \}$$ for all 
possible $$1 \leq \tau < \kappa \leq T$$ to find such $$\hat{\tau}, 
\hat{\kappa}$$ (called candidates) that maximize the probability that 
$$X_\tau$$ and $$Y_\tau(\kappa)$$ come from different distributions. If the 
probability for the best found $$\hat{\tau}, \hat{\kappa}$$ is above a certain 
threshold, then candidate $$\hat{\tau}$$ is a change point. The process is 
repeated recursively to the left and to right of $$\hat{\tau}$$ until no 
candidate corresponds to a high enough probability. This process yields a 
series of change points $$0 < \hat{\ta
 u}_1 < \hat{\tau}_2 < \cdots < \hat{\tau}_k < T$$.
+
+### Original Work
+The original work was presented in [*"A Nonparametric Approach for Multiple 
Change Point Analysis of Multivariate Data" by Matteson and 
James*](https://arxiv.org/abs/1306.4933). The authors provided extensive 
theoretical reasoning on using the following empirical divergence measure:
+
+$$\hat{\mathcal{Q}}(X_\tau,Y_\tau(\kappa);\alpha)=\dfrac{\tau(\kappa - 
\tau)}{\kappa}\hat{\mathcal{E}}(X_\tau,Y_\tau(\kappa);\alpha),$$
+
+$$\hat{\mathcal{E}}(X_\tau,Y_\tau(\kappa);\alpha)=\dfrac{2}{\tau(\kappa - 
\tau)}\sum\limits_{i=1}^\tau\sum\limits_{j=\tau+1}^\kappa \|X_i - Y_j\|^\alpha 
- {\displaystyle \binom{\tau}{2}^{-1}} \sum\limits_{1\leq i < j \leq\tau}\|X_i 
- X_j\|^\alpha - {\displaystyle \binom{\kappa - \tau}{2}^{-1}} 
\sum\limits_{\tau+1\leq i < j \leq\kappa}\|Y_i - Y_j\|^\alpha,$$
+
+where $$\alpha \in (0, 2)$$, usually we take $$\alpha=1$$; $$\|\cdot\|$$ is 
Euclidean distance; and the coefficient in front of the second and third terms 
in $$\hat{\mathcal{E}}$$ are binomial coefficients.
+The candidates are given by the
+
+$$(\hat{\tau}, \hat{\kappa}) = \text{arg}\max\limits_{(\tau, 
\kappa)}\hat{\mathcal{Q}}(X_\tau,Y_\tau(\kappa);\alpha).$$
+
+After the candidates are found, one needs to find the probability that 
$$X_{\hat{\tau}}$$ and $$Y_{\hat{\tau}}(\hat{\kappa})$$ come from a different 
distribution. Generally speaking, the time sub-series $$X$$ and $$Y$$ could 
come from any distribution(s), and authors proposed the use of a non-parametric 
permutation test to test for significant difference between them. If the 
candidates are shown to be significant, the process is to be run using 
hierarchical segmentation, i.e., recursively. For more details read the linked 
paper.
+
+### Hunter Paper
+While the original paper was theoretically sound, there were a few practical 
issues with the methodology. They were outlined pretty well in [*Hunter: Using 
Change Point Detection to Hunt for Performance Regressions by Fleming et 
al.*](https://arxiv.org/abs/2301.03034). Here is the short outline, with more 
details in the linked paper:
+- High computational cost due to the permutation significance test
+- Non-deterministicity of the results due to the permutation significance test

Review Comment:
   ...which ironically, could be  avoided by running through even more 
permutations so that the significance test has more decimals than the chosen 
p-value, but adding e.g. 100x more permutations was prohibitively slow. (...and 
was never seriously considered, the lack of determinism was simply accepted and 
to some extent masked by a higher level system)



##########
docs/imgs/example.png:
##########


Review Comment:
   short version: If you're only including one picture, I don't think this is a 
very good one.



##########
docs/MATH.md:
##########
@@ -0,0 +1,79 @@
+# Change Point Detection
+## Overview
+Otava implements a nonparametric change point detection algorithm designed to 
identify statistically significant distribution changes in time-ordered data. 
The method is primarily based on the **E-Divisive family of algorithms** for 
multivariate change point detection, with some practical adaptations.
+
+At a high level, the algorithm:
+- Measures statistical divergence between segments of a time series
+- Searches for change points using hierarchical segmentation
+- Evaluates significance of candidate splits using statistical hypothesis 
testing
+
+The current implementation prioritizes:
+- Robustness to noisy real-world signals
+- Deterministic behavior
+- Practical runtime for production workloads
+
+A representative example of algorithm application:
+
+![Example](./imgs/example.png "Example")
+
+Here the algorithm detected 4 change points with statistical test showing that 
behavior of the time series changes at them. In other words, data have 
different distribution to the left and to the right of each change point.
+
+## Technical Details
+### Main Idea
+The main idea is to use a divergence measure between distributions to identify 
potential points in time series at which the characteristics of the time series 
changed. Namely, having a time series $$Z_1, \cdots, Z_T$$ (which may be 
multidimensional, i.e. from $$\mathbb{R}^d$$ with $$d\geq1$$) we are testing 
subsequences $$X_\tau = \{ Z_1, Z_2, \cdots, Z_\tau \}$$ and 
$$Y_\tau(\kappa)=\{ Z_{\tau+1}, Z_{\tau+2}, \cdots, Z_\kappa \}$$ for all 
possible $$1 \leq \tau < \kappa \leq T$$ to find such $$\hat{\tau}, 
\hat{\kappa}$$ (called candidates) that maximize the probability that 
$$X_\tau$$ and $$Y_\tau(\kappa)$$ come from different distributions. If the 
probability for the best found $$\hat{\tau}, \hat{\kappa}$$ is above a certain 
threshold, then candidate $$\hat{\tau}$$ is a change point. The process is 
repeated recursively to the left and to right of $$\hat{\tau}$$ until no 
candidate corresponds to a high enough probability. This process yields a 
series of change points $$0 < \hat{\ta
 u}_1 < \hat{\tau}_2 < \cdots < \hat{\tau}_k < T$$.
+
+### Original Work
+The original work was presented in [*"A Nonparametric Approach for Multiple 
Change Point Analysis of Multivariate Data" by Matteson and 
James*](https://arxiv.org/abs/1306.4933). The authors provided extensive 
theoretical reasoning on using the following empirical divergence measure:
+
+$$\hat{\mathcal{Q}}(X_\tau,Y_\tau(\kappa);\alpha)=\dfrac{\tau(\kappa - 
\tau)}{\kappa}\hat{\mathcal{E}}(X_\tau,Y_\tau(\kappa);\alpha),$$
+
+$$\hat{\mathcal{E}}(X_\tau,Y_\tau(\kappa);\alpha)=\dfrac{2}{\tau(\kappa - 
\tau)}\sum\limits_{i=1}^\tau\sum\limits_{j=\tau+1}^\kappa \|X_i - Y_j\|^\alpha 
- {\displaystyle \binom{\tau}{2}^{-1}} \sum\limits_{1\leq i < j \leq\tau}\|X_i 
- X_j\|^\alpha - {\displaystyle \binom{\kappa - \tau}{2}^{-1}} 
\sum\limits_{\tau+1\leq i < j \leq\kappa}\|Y_i - Y_j\|^\alpha,$$
+
+where $$\alpha \in (0, 2)$$, usually we take $$\alpha=1$$; $$\|\cdot\|$$ is 
Euclidean distance; and the coefficient in front of the second and third terms 
in $$\hat{\mathcal{E}}$$ are binomial coefficients.
+The candidates are given by the
+
+$$(\hat{\tau}, \hat{\kappa}) = \text{arg}\max\limits_{(\tau, 
\kappa)}\hat{\mathcal{Q}}(X_\tau,Y_\tau(\kappa);\alpha).$$
+
+After the candidates are found, one needs to find the probability that 
$$X_{\hat{\tau}}$$ and $$Y_{\hat{\tau}}(\hat{\kappa})$$ come from a different 
distribution. Generally speaking, the time sub-series $$X$$ and $$Y$$ could 
come from any distribution(s), and authors proposed the use of a non-parametric 
permutation test to test for significant difference between them. If the 
candidates are shown to be significant, the process is to be run using 
hierarchical segmentation, i.e., recursively. For more details read the linked 
paper.
+
+### Hunter Paper
+While the original paper was theoretically sound, there were a few practical 
issues with the methodology. They were outlined pretty well in [*Hunter: Using 
Change Point Detection to Hunt for Performance Regressions by Fleming et 
al.*](https://arxiv.org/abs/2301.03034). Here is the short outline, with more 
details in the linked paper:
+- High computational cost due to the permutation significance test
+- Non-deterministicity of the results due to the permutation significance test
+- Missing change points in some of the patterns as the time series expands.
+
+The authors proposed a few innovations to resolve the issues. Namely,
+1. **Faster significance test:** replace permutation test with Student's 
t-test, that demonstrated great results in practice - *This helps resolve 
computational cost and non-deterministicity*.
+2. **Fixed-Sized Windows:** Instead of looking at the whole time series, the 
algorithm traverses it through an overlapping sliding window approach - *This 
helps catch special pattern-cases described in the paper*.
+3. **Weak Change Points:** Having two significance thresholds. Algorithm 
starts with a more relaxed threshold to find "weak" change points, and then 
continues by re-evaluating all "weak" change points using stricter threshold to 
yield the final change points - *Using a single threshold could have myopically 
stopped the algorithm. Allowing it to look for more points and filter out the 
"weak" ones later resolves the issue.*

Review Comment:
   Algorithm starts with a more relaxed threshold to find a larger set of  
candidate change points, called "weak" change points...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to