Github user mikewalch commented on a diff in the pull request:
https://github.com/apache/accumulo/pull/293#discussion_r134035312
--- Diff:
core/src/main/java/org/apache/accumulo/core/file/rfile/RollingStats.java ---
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
contributor license
+ * agreements. See the NOTICE file distributed with this work for
additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the
License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
ANY KIND, either express
+ * or implied. See the License for the specific language governing
permissions and limitations under
+ * the License.
+ */
+package org.apache.accumulo.core.file.rfile;
+
+import org.apache.commons.math3.stat.StatUtils;
+import org.apache.commons.math3.util.FastMath;
+
+/**
+ * This class supports efficient window statistics. Apache commons math3
has a class called DescriptiveStatistics that supports windows.
DescriptiveStatistics
+ * recomputes the statistics over the entire window each time its
requested. In a test over 1,000,000 entries with a window size of 1019 that
requested stats
+ * for each entry this class took ~50ms and DescriptiveStatistics took
~6,000ms.
+ *
+ * <p>
+ * This class may not be as accurate as DescriptiveStatistics. In unit
test its within 1/1000 of DescriptiveStatistics.
+ */
+class RollingStats {
+ private int position;
+ private double window[];
+
+ private double average;
+ private double variance;
+ private double stddev;
+
+ // indicates if the window is full
+ private boolean windowFull;
+
+ private int recomputeCounter = 0;
+
+ RollingStats(int windowSize) {
+ this.windowFull = false;
+ this.position = 0;
+ this.window = new double[windowSize];
+ }
+
+ /**
+ * @see <a href=
"http://jonisalonen.com/2014/efficient-and-accurate-rolling-standard-deviation/">Efficient
and accurate rolling standard deviation</a>
+ */
+ private void update(double n, double o, int w) {
--- End diff --
I guess `n` & `o` is for new & old. Could instead use `newValue` &
`oldValue` to make things clear. What is `w`?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---