[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331672747 ## File path: helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,94 @@ +package org.apache.helix.monitoring.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.monitoring.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + private static final long VALUE_NOT_SET = -1; + private long _lastEmittedMetricValue = VALUE_NOT_SET; + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + */ + public RebalanceLatencyGauge(String metricName, long slidingTimeWindow) { +super(metricName, new Histogram( +new SlidingTimeWindowArrayReservoir(slidingTimeWindow, TimeUnit.MILLISECONDS))); +_metricName = metricName; + } + + /** + * Calling this method multiple times would simply overwrite the previous state. This is because + * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the + * internal state of this metric. + */ + @Override + public void startMeasuringLatency() { Review comment: Agreed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331674026 ## File path: helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,94 @@ +package org.apache.helix.monitoring.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.monitoring.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + private static final long VALUE_NOT_SET = -1; + private long _lastEmittedMetricValue = VALUE_NOT_SET; + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + */ + public RebalanceLatencyGauge(String metricName, long slidingTimeWindow) { +super(metricName, new Histogram( +new SlidingTimeWindowArrayReservoir(slidingTimeWindow, TimeUnit.MILLISECONDS))); +_metricName = metricName; + } + + /** + * Calling this method multiple times would simply overwrite the previous state. This is because + * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the + * internal state of this metric. + */ + @Override + public void startMeasuringLatency() { +reset(); +_startTime = System.currentTimeMillis(); + } + + @Override + public void endMeasuringLatency() { +if (_endTime != 0L) { + throw new HelixException( + String.format("Needs to call startMeasuringLatency first! Metric name: %s", _metricName)); +} +_endTime = System.currentTimeMillis(); +_lastEmittedMetricValue = _endTime - _startTime; +updateValue(_lastEmittedMetricValue); + } + + @Override + public String getMetricName() { +return _metricName; + } + + @Override + public void reset() { Review comment: No, in this case, instead of resetting and continue the logic, I would prefer reset and skip the updateValue(). How about: 1. reset set both value to be -1. 2. if _startTime is -1, ignore the endMeasuringLatency calls. 3. After update value, force reset. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331362824 ## File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/AbstractTestClusterModel.java ## @@ -38,7 +37,7 @@ import org.mockito.Mockito; import org.testng.annotations.BeforeClass; -import static org.mockito.Mockito.when; +import static org.mockito.Mockito.*; Review comment: Still wildcard. I think @pkuwm has already added it to style file. Please try importing the latest file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331361552 ## File path: helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,94 @@ +package org.apache.helix.monitoring.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.monitoring.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + private static final long VALUE_NOT_SET = -1; + private long _lastEmittedMetricValue = VALUE_NOT_SET; + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + */ + public RebalanceLatencyGauge(String metricName, long slidingTimeWindow) { +super(metricName, new Histogram( +new SlidingTimeWindowArrayReservoir(slidingTimeWindow, TimeUnit.MILLISECONDS))); +_metricName = metricName; + } + + /** + * Calling this method multiple times would simply overwrite the previous state. This is because + * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the + * internal state of this metric. + */ + @Override + public void startMeasuringLatency() { +reset(); +_startTime = System.currentTimeMillis(); + } + + @Override + public void endMeasuringLatency() { +if (_endTime != 0L) { + throw new HelixException( Review comment: This may fail the business logic. Please don't throw Exceptions. If _endTime != 0, we can log the error and ignore this recording. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331360929 ## File path: helix-core/src/main/java/org/apache/helix/monitoring/metrics/MetricCollector.java ## @@ -0,0 +1,92 @@ +package org.apache.helix.monitoring.metrics; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; +import javax.management.JMException; +import javax.management.ObjectName; +import org.apache.helix.HelixException; +import org.apache.helix.monitoring.metrics.model.Metric; +import org.apache.helix.monitoring.mbeans.MonitorDomainNames; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMBeanProvider; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +/** + * Collects and manages all metrics that implement the {@link Metric} interface. + */ +public abstract class MetricCollector extends DynamicMBeanProvider { + private static final String CLUSTER_NAME_KEY = "ClusterName"; + private static final String ENTITY_NAME_KEY = "EntityName"; + private String _clusterName; + private String _entityName; + private Map _metricMap; + + public MetricCollector(String clusterName, String entityName) { +_clusterName = clusterName; +_entityName = entityName; +_metricMap = new HashMap<>(); + } + + @Override + public DynamicMBeanProvider register() throws JMException { +// First cast all Metric objects to DynamicMetrics +Collection> dynamicMetrics = new HashSet<>(); +_metricMap.values().forEach(metric -> dynamicMetrics.add(metric.getDynamicMetric())); + +// TODO: Create helper methods for mbeanName and ObjectName +String mbeanName = +String.format("%s=%s, %s=%s", CLUSTER_NAME_KEY, _clusterName, ENTITY_NAME_KEY, _entityName); +doRegister(dynamicMetrics, +new ObjectName(String.format("%s:%s", MonitorDomainNames.Rebalancer.name(), mbeanName))); Review comment: Question, is MetricCollector designed for Rebalancer metrics only? If not, here it is hardcoded. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331361705 ## File path: helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,94 @@ +package org.apache.helix.monitoring.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.monitoring.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + private static final long VALUE_NOT_SET = -1; + private long _lastEmittedMetricValue = VALUE_NOT_SET; + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + */ + public RebalanceLatencyGauge(String metricName, long slidingTimeWindow) { +super(metricName, new Histogram( +new SlidingTimeWindowArrayReservoir(slidingTimeWindow, TimeUnit.MILLISECONDS))); +_metricName = metricName; + } + + /** + * Calling this method multiple times would simply overwrite the previous state. This is because + * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the + * internal state of this metric. + */ + @Override + public void startMeasuringLatency() { Review comment: I think you will need to add sync keyword for startMeasuringLatency and endMeasuringLatency. There is no guarantee this metric object is used in only one thread. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331362485 ## File path: helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java ## @@ -818,6 +818,7 @@ private void dropRebalancedRunningTasks(Map> newAssig paMap.put(pId, new PartitionAssignment(instance, TaskPartitionState.DROPPED.name())); jobContext.setPartitionState(pId, TaskPartitionState.DROPPED); // Now it will be dropped and be rescheduled +// drop it from new assignemnt Review comment: Unrelated change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331361888 ## File path: helix-core/src/main/java/org/apache/helix/monitoring/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,94 @@ +package org.apache.helix.monitoring.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.monitoring.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + private static final long VALUE_NOT_SET = -1; + private long _lastEmittedMetricValue = VALUE_NOT_SET; + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + */ + public RebalanceLatencyGauge(String metricName, long slidingTimeWindow) { +super(metricName, new Histogram( +new SlidingTimeWindowArrayReservoir(slidingTimeWindow, TimeUnit.MILLISECONDS))); +_metricName = metricName; + } + + /** + * Calling this method multiple times would simply overwrite the previous state. This is because + * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the + * internal state of this metric. + */ + @Override + public void startMeasuringLatency() { +reset(); +_startTime = System.currentTimeMillis(); + } + + @Override + public void endMeasuringLatency() { +if (_endTime != 0L) { + throw new HelixException( + String.format("Needs to call startMeasuringLatency first! Metric name: %s", _metricName)); +} +_endTime = System.currentTimeMillis(); +_lastEmittedMetricValue = _endTime - _startTime; +updateValue(_lastEmittedMetricValue); + } + + @Override + public String getMetricName() { +return _metricName; + } + + @Override + public void reset() { Review comment: Why this is public? Consider the following call sequence: 1. startMeasuringLatency() 2. reset() 3. endMeasuringLatency(), then you record a huge number as the latency. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331213330 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -117,6 +122,16 @@ private WagedRebalancer(AssignmentMetadataStore assignmentMetadataStore, _rebalanceAlgorithm = algorithm; _mappingCalculator = mappingCalculator; _manager = manager; +// If metricCollector is null, instantiate a version that does not register metrics in order to +// allow rebalancer to proceed +_metricCollector = +metricCollector == null ? new WagedRebalancerMetricCollector() : metricCollector; + } + + @VisibleForTesting Review comment: As you suggested in the other PR, it is not preferred to use "@VisibleForTesting" method, right? Shall we remove this annotation and move it closer to the other protected method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331215851 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -447,7 +490,7 @@ private void validateInput(ResourceControllerDataProvider clusterData, * @param currentStateOutput * @param resources * @return The current best possible assignment. If record does not exist in the - * assignmentMetadataStore, return the current state assignment. + * assignmentMetadataStore, return the current state assignment. * @throws HelixRebalanceException */ private Map getBestPossibleAssignment( Review comment: No metric for the this read latency? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331217506 ## File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/model/AbstractTestClusterModel.java ## @@ -116,11 +115,11 @@ protected ResourceControllerDataProvider setupClusterDataCache() throws IOExcept // 4. Mock two resources, each with 2 partitions on the default instance. // The instance will have the following partitions assigned // Resource 1: -// partition 1 - MASTER -// partition 2 - SLAVE +// partition 1 - MASTER Review comment: Please try not to change it. Or fill the line with "--" Otherwise it becomes less readable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331217164 ## File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestWagedRebalancer.java ## @@ -47,9 +47,8 @@ import org.testng.annotations.BeforeClass; import org.testng.annotations.Test; -import static org.mockito.Matchers.any; -import static org.mockito.Matchers.anyString; -import static org.mockito.Mockito.when; +import static org.mockito.Matchers.*; Review comment: I think wildcard is not preferred. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r331214906 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -226,16 +247,25 @@ private void refreshBaseline(ResourceControllerDataProvider clusterData, Map> clusterChanges, Map resourceMap, final CurrentStateOutput currentStateOutput) throws HelixRebalanceException { LOG.info("Start calculating the new baseline."); + +// Read the baseline from metadata store + _metricCollector.getMetric(WagedRebalancerMetricNames.GlobalStateReadLatencyGauge.name(), +LatencyMetric.class).startMeasuringLatency(); Map currentBaseline = getBaselineAssignment(_assignmentMetadataStore, currentStateOutput, resourceMap.keySet()); + _metricCollector.getMetric(WagedRebalancerMetricNames.GlobalStateReadLatencyGauge.name(), +LatencyMetric.class).endMeasuringLatency(); + // For baseline calculation // 1. Ignore node status (disable/offline). // 2. Use the baseline as the previous best possible assignment since there is no "baseline" for // the baseline. -Map newBaseline = -calculateAssignment(clusterData, clusterChanges, resourceMap, clusterData.getAllInstances(), -Collections.emptyMap(), currentBaseline); +Map newBaseline = calculateAssignment(clusterData, clusterChanges, +resourceMap, clusterData.getAllInstances(), Collections.emptyMap(), currentBaseline); +// Write the new baseline to metadata store + _metricCollector.getMetric(WagedRebalancerMetricNames.GlobalStateWriteLatencyGauge.name(), +LatencyMetric.class).startMeasuringLatency(); if (_assignmentMetadataStore != null) { try { _assignmentMetadataStore.persistBaseline(newBaseline); Review comment: The BucketAccessor will be a generic component, right? What's the plan to track it's latency if it is used somewhere else outside the rebalancer? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330689647 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,94 @@ +package org.apache.helix.controller.rebalancer.waged.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.controller.rebalancer.waged.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + */ + public RebalanceLatencyGauge(String metricName, long slidingTimeWindow) { +super(metricName, new Histogram( +new SlidingTimeWindowArrayReservoir(slidingTimeWindow, TimeUnit.MILLISECONDS))); +_metricName = metricName; + } + + /** + * Calling this method multiple times would simply overwrite the previous state. This is because + * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the + * internal state of this metric. + */ + @Override + public void startMeasuringLatency() { +reset(); +_startTime = System.currentTimeMillis(); + } + + @Override + public void endMeasuringLatency() { +if (_endTime != 0L) { + throw new HelixException( + String.format("Needs to call startMeasuringLatency first! Metric name: %s", _metricName)); +} +_endTime = System.currentTimeMillis(); +updateValue(getLastEmittedMetricValue()); + } + + @Override + public String getMetricName() { +return _metricName; + } + + @Override + public void reset() { +_startTime = 0L; +_endTime = 0L; + } + + @Override + public String toString() { +return String.format("Metric %s's latency is %d", _metricName, getLastEmittedMetricValue()); + } + + /** + * Returns an appropriate metric value at the time of the call. Review comment: nit: update the comments, please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330688071 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -391,7 +429,13 @@ private void validateInput(ResourceControllerDataProvider clusterData, Map currentBaseline = Collections.emptyMap(); if (assignmentMetadataStore != null) { try { +_metricCollector.getMetric( + WagedRebalancerMetricCollector.WagedRebalancerMetricNames.StateReadLatencyGauge.name(), +LatencyMetric.class).startMeasuringLatency(); currentBaseline = assignmentMetadataStore.getBaseline(); +_metricCollector.getMetric( + WagedRebalancerMetricCollector.WagedRebalancerMetricNames.StateReadLatencyGauge.name(), +LatencyMetric.class).endMeasuringLatency(); Review comment: Can we have temp var for the metric object? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330689218 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/WagedRebalancerMetricCollector.java ## @@ -0,0 +1,76 @@ +package org.apache.helix.controller.rebalancer.waged.metrics; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.google.common.annotations.VisibleForTesting; +import javax.management.JMException; +import org.apache.helix.controller.rebalancer.waged.metrics.implementation.RebalanceLatencyGauge; +import org.apache.helix.controller.rebalancer.waged.metrics.model.LatencyMetric; + +public class WagedRebalancerMetricCollector extends MetricCollector { + private static final String WAGED_REBALANCER_ENTITY_NAME = "WagedRebalancer"; + + /** + * This enum class contains all metric names defined for WagedRebalancer. Note that all enums are + * in camel case for readability. + */ + public enum WagedRebalancerMetricNames { +// Per-stage latency metrics +GlobalBaselineCalcLatencyGauge, +PartialRebalanceLatencyGauge, + +// The following latency metrics are related to AssignmentMetadataStore +StateReadLatencyGauge, +StateWriteLatencyGauge + } + + public WagedRebalancerMetricCollector(String clusterName) throws JMException { +super(clusterName, WAGED_REBALANCER_ENTITY_NAME); +createMetrics(); +register(); + } + + @VisibleForTesting + public WagedRebalancerMetricCollector() throws JMException { +super(null, null); +createMetrics(); + } + + /** + * Creates and registers all metrics in MetricCollector for WagedRebalancer. + */ + private void createMetrics() throws JMException { +// Define all metrics +LatencyMetric globalBaselineCalcLatencyGauge = new RebalanceLatencyGauge( +WagedRebalancerMetricNames.GlobalBaselineCalcLatencyGauge.name(), getResetIntervalInMs()); +LatencyMetric partialRebalanceLatencyGauge = new RebalanceLatencyGauge( +WagedRebalancerMetricNames.PartialRebalanceLatencyGauge.name(), getResetIntervalInMs()); +LatencyMetric stateReadLatencyGauge = new RebalanceLatencyGauge( +WagedRebalancerMetricNames.StateReadLatencyGauge.name(), getResetIntervalInMs()); +LatencyMetric stateWriteLatencyGauge = new RebalanceLatencyGauge( +WagedRebalancerMetricNames.StateWriteLatencyGauge.name(), getResetIntervalInMs()); + +// Add to MetricCollector Review comment: nit: Add metrics to the WagedRebalancerMetricCollector. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330689995 ## File path: helix-core/src/main/java/org/apache/helix/controller/stages/BestPossibleStateCalcStage.java ## @@ -104,7 +109,7 @@ public Object call() { } private BestPossibleStateOutput compute(ClusterEvent event, Map resourceMap, - CurrentStateOutput currentStateOutput) { + CurrentStateOutput currentStateOutput) throws JMException { Review comment: Same here, we don't want any monitor logic to block business. No matter how broken it could be. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330690200 ## File path: helix-core/src/main/java/org/apache/helix/controller/stages/BestPossibleStateCalcStage.java ## @@ -249,8 +254,16 @@ private boolean validateOfflineInstancesLimit(final ResourceControllerDataProvid // Init rebalancer with the rebalance preferences. Map preferences = cache.getClusterConfig() .getGlobalRebalancePreference(); + +// TODO: Add an appropriate entityName Review comment: I think this has been done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330688449 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/MetricCollector.java ## @@ -0,0 +1,92 @@ +package org.apache.helix.controller.rebalancer.waged.metrics; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; +import javax.management.JMException; +import javax.management.ObjectName; +import org.apache.helix.HelixException; +import org.apache.helix.controller.rebalancer.waged.metrics.model.Metric; +import org.apache.helix.monitoring.mbeans.MonitorDomainNames; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMBeanProvider; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +/** + * Collects and manages all metrics that implement the {@link Metric} interface. + */ +public class MetricCollector extends DynamicMBeanProvider { Review comment: It might be safer to mark this class as abstract. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330688735 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/WagedRebalancerMetricCollector.java ## @@ -0,0 +1,76 @@ +package org.apache.helix.controller.rebalancer.waged.metrics; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.google.common.annotations.VisibleForTesting; +import javax.management.JMException; +import org.apache.helix.controller.rebalancer.waged.metrics.implementation.RebalanceLatencyGauge; +import org.apache.helix.controller.rebalancer.waged.metrics.model.LatencyMetric; + +public class WagedRebalancerMetricCollector extends MetricCollector { + private static final String WAGED_REBALANCER_ENTITY_NAME = "WagedRebalancer"; + + /** + * This enum class contains all metric names defined for WagedRebalancer. Note that all enums are + * in camel case for readability. + */ + public enum WagedRebalancerMetricNames { +// Per-stage latency metrics +GlobalBaselineCalcLatencyGauge, +PartialRebalanceLatencyGauge, + +// The following latency metrics are related to AssignmentMetadataStore +StateReadLatencyGauge, +StateWriteLatencyGauge + } + + public WagedRebalancerMetricCollector(String clusterName) throws JMException { +super(clusterName, WAGED_REBALANCER_ENTITY_NAME); +createMetrics(); +register(); + } + + @VisibleForTesting + public WagedRebalancerMetricCollector() throws JMException { Review comment: Comment the metrics won't be registered while using this constructor. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330686086 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -134,10 +145,9 @@ public void close() { */ public Map computeNewIdealStates(ResourceControllerDataProvider clusterData, Map resourceMap, final CurrentStateOutput currentStateOutput) - throws HelixRebalanceException { + throws HelixRebalanceException, JMException { Review comment: Don't throw the monitor exception. Just log the error and ignore it. We don't want the monitor logic or bug to block business logic. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r330687419 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -90,31 +92,39 @@ private static AssignmentMetadataStore constructAssignmentStore(HelixManager hel } public WagedRebalancer(HelixManager helixManager, - Map preferences) { + Map preferences, + MetricCollector metricCollector) throws JMException { this(constructAssignmentStore(helixManager), ConstraintBasedAlgorithmFactory.getInstance(preferences), // Use DelayedAutoRebalancer as the mapping calculator for the final assignment output. // Mapping calculator will translate the best possible assignment into the applicable state // mapping based on the current states. // TODO abstract and separate the main mapping calculator logic from DelayedAutoRebalancer -new DelayedAutoRebalancer()); +new DelayedAutoRebalancer(), metricCollector); } private WagedRebalancer(AssignmentMetadataStore assignmentMetadataStore, - RebalanceAlgorithm algorithm, MappingCalculator mappingCalculator) { + RebalanceAlgorithm algorithm, MappingCalculator mappingCalculator, + MetricCollector metricCollector) throws JMException { if (assignmentMetadataStore == null) { LOG.warn("Assignment Metadata Store is not configured properly." + " The rebalancer will not access the assignment store during the rebalance."); } _assignmentMetadataStore = assignmentMetadataStore; _rebalanceAlgorithm = algorithm; _mappingCalculator = mappingCalculator; +_metricCollector = metricCollector; Review comment: As I checked the code about how _metricCollector is leveraged, there is no null check after this point. Since we still have a public constructor that accepts the instance, we need to check for null to prevent NPE here. Or check for null everywhere it is used. Either way is fine to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329787035 ## File path: helix-core/src/main/java/org/apache/helix/monitoring/mbeans/MonitorDomainNames.java ## @@ -28,5 +28,6 @@ HelixThreadPoolExecutor, HelixCallback, RoutingTableProvider, - CLMParticipantReport + CLMParticipantReport, + WagedRebalancer Review comment: No, the problem is that, once you have the name, changing it requires us to consider backward compatibility. Better to do it right on the first shot. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329771293 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -170,6 +174,8 @@ public void close() { LOG.info("Finish computing new ideal states for resources: {}", resourceMap.keySet().toString()); +// Register all metrics +_metricCollector.register(); Review comment: We should remove this line, right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329787483 ## File path: helix-core/src/test/java/org/apache/helix/controller/rebalancer/waged/TestWagedRebalancer.java ## @@ -110,9 +111,9 @@ protected ResourceControllerDataProvider setupClusterDataCache() throws IOExcept } @Test - public void testRebalance() throws IOException, HelixRebalanceException { + public void testRebalance() throws IOException, HelixRebalanceException, JMException { _metadataStore.clearMetadataStore(); -WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm); +WagedRebalancer rebalancer = new WagedRebalancer(_metadataStore, _algorithm, null); Review comment: Not sure about all the usage, but if all test code is passing null, we can just remove this parameter from this constructor. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329780111 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/MetricCollector.java ## @@ -0,0 +1,97 @@ +package org.apache.helix.controller.rebalancer.waged.metrics; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; +import java.util.Set; +import java.util.stream.Collectors; +import javax.management.JMException; +import javax.management.ObjectName; +import org.apache.helix.HelixException; +import org.apache.helix.controller.rebalancer.waged.metrics.model.Metric; +import org.apache.helix.monitoring.mbeans.MonitorDomainNames; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMBeanProvider; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +/** + * Collects and manages all metrics that implement the {@link Metric} interface. + */ +public class MetricCollector extends DynamicMBeanProvider { + private boolean _registered; + private String _clusterName; + private String _entityName; + private Map _metricMap; + + public MetricCollector(String clusterName, String entityName) { +_clusterName = clusterName; +_entityName = entityName; +_metricMap = new HashMap<>(); + } + + @Override + public DynamicMBeanProvider register() throws JMException { +// First cast all Metric objects to DynamicMetrics +Collection> dynamicMetrics = new HashSet<>(); +_metricMap.values().forEach(metric -> dynamicMetrics.add(metric.getDynamicMetric())); + +// TODO: Create helper methods for mbeanName and ObjectName +String mbeanName = String.format("%s=%s, %s=%s", "MetricCollector", _clusterName, +"MetricCollector", _entityName); +doRegister(dynamicMetrics, new ObjectName( +String.format("%s:%s", MonitorDomainNames.WagedRebalancer.name(), mbeanName))); +_registered = true; +return this; + } + + @Override + public String getSensorName() { +return String.format("%s.%s.%s", "MetricCollector", _clusterName, _entityName); + } + + public boolean isRegistered() { +return _registered; Review comment: If the metric list is static, we should include the fixed list in the metric collector implementation. If the metric list is dynamic, it is dangerous to tell if all the metric has been registered even this method returns true. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329782151 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,96 @@ +package org.apache.helix.controller.rebalancer.waged.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.controller.rebalancer.waged.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + private static final long DEFAULT_RESET_INTERVAL_MS = 60 * 60 * 1000; // 1 hour Review comment: This window should be configurable by the system property. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329775755 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -255,15 +287,31 @@ private void refreshBaseline(ResourceControllerDataProvider clusterData, Map> clusterChanges, Map resourceMap, final CurrentStateOutput currentStateOutput) throws HelixRebalanceException { LOG.info("Start calculating the new best possible assignment."); + +// TODO: Consider combining the metrics for both baseline/best possible? +// Read the baseline from metadata store + _metricCollector.getMetric(WagedRebalancerMetricNames.PartialStateReadLatencyGauge.name(), +LatencyMetric.class).startMeasuringLatency(); Map currentBaseline = getBaselineAssignment(_assignmentMetadataStore, currentStateOutput, resourceMap.keySet()); -Map currentBestPossibleAssignment = -getBestPossibleAssignment(_assignmentMetadataStore, currentStateOutput, -resourceMap.keySet()); + _metricCollector.getMetric(WagedRebalancerMetricNames.PartialStateReadLatencyGauge.name(), +LatencyMetric.class).endMeasuringLatency(); + +// Read the best possible assignment from metadata store + _metricCollector.getMetric(WagedRebalancerMetricNames.PartialStateReadLatencyGauge.name(), +LatencyMetric.class).startMeasuringLatency(); +Map currentBestPossibleAssignment = getBestPossibleAssignment( +_assignmentMetadataStore, currentStateOutput, resourceMap.keySet()); + _metricCollector.getMetric(WagedRebalancerMetricNames.PartialStateReadLatencyGauge.name(), +LatencyMetric.class).endMeasuringLatency(); + +// Compute the new assignment Map newAssignment = calculateAssignment(clusterData, clusterChanges, resourceMap, clusterData.getEnabledLiveInstances(), currentBaseline, currentBestPossibleAssignment); + _metricCollector.getMetric(WagedRebalancerMetricNames.PartialStateWriteLatencyGauge.name(), Review comment: Note that if the operation is not done, you should not record any latency. Or the result would be unusable. So we have to record inside the if condition. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329769603 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -19,6 +19,8 @@ * under the License. */ +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableSet; Review comment: Shouldn't all java imports be arranged on the top of the file? I have asked @pkuwm to take a look at the new style file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329783344 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,96 @@ +package org.apache.helix.controller.rebalancer.waged.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.controller.rebalancer.waged.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + private static final long DEFAULT_RESET_INTERVAL_MS = 60 * 60 * 1000; // 1 hour + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + */ + public RebalanceLatencyGauge(String metricName) { +super(metricName, new Histogram( +new SlidingTimeWindowArrayReservoir(DEFAULT_RESET_INTERVAL_MS, TimeUnit.MILLISECONDS))); +_metricName = metricName; +reset(); + } + + /** + * Calling this method multiple times would simply overwrite the previous state. This is because + * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the + * internal state of this metric. + */ + @Override + public void startMeasuringLatency() { +reset(); +_startTime = System.currentTimeMillis(); + } + + @Override + public void endMeasuringLatency() { +if (_endTime != 0L) { + throw new HelixException( + String.format("Needs to call startMeasuringLatency first! Metric name: %s", _metricName)); +} +_endTime = System.currentTimeMillis(); +updateValue(_endTime - _startTime); + } + + @Override + public String getMetricName() { +return _metricName; + } + + @Override + public void reset() { +_startTime = 0L; +_endTime = 0L; Review comment: If you don't have endTime at all, I think it still works. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329786570 ## File path: helix-core/src/main/java/org/apache/helix/controller/stages/BestPossibleStateCalcStage.java ## @@ -249,8 +253,15 @@ private boolean validateOfflineInstancesLimit(final ResourceControllerDataProvid // Init rebalancer with the rebalance preferences. Map preferences = cache.getClusterConfig() .getGlobalRebalancePreference(); + +// TODO: Add an appropriate entityName +if (METRIC_COLLECTOR_THREAD_LOCAL.get() == null) { + METRIC_COLLECTOR_THREAD_LOCAL.set(new MetricCollector(helixManager.getClusterName(), "")); Review comment: I guess the lazy init is not required here. Let's just put it to the constructor of the stage object. It's always better to have the metrics code "disappear", you know. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329773415 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -189,15 +200,25 @@ public void close() { return itemKeys; })); +// Perform Global Baseline Calculation + _metricCollector.getMetric(WagedRebalancerMetricNames.GlobalBaselineCalcLatencyGauge.name(), Review comment: 1. this is too verbose, let's have a temp var for the metric instance. 2. better to be moved into the refreshBaseline/partialRebalance method. Note that if no global rebalance is done, your metric will record a garbage latency number. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329782959 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/implementation/RebalanceLatencyGauge.java ## @@ -0,0 +1,96 @@ +package org.apache.helix.controller.rebalancer.waged.metrics.implementation; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import com.codahale.metrics.SlidingTimeWindowArrayReservoir; +import java.util.concurrent.TimeUnit; +import org.apache.helix.HelixException; +import org.apache.helix.controller.rebalancer.waged.metrics.model.LatencyMetric; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +public class RebalanceLatencyGauge extends LatencyMetric { + private static final long DEFAULT_RESET_INTERVAL_MS = 60 * 60 * 1000; // 1 hour + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + */ + public RebalanceLatencyGauge(String metricName) { +super(metricName, new Histogram( +new SlidingTimeWindowArrayReservoir(DEFAULT_RESET_INTERVAL_MS, TimeUnit.MILLISECONDS))); +_metricName = metricName; +reset(); + } + + /** + * Calling this method multiple times would simply overwrite the previous state. This is because + * the rebalancer could fail at any point, and we want it to recover gracefully by resetting the + * internal state of this metric. + */ + @Override + public void startMeasuringLatency() { +reset(); +_startTime = System.currentTimeMillis(); + } + + @Override + public void endMeasuringLatency() { +if (_endTime != 0L) { + throw new HelixException( + String.format("Needs to call startMeasuringLatency first! Metric name: %s", _metricName)); +} +_endTime = System.currentTimeMillis(); +updateValue(_endTime - _startTime); Review comment: nit: updateValue(getMetricValue())? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329773765 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -226,16 +247,25 @@ private void refreshBaseline(ResourceControllerDataProvider clusterData, Map> clusterChanges, Map resourceMap, final CurrentStateOutput currentStateOutput) throws HelixRebalanceException { LOG.info("Start calculating the new baseline."); + +// Read the baseline from metadata store + _metricCollector.getMetric(WagedRebalancerMetricNames.GlobalStateReadLatencyGauge.name(), Review comment: Same here, please push to the most detailed private method if possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329781886 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/WagedRebalancerMetricNames.java ## @@ -0,0 +1,35 @@ +package org.apache.helix.controller.rebalancer.waged.metrics; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/** + * This enum class contains all metric names defined for WagedRebalancer. Note that all enums are in camel case for readability. + */ +public enum WagedRebalancerMetricNames { Review comment: It would be much easier if you have WagedRebalancerMetricCollector which holds all the methods, metrics, and the enums. I can see what you want to achieve is a generic MetricCollector for all use cases. But there would be many restrictions and duplicate codes. I would encourage you to have a try. However, please mind the additional cost we will need to pay : ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329778506 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/MetricCollector.java ## @@ -0,0 +1,97 @@ +package org.apache.helix.controller.rebalancer.waged.metrics; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; +import java.util.Set; +import java.util.stream.Collectors; +import javax.management.JMException; +import javax.management.ObjectName; +import org.apache.helix.HelixException; +import org.apache.helix.controller.rebalancer.waged.metrics.model.Metric; +import org.apache.helix.monitoring.mbeans.MonitorDomainNames; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMBeanProvider; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +/** + * Collects and manages all metrics that implement the {@link Metric} interface. + */ +public class MetricCollector extends DynamicMBeanProvider { + private boolean _registered; + private String _clusterName; + private String _entityName; + private Map _metricMap; + + public MetricCollector(String clusterName, String entityName) { +_clusterName = clusterName; +_entityName = entityName; +_metricMap = new HashMap<>(); + } + + @Override + public DynamicMBeanProvider register() throws JMException { +// First cast all Metric objects to DynamicMetrics +Collection> dynamicMetrics = new HashSet<>(); +_metricMap.values().forEach(metric -> dynamicMetrics.add(metric.getDynamicMetric())); + +// TODO: Create helper methods for mbeanName and ObjectName +String mbeanName = String.format("%s=%s, %s=%s", "MetricCollector", _clusterName, +"MetricCollector", _entityName); +doRegister(dynamicMetrics, new ObjectName( +String.format("%s:%s", MonitorDomainNames.WagedRebalancer.name(), mbeanName))); Review comment: This means, if we have multiple clusters managed by one supercluster node, the mbeans will be grouped by WagedRebalancer domain name first. Then you have clusters separated. Our convention is group by cluster name first. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329788124 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/model/Metric.java ## @@ -0,0 +1,54 @@ +package org.apache.helix.controller.rebalancer.waged.metrics.model; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +/** + * Defines a generic metric interface. + */ +public interface Metric { + + /** + * Gets the name of the metric. + */ + String getMetricName(); + + /** + * Resets the internal state of this metric. + */ + void reset(); + + /** + * Prints the metric along with its name. + */ + String print(); Review comment: toString() ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329775047 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedRebalancer.java ## @@ -226,16 +247,25 @@ private void refreshBaseline(ResourceControllerDataProvider clusterData, Map> clusterChanges, Map resourceMap, final CurrentStateOutput currentStateOutput) throws HelixRebalanceException { LOG.info("Start calculating the new baseline."); + +// Read the baseline from metadata store + _metricCollector.getMetric(WagedRebalancerMetricNames.GlobalStateReadLatencyGauge.name(), +LatencyMetric.class).startMeasuringLatency(); Map currentBaseline = getBaselineAssignment(_assignmentMetadataStore, currentStateOutput, resourceMap.keySet()); + _metricCollector.getMetric(WagedRebalancerMetricNames.GlobalStateReadLatencyGauge.name(), +LatencyMetric.class).endMeasuringLatency(); + // For baseline calculation // 1. Ignore node status (disable/offline). // 2. Use the baseline as the previous best possible assignment since there is no "baseline" for // the baseline. -Map newBaseline = -calculateAssignment(clusterData, clusterChanges, resourceMap, clusterData.getAllInstances(), -Collections.emptyMap(), currentBaseline); +Map newBaseline = calculateAssignment(clusterData, clusterChanges, +resourceMap, clusterData.getAllInstances(), Collections.emptyMap(), currentBaseline); +// Write the new baseline to metadata store + _metricCollector.getMetric(WagedRebalancerMetricNames.GlobalStateWriteLatencyGauge.name(), +LatencyMetric.class).startMeasuringLatency(); if (_assignmentMetadataStore != null) { try { _assignmentMetadataStore.persistBaseline(newBaseline); Review comment: Will it be better if the AssignmentMetadataStore.persistBaseline takes a metric object as the input? The main goal is to minimize the metric code footprint. Or it will hurt the code's readability. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329779503 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/MetricCollector.java ## @@ -0,0 +1,97 @@ +package org.apache.helix.controller.rebalancer.waged.metrics; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; +import java.util.Set; +import java.util.stream.Collectors; +import javax.management.JMException; +import javax.management.ObjectName; +import org.apache.helix.HelixException; +import org.apache.helix.controller.rebalancer.waged.metrics.model.Metric; +import org.apache.helix.monitoring.mbeans.MonitorDomainNames; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMBeanProvider; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.DynamicMetric; + +/** + * Collects and manages all metrics that implement the {@link Metric} interface. + */ +public class MetricCollector extends DynamicMBeanProvider { + private boolean _registered; + private String _clusterName; + private String _entityName; + private Map _metricMap; + + public MetricCollector(String clusterName, String entityName) { +_clusterName = clusterName; +_entityName = entityName; +_metricMap = new HashMap<>(); + } + + @Override + public DynamicMBeanProvider register() throws JMException { +// First cast all Metric objects to DynamicMetrics +Collection> dynamicMetrics = new HashSet<>(); +_metricMap.values().forEach(metric -> dynamicMetrics.add(metric.getDynamicMetric())); + +// TODO: Create helper methods for mbeanName and ObjectName +String mbeanName = String.format("%s=%s, %s=%s", "MetricCollector", _clusterName, +"MetricCollector", _entityName); +doRegister(dynamicMetrics, new ObjectName( +String.format("%s:%s", MonitorDomainNames.WagedRebalancer.name(), mbeanName))); +_registered = true; +return this; + } + + @Override + public String getSensorName() { +return String.format("%s.%s.%s", "MetricCollector", _clusterName, _entityName); Review comment: If you don't include the domain name here, it is possible we emit duplicate ingraphs metrics when there are 2 entity names are the same across 2 different monitor domain. Although I think it won't be the case, the thing is that we won't have a programmatic way to detect the issue. So, please ensure the sensor name includes all the MBean name components. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org
[GitHub] [helix] jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer
jiajunwang commented on a change in pull request #490: Add latency metric components for WAGED rebalancer URL: https://github.com/apache/helix/pull/490#discussion_r329785737 ## File path: helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/metrics/model/LatencyMetric.java ## @@ -0,0 +1,52 @@ +package org.apache.helix.controller.rebalancer.waged.metrics.model; + +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +import com.codahale.metrics.Histogram; +import org.apache.helix.monitoring.mbeans.dynamicMBeans.HistogramDynamicMetric; + +/** + * Represents a latency metric and defines methods to help with calculation. A latency metric gives + * how long a particular stage in the logic took in milliseconds. + */ +public abstract class LatencyMetric extends HistogramDynamicMetric implements Metric { + protected long _startTime; + protected long _endTime; + protected String _metricName; + + /** + * Instantiates a new Histogram dynamic metric. + * @param metricName the metric name + * @param metricObject the metric object + */ + public LatencyMetric(String metricName, Histogram metricObject) { Review comment: Can we just construct a new Histogram inside the constructor? I think there was some restriction preventing us from doing so. But I don't see any problem if we do it here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org