Github user anew commented on a diff in the pull request:

    https://github.com/apache/incubator-tephra/pull/20#discussion_r90759085
  
    --- Diff: 
tephra-core/src/main/java/org/apache/tephra/janitor/TransactionPruningPlugin.java
 ---
    @@ -0,0 +1,90 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *   http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +
    +package org.apache.tephra.janitor;
    +
    +import org.apache.hadoop.conf.Configuration;
    +
    +import java.io.IOException;
    +
    +/**
    + * Data janitor interface to manage the invalid transaction list.
    + *
    + * <p/>
    + * An invalid transaction can only be removed from the invalid list after 
the data written
    + * by the invalid transactions has been removed from all the data stores.
    + * The term data store is used here to represent a set of tables in a 
database that have
    + * the same data clean up policy, like all Apache Phoenix tables in an 
HBase instance.
    + *
    + * <p/>
    + * Typically every data store will have a background job which cleans up 
the data written by invalid transactions.
    + * Prune upper bound for a data store is defined as the largest invalid 
transaction whose data has been
    + * cleaned up from that data store.
    + * <pre>
    + * prune-upper-bound = min(max(invalid list), min(in-progress list) - 1)
    + * </pre>
    + * where invalid list and in-progress list are from the transaction 
snapshot used to clean up the invalid data in the
    + * data store.
    + *
    + * <p/>
    + * There will be one such plugin per data store. The plugins will be 
executed as part of the Transaction Service.
    + * Each plugin will be invoked periodically to fetch the prune upper bound 
for its data store.
    + * Invalid transaction list can pruned up to the minimum of prune upper 
bounds returned by all the plugins.
    + */
    +public interface TransactionPruningPlugin {
    +  /**
    +   * Called once when the Transaction Service starts up.
    +   *
    +   * @param conf configuration for the plugin
    +   */
    +  void initialize(Configuration conf) throws IOException;
    +
    +  /**
    +   * Called periodically to fetch prune upper bound for a data store. The 
plugin examines the state of data cleanup
    +   * in the data store and determines the smallest invalid transaction 
whose writes no longer exist in the data
    --- End diff --
    
    or a greatest lower bound for transaction ids that may not be pruned?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to