[CONF] Apache Samza > SEP-7: Samza on Azure

Pawas Chhokra (Confluence) Fri, 30 Jun 2017 15:21:55 -0700

Title: Message Title

Pawas Chhokra edited a page

...

Implement the AzureJobCoordinator on top of current JobCoordinator. Implement
Change the current Latch (lock) and implementation to a distributed lock based implementation, and implement it for Zookeeper.
Implement the distributed Lock and Leader functionality with Lease Blobs in Azure. These are pluggable components. A blob in Azure storage is used for storing large amounts of unstructured data. A Lease Blob is an operation that establishes and manages a lock on a blob for write and delete operations. We will use this service to elect the leader when running in Azure as follows:
- All the processors will try to acquire a lease on the same shared blob in Azure storage
- The processor that acquires the lease becomes the leader and automatically renews the lease at a constant time interval.
- In parallel, all the worker processors keep trying to acquire a lease on the same blob at constant time interval (T1). Acquiring a lease on a blob that has already been leased by someone results in failure.
- When the leader dies, it results in release of the lease, following which one of the worker processes will acquire a new lease on the blob. This ensures that the system will not exist without a leader for more than (T1 + lease acquisition) time.
Implement a barrier in Azure. According to my current research, Azure does not have a watch/subscribe functionality in Azure storage. This makes the barrier implementation a little challenging. Any change in the list of processors in the system will by monitoring each processor's heartbeats. Once the leader is elected, every processor will heartbeat to the leader at regular time intervals. If the leader does not hear from a processor after a particular time, it will assume that that processor has died and call the onProcessorChange() function. This will initiate the generation of a new job model which the leader will publish to the blob. The leader will also keep track of whether every processor has the updated version of the job model. In the background, every worker processor continuously polls the blob to check if any data on it has been modified since the last time it was retrieved. If true, it stops all current processing and retrieves the new job model. All worker processes start processing once they validate with the leader that every processor has got the same updated version of the job model.
Implement the checkpointing mechanism with Azure Storage.
Integrate all of this with the EventHubSystemProducer and EventHubSystemConsumer.

...

The following interfaces will be implemented for Azure:

Lock interface to replace current Latch interface

public interface Lock {



  /**


 * Acquires the lock


 */


 void lock();




  /**


 * Waits for a status notification from the lock


 */


 void await();




  /**


 * Releases the lock


 */


 void unlock();




  /**


 * Perform the write operation once the lock is acquired


 */


 void update();




}

JobCoordinator

public class AzureJobCoordinator implements JobCoordinator {}
Latch

public class AzureLatch implements Latch {}

LeaderElection

public class AzureLeaderElector implements LeaderElector {

@Override


public void setLeaderElectorListener(LeaderElectorListener listener) {




}




/**


 * Try and acquire a lease on the shared blob.


 */


@Override


public void tryBecomeLeader() {


  tryAcquireLeaseOrWait();


}




/**


 * Release the lease


 */


@Override


public void resignLeadership() {




}




@Override


public boolean amILeader() {


  return false;


}




/**


 * Keeps renewing the lease automatically after a fixed time interval.


 * @return true if it was able to renew the lease successfully, false otherwise.


 */


public boolean keepRenewingLease() {


  return false;


}




/**


 * Tries to acquire a lease after a fixed time interval if its not the leader currently.


 * Does nothing if it is the leader currently.


 * Whenever a lease is acquired, it depicts a processor change.


 */


private void tryAcquireLeaseOrWait()


{




}

}

The following config values will be introduced for this implementation:

org.apache.samza.azure.AzureJobCoordinatorFactory for job.coordinator.factory.
Azure Storage Account Name
Azure Storage Account Key

Implementation and Test Plan

...

View page

•

Add comment

•

[CONF] Apache Samza > SEP-7: Samza on Azure

Implementation and Test Plan

Reply via email to