[ 
https://issues.apache.org/jira/browse/MESOS-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-2075:
-----------------------------
    Description: 
To achieve fault-tolerance for the maintenance primitives, we will need to add 
the maintenance information to the registry.

The registry currently stores all of the slave information, which is quite 
large (~ 17MB for 50,000 slaves from my testing), which results in a protobuf 
object that is extremely expensive to copy.

As far as I can tell, reads / writes to maintenance information is independent 
of reads / writes to the existing 'registry' information. So there are two 
approach here:

h4. Add maintenance information to 'maintenance' key: (This is the chosen 
method.)
# The advantage of this approach is that we don't further grow the large 
Registry object.
# This approach assumes that writes to 'maintenance' are independent of writes 
to the 'registry'. If these writes are not independent, this approach requires 
that we add transactional support to the State abstraction.
# This approach requires adding compaction to LogStorage.
# This approach likely requires some refactoring to the Registrar.

h4. Add maintenance information to 'registry' key:
# The advantage of this approach is that it's the easiest to implement.
# This will further grow the single 'registry' object, but doesn't preclude it 
being split apart in the future.
# This approach may require using the diff support in LogStorage and/or adding 
compression support to LogStorage snapshots to deal with the increased size of 
the registry.

  was:
To achieve fault-tolerance for the maintenance primitives, we will need to add 
the maintenance information to the registry.

The registry currently stores all of the slave information, which is quite 
large (~ 17MB for 50,000 slaves from my testing), which results in a protobuf 
object that is extremely expensive to copy.

As far as I can tell, reads / writes to maintenance information is independent 
of reads / writes to the existing 'registry' information. So there are two 
approach here:

h4. Add maintenance information to 'maintenance' key:
# The advantage of this approach is that we don't further grow the large 
Registry object.
# This approach assumes that writes to 'maintenance' are independent of writes 
to the 'registry'. If these writes are not independent, this approach requires 
that we add transactional support to the State abstraction.
# This approach requires adding compaction to LogStorage.
# This approach likely requires some refactoring to the Registrar.

h4. Add maintenance information to 'registry' key:
# The advantage of this approach is that it's the easiest to implement.
# This will further grow the single 'registry' object, but doesn't preclude it 
being split apart in the future.
# This approach may require using the diff support in LogStorage and/or adding 
compression support to LogStorage snapshots to deal with the increased size of 
the registry.


> Add maintenance information to the replicated registry.
> -------------------------------------------------------
>
>                 Key: MESOS-2075
>                 URL: https://issues.apache.org/jira/browse/MESOS-2075
>             Project: Mesos
>          Issue Type: Task
>          Components: master
>            Reporter: Benjamin Mahler
>            Assignee: Joseph Wu
>              Labels: mesosphere, twitter
>
> To achieve fault-tolerance for the maintenance primitives, we will need to 
> add the maintenance information to the registry.
> The registry currently stores all of the slave information, which is quite 
> large (~ 17MB for 50,000 slaves from my testing), which results in a protobuf 
> object that is extremely expensive to copy.
> As far as I can tell, reads / writes to maintenance information is 
> independent of reads / writes to the existing 'registry' information. So 
> there are two approach here:
> h4. Add maintenance information to 'maintenance' key: (This is the chosen 
> method.)
> # The advantage of this approach is that we don't further grow the large 
> Registry object.
> # This approach assumes that writes to 'maintenance' are independent of 
> writes to the 'registry'. If these writes are not independent, this approach 
> requires that we add transactional support to the State abstraction.
> # This approach requires adding compaction to LogStorage.
> # This approach likely requires some refactoring to the Registrar.
> h4. Add maintenance information to 'registry' key:
> # The advantage of this approach is that it's the easiest to implement.
> # This will further grow the single 'registry' object, but doesn't preclude 
> it being split apart in the future.
> # This approach may require using the diff support in LogStorage and/or 
> adding compression support to LogStorage snapshots to deal with the increased 
> size of the registry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to