[
https://issues.apache.org/jira/browse/MAILBOX-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194605#comment-13194605
]
Ioan Eugen Stan commented on MAILBOX-103:
-----------------------------------------
We can use ZooKeeper to implement this. Full thread:
http://mail-archives.apache.org/mod_mbox/zookeeper-user/201201.mbox/%3CCAFvdMiCeMRxJaRg56zAFMRQSMB_oxRMzAYJ7e%3DJOQVf94Wscdg%40mail.gmail.com%3E
Use plain ZooKeeper and rely on znode version for sequence generation for both
UID's and ModSeq.
This should scale very well with a single Zk ensemble to the number of
millions. After that we can use multiple Zk ensembles where each
ensemble should manage a shard of the mailboxes.
The first thing that comes to mind is the way Debian stores packages
[3], where they use the first letter of the package as a directory to
group all packages that start with the same name into a single
directory.
This way we can make an ensemble handle all mailboxes that start with
0-4 and another that handles 5-9. This way, considering the mailboxes
are generated uniformly, we can split the load in half so we have
horizontal scalability.
[1]
http://zookeeper.apache.org/doc/current/zookeeperOver.html#fg_zkPerfReliability
[2] http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview
[3] ftp://ftp.be.debian.org/debian/pool/main
> [gsoc2011] Design and implement Distributed UID generation
> ----------------------------------------------------------
>
> Key: MAILBOX-103
> URL: https://issues.apache.org/jira/browse/MAILBOX-103
> Project: James Mailbox
> Issue Type: New Feature
> Components: hbase
> Affects Versions: 0.4
> Reporter: Eric Charles
> Fix For: 0.4
>
>
> Context: IMAP4rev1 (RFC3501 requires that every message is identified by a
> stable 32-bit Unique Identifier (UID) assigned in incremental sequence. This
> is now achieved in James IMAP subproject (http://james.apache.org/imap) with
> a UidProvider interface implemented in memory. This implementation does not
> allow distributed working of the solution.
> Task: A DistributedUidProvider must be designed. The design can rely on a
> distributed memory cache such as hazelcast , or any other solution (hadoop,
> hbase, cassandra,...), and implemented.
> Mentor: eric at apache dot org
> Complexity: medium
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]