A few lines of Java in a partitioning or rack aware strategy might be able to 
achieve this. 

--Joe

--
Typed with big fingers on a small keyboard. 

On Apr 8, 2011, at 13:17, Patrick Julien <pjul...@gmail.com> wrote:

> We have a pilot project running where all our historical data
> worldwide would be stored using cassandra.  So far, we have been
> successful at getting the write and read throughput we need, in fact,
> coming in over 27% over our needed capacity and well beyond what we
> were able to achieve with mysql, very impressive.
> 
> However, one thing that escapes me is how we should organize different
> data center access.
> 
> The scenario is the following:
> 
> - We have data centers in North America, London, Tokyo and so on.
> - The relative cost of data centers is very different, e.g., TCO for
> one server in Tokyo is about the same than 5 such computers in New
> York.
> - We want to have access to all the data from North America, hence we
> would run Hadoop/Pig queries from the New York/North America data
> center only.
> 
> The problem is this: we would like the historical data from Tokyo to
> stay in Tokyo and only be replicated to New York.  The one in London
> to be in London and only be replicated to New York and so on for all
> data centers.
> 
> Is this currently possible with Cassandra?  I believe we would need to
> run multiple clusters and migrate data manually from data centers to
> North America to achieve this.  Also, any suggestions would also be
> welcomed.

Reply via email to