Hive is beginning to implement Region support where one metastore will
manage multiple filesystems and jobtrackers. When a query creates a
table it will then be copied to one ore more datacenters. In addition
the query planner will intelligently attempt to run queries in regions
only where all the
If you want to start an open source project for this I am sure that there are
others with the same problem that might be very wiling to help out. :)
--Bobby Evans
On 4/19/12 4:31 PM, "Michael Segel" wrote:
I don't know of any open source solution in doing this...
And yeah its something one can
I don't know of any open source solution in doing this...
And yeah its something one can't talk about ;-)
On Apr 19, 2012, at 4:28 PM, Robert Evans wrote:
> Where I work we have done some things like this, but none of them are open
> source, and I have not really been directly involved w
Where I work we have done some things like this, but none of them are open
source, and I have not really been directly involved with the details of it. I
can guess about what it would take, but that is all it would be at this point.
--Bobby
On 4/17/12 5:46 PM, "Abhishek Pratap Singh" wrote:
Thanks bobby, I m looking for something like this. Now the question is
what is the best strategy to do Hot/Hot or Hot/Warm.
I need to consider the CPU and Network bandwidth, also needs to decide from
which layer this replication should start.
Regards,
Abhishek
On Mon, Apr 16, 2012 at 7:08 AM,
Hi Abhishek,
Manu is correct about High Availability within a single colo. I realize that
in some cases you have to have fail over between colos. I am not aware of any
turn key solution for things like that, but generally what you want to do is to
run two clusters, one in each colo, either ho
Hi Abhishek,
1. Use multiple directories for *dfs.name.dir* & *dfs.data.dir* etc
* Recommendation: write to *two local directories on different
physical volumes*, and to an *NFS-mounted* directory
– Data will be preserved even in the event of a total failure of the
NameNode machines
* Recommendati
Thanks Robert.
Is there a best practice or design than can address the High Availability
to certain extent?
~Abhishek
On Wed, Apr 11, 2012 at 12:32 PM, Robert Evans wrote:
> No it does not. Sorry
>
>
> On 4/11/12 1:44 PM, "Abhishek Pratap Singh" wrote:
>
> Hi All,
>
> Just wanted if hadoop sup
No it does not. Sorry
On 4/11/12 1:44 PM, "Abhishek Pratap Singh" wrote:
Hi All,
Just wanted if hadoop supports more than one data centre. This is basically
for DR purposes and High Availability where one centre goes down other can
bring up.
Regards,
Abhishek