Exactly right, Vaibhav.
vaibhav thapliyal wrote:
I think neither of these would contribute much to load balancing. HDFS
replication is mostly a safeguard against Single Points of failure in a
Hadoop cluster. However, Data center replication would ensure the
availability of an Accumulo instance.
On 16 October 2016 at 21:02, Yamini Joshi <yamini.1...@gmail.com
<mailto:yamini.1...@gmail.com>> wrote:
In other words, what helps in load balancing? HDFS replication or
Data center replication?
Best regards,
Yamini Joshi
On Sat, Oct 15, 2016 at 10:44 PM, Yamini Joshi
<yamini.1...@gmail.com <mailto:yamini.1...@gmail.com>> wrote:
So HDFS is for durability while replication is for availability?
I'm assuming that the client is unaware of the replicated
instance and queries the DB with no knowledge of which
instance/table will return the result.
Best regards,
Yamini Joshi
On Thu, Oct 13, 2016 at 11:46 AM, Josh Elser
<josh.el...@gmail.com <mailto:josh.el...@gmail.com>> wrote:
I'm not familiar with MongoDB. Perhaps someone else can
confirm this for you.
Yamini Joshi wrote:
So, can I say that if I have a table split across nodes
(i.e. num
tablets > 1) and HDFS replication in my system, it is
sort of equivalent
to a sharded and replicated mongo architecture?
Best regards,
Yamini Joshi
On Thu, Oct 13, 2016 at 11:06 AM, Josh Elser
<josh.el...@gmail.com <mailto:josh.el...@gmail.com>
<mailto:josh.el...@gmail.com
<mailto:josh.el...@gmail.com>>> wrote:
The Accumulo (Data Center) Replication feature is
for having
multiple active Accumulo clusters all containing
the same data.
HDFS provides replication as a means for durability
of the data it
is storing. The files that Accumulo creates on one
HDFS instance are
replicated by HDFS. This does not help if your
entire cluster become
unavailable. That is what the data center
replication Accumulo
feature solves.
While both can be called "replication", they serve
very different
purposes.
Yamini Joshi wrote:
Hello
I was going through some Accumulo docs and
found out about
replication.
To enable replication,one needs to make some
config settings as
described in
https://github.com/apache/accumulo/blob/master/docs/src/main/asciidoc/chapters/replication.txt
<https://github.com/apache/accumulo/blob/master/docs/src/main/asciidoc/chapters/replication.txt>
<https://github.com/apache/accumulo/blob/master/docs/src/main/asciidoc/chapters/replication.txt
<https://github.com/apache/accumulo/blob/master/docs/src/main/asciidoc/chapters/replication.txt>>.
I cannot seem to grasp the difference between
this replication
conf and
the replication on HDFS level. What exactly is
the use case for
replication? Are the replicated instances
visible to the clients?
Best regards,
Yamini Joshi