Re: Hbase federated cluster for messages

2016-08-20 Thread Mikhail Antonov
Just out of curiosity, is there anything particular about your deployment
or usecase that raised this specific concern about Namenode performance?

HDFS clusters with 80 datanodes shall be considered medium-sized; there are
plenty of (much) bigger clusters out there in the field;
and HBase clusters with 80 nodes aren't something very uncommon either.
Fine tuning cluster of this size according to specific workload
would certainly require some planning and work and setting a bunch of
params related to heap/memstore/block cache sizing,
GC settings, RPC scheduler settings, replication settings and a bunch of
other things; but why the Namenode?

-Mikhail

On Sat, Aug 20, 2016 at 6:39 AM, Alexandr Porunov <
alexandr.poru...@gmail.com> wrote:

> Thank you Dima
>
> Best regards,
> Alexandr
>
> On Sat, Aug 20, 2016 at 4:17 PM, Dima Spivak  wrote:
>
> > Yup.
> >
> > On Saturday, August 20, 2016, Alexandr Porunov <
> alexandr.poru...@gmail.com
> > >
> > wrote:
> >
> > > So, will it be ok if we have 80 data nodes (8TB on each node) and only
> > one
> > > namenode? Will it works for the messaging system? We will have 2x
> > > replication so there are 320 TB of data (per year) (640 TB with
> > > replication). 13 R+W ops/sec. Each message 100 bytes or 1024 bytes.
> > > Is it possible to handle such load with hbase?
> > >
> > > Sincerely,
> > > Alexandr
> > >
> > > On Sat, Aug 20, 2016 at 8:44 AM, Dima Spivak  > > > wrote:
> > >
> > > > You can easily store that much data as long as you don't have small
> > > files,
> > > > which is typically why people turn to federation.
> > > >
> > > > -Dima
> > > >
> > > > On Friday, August 19, 2016, Alexandr Porunov <
> > alexandr.poru...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > We are talking about facebook. So, there are 25 TB per month. 15
> > > billion
> > > > > messages with 1024 bytes and 120 billion messages with 100 bytes
> per
> > > > month.
> > > > >
> > > > > I thought that they used only hbase to handle such a huge data If
> > they
> > > > used
> > > > > their own implementation of hbase then I haven't questions.
> > > > >
> > > > > Sincerely,
> > > > > Alexandr
> > > > >
> > > > > On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak  > > 
> > > > > > wrote:
> > > > >
> > > > > > I'd +1 what Vladimir says. How much data (in TBs/PBs) and how
> many
> > > > files
> > > > > > are we talking about here? I'd say that use cases that benefit
> from
> > > > HBase
> > > > > > don't tend to hit the kind of HDFS file limits that federation
> > seeks
> > > to
> > > > > > address.
> > > > > >
> > > > > > -Dima
> > > > > >
> > > > > > On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov <
> > > > > vladrodio...@gmail.com  
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > FB has its own "federation". It is a proprietary code, I
> presume.
> > > > > > >
> > > > > > > -Vladimir
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> > > > > > > alexandr.poru...@gmail.com  >
> wrote:
> > > > > > >
> > > > > > > > No. There isn't. But I want to figure out how to configure
> that
> > > > type
> > > > > of
> > > > > > > > cluster in the case if there is particular reason. How
> facebook
> > > can
> > > > > > > handle
> > > > > > > > such a huge amount of ops without federation? I don't think
> > that
> > > > they
> > > > > > > just
> > > > > > > > have one namenode server and one standby namenode server. It
> > > isn't
> > > > > > > > possible. I am sure that they use federation.
> > > > > > > >
> > > > > > > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > > > > > > > vladrodio...@gmail.com  >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > >> I am not sure how to do it but I have to configure
> > federated
> > > > > > cluster
> > > > > > > > > with
> > > > > > > > > >> hbase to store huge amount of messages (client to
> client)
> > > (40%
> > > > > > > writes,
> > > > > > > > > 60%
> > > > > > > > > >> reads).
> > > > > > > > >
> > > > > > > > > Any particular reason for federated cluster? How huge is
> huge
> > > > > amount
> > > > > > > and
> > > > > > > > > what is the message size?
> > > > > > > > >
> > > > > > > > > -Vladimir
> > > > > > > > >
> > > > > > > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak <
> > > > > dspi...@cloudera.com  >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > As far as I know, HBase doesn't support spreading tables
> > > across
> > > > > > > > > namespaces;
> > > > > > > > > > you'd have to point it at one namenode at a time. I've
> > heard
> > > of
> > > > > > > people
> > > > > > > > > > trying to run multiple HBase instances in order to get
> > access
> > > > to
> > > > > > all
> > > > > > > > > their
> > > > > > > > > > HDFS data, but it doesn't tend to be much fun.
> > > > > > > > > >
> > > > > > > > > > -Dima
> > > > > > > > > >
> > > > > > > > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > > > > > > > alexandr.poru...@gmail.com 
> >
> > > wrote

Re: Hbase federated cluster for messages

2016-08-20 Thread Alexandr Porunov
Thank you Dima

Best regards,
Alexandr

On Sat, Aug 20, 2016 at 4:17 PM, Dima Spivak  wrote:

> Yup.
>
> On Saturday, August 20, 2016, Alexandr Porunov  >
> wrote:
>
> > So, will it be ok if we have 80 data nodes (8TB on each node) and only
> one
> > namenode? Will it works for the messaging system? We will have 2x
> > replication so there are 320 TB of data (per year) (640 TB with
> > replication). 13 R+W ops/sec. Each message 100 bytes or 1024 bytes.
> > Is it possible to handle such load with hbase?
> >
> > Sincerely,
> > Alexandr
> >
> > On Sat, Aug 20, 2016 at 8:44 AM, Dima Spivak  > > wrote:
> >
> > > You can easily store that much data as long as you don't have small
> > files,
> > > which is typically why people turn to federation.
> > >
> > > -Dima
> > >
> > > On Friday, August 19, 2016, Alexandr Porunov <
> alexandr.poru...@gmail.com
> > >
> > > wrote:
> > >
> > > > We are talking about facebook. So, there are 25 TB per month. 15
> > billion
> > > > messages with 1024 bytes and 120 billion messages with 100 bytes per
> > > month.
> > > >
> > > > I thought that they used only hbase to handle such a huge data If
> they
> > > used
> > > > their own implementation of hbase then I haven't questions.
> > > >
> > > > Sincerely,
> > > > Alexandr
> > > >
> > > > On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak  > 
> > > > > wrote:
> > > >
> > > > > I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many
> > > files
> > > > > are we talking about here? I'd say that use cases that benefit from
> > > HBase
> > > > > don't tend to hit the kind of HDFS file limits that federation
> seeks
> > to
> > > > > address.
> > > > >
> > > > > -Dima
> > > > >
> > > > > On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov <
> > > > vladrodio...@gmail.com  
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > FB has its own "federation". It is a proprietary code, I presume.
> > > > > >
> > > > > > -Vladimir
> > > > > >
> > > > > >
> > > > > > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> > > > > > alexandr.poru...@gmail.com  > wrote:
> > > > > >
> > > > > > > No. There isn't. But I want to figure out how to configure that
> > > type
> > > > of
> > > > > > > cluster in the case if there is particular reason. How facebook
> > can
> > > > > > handle
> > > > > > > such a huge amount of ops without federation? I don't think
> that
> > > they
> > > > > > just
> > > > > > > have one namenode server and one standby namenode server. It
> > isn't
> > > > > > > possible. I am sure that they use federation.
> > > > > > >
> > > > > > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > > > > > > vladrodio...@gmail.com  >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > >> I am not sure how to do it but I have to configure
> federated
> > > > > cluster
> > > > > > > > with
> > > > > > > > >> hbase to store huge amount of messages (client to client)
> > (40%
> > > > > > writes,
> > > > > > > > 60%
> > > > > > > > >> reads).
> > > > > > > >
> > > > > > > > Any particular reason for federated cluster? How huge is huge
> > > > amount
> > > > > > and
> > > > > > > > what is the message size?
> > > > > > > >
> > > > > > > > -Vladimir
> > > > > > > >
> > > > > > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak <
> > > > dspi...@cloudera.com  >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > As far as I know, HBase doesn't support spreading tables
> > across
> > > > > > > > namespaces;
> > > > > > > > > you'd have to point it at one namenode at a time. I've
> heard
> > of
> > > > > > people
> > > > > > > > > trying to run multiple HBase instances in order to get
> access
> > > to
> > > > > all
> > > > > > > > their
> > > > > > > > > HDFS data, but it doesn't tend to be much fun.
> > > > > > > > >
> > > > > > > > > -Dima
> > > > > > > > >
> > > > > > > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > > > > > > alexandr.poru...@gmail.com  >
> > wrote:
> > > > > > > > >
> > > > > > > > > > Hello,
> > > > > > > > > >
> > > > > > > > > > I am not sure how to do it but I have to configure
> > federated
> > > > > > cluster
> > > > > > > > with
> > > > > > > > > > hbase to store huge amount of messages (client to client)
> > > (40%
> > > > > > > writes,
> > > > > > > > > 60%
> > > > > > > > > > reads). Does somebody have any idea or examples how to
> > > > configure
> > > > > > it?
> > > > > > > > > >
> > > > > > > > > > Of course we can configure hdfs in a federated mode but
> as
> > > for
> > > > me
> > > > > > it
> > > > > > > > > isn't
> > > > > > > > > > suitable for hbase. If we want to save message from
> client
> > 1
> > > to
> > > > > > > client
> > > > > > > > 2
> > > > > > > > > in
> > > > > > > > > > the hbase cluster then how hbase know in which namespace
> it
> > > > have
> > > > > to
> > > > > > > > save
> > > > > > > > > > it? Which namenode will be responsible for that message?
> > How
> > > we
> > > > > can
> > > > > > > > read
> > > > > > > > > > client messages?
> > > > > > > > > >
>

Re: Hbase federated cluster for messages

2016-08-20 Thread Dima Spivak
Yup.

On Saturday, August 20, 2016, Alexandr Porunov 
wrote:

> So, will it be ok if we have 80 data nodes (8TB on each node) and only one
> namenode? Will it works for the messaging system? We will have 2x
> replication so there are 320 TB of data (per year) (640 TB with
> replication). 13 R+W ops/sec. Each message 100 bytes or 1024 bytes.
> Is it possible to handle such load with hbase?
>
> Sincerely,
> Alexandr
>
> On Sat, Aug 20, 2016 at 8:44 AM, Dima Spivak  > wrote:
>
> > You can easily store that much data as long as you don't have small
> files,
> > which is typically why people turn to federation.
> >
> > -Dima
> >
> > On Friday, August 19, 2016, Alexandr Porunov  >
> > wrote:
> >
> > > We are talking about facebook. So, there are 25 TB per month. 15
> billion
> > > messages with 1024 bytes and 120 billion messages with 100 bytes per
> > month.
> > >
> > > I thought that they used only hbase to handle such a huge data If they
> > used
> > > their own implementation of hbase then I haven't questions.
> > >
> > > Sincerely,
> > > Alexandr
> > >
> > > On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak  
> > > > wrote:
> > >
> > > > I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many
> > files
> > > > are we talking about here? I'd say that use cases that benefit from
> > HBase
> > > > don't tend to hit the kind of HDFS file limits that federation seeks
> to
> > > > address.
> > > >
> > > > -Dima
> > > >
> > > > On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov <
> > > vladrodio...@gmail.com  
> > > > >
> > > > wrote:
> > > >
> > > > > FB has its own "federation". It is a proprietary code, I presume.
> > > > >
> > > > > -Vladimir
> > > > >
> > > > >
> > > > > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> > > > > alexandr.poru...@gmail.com  > wrote:
> > > > >
> > > > > > No. There isn't. But I want to figure out how to configure that
> > type
> > > of
> > > > > > cluster in the case if there is particular reason. How facebook
> can
> > > > > handle
> > > > > > such a huge amount of ops without federation? I don't think that
> > they
> > > > > just
> > > > > > have one namenode server and one standby namenode server. It
> isn't
> > > > > > possible. I am sure that they use federation.
> > > > > >
> > > > > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > > > > > vladrodio...@gmail.com  >
> > > > > > wrote:
> > > > > >
> > > > > > > >> I am not sure how to do it but I have to configure federated
> > > > cluster
> > > > > > > with
> > > > > > > >> hbase to store huge amount of messages (client to client)
> (40%
> > > > > writes,
> > > > > > > 60%
> > > > > > > >> reads).
> > > > > > >
> > > > > > > Any particular reason for federated cluster? How huge is huge
> > > amount
> > > > > and
> > > > > > > what is the message size?
> > > > > > >
> > > > > > > -Vladimir
> > > > > > >
> > > > > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak <
> > > dspi...@cloudera.com  >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > As far as I know, HBase doesn't support spreading tables
> across
> > > > > > > namespaces;
> > > > > > > > you'd have to point it at one namenode at a time. I've heard
> of
> > > > > people
> > > > > > > > trying to run multiple HBase instances in order to get access
> > to
> > > > all
> > > > > > > their
> > > > > > > > HDFS data, but it doesn't tend to be much fun.
> > > > > > > >
> > > > > > > > -Dima
> > > > > > > >
> > > > > > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > > > > > alexandr.poru...@gmail.com  >
> wrote:
> > > > > > > >
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > I am not sure how to do it but I have to configure
> federated
> > > > > cluster
> > > > > > > with
> > > > > > > > > hbase to store huge amount of messages (client to client)
> > (40%
> > > > > > writes,
> > > > > > > > 60%
> > > > > > > > > reads). Does somebody have any idea or examples how to
> > > configure
> > > > > it?
> > > > > > > > >
> > > > > > > > > Of course we can configure hdfs in a federated mode but as
> > for
> > > me
> > > > > it
> > > > > > > > isn't
> > > > > > > > > suitable for hbase. If we want to save message from client
> 1
> > to
> > > > > > client
> > > > > > > 2
> > > > > > > > in
> > > > > > > > > the hbase cluster then how hbase know in which namespace it
> > > have
> > > > to
> > > > > > > save
> > > > > > > > > it? Which namenode will be responsible for that message?
> How
> > we
> > > > can
> > > > > > > read
> > > > > > > > > client messages?
> > > > > > > > >
> > > > > > > > > Give me any ideas, please
> > > > > > > > >
> > > > > > > > > Sincerely,
> > > > > > > > > Alexandr
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > -Dima
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -Dima
> > > >
> > >
> >
> >
> > --
> > -Dima
> >
>


-- 
-Dima


Re: Hbase federated cluster for messages

2016-08-20 Thread Alexandr Porunov
So, will it be ok if we have 80 data nodes (8TB on each node) and only one
namenode? Will it works for the messaging system? We will have 2x
replication so there are 320 TB of data (per year) (640 TB with
replication). 13 R+W ops/sec. Each message 100 bytes or 1024 bytes.
Is it possible to handle such load with hbase?

Sincerely,
Alexandr

On Sat, Aug 20, 2016 at 8:44 AM, Dima Spivak  wrote:

> You can easily store that much data as long as you don't have small files,
> which is typically why people turn to federation.
>
> -Dima
>
> On Friday, August 19, 2016, Alexandr Porunov 
> wrote:
>
> > We are talking about facebook. So, there are 25 TB per month. 15 billion
> > messages with 1024 bytes and 120 billion messages with 100 bytes per
> month.
> >
> > I thought that they used only hbase to handle such a huge data If they
> used
> > their own implementation of hbase then I haven't questions.
> >
> > Sincerely,
> > Alexandr
> >
> > On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak  > > wrote:
> >
> > > I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many
> files
> > > are we talking about here? I'd say that use cases that benefit from
> HBase
> > > don't tend to hit the kind of HDFS file limits that federation seeks to
> > > address.
> > >
> > > -Dima
> > >
> > > On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov <
> > vladrodio...@gmail.com 
> > > >
> > > wrote:
> > >
> > > > FB has its own "federation". It is a proprietary code, I presume.
> > > >
> > > > -Vladimir
> > > >
> > > >
> > > > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> > > > alexandr.poru...@gmail.com > wrote:
> > > >
> > > > > No. There isn't. But I want to figure out how to configure that
> type
> > of
> > > > > cluster in the case if there is particular reason. How facebook can
> > > > handle
> > > > > such a huge amount of ops without federation? I don't think that
> they
> > > > just
> > > > > have one namenode server and one standby namenode server. It isn't
> > > > > possible. I am sure that they use federation.
> > > > >
> > > > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > > > > vladrodio...@gmail.com >
> > > > > wrote:
> > > > >
> > > > > > >> I am not sure how to do it but I have to configure federated
> > > cluster
> > > > > > with
> > > > > > >> hbase to store huge amount of messages (client to client) (40%
> > > > writes,
> > > > > > 60%
> > > > > > >> reads).
> > > > > >
> > > > > > Any particular reason for federated cluster? How huge is huge
> > amount
> > > > and
> > > > > > what is the message size?
> > > > > >
> > > > > > -Vladimir
> > > > > >
> > > > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak <
> > dspi...@cloudera.com >
> > > > > > wrote:
> > > > > >
> > > > > > > As far as I know, HBase doesn't support spreading tables across
> > > > > > namespaces;
> > > > > > > you'd have to point it at one namenode at a time. I've heard of
> > > > people
> > > > > > > trying to run multiple HBase instances in order to get access
> to
> > > all
> > > > > > their
> > > > > > > HDFS data, but it doesn't tend to be much fun.
> > > > > > >
> > > > > > > -Dima
> > > > > > >
> > > > > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > > > > alexandr.poru...@gmail.com > wrote:
> > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I am not sure how to do it but I have to configure federated
> > > > cluster
> > > > > > with
> > > > > > > > hbase to store huge amount of messages (client to client)
> (40%
> > > > > writes,
> > > > > > > 60%
> > > > > > > > reads). Does somebody have any idea or examples how to
> > configure
> > > > it?
> > > > > > > >
> > > > > > > > Of course we can configure hdfs in a federated mode but as
> for
> > me
> > > > it
> > > > > > > isn't
> > > > > > > > suitable for hbase. If we want to save message from client 1
> to
> > > > > client
> > > > > > 2
> > > > > > > in
> > > > > > > > the hbase cluster then how hbase know in which namespace it
> > have
> > > to
> > > > > > save
> > > > > > > > it? Which namenode will be responsible for that message? How
> we
> > > can
> > > > > > read
> > > > > > > > client messages?
> > > > > > > >
> > > > > > > > Give me any ideas, please
> > > > > > > >
> > > > > > > > Sincerely,
> > > > > > > > Alexandr
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > -Dima
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -Dima
> > >
> >
>
>
> --
> -Dima
>


Re: Hbase federated cluster for messages

2016-08-19 Thread Dima Spivak
You can easily store that much data as long as you don't have small files,
which is typically why people turn to federation.

-Dima

On Friday, August 19, 2016, Alexandr Porunov 
wrote:

> We are talking about facebook. So, there are 25 TB per month. 15 billion
> messages with 1024 bytes and 120 billion messages with 100 bytes per month.
>
> I thought that they used only hbase to handle such a huge data If they used
> their own implementation of hbase then I haven't questions.
>
> Sincerely,
> Alexandr
>
> On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak  > wrote:
>
> > I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many files
> > are we talking about here? I'd say that use cases that benefit from HBase
> > don't tend to hit the kind of HDFS file limits that federation seeks to
> > address.
> >
> > -Dima
> >
> > On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov <
> vladrodio...@gmail.com 
> > >
> > wrote:
> >
> > > FB has its own "federation". It is a proprietary code, I presume.
> > >
> > > -Vladimir
> > >
> > >
> > > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> > > alexandr.poru...@gmail.com > wrote:
> > >
> > > > No. There isn't. But I want to figure out how to configure that type
> of
> > > > cluster in the case if there is particular reason. How facebook can
> > > handle
> > > > such a huge amount of ops without federation? I don't think that they
> > > just
> > > > have one namenode server and one standby namenode server. It isn't
> > > > possible. I am sure that they use federation.
> > > >
> > > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > > > vladrodio...@gmail.com >
> > > > wrote:
> > > >
> > > > > >> I am not sure how to do it but I have to configure federated
> > cluster
> > > > > with
> > > > > >> hbase to store huge amount of messages (client to client) (40%
> > > writes,
> > > > > 60%
> > > > > >> reads).
> > > > >
> > > > > Any particular reason for federated cluster? How huge is huge
> amount
> > > and
> > > > > what is the message size?
> > > > >
> > > > > -Vladimir
> > > > >
> > > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak <
> dspi...@cloudera.com >
> > > > > wrote:
> > > > >
> > > > > > As far as I know, HBase doesn't support spreading tables across
> > > > > namespaces;
> > > > > > you'd have to point it at one namenode at a time. I've heard of
> > > people
> > > > > > trying to run multiple HBase instances in order to get access to
> > all
> > > > > their
> > > > > > HDFS data, but it doesn't tend to be much fun.
> > > > > >
> > > > > > -Dima
> > > > > >
> > > > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > > > alexandr.poru...@gmail.com > wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > I am not sure how to do it but I have to configure federated
> > > cluster
> > > > > with
> > > > > > > hbase to store huge amount of messages (client to client) (40%
> > > > writes,
> > > > > > 60%
> > > > > > > reads). Does somebody have any idea or examples how to
> configure
> > > it?
> > > > > > >
> > > > > > > Of course we can configure hdfs in a federated mode but as for
> me
> > > it
> > > > > > isn't
> > > > > > > suitable for hbase. If we want to save message from client 1 to
> > > > client
> > > > > 2
> > > > > > in
> > > > > > > the hbase cluster then how hbase know in which namespace it
> have
> > to
> > > > > save
> > > > > > > it? Which namenode will be responsible for that message? How we
> > can
> > > > > read
> > > > > > > client messages?
> > > > > > >
> > > > > > > Give me any ideas, please
> > > > > > >
> > > > > > > Sincerely,
> > > > > > > Alexandr
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -Dima
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -Dima
> >
>


-- 
-Dima


Re: Hbase federated cluster for messages

2016-08-19 Thread Alexandr Porunov
We are talking about facebook. So, there are 25 TB per month. 15 billion
messages with 1024 bytes and 120 billion messages with 100 bytes per month.

I thought that they used only hbase to handle such a huge data If they used
their own implementation of hbase then I haven't questions.

Sincerely,
Alexandr

On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak  wrote:

> I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many files
> are we talking about here? I'd say that use cases that benefit from HBase
> don't tend to hit the kind of HDFS file limits that federation seeks to
> address.
>
> -Dima
>
> On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov  >
> wrote:
>
> > FB has its own "federation". It is a proprietary code, I presume.
> >
> > -Vladimir
> >
> >
> > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> > alexandr.poru...@gmail.com> wrote:
> >
> > > No. There isn't. But I want to figure out how to configure that type of
> > > cluster in the case if there is particular reason. How facebook can
> > handle
> > > such a huge amount of ops without federation? I don't think that they
> > just
> > > have one namenode server and one standby namenode server. It isn't
> > > possible. I am sure that they use federation.
> > >
> > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > > vladrodio...@gmail.com>
> > > wrote:
> > >
> > > > >> I am not sure how to do it but I have to configure federated
> cluster
> > > > with
> > > > >> hbase to store huge amount of messages (client to client) (40%
> > writes,
> > > > 60%
> > > > >> reads).
> > > >
> > > > Any particular reason for federated cluster? How huge is huge amount
> > and
> > > > what is the message size?
> > > >
> > > > -Vladimir
> > > >
> > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak 
> > > > wrote:
> > > >
> > > > > As far as I know, HBase doesn't support spreading tables across
> > > > namespaces;
> > > > > you'd have to point it at one namenode at a time. I've heard of
> > people
> > > > > trying to run multiple HBase instances in order to get access to
> all
> > > > their
> > > > > HDFS data, but it doesn't tend to be much fun.
> > > > >
> > > > > -Dima
> > > > >
> > > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > > alexandr.poru...@gmail.com> wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I am not sure how to do it but I have to configure federated
> > cluster
> > > > with
> > > > > > hbase to store huge amount of messages (client to client) (40%
> > > writes,
> > > > > 60%
> > > > > > reads). Does somebody have any idea or examples how to configure
> > it?
> > > > > >
> > > > > > Of course we can configure hdfs in a federated mode but as for me
> > it
> > > > > isn't
> > > > > > suitable for hbase. If we want to save message from client 1 to
> > > client
> > > > 2
> > > > > in
> > > > > > the hbase cluster then how hbase know in which namespace it have
> to
> > > > save
> > > > > > it? Which namenode will be responsible for that message? How we
> can
> > > > read
> > > > > > client messages?
> > > > > >
> > > > > > Give me any ideas, please
> > > > > >
> > > > > > Sincerely,
> > > > > > Alexandr
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -Dima
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -Dima
>


Re: Hbase federated cluster for messages

2016-08-19 Thread Dima Spivak
I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many files
are we talking about here? I'd say that use cases that benefit from HBase
don't tend to hit the kind of HDFS file limits that federation seeks to
address.

-Dima

On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov 
wrote:

> FB has its own "federation". It is a proprietary code, I presume.
>
> -Vladimir
>
>
> On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> alexandr.poru...@gmail.com> wrote:
>
> > No. There isn't. But I want to figure out how to configure that type of
> > cluster in the case if there is particular reason. How facebook can
> handle
> > such a huge amount of ops without federation? I don't think that they
> just
> > have one namenode server and one standby namenode server. It isn't
> > possible. I am sure that they use federation.
> >
> > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > vladrodio...@gmail.com>
> > wrote:
> >
> > > >> I am not sure how to do it but I have to configure federated cluster
> > > with
> > > >> hbase to store huge amount of messages (client to client) (40%
> writes,
> > > 60%
> > > >> reads).
> > >
> > > Any particular reason for federated cluster? How huge is huge amount
> and
> > > what is the message size?
> > >
> > > -Vladimir
> > >
> > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak 
> > > wrote:
> > >
> > > > As far as I know, HBase doesn't support spreading tables across
> > > namespaces;
> > > > you'd have to point it at one namenode at a time. I've heard of
> people
> > > > trying to run multiple HBase instances in order to get access to all
> > > their
> > > > HDFS data, but it doesn't tend to be much fun.
> > > >
> > > > -Dima
> > > >
> > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > alexandr.poru...@gmail.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I am not sure how to do it but I have to configure federated
> cluster
> > > with
> > > > > hbase to store huge amount of messages (client to client) (40%
> > writes,
> > > > 60%
> > > > > reads). Does somebody have any idea or examples how to configure
> it?
> > > > >
> > > > > Of course we can configure hdfs in a federated mode but as for me
> it
> > > > isn't
> > > > > suitable for hbase. If we want to save message from client 1 to
> > client
> > > 2
> > > > in
> > > > > the hbase cluster then how hbase know in which namespace it have to
> > > save
> > > > > it? Which namenode will be responsible for that message? How we can
> > > read
> > > > > client messages?
> > > > >
> > > > > Give me any ideas, please
> > > > >
> > > > > Sincerely,
> > > > > Alexandr
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -Dima
> > > >
> > >
> >
>



-- 
-Dima


Re: Hbase federated cluster for messages

2016-08-19 Thread Vladimir Rodionov
FB has its own "federation". It is a proprietary code, I presume.

-Vladimir


On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
alexandr.poru...@gmail.com> wrote:

> No. There isn't. But I want to figure out how to configure that type of
> cluster in the case if there is particular reason. How facebook can handle
> such a huge amount of ops without federation? I don't think that they just
> have one namenode server and one standby namenode server. It isn't
> possible. I am sure that they use federation.
>
> On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> vladrodio...@gmail.com>
> wrote:
>
> > >> I am not sure how to do it but I have to configure federated cluster
> > with
> > >> hbase to store huge amount of messages (client to client) (40% writes,
> > 60%
> > >> reads).
> >
> > Any particular reason for federated cluster? How huge is huge amount and
> > what is the message size?
> >
> > -Vladimir
> >
> > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak 
> > wrote:
> >
> > > As far as I know, HBase doesn't support spreading tables across
> > namespaces;
> > > you'd have to point it at one namenode at a time. I've heard of people
> > > trying to run multiple HBase instances in order to get access to all
> > their
> > > HDFS data, but it doesn't tend to be much fun.
> > >
> > > -Dima
> > >
> > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > alexandr.poru...@gmail.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > I am not sure how to do it but I have to configure federated cluster
> > with
> > > > hbase to store huge amount of messages (client to client) (40%
> writes,
> > > 60%
> > > > reads). Does somebody have any idea or examples how to configure it?
> > > >
> > > > Of course we can configure hdfs in a federated mode but as for me it
> > > isn't
> > > > suitable for hbase. If we want to save message from client 1 to
> client
> > 2
> > > in
> > > > the hbase cluster then how hbase know in which namespace it have to
> > save
> > > > it? Which namenode will be responsible for that message? How we can
> > read
> > > > client messages?
> > > >
> > > > Give me any ideas, please
> > > >
> > > > Sincerely,
> > > > Alexandr
> > > >
> > >
> > >
> > >
> > > --
> > > -Dima
> > >
> >
>


Re: Hbase federated cluster for messages

2016-08-19 Thread Alexandr Porunov
No. There isn't. But I want to figure out how to configure that type of
cluster in the case if there is particular reason. How facebook can handle
such a huge amount of ops without federation? I don't think that they just
have one namenode server and one standby namenode server. It isn't
possible. I am sure that they use federation.

On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov 
wrote:

> >> I am not sure how to do it but I have to configure federated cluster
> with
> >> hbase to store huge amount of messages (client to client) (40% writes,
> 60%
> >> reads).
>
> Any particular reason for federated cluster? How huge is huge amount and
> what is the message size?
>
> -Vladimir
>
> On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak 
> wrote:
>
> > As far as I know, HBase doesn't support spreading tables across
> namespaces;
> > you'd have to point it at one namenode at a time. I've heard of people
> > trying to run multiple HBase instances in order to get access to all
> their
> > HDFS data, but it doesn't tend to be much fun.
> >
> > -Dima
> >
> > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > alexandr.poru...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > I am not sure how to do it but I have to configure federated cluster
> with
> > > hbase to store huge amount of messages (client to client) (40% writes,
> > 60%
> > > reads). Does somebody have any idea or examples how to configure it?
> > >
> > > Of course we can configure hdfs in a federated mode but as for me it
> > isn't
> > > suitable for hbase. If we want to save message from client 1 to client
> 2
> > in
> > > the hbase cluster then how hbase know in which namespace it have to
> save
> > > it? Which namenode will be responsible for that message? How we can
> read
> > > client messages?
> > >
> > > Give me any ideas, please
> > >
> > > Sincerely,
> > > Alexandr
> > >
> >
> >
> >
> > --
> > -Dima
> >
>


Re: Hbase federated cluster for messages

2016-08-19 Thread Vladimir Rodionov
>> I am not sure how to do it but I have to configure federated cluster with
>> hbase to store huge amount of messages (client to client) (40% writes,
60%
>> reads).

Any particular reason for federated cluster? How huge is huge amount and
what is the message size?

-Vladimir

On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak  wrote:

> As far as I know, HBase doesn't support spreading tables across namespaces;
> you'd have to point it at one namenode at a time. I've heard of people
> trying to run multiple HBase instances in order to get access to all their
> HDFS data, but it doesn't tend to be much fun.
>
> -Dima
>
> On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> alexandr.poru...@gmail.com> wrote:
>
> > Hello,
> >
> > I am not sure how to do it but I have to configure federated cluster with
> > hbase to store huge amount of messages (client to client) (40% writes,
> 60%
> > reads). Does somebody have any idea or examples how to configure it?
> >
> > Of course we can configure hdfs in a federated mode but as for me it
> isn't
> > suitable for hbase. If we want to save message from client 1 to client 2
> in
> > the hbase cluster then how hbase know in which namespace it have to save
> > it? Which namenode will be responsible for that message? How we can read
> > client messages?
> >
> > Give me any ideas, please
> >
> > Sincerely,
> > Alexandr
> >
>
>
>
> --
> -Dima
>


Re: Hbase federated cluster for messages

2016-08-19 Thread Alexandr Porunov
Hi Dima,

But isn't it a bottleneck then?
Our throughput limited by a single namenode server?

Sincerely,
Alexandr

On Fri, Aug 19, 2016 at 9:57 PM, Dima Spivak  wrote:

> As far as I know, HBase doesn't support spreading tables across namespaces;
> you'd have to point it at one namenode at a time. I've heard of people
> trying to run multiple HBase instances in order to get access to all their
> HDFS data, but it doesn't tend to be much fun.
>
> -Dima
>
> On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> alexandr.poru...@gmail.com> wrote:
>
> > Hello,
> >
> > I am not sure how to do it but I have to configure federated cluster with
> > hbase to store huge amount of messages (client to client) (40% writes,
> 60%
> > reads). Does somebody have any idea or examples how to configure it?
> >
> > Of course we can configure hdfs in a federated mode but as for me it
> isn't
> > suitable for hbase. If we want to save message from client 1 to client 2
> in
> > the hbase cluster then how hbase know in which namespace it have to save
> > it? Which namenode will be responsible for that message? How we can read
> > client messages?
> >
> > Give me any ideas, please
> >
> > Sincerely,
> > Alexandr
> >
>
>
>
> --
> -Dima
>


Re: Hbase federated cluster for messages

2016-08-19 Thread Dima Spivak
As far as I know, HBase doesn't support spreading tables across namespaces;
you'd have to point it at one namenode at a time. I've heard of people
trying to run multiple HBase instances in order to get access to all their
HDFS data, but it doesn't tend to be much fun.

-Dima

On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
alexandr.poru...@gmail.com> wrote:

> Hello,
>
> I am not sure how to do it but I have to configure federated cluster with
> hbase to store huge amount of messages (client to client) (40% writes, 60%
> reads). Does somebody have any idea or examples how to configure it?
>
> Of course we can configure hdfs in a federated mode but as for me it isn't
> suitable for hbase. If we want to save message from client 1 to client 2 in
> the hbase cluster then how hbase know in which namespace it have to save
> it? Which namenode will be responsible for that message? How we can read
> client messages?
>
> Give me any ideas, please
>
> Sincerely,
> Alexandr
>



-- 
-Dima