Okay, at which scale you have experience with ? -----Original Message----- From: Dima Spivak [mailto:dimaspi...@apache.org] Sent: Monday, July 17, 2017 7:40 PM To: user@hbase.apache.org Subject: Re: Hbase on docker container with persistent storage
No, not at the scale you're looking at. On Mon, Jul 17, 2017 at 6:36 AM Udbhav Agarwal <udbhav.agar...@syncoms.com> wrote: > Hi Dima, > I am unable to containeriz HDFS till now. Do you have any reference > which I can use to go ahead with that ? > > Thanks, > Udbhav > > -----Original Message----- > From: Dima Spivak [mailto:dimaspi...@apache.org] > Sent: Monday, July 17, 2017 6:37 PM > To: user@hbase.apache.org > Subject: Re: Hbase on docker container with persistent storage > > Hi Udbhav, > > How have you containerized HDFS to run on Docker across 80 hosts? The > answer to that would guide how you might add HBase into the mix. > > On Mon, Jul 17, 2017 at 5:33 AM Udbhav Agarwal > <udbhav.agar...@syncoms.com > > > wrote: > > > Hi Dima, > > Hope you are doing well. > > Using hbase on a single host is performant because now I am not > > dealing with Terabytes of data. For now data size is very > > less.(around > > 1 gb). This setup I am using to test my application. > > As a next step I have to grow the data as well as > > storage and check performance. So I will need to use hbase deployed > > on > > 70-80 servers. > > Now can you please let me know how can I containerize > > hbase so as to be able to use hbase backed by hdfs using 70-80 host > > machines and not loose data if the container itself dies due to some > reason? > > > > Thanks, > > Udbhav > > > > From: Dima Spivak [mailto:dimaspi...@apache.org] > > Sent: Friday, July 14, 2017 10:11 PM > > To: Udbhav Agarwal <udbhav.agar...@syncoms.com>; > > user@hbase.apache.org > > Cc: dimaspi...@apache.org > > Subject: Re: Hbase on docker container with persistent storage > > > > If running HBase on a single host is performant enough for you, why > > use HBase at all? How are you currently storing your data? > > > > On Fri, Jul 14, 2017 at 6:07 AM Udbhav Agarwal > > <udbhav.agar...@syncoms.com <mailto:udbhav.agar...@syncoms.com>> wrote: > > Additionally, can you please provide me some links which can guide > > me to setup up such system with volumes ? Thank you. > > > > Udbhav > > -----Original Message----- > > From: Udbhav Agarwal [mailto:udbhav.agar...@syncoms.com<mailto: > > udbhav.agar...@syncoms.com>] > > Sent: Friday, July 14, 2017 6:31 PM > > To: user@hbase.apache.org<mailto:user@hbase.apache.org> > > Cc: dimaspi...@apache.org<mailto:dimaspi...@apache.org> > > Subject: RE: Hbase on docker container with persistent storage > > > > Thank you Dima for the response. > > Let me reiterate what I want to achieve in my case. I am > > using hbase to persist my bigdata(Terabytes and petabytes) coming > > from various sources through spark streaming and kafka. Spark > > streaming and kafka are running as separate microservices inside > > different and > excusive containers. > > These containers are communicating with http service protocol. > > Currently I am using hbase setup on 4 VMs on a single host machine. > > I have a microservice inside a container to connect to this hbase. > > This whole setup is functional and I am able to persist data into as > > well as get data from hbase into spark streaming. My use case is of > > real time ingestion into hbase as well as real time query from hbase. > > Now I am planning to deploy hbase itself inside container. I > > want to know what are the options for this. In how many possible > > ways I can achieve this ? If I use volumes of container, will they > > be able to hold such amount of data (TBs & PBs) ? How will I setup > > up hdfs > inside volumes ? > > how can I use the power of distributed file system there? Is this > > the best way ? > > > > > > Thanks, > > Udbhav > > -----Original Message----- > > From: Dima Spivak [mailto:dimaspi...@apache.org<mailto: > > dimaspi...@apache.org>] > > Sent: Friday, July 14, 2017 3:44 AM > > To: hbase-user <user@hbase.apache.org<mailto:user@hbase.apache.org>> > > Subject: Re: Hbase on docker container with persistent storage > > > > Udbhav, > > > > Volumes are Docker's way of having folders or files from the host > > machine bypass the union filesystem used within a Docker container. > > As such, if a container with a volume is killed, the data from that > > volume should remain there. That said, if whatever caused the > > container to die affects the filesystem within the container, it > > would > also affect the data on the host. > > > > Running HBase in the manner you've described is not typical in > > anything resembling a production environment, but if you explain > > more about your use case, we could provide more advice. That said, > > how you'd handle data locality and, in particular, multi-host > > deployments of HBase in this manner is more of a concern for me than > > volume data corruption. What kind of scale do you need to support? > > What kind of > performance do you expect? > > > > -Dima > > > > On Thu, Jul 13, 2017 at 12:18 AM, Samir Ahmic <ahmic.sa...@gmail.com > > <mailto:ahmic.sa...@gmail.com>> wrote: > > > > > Hi Udbhav, > > > Great work on hbase docker deployment was done in > > > https://issues.apache.org/jira/browse/HBASE-12721 you may start > > > your journey from there. As for rest of your questions maybe > > > there are some folks here that were doing similar testing and may > > > give you more > > info. > > > > > > Regards > > > Samir > > > > > > On Thu, Jul 13, 2017 at 7:57 AM, Udbhav Agarwal < > > > udbhav.agar...@syncoms.com<mailto:udbhav.agar...@syncoms.com>> > > > wrote: > > > > > > > Hi All, > > > > I need to run hbase 0.98 backed by hdfs on docker container and > > > > want to stop the data lost if the container restarts. > > > > As per my understanding of docker containers, > > > > they work in a way that if any of the container is > > > > stopped/killed , every information related to it gets killed. It > > > > implies if I am running hbase in a > > > container > > > > and I have stored some data in some tables and consequently if > > > > the container is stopped then the data will be lost. I need a > > > > way in which I can stop this data loss. > > > > I have gone through concept of volume in docker. > > > > Is it possible to stop this data loss with this approach? What > > > > if volume gets corrupted? Is there any instance of volume > > > > running there which can be stopped and can cause data loss ? > > > > Is there a possibility that I can use hdfs > > > > running at some external host outside the docker and my hbase > > > > running inside docker ? Is such scenario possible ? If yes, How ? > > > > Thank you in advance. > > > > > > > > > > > > Thanks, > > > > Udbhav Agarwal > > > > > > > > > > > > > -- > > -Dima > > > -- > -Dima > -- -Dima