Re: AntsDB 18.05.27 is released with TPC-C benchmark and LGPL

2018-05-28 Thread Dima Spivak
Hey Water,

Just as an FYI, LGPL is still considered incompatible with the Apache
license and so will generally be a non-starter for organizations who pay
attention to category X dependencies (see
https://www.apache.org/legal/resolved.html#category-x).

On Mon, May 28, 2018 at 12:22 AM Water Guo  wrote:

> In case people wondering what AntsDB is……….
>
> It is a database virtualization software that brings MySQL compatibility
> to HBase.
>
> > On May 28, 2018, at 3:19 AM, Water Guo  wrote:
> >
> > Dear HBase community:
> >
> > A new version of AntsDB has been released. To address some of the
> feedback from the community, now the source code is published under LGPL
> license. It is much less restrictive than the previous AGPL license. I hope
> it will ease the concerns of making derivative work.
> >
> > I also conducted a TPC-C benchmark using BenchmarkSQL 4.1.1 from
> PostgreSQL community. It shows that AntsDB can handle row lock and
> transaction commit/rollback very efficiently. Result is published at
> http://www.antsdb.com/?p=207.
> >
> > As always your feedback is welcome and please follow the project on
> GitHub at https://github.com/waterguo/antsdb
> >
> > ~water
>
> --
-Dima


Re: AntsDB is released with MySQL compatibility for HBase

2018-05-09 Thread Dima Spivak
LGPL is not compatible with ASL, either. See: https://www.apache.org/
legal/resolved.html#category-x

-Dima

On Wed, May 9, 2018 at 10:59 AM, Sanel Zukan  wrote:

> What about Connector/J from MariaDB? It is LGPL and (I think) that
> should make it easier to mix with Apache license.
>
> Best,
> Sanel
>
> Water Guo  writes:
> > I am not an expert with software license. But my gut’s feeling is I
> can’t claim Apache if I am using code from MySQL JDBC driver (GPL), can I?
> >
> >> On May 8, 2018, at 11:49 AM, Stack  wrote:
> >>
> >> On Mon, May 7, 2018 at 11:58 AM, Water Guo 
> wrote:
> >>
> >>> Not it is not. AntsDB has no dependence on MySQL code. It is written in
> >>> Java and uses a couple of open source libraries.
> >>>
> >>>
> >> Sorry. There seems to be a misunderstanding. I was just asking if why
> >> antsdb has a GPL license rather than say an Apache one:
> >> https://github.com/waterguo/antsdb/blob/master/LICENSE.txt
> >>
> >> Thanks,
> >> S
> >>
> >>
>  Sounds great Water. Lets take it for a spin. Quick question, why the
> "GNU
>  Affero General Public License, version 3" Is it up from mysql? Thanks,
>  S
> >>>
> >>> On Mon, May 7, 2018 at 10:09 AM, Water Guo 
> wrote:
> >>>
>  Dear HBase Community,
> 
>  I’d like to take this opportunity to introduce my open source project
>  AntsDB. It is a database virtualization software that brings MySQL
>  compatibility to HBase. It means you can use any MySQL bindings such
> as
>  JDBC, ODBC, PHP, Perl to manipulate data in HBase. It supports most
> MySQL
>  DDLs and all DMLs, transaction control, table locks, row locks etc.
> Up to
>  date applications such as MySQL console, MySQL command lines,
> >>> BenchmarkSQL,
>  MediaWiki, SonarQube, DBeaver, SquirrelSQL and many others can run
> >>> directly
>  on HBase using AntsDB layer. The project is hosted at
>  https://github.com/waterguo/antsdb.
> 
>  AntsDB is designed to support high concurrency, low latency
> applications.
>  It uses local storage as cache so it can further reduce the latency of
>  HBase. We have benchmarked AntsDB using YCSB. The result is at
>  http://www.antsdb.com/?p=171.
> 
>  People always ask me how it is different from Phoenix. While Phoenix
> is
>  building a powerful SQL layer for HBase, we want to focus on backward
>  compatibility. We want to have applications built for MySQL can be
> used
>  directly on HBase. And people who are familiar with traditional
> >>> relational
>  database can adopt HBase/Hadoop stack with ease.
> 
>  I’d be very glad if you find the project is useful and your feedback
> is
>  very welcome.
> 
>  Thanks
>  -water
> 
> >>>
> >>>
>


Re: Hbase master doesnt start from commandline

2018-03-19 Thread Dima Spivak
Hey Muni,

This is probably a better question for Cloudera support. Especially if you
happen to be using Cloudera Manager, the normal command line functionality
may not apply.


-Dima

On Mon, Mar 19, 2018 at 10:41 AM, Muni Adusumalli  wrote:

> Hi,
>
> We have a cloudera cluster running in production.
> One of the hosts went down as it ran out of disk space when running hbase
> backups.
> After that master is marked as Busy on cloudera.
> cant restart the master using command line.
> All the region servers are up, but the master is missing.
>
> Any inputs on what to look for?
>
> Regards,
> RekDev
>


Re: “Could not find or load main class” when running with systemctl

2017-08-31 Thread Dima Spivak
I'd think journalctl would have more logs on systemd stuff, no? Can you try
running that against your service (journalctl -u, if memory serves me).

My guess from the capitalized "Cannot" is that it's running some command
which is failing and then trying to pass that failing arg's output in to
the start script.

-Dima

On Thu, Aug 31, 2017 at 12:00 PM, Stack  wrote:

> HBase has its own set of logs. Are you unable to find any? Or have
> systemctl direct stdout/stderr somewhere that persists past startup.
> S
>
> On Thu, Aug 31, 2017 at 11:02 AM, Alexandr Porunov <
> alexandr.poru...@gmail.com> wrote:
>
> > There are no much information:
> >
> > Aug 31 20:48:57 jblur.com hbasemaster[12138]: Java HotSpot(TM) 64-Bit
> > Server VM warning: ignoring option PermSize=128m; support was removed in
> > 8.0
> > Aug 31 20:48:57 jblur.com hbasemaster[12138]: Java HotSpot(TM) 64-Bit
> > Server VM warning: ignoring option MaxPermSize=128m; support was removed
> in
> > 8.0
> > Aug 31 20:48:57 jblur.com hbasemaster[12138]: Java HotSpot(TM) 64-Bit
> > Server VM warning: ignoring option PermSize=128m; support was removed in
> > 8.0
> > Aug 31 20:48:57 jblur.com hbasemaster[12138]: Java HotSpot(TM) 64-Bit
> > Server VM warning: ignoring option MaxPermSize=128m; support was removed
> in
> > 8.0
> > Aug 31 20:48:57 jblur.com hbasemaster[12138]: Error: Could not find or
> > load
> > main class Cannot
> >
> > Log files shows nothing interesting:
> >
> > core file size  (blocks, -c) unlimited
> > data seg size   (kbytes, -d) unlimited
> > scheduling priority (-e) 0
> > file size   (blocks, -f) unlimited
> > pending signals (-i) 160324
> > max locked memory   (kbytes, -l) 64
> > max memory size (kbytes, -m) unlimited
> > open files  (-n) 65536
> > pipe size(512 bytes, -p) 8
> > POSIX message queues (bytes, -q) 819200
> > real-time priority  (-r) 0
> > stack size  (kbytes, -s) 8192
> > cpu time   (seconds, -t) unlimited
> > max user processes  (-u) 65536
> > virtual memory  (kbytes, -v) unlimited
> > file locks  (-x) unlimited
> > Thu Aug 31 20:48:43 EEST 2017 Starting master on jblur.com
> > core file size  (blocks, -c) unlimited
> > data seg size   (kbytes, -d) unlimited
> > scheduling priority (-e) 0
> > file size   (blocks, -f) unlimited
> > pending signals (-i) 160324
> > max locked memory   (kbytes, -l) 64
> > max memory size (kbytes, -m) unlimited
> > open files  (-n) 65536
> > pipe size(512 bytes, -p) 8
> > POSIX message queues (bytes, -q) 819200
> > real-time priority  (-r) 0
> > stack size  (kbytes, -s) 8192
> > cpu time   (seconds, -t) unlimited
> > max user processes  (-u) 65536
> > virtual memory  (kbytes, -v) unlimited
> > file locks  (-x) unlimited
> > Thu Aug 31 20:48:56 EEST 2017 Starting master on jblur.com
> > core file size  (blocks, -c) unlimited
> > data seg size   (kbytes, -d) unlimited
> > scheduling priority (-e) 0
> > file size   (blocks, -f) unlimited
> > pending signals (-i) 160324
> > max locked memory   (kbytes, -l) 64
> > max memory size (kbytes, -m) unlimited
> > open files  (-n) 65536
> > pipe size(512 bytes, -p) 8
> > POSIX message queues (bytes, -q) 819200
> > real-time priority  (-r) 0
> > stack size  (kbytes, -s) 8192
> > cpu time   (seconds, -t) unlimited
> > max user processes  (-u) 65536
> > virtual memory  (kbytes, -v) unlimited
> > file locks  (-x) unlimited
> >
> > Do you have any ideas?
> >
> > On Thu, Aug 31, 2017 at 7:16 PM, Stack  wrote:
> >
> > > On Wed, Aug 30, 2017 at 3:14 AM, Alexandr Porunov <
> > > alexandr.poru...@gmail.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > I am trying to run hbase 1.3.1 with systemctl on fedora 26 but
> without
> > > > success.
> > > >
> > > > If I execute `/usr/lib/hbase/bin/start-hbase.sh` it starts just
> fine.
> > > > But I need to write systemctl wrapper around hbase.
> > > > I don't understand what am I doing wrong. I am using hdfs user to
> start
> > > > hbase in both cases.
> > > >
> > > > Here is my /etc/systemd/system/hbase.service :
> > > >
> > > > [Unit]
> > > > Description=Hbase Service
> > > >
> > > > [Service]
> > > > User=hdfs
> > > > Group=hadoop
> > > > Type=forking
> > > > ExecStart=/usr/lib/hbase/bin/start-hbase.sh
> > > > ExecStop=/usr/lib/hbase/bin/stop-hbase.sh
> > > > TimeoutStartSec=2min
> > > > Restart=on-failure
> > > >
> > > > [Install]
> > > > WantedBy=multi-user.target
> > > >
> > > > After `systemctl start 

Re: [ANNOUNCE] New HBase committer Mike Drob

2017-08-01 Thread Dima Spivak
Yay Mike!


-Dima

On Tue, Aug 1, 2017 at 9:26 AM, Jonathan Hsieh  wrote:

> Congrats Mike!
>
> On Tue, Aug 1, 2017 at 8:38 AM, Josh Elser  wrote:
>
> > On behalf of the Apache HBase PMC, I'm pleased to announce that Mike Drob
> > has accepted the PMC's invitation to become a committer.
> >
> > Mike has been doing some great things lately in the project and this is a
> > simple way that we can express our thanks. As my boss likes to tell me:
> the
> > reward for a job well-done is more work to do! We're all looking forward
> to
> > your continued involvement :)
> >
> > Please join me in congratulating Mike!
> >
> > - Josh
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // HBase Team, Engineering Manager, Cloudera
> // j...@cloudera.com // @jmhsieh
>


Re: distributing new regions immediately

2017-07-27 Thread Dima Spivak
Presplitting tables is typically how this is addressed in production cases.

On Thu, Jul 27, 2017 at 12:17 PM jeff saremi  wrote:

> We haven't done enough testing for me to say this with certainty but as we
> insert data and new regions get created, it could be a while before those
> regions are distributed. As such and if the data injection continues the
> load on the region server becomes overwhelming
>
> Is there a way to expedite the distribution of regions among available
> region servers?
>
> thanks
>
> --
-Dima


Re: Hbase on docker container with persistent storage

2017-07-19 Thread Dima Spivak
I've run HDFS/HBase in Docker containers across a handful of hosts while
working on changes to the clusterdock project [1]. More often, though, I've
worked with multiple Docker containers on a single machine (albeit with
lots of storage) to test the components.

1. https://github.com/clusterdock/

-Dima

On Tue, Jul 18, 2017 at 9:52 PM, Udbhav Agarwal <udbhav.agar...@syncoms.com>
wrote:

> Okay, at which scale you have experience with ?
>
> -Original Message-----
> From: Dima Spivak [mailto:dimaspi...@apache.org]
> Sent: Monday, July 17, 2017 7:40 PM
> To: user@hbase.apache.org
> Subject: Re: Hbase on docker container with persistent storage
>
> No, not at the scale you're looking at.
>
> On Mon, Jul 17, 2017 at 6:36 AM Udbhav Agarwal <udbhav.agar...@syncoms.com
> >
> wrote:
>
> > Hi Dima,
> > I am unable to containeriz HDFS till now. Do you have any reference
> > which I can use to go ahead with that ?
> >
> > Thanks,
> > Udbhav
> >
> > -Original Message-
> > From: Dima Spivak [mailto:dimaspi...@apache.org]
> > Sent: Monday, July 17, 2017 6:37 PM
> > To: user@hbase.apache.org
> > Subject: Re: Hbase on docker container with persistent storage
> >
> > Hi Udbhav,
> >
> > How have you containerized HDFS to run on Docker across 80 hosts? The
> > answer to that would guide how you might add HBase into the mix.
> >
> > On Mon, Jul 17, 2017 at 5:33 AM Udbhav Agarwal
> > <udbhav.agar...@syncoms.com
> > >
> > wrote:
> >
> > > Hi Dima,
> > > Hope you are doing well.
> > > Using hbase on a single host is performant because now I am not
> > > dealing with Terabytes of data. For now data size is very
> > > less.(around
> > > 1 gb). This setup I am using to test my application.
> > >As a next step I have to grow the data as well as
> > > storage and check performance. So I will need to use hbase deployed
> > > on
> > > 70-80 servers.
> > >Now can you please let me know how can I containerize
> > > hbase so as to be able to use hbase backed by hdfs using 70-80 host
> > > machines and not loose data if the container itself dies due to some
> > reason?
> > >
> > > Thanks,
> > > Udbhav
> > >
> > > From: Dima Spivak [mailto:dimaspi...@apache.org]
> > > Sent: Friday, July 14, 2017 10:11 PM
> > > To: Udbhav Agarwal <udbhav.agar...@syncoms.com>;
> > > user@hbase.apache.org
> > > Cc: dimaspi...@apache.org
> > > Subject: Re: Hbase on docker container with persistent storage
> > >
> > > If running HBase on a single host is performant enough for you, why
> > > use HBase at all? How are you currently storing your data?
> > >
> > > On Fri, Jul 14, 2017 at 6:07 AM Udbhav Agarwal
> > > <udbhav.agar...@syncoms.com <mailto:udbhav.agar...@syncoms.com>>
> wrote:
> > > Additionally, can you please provide me some links which can guide
> > > me to setup up such system with volumes ? Thank you.
> > >
> > > Udbhav
> > > -Original Message-
> > > From: Udbhav Agarwal [mailto:udbhav.agar...@syncoms.com > > udbhav.agar...@syncoms.com>]
> > > Sent: Friday, July 14, 2017 6:31 PM
> > > To: user@hbase.apache.org<mailto:user@hbase.apache.org>
> > > Cc: dimaspi...@apache.org<mailto:dimaspi...@apache.org>
> > > Subject: RE: Hbase on docker container with persistent storage
> > >
> > > Thank you Dima for the response.
> > > Let me reiterate what I want to achieve in my case. I am
> > > using hbase to persist my bigdata(Terabytes and petabytes) coming
> > > from various sources through spark streaming and kafka.  Spark
> > > streaming and kafka are running as separate microservices inside
> > > different and
> > excusive containers.
> > > These containers are communicating with http service protocol.
> > > Currently I am using hbase setup on 4 VMs on a single host machine.
> > > I have a microservice inside a container to connect to this hbase.
> > > This whole setup is functional and I am able to persist data into as
> > > well as get data from hbase into spark streaming. My use case is of
> > > real time ingestion into hbase as well as real time query from hbase.
> > > Now I am planning to deploy hbase itself inside container. I
> > > want to know what are the options for this. In how many possible
> > > ways I can achieve this ? If I 

Re: Hbase on docker container with persistent storage

2017-07-17 Thread Dima Spivak
No, not at the scale you're looking at.

On Mon, Jul 17, 2017 at 6:36 AM Udbhav Agarwal <udbhav.agar...@syncoms.com>
wrote:

> Hi Dima,
> I am unable to containeriz HDFS till now. Do you have any reference which
> I can use to go ahead with that ?
>
> Thanks,
> Udbhav
>
> -Original Message-
> From: Dima Spivak [mailto:dimaspi...@apache.org]
> Sent: Monday, July 17, 2017 6:37 PM
> To: user@hbase.apache.org
> Subject: Re: Hbase on docker container with persistent storage
>
> Hi Udbhav,
>
> How have you containerized HDFS to run on Docker across 80 hosts? The
> answer to that would guide how you might add HBase into the mix.
>
> On Mon, Jul 17, 2017 at 5:33 AM Udbhav Agarwal <udbhav.agar...@syncoms.com
> >
> wrote:
>
> > Hi Dima,
> > Hope you are doing well.
> > Using hbase on a single host is performant because now I am not
> > dealing with Terabytes of data. For now data size is very less.(around
> > 1 gb). This setup I am using to test my application.
> >As a next step I have to grow the data as well as
> > storage and check performance. So I will need to use hbase deployed on
> > 70-80 servers.
> >Now can you please let me know how can I containerize
> > hbase so as to be able to use hbase backed by hdfs using 70-80 host
> > machines and not loose data if the container itself dies due to some
> reason?
> >
> > Thanks,
> > Udbhav
> >
> > From: Dima Spivak [mailto:dimaspi...@apache.org]
> > Sent: Friday, July 14, 2017 10:11 PM
> > To: Udbhav Agarwal <udbhav.agar...@syncoms.com>; user@hbase.apache.org
> > Cc: dimaspi...@apache.org
> > Subject: Re: Hbase on docker container with persistent storage
> >
> > If running HBase on a single host is performant enough for you, why
> > use HBase at all? How are you currently storing your data?
> >
> > On Fri, Jul 14, 2017 at 6:07 AM Udbhav Agarwal
> > <udbhav.agar...@syncoms.com <mailto:udbhav.agar...@syncoms.com>> wrote:
> > Additionally, can you please provide me some links which can guide me
> > to setup up such system with volumes ? Thank you.
> >
> > Udbhav
> > -Original Message-
> > From: Udbhav Agarwal [mailto:udbhav.agar...@syncoms.com > udbhav.agar...@syncoms.com>]
> > Sent: Friday, July 14, 2017 6:31 PM
> > To: user@hbase.apache.org<mailto:user@hbase.apache.org>
> > Cc: dimaspi...@apache.org<mailto:dimaspi...@apache.org>
> > Subject: RE: Hbase on docker container with persistent storage
> >
> > Thank you Dima for the response.
> > Let me reiterate what I want to achieve in my case. I am using
> > hbase to persist my bigdata(Terabytes and petabytes) coming from
> > various sources through spark streaming and kafka.  Spark streaming
> > and kafka are running as separate microservices inside different and
> excusive containers.
> > These containers are communicating with http service protocol.
> > Currently I am using hbase setup on 4 VMs on a single host machine. I
> > have a microservice inside a container to connect to this hbase. This
> > whole setup is functional and I am able to persist data into as well
> > as get data from hbase into spark streaming. My use case is of real
> > time ingestion into hbase as well as real time query from hbase.
> > Now I am planning to deploy hbase itself inside container. I
> > want to know what are the options for this. In how many possible ways
> > I can achieve this ? If I use volumes of container, will they be able
> > to hold such amount of data (TBs & PBs) ? How will I setup up hdfs
> inside volumes ?
> > how can I use the power of distributed file system there? Is this the
> > best way ?
> >
> >
> > Thanks,
> > Udbhav
> > -Original Message-
> > From: Dima Spivak [mailto:dimaspi...@apache.org > dimaspi...@apache.org>]
> > Sent: Friday, July 14, 2017 3:44 AM
> > To: hbase-user <user@hbase.apache.org<mailto:user@hbase.apache.org>>
> > Subject: Re: Hbase on docker container with persistent storage
> >
> > Udbhav,
> >
> > Volumes are Docker's way of having folders or files from the host
> > machine bypass the union filesystem used within a Docker container. As
> > such, if a container with a volume is killed, the data from that
> > volume should remain there. That said, if whatever caused the
> > container to die affects the filesystem within the container, it would
> also affect the data on the host.
> >
> > Running HBase in the manner you've described is not typical in
>

Re: Hbase on docker container with persistent storage

2017-07-17 Thread Dima Spivak
Hi Udbhav,

How have you containerized HDFS to run on Docker across 80 hosts? The
answer to that would guide how you might add HBase into the mix.

On Mon, Jul 17, 2017 at 5:33 AM Udbhav Agarwal <udbhav.agar...@syncoms.com>
wrote:

> Hi Dima,
> Hope you are doing well.
> Using hbase on a single host is performant because now I am not dealing
> with Terabytes of data. For now data size is very less.(around 1 gb). This
> setup I am using to test my application.
>As a next step I have to grow the data as well as storage
> and check performance. So I will need to use hbase deployed on 70-80
> servers.
>Now can you please let me know how can I containerize hbase
> so as to be able to use hbase backed by hdfs using 70-80 host machines and
> not loose data if the container itself dies due to some reason?
>
> Thanks,
> Udbhav
>
> From: Dima Spivak [mailto:dimaspi...@apache.org]
> Sent: Friday, July 14, 2017 10:11 PM
> To: Udbhav Agarwal <udbhav.agar...@syncoms.com>; user@hbase.apache.org
> Cc: dimaspi...@apache.org
> Subject: Re: Hbase on docker container with persistent storage
>
> If running HBase on a single host is performant enough for you, why use
> HBase at all? How are you currently storing your data?
>
> On Fri, Jul 14, 2017 at 6:07 AM Udbhav Agarwal <udbhav.agar...@syncoms.com
> <mailto:udbhav.agar...@syncoms.com>> wrote:
> Additionally, can you please provide me some links which can guide me to
> setup up such system with volumes ? Thank you.
>
> Udbhav
> -Original Message-
> From: Udbhav Agarwal [mailto:udbhav.agar...@syncoms.com udbhav.agar...@syncoms.com>]
> Sent: Friday, July 14, 2017 6:31 PM
> To: user@hbase.apache.org<mailto:user@hbase.apache.org>
> Cc: dimaspi...@apache.org<mailto:dimaspi...@apache.org>
> Subject: RE: Hbase on docker container with persistent storage
>
> Thank you Dima for the response.
> Let me reiterate what I want to achieve in my case. I am using
> hbase to persist my bigdata(Terabytes and petabytes) coming from various
> sources through spark streaming and kafka.  Spark streaming and kafka are
> running as separate microservices inside different and excusive containers.
> These containers are communicating with http service protocol. Currently I
> am using hbase setup on 4 VMs on a single host machine. I have a
> microservice inside a container to connect to this hbase. This whole setup
> is functional and I am able to persist data into as well as get data from
> hbase into spark streaming. My use case is of real time ingestion into
> hbase as well as real time query from hbase.
> Now I am planning to deploy hbase itself inside container. I want
> to know what are the options for this. In how many possible ways I can
> achieve this ? If I use volumes of container, will they be able to hold
> such amount of data (TBs & PBs) ? How will I setup up hdfs inside volumes ?
> how can I use the power of distributed file system there? Is this the best
> way ?
>
>
> Thanks,
> Udbhav
> -Original Message-
> From: Dima Spivak [mailto:dimaspi...@apache.org dimaspi...@apache.org>]
> Sent: Friday, July 14, 2017 3:44 AM
> To: hbase-user <user@hbase.apache.org<mailto:user@hbase.apache.org>>
> Subject: Re: Hbase on docker container with persistent storage
>
> Udbhav,
>
> Volumes are Docker's way of having folders or files from the host machine
> bypass the union filesystem used within a Docker container. As such, if a
> container with a volume is killed, the data from that volume should remain
> there. That said, if whatever caused the container to die affects the
> filesystem within the container, it would also affect the data on the host.
>
> Running HBase in the manner you've described is not typical in anything
> resembling a production environment, but if you explain more about your use
> case, we could provide more advice. That said, how you'd handle data
> locality and, in particular, multi-host deployments of HBase in this manner
> is more of a concern for me than volume data corruption. What kind of scale
> do you need to support? What kind of performance do you expect?
>
> -Dima
>
> On Thu, Jul 13, 2017 at 12:18 AM, Samir Ahmic <ahmic.sa...@gmail.com
> <mailto:ahmic.sa...@gmail.com>> wrote:
>
> > Hi Udbhav,
> > Great work on hbase docker deployment was done in
> > https://issues.apache.org/jira/browse/HBASE-12721 you may start your
> > journey from there.  As for rest of your questions maybe there are
> > some folks here that were doing similar testing and may give you more
> info.
> >
> > Regards
> > Samir
> >
> > On Thu, Jul 13, 2017 a

Re: Hbase on docker container with persistent storage

2017-07-14 Thread Dima Spivak
If running HBase on a single host is performant enough for you, why use
HBase at all? How are you currently storing your data?

On Fri, Jul 14, 2017 at 6:07 AM Udbhav Agarwal <udbhav.agar...@syncoms.com>
wrote:

> Additionally, can you please provide me some links which can guide me to
> setup up such system with volumes ? Thank you.
>
> Udbhav
> -Original Message-
> From: Udbhav Agarwal [mailto:udbhav.agar...@syncoms.com]
> Sent: Friday, July 14, 2017 6:31 PM
> To: user@hbase.apache.org
> Cc: dimaspi...@apache.org
> Subject: RE: Hbase on docker container with persistent storage
>
> Thank you Dima for the response.
> Let me reiterate what I want to achieve in my case. I am using
> hbase to persist my bigdata(Terabytes and petabytes) coming from various
> sources through spark streaming and kafka.  Spark streaming and kafka are
> running as separate microservices inside different and excusive containers.
> These containers are communicating with http service protocol. Currently I
> am using hbase setup on 4 VMs on a single host machine. I have a
> microservice inside a container to connect to this hbase. This whole setup
> is functional and I am able to persist data into as well as get data from
> hbase into spark streaming. My use case is of real time ingestion into
> hbase as well as real time query from hbase.
> Now I am planning to deploy hbase itself inside container. I want
> to know what are the options for this. In how many possible ways I can
> achieve this ? If I use volumes of container, will they be able to hold
> such amount of data (TBs & PBs) ? How will I setup up hdfs inside volumes ?
> how can I use the power of distributed file system there? Is this the best
> way ?
>
>
> Thanks,
> Udbhav
> -Original Message-
> From: Dima Spivak [mailto:dimaspi...@apache.org]
> Sent: Friday, July 14, 2017 3:44 AM
> To: hbase-user <user@hbase.apache.org>
> Subject: Re: Hbase on docker container with persistent storage
>
> Udbhav,
>
> Volumes are Docker's way of having folders or files from the host machine
> bypass the union filesystem used within a Docker container. As such, if a
> container with a volume is killed, the data from that volume should remain
> there. That said, if whatever caused the container to die affects the
> filesystem within the container, it would also affect the data on the host.
>
> Running HBase in the manner you've described is not typical in anything
> resembling a production environment, but if you explain more about your use
> case, we could provide more advice. That said, how you'd handle data
> locality and, in particular, multi-host deployments of HBase in this manner
> is more of a concern for me than volume data corruption. What kind of scale
> do you need to support? What kind of performance do you expect?
>
> -Dima
>
> On Thu, Jul 13, 2017 at 12:18 AM, Samir Ahmic <ahmic.sa...@gmail.com>
> wrote:
>
> > Hi Udbhav,
> > Great work on hbase docker deployment was done in
> > https://issues.apache.org/jira/browse/HBASE-12721 you may start your
> > journey from there.  As for rest of your questions maybe there are
> > some folks here that were doing similar testing and may give you more
> info.
> >
> > Regards
> > Samir
> >
> > On Thu, Jul 13, 2017 at 7:57 AM, Udbhav Agarwal <
> > udbhav.agar...@syncoms.com>
> > wrote:
> >
> > > Hi All,
> > > I need to run hbase 0.98 backed by hdfs on docker container and want
> > > to stop the data lost if the container restarts.
> > >As per my understanding of docker containers, they
> > > work in a way that if any of the container is stopped/killed , every
> > > information related to it gets killed. It implies if I am running
> > > hbase in a
> > container
> > > and I have stored some data in some tables and consequently if the
> > > container is stopped then the data will be lost. I need a way in
> > > which I can stop this data loss.
> > >I have gone through concept of volume in docker. Is
> > > it possible to stop this data loss with this approach? What if
> > > volume gets corrupted? Is there any instance of volume running there
> > > which can be stopped and can cause data loss ?
> > >Is there a possibility that I can use hdfs running at
> > > some external host outside the docker and my hbase running inside
> > > docker ? Is such scenario possible ? If yes, How ?
> > >Thank you in advance.
> > >
> > >
> > > Thanks,
> > > Udbhav Agarwal
> > >
> > >
> >
>
-- 
-Dima


Re: Hbase on docker container with persistent storage

2017-07-13 Thread Dima Spivak
Udbhav,

Volumes are Docker's way of having folders or files from the host machine
bypass the union filesystem used within a Docker container. As such, if a
container with a volume is killed, the data from that volume should remain
there. That said, if whatever caused the container to die affects the
filesystem within the container, it would also affect the data on the host.

Running HBase in the manner you've described is not typical in anything
resembling a production environment, but if you explain more about your use
case, we could provide more advice. That said, how you'd handle data
locality and, in particular, multi-host deployments of HBase in this manner
is more of a concern for me than volume data corruption. What kind of scale
do you need to support? What kind of performance do you expect?

-Dima

On Thu, Jul 13, 2017 at 12:18 AM, Samir Ahmic  wrote:

> Hi Udbhav,
> Great work on hbase docker deployment was done in
> https://issues.apache.org/jira/browse/HBASE-12721 you may start your
> journey from there.  As for rest of your questions maybe there are some
> folks here that were doing similar testing and may give you more info.
>
> Regards
> Samir
>
> On Thu, Jul 13, 2017 at 7:57 AM, Udbhav Agarwal <
> udbhav.agar...@syncoms.com>
> wrote:
>
> > Hi All,
> > I need to run hbase 0.98 backed by hdfs on docker container and want to
> > stop the data lost if the container restarts.
> >As per my understanding of docker containers, they work in
> > a way that if any of the container is stopped/killed , every information
> > related to it gets killed. It implies if I am running hbase in a
> container
> > and I have stored some data in some tables and consequently if the
> > container is stopped then the data will be lost. I need a way in which I
> > can stop this data loss.
> >I have gone through concept of volume in docker. Is it
> > possible to stop this data loss with this approach? What if volume gets
> > corrupted? Is there any instance of volume running there which can be
> > stopped and can cause data loss ?
> >Is there a possibility that I can use hdfs running at some
> > external host outside the docker and my hbase running inside docker ? Is
> > such scenario possible ? If yes, How ?
> >Thank you in advance.
> >
> >
> > Thanks,
> > Udbhav Agarwal
> >
> >
>


Re: [ANNOUNCE] New HBase committer Allan Yang

2017-06-09 Thread Dima Spivak
Yay Allan!

On Thu, Jun 8, 2017 at 8:49 PM Yu Li  wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that Allan Yang
> has accepted the PMC's invitation to become a committer on the
> project. We appreciate all of Allan's generous contributions thus far and
> look forward to his continued involvement.
>
> Congratulations and welcome, Allan!
>
-- 
-Dima


Re: Hbase indexer for SOLR

2017-06-08 Thread Dima Spivak
Hey Fred,

Sorry we can't be more helpful, it's just that you're asking about a piece
of software that we don't use :(. A quick search turned up Cloudera's docs
on using Lily [1], so maybe give those a shot and then direct questions to
the Lily list Busbey pointed to if things are still misbehaving?

[1]
https://www.cloudera.com/documentation/enterprise/latest/topics/search_use_hbase_indexer_service.html

-Dima

On Thu, Jun 8, 2017 at 8:04 AM, Sean Busbey <bus...@apache.org> wrote:

> Hi Fred!
>
> Are you sure your message went through? The last message I see on the
> lily hbase indexer user list is from May 21st.
>
> As we mentioned before, the HBase project isn't familiar with the lily
> hbase indexer; it's a third party project unrelated to our community.
> I'm afraid we're unlikely to be of much help.
>
> Since that particular step is titled "Start Solr" it's possible the
> solr user mailing list might be able to help[1]. Just keep in mind
> it's very likely they will also point you back towards the NGData
> maintained user list.
>
> [1]: https://lists.apache.org/list.html?solr-u...@lucene.apache.org
>
>
> On Thu, Jun 8, 2017 at 1:38 AM, F. T. <bibo...@hotmail.fr> wrote:
> > Hi Dima,
> >
> > I didn't find any help on their user mailing list about this problem.
> And I don't know where the :2181/solr comes from.
> >
> > Here is my "hbase-indexer-site.xml" configuration file :
> >
> >
> > 
> >
> > hbase.zookeeper.quorum
> >
> > MyIpAddress
> >
> > 
> >
> >
> >
> > 
> >
> > hbaseindexer.zookeeper.connectstring
> >
> > MyIpAddress:2181
> >
> > 
> >
> >
> > The only place where I mentionned /solr is at the indexer creation using
> :
> >
> >
> > ./hbase-indexer add-indexer -n myindexer -c 
> > ../Fred_Indexer/indexdemo-indexer.xml
> -cp solr.zk=MyIpAddress:2181/solr -cp solr.collection=collection1
> >
> >
> > Any idea ?
> >
> > Fred
> >
> >
> > 
> > De : Dima Spivak <dimaspi...@apache.org>
> > Envoyé : mercredi 7 juin 2017 16:54
> > À : user@hbase.apache.org
> > Objet : Re: Hbase indexer for SOLR
> >
> > The :2181/solr config looks suspect to me, but as Busbey points out,
> > questions of how to successfully set up Lily are probably better suited
> for
> > their user mailing list.
> >
> > On Wed, Jun 7, 2017 at 7:41 AM F. T. <bibo...@hotmail.fr> wrote:
> >
> >> Thanks for for your answer. That's it, I use the Lily Hbase Indexer. As
> >> told before, I can create/delete an indexer. I can launch Hbase-indexer
> >> server command. It's stable until I insert a row into Hbase.
> >>
> >> Here is msg I get :
> >>
> >>
> >> 17/06/07 16:10:23 INFO mortbay.log: Started
> >> SelectChannelConnector@0.0.0.0:11060
> >> 17/06/07 16:12:41 INFO hbase.Server: Connection from MyIpAddress port:
> >> 37592 with version info: version: "1.2.3" url: "git://
> >> kalashnikov.att.net/Users/stack/checkouts/hbase.git.commit" revision:
> >> "bd63744624a26dc3350137b564fe746df7a721a4" user: "stack" date: "Mon
> Aug 29
> >> 15:13:42 PDT 2016" src_checksum: "0ca49367ef6c3a680888bbc4f1485d18"
> >> 17/06/07 16:12:42 INFO zookeeper.ZooKeeper: Initiating client
> connection,
> >> connectString=MyIpAddress:2181/solr sessionTimeout=3
> >> watcher=org.apache.solr.common.cloud.SolrZkClient$3@12dd2209
> >> 17/06/07 16:12:42 INFO zookeeper.ClientCnxn: Opening socket connection
> to
> >> server MyServer/MyIpAddress:2181. Will not attempt to authenticate using
> >> SASL (unknown error)
> >> 17/06/07 16:12:42 INFO zookeeper.ClientCnxn: Socket connection
> established
> >> to MyServer/MyIpAddress:2181, initiating session
> >> 17/06/07 16:12:42 INFO zookeeper.ClientCnxn: Session establishment
> >> complete on server MyServer/MyIpAddress:2181, sessionid =
> >> 0x15c81c312060085, negotiated timeout = 3
> >> 17/06/07 16:12:42 INFO zookeeper.ZooKeeper: Session: 0x15c81c312060085
> >> closed
> >> 17/06/07 16:12:42 INFO zookeeper.ClientCnxn: EventThread shut down
> >> 17/06/07 16:12:42 ERROR impl.SepEventExecutor: Error while processing
> event
> >> java.lang.RuntimeException: org.apache.solr.common.SolrException:
> Cannot
> >> connect to cluster at MyIpAddress:2181/solr: cluster not found/not ready
> >>
> >>
> >> Thanks again
> >>

Re: What is Dead Region Servers and how to clear them up?

2017-05-26 Thread Dima Spivak
Actually, it's a "Please give us the details another member of the project
already asked for."

This is a community mailing list, which means we volunteer our time to help
people with questions. If you're looking for customer support, you should
be taking your question to a consultant or vendor that provides such
services. Being a jerk is incredibly counterproductive.

-Dima

On Fri, May 26, 2017 at 11:03 AM, jeff saremi <jeffsar...@hotmail.com>
wrote:

> Thank you for the GFY answer
>
> And i guess to figure out how to fix these I can always go through the
> HBase source code.
>
>
> ____
> From: Dima Spivak <dimaspi...@apache.org>
> Sent: Friday, May 26, 2017 9:58:00 AM
> To: hbase-user
> Subject: Re: What is Dead Region Servers and how to clear them up?
>
> Sending this back to the user mailing list.
>
> RegionServers can die for many reasons. Looking at your RegionServer log
> files should give hints as to why it's happening.
>
>
> -Dima
>
> On Fri, May 26, 2017 at 9:48 AM, jeff saremi <jeffsar...@hotmail.com>
> wrote:
>
> > I had posted this to the user mailing list and I have not got any direct
> > answer to my question.
> >
> > Where do dead RS's come from and how can they be cleaned up? Someone in
> > the midst of developers should know this.
> >
> > thanks
> >
> > Jeff
> >
> > 
> > From: jeff saremi <jeffsar...@hotmail.com>
> > Sent: Thursday, May 25, 2017 10:23:17 AM
> > To: user@hbase.apache.org
> > Subject: Re: What is Dead Region Servers and how to clear them up?
> >
> > I'm still looking to get hints on how to remove the dead regions. thanks
> >
> > 
> > From: jeff saremi <jeffsar...@hotmail.com>
> > Sent: Wednesday, May 24, 2017 12:27:06 PM
> > To: user@hbase.apache.org
> > Subject: Re: What is Dead Region Servers and how to clear them up?
> >
> > i'm trying to eliminate the dead region servers.
> >
> > 
> > From: Ted Yu <yuzhih...@gmail.com>
> > Sent: Wednesday, May 24, 2017 12:17:40 PM
> > To: user@hbase.apache.org
> > Subject: Re: What is Dead Region Servers and how to clear them up?
> >
> > bq. running hbck (many times
> >
> > Can you describe the specific inconsistencies you were trying to resolve
> ?
> > Depending on the inconsistencies, advice can be given on the best known
> > hbck command arguments to use.
> >
> > Feel free to pastebin master log if needed.
> >
> > On Wed, May 24, 2017 at 12:10 PM, jeff saremi <jeffsar...@hotmail.com>
> > wrote:
> >
> > > these are the things I have done so far:
> > >
> > >
> > > - restarting master (few times)
> > >
> > > - running hbck (many times; this tool does not seem to be doing
> anything
> > > at all)
> > >
> > > - checking the list of region servers in ZK (none of the dead ones are
> > > listed here)
> > >
> > > - checking the WALs under /WALs. Out of 11 dead ones only 3
> > > are listed here with "-splitting" at the end of their names and they
> > > contain one single file like: 1493846660401..meta.1493922323600.meta
> > >
> > >
> > >
> > >
> > > 
> > > From: jeff saremi <jeffsar...@hotmail.com>
> > > Sent: Wednesday, May 24, 2017 9:04:11 AM
> > > To: user@hbase.apache.org
> > > Subject: What is Dead Region Servers and how to clear them up?
> > >
> > > Apparently having dead region servers is so common that a section of
> the
> > > master console is dedicated to that?
> > > How can we clean this up (preferably in an automated fashion)? Why
> isn't
> > > this being done by HBase automatically?
> > >
> > >
> > > thanks
> > >
> >
>


Re: What is Dead Region Servers and how to clear them up?

2017-05-26 Thread Dima Spivak
Sending this back to the user mailing list.

RegionServers can die for many reasons. Looking at your RegionServer log
files should give hints as to why it's happening.


-Dima

On Fri, May 26, 2017 at 9:48 AM, jeff saremi  wrote:

> I had posted this to the user mailing list and I have not got any direct
> answer to my question.
>
> Where do dead RS's come from and how can they be cleaned up? Someone in
> the midst of developers should know this.
>
> thanks
>
> Jeff
>
> 
> From: jeff saremi 
> Sent: Thursday, May 25, 2017 10:23:17 AM
> To: user@hbase.apache.org
> Subject: Re: What is Dead Region Servers and how to clear them up?
>
> I'm still looking to get hints on how to remove the dead regions. thanks
>
> 
> From: jeff saremi 
> Sent: Wednesday, May 24, 2017 12:27:06 PM
> To: user@hbase.apache.org
> Subject: Re: What is Dead Region Servers and how to clear them up?
>
> i'm trying to eliminate the dead region servers.
>
> 
> From: Ted Yu 
> Sent: Wednesday, May 24, 2017 12:17:40 PM
> To: user@hbase.apache.org
> Subject: Re: What is Dead Region Servers and how to clear them up?
>
> bq. running hbck (many times
>
> Can you describe the specific inconsistencies you were trying to resolve ?
> Depending on the inconsistencies, advice can be given on the best known
> hbck command arguments to use.
>
> Feel free to pastebin master log if needed.
>
> On Wed, May 24, 2017 at 12:10 PM, jeff saremi 
> wrote:
>
> > these are the things I have done so far:
> >
> >
> > - restarting master (few times)
> >
> > - running hbck (many times; this tool does not seem to be doing anything
> > at all)
> >
> > - checking the list of region servers in ZK (none of the dead ones are
> > listed here)
> >
> > - checking the WALs under /WALs. Out of 11 dead ones only 3
> > are listed here with "-splitting" at the end of their names and they
> > contain one single file like: 1493846660401..meta.1493922323600.meta
> >
> >
> >
> >
> > 
> > From: jeff saremi 
> > Sent: Wednesday, May 24, 2017 9:04:11 AM
> > To: user@hbase.apache.org
> > Subject: What is Dead Region Servers and how to clear them up?
> >
> > Apparently having dead region servers is so common that a section of the
> > master console is dedicated to that?
> > How can we clean this up (preferably in an automated fashion)? Why isn't
> > this being done by HBase automatically?
> >
> >
> > thanks
> >
>


Re: [DISCUSS] Status of the 0.98 release line

2017-04-10 Thread Dima Spivak
+1

-Dima

On Mon, Apr 10, 2017 at 12:08 PM, Stack  wrote:

> I agree we should EOL 0.98.
> St.Ack
>
> On Mon, Apr 10, 2017 at 11:43 AM, Andrew Purtell 
> wrote:
>
> > Please speak up if it is incorrect to interpret the lack of responses as
> > indicating consensus on declaring 0.98 EOL.
> >
> > I believe we should declare 0.98 EOL.
> >
> >
> > On Wed, Mar 29, 2017 at 6:56 AM, Sean Busbey  wrote:
> >
> > > Hi Folks!
> > >
> > > Back in January our Andrew Purtell stepped down as the release
> > > manager for the 0.98 release line.
> > >
> > > On the resultant dev@hbase thread[1] folks seemed largely in favor of
> > > declaring end-of-maintenance for the 0.98 line.
> > >
> > > Now that it's been a couple of months, does anyone have concerns about
> > > pushing forward on that?
> > >
> > > Do folks who listen on user@hbase but not dev@hbase have any concerns?
> > >
> > > As with any end-of-maintenance branch, the PMC would consider on a
> > > case-by-case basis doing a future release of the branch should a
> > > critical security vulnerability show up.
> > >
> > >
> > > [1]: https://s.apache.org/DjCi
> > >
> > > -busbey
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >- Andy
> >
> > If you are given a choice, you believe you have acted freely. - Raymond
> > Teller (via Peter Watts)
> >
>


Re: hbase has problems with two hostname

2017-01-17 Thread Dima Spivak
Is there any other DNS server running that might be confusing reverse
lookup? What happens if you run `host YOUR_RS_IP_ADDRESS`?

And what kind of machines are you using in your deployment?

Cheers,

On Mon, Jan 16, 2017 at 11:34 PM C R <cuirong198...@hotmail.com> wrote:

> Thanks,
>
>
>
>
>
> I deployed my HBase very simply, which has one Master and three
> regionservers.
>
>
>
>
>
> [hbase@bjsh19-16-30 conf]$ more regionservers
>
> bjsh19-16-33.qbos.com
>
> bjsh19-16-34.qbos.com
>
> bjsh19-16-35.qbos.com
>
> [hbase@bjsh19-16-30 conf]$ more hbase-site.xml
>
>
>
> ...
>
>
>
> 
>
> 
>
> zookeeper.znode.parent
>
> /hbase117
>
> 
>
> 
>
>   hbase.rootdir
>
> hdfs://bidc/hbase117
>
>   
>
> 
>
>   hbase.zookeeper.quorum
>
> bjsh19-16-30.qbos.com,bjsh19-16-31.qbos.com,
> bjsh19-16-32.qbos.com
>
>   
>
> 
>
>   hbase.cluster.distributed
>
> true
>
>   
>
> 
>
>   hbase.zookeeper.property.clientPort
>
> 2181
>
>   
>
> 
>
>
>
>
>
> The special place is the file /etc/hosts with one ip mapping to two
> hostnames on all nodes,so it will have the message:
>
>
>
> ...
>
>
>
> the server that tried to transition was wjsa-tsl05,16020,1484623636195 not
> the expected bjsh19-16-34.qbos.com,16020,1484623636195
>
>
>
> ...
>
>
>
>
>
> 
>
> 发件人: Dima Spivak <dimaspi...@apache.org>
>
> 发送时间: 2017年1月17日 4:50
>
> 收件人: user@hbase.apache.org
>
> 主题: Re: hbase has problems with two hostname
>
>
>
> Hi C R,
>
>
>
> Like many Hadoop-like services, HBase is pretty temperamental about
>
> requiring forward and reverse DNS to work properly. FWIW, the configuration
>
> file where you can populate RegionServers doesn't tend to matter as long as
>
> the hbase-site.xml file is populated correctly (it's just used to start
>
> daemons from one place).
>
>
>
> If you pass along more details about how exactly you're deploying HBase, we
>
> might be able to give more advice.
>
>
>
> On Mon, Jan 16, 2017 at 8:00 PM C R <cuirong198...@hotmail.com> wrote:
>
>
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > more /etc/hosts
>
> >
>
> >
>
> > ...
>
> >
>
> >
>
> >
>
> >
>
> > 10.19.16.31  bjsh19-16-31.qbos.comwjsa-tsl02
>
> >
>
> >
>
> > ...
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > There will have six regionservers listed in web console, but
>
> >
>
> > only three in the configuration file,  metadata tables also are not
> online
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > Hmaster will be dead after a while.
>
> >
>
> >
>
> > what should I do?
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > snapshot:
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > 2017-01-17 11:45:24,394 INFO
>
> >  [MASTER_SERVER_OPERATIONS-bjsh19-16-30:16000-0]
> master.AssignmentManager:
>
> > Assigning
> hbase:namespace,,1484623643279.30fab746cb3b6ceadcbda421459204b9.
>
> > to bjsh19-16-34.qbos.com,16020,1484623636195
>
> >
>
> >
>
> > 2017-01-17 11:45:24,395 INFO  [bjsh19-16-30:16000.activeMasterManager]
>
> > master.AssignmentManager: Joined the cluster in 23ms, failover=true
>
> >
>
> >
>
> > 2017-01-17 11:50:24,314 FATAL [bjsh19-16-30:16000.activeMasterManager]
>
> > master.HMaster: Failed to become active master
>
> >
>
> >
>
> > java.io.IOException: Timedout 30ms waiting for namespace table to be
>
> > assigned
>
> >
>
> >
>
> > at
>
> >
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:104)
>
> >
>
> >
>

Re: hbase has problems with two hostname

2017-01-16 Thread Dima Spivak
Hi C R,

Like many Hadoop-like services, HBase is pretty temperamental about
requiring forward and reverse DNS to work properly. FWIW, the configuration
file where you can populate RegionServers doesn't tend to matter as long as
the hbase-site.xml file is populated correctly (it's just used to start
daemons from one place).

If you pass along more details about how exactly you're deploying HBase, we
might be able to give more advice.

On Mon, Jan 16, 2017 at 8:00 PM C R  wrote:

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> more /etc/hosts
>
>
> ...
>
>
>
>
> 10.19.16.31  bjsh19-16-31.qbos.comwjsa-tsl02
>
>
> ...
>
>
>
>
>
>
>
>
> There will have six regionservers listed in web console, but
>
> only three in the configuration file,  metadata tables also are not online
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hmaster will be dead after a while.
>
>
> what should I do?
>
>
>
>
>
>
>
> snapshot:
>
>
>
>
>
>
>
>
>
>
>
> 2017-01-17 11:45:24,394 INFO
>  [MASTER_SERVER_OPERATIONS-bjsh19-16-30:16000-0] master.AssignmentManager:
> Assigning hbase:namespace,,1484623643279.30fab746cb3b6ceadcbda421459204b9.
> to bjsh19-16-34.qbos.com,16020,1484623636195
>
>
> 2017-01-17 11:45:24,395 INFO  [bjsh19-16-30:16000.activeMasterManager]
> master.AssignmentManager: Joined the cluster in 23ms, failover=true
>
>
> 2017-01-17 11:50:24,314 FATAL [bjsh19-16-30:16000.activeMasterManager]
> master.HMaster: Failed to become active master
>
>
> java.io.IOException: Timedout 30ms waiting for namespace table to be
> assigned
>
>
> at
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:104)
>
>
> at
> org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:986)
>
>
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:780)
>
>
> at
> org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:183)
>
>
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1652)
>
>
> at java.lang.Thread.run(Thread.java:745)
>
>
> 2017-01-17 11:50:24,315 FATAL [bjsh19-16-30:16000.activeMasterManager]
> master.HMaster: Master server abort: loaded coprocessors are: []
>
>
> 2017-01-17 11:50:24,316 FATAL [bjsh19-16-30:16000.activeMasterManager]
> master.HMaster: Unhandled exception. Starting shutdown.
>
>
>
>
>
>
>
>
>
>
>
>
>
> ...
>
>
>
>
>
>
>
>
>
> 2017-01-17 11:27:17,926 INFO  [regionserver/
> bjsh19-16-34.qbos.com/10.19.16.34:16020] regionserver.HRegionServer:
> Serving as wjsa-tsl05,16020,1
>
>
> 484623636195, RpcServer on bjsh19-16-34.qbos.com/10.19.16.34:16020,
> sessionid=0x154563e43e30179
>
>
> 2017-01-17 11:27:17,934 INFO  [regionserver/
> bjsh19-16-34.qbos.com/10.19.16.34:16020] quotas.RegionServerQuotaManager:
> Quota support disabled
>
>
> 2017-01-17 11:27:23,966 INFO
>  [PriorityRpcServer.handler=14,queue=0,port=16020]
> regionserver.RSRpcServices: Open hbase:namespace,,148462364327
>
>
> 9.30fab746cb3b6ceadcbda421459204b9.
>
>
> 2017-01-17 11:27:24,008 WARN  [RS_OPEN_REGION-bjsh19-16-34:16020-0]
> zookeeper.ZKAssign: regionserver:16020-0x154563e43e30179, quorum=bjsh19-16
>
>
> -30:2181,bjsh19-16-31:2181,bjsh19-16-32:2181, baseZNode=/hbase115new
> Attempt to transition the unassigned node for 30fab746cb3b6ceadcbda421459
>
>
> 204b9 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING failed, the server
> that tried to transition was wjsa-tsl05,16020,1484623636195 not the
>
>
> expected bjsh19-16-34.qbos.com,16020,1484623636195
>
>
> 2017-01-17 11:27:24,008 WARN  [RS_OPEN_REGION-bjsh19-16-34:16020-0]
> coordination.ZkOpenRegionCoordination: Failed transition from OFFLINE to O
>
>
> PENING for region=30fab746cb3b6ceadcbda421459204b9
>
>
> 2017-01-17 11:27:24,008 WARN  [RS_OPEN_REGION-bjsh19-16-34:16020-0]
> handler.OpenRegionHandler: Region was hijacked? Opening cancelled for enco
>
>
> dedName=30fab746cb3b6ceadcbda421459204b9
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: Hbase Support for C#/.Net

2017-01-10 Thread Dima Spivak
It really depends on your use case, so your best bet is to implement proof
of concepts of each and benchmark them yourself with your workload.

Cheers,

On Mon, Jan 9, 2017 at 11:07 PM Manjeet Singh <manjeet.chand...@gmail.com>
wrote:

> Thanks Dima for writing
>
>
>
> We just completed one project of Hbase-Spark combination we have another
>
> client on .Net platform.
>
> I have seen good support for java and spark for Hbase.
>
>
>
> I am looking solution for .Net and as its quite new for us too but what
>
> community say to go with this so I can start look more into that area.
>
>
>
> I find few which were recommended in blogs but still I am not sure about
>
> these.
>
>
>
> can you please help me to figure out the best one? Fast reading and writing
>
> were Major area.
>
>
>
> Thanks
>
> Manjeet
>
>
>
> On Tue, Jan 10, 2017 at 1:13 AM, Dima Spivak <dimaspi...@apache.org>
> wrote:
>
>
>
> > What criteria are important to you when looking for "best results"? #1
> and
>
> > #2 have been what I've used in the past and both have worked quite well.
>
> >
>
> > -Dima
>
> >
>
> > On Mon, Jan 9, 2017 at 1:06 AM, Manjeet Singh <
> manjeet.chand...@gmail.com>
>
> > wrote:
>
> >
>
> > > Hi All,
>
> > >
>
> > > I have to find which is the best way to query on Hbase will give best
>
> > > result
>
> > >
>
> > > options are as below if any one can help
>
> > >
>
> > >
>
> > >
>
> > >1. REST API
>
> > >2. Using Thrift:
>
> > >   1. HBase and Thrift in .NET C# Tutorial
>
> > >   http://pawelrychlicki.pl/Article/Details/52/hbase-and-
>
> > > thrift-in-net-c-sharp-tutorial-c-sharp-45-and-thrift-093
>
> > >3. Commercial ODBC/ADO.NET Connectors
>
> > >   1. http://www.cdata.com/drivers/hbase/ado/
>
> > >   2. http://www.simba.com/drivers/hbase-odbc-jdbc/
>
> > >4. Apache Drill
>
> > >5. Apache Phoenix
>
> > >
>
> > > Thanks
>
> > > Manjeet
>
> > > --
>
> > > luv all
>
> > >
>
> >
>
>
>
>
>
>
>
> --
>
> luv all
>
>


Re: Storing XML file in Hbase

2016-11-28 Thread Dima Spivak
Hi Mich,

How many files are you looking to store? How often do you need to read
them? What's the total size of all the files you need to serve?

Cheers,
Dima

On Mon, Nov 28, 2016 at 7:04 AM Mich Talebzadeh 
wrote:

> Hi,
>
> Storing XML file in Big Data. Are there any strategies to create multiple
> column families or just one column family and in that case how many columns
> would be optional?
>
> thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> >*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>


Re: what is the Hbase cluster requirement as per Industry standard

2016-11-04 Thread Dima Spivak
The HBase reference guide [1) has some suggestions with respect to sizing
of hardware and tuning for performance. There is no industry standard
because the best way to configure HBase is dependent upon how you are using
HBase.

As for the ZK question, Google can point you to resources that describe
what happens when your quorum goes from an odd number of servers to an even
number.

1. https://hbase.apache.org/book.html

Cheers,

On Friday, November 4, 2016, Manjeet Singh 
wrote:

> Hi All,
>
> I have below points if anyone answer it
>
> Q1. how many hardware core required by Hbase? what is the calculation at
> the end of the day one need to sure how much core it required?
>
>
>
> Q2. what is the RAM distribution calculation on each RS, Master, Java Heap,
> Client? please consider my requirement as we insert 12 GB data per day I
> have applied snappy and FastDiff and perform random get/ put orations in
> bulk by using Spark Job (15 min window, Max data size 4 GB and min 300 MB).
>
> Right now All RS having 12 GB of RAM
> Master having 6 GB of RAM
> 45 GB RAM for Spark
> 5% RAM free for Cloudera
> java Heap 4 GB
> Client 4GB
>
>
> Q3. For Hbase HA zookeeper 100% required (question comes in if any
> zookeeper node goes down what will happen, as it should be in odd figure)?
>
>
> Q4. is their any Industry standard  configuration in one doc so I can use
> for H/W sizing and RAM allocation.
>
>
>
>
> Thanks
> Manjeet
>
>
>
>
>
>
>
>
>
> --
> luv all
>


-- 
-Dima


Re: Talks from hbaseconeast2016 have been posted

2016-10-27 Thread Dima Spivak
Here's a link:
https://www.youtube.com/channel/UCy25rIFxWRBokFg-2Cm83BQ/videos?sort=dd_id=0=0

;) Stack

On Thursday, October 27, 2016, Stack  wrote:

> Some good stuff in here:
>
> + BigTable Lead on some interesting tricks done to make the service more
> robust
> + Compare of SQL tiers by the boys from Splice
> + Mikhail (looking like a gangster!) on the dead-but-not-dead RegionServer
>
> Then there's our JMS on HBase+Spark,  Joep and Sangjin on HBase as store
> for Yarn Timeline Service v2 Lots of meaty material.
>
> Yours,
> The Program Committee
> P.S. Thanks Carter Page for assemblage.
>


-- 
-Dima


Re: Graph search

2016-10-27 Thread Dima Spivak
Hey Cheyenne,

HBase itself only provides primitives for operations like put, get, and
scan, so you'd need to implement any particular search algorithms at the
application level or seek out existing projects that could add such
functionality. Projects like Giraph and Titan come to mind.

On Thursday, October 27, 2016, Cheyenne Forbes <
cheyenne.osanu.for...@gmail.com> wrote:

>  is there any way to perform a graph search on a Hbase database?
>


-- 
-Dima


Re: Hbase Row key lock

2016-10-23 Thread Dima Spivak
If your typical use case sees 50 clients simultaneously trying to update
the same row, then a strongly consistent data store that writes to disk for
fault tolerance may not be for you. That said, such a use case seems
extremely unusual to me and I'd ask why you're trying to update the same
row in such a manner.

On Sunday, October 23, 2016, Manjeet Singh <manjeet.chand...@gmail.com>
wrote:

> Hi Dima,
>
> I didn't get ? point is assume I have 50 different client all having same
> rowkey all want to update on same rowkey at same time now just tell what
> will happen? who will get what value?
>
> Thanks
> Manjeet
>
> On Mon, Oct 24, 2016 at 12:12 AM, Dima Spivak <dimaspi...@apache.org
> <javascript:;>> wrote:
>
> > Unless told not to, HBase will always write to memory and append to the
> WAL
> > on disk before returning and saying the write succeeded. That's by design
> > and the same write pattern that companies like Apple and Facebook have
> > found works for them at scale. So what's there to solve?
> >
> > On Sunday, October 23, 2016, Manjeet Singh <manjeet.chand...@gmail.com
> <javascript:;>>
> > wrote:
> >
> > > Hi All,
> > >
> > > I have read below mention blog and it also said Hbase holds the lock on
> > > rowkey level
> > > https://blogs.apache.org/hbase/entry/apache_hbase_
> internals_locking_and
> > > (0) Obtain Row Lock
> > > (1) Write to Write-Ahead-Log (WAL)
> > > (2) Update MemStore: write each cell to the memstore
> > > (3) Release Row Lock
> > >
> > >
> > > SO question is how to solve this if I have very frequent update on
> Hbase
> > >
> > > Thanks
> > > Manjeet
> > >
> > > On Wed, Aug 17, 2016 at 9:54 AM, Manjeet Singh <
> > manjeet.chand...@gmail.com <javascript:;>
> > > <javascript:;>>
> > > wrote:
> > >
> > > > Hi All
> > > >
> > > > Can anyone help me about how and in which version of Hbase support
> > Rowkey
> > > > lock ?
> > > > I have seen article about rowkey lock but it was about .94 version it
> > > said
> > > > that if row key not exist and any update request come and that rowkey
> > not
> > > > exist then in this case Hbase hold the lock for 60 sec.
> > > >
> > > > currently I am using Hbase 1.2.2 version
> > > >
> > > > Thanks
> > > > Manjeet
> > > >
> > > >
> > > >
> > > > --
> > > > luv all
> > > >
> > >
> > >
> > >
> > > --
> > > luv all
> > >
> >
> >
> > --
> > -Dima
> >
>
>
>
> --
> luv all
>


-- 
-Dima


Re: Hbase Row key lock

2016-10-23 Thread Dima Spivak
Unless told not to, HBase will always write to memory and append to the WAL
on disk before returning and saying the write succeeded. That's by design
and the same write pattern that companies like Apple and Facebook have
found works for them at scale. So what's there to solve?

On Sunday, October 23, 2016, Manjeet Singh 
wrote:

> Hi All,
>
> I have read below mention blog and it also said Hbase holds the lock on
> rowkey level
> https://blogs.apache.org/hbase/entry/apache_hbase_internals_locking_and
> (0) Obtain Row Lock
> (1) Write to Write-Ahead-Log (WAL)
> (2) Update MemStore: write each cell to the memstore
> (3) Release Row Lock
>
>
> SO question is how to solve this if I have very frequent update on Hbase
>
> Thanks
> Manjeet
>
> On Wed, Aug 17, 2016 at 9:54 AM, Manjeet Singh  >
> wrote:
>
> > Hi All
> >
> > Can anyone help me about how and in which version of Hbase support Rowkey
> > lock ?
> > I have seen article about rowkey lock but it was about .94 version it
> said
> > that if row key not exist and any update request come and that rowkey not
> > exist then in this case Hbase hold the lock for 60 sec.
> >
> > currently I am using Hbase 1.2.2 version
> >
> > Thanks
> > Manjeet
> >
> >
> >
> > --
> > luv all
> >
>
>
>
> --
> luv all
>


-- 
-Dima


Re: setup two hbase instances on Mac?

2016-10-20 Thread Dima Spivak
So what you could essentially do is use the Apache HBase topology for
clusterdock, running the clusterdock_run ./bin/start_cluster command twice,
once for each cluster you want to start. I can provide specific command
line arguments if you let me know which version of HBase you're hoping to
use. The only pre-req would be Docker for Mac preinstalled on your machine.

-Dima

On Thu, Oct 20, 2016 at 4:03 PM, Demai Ni <nid...@gmail.com> wrote:

> Actually I don't have a good reason of 'not use container', except that I
> already have homebrew install hadoop and hbase on my laptop, hence like to
> just keep using it.
>
> Thanks for the instruction through the blog. A quick question to clarify:
>  the blog is for multi-node cluster, instead of setting up two clusters on
> the same machine? If I follow the instruction, which step should I revise?
>
> At this moment, I don't need the management/monitoring features from CDH,
> and no need to install other components yet. Hence, looking for a simpler
> way. I hope(maybe unrealistic) that I can get just a 2nd set for
> configuration files, such as hbase-site.xml, hbase-env, etcs, then I will
> be able to run two isolated(not resource wise) hbase instances.
>
> Again, thanks a lot for the tip. I may give it a try if simple
> configuration change not available.
>
> Demai
>
> Demai
>
> On Thu, Oct 20, 2016 at 3:04 PM, Dima Spivak <dimaspi...@apache.org>
> wrote:
>
> > Any reason to not use the container way via clusterdock [1]? I do
> > replication testing on my Mac for this using it and have had pretty good
> > results.
> >
> > 1.
> > http://blog.cloudera.com/blog/2016/08/multi-node-clusters-
> > with-cloudera-quickstart-for-docker/
> >
> > -Dima
> >
> > On Thu, Oct 20, 2016 at 2:51 PM, Demai Ni <nid...@gmail.com> wrote:
> >
> > > hi, folks,
> > >
> > > I am trying to setup a simple development environment on my Mac Book.
> And
> > > like to have multiple instances of HBases, for some testing of
> > replication,
> > > backup. etc. And wondering there is any instruction to setup for
> multiple
> > > instances(not the VM/container way).
> > >
> > > Here is what I did so far. Install one through homebrew, and build
> > another
> > > one from source code.
> > >
> > > To make my life easier, I setup the following alias:
> > > /* first HBase, is installed by homebrew and using HDFS as storage, by
> > > specify hbase.rootdir */
> > > alias hDFSHBaseShell='/usr/local/bin/hbase shell'
> > > alias
> > > hstart='/usr/local/Cellar/hadoop/2.7.3/sbin/start-dfs.
> > > sh;/usr/local/Cellar/hadoop/2.7.3/sbin/start-yarn.sh'
> > > alias
> > > hstop='/usr/local/Cellar/hadoop/2.7.3/sbin/stop-yarn.
> > > sh;/usr/local/Cellar/hadoop/2.7.3/sbin/stop-dfs.sh'
> > > alias startHDFSHBase='/usr/local/bin/start-hbase.sh'
> > > alias stopHDFSHBase='/usr/local/bin/stop-hbase.sh'
> > >
> > >
> > > /* 2nd HBase, is build locally from git clone, and using localfile
> system
> > > as storage */
> > > /* changed HBASE_PID_DIR in hbase-env.sh to avoid conflict with the
> first
> > > instance
> > > alias localHBaseShell='/Users/demai/hbase/bin/hbase shell'
> > > alias startLocalHBase='/Users/demai/hbase/bin/start-hbase.sh'
> > > alias stopLocalHBase='/Users/demai/hbase/bin/stop-hbase.sh'
> > >
> > > Still, it is not enough as the start/stop hbase will only bring up on
> > > instance, and bring it down regardless which stop-hbase.sh I used. I
> > guess
> > > more port configuration, like describe there :
> > > http://blog.cloudera.com/blog/2013/07/guide-to-using-apache-
> hbase-ports/
> > ,
> > > need to change in hbase-site.xml?
> > >
> > > Before, I go down the manually port configuration route. Just wondering
> > > whether anyone already done it? To save me some time of random
> > shooting
> > > :-)
> > >
> > > Many thanks. BTW, I did a bit google using 'multiple hbase instances',
> > but
> > > search results doesn't exactly match this environment.
> > >
> > > Demai
> > >
> >
>


Re: Hbase cluster not getting UP one Region server get down

2016-10-20 Thread Dima Spivak
It can be lots of things, Manjeet. You've gotta do a bit of troubleshooting
yourself first; a long dump of your machine specs doesn't change that.

Can you describe what happened before/after the node went down? The log
just says server isn't running, so we can't tell much from that alone.

-Dima

On Wed, Oct 19, 2016 at 10:53 PM, Manjeet Singh 
wrote:

> I want to add few more points
>
>
> below is my cluster configuration
>
>
>
>
>
>
>
>
>
> *Distribution*
>
> * Total*
>
> *Distribution*
>
> *OS (RAID-1)*
>
> *DATA*
>
> *Total RAM*
>
> *Components*
>
> *Yarn Resource manager/ Node manager*
>
> *Node*
>
> Node- 1
>
> 2x6 Core
>
> 12 core
>
> 6x300 GB
>
> 300
>
> Single 900 GB RAID-10
>
> 96
>
> Hbase Master, HDFS Name Node,  Zookeeper Server, Spark History server,
> phoenix, HDFS Balancer, Spark getway. MySql.
>
> · YARN (MR2 Included) JobHistory Server
> .
>
> · ResourceManager
> 
>
> Name Node
>
> Node- 2
>
> 2x6 Core
>
> 12 core
>
> 6x300 GB
>
> 300
>
> 300 GB X 6 Individual RAID-0
>
> 80
>
> Hdfs data node, Hbase Region, Zookeeper Server,  spark, Hbase Master,
>
> YARN (MR2 Included) NodeManager
>
> Data Node, Spark Node
>
> Node- 3
>
> 2x6 Core
>
> 12 core
>
> 6x300 GB
>
> 300
>
> 300 GB X 6 Individual RAID-0
>
> 80
>
> Hdfs data node, Hbase Region, Zookeeper Server,  spark
>
> YARN (MR2 Included) NodeManager
>
> Data Node, Spark Node
>
> Node - 4
>
> 2x6 Core
>
> 12 core
>
> 8x300 GB
>
> 300
>
> 300 GB X 6 Individual RAID-0
>
> 80
>
> Hdfs data node, Hbase Region, spark
>
> YARN (MR2 Included) NodeManager
>
> Data Node, Spark Node
>
>
>
>
>
>
>
>
> I noticed that Hbase taking more time while reading so i use below property
> to improve its performance
>
> *Property Name*
>
> *Original value*
>
> *Changed value*
>
> hfile.block.cache.size
>
> 0.4
>
> 0.6
>
> hbase.regionserver.global.memstore.size
>
> 0.4
>
> 0.2
>
>
> below is some more information
>
> I have Spark ETL jobon same cluster and I have below parameters after
> running this job
>
>
>
> *Parameter *
>
> *Value*
>
> Number of Pipeline
>
> 2 (Kafka)
>
> Raw Size of Kafka Message
>
> 21 GB
>
> Data Rate
>
> 1 MB/Sec per pipeline
>
> Size of Aggregated Data in Hbase
>
> 2.6 GB With Snappy and Major Compaction
>
> Batch Duration
>
> 30 sec
>
> Sliding Window , Window Duration
>
> 900 Sec [15 Minute]
>
> CPU Utilization
>
> 63.2 %
>
> Number of Executor
>
> 3 per  pipeline
>
> Allocated RAM
>
> 3 GB per  pipeline
>
> Cluster N/W IO
>
> 3.2 MB/sec
>
> Cluster Disk IO
>
> 3.5 MB/Sec
>
> Max Time(highest peak) taken by Spark ETL  for 900 MB Size of Data to
> Process data for Domain
>
> 2 Hour
>
> Max Time(highest peak)  taken by Spark ETL  for 900 MB Size of Data to
> Process data for Application
>
> 30 Minute
>
> Total Time Taken by kafka Simulator to push the data into Kafka
>
> 6h
>
> Total Time Taken by by Spark ETL to process all the Data
>
> 7 h
>
> Number of SQL Query
>
> 10
>
> Number of Profile
>
> 9
>
> Number of Row in Hbase
>
> 11015719
>
>
> Thanks
> Manjeet
>
>
> On Thu, Oct 20, 2016 at 10:45 AM, Manjeet Singh <
> manjeet.chand...@gmail.com>
> wrote:
>
> > Hi All
> > Can any one help me to figure out the root cause I have 4 node cluster
> and
> > one data node get down , I did not understand why my Hbase Master not
> able
> > to get up
> >
> > I have belo log
> >
> > ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server
> > is not running yet
> > at org.apache.hadoop.hbase.master.HMaster.
> > checkServiceStarted(HMaster.java:2296)
> > at org.apache.hadoop.hbase.master.MasterRpcServices.
> > isMasterRunning(MasterRpcServices.java:936)
> > at org.apache.hadoop.hbase.protobuf.generated.
> > MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55654)
> > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:
> 2170)
> > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.
> java:109)
> > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> > RpcExecutor.java:133)
> > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> > java:108)
> > at java.lang.Thread.run(Thread.java:745)
> >
> >
> > Thanks
> > Manjeet
> >
> > --
> > luv all
> >
>
>
>
> --
> luv all
>


Re: setup two hbase instances on Mac?

2016-10-20 Thread Dima Spivak
Any reason to not use the container way via clusterdock [1]? I do
replication testing on my Mac for this using it and have had pretty good
results.

1.
http://blog.cloudera.com/blog/2016/08/multi-node-clusters-with-cloudera-quickstart-for-docker/

-Dima

On Thu, Oct 20, 2016 at 2:51 PM, Demai Ni  wrote:

> hi, folks,
>
> I am trying to setup a simple development environment on my Mac Book. And
> like to have multiple instances of HBases, for some testing of replication,
> backup. etc. And wondering there is any instruction to setup for multiple
> instances(not the VM/container way).
>
> Here is what I did so far. Install one through homebrew, and build another
> one from source code.
>
> To make my life easier, I setup the following alias:
> /* first HBase, is installed by homebrew and using HDFS as storage, by
> specify hbase.rootdir */
> alias hDFSHBaseShell='/usr/local/bin/hbase shell'
> alias
> hstart='/usr/local/Cellar/hadoop/2.7.3/sbin/start-dfs.
> sh;/usr/local/Cellar/hadoop/2.7.3/sbin/start-yarn.sh'
> alias
> hstop='/usr/local/Cellar/hadoop/2.7.3/sbin/stop-yarn.
> sh;/usr/local/Cellar/hadoop/2.7.3/sbin/stop-dfs.sh'
> alias startHDFSHBase='/usr/local/bin/start-hbase.sh'
> alias stopHDFSHBase='/usr/local/bin/stop-hbase.sh'
>
>
> /* 2nd HBase, is build locally from git clone, and using localfile system
> as storage */
> /* changed HBASE_PID_DIR in hbase-env.sh to avoid conflict with the first
> instance
> alias localHBaseShell='/Users/demai/hbase/bin/hbase shell'
> alias startLocalHBase='/Users/demai/hbase/bin/start-hbase.sh'
> alias stopLocalHBase='/Users/demai/hbase/bin/stop-hbase.sh'
>
> Still, it is not enough as the start/stop hbase will only bring up on
> instance, and bring it down regardless which stop-hbase.sh I used. I guess
> more port configuration, like describe there :
> http://blog.cloudera.com/blog/2013/07/guide-to-using-apache-hbase-ports/,
> need to change in hbase-site.xml?
>
> Before, I go down the manually port configuration route. Just wondering
> whether anyone already done it? To save me some time of random shooting
> :-)
>
> Many thanks. BTW, I did a bit google using 'multiple hbase instances', but
> search results doesn't exactly match this environment.
>
> Demai
>


Re: HBase restart without region reassigning

2016-10-17 Thread Dima Spivak
Hey Alexander,

Could something be amiss in your network settings? Seeing phantom datanodes
could be tripping things up. Are these physical machines or instances in
the cloud?

On Monday, October 17, 2016, Alexander Ilyin  wrote:

> Hi,
>
> We have a 7-node HBase cluster (version 1.1.2) and we change some of its
> settings from time to time which requires a restart. The problem is that
> every time after the restart load balancer reassigns the regions making
> data locality low.
>
> To address this issue we tried the settings described here:
> https://issues.apache.org/jira/browse/HBASE-6389,
> "hbase.master.wait.on.regionservers.interval" in particular. We tried it
> two times in slightly different ways but neither of them worked. First time
> we did a rolling restart (master, then each of datanodes) and we saw 14
> datanodes instead of 7 in Master UI. Half of them had the regions on it
> while the other half was empty. We restarted master only and we got 7 empty
> datanodes in Master UI. After that we rollbacked the setting.
>
> Second time we restarted master and datanodes at the same time but master
> failed to read meta table, moved it to a different datanode and reassigned
> the regions again.
>
> Please advise on how to use hbase.master.wait.on.regionservers.* settings
> properly. Launching major compactions for all the tables after each config
> change seems to be an overkill. Attaching Master server logs with relevant
> lines for two attempts mentioned above.
>
> Thanks in advance.
>


-- 
-Dima


Re: [ANNOUNCE] Stephen Yuan Jiang joins Apache HBase PMC

2016-10-14 Thread Dima Spivak
Congrats, Stephen!

-Dima

On Fri, Oct 14, 2016 at 11:27 AM, Enis Söztutar  wrote:

> On behalf of the Apache HBase PMC, I am happy to announce that Stephen has
> accepted our invitation to become a PMC member of the Apache HBase project.
>
> Stephen has been working on HBase for a couple of years, and is already a
> committer for more than a year. Apart from his contributions in proc v2,
> hbck and other areas, he is also helping for the 2.0 release which is the
> most important milestone for the project this year.
>
> Welcome to the PMC Stephen,
> Enis
>


Re: [Query :] hbase rebalancing the data after adding new nodes in cluster

2016-10-07 Thread Dima Spivak
Yeah, just to reinforce what Ted is saying, DO NOT run HDFS's balancer if
you use HBase. Doing so will move blocks in such a way as to destroy data
locality and negatively impact HBase performance (until a major compaction
in HBase is done).

On Friday, October 7, 2016, Ted Yu  wrote:

> For #1, it depends on whether major compaction is disabled. If major
> compaction is enabled, timing of major compaction would affect the data
> locality.
>
> For #2, no. hdfs rebalance is orthogonal to hbase.
>
> For #3, perform major compaction at earliest convenience.
>
> On Thu, Oct 6, 2016 at 11:47 PM, Manjeet Singh  >
> wrote:
>
> > Hi All,
> > I have question on re balance, my query is how hbase rebalancing the data
> > after adding new nodes in cluster
> >  > hbase-rebalancing-after-node-additions>
> > ?
> >
> >
> >
> > 1.Do I need to explicitly rebalance hbase after adding the new node
> in
> > cluster?
> >
> > 2.On my cloudera I have hdfs rebalance does its take care of hbase
> data
> > to be balance?
> >
> > 3.What is the best way to make sure that both hadoop and hbase are
> > rebalanced and work fine?
> >
> >
> > Thanks
> >
> > Manjeet
> >
> > --
> > luv all
> >
>


-- 
-Dima


Re: where clause on Phoenix view built on Hbase table throws error

2016-10-05 Thread Dima Spivak
I think you might have better luck with Phoenix questions on the Phoenix
user mailing list. :)

-Dima

On Wed, Oct 5, 2016 at 7:34 AM, Mich Talebzadeh 
wrote:

> Thanks John.
>
> 0: jdbc:phoenix:rhes564:2181> select "Date","volume" from "tsco" where
> "Date" = '1-Apr-08';
> +---+---+
> |   Date|  volume   |
> +---+---+
> | 1-Apr-08  | 49664486  |
> +---+---+
> 1 row selected (0.016 seconds)
>
> BTW I believe double quotes in enclosing phoenix column names are needed
> for case sensitivity on Hbase?
>
>
> Also does Phoenix have type conversion from VARCHAR to integer etc? Is
> there such document
>
> Regards
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJ
> d6zP6AcPCCdOABUrV8Pw
>  Jd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 5 October 2016 at 15:24, John Leach  wrote:
>
> >
> > Remove the double quotes and try single quote.  Double quotes refers to
> an
> > identifier…
> >
> > Cheers,
> > John Leach
> >
> > > On Oct 5, 2016, at 9:21 AM, Mich Talebzadeh  >
> > wrote:
> > >
> > > Hi,
> > >
> > > I have this Hbase table already populated
> > >
> > > create 'tsco','stock_daily'
> > >
> > > and populated using
> > > $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
> > > -Dimporttsv.separator=',' -Dimporttsv.columns="HBASE_ROW_KEY,
> > > stock_info:stock,stock_info:ticker,stock_daily:Date,stock_
> > daily:open,stock_daily:high,stock_daily:low,stock_daily:
> > close,stock_daily:volume"
> > > tsco hdfs://rhes564:9000/data/stocks/tsco.csv
> > > This works OK. In Hbase I have
> > >
> > > hbase(main):176:0> scan 'tsco', LIMIT => 1
> > > ROWCOLUMN+CELL
> > > TSCO-1-Apr-08
> > > column=stock_daily:Date, timestamp=1475525222488, value=1-Apr-08
> > > TSCO-1-Apr-08
> > > column=stock_daily:close, timestamp=1475525222488, value=405.25
> > > TSCO-1-Apr-08
> > > column=stock_daily:high, timestamp=1475525222488, value=406.75
> > > TSCO-1-Apr-08
> > > column=stock_daily:low, timestamp=1475525222488, value=379.25
> > > TSCO-1-Apr-08
> > > column=stock_daily:open, timestamp=1475525222488, value=380.00
> > > TSCO-1-Apr-08
> > > column=stock_daily:stock, timestamp=1475525222488, value=TESCO PLC
> > > TSCO-1-Apr-08
> > > column=stock_daily:ticker, timestamp=1475525222488, value=TSCO
> > > TSCO-1-Apr-08
> > > column=stock_daily:volume, timestamp=1475525222488, value=49664486
> > >
> > > In Phoenix I have a view "tsco" created on Hbase table as follows:
> > >
> > > 0: jdbc:phoenix:rhes564:2181> create view "tsco" (PK VARCHAR PRIMARY
> KEY,
> > > "stock_daily"."Date" VARCHAR, "stock_daily"."close" VARCHAR,
> > > "stock_daily"."high" VARCHAR, "stock_daily"."low" VARCHAR,
> > > "stock_daily"."open" VARCHAR, "stock_daily"."ticker" VARCHAR,
> > > "stock_daily"."stock" VARCHAR, "stock_daily"."volume" VARCHAR)
> > >
> > > So all good.
> > >
> > > This works
> > >
> > > 0: jdbc:phoenix:rhes564:2181> select "Date","volume" from "tsco" limit
> 2;
> > > +---+---+
> > > |   Date|  volume   |
> > > +---+---+
> > > | 1-Apr-08  | 49664486  |
> > > | 1-Apr-09  | 24877341  |
> > > +---+---+
> > > 2 rows selected (0.011 seconds)
> > >
> > > However, I don't seem to be able to use where clause!
> > >
> > > 0: jdbc:phoenix:rhes564:2181> select "Date","volume" from "tsco" where
> > > "Date" = "1-Apr-08";
> > > Error: ERROR 504 (42703): Undefined column. columnName=1-Apr-08
> > > (state=42703,code=504)
> > > org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703):
> > > Undefined column. columnName=1-Apr-08
> > >
> > > Why does it think a predicate "1-Apr-08" is a column.
> > >
> > > Any ideas?
> > >
> > > Thanks
> > >
> > >
> > >
> > > Dr Mich Talebzadeh
> > >
> > >
> > >
> > > LinkedIn * https://www.linkedin.com/profile/view?id=
> > AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> > >  Jd6zP6AcPCCd
> > OABUrV8Pw>*
> > >
> > >
> > >
> > > http://talebzadehmich.wordpress.com
> > >
> > >
> > > *Disclaimer:* Use it at your own risk. Any and all responsibility for
> any
> > > loss, damage or destruction of data or any other property which may
> arise
> > > from relying on this email's technical content is explicitly
> disclaimed.
> > > The author will in no case be liable for any monetary damages arising
> > from
> > > such loss, damage or 

Re: HBase thrift C# impersonation

2016-10-04 Thread Dima Spivak
Hey Kumar,

The ref guide section on enabling security for the Thrift gateway [1] is a
good place to start. Have you gone through that?

1. http://hbase.apache.org/book.html#security.gateway.thrift.doas

-Dima

On Tue, Oct 4, 2016 at 4:59 AM, kumar r  wrote:

> Hi,
>
> I need example for C# HBase thrift with doAs header.
>
> First of all, setting the below property isn't enough to enable
> authentication/impersonation?
>
>   
> hbase.thrift.security.qop
> auth-conf
>   
>
> After setting this property, i cannot access HBase via C# thrift. I need
> example to access HBase with doAs via C# thrift client.
>
> Help me to get it work.
>
> Thanks in advance,
> Kumar
>


Re: [apache/incubator-trafodion] [TRAFODION-1519]Use free tool to build windows ODBC (#116)

2016-10-04 Thread Dima Spivak
You hit the HBase user list instead of the Trafodion one. :) Moving
user@hbase.apache.org to bcc.

-Dima

On Tue, Oct 4, 2016 at 8:21 AM, Dave Birdsall 
wrote:

> Forwarding this to the Trafodion user list.
>
>
>
> *From:* helloHuiW [mailto:notificati...@github.com]
> *Sent:* Tuesday, October 4, 2016 6:48 AM
> *To:* apache/incubator-trafodion 
> *Cc:* DaveBirdsall ; Mention <
> ment...@noreply.github.com>
> *Subject:* Re: [apache/incubator-trafodion] [TRAFODION-1519]Use free tool
> to build windows ODBC (#116)
>
>
>
> Who can provide the ODBC Driver for Windows
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
>  issuecomment-251392666>,
> or mute the thread
>  enXUEgeA_ks5qwlkqgaJpZM4GNFgD>
> .
>


Re: Viable approaches to fail over HBase cluster across data centers

2016-09-26 Thread Dima Spivak
Sounds like the problem Ted Malaska was trying to solve with his
multicluster client [1], though not sure that has gone anywhere for a
while.

1. https://github.com/tmalaska/HBase.MCC

On Monday, September 26, 2016, Sreeram  wrote:

> Dear All,
>
>  Please let me know your thoughts on viable approaches to fail over HBase
> cluster across data centers in case of a primary data center outage. The
> deployment scenario has zero data loss as one of the primary design goals.
> Deployment scenario is Active-Passive. In case of active cluster being
> down, there must be zero data loss fail over to the passive cluster.
>
> I understand that the built-in table level replication using 'add_peer'
> might still lead to data loss since it is asynchronous.
>
> As a related note, is there is a way to specify the location (e.g. network
> drive) where HBase WAL files in HDFS need to be written to ? The network
> drive has synchronous replication across data centers. If the WAL files can
> be written to the replicated network drives, can we recover in-flight data
> in the passive cluster and resume operations from there ?
>
> Regards,
> Sreeram
>


-- 
-Dima


Re: Increased response time of hbase calls

2016-09-22 Thread Dima Spivak
Hey Deepak,

Assuming I understand your question, I think you'd be better served
reaching out to MapR directly. Our community isn't involved in M7 so the
average user (or dev) wouldn't know about the ins and outs of that
offering.

On Wednesday, September 21, 2016, Deepak Khandelwal <
dkhandelwal@gmail.com> wrote:

> Hi all
>
> I am facing an issue while accessing data from an hbase m7 table which has
> about 50 million records.
>
> In a single Api request, we make 3 calls to hbase m7.
> 1. Single Multi get to fetch about 30 records
> 2. Single multi-put to update about 500 records
> 3. Single multi-get to fetch about 15 records
>
> We consistently get the response in less than 200 seconds for approx
> 99%calls. We have a tps of about 200 with 8vm's.
> But we get issue everyday between 4pm and 6pm when Api response time gets
> significant increase to from 200ms to 7-8sec. This happens because we have
> a daily batch load That runs between 4and 6pm that puts multiple entries
> into same hbase table.
>
> We are trying to find a solution to this problem that why response time
> increases when batch load runs. We cannot change the time of batch job. Is
> there anything we could do to resolve this issue?any help or pointers would
> be much appreciated. Thanks
>


-- 
-Dima


Re: Loading HBase table into HDFS

2016-09-21 Thread Dima Spivak
Hey Karthik,

This blog post [1] by our very own JD Cryans is a good place to start
understanding bulk load.

1.
http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/

On Wednesday, September 21, 2016, karthi keyan 
wrote:

> Can any one please guide me to load the HBase table in to HDFS with
> specific columnfamily.
>
> thank,
> karthik
>


-- 
-Dima


Re: [Urgent] - HBase - Addition of new region servers - Presplit based

2016-09-20 Thread Dima Spivak
At what rate are you ingesting data, Viswa?

On Monday, September 19, 2016, Viswanathan J 
wrote:

> Thanks Eric.
>
> So addition of new region servers will not impact the regions which I
> splitted as 3 while writing and reading.
>
> After adding 2 region servers I will change the pre-split to 5 for r/w.
>
> On Mon, Sep 19, 2016 at 8:05 PM, Eric Pogrelis  >
> wrote:
>
> > V,
> >
> > Regionservers (what you said you're adding) and regions are two different
> > things. If you are manually splitting such that you have only 3 regions,
> > then if you add additional regionservers, those regionservers may not be
> > serving any regions. To put that a better way, if you have a single table
> > with 3 regions, then there are only 4 regions in total -- your 3-region
> > table + the region for the hbase meta table. So, if you have 5
> > regionservers but only 4 regions to be served, then one regionserver will
> > be idle. To take advantage of the additional regionservers you would want
> > to modify your split such that there are enough regions that all the
> > regionservers have a region to serve.
> >
> > On Mon, Sep 19, 2016 at 9:16 AM, Viswanathan J <
> jayamviswanat...@gmail.com 
> > > wrote:
> >
> >> Thanks Eric for the update.
> >>
> >> Re-balancing will happen automatically, but while writing/reading in
> >> HBase we're forming the rowkey by pre-splitting based on 3 regions. So
> if I
> >> add new regions to the cluster while reading data from 5 rs it will be
> >> impacted because written data only on 3 regions?
> >>
> >> Thanks,
> >> Viswa.J
> >>
> >> On Sun, Sep 18, 2016 at 11:17 PM, Eric Pogrelis  >
> >> wrote:
> >>
> >>> Absent other changes, the existing regions will simply be re-balanced
> >>> across the new region server count of 5.
> >>>
> >>> On Fri, Sep 16, 2016 at 9:46 AM, Viswanathan J <
> >>> jayamviswanat...@gmail.com > wrote:
> >>>
>  Hi,
> 
>  Currently I have 3 region servers and reading/storing data based on
>  pre-split(hashing). If I need to add 2 more region servers how it will
>  impact while reading and writing the data. Because currently pre-split
>  hashing only based on 3 rs.
> 
>  Please help.
> 
>  --
>  Regards,
>  Viswa.J
> 
>  --
> 
>  ---
>  You received this message because you are subscribed to the Google
>  Groups "CDH Users" group.
>  To unsubscribe from this group and stop receiving emails from it, send
>  an email to cdh-user+unsubscr...@cloudera.org .
>  For more options, visit https://groups.google.com/a/cl
>  oudera.org/d/optout.
> 
> >>>
> >>> --
> >>>
> >>> ---
> >>> You received this message because you are subscribed to the Google
> >>> Groups "CDH Users" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> >>> an email to cdh-user+unsubscr...@cloudera.org .
> >>> For more options, visit https://groups.google.com/a/cl
> >>> oudera.org/d/optout.
> >>>
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Viswa.J
> >>
> >> --
> >>
> >> ---
> >> You received this message because you are subscribed to the Google
> Groups
> >> "CDH Users" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to cdh-user+unsubscr...@cloudera.org .
> >> For more options, visit https://groups.google.com/a/
> cloudera.org/d/optout
> >> .
> >>
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google Groups
> > "CDH Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to cdh-user+unsubscr...@cloudera.org .
> > For more options, visit https://groups.google.com/a/
> cloudera.org/d/optout.
> >
>
>
>
> --
> Regards,
> Viswa.J
>


-- 
-Dima


Re: [DISCUSS] Drop the support of jdk7 at a future 1.x release

2016-09-08 Thread Dima Spivak
I'd worry about doing this from both the client-server compatibility side
as well as for when it comes to upgrades. Having to go between Java
versions is way scarier for ops people than just swapping JARs.

On Thursday, September 8, 2016, Duo Zhang  wrote:

> The main reason is the asynchronous api we want to introduce in HBase
> today. See HBASE-13784 and HBASE-16505.
>
> The CompletableFuture in java8 is very suitable to use as the return value
> of a async method. We can not use it if we still want to support java7, and
> sadly, there is no candidate which is good enough to replace
> CompletableFuture. ListenableFuture in guava or Promise in netty are good,
> but we do not want to expose third-party classes in our public
> API(especially guava, you know...). And we can also implement our own
> ListenableFuture but it just a copy of guava. Or introduce a simple
> Callback interface which does not need much code(for us) but this is a code
> style around 2000s so people will not like it...
>
> And luckily, I found that in our documentation
>
> http://hbase.apache.org/book.html#basic.prerequisites
>
> We only say that 1.3 will be compatible with jdk7, not all 1.x.
>
> So here I propose that we drop the support of jdk7 in a future 1.x release,
> maybe 1.4? Thus we can use CompletableFuture in both master and branch-1.
>
> Thanks.
>


-- 
-Dima


Re: [ANNOUNCE] Duo Zhang (张铎) joins the Apache HBase PMC

2016-09-06 Thread Dima Spivak
Yay Duo!

On Tuesday, September 6, 2016, Stack  wrote:

> On behalf of the Apache HBase PMC I am pleased to announce that 张铎
> has accepted our invitation to become a PMC member on the Apache
> HBase project. Duo has healthy notions on where the project should be
> headed and over the last year and more has been working furiously to take
> us there.
>
> Please join me in welcoming Duo to the HBase PMC!
>
> One of us!
> St.Ack
>


-- 
-Dima


Re: HBase on docker NotServingRegionException because of hostname alisas

2016-09-05 Thread Dima Spivak
Hey Pierre,

Sorry, I just don't think it's worth the time trying to debug this
framework when a more robust one exists. Perhaps try reaching out to
"kiwenlau?"

-Dima

On Mon, Sep 5, 2016 at 9:49 PM, Pierre Caserta <pierre.case...@gmail.com>
wrote:

> Thanks Dima,
> Now even if I use a network called hadoopnet.com <http://hadoopnet.com/>
> I still have the same problem.
> Here are my regionservers that get detected:
>
> Region Servers
> Base Stats
>  <http://192.168.99.100:33224/master-status#tab_baseStats>Memory
>  <http://192.168.99.100:33224/master-status#tab_memoryStats>Requests
>  <http://192.168.99.100:33224/master-status#tab_requestStats>Storefiles
>  <http://192.168.99.100:33224/master-status#tab_storeStats>Compactions
>  <http://192.168.99.100:33224/master-status#tab_compactStas>
> ServerName  Start time  Version Requests Per Second Num.
> Regions
> hadoop-slave1.hadoopnet.com,16020,1473137128613 <http://hadoop-slave1.
> hadoopnet.com:16030/rs-status>Tue Sep 06 04:45:28 UTC 20161.2.2
>  0   0
> hadoop-slave1.hadoopnet.com.hadoopnet.com,16020,1473137128613 <
> http://hadoop-slave1.hadoopnet.com.hadoopnet.com:60010/rs-status>
> Tue Sep 06 04:45:28 UTC 2016Unknown 0   0
> hadoop-slave2.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2.
> hadoopnet.com:16030/rs-status>Tue Sep 06 04:45:27 UTC 20161.2.2
>  0   0
> hadoop-slave2.hadoopnet.com.hadoopnet.com,16020,1473137127975 <
> http://hadoop-slave2.hadoopnet.com.hadoopnet.com:60010/rs-status>
> Tue Sep 06 04:45:27 UTC 2016Unknown 0   0
> Total:4 2 nodes with inconsistent version   0   0
> instead of just hadoop-slave1.hadoopnet.com,16020,1473137128613 <
> http://hadoop-slave1.hadoopnet.com:16030/rs-status> and
> hadoop-slave2.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2.
> hadoopnet.com:16030/rs-status>
> This is the script I used to start the hadoop cluster
>
> ---
> #!/bin/bash
>
> # the default node number is 3
> N=${1:-3}
>
>
> NETWORK=hadoopnet.com
> docker rm -f zk.$NETWORK &> /dev/null
> echo "start zk container..."
> docker run -p 2181:2181 --name zk.$NETWORK --hostname zk.$NETWORK
> --net=$NETWORK -itd -v conf:/opt/zookeeper/conf -v data:/tmp/zookeeper
> jplock/zookeeper
>
> # start hadoop master container
> docker rm -f hadoop-master.$NETWORK &> /dev/null
> echo "start hadoop-master container..."
> docker run -itd \
> --net=$NETWORK \
> -P \
> --name hadoop-master.$NETWORK \
> --hostname hadoop-master.$NETWORK \
> --add-host zk.$NETWORK:$(docker inspect -f "{{with index
> .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}"
> zk.$NETWORK) \
> casertap/hhb
>
>
> # start hadoop slave container
> i=1
> while [ $i -lt $N ]
> do
> docker rm -f hadoop-slave$i.$NETWORK &> /dev/null
> echo "start hadoop-slave$i container..."
> docker run -itd \
> --net=$NETWORK \
> --name hadoop-slave$i.$NETWORK \
> --hostname hadoop-slave$i.$NETWORK \
>   --publish-all=false \
>   --add-host hadoop-master.$NETWORK:$(docker inspect -f
> "{{with index .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}"
> hadoop-master.$NETWORK) \
>   --add-host zk.$NETWORK:$(docker inspect -f "{{with index
> .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}"
> zk.$NETWORK) \
> casertap/hhb
> i=$(( $i + 1 ))
> done
>
> # get into hadoop master container
> docker exec -it hadoop-master.$NETWORK bash
> ---
>
> Thanks,
> pierre
>
> > On 6 Sep 2016, at 08:47, Dima Spivak <dimaspi...@apache.org> wrote:
> >
> > Sounds good, Pierre. FWIW, if you want a preview, here's how to get a
> > 5-node HBase cluster running based on the master branch of HBase in
> about a
> > minute:
> >
> > 1. Source the clusterdock.sh script that defines the clusterdock_ helper
> > functions: source /dev/stdin <<< "$(curl -sL
> > http://tiny.cloudera.com/clusterdock.sh <http://tiny.cloudera.com/
> clusterdock.sh>)"
> > 2. Start up a cluster: CLUSTERDOCK_TOPOLOGY_IMAGE=
> > hbasejenkinsuser-docker-hbase.bintray.io/dev/clusterdock:
> apache_hbase_topology
> > clusterdock_run ./bin/start_cluster -r
> > hbasejenkinsuser-docker-hbase.bintray.io --namespace dev

Re: HBase on docker NotServingRegionException because of hostname alisas

2016-09-05 Thread Dima Spivak
Sounds good, Pierre. FWIW, if you want a preview, here's how to get a
5-node HBase cluster running based on the master branch of HBase in about a
minute:

1. Source the clusterdock.sh script that defines the clusterdock_ helper
functions: source /dev/stdin <<< "$(curl -sL
http://tiny.cloudera.com/clusterdock.sh)"
2. Start up a cluster: CLUSTERDOCK_TOPOLOGY_IMAGE=
hbasejenkinsuser-docker-hbase.bintray.io/dev/clusterdock:apache_hbase_topology
clusterdock_run ./bin/start_cluster -r
hbasejenkinsuser-docker-hbase.bintray.io --namespace dev apache_hbase
--hbase-version=master --hadoop-version=2.7.1
--secondary-nodes='node-{2..5}'

And that's it. Feel free to put a -h for help information (put it right
after the ./bin/start_cluster for details about the function or after the
apache_hbase for details about the Apache HBase topology.

-Dima

On Mon, Sep 5, 2016 at 3:44 PM, Pierre Caserta <pierre.case...@gmail.com>
wrote:

> Thanks for your answer.
> I will check the ticket https://issues.apache.org/jira/browse/HBASE-15961
> <https://issues.apache.org/jira/browse/HBASE-15961> regularly and try
> clusterdock as soon as the documentation comes out.
> I will try to use hostname with domain like: master.hadoopnet.com <
> http://master.hadoopnet.com/> and network named hadoopnet.com <
> http://hadoopnet.com/> to try if this resolve the problem.
> Currently my hostnames are hadoop-master, hadoop-slave1 and hadoop-slave2,
> maybe that is the problem.
>
> > On 5 Sep 2016, at 23:31, Dima Spivak <dimaspi...@apache.org> wrote:
> >
> > clusterdock uses --net=host for running the framework out of a container,
> > but each Hadoop/HBase cluster itself runs with its own bridge network.
> Just
> > suggesting clusterdock since it's what we now use for testing HBase
> > releases and it looks a bit more sophisticated than this other project
> > (e.g. no need to rebuild images for different cluster sizes).
> >
> > The error you're seeing is caused by not using the FQDN of the containers
> > when referring to them; Docker networks use the network name as the
> domain.
> >
> > On Monday, September 5, 2016, Pierre Caserta <pierre.case...@gmail.com
> <mailto:pierre.case...@gmail.com>>
> > wrote:
> >
> >> That is a good script thanks but I would like to understand exactly what
> >> is the problem with my config without adding another level of
> abstraction
> >> and just running the clusterdock command.
> >> In your script I can see that you are using --net=host. I think this is
> >> the main difference compared to what I am doing which is creating a
> bridge
> >> network for the hadoop cluster.
> >> I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2.
> >>
> >> Why do those strange hadoop-slave2.hadoopnet alias appear in the web ui?
> >> It looks like the network name is used as part of the hostname.
> >> Any idea what it is happening in my case?
> >>
> >> Pierre
> >>
> >>> On 5 Sep 2016, at 16:48, Dima Spivak <dimaspi...@apache.org
> >> <javascript:;>> wrote:
> >>>
> >>> You should try the Apache HBase topology for clusterdock that was
> >> committed
> >>> a few months back. See HBASE-12721 for details.
> >>>
> >>> On Sunday, September 4, 2016, Pierre Caserta <pierre.case...@gmail.com
> <mailto:pierre.case...@gmail.com>
> >> <javascript:;>>
> >>> wrote:
> >>>
> >>>> Hi,
> >>>> I am building a fully distributed hbase cluster with unmanaged
> >> zookeeper.
> >>>> I pretty much used this example and install hbase on top of it:
> >>>> https://github.com/kiwenlau/hadoop-cluster-docker
> >>>>
> >>>> Hadoop and hdfs works fine but I get this exception with hbase:
> >>>>
> >>>>   2016-09-05 06:27:12,268 INFO  [hadoop-master:16000.
> >> activeMasterManager]
> >>>> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at
> >>>> address=hadoop-slave2,16020,1473052276351,
> exception=org.apache.hadoop.
> >>>> hbase.NotServingRegionException: Region hbase:meta,,1 is not online
> on
> >>>> hadoop-slave2.hadoopnet,16020,1473056813966
> >>>>   at org.apache.hadoop.hbase.regionserver.HRegionServer.
> >>>> getRegionByEncodedName(HRegionServer.java:2910)
> >>>>
> >>>> This is bloking because any command I enter on the hbase shell will
> >> return
> >>>> the following error:
>

Re: HBase on docker NotServingRegionException because of hostname alisas

2016-09-05 Thread Dima Spivak
clusterdock uses --net=host for running the framework out of a container,
but each Hadoop/HBase cluster itself runs with its own bridge network. Just
suggesting clusterdock since it's what we now use for testing HBase
releases and it looks a bit more sophisticated than this other project
(e.g. no need to rebuild images for different cluster sizes).

The error you're seeing is caused by not using the FQDN of the containers
when referring to them; Docker networks use the network name as the domain.

On Monday, September 5, 2016, Pierre Caserta <pierre.case...@gmail.com>
wrote:

> That is a good script thanks but I would like to understand exactly what
> is the problem with my config without adding another level of abstraction
> and just running the clusterdock command.
> In your script I can see that you are using --net=host. I think this is
> the main difference compared to what I am doing which is creating a bridge
> network for the hadoop cluster.
> I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2.
>
> Why do those strange hadoop-slave2.hadoopnet alias appear in the web ui?
> It looks like the network name is used as part of the hostname.
> Any idea what it is happening in my case?
>
> Pierre
>
> > On 5 Sep 2016, at 16:48, Dima Spivak <dimaspi...@apache.org
> <javascript:;>> wrote:
> >
> > You should try the Apache HBase topology for clusterdock that was
> committed
> > a few months back. See HBASE-12721 for details.
> >
> > On Sunday, September 4, 2016, Pierre Caserta <pierre.case...@gmail.com
> <javascript:;>>
> > wrote:
> >
> >> Hi,
> >> I am building a fully distributed hbase cluster with unmanaged
> zookeeper.
> >> I pretty much used this example and install hbase on top of it:
> >> https://github.com/kiwenlau/hadoop-cluster-docker
> >>
> >> Hadoop and hdfs works fine but I get this exception with hbase:
> >>
> >>2016-09-05 06:27:12,268 INFO  [hadoop-master:16000.
> activeMasterManager]
> >> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at
> >> address=hadoop-slave2,16020,1473052276351, exception=org.apache.hadoop.
> >> hbase.NotServingRegionException: Region hbase:meta,,1 is not online on
> >> hadoop-slave2.hadoopnet,16020,1473056813966
> >>at org.apache.hadoop.hbase.regionserver.HRegionServer.
> >> getRegionByEncodedName(HRegionServer.java:2910)
> >>
> >> This is bloking because any command I enter on the hbase shell will
> return
> >> the following error:
> >>
> >>ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is
> >> initializing
> >>
> >> The containers are runned using --net=hadoopnet
> >> which is a network create as such:
> >>
> >>docker network create --driver=bridge hadoopnet
> >>
> >> The hbase webui is showing this:
> >>
> >>Region Servers
> >>ServerName  Start time  Version Requests Per Second Num.
> >> Regions
> >>hadoop-slave1,16020,1473056814064   Mon Sep 05 06:26:54 UTC 2016
> >> 1.2.2   0   0
> >>hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54 UTC
> >> 2016Unknown 0   0
> >>hadoop-slave2,16020,1473056813966   Mon Sep 05 06:26:53 UTC 2016
> >> 1.2.2   0   0
> >>hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53 UTC
> >> 2016Unknown 0   0
> >>Total:4 2 nodes with inconsistent version   0   0
> >>
> >> I should have only 2 regionservers but 2 strange hadoop-slave1.hadoopnet
> >> and hadoop-slave2.hadoopnet are added to the list.
> >> When I look at zk using:
> >>
> >>/usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs
> >>
> >> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and
> >> hadoop-slave2,16020,1473056813966
> >>
> >> Looking at the zookeeper.MetaTableLocator: Failed verification error I
> see
> >> that  hadoop-slave2,16020,1473052276351 and
> hadoop-slave2.hadoopnet,16020,1473056813966
> >> get mixed up.
> >>
> >> here is my config on all server
> >>
> >>
> >>
> >>
> >>
> >>  
> >>hbase.rootdir
> >>hdfs://hadoop-master:9000/hbase
> >>  The directory shared by region servers. Should
> >> be fully-qualified to include the filesystem to use. E.g:
> >> hdfs:/

Re: HBase on docker NotServingRegionException because of hostname alisas

2016-09-05 Thread Dima Spivak
You should try the Apache HBase topology for clusterdock that was committed
a few months back. See HBASE-12721 for details.

On Sunday, September 4, 2016, Pierre Caserta 
wrote:

> Hi,
> I am building a fully distributed hbase cluster with unmanaged zookeeper.
> I pretty much used this example and install hbase on top of it:
> https://github.com/kiwenlau/hadoop-cluster-docker
>
> Hadoop and hdfs works fine but I get this exception with hbase:
>
> 2016-09-05 06:27:12,268 INFO  [hadoop-master:16000.activeMasterManager]
> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at
> address=hadoop-slave2,16020,1473052276351, exception=org.apache.hadoop.
> hbase.NotServingRegionException: Region hbase:meta,,1 is not online on
> hadoop-slave2.hadoopnet,16020,1473056813966
> at org.apache.hadoop.hbase.regionserver.HRegionServer.
> getRegionByEncodedName(HRegionServer.java:2910)
>
> This is bloking because any command I enter on the hbase shell will return
> the following error:
>
> ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is
> initializing
>
> The containers are runned using --net=hadoopnet
> which is a network create as such:
>
> docker network create --driver=bridge hadoopnet
>
> The hbase webui is showing this:
>
> Region Servers
> ServerName  Start time  Version Requests Per Second Num.
> Regions
> hadoop-slave1,16020,1473056814064   Mon Sep 05 06:26:54 UTC 2016
> 1.2.2   0   0
> hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54 UTC
> 2016Unknown 0   0
> hadoop-slave2,16020,1473056813966   Mon Sep 05 06:26:53 UTC 2016
> 1.2.2   0   0
> hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53 UTC
> 2016Unknown 0   0
> Total:4 2 nodes with inconsistent version   0   0
>
> I should have only 2 regionservers but 2 strange hadoop-slave1.hadoopnet
> and hadoop-slave2.hadoopnet are added to the list.
> When I look at zk using:
>
> /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs
>
> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and
> hadoop-slave2,16020,1473056813966
>
> Looking at the zookeeper.MetaTableLocator: Failed verification error I see
> that  hadoop-slave2,16020,1473052276351 and 
> hadoop-slave2.hadoopnet,16020,1473056813966
> get mixed up.
>
> here is my config on all server
>
> 
> 
>
> 
>   
> hbase.rootdir
> hdfs://hadoop-master:9000/hbase
>   The directory shared by region servers. Should
> be fully-qualified to include the filesystem to use. E.g:
> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR
>   
>   
>   hbase.master
>   hdfs://hadoop-master:6
>   The host and port that the HBase master runs
> at.
>   
>   
>   hbase.cluster.distributed
>   true
>   The mode the cluster will be in. Possible
> values are
>   false: standalone and pseudo-distributed setups with managed
> Zookeeper
>   true: fully-distributed with unmanaged Zookeeper Quorum (see
> hbase-env.sh)
>   
>   
>   hbase.master.info.port
>   60010
>   The UI interface of HBase master
> runs.
>   
>   
>   hbase.zookeeper.quorum
>   zk
>   string m_e_m_b_e_r_s is replaced by list of
> hosts separated by comma. Its generated by configure-slaves.sh on master
> node
>   
>   
>   hbase.zookeeper.property.maxClientCnxns
>   300
>   
>   
>   hbase.zookeeper.property.datadir
>   /tmp/zookeeper
>   location of storage of zookeeper
> data
>   
>   
>   hbase.zookeeper.property.clientPort
>   2181
>   
>
> 
>
> I created a stack overflow question as well: http://stackoverflow.com/
> questions/39325041/hbase-on-docker-notservingregionexception-
> because-of-hostname-alisas  questions/39325041/hbase-on-docker-notservingregionexception-
> because-of-hostname-alisas>
>
> Thanks,
> Pierre



-- 
-Dima


Re: [from EMC Isilon] -- pls. review "Why HBase on EMC Isilon" post

2016-09-01 Thread Dima Spivak
Perhaps it'd be useful to have performance benchmarks with an Isilon setup
vs. a non-Isilon setup? I saw a graphic promising 50x better performance
for [ostensibly un-released] "Project Nitro," but don't get any sense of
how much faster things would be for a user today. There's also been a lot
of work in the upcoming Hadoop 3.0 release to support erasure coding and it
might be worthwhile to address this if one of the main selling points of
Isilon is its storage efficiency.


-Dima

On Thu, Sep 1, 2016 at 9:21 AM, Ted Yu  wrote:

> Interesting.
>
> Minor correction:
>
> bq. The locations of all files and regions are kept in a special metadata
> table “*hbase:meta*”
>
> The locations of hfiles are not tracked in hbase:meta
>
>
>
> On Thu, Sep 1, 2016 at 1:52 AM, Chernov, Arseny 
> wrote:
>
> > Dear colleagues at User@HBase ,
> >
> > I really value your time and thank you for attention. I understand that,
> > without context setting, my mail carries a substantial risk to trigger a
> > huge off-top. This is not my intention at all.
> >
> > So, brief context is: we at EMC Isilon see strong uptake of HBase in
> > mid-market and enterprise datacenters, where sustained practices of ITIL,
> > compliance, legacy architectures and cost control , -- oftentimes are not
> > correlating well with true “Big Table” web-scale concepts of this
> brilliant
> > database. We see emergence of small, order of 100-s of TB-s, virtualised
> > HBase deployments, in different pockets of enterprise, as well.
> >
> > I humbly ask you to review and share your feedback (1x1 or in this forum)
> > your PoV-s on “Why HBase on Isilon” blogpost that I’ve just published:
> >
> > Why HBase on EMC Isilon – Top 5 Reasons http://bit.ly/2bSmKgc
> >
> > I’m very keen to learn more from this respected community, as well as
> > possibly answer your questions about EMC Isilon and share (to extent of
> > what could be shared) what we see.
> >
> > --
> > Arseny Chernov
> > 陈毅誠
> >
> >
> >
>


Re: How to deal OutOfOrderScannerNextException

2016-08-31 Thread Dima Spivak
:) Glad it got sorted out.

On Wednesday, August 31, 2016, Kang Minwoo <minwoo.k...@outlook.com> wrote:

> Hi, I found what is problem.
>
>
> In my case, I use "hbase.client.rpc.compressor=
> org.apache.hadoop.io.compress.SnappyCodec" option.
>
> While "org.apache.hadoop.hbase.client.ResultScanner" called next()
> method, IPC Client occured "java.lang.UnsatisfiedLinkError:
> org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z" Exception.
>
>
> But HBase wrap that exception. So I saw "java.lang.RuntimeException:
> org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of
> OutOfOrderScannerNextException: was there a rpc timeout?" error.
>
>
> I set unable "hbase.client.rpc.compressor" option.
>
> My hbase java client works well.
>
>
> Thanks alot for your comment, It is helpful for debug.
>
>
> I have a suggestion. If hbase show raw exception (not wrap), it is more
> helpful for debug.
>
>
> Yours sincerely,
>
> Minwoo
>
> 
> 보낸 사람: Kang Minwoo <minwoo.k...@outlook.com <javascript:;>>
> 보낸 날짜: 2016년 8월 31일 수요일 오전 9:48:07
> 받는 사람: user@hbase.apache.org <javascript:;>
> 제목: RE: How to deal OutOfOrderScannerNextException
>
> Thanks for your help.
>
> I have to upgrade my client dependencies.
>
>
> I will try!
>
> 
> 보낸 사람: Dima Spivak <dspi...@cloudera.com <javascript:;>>
> 보낸 날짜: 2016년 8월 31일 수요일 오전 9:43:42
> 받는 사람: user@hbase.apache.org <javascript:;>
> 제목: Re: How to deal OutOfOrderScannerNextException
>
> Hey Minwoo,
>
> Gotcha. Unfortunately, we don't guarantee compatibility between that client
> and the HBase server you're running, so the only way to solve the issue
> will probably be to upgrade your client dependencies.
>
> On Tuesday, August 30, 2016, Kang Minwoo <minwoo.k...@outlook.com
> <javascript:;>> wrote:
>
> > Because I have a number of hbase cluster.
> >
> > They are different version.
> >
> >
> > Legacy Hbase cluster version is 0.96.2-hadoop2.
> >
> > So I have to maintain 0.96.2-hadoop2.
> >
> > 
> > 보낸 사람: Dima Spivak <dspi...@cloudera.com <javascript:;> <javascript:;>>
> > 보낸 날짜: 2016년 8월 31일 수요일 오전 9:32:59
> > 받는 사람: user@hbase.apache.org <javascript:;> <javascript:;>
> > 제목: Re: How to deal OutOfOrderScannerNextException
> >
> > Any reason to not use the 1.2.2 client library? You're likely hitting a
> > compatibility issue.
> >
> > On Tuesday, August 30, 2016, Kang Minwoo <minwoo.k...@outlook.com
> <javascript:;>
> > <javascript:;>> wrote:
> >
> > > Hi Dima Spivak,
> > >
> > >
> > > Thanks for interesting my problem.
> > >
> > >
> > > Hbase server version is 1.2.2
> > >
> > > Java Hbase library version is 0.96.2-hadoop2 at hbase-client,
> > > 0.96.2-hadoop2 at hbase-hadoop-compat.
> > >
> > >
> > > Here is an excerpt of the code.
> > >
> > > 
> > > 
> > >
> > > 
> > >
> > > ResultScanner rs = keyTable.getScanner(scan); ==> Exception is here.
> > > List list = new ArrayList();
> > > try {
> > > for (Result r : rs) {
> > > list.add(r);
> > > }
> > > } finally {
> > > rs.close();
> > > }
> > > return list;
> > > 
> > > 
> > >
> > >
> > > Here is a stacktrace.
> > > 
> > > 
> > > org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of
> > > OutOfOrderScannerNextException: was there a rpc timeout?
> > > at org.apache.hadoop.hbase.client.ClientScanner.next(
> > > ClientScanner.java:384)
> > > at org.apache.hadoop.hbase.client.MetaScanner.metaScan(
> > > MetaScanner.java:177)
> > > at org.apache.hadoop.hbase.client.HConnectionManager$
> > > HConnectionImplementation.prefetchRegionCache(
> > > HConnectionManager.java:1107)
> > > at org.apache.hadoop.hbase.client.HConnectionManager$
> > > HConnectionImplementation.locateRegionInMeta(
> > H

Re: How to deal OutOfOrderScannerNextException

2016-08-30 Thread Dima Spivak
Hey Minwoo,

Gotcha. Unfortunately, we don't guarantee compatibility between that client
and the HBase server you're running, so the only way to solve the issue
will probably be to upgrade your client dependencies.

On Tuesday, August 30, 2016, Kang Minwoo <minwoo.k...@outlook.com> wrote:

> Because I have a number of hbase cluster.
>
> They are different version.
>
>
> Legacy Hbase cluster version is 0.96.2-hadoop2.
>
> So I have to maintain 0.96.2-hadoop2.
>
> ________
> 보낸 사람: Dima Spivak <dspi...@cloudera.com <javascript:;>>
> 보낸 날짜: 2016년 8월 31일 수요일 오전 9:32:59
> 받는 사람: user@hbase.apache.org <javascript:;>
> 제목: Re: How to deal OutOfOrderScannerNextException
>
> Any reason to not use the 1.2.2 client library? You're likely hitting a
> compatibility issue.
>
> On Tuesday, August 30, 2016, Kang Minwoo <minwoo.k...@outlook.com
> <javascript:;>> wrote:
>
> > Hi Dima Spivak,
> >
> >
> > Thanks for interesting my problem.
> >
> >
> > Hbase server version is 1.2.2
> >
> > Java Hbase library version is 0.96.2-hadoop2 at hbase-client,
> > 0.96.2-hadoop2 at hbase-hadoop-compat.
> >
> >
> > Here is an excerpt of the code.
> >
> > 
> > 
> >
> > 
> >
> > ResultScanner rs = keyTable.getScanner(scan); ==> Exception is here.
> > List list = new ArrayList();
> > try {
> > for (Result r : rs) {
> > list.add(r);
> > }
> > } finally {
> > rs.close();
> > }
> > return list;
> > 
> > 
> >
> >
> > Here is a stacktrace.
> > 
> > 
> > org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of
> > OutOfOrderScannerNextException: was there a rpc timeout?
> > at org.apache.hadoop.hbase.client.ClientScanner.next(
> > ClientScanner.java:384)
> > at org.apache.hadoop.hbase.client.MetaScanner.metaScan(
> > MetaScanner.java:177)
> > at org.apache.hadoop.hbase.client.HConnectionManager$
> > HConnectionImplementation.prefetchRegionCache(
> > HConnectionManager.java:1107)
> > at org.apache.hadoop.hbase.client.HConnectionManager$
> > HConnectionImplementation.locateRegionInMeta(
> HConnectionManager.java:1167)
> > at org.apache.hadoop.hbase.client.HConnectionManager$
> > HConnectionImplementation.locateRegion(HConnectionManager.java:1059)
> > at org.apache.hadoop.hbase.client.HConnectionManager$
> > HConnectionImplementation.locateRegion(HConnectionManager.java:1016)
> > at org.apache.hadoop.hbase.client.HConnectionManager$
> > HConnectionImplementation.getRegionLocation(HConnectionManager.java:857)
> > at org.apache.hadoop.hbase.client.RegionServerCallable.
> > prepare(RegionServerCallable.java:72)
> > at org.apache.hadoop.hbase.client.ScannerCallable.
> > prepare(ScannerCallable.java:118)
> > at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> > RpcRetryingCaller.java:119)
> > at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> > RpcRetryingCaller.java:96)
> > at org.apache.hadoop.hbase.client.ClientScanner.
> > nextScanner(ClientScanner.java:264)
> > at org.apache.hadoop.hbase.client.ClientScanner.
> > initializeScannerInConstruction(ClientScanner.java:169)
> > at org.apache.hadoop.hbase.client.ClientScanner.(
> > ClientScanner.java:164)
> > at org.apache.hadoop.hbase.client.ClientScanner.(
> > ClientScanner.java:107)
> > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:720)
> > at com.my.app.reader.hbase.HBaseReader.getResult(HBaseReader.java:1)
> > 
> > 
> >
> > Yours sincerely,
> > Minwoo
> >
> >
> > 
> > 보낸 사람: Dima Spivak <dspi...@cloudera.com <javascript:;> <javascript:;>>
> > 보낸 날짜: 2016년 8월 30일 화요일 오후 11:58:12
> > 받는 사람: user@hbase.apache.org <javascript:;> <javascript:;>
> > 제목: Re: How to deal OutOfOrderScannerNextException
> >
> > Hey Minwoo,
> >
> > What version of HBase are you running? Also, can you post an excerpt of
> the
> > code you're trying to run when you get this Exception?
> >
> > 

Re: How to deal OutOfOrderScannerNextException

2016-08-30 Thread Dima Spivak
Any reason to not use the 1.2.2 client library? You're likely hitting a
compatibility issue.

On Tuesday, August 30, 2016, Kang Minwoo <minwoo.k...@outlook.com> wrote:

> Hi Dima Spivak,
>
>
> Thanks for interesting my problem.
>
>
> Hbase server version is 1.2.2
>
> Java Hbase library version is 0.96.2-hadoop2 at hbase-client,
> 0.96.2-hadoop2 at hbase-hadoop-compat.
>
>
> Here is an excerpt of the code.
>
> 
> 
>
> 
>
> ResultScanner rs = keyTable.getScanner(scan); ==> Exception is here.
> List list = new ArrayList();
> try {
> for (Result r : rs) {
> list.add(r);
> }
> } finally {
> rs.close();
> }
> return list;
> 
> 
>
>
> Here is a stacktrace.
> 
> 
> org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of
> OutOfOrderScannerNextException: was there a rpc timeout?
> at org.apache.hadoop.hbase.client.ClientScanner.next(
> ClientScanner.java:384)
> at org.apache.hadoop.hbase.client.MetaScanner.metaScan(
> MetaScanner.java:177)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.prefetchRegionCache(
> HConnectionManager.java:1107)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1167)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.locateRegion(HConnectionManager.java:1059)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.locateRegion(HConnectionManager.java:1016)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.getRegionLocation(HConnectionManager.java:857)
> at org.apache.hadoop.hbase.client.RegionServerCallable.
> prepare(RegionServerCallable.java:72)
> at org.apache.hadoop.hbase.client.ScannerCallable.
> prepare(ScannerCallable.java:118)
> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:119)
> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(
> RpcRetryingCaller.java:96)
> at org.apache.hadoop.hbase.client.ClientScanner.
> nextScanner(ClientScanner.java:264)
> at org.apache.hadoop.hbase.client.ClientScanner.
> initializeScannerInConstruction(ClientScanner.java:169)
> at org.apache.hadoop.hbase.client.ClientScanner.(
> ClientScanner.java:164)
> at org.apache.hadoop.hbase.client.ClientScanner.(
> ClientScanner.java:107)
> at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:720)
> at com.my.app.reader.hbase.HBaseReader.getResult(HBaseReader.java:1)
> 
> 
>
> Yours sincerely,
> Minwoo
>
>
> 
> 보낸 사람: Dima Spivak <dspi...@cloudera.com <javascript:;>>
> 보낸 날짜: 2016년 8월 30일 화요일 오후 11:58:12
> 받는 사람: user@hbase.apache.org <javascript:;>
> 제목: Re: How to deal OutOfOrderScannerNextException
>
> Hey Minwoo,
>
> What version of HBase are you running? Also, can you post an excerpt of the
> code you're trying to run when you get this Exception?
>
> On Tuesday, August 30, 2016, Kang Minwoo <minwoo.k...@outlook.com
> <javascript:;>> wrote:
>
> > Hello Hbase users.
> >
> >
> > While I used hbase client libarary in JAVA, I got
> > OutOfOrderScannerNextException.
> >
> > Here is stacktrace.
> >
> >
> > --
> >
> > java.lang.RuntimeException: org.apache.hadoop.hbase.
> DoNotRetryIOException:
> > Failed after retry of OutOfOrderScannerNextException: was there a rpc
> > timeout?
> > org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(
> > AbstractClientScanner.java:94)
> > --
> >
> > This error was ocured by using scan method.
> > When I used Hbase shell, it is no any exception.
> > But In java client It is occured OutOfOrderScannerNextException when
> using
> > scan. (get method is find.)
> >
> > If someone know how to deal OutOfOrderScannerNextException, Please share
> > your knowledge.
> > It would be very helpful.
> >
> > Yours sincerely,
> > Minwoo
> >
> >
>
> --
> -Dima
>


-- 
-Dima


Re: ApacheCon Seville CFP closes September 9th

2016-08-30 Thread Dima Spivak
If you're trying to unsubscribe from HBase's mailing lists, go to
https://hbase.apache.org/mail-lists.html and follow the instructions.

On Tuesday, August 30, 2016, Mark Prakash  wrote:

> I am no longer involved in Hadoop, can someone please tell me how to un
> subscribe to these email thread?
>
> On Tue, Aug 30, 2016 at 11:46 AM, Jason Barber  > wrote:
>
> > Unsubscribe
> >
> > > On Aug 30, 2016, at 10:03 AM, Rich Bowen  > wrote:
> > >
> > > It's traditional. We wait for the last minute to get our talk proposals
> > > in for conferences.
> > >
> > > Well, the last minute has arrived. The CFP for ApacheCon Seville closes
> > > on September 9th, which is less than 2 weeks away. It's time to get
> your
> > > talks in, so that we can make this the best ApacheCon yet.
> > >
> > > It's also time to discuss with your developer and user community
> whether
> > > there's a track of talks that you might want to propose, so that you
> > > have more complete coverage of your project than a talk or two.
> > >
> > > For Apache Big Data, the relevant URLs are:
> > > Event details:
> > > http://events.linuxfoundation.org/events/apache-big-data-europe
> > > CFP:
> > > http://events.linuxfoundation.org/events/apache-big-data-
> > europe/program/cfp
> > >
> > > For ApacheCon Europe, the relevant URLs are:
> > > Event details: http://events.linuxfoundation.
> org/events/apachecon-europe
> > > CFP: http://events.linuxfoundation.org/events/apachecon-europe/
> > program/cfp
> > >
> > > This year, we'll be reviewing papers "blind" - that is, looking at the
> > > abstracts without knowing who the speaker is. This has been shown to
> > > eliminate the "me and my buddies" nature of many tech conferences,
> > > producing more diversity, and more new speakers. So make sure your
> > > abstracts clearly explain what you'll be talking about.
> > >
> > > For further updated about ApacheCon, follow us on Twitter, @ApacheCon,
> > > or drop by our IRC channel, #apachecon on the Freenode IRC network.
> > >
> > > --
> > > Rich Bowen
> > > WWW: http://apachecon.com/
> > > Twitter: @ApacheCon
> >
>


-- 
-Dima


Re: How to deal OutOfOrderScannerNextException

2016-08-30 Thread Dima Spivak
Hey Minwoo,

What version of HBase are you running? Also, can you post an excerpt of the
code you're trying to run when you get this Exception?

On Tuesday, August 30, 2016, Kang Minwoo  wrote:

> Hello Hbase users.
>
>
> While I used hbase client libarary in JAVA, I got
> OutOfOrderScannerNextException.
>
> Here is stacktrace.
>
>
> --
>
> java.lang.RuntimeException: org.apache.hadoop.hbase.DoNotRetryIOException:
> Failed after retry of OutOfOrderScannerNextException: was there a rpc
> timeout?
> org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(
> AbstractClientScanner.java:94)
> --
>
> This error was ocured by using scan method.
> When I used Hbase shell, it is no any exception.
> But In java client It is occured OutOfOrderScannerNextException when using
> scan. (get method is find.)
>
> If someone know how to deal OutOfOrderScannerNextException, Please share
> your knowledge.
> It would be very helpful.
>
> Yours sincerely,
> Minwoo
>
>

-- 
-Dima


Re: Avro schema getting changed dynamically

2016-08-30 Thread Dima Spivak
Moving dev@ to bcc. Please don't email the developer mailing list with
questions about how to set up HBase for your use case.

On Monday, August 29, 2016, Manjeet Singh 
wrote:

> I want ot add few more points
>
> I am using Java native Api for Hbase get/put
>
> and below is the example
>
> assume i have below schema and I am inserting data by using this schema
> into Hbase but later I have new new values coming and i need to fit this
> into my schema for this I need to create new schema as showing in below
> example
> in example 2 I have new field time stamp
>
> example 1
> { "type" : "record", "name" : "twitter_schema", "namespace" :
> "com.miguno.avro", "fields" : [ { "name" : "username", "type" : "string",
> "doc" : "Name of the user account on Twitter.com" }, { "name" : "tweet",
> "type" : "string", "doc" : "The content of the user's Twitter message" } ]
> }
>
>
> example 2
>
>
> { "type" : "record", "name" : "twitter_schema", "namespace" :
> "com.miguno.avro", "fields" : [ { "name" : "username", "type" : "string",
> "doc" : "Name of the user account on Twitter.com" }, { "name" : "tweet",
> "type" : "string", "doc" : "The content of the user's Twitter message" }, {
> "name" : "timestamp", "type" : "long", "doc" : "Unix epoch time in seconds"
> } ], }
>
>
>
>
> Thanks
> Manjeet
>
>
>
>
>
> On Tue, Aug 30, 2016 at 11:47 AM, Manjeet Singh <
> manjeet.chand...@gmail.com >
> wrote:
>
> > Hi All,
> >
> > I have use case to put data in avro format in Hbase , I have frequent
> read
> > write operations but its not a problem.
> >
> > Problem is what if my avro schema get changed how shuld I deal with it?
> > This should in mind what about older data which already inserted in Hbase
> > and now we have new schema.
> >
> > can anyone suggest me solution for the same
> >
> > Thanks
> > Manjeet
> >
> > --
> > luv all
> >
>
>
>
> --
> luv all
>


-- 
-Dima


Re: HBase for Small Key Value Tables

2016-08-29 Thread Dima Spivak
(Though if it is only 7 GB, why not just store it in memory?)

On Sunday, August 28, 2016, Dima Spivak <dspi...@cloudera.com> wrote:

> If your data can all fit on one machine, HBase is not the best choice. I
> think you'd be better off using a simpler solution for small data and leave
> HBase for use cases that require proper clusters.
>
> On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com
> <javascript:_e(%7B%7D,'cvml','mylogi...@gmail.com');>> wrote:
>
>> We dont want to invest into another DB like Dynamo, Cassandra and Already
>> are in the Hadoop Stack. Managing another DB would be a pain. Why HBase
>> over RDMS, is because we call HBase via Spark Streaming to lookup the
>> keys.
>>
>> Manish
>>
>> On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak <dspi...@cloudera.com>
>> wrote:
>>
>> > Hey Manish,
>> >
>> > Just to ask the naive question, why use HBase if the data fits into
>> such a
>> > small table?
>> >
>> > On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > We have a scenario where HBase is used like a Key Value Database to
>> map
>> > > Keys to Regions. We have over 5 Million Keys, but the table size is
>> less
>> > > than 7 GB. The read volume is pretty high - About 50x of the
>> put/delete
>> > > volume. This causes hot spotting on the Data Node and the region is
>> not
>> > > split. We cannot change the maxregionsize parameter as that will
>> impact
>> > > other tables too.
>> > >
>> > > Our idea is to manually inspect the row key ranges and then split the
>> > > region manually and assign them to different region servers. We will
>> > > continue to then monitor the rows in one region to see if needs to be
>> > > split.
>> > >
>> > > Any experience of doing this on HBase. Is this a recommended approach?
>> > >
>> > > Thanks,
>> > > Manish
>> > >
>> >
>> >
>> > --
>> > -Dima
>> >
>>
>
>
> --
> -Dima
>
>

-- 
-Dima


Re: HBase for Small Key Value Tables

2016-08-29 Thread Dima Spivak
If your data can all fit on one machine, HBase is not the best choice. I
think you'd be better off using a simpler solution for small data and leave
HBase for use cases that require proper clusters.

On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com> wrote:

> We dont want to invest into another DB like Dynamo, Cassandra and Already
> are in the Hadoop Stack. Managing another DB would be a pain. Why HBase
> over RDMS, is because we call HBase via Spark Streaming to lookup the keys.
>
> Manish
>
> On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>> wrote:
>
> > Hey Manish,
> >
> > Just to ask the naive question, why use HBase if the data fits into such
> a
> > small table?
> >
> > On Sunday, August 28, 2016, Manish Maheshwari <mylogi...@gmail.com
> <javascript:;>> wrote:
> >
> > > Hi,
> > >
> > > We have a scenario where HBase is used like a Key Value Database to map
> > > Keys to Regions. We have over 5 Million Keys, but the table size is
> less
> > > than 7 GB. The read volume is pretty high - About 50x of the put/delete
> > > volume. This causes hot spotting on the Data Node and the region is not
> > > split. We cannot change the maxregionsize parameter as that will impact
> > > other tables too.
> > >
> > > Our idea is to manually inspect the row key ranges and then split the
> > > region manually and assign them to different region servers. We will
> > > continue to then monitor the rows in one region to see if needs to be
> > > split.
> > >
> > > Any experience of doing this on HBase. Is this a recommended approach?
> > >
> > > Thanks,
> > > Manish
> > >
> >
> >
> > --
> > -Dima
> >
>


-- 
-Dima


Re: HBase for Small Key Value Tables

2016-08-28 Thread Dima Spivak
Hey Manish,

Just to ask the naive question, why use HBase if the data fits into such a
small table?

On Sunday, August 28, 2016, Manish Maheshwari  wrote:

> Hi,
>
> We have a scenario where HBase is used like a Key Value Database to map
> Keys to Regions. We have over 5 Million Keys, but the table size is less
> than 7 GB. The read volume is pretty high - About 50x of the put/delete
> volume. This causes hot spotting on the Data Node and the region is not
> split. We cannot change the maxregionsize parameter as that will impact
> other tables too.
>
> Our idea is to manually inspect the row key ranges and then split the
> region manually and assign them to different region servers. We will
> continue to then monitor the rows in one region to see if needs to be
> split.
>
> Any experience of doing this on HBase. Is this a recommended approach?
>
> Thanks,
> Manish
>


-- 
-Dima


Re: Hbase Heap Size problem and Native API response is slow

2016-08-28 Thread Dima Spivak
And what kind of performance do you see vs. what you expect to see? How big
is your cluster in production/how much total data will you be storing in
production?

On Sunday, August 28, 2016, Manjeet Singh 
wrote:

> Hi
> I performed this testing on 2 node cluster where its i7 core processor with
> 16 gb ram 8 core on each node.
>
> I have very frequent get put operation on hbase using spark streaming and
> sql where we r aggregate data on spark group and saving it to hbase
> Can you give us more specifics about what kind of performance you're
> expecting, Manjeet, and what kind of performance you're actually seeing?
> Also, how big is your cluster (i.e. number of nodes, amount of RAM/CPU per
> node)? It's also important realize that performance can be impacted by the
> write patterns of the data you're trying to query; if compactions haven't
> occurred at the time that you try to do your reads, HBase may have to go to
> disk repeatedly to access HFiles, even when only accessing columns within
> one row.
>
> On Sat, Aug 27, 2016 at 11:12 AM, Manjeet Singh <
> manjeet.chand...@gmail.com >
> wrote:
>
> > Thanks Vladrodionov for your reply
> > I took this design from twiter where a rowkey is twitter id and twites
> and
> > hastag in column
> >
> > I hv mob no or ip by which domain visited in column qualifyer.
> >
> > can you plz tell me how can I index my row key with qualam idk how many
> > column I hv
> > On 27 Aug 2016 22:21, "Vladimir Rodionov"  > wrote:
> >
> > > >> Problem is its very slow
> > >
> > > rows are not indexed by column qualifier, and you need to scan all of
> > them.
> > > I suggest you consider different row-key design or
> > > add additional index-table for your table.
> > >
> > > -Vlad
> > >
> > > On Sat, Aug 27, 2016 at 4:12 AM, Manjeet Singh <
> > manjeet.chand...@gmail.com 
> > > >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > can anybody suggest me the improvement in my below code
> > > > Purpose os this code to get column qualifier by prefix scan
> > > > Problem is its very slow
> > > >
> > > >
> > > > public static ArrayList getColumnQualifyerByPrefixScan
> (String
> > > > rowKey, String prefix) {
> > > >
> > > > ArrayList list = null;
> > > > try {
> > > >
> > > > FilterList filterList = new FilterList(FilterList.
> > > Operator.MUST_PASS_ALL);
> > > > Filter filterB = new QualifierFilter(CompareFilter.CompareOp.EQUAL,
> > > > new BinaryPrefixComparator(Bytes.toBytes(prefix)));
> > > > filterList.addFilter(filterB);
> > > >
> > > > list = new ArrayList();
> > > >
> > > > Get get1 = new Get(rowKey.getBytes());
> > > > get1.setFilter(filterList);
> > > > Result rs1 = hTable.get(get1);
> > > > int i = 0;
> > > > for (KeyValue kv : rs1.raw()) {
> > > >  list.add(new String(kv.getQualifier()) + " ");
> > > > }
> > > > } catch (Exception e) {
> > > > //System.out.println(e.getMessage());
> > > >
> > > > }
> > > > return list;
> > > > }
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Aug 26, 2016 at 7:56 PM, Manjeet Singh <
> > > manjeet.chand...@gmail.com 
> > > > >
> > > > wrote:
> > > >
> > > > > Hi All
> > > > >
> > > > > I am using wide table approach where I have might have more
> > 1,00,
> > > > > column qualifier
> > > > >
> > > > > I am getting problem as below
> > > > > Heap size problem by using scan on shell , as a solution I increase
> > > java
> > > > > heap size by using cloudera manager to 4 GB
> > > > >
> > > > >
> > > > > second I have below Native API code It took very long time to
> process
> > > can
> > > > > any one help me on same?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > public static ArrayList getColumnQualifyerByPrefixScan
> > (String
> > > > > rowKey, String prefix) {
> > > > >
> > > > > ArrayList list = null;
> > > > > try {
> > > > >
> > > > > FilterList filterList = new FilterList(FilterList.
> > > > Operator.MUST_PASS_ALL);
> > > > > Filter filterB = new QualifierFilter(CompareFilter.
> CompareOp.EQUAL,
> > > > > new BinaryPrefixComparator(Bytes.toBytes(prefix)));
> > > > > filterList.addFilter(filterB);
> > > > >
> > > > > list = new ArrayList();
> > > > >
> > > > > Get get1 = new Get(rowKey.getBytes());
> > > > > get1.setFilter(filterList);
> > > > > Result rs1 = hTable.get(get1);
> > > > > int i = 0;
> > > > > for (KeyValue kv : rs1.raw()) {
> > > > > list.add(new String(kv.getQualifier()) + " ");
> > > > > }
> > > > > } catch (Exception e) {
> > > > > //System.out.println(e.getMessage());
> > > > >
> > > > > }
> > > > > return list;
> > > > > }
> > > > >
> > > > > Thanks
> > > > > Manjeet
> > > > > --
> > > > > luv all
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > luv all
> > > >
> > >
> >
>
>
>
> --
> -Dima
>


-- 
-Dima


Re: Hbase Heap Size problem and Native API response is slow

2016-08-27 Thread Dima Spivak
Can you give us more specifics about what kind of performance you're
expecting, Manjeet, and what kind of performance you're actually seeing?
Also, how big is your cluster (i.e. number of nodes, amount of RAM/CPU per
node)? It's also important realize that performance can be impacted by the
write patterns of the data you're trying to query; if compactions haven't
occurred at the time that you try to do your reads, HBase may have to go to
disk repeatedly to access HFiles, even when only accessing columns within
one row.

On Sat, Aug 27, 2016 at 11:12 AM, Manjeet Singh 
wrote:

> Thanks Vladrodionov for your reply
> I took this design from twiter where a rowkey is twitter id and twites and
> hastag in column
>
> I hv mob no or ip by which domain visited in column qualifyer.
>
> can you plz tell me how can I index my row key with qualam idk how many
> column I hv
> On 27 Aug 2016 22:21, "Vladimir Rodionov"  wrote:
>
> > >> Problem is its very slow
> >
> > rows are not indexed by column qualifier, and you need to scan all of
> them.
> > I suggest you consider different row-key design or
> > add additional index-table for your table.
> >
> > -Vlad
> >
> > On Sat, Aug 27, 2016 at 4:12 AM, Manjeet Singh <
> manjeet.chand...@gmail.com
> > >
> > wrote:
> >
> > > Hi All,
> > >
> > > can anybody suggest me the improvement in my below code
> > > Purpose os this code to get column qualifier by prefix scan
> > > Problem is its very slow
> > >
> > >
> > > public static ArrayList getColumnQualifyerByPrefixScan(String
> > > rowKey, String prefix) {
> > >
> > > ArrayList list = null;
> > > try {
> > >
> > > FilterList filterList = new FilterList(FilterList.
> > Operator.MUST_PASS_ALL);
> > > Filter filterB = new QualifierFilter(CompareFilter.CompareOp.EQUAL,
> > > new BinaryPrefixComparator(Bytes.toBytes(prefix)));
> > > filterList.addFilter(filterB);
> > >
> > > list = new ArrayList();
> > >
> > > Get get1 = new Get(rowKey.getBytes());
> > > get1.setFilter(filterList);
> > > Result rs1 = hTable.get(get1);
> > > int i = 0;
> > > for (KeyValue kv : rs1.raw()) {
> > >  list.add(new String(kv.getQualifier()) + " ");
> > > }
> > > } catch (Exception e) {
> > > //System.out.println(e.getMessage());
> > >
> > > }
> > > return list;
> > > }
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Aug 26, 2016 at 7:56 PM, Manjeet Singh <
> > manjeet.chand...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi All
> > > >
> > > > I am using wide table approach where I have might have more
> 1,00,
> > > > column qualifier
> > > >
> > > > I am getting problem as below
> > > > Heap size problem by using scan on shell , as a solution I increase
> > java
> > > > heap size by using cloudera manager to 4 GB
> > > >
> > > >
> > > > second I have below Native API code It took very long time to process
> > can
> > > > any one help me on same?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > public static ArrayList getColumnQualifyerByPrefixScan
> (String
> > > > rowKey, String prefix) {
> > > >
> > > > ArrayList list = null;
> > > > try {
> > > >
> > > > FilterList filterList = new FilterList(FilterList.
> > > Operator.MUST_PASS_ALL);
> > > > Filter filterB = new QualifierFilter(CompareFilter.CompareOp.EQUAL,
> > > > new BinaryPrefixComparator(Bytes.toBytes(prefix)));
> > > > filterList.addFilter(filterB);
> > > >
> > > > list = new ArrayList();
> > > >
> > > > Get get1 = new Get(rowKey.getBytes());
> > > > get1.setFilter(filterList);
> > > > Result rs1 = hTable.get(get1);
> > > > int i = 0;
> > > > for (KeyValue kv : rs1.raw()) {
> > > > list.add(new String(kv.getQualifier()) + " ");
> > > > }
> > > > } catch (Exception e) {
> > > > //System.out.println(e.getMessage());
> > > >
> > > > }
> > > > return list;
> > > > }
> > > >
> > > > Thanks
> > > > Manjeet
> > > > --
> > > > luv all
> > > >
> > >
> > >
> > >
> > > --
> > > luv all
> > >
> >
>



-- 
-Dima


Re: Accessing different HBase versions from the same JVM

2016-08-26 Thread Dima Spivak
Sadly, there is no "easy" way to do it (blame filesystem changes and the
rpc differences, among other things).

A while back, someone posted about how he was able to do a snapshot export
between 0.94 and 0.98 [1] but this is not officially supported. Perhaps
someone else has ideas?

1.
http://mail-archives.apache.org/mod_mbox/hbase-user/201412.mbox/%3cc18855da-cb0f-4500-8d35-c79f78106...@digitalenvoy.net%3E

On Friday, August 26, 2016, Enrico Olivelli - Diennea <
enrico.olive...@diennea.com> wrote:

> Thank you Dima for your quick answer.
>
> Do you think that it would be possible to create a shaded version of the
> 0.94  client (with all the dependencies) and let it live inside the same
> JVM of a pure 1.2.2 client ?
>
> My real need is to copy data from a 0.94 cluster to a new 1.2.2
> installation, but temporary continuing to read from 0.94 in order not to
> provide down time
>
> what is the best way to copy data from a 0.94 cluster to a new cluster of
> different hbase major versions ?
>
> can you give me some link ?
>
> Thanks
> Enrico
>
>
> Il giorno ven, 26/08/2016 alle 00.04 -0700, Dima Spivak ha scritto:
>
> I would say no; 0.94 is not wire compatible with 1.2.2  because the former
> uses Hadoop IPC and the latter uses protocol buffers. Sorry, Enrico.
>
> On Friday, August 26, 2016, Enrico Olivelli - Diennea <
> enrico.olive...@diennea.com <javascript:;> enrico.olive...@diennea.com <javascript:;>>> wrote:
>
>
>
> Hi,
> I would like to connect to both a 0.94 hbase cluster and a 1.2.2 hbase
> cluster from the same JVM
> I think that 0.94 client code is not compatible with 1.2.2
>
> do you think it is possible ?
>
> Thank you
>
>
> --
> Enrico Olivelli
> Software Development Manager @Diennea
> Tel.: (+39) 0546 066100 - Int. 925
> Viale G.Marconi 30/14 - 48018 Faenza (RA)
>
> MagNews - E-mail Marketing Solutions
> http://www.magnews.it
> Diennea - Digital Marketing Solutions
> http://www.diennea.com
>
>
> 
>
> Iscriviti alla nostra newsletter per rimanere aggiornato su digital ed
> email marketing! http://www.magnews.it/newsletter/
>
> The information in this email is confidential and may be legally
> privileged. If you are not the intended recipient please notify the sender
> immediately and destroy this email. Any unauthorized, direct or indirect,
> disclosure, copying, storage, distribution or other use is strictly
> forbidden.
>
>
>
>
>
>
>
> --
> Enrico Olivelli
> Software Development Manager @Diennea
> Tel.: (+39) 0546 066100 - Int. 925
> Viale G.Marconi 30/14 - 48018 Faenza (RA)
>
> MagNews - E-mail Marketing Solutions
> http://www.magnews.it
> Diennea - Digital Marketing Solutions
> http://www.diennea.com
>
>
> 
>
> Iscriviti alla nostra newsletter per rimanere aggiornato su digital ed
> email marketing! http://www.magnews.it/newsletter/
>
> The information in this email is confidential and may be legally
> privileged. If you are not the intended recipient please notify the sender
> immediately and destroy this email. Any unauthorized, direct or indirect,
> disclosure, copying, storage, distribution or other use is strictly
> forbidden.
>


-- 
-Dima


Re: Accessing different HBase versions from the same JVM

2016-08-26 Thread Dima Spivak
I would say no; 0.94 is not wire compatible with 1.2.2  because the former
uses Hadoop IPC and the latter uses protocol buffers. Sorry, Enrico.

On Friday, August 26, 2016, Enrico Olivelli - Diennea <
enrico.olive...@diennea.com> wrote:

> Hi,
> I would like to connect to both a 0.94 hbase cluster and a 1.2.2 hbase
> cluster from the same JVM
> I think that 0.94 client code is not compatible with 1.2.2
>
> do you think it is possible ?
>
> Thank you
>
>
> --
> Enrico Olivelli
> Software Development Manager @Diennea
> Tel.: (+39) 0546 066100 - Int. 925
> Viale G.Marconi 30/14 - 48018 Faenza (RA)
>
> MagNews - E-mail Marketing Solutions
> http://www.magnews.it
> Diennea - Digital Marketing Solutions
> http://www.diennea.com
>
>
> 
>
> Iscriviti alla nostra newsletter per rimanere aggiornato su digital ed
> email marketing! http://www.magnews.it/newsletter/
>
> The information in this email is confidential and may be legally
> privileged. If you are not the intended recipient please notify the sender
> immediately and destroy this email. Any unauthorized, direct or indirect,
> disclosure, copying, storage, distribution or other use is strictly
> forbidden.
>


-- 
-Dima


Re: HBase: Copy some data from One Table to Another

2016-08-25 Thread Dima Spivak
Hi George,

All rows? Or just a handful of rows? How big is your table?

On Thursday, August 25, 2016, GEORGE, MURALIDHARAN  wrote:

> Hello Support Team
>
> Greetings!
>
> I have a question in HBase:
>
> I want to copy data from some rows in one HBase table to another HBase
> table. Please me know.
> Thanks!
>
> Best Regards
> G. Murali
> Visitors Data Patterns Team
>
>

-- 
-Dima


Re: Hbase table size with replicas

2016-08-25 Thread Dima Spivak
That HDFS didn't have the change. See HADOOP-6857 for details.

On Thursday, August 25, 2016, marjana  wrote:

> Hm I only see one number:
>
> 2.4 M  /apps/hbase/data/data/default/FACT_AMERICAN
>
> This is 2.3.0 version.
>
>
>
> --
> View this message in context: http://apache-hbase.679495.n3.
> nabble.com/Hbase-table-size-with-replicas-tp4082087p4082089.html
> Sent from the HBase User mailing list archive at Nabble.com.
>


-- 
-Dima


Re: How to get Last 1000 records from 1 millions records

2016-08-24 Thread Dima Spivak
Hey Manjeet,

How much data are you actually trying to get the last 1000 records for? If
you're dealing at the scale of only millions of rows, HBase may not be the
best choice for this type of problem.

On Wed, Aug 24, 2016 at 12:05 PM, Manjeet Singh 
wrote:

> Hi all
>
> Hbase didnt provide sorting on column but rowkey store in sorted form
> like small value first and greater value last
>
> example
> 1
> 2
> 3
> 4
> 5
> 6
> 7
> and so on
>
> Assume I have 1 Miilions record but i want to look last 1000 records only
> Is their any way to do this? I don't want to perform any calculation on
> client side so may be any filter can help on it?
>
> Thanks
> Manjeet
>
> --
> luv all
>



-- 
-Dima


Re: How to backport MOB to Hbase 1.2.2

2016-08-21 Thread Dima Spivak
Hey Anil,

No, you're totally right; CDH 5.4 shipped with MOB, but on an HBase based
on the upstream 1.0 release. I can tell you firsthand that the time and
effort undertaken at Cloudera and Intel to make it production-ready (and
convince ourselves of that through rigorous testing) was pretty
significant, so someone looking to "roll their own" based on an Apache
release is in for some long nights.

On Sunday, August 21, 2016, anil gupta <anilgupt...@gmail.com> wrote:

> Hi Dima,
>
> I was under impression that some CDH5.x GA release shipped MOB. Is that
> wrong?
>
> Thanks,
> Anil
>
> On Sat, Aug 20, 2016 at 10:48 PM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>> wrote:
>
> > Nope, you'd be in uncharted territory there, my friend, and definitely
> not
> > in a place that would be production-ready. Sorry to be the bearer of bad
> > news :(.
> >
> > On Saturday, August 20, 2016, Ascot Moss <ascot.m...@gmail.com
> <javascript:;>> wrote:
> >
> > > I have read HBASE-15370.   We have to wait quite a while for HBase 2.0,
> > >  this is the reason why I want to try out MOB now in HBase 1.2.2 in my
> > test
> > > environment, any steps and guide to do the backport?
> > >
> > >
> > > On Sun, Aug 21, 2016 at 12:44 PM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>
> > > <javascript:;>> wrote:
> > >
> > > > Hi Ascot,
> > > >
> > > > MOB won't be backported into any pre-2.0 HBase branch. HBASE-15370
> > > tracked
> > > > the effort and an email thread on the dev list ("[DISCUSS] Criteria
> for
> > > > including MOB feature backport in branch-1" started by Ted Yu on
> March
> > > 3rd
> > > > of this year) has additional rationale as to why that is.
> > > >
> > > > Cheers,
> > > >
> > > > On Saturday, August 20, 2016, Ascot Moss <ascot.m...@gmail.com
> <javascript:;>
> > > <javascript:;>
> > > > <javascript:_e(%7B%7D,'cvml','ascot.m...@gmail.com <javascript:;>
> <javascript:;>');>>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I want to use MOB in Hbase 1.2.2, can anyone advise the step to
> > > backport
> > > > > MOB to HBase 1.2.2?
> > > > >
> > > > > Regards
> > > > >
> > > >
> > > >
> > > > --
> > > > -Dima
> > > >
> > >
> >
> >
> > --
> > -Dima
> >
>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>


-- 
-Dima


Re: How to backport MOB to Hbase 1.2.2

2016-08-20 Thread Dima Spivak
Nope, you'd be in uncharted territory there, my friend, and definitely not
in a place that would be production-ready. Sorry to be the bearer of bad
news :(.

On Saturday, August 20, 2016, Ascot Moss <ascot.m...@gmail.com> wrote:

> I have read HBASE-15370.   We have to wait quite a while for HBase 2.0,
>  this is the reason why I want to try out MOB now in HBase 1.2.2 in my test
> environment, any steps and guide to do the backport?
>
>
> On Sun, Aug 21, 2016 at 12:44 PM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>> wrote:
>
> > Hi Ascot,
> >
> > MOB won't be backported into any pre-2.0 HBase branch. HBASE-15370
> tracked
> > the effort and an email thread on the dev list ("[DISCUSS] Criteria for
> > including MOB feature backport in branch-1" started by Ted Yu on March
> 3rd
> > of this year) has additional rationale as to why that is.
> >
> > Cheers,
> >
> > On Saturday, August 20, 2016, Ascot Moss <ascot.m...@gmail.com
> <javascript:;>
> > <javascript:_e(%7B%7D,'cvml','ascot.m...@gmail.com <javascript:;>');>>
> wrote:
> >
> > > Hi,
> > >
> > > I want to use MOB in Hbase 1.2.2, can anyone advise the step to
> backport
> > > MOB to HBase 1.2.2?
> > >
> > > Regards
> > >
> >
> >
> > --
> > -Dima
> >
>


-- 
-Dima


How to backport MOB to Hbase 1.2.2

2016-08-20 Thread Dima Spivak
Hi Ascot,

MOB won't be backported into any pre-2.0 HBase branch. HBASE-15370 tracked
the effort and an email thread on the dev list ("[DISCUSS] Criteria for
including MOB feature backport in branch-1" started by Ted Yu on March 3rd
of this year) has additional rationale as to why that is.

Cheers,

On Saturday, August 20, 2016, Ascot Moss > wrote:

> Hi,
>
> I want to use MOB in Hbase 1.2.2, can anyone advise the step to backport
> MOB to HBase 1.2.2?
>
> Regards
>


-- 
-Dima


Re: Hbase federated cluster for messages

2016-08-20 Thread Dima Spivak
Yup.

On Saturday, August 20, 2016, Alexandr Porunov <alexandr.poru...@gmail.com>
wrote:

> So, will it be ok if we have 80 data nodes (8TB on each node) and only one
> namenode? Will it works for the messaging system? We will have 2x
> replication so there are 320 TB of data (per year) (640 TB with
> replication). 13 R+W ops/sec. Each message 100 bytes or 1024 bytes.
> Is it possible to handle such load with hbase?
>
> Sincerely,
> Alexandr
>
> On Sat, Aug 20, 2016 at 8:44 AM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>> wrote:
>
> > You can easily store that much data as long as you don't have small
> files,
> > which is typically why people turn to federation.
> >
> > -Dima
> >
> > On Friday, August 19, 2016, Alexandr Porunov <alexandr.poru...@gmail.com
> <javascript:;>>
> > wrote:
> >
> > > We are talking about facebook. So, there are 25 TB per month. 15
> billion
> > > messages with 1024 bytes and 120 billion messages with 100 bytes per
> > month.
> > >
> > > I thought that they used only hbase to handle such a huge data If they
> > used
> > > their own implementation of hbase then I haven't questions.
> > >
> > > Sincerely,
> > > Alexandr
> > >
> > > On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>
> > > <javascript:;>> wrote:
> > >
> > > > I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many
> > files
> > > > are we talking about here? I'd say that use cases that benefit from
> > HBase
> > > > don't tend to hit the kind of HDFS file limits that federation seeks
> to
> > > > address.
> > > >
> > > > -Dima
> > > >
> > > > On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov <
> > > vladrodio...@gmail.com <javascript:;> <javascript:;>
> > > > >
> > > > wrote:
> > > >
> > > > > FB has its own "federation". It is a proprietary code, I presume.
> > > > >
> > > > > -Vladimir
> > > > >
> > > > >
> > > > > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> > > > > alexandr.poru...@gmail.com <javascript:;> <javascript:;>> wrote:
> > > > >
> > > > > > No. There isn't. But I want to figure out how to configure that
> > type
> > > of
> > > > > > cluster in the case if there is particular reason. How facebook
> can
> > > > > handle
> > > > > > such a huge amount of ops without federation? I don't think that
> > they
> > > > > just
> > > > > > have one namenode server and one standby namenode server. It
> isn't
> > > > > > possible. I am sure that they use federation.
> > > > > >
> > > > > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > > > > > vladrodio...@gmail.com <javascript:;> <javascript:;>>
> > > > > > wrote:
> > > > > >
> > > > > > > >> I am not sure how to do it but I have to configure federated
> > > > cluster
> > > > > > > with
> > > > > > > >> hbase to store huge amount of messages (client to client)
> (40%
> > > > > writes,
> > > > > > > 60%
> > > > > > > >> reads).
> > > > > > >
> > > > > > > Any particular reason for federated cluster? How huge is huge
> > > amount
> > > > > and
> > > > > > > what is the message size?
> > > > > > >
> > > > > > > -Vladimir
> > > > > > >
> > > > > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak <
> > > dspi...@cloudera.com <javascript:;> <javascript:;>>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > As far as I know, HBase doesn't support spreading tables
> across
> > > > > > > namespaces;
> > > > > > > > you'd have to point it at one namenode at a time. I've heard
> of
> > > > > people
> > > > > > > > trying to run multiple HBase instances in order to get access
> > to
> > > > all
> > > > > 

Re: Hbase federated cluster for messages

2016-08-19 Thread Dima Spivak
You can easily store that much data as long as you don't have small files,
which is typically why people turn to federation.

-Dima

On Friday, August 19, 2016, Alexandr Porunov <alexandr.poru...@gmail.com>
wrote:

> We are talking about facebook. So, there are 25 TB per month. 15 billion
> messages with 1024 bytes and 120 billion messages with 100 bytes per month.
>
> I thought that they used only hbase to handle such a huge data If they used
> their own implementation of hbase then I haven't questions.
>
> Sincerely,
> Alexandr
>
> On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>> wrote:
>
> > I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many files
> > are we talking about here? I'd say that use cases that benefit from HBase
> > don't tend to hit the kind of HDFS file limits that federation seeks to
> > address.
> >
> > -Dima
> >
> > On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov <
> vladrodio...@gmail.com <javascript:;>
> > >
> > wrote:
> >
> > > FB has its own "federation". It is a proprietary code, I presume.
> > >
> > > -Vladimir
> > >
> > >
> > > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> > > alexandr.poru...@gmail.com <javascript:;>> wrote:
> > >
> > > > No. There isn't. But I want to figure out how to configure that type
> of
> > > > cluster in the case if there is particular reason. How facebook can
> > > handle
> > > > such a huge amount of ops without federation? I don't think that they
> > > just
> > > > have one namenode server and one standby namenode server. It isn't
> > > > possible. I am sure that they use federation.
> > > >
> > > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > > > vladrodio...@gmail.com <javascript:;>>
> > > > wrote:
> > > >
> > > > > >> I am not sure how to do it but I have to configure federated
> > cluster
> > > > > with
> > > > > >> hbase to store huge amount of messages (client to client) (40%
> > > writes,
> > > > > 60%
> > > > > >> reads).
> > > > >
> > > > > Any particular reason for federated cluster? How huge is huge
> amount
> > > and
> > > > > what is the message size?
> > > > >
> > > > > -Vladimir
> > > > >
> > > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak <
> dspi...@cloudera.com <javascript:;>>
> > > > > wrote:
> > > > >
> > > > > > As far as I know, HBase doesn't support spreading tables across
> > > > > namespaces;
> > > > > > you'd have to point it at one namenode at a time. I've heard of
> > > people
> > > > > > trying to run multiple HBase instances in order to get access to
> > all
> > > > > their
> > > > > > HDFS data, but it doesn't tend to be much fun.
> > > > > >
> > > > > > -Dima
> > > > > >
> > > > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > > > alexandr.poru...@gmail.com <javascript:;>> wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > I am not sure how to do it but I have to configure federated
> > > cluster
> > > > > with
> > > > > > > hbase to store huge amount of messages (client to client) (40%
> > > > writes,
> > > > > > 60%
> > > > > > > reads). Does somebody have any idea or examples how to
> configure
> > > it?
> > > > > > >
> > > > > > > Of course we can configure hdfs in a federated mode but as for
> me
> > > it
> > > > > > isn't
> > > > > > > suitable for hbase. If we want to save message from client 1 to
> > > > client
> > > > > 2
> > > > > > in
> > > > > > > the hbase cluster then how hbase know in which namespace it
> have
> > to
> > > > > save
> > > > > > > it? Which namenode will be responsible for that message? How we
> > can
> > > > > read
> > > > > > > client messages?
> > > > > > >
> > > > > > > Give me any ideas, please
> > > > > > >
> > > > > > > Sincerely,
> > > > > > > Alexandr
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -Dima
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -Dima
> >
>


-- 
-Dima


Re: Hbase federated cluster for messages

2016-08-19 Thread Dima Spivak
I'd +1 what Vladimir says. How much data (in TBs/PBs) and how many files
are we talking about here? I'd say that use cases that benefit from HBase
don't tend to hit the kind of HDFS file limits that federation seeks to
address.

-Dima

On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov <vladrodio...@gmail.com>
wrote:

> FB has its own "federation". It is a proprietary code, I presume.
>
> -Vladimir
>
>
> On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov <
> alexandr.poru...@gmail.com> wrote:
>
> > No. There isn't. But I want to figure out how to configure that type of
> > cluster in the case if there is particular reason. How facebook can
> handle
> > such a huge amount of ops without federation? I don't think that they
> just
> > have one namenode server and one standby namenode server. It isn't
> > possible. I am sure that they use federation.
> >
> > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov <
> > vladrodio...@gmail.com>
> > wrote:
> >
> > > >> I am not sure how to do it but I have to configure federated cluster
> > > with
> > > >> hbase to store huge amount of messages (client to client) (40%
> writes,
> > > 60%
> > > >> reads).
> > >
> > > Any particular reason for federated cluster? How huge is huge amount
> and
> > > what is the message size?
> > >
> > > -Vladimir
> > >
> > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak <dspi...@cloudera.com>
> > > wrote:
> > >
> > > > As far as I know, HBase doesn't support spreading tables across
> > > namespaces;
> > > > you'd have to point it at one namenode at a time. I've heard of
> people
> > > > trying to run multiple HBase instances in order to get access to all
> > > their
> > > > HDFS data, but it doesn't tend to be much fun.
> > > >
> > > > -Dima
> > > >
> > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
> > > > alexandr.poru...@gmail.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I am not sure how to do it but I have to configure federated
> cluster
> > > with
> > > > > hbase to store huge amount of messages (client to client) (40%
> > writes,
> > > > 60%
> > > > > reads). Does somebody have any idea or examples how to configure
> it?
> > > > >
> > > > > Of course we can configure hdfs in a federated mode but as for me
> it
> > > > isn't
> > > > > suitable for hbase. If we want to save message from client 1 to
> > client
> > > 2
> > > > in
> > > > > the hbase cluster then how hbase know in which namespace it have to
> > > save
> > > > > it? Which namenode will be responsible for that message? How we can
> > > read
> > > > > client messages?
> > > > >
> > > > > Give me any ideas, please
> > > > >
> > > > > Sincerely,
> > > > > Alexandr
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -Dima
> > > >
> > >
> >
>



-- 
-Dima


Re: Hbase federated cluster for messages

2016-08-19 Thread Dima Spivak
As far as I know, HBase doesn't support spreading tables across namespaces;
you'd have to point it at one namenode at a time. I've heard of people
trying to run multiple HBase instances in order to get access to all their
HDFS data, but it doesn't tend to be much fun.

-Dima

On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov <
alexandr.poru...@gmail.com> wrote:

> Hello,
>
> I am not sure how to do it but I have to configure federated cluster with
> hbase to store huge amount of messages (client to client) (40% writes, 60%
> reads). Does somebody have any idea or examples how to configure it?
>
> Of course we can configure hdfs in a federated mode but as for me it isn't
> suitable for hbase. If we want to save message from client 1 to client 2 in
> the hbase cluster then how hbase know in which namespace it have to save
> it? Which namenode will be responsible for that message? How we can read
> client messages?
>
> Give me any ideas, please
>
> Sincerely,
> Alexandr
>



-- 
-Dima


Re: Hbase Row key lock

2016-08-17 Thread Dima Spivak
Row locks on the client side were deprecated in 0.94 (see HBASE-7341) and
removed in 0.96 (see HBASE-7315). As you note, they could lead to deadlocks
and also had problems when region moves or splits occurred.

Is there a specific reason you're looking for this functionality, Manjeet?

-Dima

On Tuesday, August 16, 2016, Manjeet Singh 
wrote:

> Hi All
>
> Can anyone help me about how and in which version of Hbase support Rowkey
> lock ?
> I have seen article about rowkey lock but it was about .94 version it said
> that if row key not exist and any update request come and that rowkey not
> exist then in this case Hbase hold the lock for 60 sec.
>
> currently I am using Hbase 1.2.2 version
>
> Thanks
> Manjeet
>
>
>
> --
> luv all
>


-- 
-Dima


Re: (BUG)ShortCircuitLocalReads Failed when enabled replication

2016-08-11 Thread Dima Spivak
Hey Yang,

Looks like HDFS is having trouble with a block. Have you tried running
hadoop fsck?

-Dima

On Thursday, August 11, 2016, Ming Yang  wrote:

> The cluster enabled shortCircuitLocalReads.
> 
> dfs.client.read.shortcircuit
> true
> 
>
> When enabled replication,we found a large number of error logs.
> 1.shortCircuitLocalReads(fail everytime).
> 2.Try reading via the datanode on targetAddr(success).
> How to make shortCircuitLocalReads successfully when enabled replication?
>
> 2016-08-03 10:46:21,721 DEBUG
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
> Opening
> log for replication dn7%2C60020%2C1470136216957.1470192327030 at 16999670
> 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient:
> BlockReaderLocal requested with incorrect offset: Offset 0 and length
> 17073479 don't match block blk_4137524355009640437_53760530 ( blockLen
> 16999670 )
> 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient:
> BlockReaderLocal: Removing blk_4137524355009640437_53760530 from cache
> because local file
> /sdd/hdfs/dfs/data/blocksBeingWritten/blk_4137524355009640437 could not be
> opened.
> 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Failed to
> read block blk_4137524355009640437_53760530 on local
> machinejava.io.IOException: Offset 0 and length 17073479 don't match block
> blk_4137524355009640437_53760530 ( blockLen 16999670 )
> at org.apache.hadoop.hdfs.BlockReaderLocal.(
> BlockReaderLocal.java:287)
> at
> org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(
> BlockReaderLocal.java:171)
> at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(
> DFSClient.java:358)
> at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:74)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.
> blockSeekTo(DFSClient.java:2073)
> at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(
> DFSClient.java:2224)
> at java.io.DataInputStream.read(DataInputStream.java:149)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
> at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$
> WALReader.(SequenceFileLogReader.java:55)
> at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(
> SequenceFileLogReader.java:178)
> at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:734)
> at
> org.apache.hadoop.hbase.replication.regionserver.
> ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.
> java:69)
> at
> org.apache.hadoop.hbase.replication.regionserver.
> ReplicationSource.openReader(ReplicationSource.java:574)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(
> ReplicationSource.java:364)
> 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Try reading
> via the datanode on /192.168.7.139:50010
>


-- 
-Dima


Re: Blk Load via Collection

2016-08-06 Thread Dima Spivak
Hey Manjeet,

Let me move dev@ to bcc and add user@ as the main recipient. You're most
likely to get good advice from HBase users who might well have faced this
question themselves. :)

Cheers,
  Dima

On Saturday, August 6, 2016, Manjeet Singh 
wrote:

> Hi All
>
> I am writing Java Native API for HBase responsible for bulk read write
> (operations)
> for this I used List of Put and List of Get and compare and put update list
> I am facing two problems at some point of time my system get hang seems I
> am missing some configurations which might responsible for some cache (my
> understanding)
>
> second I am using it to improve performance in term of mass update but its
> actually degrading the performance.
>
>  can anyone suggest me the correct approach or configuraions
>
> Thanks
> Manjeet
>
> --
> luv all
>


-- 
-Dima


Re: How to configure HBase in a HA mode?

2016-08-05 Thread Dima Spivak
Glad you got it sorted out. Happy HBase-ing! :)

-Dima

On Friday, August 5, 2016, Alexandr Porunov <alexandr.poru...@gmail.com>
wrote:

> Hello Dima,
>
> I figured out what it was. The problem was with the old znode which hasn't
> been configured properly. I've removed /hbase znode (rmr /hbase) and
> restarted hbase. Now it works properly.
>
> Thanks again for the help
>
> Sincerely,
> Alexandr
>
> On Fri, Aug 5, 2016 at 2:35 PM, Alexandr Porunov <
> alexandr.poru...@gmail.com <javascript:;>
> > wrote:
>
> > Hello Dima,
> >
> > I have 4 nodes
> > hadoopActiveMaster - Zookeeper, NN active master, journal, zkfc
> > hadoopStandby - Zookeeper, NN standby master, journal, zkfc
> > hadoopSlave1 - Zookeeper, data node, journal
> > hadoopSlave2 - data node
> >
> > /etc/hosts - http://paste.openstack.org/show/550399/
> > /usr/hadoop/etc/hadoop/hdfs-site.xml - http://paste.openstack.org/
> > show/550396/
> > /usr/hadoop/etc/hadoop/core-site.xml - http://paste.openstack.org/
> > show/550397/
> > /usr/hbase/conf/hbase-site.xml - http://paste.openstack.org/show/550398/
> > /usr/hbase/logs/hbase-hadoop-master-hadoopActiveMaster.log -
> > http://paste.openstack.org/show/550402/
> >
> > Also I have put in "/usr/hbase/conf/hbase-env.sh":
> > export HBASE_MANAGES_ZK=false
> >
> > Best regards,
> > Alexandr
> >
> > On Fri, Aug 5, 2016 at 2:05 PM, Dima Spivak <dspi...@cloudera.com
> <javascript:;>> wrote:
> >
> >> Hey Alexandr,
> >>
> >> What does your hbase-site and hdfs-site look like? Wanna upload them to
> >> Gist or something similar and then paste a link?
> >>
> >> -Dima
> >>
> >> On Friday, August 5, 2016, Alexandr Porunov <alexandr.poru...@gmail.com
> <javascript:;>>
> >> wrote:
> >>
> >> > Hello Dima,
> >> >
> >> > Thank you for advice. But the problem haven't disappeared. When I
> start
> >> > HMaster on nn1 and nn2 nodes they work but when I try to connect to
> the
> >> nn1
> >> > (http://nn1:16010/) HMaster on nn1 crashes. HMaster on nn2 continue
> be
> >> > available via http://nn2:16010/ . Don't you know why it is happens?
> >> >
> >> > Here is my logs from nn1:
> >> >
> >> > Fri Aug  5 13:04:20 EEST 2016 Starting master on nn1
> >> > core file size  (blocks, -c) 0
> >> > data seg size   (kbytes, -d) unlimited
> >> > scheduling priority (-e) 0
> >> > file size   (blocks, -f) unlimited
> >> > pending signals (-i) 3904
> >> > max locked memory   (kbytes, -l) 64
> >> > max memory size (kbytes, -m) unlimited
> >> > open files  (-n) 1024
> >> > pipe size(512 bytes, -p) 8
> >> > POSIX message queues (bytes, -q) 819200
> >> > real-time priority  (-r) 0
> >> > stack size  (kbytes, -s) 8192
> >> > cpu time   (seconds, -t) unlimited
> >> > max user processes  (-u) 3904
> >> > virtual memory  (kbytes, -v) unlimited
> >> > file locks  (-x) unlimited
> >> > 2016-08-05 13:04:21,531 INFO  [main] util.VersionInfo: HBase 1.1.5
> >> > 2016-08-05 13:04:21,531 INFO  [main] util.VersionInfo: Source code
> >> > repository git://diocles.local/Volumes/hbase-1.1.5/hbase
> >> > revision=239b80456118175b340b2e562a5568b5c744252e
> >> > 2016-08-05 13:04:21,531 INFO  [main] util.VersionInfo: Compiled by
> >> ndimiduk
> >> > on Sun May  8 20:29:26 PDT 2016
> >> > 2016-08-05 13:04:21,531 INFO  [main] util.VersionInfo: From source
> with
> >> > checksum 7ad8dc6c5daba19e4aab081181a2457d
> >> > 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> >> > env:PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:
> >> > /usr/java/default/bin:/usr/hadoop/bin:/usr/hadoop/sbin:/
> usr/hadoop/bin
> >> > 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> >> > env:HISTCONTROL=ignoredups
> >> > 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> >> > env:HBASE_REGIONSERVER_OPTS= -XX:PermSize=128m -XX:MaxPermSize=128m
> >> > 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> >> > env:MAIL=/var/spool/mail/hadoop
> >> > 2016-08-05 13:04:22,244 INFO

Re: How to configure HBase in a HA mode?

2016-08-05 Thread Dima Spivak
Hey Alexandr,

What does your hbase-site and hdfs-site look like? Wanna upload them to
Gist or something similar and then paste a link?

-Dima

On Friday, August 5, 2016, Alexandr Porunov 
wrote:

> Hello Dima,
>
> Thank you for advice. But the problem haven't disappeared. When I start
> HMaster on nn1 and nn2 nodes they work but when I try to connect to the nn1
> (http://nn1:16010/) HMaster on nn1 crashes. HMaster on nn2 continue be
> available via http://nn2:16010/ . Don't you know why it is happens?
>
> Here is my logs from nn1:
>
> Fri Aug  5 13:04:20 EEST 2016 Starting master on nn1
> core file size  (blocks, -c) 0
> data seg size   (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size   (blocks, -f) unlimited
> pending signals (-i) 3904
> max locked memory   (kbytes, -l) 64
> max memory size (kbytes, -m) unlimited
> open files  (-n) 1024
> pipe size(512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority  (-r) 0
> stack size  (kbytes, -s) 8192
> cpu time   (seconds, -t) unlimited
> max user processes  (-u) 3904
> virtual memory  (kbytes, -v) unlimited
> file locks  (-x) unlimited
> 2016-08-05 13:04:21,531 INFO  [main] util.VersionInfo: HBase 1.1.5
> 2016-08-05 13:04:21,531 INFO  [main] util.VersionInfo: Source code
> repository git://diocles.local/Volumes/hbase-1.1.5/hbase
> revision=239b80456118175b340b2e562a5568b5c744252e
> 2016-08-05 13:04:21,531 INFO  [main] util.VersionInfo: Compiled by ndimiduk
> on Sun May  8 20:29:26 PDT 2016
> 2016-08-05 13:04:21,531 INFO  [main] util.VersionInfo: From source with
> checksum 7ad8dc6c5daba19e4aab081181a2457d
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:
> /usr/java/default/bin:/usr/hadoop/bin:/usr/hadoop/sbin:/usr/hadoop/bin
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:HISTCONTROL=ignoredups
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:HBASE_REGIONSERVER_OPTS= -XX:PermSize=128m -XX:MaxPermSize=128m
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:MAIL=/var/spool/mail/hadoop
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:LD_LIBRARY_PATH=:/usr/hadoop/lib
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:LOGNAME=hadoop
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:HBASE_REST_OPTS=
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:PWD=/usr/hadoop
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:HADOOP_PREFIX=/usr/hadoop
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:HBASE_ROOT_LOGGER=INFO,RFA
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:LESSOPEN=||/usr/bin/lesspipe.sh %s
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:SHELL=/bin/bash
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:HBASE_ENV_INIT=true
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:HBASE_MASTER_OPTS= -XX:PermSize=128m -XX:MaxPermSize=128m
> 2016-08-05 13:04:22,244 INFO  [main] util.ServerCommandLine:
> env:HBASE_MANAGES_ZK=false
> 2016-08-05 13:04:22,245 INFO  [main] util.ServerCommandLine:
> env:HBASE_NICENESS=0
> 2016-08-05 13:04:22,245 INFO  [main] util.ServerCommandLine:
> env:HBASE_OPTS=-XX:+UseConcMarkSweepGC   -XX:PermSize=128m
> -XX:MaxPermSize=128m -Dhbase.log.dir=/usr/hbase/bin/../logs
> -Dhbase.log.file=hbase-hadoop-master-nn1.log
> -Dhbase.home.dir=/usr/hbase/bin/.. -Dhbase.id.str=hadoop
> -Dhbase.root.logger=INFO,RFA -Djava.library.path=/usr/hadoop/lib
> -Dhbase.security.logger=INFO,RFAS
> 2016-08-05 13:04:22,245 INFO  [main] util.ServerCommandLine:
> env:HBASE_START_FILE=/tmp/hbase-hadoop-master.autorestart
> 2016-08-05 13:04:22,245 INFO  [main] util.ServerCommandLine:
> env:HBASE_SECURITY_LOGGER=INFO,RFAS
> 2016-08-05 13:04:22,245 INFO  [main] util.ServerCommandLine:
> env:LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;
> 35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;
> 41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=
> 01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=
> 01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.
> tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:
> *.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.
> lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.
> tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:
> *.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:
> *.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:
> *.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;
> 35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=
> 

Re: How to configure HBase in a HA mode?

2016-08-04 Thread Dima Spivak
Hey Alexandr,

In that case, you'd use what you have set in your hdfs-site.xml for
the dfs.nameservices property (followed by the HBase directory under HDFS).

-Dima

On Thu, Aug 4, 2016 at 12:54 PM, Alexandr Porunov <
alexandr.poru...@gmail.com> wrote:

> Hello,
>
> I don't understand one parameter from hbase-site.xml :
>
> 
>   hbase.rootdir
>   hdfs://hdfsHost:8020/hbase
> 
>
> What we have to put in that parameter if we configured HDFS cluster in HA
> mode? I mean we have 2 name nodes (nn1, nn2) and 2 data nodes (dn1, dn2)
> then which node we have to use in "hbase.rootdir" parameter?
>
> The most logical answer is the name node which is currently active. But if
> we will use active name node and it fails then hbase cluster becomes
> unavailable even if our nn2 will change its status to active. Hbase cluster
> will not understand that we have changed our active NN.
>
> Moreover, I have configured HBase cluster with the following parameter:
>
> 
>   hbase.rootdir
>   hdfs://nn1:8020/hbase
> 
>
> It doesn't work.
> 1. HMaster starts
> 2. I put "http://nn1:16010; into browser
> 3. HMaster disappears
>
> Here is my logs/hbase-hadoop-master-nn1.log :
> http://paste.openstack.org/show/549232/
>
> Please, help me to find out how to configure it
>
> Sincerely,
>
> Alexandr
>



-- 
-Dima


Re: (BUG)Failed to read block error when enable replication

2016-08-02 Thread Dima Spivak
Hey Yang,

Looks like HDFS is having trouble with a block. Have you tried running
hadoop fsck?

-Dima

On Tuesday, August 2, 2016, Ming Yang  wrote:

> When enabled replication,we found a large number of error logs.Is the
> cluster configuration incorrect?
>
> 2016-08-03 10:46:21,721 DEBUG
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening
> log for replication dn7%2C60020%2C1470136216957.1470192327030 at 16999670
> 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient:
> BlockReaderLocal requested with incorrect offset: Offset 0 and length
> 17073479 don't match block blk_4137524355009640437_53760530 ( blockLen
> 16999670 )
> 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient:
> BlockReaderLocal: Removing blk_4137524355009640437_53760530 from cache
> because local file
> /sdd/hdfs/dfs/data/blocksBeingWritten/blk_4137524355009640437 could not be
> opened.
> 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Failed to
> read block blk_4137524355009640437_53760530 on local
> machinejava.io.IOException: Offset 0 and length 17073479 don't match block
> blk_4137524355009640437_53760530 ( blockLen 16999670 )
> at
> org.apache.hadoop.hdfs.BlockReaderLocal.(BlockReaderLocal.java:287)
> at
>
> org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:171)
> at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:358)
> at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:74)
> at
>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2073)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2224)
> at java.io.DataInputStream.read(DataInputStream.java:149)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
> at
>
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:55)
> at
>
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:178)
> at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:734)
> at
>
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:69)
> at
>
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:574)
> at
>
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:364)
>
> 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Try reading
> via the datanode on /192.168.7.139:50010
>


-- 
-Dima


Re: issue starting regionserver with SASL authentication failed

2016-08-02 Thread Dima Spivak
Hm, not sure what to say. The error seems to be pointing at not having a
TGT...

-Dima

On Tue, Aug 2, 2016 at 12:45 AM, Aneela Saleem <ane...@platalytics.com>
wrote:

> Yes, I have kinit'd as the service user. But still getting error
>
> On Tue, Aug 2, 2016 at 3:05 AM, Dima Spivak <dspi...@cloudera.com> wrote:
>
> > The stacktrace suggests you don't have a ticket-granting ticket. Have you
> > kinit'd as the service user?
> >
> > -Dima
> >
> > On Sun, Jul 31, 2016 at 11:19 PM, Aneela Saleem <ane...@platalytics.com>
> > wrote:
> >
> > > Hi Dima,
> > >
> > > I followed the official reference guide now, but still same error.
> > > Attached is the hbase-site.xml file, please have a look. What's wrong
> > there?
> > >
> > > On Thu, Jul 28, 2016 at 11:58 PM, Dima Spivak <dspi...@cloudera.com>
> > > wrote:
> > >
> > >> I haven't looked in detail at your hbase-site.xml, but if you're
> running
> > >> Apache HBase (and not a CDH release), I might recommend using the
> > official
> > >> reference guide [1] to configure your cluster instead of the CDH 4.2.0
> > >> docs
> > >> since those would correspond to HBase 0.94, and might well have
> > different
> > >> steps required to set up security. If you are trying out CDH HBase, be
> > >> sure
> > >> to use up-to-date documentation for your release.
> > >>
> > >> Let us know how it goes.
> > >>
> > >> [1] https://hbase.apache.org/book.html#hbase.secure.configuration
> > >>
> > >> -Dima
> > >>
> > >> On Thu, Jul 28, 2016 at 10:09 AM, Aneela Saleem <
> ane...@platalytics.com
> > >
> > >> wrote:
> > >>
> > >> > Hi Dima,
> > >> >
> > >> > I'm running Hbase version 1.2.2
> > >> >
> > >> > On Thu, Jul 28, 2016 at 8:35 PM, Dima Spivak <dspi...@cloudera.com>
> > >> wrote:
> > >> >
> > >> > > Hi Aneela,
> > >> > >
> > >> > > What version of HBase are you running?
> > >> > >
> > >> > > -Dima
> > >> > >
> > >> > > On Thursday, July 28, 2016, Aneela Saleem <ane...@platalytics.com
> >
> > >> > wrote:
> > >> > >
> > >> > > > Hi,
> > >> > > >
> > >> > > > I have successfully configured Zookeeper with Kerberos
> > >> authentication.
> > >> > > Now
> > >> > > > i'm facing issue while configuring HBase with Kerberos
> > >> authentication.
> > >> > I
> > >> > > > have followed this link
> > >> > > > <
> > >> > >
> > >> >
> > >>
> >
> http://www.cloudera.com/documentation/archive/cdh/4-x/4-2-0/CDH4-Security-Guide/cdh4sg_topic_8_2.html
> > >> > > >.
> > >> > > > Attached are the configuration files, i.e., hbase-site.xml and
> > >> > > > zk-jaas.conf.
> > >> > > >
> > >> > > > Following are the logs from regionserver:
> > >> > > >
> > >> > > > 016-07-28 17:44:56,881 WARN  [regionserver/hadoop-master/
> > >> > > > 192.168.23.206:16020] regionserver.HRegionServer: error telling
> > >> master
> > >> > > we
> > >> > > > are up
> > >> > > > com.google.protobuf.ServiceException: java.io.IOException: Could
> > not
> > >> > set
> > >> > > > up IO Streams to hadoop-master/192.168.23.206:16000
> > >> > > > at
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:240)
> > >> > > > at
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
> > >> > > > at
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$Regi

Re: issue starting regionserver with SASL authentication failed

2016-08-01 Thread Dima Spivak
The stacktrace suggests you don't have a ticket-granting ticket. Have you
kinit'd as the service user?

-Dima

On Sun, Jul 31, 2016 at 11:19 PM, Aneela Saleem <ane...@platalytics.com>
wrote:

> Hi Dima,
>
> I followed the official reference guide now, but still same error.
> Attached is the hbase-site.xml file, please have a look. What's wrong there?
>
> On Thu, Jul 28, 2016 at 11:58 PM, Dima Spivak <dspi...@cloudera.com>
> wrote:
>
>> I haven't looked in detail at your hbase-site.xml, but if you're running
>> Apache HBase (and not a CDH release), I might recommend using the official
>> reference guide [1] to configure your cluster instead of the CDH 4.2.0
>> docs
>> since those would correspond to HBase 0.94, and might well have different
>> steps required to set up security. If you are trying out CDH HBase, be
>> sure
>> to use up-to-date documentation for your release.
>>
>> Let us know how it goes.
>>
>> [1] https://hbase.apache.org/book.html#hbase.secure.configuration
>>
>> -Dima
>>
>> On Thu, Jul 28, 2016 at 10:09 AM, Aneela Saleem <ane...@platalytics.com>
>> wrote:
>>
>> > Hi Dima,
>> >
>> > I'm running Hbase version 1.2.2
>> >
>> > On Thu, Jul 28, 2016 at 8:35 PM, Dima Spivak <dspi...@cloudera.com>
>> wrote:
>> >
>> > > Hi Aneela,
>> > >
>> > > What version of HBase are you running?
>> > >
>> > > -Dima
>> > >
>> > > On Thursday, July 28, 2016, Aneela Saleem <ane...@platalytics.com>
>> > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I have successfully configured Zookeeper with Kerberos
>> authentication.
>> > > Now
>> > > > i'm facing issue while configuring HBase with Kerberos
>> authentication.
>> > I
>> > > > have followed this link
>> > > > <
>> > >
>> >
>> http://www.cloudera.com/documentation/archive/cdh/4-x/4-2-0/CDH4-Security-Guide/cdh4sg_topic_8_2.html
>> > > >.
>> > > > Attached are the configuration files, i.e., hbase-site.xml and
>> > > > zk-jaas.conf.
>> > > >
>> > > > Following are the logs from regionserver:
>> > > >
>> > > > 016-07-28 17:44:56,881 WARN  [regionserver/hadoop-master/
>> > > > 192.168.23.206:16020] regionserver.HRegionServer: error telling
>> master
>> > > we
>> > > > are up
>> > > > com.google.protobuf.ServiceException: java.io.IOException: Could not
>> > set
>> > > > up IO Streams to hadoop-master/192.168.23.206:16000
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:240)
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982)
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2284)
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:906)
>> > > > at java.lang.Thread.run(Thread.java:745)
>> > > > Caused by: java.io.IOException: Could not set up IO Streams to
>> > > > hadoop-master/192.168.23.206:16000
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:785)
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
>> > > > at
>> > >
>> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
>> > > > at
>> > > >
>> > >
>> >
>> org.apache.had

Re: Hbase USERT

2016-07-28 Thread Dima Spivak
Hey Ankit,

Moving the dev list to bcc and adding the user mailing list as the
recipient. Maybe a fellow user can offer some suggestions.

All the best,
  Dima

On Thursday, July 28, 2016, ankit beohar  wrote:

> Hi Hbase,
>
> My use case is :- I am getting files and I want to insert the records in
> hbase with rowkey if rowkey available I have to update the values with
> old+new values.
>
> For this I wrote MR job and get the values of each rowkey and in If else I
> manage my update and insert but with only 0.1 millions records hbase region
> server goes down.
>
> Any idea on this?
>
> I tried to incorporate Phoenix upsert also but with this same error occurs.
>
> Please help me out this.
>
> Best Regards,
> ANKIT BEOHAR
>


Re: issue starting regionserver with SASL authentication failed

2016-07-28 Thread Dima Spivak
I haven't looked in detail at your hbase-site.xml, but if you're running
Apache HBase (and not a CDH release), I might recommend using the official
reference guide [1] to configure your cluster instead of the CDH 4.2.0 docs
since those would correspond to HBase 0.94, and might well have different
steps required to set up security. If you are trying out CDH HBase, be sure
to use up-to-date documentation for your release.

Let us know how it goes.

[1] https://hbase.apache.org/book.html#hbase.secure.configuration

-Dima

On Thu, Jul 28, 2016 at 10:09 AM, Aneela Saleem <ane...@platalytics.com>
wrote:

> Hi Dima,
>
> I'm running Hbase version 1.2.2
>
> On Thu, Jul 28, 2016 at 8:35 PM, Dima Spivak <dspi...@cloudera.com> wrote:
>
> > Hi Aneela,
> >
> > What version of HBase are you running?
> >
> > -Dima
> >
> > On Thursday, July 28, 2016, Aneela Saleem <ane...@platalytics.com>
> wrote:
> >
> > > Hi,
> > >
> > > I have successfully configured Zookeeper with Kerberos authentication.
> > Now
> > > i'm facing issue while configuring HBase with Kerberos authentication.
> I
> > > have followed this link
> > > <
> >
> http://www.cloudera.com/documentation/archive/cdh/4-x/4-2-0/CDH4-Security-Guide/cdh4sg_topic_8_2.html
> > >.
> > > Attached are the configuration files, i.e., hbase-site.xml and
> > > zk-jaas.conf.
> > >
> > > Following are the logs from regionserver:
> > >
> > > 016-07-28 17:44:56,881 WARN  [regionserver/hadoop-master/
> > > 192.168.23.206:16020] regionserver.HRegionServer: error telling master
> > we
> > > are up
> > > com.google.protobuf.ServiceException: java.io.IOException: Could not
> set
> > > up IO Streams to hadoop-master/192.168.23.206:16000
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:240)
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
> > > at
> > >
> >
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982)
> > > at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2284)
> > > at
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:906)
> > > at java.lang.Thread.run(Thread.java:745)
> > > Caused by: java.io.IOException: Could not set up IO Streams to
> > > hadoop-master/192.168.23.206:16000
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:785)
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
> > > at
> > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
> > > ... 5 more
> > > Caused by: java.lang.RuntimeException: SASL authentication failed. The
> > > most likely cause is missing or invalid credentials. Consider 'kinit'.
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$1.run(RpcClientImpl.java:685)
> > > at java.security.AccessController.doPrivileged(Native Method)
> > > at javax.security.auth.Subject.doAs(Subject.java:415)
> > > at
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.handleSaslConnectionFailure(RpcClientImpl.java:643)
> > > at
> > >
> >
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:751)
> > > ... 9 more
> > > Caused by: javax.security.sasl.SaslException: GSS initiate failed
> [Caused
> > > by GSSException: No valid credentials provided (Mechanism level: Failed
> > to
> > > find any Kerberos tgt)]
> > > at
> > >
> >
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
> > > at
> > >
> >
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HB

Re: issue starting regionserver with SASL authentication failed

2016-07-28 Thread Dima Spivak
Hi Aneela,

What version of HBase are you running?

-Dima

On Thursday, July 28, 2016, Aneela Saleem  wrote:

> Hi,
>
> I have successfully configured Zookeeper with Kerberos authentication. Now
> i'm facing issue while configuring HBase with Kerberos authentication. I
> have followed this link
> .
> Attached are the configuration files, i.e., hbase-site.xml and
> zk-jaas.conf.
>
> Following are the logs from regionserver:
>
> 016-07-28 17:44:56,881 WARN  [regionserver/hadoop-master/
> 192.168.23.206:16020] regionserver.HRegionServer: error telling master we
> are up
> com.google.protobuf.ServiceException: java.io.IOException: Could not set
> up IO Streams to hadoop-master/192.168.23.206:16000
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:240)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
> at
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2284)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:906)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Could not set up IO Streams to
> hadoop-master/192.168.23.206:16000
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:785)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
> ... 5 more
> Caused by: java.lang.RuntimeException: SASL authentication failed. The
> most likely cause is missing or invalid credentials. Consider 'kinit'.
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$1.run(RpcClientImpl.java:685)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.handleSaslConnectionFailure(RpcClientImpl.java:643)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:751)
> ... 9 more
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused
> by GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
> at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
> at
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740)
> ... 9 more
> Caused by: GSSException: No valid credentials provided (Mechanism level:
> Failed to find any Kerberos tgt)
> at
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
> at
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
> at
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
> at
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
> at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
> at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
> at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
>
>
> Please have a look, whats going wrong here?
>
> Thanks
>
>


Re: Using hbase for transactional applications

2016-07-25 Thread Dima Spivak
Hey Kanagha,

What kind of scale are you looking at for running your application? How big
do you envision the cluster needing to be to handle your use case?

-Dima

On Mon, Jul 25, 2016 at 1:09 PM, Dave Birdsall 
wrote:

> Hi,
>
> For ecommerce, you'll likely want solid ACID transactions. Apache Trafodion
> (incubating) has a scalable distributed transaction engine. You can use
> Trafodion for your SQL and transactional needs and use HBase as the
> backend.
>
> See http://trafodion.apache.org/.
>
> Dave
>
> -Original Message-
> From: Kanagha [mailto:er.kana...@gmail.com]
> Sent: Monday, July 25, 2016 12:38 PM
> To: user@hbase.apache.org
> Subject: Using hbase for transactional applications
>
> Hi,
>
> I am investigating into designing a simple ecommerce application using
> Hbase
> as the database.
>
> I'm planning to use Apache Phoenix and Hbase (backend) and use a Spring
> application.
>
> What are the other recommended approaches?
>
> Thanks
> Kanagha
>


Re: Running Hbase java tests under “test” folder for specific/remote hbase server(hbase 1.2.1)

2016-07-18 Thread Dima Spivak
Hello,

No, the tests in test folders under most modules are unit tests that spin
up miniclusters and/or use internal hooks to test small sections of code in
isolation. For end-to-end tests that can be run on clusters, check out the
tests in the hbase-it module. Details on running those can be found in the
reference guide.

Cheers,
  Dima

On Monday, July 18, 2016, buddyhbase hbase  wrote:

> Awaiting your help here, Thanks.
>
> 
> From: buddyhbase hbase
> Sent: 12 July 2016 12:40:56
> To: user@hbase.apache.org ; d...@hbase.apache.org
> 
> Subject: Running Hbase java tests under “test” folder for specific/remote
> hbase server(hbase 1.2.1)
>
>
> Hi,
>
>
> I could see lot of Java tests under test folder of each hbase modules, ex:
> ~/src/hbase-1.2.1/hbase-client/src/test , Now when I run this with mvn test
> it will spawn a local Hbase server and run the tests against it, but I
> would want to give the configuration of deployed hbase
> server(hbase-site.xml) to these tests, so that it could test the deployed
> server, is there a way ?
>
>
> Please let me know, Thanks
>
> hbasebuddy
>
>


Re: HBASE Install/Configure

2016-07-16 Thread Dima Spivak
Hi Anil,

Have you tried the steps documented in the HBase ref guide [1] yet? If so,
can you describe your environment a bit more? How many machines are you
trying to run HBase across? Do you have an HDFS cluster set up already?

Cheers,
  Dima

[1] https://hbase.apache.org/book.html

On Thursday, July 14, 2016, kha...@hotmail.com  wrote:

> Hi Friends,
> I am newbie to HBASE and trying to configure HBASE on UBUNTU 14.04. After
> configuration and starting hbase (get to hbase CLI). If I run any HBASE
> command, I get The node /hbase is not in ZooKeeper.
> Please find attached my HBASE-SITE.xml and zoo.cfg.
>
> Appreciate all your help/support in advance
> Regards - Anil Khiani
>
> 
>
> 
> hbase.rootdir
> hdfs://localhost/hbase
> Enter the HBase NameNode server hostname
> 
>
> 
> hbase.cluster.distributed
> true
> 
>
> 
>   hbase.zookeeper.property.clientPort
>   2181
>The port at which the clients will connect.
>   
> 
>
> 
> hbase.master.port
> 2080
> The port the HBase Master should bind
> to.
> 
>
> 
>   hbase.zookeeper.quorum
>   localhost
>   Comma separated list of servers in the ZooKeeper Quorum.
>   For example,
> "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
>   By default this is set to localhost for local and pseudo-distributed
> modes
>   of operation. For a fully-distributed setup, this should be set to a
> full
>   list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in
> hbase-env.sh
>   this is the list of servers which we will start/stop ZooKeeper on.
>   
> 
>
> 
>   hbase.zookeeper.property.dataDir
>   /var/lib/zookeeper
>   Property from ZooKeeper's config zoo.cfg.
>   The directory where the snapshot is stored.
>   
> 
>
> 
> zookeeper.znode.parent
> /hbase
> 
>
> 
>
> Zoo.cfg
>
> dataDir=/var/lib/zookeeper
> server.1=localhost:2888:3888
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/HBASE-Install-Configure-tp4081233.html
> Sent from the HBase User mailing list archive at Nabble.com.
>


Re: How to get last one month of data from hbase table.

2016-07-15 Thread Dima Spivak
TimestampsFilter*. My bad. :)

-Dima

On Friday, July 15, 2016, Dima Spivak <dspi...@cloudera.com> wrote:

> If you have a Thrift server running, you can use HappyBase (
> https://happybase.readthedocs.io/en/stable/) to get a pretty nifty Python
> API. That along with using a scan with the TimestampFilter should get you
> what you want.
>
> -Dima
>
> On Thursday, July 14, 2016, Mahesh Sankaran <sankarmahes...@gmail.com
> <javascript:_e(%7B%7D,'cvml','sankarmahes...@gmail.com');>> wrote:
>
>> Hi all,
>>
>> My client wants last one month of data in  hbase table.
>> I know we can export using timestamp range. But we dont know starting
>> timestamp of particular month in our hbase table which contains millions
>> of
>> rows.
>> Is there any python api to find starting timestamp of particular month or
>> date in hbase table.
>> Any help would be appreciated.
>>
>>
>> Best regards,
>>
>> Mahesh
>>
>


Re: How to get last one month of data from hbase table.

2016-07-15 Thread Dima Spivak
If you have a Thrift server running, you can use HappyBase (
https://happybase.readthedocs.io/en/stable/) to get a pretty nifty Python
API. That along with using a scan with the TimestampFilter should get you
what you want.

-Dima

On Thursday, July 14, 2016, Mahesh Sankaran 
wrote:

> Hi all,
>
> My client wants last one month of data in  hbase table.
> I know we can export using timestamp range. But we dont know starting
> timestamp of particular month in our hbase table which contains millions of
> rows.
> Is there any python api to find starting timestamp of particular month or
> date in hbase table.
> Any help would be appreciated.
>
>
> Best regards,
>
> Mahesh
>


Re: Caused by: java.io.InterruptedIOException

2016-07-11 Thread Dima Spivak
Hm, sorry to say that you might have better luck at getting to the bottom
of this at Hortonworks’ user forums since people there are more likely to
have played with Apex and the like. The stack trace you included is
actually a pretty generic one so it’d be hard for someone to see those
lines and chime in with a specific cause without way more details about
your use case and the configuration of your cluster. If I had to guess,
there’s probably some tuning that might be needed within your HBase/Apex
configuration, but you’re venturing into things that people familiar with
HDP are more likely to be able to help out with.

-Dima

On Mon, Jul 11, 2016 at 4:59 PM, Raja.Aravapalli <raja.aravapa...@target.com
> wrote:

>
>
> Thanks for the response Dima,
>
> Please find below some of the details:
>
>
> Hadoop Distribution: Hortonworks
> Version: HBase 1.1.2.2.3.4.0
> Cluster size: 40nodes
>
> About Application:
>
> We have an Apache Apex application, where in we have one of the programs,
> which runs in 3 instances and does PUT operations simultaneously on a same
> Hbase table.
>
>
> Application is running fine from 3 -4 days, Although application recovered
> automatically, want to know more details on these exceptions in the log:
>
>
> Caused by: java.io.InterruptedIOException: #2, interrupted.
> currentNumberOfTask=1
>  at
>
> org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1661)
>  at
>
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1687)
>  at
>
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:208)
>  at
>
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
>  at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1449)
>  at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1040)
>
>
>
> Thanks a lot in advance.
>
>
> Regards,
> Raja.
>
>
>
> On 7/11/16, 4:10 PM, "Dima Spivak" <dspi...@cloudera.com> wrote:
>
> >Hey Raja,
> >
> >We'll need more details about your setup (HBase version, size/topology of
> >cluster, server specs, etc.) and the applications you're running before we
> >can even start giving ideas of things to try. Wanna pass those along?
> >
> >-Dima
> >
> >On Monday, July 11, 2016, Raja.Aravapalli <raja.aravapa...@target.com>
> >wrote:
> >
> >>
> >> Hi,
> >>
> >>
> >> One of applications which does put operations on Hbase table, is failing
> >> with below exception in the log. Can someone please debug the issue and
> fix
> >> this. I am new to Hbase.
> >>
> >>
> >> Caused by: java.io.InterruptedIOException: #2, interrupted.
> >> currentNumberOfTask=1
> >> at
> >>
> org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1661)
> >> at
> >>
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1687)
> >> at
> >>
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:208)
> >> at
> >>
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
> >> at
> >> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1449)
> >> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1040)
> >>
> >>
> >>
> >>
> >> Regards,
> >> Raja.
> >>
>


Re: Caused by: java.io.InterruptedIOException

2016-07-11 Thread Dima Spivak
Hey Raja,

We'll need more details about your setup (HBase version, size/topology of
cluster, server specs, etc.) and the applications you're running before we
can even start giving ideas of things to try. Wanna pass those along?

-Dima

On Monday, July 11, 2016, Raja.Aravapalli 
wrote:

>
> Hi,
>
>
> One of applications which does put operations on Hbase table, is failing
> with below exception in the log. Can someone please debug the issue and fix
> this. I am new to Hbase.
>
>
> Caused by: java.io.InterruptedIOException: #2, interrupted.
> currentNumberOfTask=1
> at
> org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1661)
> at
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1687)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:208)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
> at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1449)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1040)
>
>
>
>
> Regards,
> Raja.
>


Re: Escaping separator in data while bulk loading using importtsv tool and ingesting numeric values

2016-07-07 Thread Dima Spivak
Hi Mahesha,

1.) HBase stores all values as byte arrays, so there's no typing to speak
of. ImportTsv is simply ingesting what it sees, quotes included (or not).

2.) ImportTsv doesn't support escaping, if I'm reading the code correctly. (
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
)

All the best,
  Dima

On Thursday, July 7, 2016, Mahesha999  wrote:

> I am using importtsv tool to ingest data. I have some doubts. I am using
> hbase 1.1.5.
>
> First does it ingest non-string/numeric values? I was referring  this link
> <
> http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
> >
> detailing importtsv in cloudera distribution. It says:"it interprets
> everything as strings". So I was guessing what does that mean.
>
> I am using simple wordcount example where first column is a word and second
> column is word count.
>
> When I keep file as follows:
>
> "access","1"
> "about","1"
>
> and ingest and then do scan on hbase shell it gives following output:
>
>  about column=f:count,
> timestamp=1467716881104, value="1"
>  accesscolumn=f:count,
> timestamp=1467716881104, value="1"
>
> When I keep file as follows (double quotes surrounding count is removed):
>
> "access",1
> "about",1
>
> and ingest and then do scan on hbase shell it gives following output
> (double
> quotes surrounding count is not there):
>
>  about column=f:count,
> timestamp=1467716881104, value=1
>  accesscolumn=f:count,
> timestamp=1467716881104, value=1
>
> So as you can see there are no double quotes in count's value. *Q1. Does
> that mean it is stored as integer and not as string? * The cloudera's
> article suggests that custom MR job needs to be written for ingesting
> non-string values. However I am not able to get what does that mean if
> above
> is ingesting integer values.
>
> Also another doubt I am having is that whether I can escape the column
> separator when it appears inside the column value. For example in
> importtsv,
> we can specify the separator as follows:
>
> -Dimporttsv.separator=,
>
> However what if I have employee data where first column is employee name
> and
> second column as address? My file will have rows resembling to something
> like this:
>
> "mahesh","A6,Hyatt Appartment"
>
> That second comma makes importtsv think that there are three columns and
> throwing BadTsvLineException("Excessive columns").
>
> Thus I tried escaping comma with backslash ('\') and just for sake of
> curiosity escaping backslash with another backslash (that is "\\"). So my
> file had following lines:
>
> "able","1\"
> "z","1\"
> "za","1\\1"
>
> When I ran scan on hbase shell, it gave following output:
>
>  able  column=f:count,
> timestamp=1467716881104, value="1\x5C"
>  z column=f:count,
> timestamp=1467716881104, value="1\x5C"
>  zacolumn=f:count,
> timestamp=1467716881104, value="1\x5C\x5C1"
>
> *Q2. So it seems that instead of escaping character following backslash, it
> encodes backslash as "\x5C". Is it like that? Is there no way to escape
> column separator while bulk loading data using importtsv?*
>
>
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Escaping-separator-in-data-while-bulk-loading-using-importtsv-tool-and-ingesting-numeric-values-tp4081081.html
> Sent from the HBase User mailing list archive at Nabble.com.
>


Re: NoClassDefFoundError org/apache/hadoop/hbase/HBaseConfiguration

2016-07-05 Thread Dima Spivak
Hey Robert,

Probably a better question to ask over at u...@spark.apache.org.
abase-common,jar would be the artifact you’d wanna put on the class path,
though.

-Dima

On Tue, Jul 5, 2016 at 3:39 PM, Robert James <srobertja...@gmail.com> wrote:

> I'm using spark-shell.  The perplexing thing is that if I load it via
> spark-shell --jars, it seems to work.  However, if I load it via
> spark.driver.extraClassPath in the config file, it seems to fail.
> What is the difference between --jars (command line) and
> spark.driver.extraClassPath (config)?
>
> On 7/5/16, Dima Spivak <dspi...@cloudera.com> wrote:
> > Hey Robert,
> >
> > HBaseConfiguration is part of the hbase-common module of the HBase
> project.
> > Are you using Maven to provide dependencies or just running java -cp?
> >
> > -Dima
> >
> > On Monday, July 4, 2016, Robert James <srobertja...@gmail.com> wrote:
> >
> >> When trying to load HBase via Spark, I get NoClassDefFoundError
> >> org/apache/hadoop/hbase/HBaseConfiguration errors.
> >>
> >> How do I provide that class to Spark?
> >>
> >
>


Re: NoClassDefFoundError org/apache/hadoop/hbase/HBaseConfiguration

2016-07-05 Thread Dima Spivak
Hey Robert,

HBaseConfiguration is part of the hbase-common module of the HBase project.
Are you using Maven to provide dependencies or just running java -cp?

-Dima

On Monday, July 4, 2016, Robert James  wrote:

> When trying to load HBase via Spark, I get NoClassDefFoundError
> org/apache/hadoop/hbase/HBaseConfiguration errors.
>
> How do I provide that class to Spark?
>


Re: Delete row that has columns with future timestamp

2016-06-26 Thread Dima Spivak
Hey M.,

Just to follow up on what JMS said, this was fixed in April 2014 (details
at https://issues.apache.org/jira/browse/HBASE-10118), so running a version
of HBase in which the patch went in is probably your best option.

-Dima

On Sunday, June 26, 2016, Jean-Marc Spaggiari 
wrote:

> Hi,
>
> This is a known issue and I think it is solved is more recent versions.  Do
> you have the option to upgrade?
>
> JMS
> Le 2016-06-26 07:00, "M. BagherEsmaeily"  > a écrit :
>
> > these problem doesn't solve with major compact!! Assuming the problem is
> > solved with major compact, in this case, it's still a bug.
> >
> > On Sun, Jun 26, 2016 at 3:08 PM, Lise Regnier  >
> > wrote:
> >
> > > you need to run a major compact after deletion
> > > lise
> > >
> > > > On 26 Jun 2016, at 11:20, M. BagherEsmaeily  >
> > > wrote:
> > > >
> > > > Hello
> > > > I use HBase version 0.98.9-hadoop1 with Hadoop version 1.2.1 . when i
> > > > delete row that has columns with future timestamp, delete not affect
> > and
> > > > row still surviving.
> > > >
> > > > For example when i put a row with future timestamp:
> > > > Put p = new Put(Bytes.toBytes("key1"));
> > > > p.add(Bytes.toBytes("C"), Bytes.toBytes("q1"), 2L,
> > > > Bytes.toBytes("test-val"));
> > > > table.put(p);
> > > >
> > > > After put, when i scan my table, the result is:
> > > > ROW COLUMN+CELL
> > > > key1column=C:q1, timestamp=2, value=test-val
> > > >
> > > > When i delete this row with following code:
> > > > Delete d = new Delete(Bytes.toBytes("key1"));
> > > > table.delete(d);
> > > >
> > > > OR with this code:
> > > > Delete d = new Delete(Bytes.toBytes("key1"), Long.MAX_VALUE);
> > > > table.delete(d);
> > > >
> > > > After each two deletes the result of scan is:
> > > > ROW COLUMN+CELL
> > > > key1column=C:q1, timestamp=2, value=test-val
> > > >
> > > > And raw scan result is:
> > > > ROW COLUMN+CELL
> > > > key1column=C:, timestamp=1466931500501, type=DeleteFamily
> > > > key1column=C:q1, timestamp=2, value=test-val
> > > >
> > > >
> > > > But when i change the timestamp of delete to Long.MAX_VALUE-1, this
> > > delete
> > > > works. Can anyone help me with this?
> > >
> > >
> >
>


Re: unsubscribe me please from this mailing list

2016-06-24 Thread Dima Spivak
Please email user-unsubscr...@hbase.apache.org, Fateme.

Cheers,
  Dima

On Thursday, June 23, 2016, fateme Abiri 
wrote:

>
> unsubscribe me please from this mailing list
> Thanks & Regards
>
>
>
> Fateme Abiri
> Software Engineer
> M.Sc. Degree
> Ferdowsi University of Mashhad, Iran
>
>
>
>


Re: `hbase classpath` command causes “File name too long” error

2016-06-22 Thread Dima Spivak
You weren't setting the classpath. In Bash, you can't put a $ in front of
the variable name when you're assigning it to a value.

-Dima

On Wednesday, June 22, 2016, Mahesha999  wrote:

> hey thanks. That worked. Seems that my lack of experience with Linux
> causing
> trouble. Can u tell me what was going?
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/hbase-classpath-command-causes-File-name-too-long-error-tp4080706p4080814.html
> Sent from the HBase User mailing list archive at Nabble.com.
>


Re: Command to get HBase table owner

2016-06-08 Thread Dima Spivak
Oops, spoke too soon. Looks like owner was deprecated back in HBASE-6188,
but is still sticking around (e.g. you can get it from HTableDescriptor
with getOwnerString and we even updated our TestAccessController to include
it). Not sure why it remains deprecated after all this time, so maybe
someone more seasoned than me can explain the justification…

Things like replication_scope can be seen by using ‘describe' from the
HBase shell. Other metadata (e.g. modified time) is not kept for HBase
tables as far as I’m aware.

-Dima

On Wed, Jun 8, 2016 at 12:44 AM, Dima Spivak <dspi...@cloudera.com> wrote:

> Hi Kumar,
>
> Which version of HBase do you run? Recent releases have moved to ACLs for
> table permissions in place of an "owner" construct.
>
> -Dima
>
>
> On Wednesday, June 8, 2016, kumar r <kumarc...@gmail.com> wrote:
>
>> Hi,
>>
>> Is there any command to get complete description about hbase table such as
>> owner, database, modified time, etc.
>>
>> In hive, i can get those information using
>>
>> desc formatted tablename;
>>
>> But in hbase desc 'tablename' shows size, version, replication_scope, etc.
>>
>> I want to get owner details of hbase table.
>>
>> Thanks,
>> Kumar
>>
>


Re: Command to get HBase table owner

2016-06-08 Thread Dima Spivak
Hi Kumar,

Which version of HBase do you run? Recent releases have moved to ACLs for
table permissions in place of an "owner" construct.

-Dima

On Wednesday, June 8, 2016, kumar r  wrote:

> Hi,
>
> Is there any command to get complete description about hbase table such as
> owner, database, modified time, etc.
>
> In hive, i can get those information using
>
> desc formatted tablename;
>
> But in hbase desc 'tablename' shows size, version, replication_scope, etc.
>
> I want to get owner details of hbase table.
>
> Thanks,
> Kumar
>


Re: hbase 'transparent encryption' feature is production ready or not?

2016-06-06 Thread Dima Spivak
FWIW, some engineers at Cloudera who worked on adding encryption at rest to
HDFS wrote a blog post on this where they describe negligible performance
impacts on write and only a slight performance degradation on large reads (
http://blog.cloudera.com/blog/2015/01/new-in-cdh-5-3-transparent-encryption-in-hdfs/).
Obviously, your mileage may vary, but in my internal testing, I can also
say I haven't seen much (if any) impact with encryption zones enabled.

-Dima

On Monday, June 6, 2016, Liu, Ming (Ming)  wrote:

> Hi, Andrew again,
>
> I still have a question that if we move the encryption to HDFS level, we
> no longer can enable encryption per table I think?
> I assume encryption will impact performance to some extent, so we may
> would like to enable it per table. Is there any performance tests that
> shows how much overhead encryption can introduce? If very small, then I am
> very happy to do it in HDFS and encrypt all data.
> I still not start to study HSM, but if for example, we can setup a
> separate storage, like a NFS, which can be mounted to each node of HBase
> cluster, and we put the key there, is it an acceptable plan?
>
> Thanks,
> Ming
>
> -邮件原件-
> 发件人: Andrew Purtell [mailto:apurt...@apache.org ]
> 发送时间: 2016年6月3日 12:27
> 收件人: user@hbase.apache.org 
> 抄送: Zhang, Yi (Eason) >
> 主题: Re: 答复: hbase 'transparent encryption' feature is production ready or
> not?
>
> > We are now confident to use this feature.
>
> You should test carefully for your use case in any case.
>
> > HSM is a good option, I am new to it. But will look at it.
>
> I recommend using HDFS's transparent encryption feature instead of HBase
> transparent encryption if you're only just now thinking about HSMs and key
> protection in general. Storing the master key on the same nodes as the
> encrypted data will defeat protection. This should be offloaded to a
> protected domain. Hadoop ships with a software KMS that, while it has
> limitations, can be set up on a specially secured server and HDFS TDE can
> take advantage of it. (HBase TDE doesn't support the Hadoop KMS.)
>
> Advice offered for what it's worth (smile)
>
>
> On Thu, Jun 2, 2016 at 9:16 PM, Liu, Ming (Ming)  > wrote:
>
> > Thank you Andrew!
> >
> > What we hear must be rumor :-) We are now confident to use this feature.
> >
> > HSM is a good option, I am new to it. But will look at it.
> >
> > Thanks,
> > Ming
> > -邮件原件-
> > 发件人: Andrew Purtell [mailto:apurt...@apache.org ]
> > 发送时间: 2016年6月3日 8:59
> > 收件人: user@hbase.apache.org 
> > 抄送: Zhang, Yi (Eason) >
> > 主题: Re: hbase 'transparent encryption' feature is production ready or
> not?
> >
> > > We heard from various sources that it is not production ready before.
> >
> > ​Said by whom, specifically? ​
> >
> > ​> During our tests, we do find out it works not very stable, but
> > probably due to our lack of experience of this feature.
> >
> > If you have something repeatable, please consider filing a JIRA to
> > report the problem.
> >
> > > And, we now save the encryption key in the disk, so we were
> > > wondering,
> > this is something not secure.
> >
> > Data keys are encrypted with a master key which must be protected. The
> > out of the box key provider stores the master key in a local keystore.
> > That's not sufficient protection. In a production environment you will
> > want to use a HSM. Most (all?) HSMs support the keystore API. If that
> > is not sufficient, our KeyProvider API is extensible for the solution
> > you choose to employ in production.
> >
> > ​Have you looked at HDFS transparent encryption?
> >
> > https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/
> > TransparentEncryption.html Because it works at the HDFS layer it's a
> > more general solution. Be careful what version of Hadoop you use if
> > opting for HDFS TDE, though. Pick the most recent release. Slightly
> > older versions (like 2.6.0) had fatal bugs if used in conjunction with
> > HBase.
> >
> >
> >
> > On Thu, Jun 2, 2016 at 5:52 PM, Liu, Ming (Ming)  >
> > wrote:
> >
> > > Hi, all,
> > >
> > > We are trying to deploy the 'transparent encryption' feature of
> > > HBase , described in HBase reference guide:
> > > https://hbase.apache.org/book.html#hbase.encryption.server  , in our
> > > product.
> > > We heard from various sources that it is not production ready before.
> > >
> > > During our tests, we do find out it works not very stable, but
> > > probably due to our lack of experience of this feature. It works
> > > sometime, sometimes not work, and retry the same configuration, it
> > > work again. We were using HBase 1.0.
> > >
> > > Could anyone give us some information that this feature is already
> > > stable and can be used in a production environment?
> > >
> > > And, we now save the 

Re: HBase Master is shutting down with error

2016-06-01 Thread Dima Spivak
Hey Pranavan,

You’ll likely have more luck on the user@hbase.apache.org mailing list.

Cheers,
  Dima

On Wed, Jun 1, 2016 at 9:39 AM, Pranavan Theivendiram <
pranavan...@cse.mrt.ac.lk> wrote:

> Hi Devs,
>
> I am Pranavan from Sri Lanka. I am doing a GSoC project for apache
> pheonix. Please help me in the following problem.
>
> I set up a cluster with hadoop, hbase, and zookeeper. I am running a
> single node. But HMaster is failing with the following error.
>
> 2016-06-01 22:01:27,273 INFO
>  [megala-Inspiron-N5110:6.activeMasterManager]
> master.ActiveMasterManager: Deleting ZNode for
> /hbase/backup-masters/megala-inspiron-n5110,6,1464798671030 from backup
> master directory
> 2016-06-01 22:01:27,632 INFO
>  [megala-Inspiron-N5110:6.activeMasterManager]
> master.ActiveMasterManager: Registered Active
> Master=megala-inspiron-n5110,6,1464798671030
> 2016-06-01 22:01:28,148 FATAL
> [megala-Inspiron-N5110:6.activeMasterManager] master.HMaster: Failed to
> become active master
> java.lang.IllegalStateException
>
> I attached the log file as well.
> Can anyone help me on this problem?
>
> The versions of the components are listed below
>
>1. hadoop 2.6.4
>2. hbase 1.2.1
>3. zookeeper 3.4.6
>
>
> Thanks
> *T. Pranavan*
> *Junior Consultant | Department of Computer Science & Engineering
> ,University of Moratuwa*
> *Mobile| *0775136836
>


Re: Best way to pass configuration properties to MRv2 jobs

2016-04-18 Thread Dima Spivak
Probably better off asking on the Hadoop user mailing list (
u...@hadoop.apache.org) than the HBase one… :)

-Dima

On Mon, Apr 18, 2016 at 2:57 AM, Henning Blohm 
wrote:

> Hi,
>
> in our Hadoop 2.6.0 cluster, we need to pass some properties to all Hadoop
> processes so they can be referenced using ${...} syntax in configuration
> files. This works reasonably well using HADOOP_NAMENODE_OPTS and the like.
>
> For Map/Reduce jobs however, we need to speficy not only
>
> mapred.child.java.opts
>
> to pass system properties, in addition we need to set
>
> yarn.app.mapreduce.am.command-opts
>
> for anything that is referenced in Hadoop configuration files.
>
> In the end however almost all the properties passed are available as
> environment variables as well.
>
> Hence my question:
>
> * Is it possible to use reference environment variables in configuration
> files directly?
> * Does anybody know of a simpler way to make sure some system properties
> are _always_ set for all Yarn processes?
>
> Thanks,
> Henning
>


Re: Hbase Fully distribution mode - Cannot resolve regionserver hostname

2015-07-19 Thread Dima Spivak
Hi Pubudu,

Please stop adding dev@ back to this email's recipient list; this isn't an
issue for HBase developers to address. Looking at what you've posted from
hbase-site.xml, I recommend you start with the HBase Ref Guide,
specifically section 2.4 on Fully Distributed setups [1]. I'd also
recommend trying to use hostnames instead of IP addresses in your
configurations, if possible; I've had issues in the past with DNS when
providing HBase with IP addresses instead of FQDNs.

[1] http://hbase.apache.org/book.html#quickstart_fully_distributed

-Dima

On Sat, Jul 18, 2015 at 6:33 AM, Pubudu Gunatilaka pubudu...@gmail.com
wrote:

 Hi,

 I looked into HBASE-12954. But could not figure out a way of solving the
 issue. I noticed one thing in region servers. When I try to add a region
 server separately, it adds as two servers. One from the ip address and
 other from the hostname. Is this an expected outcome? I have attached a
 screen shot.

 Following are the configurations of my hbase-site.xml.

 configuration
 property
 namehbase.rootdir/name
 valuehdfs://172.17.0.205:54310/hbase/value
 /property

 property
 namehbase.cluster.distributed/name
 valuetrue/value
 /property

 property
 namehbase.zookeeper.property.dataDir/name
 value/opt/HBASE/zookeeper/value
 /property

 property
   namehbase.zookeeper.quorum/name
   value172.17.0.207/value
 /property

 property
   namehbase.master.info.bindAddress/name
   value172.17.0.216/value
 /property

 property
   namehbase.zookeeper.property.clientPort/name
   value2181/value
 /property

 /configuration

 Any help on this is appreciated.

 Thank you!



 On Sat, Jul 18, 2015 at 12:20 AM, Dima Spivak dspi...@cloudera.com
 wrote:

 +user@, dev@ to bcc

 Pubudu,

 I think you'll get more help on an issue like this on the users list.

 -Dima

 -- Forwarded message --
 From: Ted Yu yuzhih...@gmail.com
 Date: Fri, Jul 17, 2015 at 5:40 AM
 Subject: Re: Hbase Fully distribution mode - Cannot resolve regionserver
 hostname
 To: d...@hbase.apache.org d...@hbase.apache.org


 Have you looked at
 HBASE-12954 Ability impaired using HBase on multihomed hosts

 Cheers

 On Fri, Jul 17, 2015 at 3:32 AM, Pubudu Gunatilaka pubudu...@gmail.com
 wrote:

  Hi Devs,
 
  I am trying to run Hbase in fully distributed mode. So first I started
  master node. Then I started regionserver. But I am getting following
 error.
 
  2015-07-17 05:12:02,260 WARN  [pod-35:16020.activeMasterManager]
  master.AssignmentManager: Failed assignment of hbase:meta,,1.1588230740
 to
  pod-36,16020,1437109916288, trying to assign elsewhere instead; try=1 of
 10
  java.net.UnknownHostException: unknown host: pod-36
  at
 
 

 org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.init(RpcClientImpl.java:296)
  at
 
 

 org.apache.hadoop.hbase.ipc.RpcClientImpl.createConnection(RpcClientImpl.java:129)
  at
 
 

 org.apache.hadoop.hbase.ipc.RpcClientImpl.getConnection(RpcClientImpl.java:1278)
  at
 org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1152)
  at
 
 

 org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
  at
 
 

 org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
  at
 
 

 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:21711)
  at
 
 

 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:712)
  at
 
 

 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2101)
  at
 
 

 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1567)
  at
 
 

 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1545)
  at
 
 

 org.apache.hadoop.hbase.master.AssignmentManager.assignMeta(AssignmentManager.java:2630)
  at org.apache.hadoop.hbase.master.HMaster.assignMeta(HMaster.java:820)
  at
 
 

 org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:685)
  at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:165)
  at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1428)
  at java.lang.Thread.run(Thread.java:745)
 
 
  This error occurs as the master node cannot resolve the hostname of the
  regionserver. According to the requirement of mine, I want to automate
 the
  hbase installation with 1 master node and 4 regionservers. But at the
  moment I don't have any possibility of updating master's /etc/hosts
 file.
  From the hbase configuration side, will I be able to solve the problem?
 
  If the hbase can communicate with IP addresses or use the hostname,
 which
  is already sent by regionserver to the master without updating

Fwd: Hbase Fully distribution mode - Cannot resolve regionserver hostname

2015-07-17 Thread Dima Spivak
+user@, dev@ to bcc

Pubudu,

I think you'll get more help on an issue like this on the users list.

-Dima

-- Forwarded message --
From: Ted Yu yuzhih...@gmail.com
Date: Fri, Jul 17, 2015 at 5:40 AM
Subject: Re: Hbase Fully distribution mode - Cannot resolve regionserver
hostname
To: d...@hbase.apache.org d...@hbase.apache.org


Have you looked at
HBASE-12954 Ability impaired using HBase on multihomed hosts

Cheers

On Fri, Jul 17, 2015 at 3:32 AM, Pubudu Gunatilaka pubudu...@gmail.com
wrote:

 Hi Devs,

 I am trying to run Hbase in fully distributed mode. So first I started
 master node. Then I started regionserver. But I am getting following
error.

 2015-07-17 05:12:02,260 WARN  [pod-35:16020.activeMasterManager]
 master.AssignmentManager: Failed assignment of hbase:meta,,1.1588230740 to
 pod-36,16020,1437109916288, trying to assign elsewhere instead; try=1 of
10
 java.net.UnknownHostException: unknown host: pod-36
 at


org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.init(RpcClientImpl.java:296)
 at


org.apache.hadoop.hbase.ipc.RpcClientImpl.createConnection(RpcClientImpl.java:129)
 at


org.apache.hadoop.hbase.ipc.RpcClientImpl.getConnection(RpcClientImpl.java:1278)
 at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1152)
 at


org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
 at


org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
 at


org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:21711)
 at


org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:712)
 at


org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2101)
 at


org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1567)
 at


org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1545)
 at


org.apache.hadoop.hbase.master.AssignmentManager.assignMeta(AssignmentManager.java:2630)
 at org.apache.hadoop.hbase.master.HMaster.assignMeta(HMaster.java:820)
 at


org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:685)
 at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:165)
 at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1428)
 at java.lang.Thread.run(Thread.java:745)


 This error occurs as the master node cannot resolve the hostname of the
 regionserver. According to the requirement of mine, I want to automate the
 hbase installation with 1 master node and 4 regionservers. But at the
 moment I don't have any possibility of updating master's /etc/hosts file.
 From the hbase configuration side, will I be able to solve the problem?

 If the hbase can communicate with IP addresses or use the hostname, which
 is already sent by regionserver to the master without updating /etc/hosts
 file this issue can be solved. Similar approach can be found in hadoop as
 well. Once the datanode connects to the namenode, it can communicate with
 the datanode without updating /etc/hosts file.

 Any help on this is appreciated.

 Thank you!

 --

 *Pubudu Gunatilaka*



  1   2   >