It appears that my issue was caused by the missing sections I
mentioned in the second post. I ran a job with these settings, and my
job finished in < 6 hours. Thanks for your suggestions because I have
further ideas regarding issues moving forward.
scan.setCaching(500);// 1 is the default
Hi Chien,
4. From 50-150k per * second * to 100-150k per * minute *, as stated
above, so reads went *DOWN* significantly. I think you must have
misread.
I will take into account some of your other suggestions.
Thanks,
Colin
On Tue, Apr 12, 2016 at 8:19 PM, Chien Le wrote:
Some things I would look at:
1. Node statistics, both the mapper and regionserver nodes. Make sure
they're on fully healthy nodes (no disk issues, no half duplex, etc) and
that they're not already saturated from other jobs.
2. Is there a common regionserver behind the remaining mappers/regions? If
I've noticed that I've omitted
scan.setCaching(500);// 1 is the default in Scan, which will
be bad for MapReduce jobs
scan.setCacheBlocks(false); // don't set to true for MR jobs
which appear to be suggestions from examples. Still I am not sure if
this explains the significant request
Excuse my double post. I thought I deleted my draft, and then
constructed a cleaner, more detailed, more readable mail.
On Tue, Apr 12, 2016 at 10:26 PM, Colin Kincaid Williams wrote:
> After trying to get help with distcp on hadoop-user and cdh-user
> mailing lists, I've given
Hi Dilon,
Sounds like your table was not pre-split from the behavior that you are
describing, but when you say that you are bulk loading the data using MR is
this a MR job that does Put(s) into HBase or just generating HFiles (if
using importtsv you have both options) that are later on bulk
I searched (current) 0.98 and branch-1 where I found:
./hbase-client/src/main/java/org/apache/hadoop/hbase/security/token/TokenUtil.java
Looking at both 0.98[1] and 0.98.6[2] on github I see TokenUtil as
part of hbase-server.
Is it necessary for us to add this call to TokenUtil to all MR jobs
Please take a look at
HBASE-12493 User class should provide a way to re-use existing token
which went into 0.98.9
FYI
On Thu, May 14, 2015 at 8:37 AM, Edward C. Skoviak edward.skov...@gmail.com
wrote:
I searched (current) 0.98 and branch-1 where I found:
bq. it has been moved to be a part of the hbase-server package
I searched (current) 0.98 and branch-1 where I found:
./hbase-client/src/main/java/org/apache/hadoop/hbase/security/token/TokenUtil.java
FYI
On Wed, May 13, 2015 at 11:45 AM, Edward C. Skoviak
edward.skov...@gmail.com wrote:
I'm
Hi Shahab,
Thanks for the response.
I have added the @Override and somehow that worked. I have pasted the new
Reducer code below.
Though, I did not understood the difference here, as if what I have done
differently. I might be a very silly reason though.
Hi,
The @Override annotation worked because, without it the reduce method in
the superclass (Reducer) was being invoked, which basically writes the
input from the mapper class to the context object. Try to look up the
source code for the Reducer class online and you'll realize that.
Hope that
Parkirat,
This is a core Java concept which is mainly related to how Class
inheritance works in Java and how the @Override annotation is used, and is
not Hadoop specific. (It is also used while implementing interfaces since
JDK 6.)
You can read about it here:
Hi Parkirat,
I don't think that HBase is causing the problems. You might already know
this but need to add the reducer class to the job as you add the mapper.
Also, if you want to read from a HBase table in a MapReduce job, you need
to implement the TableMapper for the mapper and if you want to
Thanks All for replying to my thread.
I have further investigated the issue and found that hadoop is not
running/respecting any reduce for my jobs ir-respective of if, it is normal
mapreduce or hbase api of mapreduce.
I am pasting word count example that I have run and the input and output
file
Add @Override notation at top of the 'reduce' method and then try (just
like you are doing for the 'map' method):
public class WordCountReducer extends ReducerText, IntWritable, Text,
IntWritable {
*@Override*
protected void reduce(Text key, IterableIntWritable values,
Hi Parkirat,
I don't follow the reducer problem you're having. Can you post your code
that configures the job? I assume you're using TableMapReduceUtil someplace.
Your reducer is removing duplicate values? Sounds like you need to update
it's logic to only emit a value once. Pastebin-ing your
Hi Ted,
I am trying your solution. But I got the same error message.
Thanks
Did you create the table prior to launching your program ?
If so, when you scan hbase:meta table, do you see row(s) for it ?
Cheers
On Feb 4, 2014, at 12:53 AM, Murali muralidha...@veradistech.com wrote:
Hi Ted,
I am trying your solution. But I got the same error message.
Thanks
Hi Ted,
I am using HBase 0.96 version. But I am also getting the below error
message
14/02/03 10:18:32 ERROR mapreduce.TableOutputFormat:
org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find
region for after 35 tries.
Exception in thread main
Murali:
Are you using 0.96.1.1 ?
Can you show us the command line you used ?
Meanwhile I assume the HBase cluster is functional - you can use shell to
insert data.
Cheers
On Mon, Feb 3, 2014 at 8:33 PM, Murali muralidha...@veradistech.com wrote:
Hi Ted,
I am using HBase 0.96 version.
Hi Ted
Thanks for your reply. I am using HBase version 0.96.0. I can insert a
record using shell command. I am running the below command to run my
MapReduce job. It is a word count example. Reading a text file from hdfs file
path and insert the counts to HBase table.
hadoop jar hb.jar
See the sample command in
http://hbase.apache.org/book.html#trouble.mapreduce :
HADOOP_CLASSPATH=`hbase classpath` hadoop jar
On Mon, Feb 3, 2014 at 9:33 PM, Murali muralidha...@veradistech.com wrote:
Hi Ted
Thanks for your reply. I am using HBase version 0.96.0. I can insert a
record
Have you considered using MultiTableInputFormat ?
Cheers
On Mon, Jan 27, 2014 at 9:14 AM, daidong daidon...@gmail.com wrote:
Dear all,
I am writing a MapReduce application processing HBase table. In each map,
it needs to read data from another HBase table, so i use the 'setup'
function
I agree that we should find the cause for why initialization got stuck.
I noticed empty catch block:
} catch (IOException e) {
}
Can you add some logging there to see what might have gone wrong ?
Thanks
On Mon, Jan 27, 2014 at 11:56 AM, daidong daidon...@gmail.com wrote:
Dear
Why do you use 0.95 which was a developer release ?
See http://hbase.apache.org/book.html#d243e520
Cheers
On Fri, Jan 24, 2014 at 8:40 AM, daidong daidon...@gmail.com wrote:
Dear all,
I have a simple HBase MapReduce application and try to run it on a
12-node cluster using this command:
Thanks Ted, I actually tried to modify HBase, so i choose this developer
release.
So, you are thinking this is a version problem, should disappear if i
switched to 0.96?
2014/1/24 Ted Yu yuzhih...@gmail.com
Why do you use 0.95 which was a developer release ?
See
Hi,
Is your table properly served? Are you able to see it on the Web UI? Is you
HBCK reporting everything correctly?
JM
2013/7/11 S. Zhou myx...@yahoo.com
I am running a very simple MR HBase job (reading from a tiny HBase table
and outputs nothing). I run it on a pseudo-distributed HBase
Yes, I can see the table through hbase shell and web ui (localhost:60010). hbck
reports ok
From: Jean-Marc Spaggiari jean-m...@spaggiari.org
To: user@hbase.apache.org; S. Zhou myx...@yahoo.com
Sent: Thursday, July 11, 2013 11:01 AM
Subject: Re: HBase
-Marc Spaggiari jean-m...@spaggiari.org
*To:* user@hbase.apache.org; S. Zhou myx...@yahoo.com
*Sent:* Thursday, July 11, 2013 11:01 AM
*Subject:* Re: HBase mapreduce job: unable to find region for a table
Hi,
Is your table properly served? Are you able to see it on the Web UI? Is
you HBCK
Here you have several examples:
http://hbase.apache.org/book/mapreduce.example.html
http://sujee.net/tech/articles/hadoop/hbase-map-reduce-freq-counter/
http://bigdataprocessing.wordpress.com/2012/07/27/hadoop-hbase-mapreduce-examples/
Tnx,But I don't know why when the client.buffer.size is increased, I've got
bad result,does it related to other parameters ? and I give 8 gb heap to
each regionserver.
On Mon, Jan 21, 2013 at 12:34 PM, Harsh J ha...@cloudera.com wrote:
Hi Farrokh,
This isn't a HDFS question - please ask these
Give put(ListPut puts) a shot and see if it works for you.
Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Mon, Jan 21, 2013 at 11:41 AM, Farrokh Shahriari
mohandes.zebeleh...@gmail.com wrote:
Hi there
Is there any way to use arrayList of Puts in map function to
And also how can I use autoflush bufferclientside in Map function for
inserting data to Hbase Table ?
You are using TableOutputFormat right? Here autoFlush is turned OFF ... You can
use config param hbase.client.write.buffer to set the client side buffer size.
-Anoop-
/2012 01:33 PM
Subject:RE: Hbase MapReduce
It 's weird that hbase aggregate functions don't use MapReduce, this means
that the performance will be very poor.
Is it a must to use coprocessors?
Is there a much easier way to improve the functions' performance ?
CC: user@hbase.apache.org
Regards, Dalia.
You have to use MapReduce for that.
In the HBase in Practice´s book, there are lot of great examples for this.
On 11/24/2012 12:15 PM, Dalia Sobhy wrote:
Dear all,
I wanted to ask a question..
Do Hbase Aggregate Functions such as rowcount, getMax, get Average use
MapReduce to
Hi, but you do not need to us M/R. You could also use coprocessors.
See this site:
https://blogs.apache.org/hbase/entry/coprocessor_introduction
- in the section Endpoints
An aggregation coprocessor ships with hbase that should match your
requirements.
You just need to load it and eventually
Do you think it would be a good idea to temper the use of CoProcessors?
This kind of reminds me of when people first started using stored procedures...
Sent from a remote device. Please excuse any typos...
Mike Segel
On Nov 24, 2012, at 11:46 AM, tom t...@arcor.de wrote:
Hi, but you do not
It 's weird that hbase aggregate functions don't use MapReduce, this means that
the performance will be very poor.
Is it a must to use coprocessors?
Is there a much easier way to improve the functions' performance ?
CC: user@hbase.apache.org
From: michael_se...@hotmail.com
Subject: Re: Hbase
To: user@hbase.apache.org user@hbase.apache.org,
Date: 11/24/2012 01:33 PM
Subject:RE: Hbase MapReduce
It 's weird that hbase aggregate functions don't use MapReduce, this means
that the performance will be very poor.
Is it a must to use coprocessors?
Is there a much easier way
Hello Amlan,
Issue is still unresolved...Will get fixed in 0.96.0.
Regards,
Mohammad Tariq
On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy amlan@cleartrip.com wrote:
Hi,
While writing a MapReduce job for HBase, can I use multiple tables as input?
I think
Hi,
Isn't that the case that you can always initiate a scanner inside a map
job (referring to another table from which had been set into the
configuration of TableMapReduceUtil.initTableMapperJob(...) ) ?
Hope this serves as temporary solution.
On 08/06/2012 02:35 PM, Mohammad Tariq wrote:
Hi Amlan,
I think if you share your usecase regarding two tables as inputs, people on
the mailing list may be able to help you better. For example, are you
looking at joining the two tables? What are the sizes of the tables etc?
Best Regards,
Sonal
Crux: Reporting for HBase
. What is the best solution available in 0.92.0 (I
understand the best solution is coming in version 0.96.0).
Regards,
Amlan
-Original Message-
From: Ioakim Perros [mailto:imper...@gmail.com]
Sent: Monday, August 06, 2012 5:11 PM
To: user@hbase.apache.org
Subject: Re: HBase MapReduce - Using
the best solution is coming in version 0.96.0).
Regards,
Amlan
-Original Message-
From: Ioakim Perros [mailto:imper...@gmail.com]
Sent: Monday, August 06, 2012 5:11 PM
To: user@hbase.apache.org
Subject: Re: HBase MapReduce - Using mutiple tables as source
Hi,
Isn't that the case
Subject: Re: HBase MapReduce - Using mutiple tables as source
Hi,
Isn't that the case that you can always initiate a scanner inside a map
job (referring to another table from which had been set into the
configuration of TableMapReduceUtil.initTableMapperJob(...) ) ?
Hope this serves
On Mon, Aug 6, 2012 at 3:22 PM, Wei Tan w...@us.ibm.com wrote:
I understand that this is achievable by running multiple MR jobs, each
with a different output table specified in the reduce class. What I want
is to scan a source table once and generate multiple tables at one time.
Thanks,
Its available just as a patch on trunk for now.
You wont find it in 0.92.0
./zahoor
On 06-Aug-2012, at 5:01 PM, Amlan Roy amlan@cleartrip.com wrote:
https://issues.apache.org/jira/browse/HBASE-3996
My first guess would be to check if all the KVs using the same
qualifier, because then it's basically the same cell 10 times.
J-D
On Mon, May 14, 2012 at 6:50 PM, Ben Kim benkimkim...@gmail.com wrote:
Hello!
I'm writing a mapreduce code to read a SequenceFile and write it to hbase
table.
Oops I made mistake while copy-paste
The reducer initialization code should be like this
TableMapReduceUtil.initTableReducerJob(rs_system, MyTableReducer,
itemTableJob);
On Tue, May 15, 2012 at 10:50 AM, Ben Kim benkimkim...@gmail.com wrote:
Hello!
I'm writing a mapreduce code to read a
Take a look at HBASE-3996 where Stack has some comments outstanding.
Cheers
On Tue, Apr 3, 2012 at 5:52 AM, Shawn Quinn squ...@moxiegroup.com wrote:
Hello,
I have a table whose key is structured as eventType + time, and I need to
periodically run a map reduce job on the table which will
Sounds good, thanks Ted. I'll give it a whirl and add any
comments/findings to the Jira issue.
-Shawn
On Tue, Apr 3, 2012 at 10:45 AM, Ted Yu yuzhih...@gmail.com wrote:
Stack said he might help implement his suggestions if Eran is busy.
The patch doesn't depend on recent changes to the
i tried to run the program from eclipse, but during that , i could not see
any job running on the jobtracker/tasktracker web UI pages. i observed that
on the eclipse localJobRunner is executing , so that job is not submitted
to the whole cluster, but its executing on that name node machine alone.
You don't need the conf dir in the jar, in fact you really don't want
it there. I don't know where that alert is coming from, would be nice
if you gave more details.
J-D
On Fri, Dec 9, 2011 at 6:45 AM, Vamshi Krishna vamshi2...@gmail.com wrote:
Hi,
i want to run mapreduce program to insert
I had the same issue.
The problem for me turned out to be that the hbase.zookeeper.quorum was
not set in hbase-site.xml in the server that submitted the mapreduce
job. Ironically, this is also the same server that was running hbase
master. This defaulted to 127.0.0.1 which was where the
HBase doesn't have it's own MapReduce system, it uses Hadoop's. How
are you launching your jobs?
On Mon, Sep 12, 2011 at 2:32 AM, Jimson K. James
jimson.ja...@nestgroup.net wrote:
Hi All,
When I run Hadoop mapreduce jobs, the job statistics and status is
displayed in jobtracker/task
Subject: Re: Hbase Mapreduce jobs Dashboard
HBase doesn't have it's own MapReduce system, it uses Hadoop's. How
are you launching your jobs?
On Mon, Sep 12, 2011 at 2:32 AM, Jimson K. James
jimson.ja...@nestgroup.net wrote:
Hi All,
When I run Hadoop mapreduce jobs, the job statistics and status
PM
To: user@hbase.apache.org
Subject: Re: Hbase Mapreduce jobs Dashboard
HBase doesn't have it's own MapReduce system, it uses Hadoop's. How
are you launching your jobs?
On Mon, Sep 12, 2011 at 2:32 AM, Jimson K. James
jimson.ja...@nestgroup.net wrote:
Hi All,
When I run Hadoop
this issue is still not resolved...
unfortunatelly calling HConnectionManager.deleteConnection(conf, true); after
the MR job is finished, does not close the connection to the zookeeper
we have 3 zookeeper nodes
by default there is a limit of 10 connections allowed from a single client
so after
Try getting the ZooKeeperWatcher from the connection on your way out
and explicitly shutdown the zk connection (see TestZooKeeper unit test
for example).
St.Ack
On Thu, Jul 28, 2011 at 6:01 AM, Andre Reiter a.rei...@web.de wrote:
this issue is still not resolved...
unfortunatelly calling
10 connection maximum is too low. It has been recommended to go up to as many as 2000 connections
in the list. This doesn't fix your problem but is something you should probably have in your
configuration.
~Jeff
On 7/28/2011 10:00 AM, Stack wrote:
Try getting the ZooKeeperWatcher from the
@hbase.apache.org
Sent: Thu, July 28, 2011 12:10:16 PM
Subject: Re: HBase MapReduce Zookeeper
10 connection maximum is too low. It has been recommended to go up to as many
as 2000 connections
in the list. This doesn't fix your problem but is something you should
probably
have in your
i guess, i know the reason, why HConnectionManager.deleteConnection(conf,
true); does not work for me
in the MR job im using TableInputFormat, if you have a look at the source code
in the method
public void setConf(Configuration configuration)
there is a line creating the HTable like this :
Yes, that's the connection leak.
Use deleteAllConnections(true), and it will close all open connections.
- Ruben
From: Andre Reiter a.rei...@web.de
To: user@hbase.apache.org
Sent: Thu, July 28, 2011 4:55:52 PM
Subject: Re: HBase MapReduce Zookeeper
i guess
From: Andre Reiter a.rei...@web.de
To: user@hbase.apache.org
Sent: Thu, July 28, 2011 4:55:52 PM
Subject: Re: HBase MapReduce Zookeeper
i guess, i know the reason, why HConnectionManager.deleteConnection(conf,
true); does not work for me
in the MR job im using
hi Ruben, St.Ack
thanks a lot for your help!
finally, the problem seems to be solved by an pretty sick workaround
i did it like Bryan Keller described in this issue:
https://issues.apache.org/jira/browse/HBASE-3792
@Ruben: thanks for the urls to that issues
cheers
andre
Maybe job.setJarByClass() can solve this problem.
On Thu, Jul 28, 2011 at 7:06 PM, air cnwe...@gmail.com wrote:
-- Forwarded message --
From: air cnwe...@gmail.com
Date: 2011/7/28
Subject: HBase Mapreduce cannot find Map class
To: CDH Users cdh-u...@cloudera.org
import
Hi St.Ack,
thanks for your reply
but funally i miss the point, what would be the options to solve our issue?
andre
Can you reuse Configuration instances though the configuration changes?
Else in your Mapper#cleanup, call HTable.close() then try
HConnectionManager.deleteConnection(table.getConfiguration()) after
close (could be issue with executors used by multi* operations not
completing before delete of
Hi Stack,
just to make clear, actually the connections to the zookeeper being kept are
not on our mappers (tasktrackers) but on the client, which schedules the MR job
i think, the mappers are just fine, as they are
andre
Stack wrote:
Can you reuse Configuration instances though the
Then similarly, can you do the deleteConnection above in your client
or reuse the Configuration client-side that you use setting up the
job?
St.Ack
On Wed, Jul 20, 2011 at 12:13 AM, Andre Reiter a.rei...@web.de wrote:
Hi Stack,
just to make clear, actually the connections to the zookeeper
Hi St.Ack,
actually calling HConnectionManager.deleteConnection(conf, true); does not
close the connection to the zookeeper
i still can see the connection established...
andre
Stack wrote:
Then similarly, can you do the deleteConnection above in your client
or reuse the Configuration
Andre:
So you didn't see the following in client log (HConnectionManager line 1067)
?
LOG.info(Closed zookeeper sessionid=0x +
Long.toHexString(this.zooKeeper.getZooKeeper().getSessionId()));
HConnectionManager.deleteConnection(conf, true) is supposed to close zk
connection in
unfortunatelly there was no such LOG entry... :-(
our versions:
hadoop-0.20.2-CDH3B4
hbase-0.90.1-CDH3B4
zookeeper-3.3.2-CDH3B4
either the map HConnectionManager.HBASE_INSTANCES does not contain the
connection for the current config, or HConnectionImplementation.zooKeeper is
null
but the
This seems to be cdh related.
either the map HConnectionManager.HBASE_INSTANCES does not contain the
connection for the current config
You need to pass the same conf object.
In trunk, I added the following:
public static void deleteStaleConnection(HConnection connection) {
See
Hi Ted,
thanks for the reply,
at the moment i'm hust wondering, why the client creates a zookeeper connection
at all
all the client has to do, is to schedule a MR job, which is done by connecting
to the jobtracker and to provide all the needed stuff, config, some extra
resources in the
My guess is that it needs to ask the master for the regions so it can
make the splits used by mapper tasks (to find master, needs to ask zk,
etc.). Check it out yourself under the mapreduce package?
St.Ack
On Wed, Jul 20, 2011 at 3:06 PM, Andre Reiter a.rei...@web.de wrote:
Hi Ted,
thanks for
Hi there-
re: that we have to reuse the Configuration object
You are probably referring to this...
http://hbase.apache.org/book.html#client.connections
... yes, that is general guidance on client connection..
re: do i have to create a pool of Configuration objects, to share them
Hi Doug,
thanks a lot for reply,
it's clear, that there is a parameter for maxClientCnxns, which is 10 by default
of course i could increase it to s.th. big. but like i said, the old
connections are still there, and i cannot imagine, that this is a correct
behaviour, to let them open
Configuration is not Comparable. Its instance identity that is used
comparing Configurations down in the guts of HConnectionManager in
0.90.x hbase so even if you reuse a Configuration and tweak it per
job, as far as HCM is concerned its the 'same'.
Are you seeing otherwise?
St.Ack
On Tue, Jul
79 matches
Mail list logo