Re: Hbase with Hadoop

2011-10-12 Thread Ramya Sunil
Hi Jignesh,

I have been running quite a few hbase tests on Hadoop 0.20.205 without any
issues on both secure and non secure clusters.

I have seen the error you mentioned when one has not specified the hbase
config directory.

Can you please try "hbase --config  shell"
and check if that solves the problem?

Thanks
Ramya


On Wed, Oct 12, 2011 at 4:50 PM, Matt Foley  wrote:

> Hi Jignesh,
> Not clear what's going on with your ZK, but as a starting point, the
> hsync/flush feature in 205 was implemented with an on-off switch.  Make
> sure
> you've turned it on by setting  *dfs.support.append  *to true in the
> hdfs-site.xml config file.
>
> Also, are you installing Hadoop with security turned on or off?
>
> I'll gather some other config info that should help.
> --Matt
>
>
> On Wed, Oct 12, 2011 at 1:47 PM, Jignesh Patel 
> wrote:
>
> > When I tried to run Hbase 0.90.4 with hadoop-.0.20.205.0 I got following
> > error
> >
> > Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase shell
> > HBase Shell; enter 'help' for list of supported commands.
> > Type "exit" to leave the HBase Shell
> > Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011
> >
> > hbase(main):001:0> status
> >
> > ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is
> able
> > to connect to ZooKeeper but the connection closes immediately. This could
> be
> > a sign that the server has too many connections (30 is the default).
> > Consider inspecting your ZK server logs for that error and then make sure
> > you are reusing HBaseConfiguration as often as you can. See HTable's
> javadoc
> > for more information.
> >
> >
> > And when I tried to stop Hbase I continuously sees dot being printed and
> no
> > sign of stopping it. Not sure why it just simply stop it.
> >
> > stopping
> >
> hbase...….
> >
> >
> > On Oct 12, 2011, at 3:19 PM, Jignesh Patel wrote:
> >
> > > The new plugin works after deleting eclipse and reinstalling it.
> > > On Oct 12, 2011, at 2:39 PM, Jignesh Patel wrote:
> > >
> > >> I have installed Hadoop-0.20.205.0 but when I replace the hadoop
> > 0.20.204.0 eclipse plugin with the 0.20.205.0, eclipse is not recognizing
> > it.
> > >>
> > >> -Jignesh
> > >> On Oct 12, 2011, at 12:31 PM, Vinod Gupta Tankala wrote:
> > >>
> > >>> its free and open source too.. basically, their releases are ahead of
> > public
> > >>> releases of hadoop/hbase - from what i understand, major bug fixes
> and
> > >>> enhancements are checked in to their branch first and then eventually
> > make
> > >>> it to public release branches.
> > >>>
> > >>> thanks
> > >>>
> > >>> On Wed, Oct 12, 2011 at 9:26 AM, Jignesh Patel 
> > wrote:
> > >>>
> >  Sorry to here that.
> >  Is CDH3 is a open source or a paid version?
> > 
> >  -jignesh
> >  On Oct 12, 2011, at 11:58 AM, Vinod Gupta Tankala wrote:
> > 
> > > for what its worth, i was in a similar situation/dilemma few days
> ago
> > and
> > > got frustrated figuring out what version combination of
> hadoop/hbase
> > to
> >  use
> > > and how to build hadoop manually to be compatible with hbase. the
> > build
> > > process didn't work for me either.
> > > eventually, i ended up using cloudera distribution and i think it
> > saved
> >  me a
> > > lot of headache and time.
> > >
> > > thanks
> > >
> > > On Tue, Oct 11, 2011 at 8:29 PM, jigneshmpatel <
> > jigneshmpa...@gmail.com
> > > wrote:
> > >
> > >> Matt,
> > >> Thanks a lot. Just wanted to have some more information. If hadoop
> > >> 0.2.205.0
> > >> voted by the community members then will it become major release?
> > And
> >  what
> > >> if it is not approved by community members.
> > >>
> > >> And as you said I do like to use 0.90.3 if it works. If it is ok,
> > can
> >  you
> > >> share the deails of those configuration changes?
> > >>
> > >> -Jignesh
> > >>
> > >> --
> > >> View this message in context:
> > >>
> > 
> >
> http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3414658.html
> > >> Sent from the Hadoop lucene-users mailing list archive at
> > Nabble.com.
> > >>
> > 
> > 
> > >>
> > >
> >
> >
>


Re: Hbase with Hadoop

2011-10-13 Thread Ramya Sunil
Hi Jignesh,

"--config" (i.e. - - config) is the option to use and not "-config".
Alternatively you can also set HBASE_CONF_DIR.

Below is the exact command line:

$ hbase --config /home/ramya/hbase/conf shell
hbase(main):001:0> create 'newtable','family'
0 row(s) in 0.5140 seconds

hbase(main):002:0> list 'newtable'
TABLE
newtable
1 row(s) in 0.0120 seconds

OR

$ export HBASE_CONF_DIR=/home/ramya/hbase/conf
$ hbase shell

hbase(main):001:0> list 'newtable'
TABLE

newtable

1 row(s) in 0.3860 seconds


Thanks
Ramya


On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel wrote:

> There is no command like -config see below
>
> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
> shell
> Unrecognized option: -config
> Could not create the Java virtual machine.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>


Re: Hbase with Hadoop

2011-10-13 Thread Ramya Sunil
Jignesh,

I dont see zookeeper running on your master. My cluster reads the following:

$ jps
15315 Jps
13590 HMaster
15235 HQuorumPeer

Can you please shutdown your Hmaster and run the following first:
$ hbase-daemon.sh start zookeeper

And then start your hbasemaster and regionservers?

Thanks
Ramya

On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel  wrote:

> ok --config worked but it is showing me same error. How to resolve this.
>
> http://pastebin.com/UyRBA7vX
>
> On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:
>
> > Hi Jignesh,
> >
> > "--config" (i.e. - - config) is the option to use and not "-config".
> > Alternatively you can also set HBASE_CONF_DIR.
> >
> > Below is the exact command line:
> >
> > $ hbase --config /home/ramya/hbase/conf shell
> > hbase(main):001:0> create 'newtable','family'
> > 0 row(s) in 0.5140 seconds
> >
> > hbase(main):002:0> list 'newtable'
> > TABLE
> > newtable
> > 1 row(s) in 0.0120 seconds
> >
> > OR
> >
> > $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
> > $ hbase shell
> >
> > hbase(main):001:0> list 'newtable'
> > TABLE
> >
> > newtable
> >
> > 1 row(s) in 0.3860 seconds
> >
> >
> > Thanks
> > Ramya
> >
> >
> > On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel  >wrote:
> >
> >> There is no command like -config see below
> >>
> >> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
> >> shell
> >> Unrecognized option: -config
> >> Could not create the Java virtual machine.
> >>
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
> >> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
> >>
>
>


Re: Hbase with Hadoop

2011-10-13 Thread Ramya Sunil
You already have zookeeper running on 2181 according to your jps output.
That is the reason, master seems to be complaining.
Can you please stop zookeeper, verify that no daemons are running on 2181
and restart your master?

On Thu, Oct 13, 2011 at 12:37 PM, Jignesh Patel  wrote:

> Ramya,
>
>
> Based on "Hbase the definite guide" it seems zookeeper being started by
> hbase no need to start it separately(may be this is changed for 0.90.4.
> Anyways now  following is the updated status.
>
> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh
> starting master, logging to
> /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out
> Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting.
> Why? Because clients (eg shell) wont be able to find this ZK quorum
> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps
> 41486 HQuorumPeer
> 38814 SecondaryNameNode
> 41578 Jps
> 38878 JobTracker
> 38726 DataNode
> 38639 NameNode
> 38964 TaskTracker
>
> On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote:
>
> > Jignesh,
> >
> > I dont see zookeeper running on your master. My cluster reads the
> following:
> >
> > $ jps
> > 15315 Jps
> > 13590 HMaster
> > 15235 HQuorumPeer
> >
> > Can you please shutdown your Hmaster and run the following first:
> > $ hbase-daemon.sh start zookeeper
> >
> > And then start your hbasemaster and regionservers?
> >
> > Thanks
> > Ramya
> >
> > On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel 
> wrote:
> >
> >> ok --config worked but it is showing me same error. How to resolve this.
> >>
> >> http://pastebin.com/UyRBA7vX
> >>
> >> On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:
> >>
> >>> Hi Jignesh,
> >>>
> >>> "--config" (i.e. - - config) is the option to use and not "-config".
> >>> Alternatively you can also set HBASE_CONF_DIR.
> >>>
> >>> Below is the exact command line:
> >>>
> >>> $ hbase --config /home/ramya/hbase/conf shell
> >>> hbase(main):001:0> create 'newtable','family'
> >>> 0 row(s) in 0.5140 seconds
> >>>
> >>> hbase(main):002:0> list 'newtable'
> >>> TABLE
> >>> newtable
> >>> 1 row(s) in 0.0120 seconds
> >>>
> >>> OR
> >>>
> >>> $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
> >>> $ hbase shell
> >>>
> >>> hbase(main):001:0> list 'newtable'
> >>> TABLE
> >>>
> >>> newtable
> >>>
> >>> 1 row(s) in 0.3860 seconds
> >>>
> >>>
> >>> Thanks
> >>> Ramya
> >>>
> >>>
> >>> On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel <
> jigneshmpa...@gmail.com
> >>> wrote:
> >>>
> >>>> There is no command like -config see below
> >>>>
> >>>> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config
> ./config
> >>>> shell
> >>>> Unrecognized option: -config
> >>>> Could not create the Java virtual machine.
> >>>>
> >>>> --
> >>>> View this message in context:
> >>>>
> >>
> http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
> >>>> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
> >>>>
> >>
> >>
>
>


Re: Hbase with Hadoop

2011-10-13 Thread Ramya Sunil
Jignesh,

I have been able to deploy Hbase 0.90.3 and 0.90.4 with hadoop-0.20.205.
Below are the steps I followed:

1. Make sure none of hbasemaster, regionservers or zookeeper are running. As
Matt pointed out, turn on append.
2. hbase-daemon.sh --config $HBASE_CONF_DIR start zookeeper
3. hbase-daemon.sh --config $HBASE_CONF_DIR start master
4. hbase-daemon.sh --config $HBASE_CONF_DIR start regionserver
5. hbase --config $HBASE_CONF_DIR shell


Hope it helps.
Ramya



On Thu, Oct 13, 2011 at 4:11 PM, Jignesh Patel  wrote:

> Is there a way to resolve this weird problem.
>
> > bin/hbase-start.sh is supposed to start zookeeper but it doesn't start.
> But on the other side if zookeeper up and running then it says
>
> > Couldnt start ZK at requested address of 2181, instead got: 2182.
> Aborting. Why? Because clients (eg shell) wont be able to find this ZK
> quorum
>
>
>
> On Oct 13, 2011, at 5:40 PM, Jignesh Patel wrote:
>
> > Ok now the problem is
> >
> > if I only use bin/hbase-start.sh then it doesn't start zookeeper.
> >
> > But if I use bin/hbase-daemon.sh start zookeeper before starting
> bin/hbase-start.sh then it will try to start zookeeper at port 2181 and then
> I have following error.
> >
> > Couldnt start ZK at requested address of 2181, instead got: 2182.
> Aborting. Why? Because clients (eg shell) wont be able to find this ZK
> quorum
> >
> >
> > So I am wondering if bin/hbase-start.sh is trying to start zookeeper then
> while zookeeper is not running it should start the zookeeper. I only get the
> error if zookeeper already running.
> >
> >
> > -Jignesh
> >
> >
> > On Oct 13, 2011, at 4:53 PM, Ramya Sunil wrote:
> >
> >> You already have zookeeper running on 2181 according to your jps output.
> >> That is the reason, master seems to be complaining.
> >> Can you please stop zookeeper, verify that no daemons are running on
> 2181
> >> and restart your master?
> >>
> >> On Thu, Oct 13, 2011 at 12:37 PM, Jignesh Patel 
> wrote:
> >>
> >>> Ramya,
> >>>
> >>>
> >>> Based on "Hbase the definite guide" it seems zookeeper being started by
> >>> hbase no need to start it separately(may be this is changed for 0.90.4.
> >>> Anyways now  following is the updated status.
> >>>
> >>> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh
> >>> starting master, logging to
> >>>
> /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out
> >>> Couldnt start ZK at requested address of 2181, instead got: 2182.
> Aborting.
> >>> Why? Because clients (eg shell) wont be able to find this ZK quorum
> >>> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps
> >>> 41486 HQuorumPeer
> >>> 38814 SecondaryNameNode
> >>> 41578 Jps
> >>> 38878 JobTracker
> >>> 38726 DataNode
> >>> 38639 NameNode
> >>> 38964 TaskTracker
> >>>
> >>> On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote:
> >>>
> >>>> Jignesh,
> >>>>
> >>>> I dont see zookeeper running on your master. My cluster reads the
> >>> following:
> >>>>
> >>>> $ jps
> >>>> 15315 Jps
> >>>> 13590 HMaster
> >>>> 15235 HQuorumPeer
> >>>>
> >>>> Can you please shutdown your Hmaster and run the following first:
> >>>> $ hbase-daemon.sh start zookeeper
> >>>>
> >>>> And then start your hbasemaster and regionservers?
> >>>>
> >>>> Thanks
> >>>> Ramya
> >>>>
> >>>> On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel 
> >>> wrote:
> >>>>
> >>>>> ok --config worked but it is showing me same error. How to resolve
> this.
> >>>>>
> >>>>> http://pastebin.com/UyRBA7vX
> >>>>>
> >>>>> On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:
> >>>>>
> >>>>>> Hi Jignesh,
> >>>>>>
> >>>>>> "--config" (i.e. - - config) is the option to use and not "-config".
> >>>>>> Alternatively you can also set HBASE_CONF_DIR.
> >>>>>>
> >>>>>> Below is the exact command line:
> >>>>>>
> >>>>>> $ hbase --config /home/ramya/hbase/conf shell
> >>>>>> hbase(main):001:0> create 'newtable','family'
> >>>>>> 0 row(s) in 0.5140 seconds
> >>>>>>
> >>>>>> hbase(main):002:0> list 'newtable'
> >>>>>> TABLE
> >>>>>> newtable
> >>>>>> 1 row(s) in 0.0120 seconds
> >>>>>>
> >>>>>> OR
> >>>>>>
> >>>>>> $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
> >>>>>> $ hbase shell
> >>>>>>
> >>>>>> hbase(main):001:0> list 'newtable'
> >>>>>> TABLE
> >>>>>>
> >>>>>> newtable
> >>>>>>
> >>>>>> 1 row(s) in 0.3860 seconds
> >>>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>> Ramya
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel <
> >>> jigneshmpa...@gmail.com
> >>>>>> wrote:
> >>>>>>
> >>>>>>> There is no command like -config see below
> >>>>>>>
> >>>>>>> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config
> >>> ./config
> >>>>>>> shell
> >>>>>>> Unrecognized option: -config
> >>>>>>> Could not create the Java virtual machine.
> >>>>>>>
> >>>>>>> --
> >>>>>>> View this message in context:
> >>>>>>>
> >>>>>
> >>>
> http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
> >>>>>>> Sent from the Hadoop lucene-users mailing list archive at
> Nabble.com.
> >>>>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >
>
>


Re: streaming cacheArchive shared libraries

2011-08-05 Thread Ramya Sunil
Hi Keith,

I have tried the exact use case you have mentioned and it works fine for me.
Below is the command line for the same:

[ramya]$ jar vxf samplelib.jar
 created: META-INF/
 inflated: META-INF/MANIFEST.MF
 inflated: libhdfs.so

[ramya]$ hadoop dfs -put samplelib.jar samplelib.jar

[ramya]$ hadoop jar hadoop-streaming.jar -input InputDir -mapper "ls
testlink/libhdfs.so" -reducer NONE -output out -cacheArchive
hdfs://:/user/ramya/samplelib.jar#testlink

[ramya]$ hadoop dfs -cat out/*
testlink/libhdfs.so
testlink/libhdfs.so
testlink/libhdfs.so


Hope it helps.

Thanks
Ramya

On 8/5/11 10:10 AM, "Keith Wiley"  wrote:


I can use cacheFile to load .so files into the distributed cache and it
works fine (the streaming executable links against the .so and runs), but I
can't get it to work with -cacheArchive.  It always says it can't find the
.so file.  I realize that if you jar a directory, the directory will be
recreated when you unjar, but I've tried jaring a file directly.  It is
easily verified that unjarring such a file reproduces the original file as a
sibling of the jar file itself.  So it seems to me that cacheArchive should
have transferred the jar file to the cwd of my task, unjarred it, and
produced a .so file right there, but it doesn't link up with the executable.
 Like I said, I know this basic approach works just fine with cacheFile.

What could be the problem here?  I can't easily see the files on the cluster
since it is a remote cluster with limited access.  I don't believe I can ssh
to any individual machine to investigate the files that are created for a
task...but I think I have worked through the process logically and I'm not
sure what I'm doing wrong.

Thoughts?


Keith Wiley *kwi...@keithwiley.com* keithwiley.com
music.keithwiley.com

"Luminous beings are we, not this crude matter."
   --  Yoda



Re: Jobs failing on submit

2011-08-26 Thread Ramya Sunil
Hi John,

How many tasktrackers do you have? Can you check if your tasktrackers are
running and the total available map and reduce capacity in your cluster?
Can you also post the configuration of the scheduler you are using? You
might also want to check the jobtracker logs. It would help in further
debugging.

Thanks
Ramya

On Fri, Aug 26, 2011 at 7:50 AM, John Armstrong wrote:

> One of my colleagues has noticed this problem for a while, and now it's
> biting me.  Jobs seem to be failing before every really starting.  It seems
> to be limited (so far) to running in pseudo-distributed mode, since that's
> where he saw the problem and where I'm now seeing it; it hasn't come up on
> our cluster (yet).
>
> So here's what happens:
>
> $ java -classpath $MY_CLASSPATH MyLauncherClass -conf my-config.xml -D
> extra.properties=extravalues
> ...
> launcher output
> ...
> 11/08/26 10:35:54 INFO input.FileInputFormat: Total input paths to process
> : 2
> 11/08/26 10:35:54 INFO mapred.JobClient: Running job:
> job_201108261034_0001
> 11/08/26 10:35:55 INFO mapred.JobClient:  map 0% reduce 0%
>
> and it just sits there.  If I look at the jobtracker's web view the number
> of submissions increments, but nothing shows up as a running, completed,
> failed, or retired job.  If I use the command line probe I find
>
> $ hadoop job -list
> 1 jobs currently running
> JobId   State   StartTime   UserNamePriority
>  SchedulingInfo
> job_201108261034_0001   4   1314369354247   hdfsNORMAL  NA
>
> If I try to kill this job, nothing happens; it remains in the list with
> state 4 (failed?).  I've tried telling the mapper JVM to suspend so I can
> find it in netstat and attach a debugger from IDEA, but it seems that the
> job never gets to the point of even spinning up a JVM to run the mapper.
>
> Any ideas what might be going wrong?  Thanks.
>


Re: Jobs failing on submit

2011-08-26 Thread Ramya Sunil
On Fri, Aug 26, 2011 at 11:50 AM, John Armstrong wrote:

> On Fri, 26 Aug 2011 11:46:42 -0700, Ramya Sunil 
> wrote:
> > How many tasktrackers do you have? Can you check if your tasktrackers
> are
> > running and the total available map and reduce capacity in your cluster?
>
> In pseudo-distributed there's one tasktracker, which is running, and the
> total map and reduce capacity is reported by the jobtracker at 6 slots
> each.
>
> > Can you also post the configuration of the scheduler you are using? You
> > might also want to check the jobtracker logs. It would help in further
> > debugging.
>
> Any ideas what I should be looking for that could cause a job to list as
> failed before launching any task JVMs and without reporting back to the
> launcher that it's failed?  Am I correct in interpreting "state 4" as
> "failure"?
>

State "4" indicates that the job is still in the PREP state and not a job
failure. We have seen these kind of errors when either the cluster does not
have tasktrackers to run the tasks or when the queue to which the job is
submitted does not have sufficient capacity.
In the logs, if you are able to see "Adding task (MAP/REDUCE)
...for tracker 'tracker_'", that means the task was
scheduled to be run on the TT. One can then look at the TT logs to check why
the tasks did not begin execution.
If you do not see this log message, that implies the cluster does not have
enough resources due to which JT is unable to schedule the tasks.

Thanks
Ramya