Re: Hbase with Hadoop

2011-10-13 Thread giridharan kesavan

Jignesh,

passing --config path_to_hbase_configs would help.

Like:
bin/hbase --config path_to_hbase_configs shell

-Giri

On 10/12/11 4:50 PM, Matt Foley wrote:

Hi Jignesh,
Not clear what's going on with your ZK, but as a starting point, the
hsync/flush feature in 205 was implemented with an on-off switch.  Make sure
you've turned it on by setting  *dfs.support.append  *to true in the
hdfs-site.xml config file.

Also, are you installing Hadoop with security turned on or off?

I'll gather some other config info that should help.
--Matt


On Wed, Oct 12, 2011 at 1:47 PM, Jignesh Pateljign...@websoft.com  wrote:


When I tried to run Hbase 0.90.4 with hadoop-.0.20.205.0 I got following
error

Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase shell
HBase Shell; enter 'helpRETURN' for list of supported commands.
Type exitRETURN to leave the HBase Shell
Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011

hbase(main):001:0  status

ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able
to connect to ZooKeeper but the connection closes immediately. This could be
a sign that the server has too many connections (30 is the default).
Consider inspecting your ZK server logs for that error and then make sure
you are reusing HBaseConfiguration as often as you can. See HTable's javadoc
for more information.


And when I tried to stop Hbase I continuously sees dot being printed and no
sign of stopping it. Not sure why it just simply stop it.

stopping
hbase...….


On Oct 12, 2011, at 3:19 PM, Jignesh Patel wrote:


The new plugin works after deleting eclipse and reinstalling it.
On Oct 12, 2011, at 2:39 PM, Jignesh Patel wrote:


I have installed Hadoop-0.20.205.0 but when I replace the hadoop

0.20.204.0 eclipse plugin with the 0.20.205.0, eclipse is not recognizing
it.

-Jignesh
On Oct 12, 2011, at 12:31 PM, Vinod Gupta Tankala wrote:


its free and open source too.. basically, their releases are ahead of

public

releases of hadoop/hbase - from what i understand, major bug fixes and
enhancements are checked in to their branch first and then eventually

make

it to public release branches.

thanks

On Wed, Oct 12, 2011 at 9:26 AM, Jignesh Pateljign...@websoft.com

wrote:

Sorry to here that.
Is CDH3 is a open source or a paid version?

-jignesh
On Oct 12, 2011, at 11:58 AM, Vinod Gupta Tankala wrote:


for what its worth, i was in a similar situation/dilemma few days ago

and

got frustrated figuring out what version combination of hadoop/hbase

to

use

and how to build hadoop manually to be compatible with hbase. the

build

process didn't work for me either.
eventually, i ended up using cloudera distribution and i think it

saved

me a

lot of headache and time.

thanks

On Tue, Oct 11, 2011 at 8:29 PM, jigneshmpatel

jigneshmpa...@gmail.com

wrote:


Matt,
Thanks a lot. Just wanted to have some more information. If hadoop
0.2.205.0
voted by the community members then will it become major release?

And

what

if it is not approved by community members.

And as you said I do like to use 0.90.3 if it works. If it is ok,

can

you

share the deails of those configuration changes?

-Jignesh

--
View this message in context:


http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3414658.html

Sent from the Hadoop lucene-users mailing list archive at

Nabble.com.







--
-Giri



Re: Hbase with Hadoop

2011-10-13 Thread Jignesh Patel
Actually real problem is here

http://pastebin.com/jyvpivt6

Moreover I didn't find any command like -config.

-Jignesh

On Oct 13, 2011, at 2:02 AM, giridharan kesavan wrote:

 Jignesh,
 
 passing --config path_to_hbase_configs would help.
 
 Like:
 bin/hbase --config path_to_hbase_configs shell
 
 -Giri
 
 On 10/12/11 4:50 PM, Matt Foley wrote:
 Hi Jignesh,
 Not clear what's going on with your ZK, but as a starting point, the
 hsync/flush feature in 205 was implemented with an on-off switch.  Make sure
 you've turned it on by setting  *dfs.support.append  *to true in the
 hdfs-site.xml config file.
 
 Also, are you installing Hadoop with security turned on or off?
 
 I'll gather some other config info that should help.
 --Matt
 
 
 On Wed, Oct 12, 2011 at 1:47 PM, Jignesh Pateljign...@websoft.com  wrote:
 
 When I tried to run Hbase 0.90.4 with hadoop-.0.20.205.0 I got following
 error
 
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase shell
 HBase Shell; enter 'helpRETURN' for list of supported commands.
 Type exitRETURN to leave the HBase Shell
 Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011
 
 hbase(main):001:0  status
 
 ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able
 to connect to ZooKeeper but the connection closes immediately. This could be
 a sign that the server has too many connections (30 is the default).
 Consider inspecting your ZK server logs for that error and then make sure
 you are reusing HBaseConfiguration as often as you can. See HTable's javadoc
 for more information.
 
 
 And when I tried to stop Hbase I continuously sees dot being printed and no
 sign of stopping it. Not sure why it just simply stop it.
 
 stopping
 hbase...….
 
 
 On Oct 12, 2011, at 3:19 PM, Jignesh Patel wrote:
 
 The new plugin works after deleting eclipse and reinstalling it.
 On Oct 12, 2011, at 2:39 PM, Jignesh Patel wrote:
 
 I have installed Hadoop-0.20.205.0 but when I replace the hadoop
 0.20.204.0 eclipse plugin with the 0.20.205.0, eclipse is not recognizing
 it.
 -Jignesh
 On Oct 12, 2011, at 12:31 PM, Vinod Gupta Tankala wrote:
 
 its free and open source too.. basically, their releases are ahead of
 public
 releases of hadoop/hbase - from what i understand, major bug fixes and
 enhancements are checked in to their branch first and then eventually
 make
 it to public release branches.
 
 thanks
 
 On Wed, Oct 12, 2011 at 9:26 AM, Jignesh Pateljign...@websoft.com
 wrote:
 Sorry to here that.
 Is CDH3 is a open source or a paid version?
 
 -jignesh
 On Oct 12, 2011, at 11:58 AM, Vinod Gupta Tankala wrote:
 
 for what its worth, i was in a similar situation/dilemma few days ago
 and
 got frustrated figuring out what version combination of hadoop/hbase
 to
 use
 and how to build hadoop manually to be compatible with hbase. the
 build
 process didn't work for me either.
 eventually, i ended up using cloudera distribution and i think it
 saved
 me a
 lot of headache and time.
 
 thanks
 
 On Tue, Oct 11, 2011 at 8:29 PM, jigneshmpatel
 jigneshmpa...@gmail.com
 wrote:
 
 Matt,
 Thanks a lot. Just wanted to have some more information. If hadoop
 0.2.205.0
 voted by the community members then will it become major release?
 And
 what
 if it is not approved by community members.
 
 And as you said I do like to use 0.90.3 if it works. If it is ok,
 can
 you
 share the deails of those configuration changes?
 
 -Jignesh
 
 --
 View this message in context:
 
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3414658.html
 Sent from the Hadoop lucene-users mailing list archive at
 Nabble.com.
 
 
 
 
 -- 
 -Giri
 



Re: Hbase with Hadoop

2011-10-13 Thread jigneshmpatel
Another thing I am using hadoop in psuedo single node server. But even if I
don't start Hbase I will have same error.

ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able
to connect to ZooKeeper but the connection closes immediately. This could be
a sign that the server has too many connections (30 is the default).
Consider inspecting your ZK server logs for that error and then make sure
you are reusing HBaseConfiguration as often as you can. See HTable's javadoc
for more information.


 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418992.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: Hbase with Hadoop

2011-10-13 Thread jigneshmpatel
There is no command like -config see below

Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
shell
Unrecognized option: -config
Could not create the Java virtual machine.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: Hbase with Hadoop

2011-10-13 Thread Harsh J
You'll need two hyphens before 'config'.

On 13-Oct-2011, at 9:00 PM, jigneshmpatel wrote:

 There is no command like -config see below
 
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
 shell
 Unrecognized option: -config
 Could not create the Java virtual machine.
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.



Re: Hbase with Hadoop

2011-10-13 Thread Matt Foley
Hi Jignesh,
the option is --config (with a double dash) not -config (with a single
dash).  Please let me know if that works.

--Matt


On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.comwrote:

 There is no command like -config see below

 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
 shell
 Unrecognized option: -config
 Could not create the Java virtual machine.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.



Re: Hbase with Hadoop

2011-10-13 Thread Ramya Sunil
Hi Jignesh,

--config (i.e. - - config) is the option to use and not -config.
Alternatively you can also set HBASE_CONF_DIR.

Below is the exact command line:

$ hbase --config /home/ramya/hbase/conf shell
hbase(main):001:0 create 'newtable','family'
0 row(s) in 0.5140 seconds

hbase(main):002:0 list 'newtable'
TABLE
newtable
1 row(s) in 0.0120 seconds

OR

$ export HBASE_CONF_DIR=/home/ramya/hbase/conf
$ hbase shell

hbase(main):001:0 list 'newtable'
TABLE

newtable

1 row(s) in 0.3860 seconds


Thanks
Ramya


On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.comwrote:

 There is no command like -config see below

 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
 shell
 Unrecognized option: -config
 Could not create the Java virtual machine.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.



Re: Hbase with Hadoop

2011-10-13 Thread Jignesh Patel
ok --config worked but it is showing me same error. How to resolve this.

http://pastebin.com/UyRBA7vX

On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:

 Hi Jignesh,
 
 --config (i.e. - - config) is the option to use and not -config.
 Alternatively you can also set HBASE_CONF_DIR.
 
 Below is the exact command line:
 
 $ hbase --config /home/ramya/hbase/conf shell
 hbase(main):001:0 create 'newtable','family'
 0 row(s) in 0.5140 seconds
 
 hbase(main):002:0 list 'newtable'
 TABLE
 newtable
 1 row(s) in 0.0120 seconds
 
 OR
 
 $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
 $ hbase shell
 
 hbase(main):001:0 list 'newtable'
 TABLE
 
 newtable
 
 1 row(s) in 0.3860 seconds
 
 
 Thanks
 Ramya
 
 
 On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.comwrote:
 
 There is no command like -config see below
 
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
 shell
 Unrecognized option: -config
 Could not create the Java virtual machine.
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
 



Re: Hbase with Hadoop

2011-10-13 Thread Ramya Sunil
Jignesh,

I dont see zookeeper running on your master. My cluster reads the following:

$ jps
15315 Jps
13590 HMaster
15235 HQuorumPeer

Can you please shutdown your Hmaster and run the following first:
$ hbase-daemon.sh start zookeeper

And then start your hbasemaster and regionservers?

Thanks
Ramya

On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com wrote:

 ok --config worked but it is showing me same error. How to resolve this.

 http://pastebin.com/UyRBA7vX

 On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:

  Hi Jignesh,
 
  --config (i.e. - - config) is the option to use and not -config.
  Alternatively you can also set HBASE_CONF_DIR.
 
  Below is the exact command line:
 
  $ hbase --config /home/ramya/hbase/conf shell
  hbase(main):001:0 create 'newtable','family'
  0 row(s) in 0.5140 seconds
 
  hbase(main):002:0 list 'newtable'
  TABLE
  newtable
  1 row(s) in 0.0120 seconds
 
  OR
 
  $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
  $ hbase shell
 
  hbase(main):001:0 list 'newtable'
  TABLE
 
  newtable
 
  1 row(s) in 0.3860 seconds
 
 
  Thanks
  Ramya
 
 
  On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.com
 wrote:
 
  There is no command like -config see below
 
  Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
  shell
  Unrecognized option: -config
  Could not create the Java virtual machine.
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
  Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
 




Re: Hbase with Hadoop

2011-10-13 Thread Jignesh Patel
Ramya,


Based on Hbase the definite guide it seems zookeeper being started by hbase 
no need to start it separately(may be this is changed for 0.90.4. Anyways now  
following is the updated status.

Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh
starting master, logging to 
/users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out
Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. 
Why? Because clients (eg shell) wont be able to find this ZK quorum
Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps
41486 HQuorumPeer
38814 SecondaryNameNode
41578 Jps
38878 JobTracker
38726 DataNode
38639 NameNode
38964 TaskTracker

On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote:

 Jignesh,
 
 I dont see zookeeper running on your master. My cluster reads the following:
 
 $ jps
 15315 Jps
 13590 HMaster
 15235 HQuorumPeer
 
 Can you please shutdown your Hmaster and run the following first:
 $ hbase-daemon.sh start zookeeper
 
 And then start your hbasemaster and regionservers?
 
 Thanks
 Ramya
 
 On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com wrote:
 
 ok --config worked but it is showing me same error. How to resolve this.
 
 http://pastebin.com/UyRBA7vX
 
 On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:
 
 Hi Jignesh,
 
 --config (i.e. - - config) is the option to use and not -config.
 Alternatively you can also set HBASE_CONF_DIR.
 
 Below is the exact command line:
 
 $ hbase --config /home/ramya/hbase/conf shell
 hbase(main):001:0 create 'newtable','family'
 0 row(s) in 0.5140 seconds
 
 hbase(main):002:0 list 'newtable'
 TABLE
 newtable
 1 row(s) in 0.0120 seconds
 
 OR
 
 $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
 $ hbase shell
 
 hbase(main):001:0 list 'newtable'
 TABLE
 
 newtable
 
 1 row(s) in 0.3860 seconds
 
 
 Thanks
 Ramya
 
 
 On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.com
 wrote:
 
 There is no command like -config see below
 
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
 shell
 Unrecognized option: -config
 Could not create the Java virtual machine.
 
 --
 View this message in context:
 
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
 
 
 



cannot use distcp in some s3 buckets

2011-10-13 Thread Raimon Bosch
Hi,

I've been having some problems with one of our s3 buckets. I have asked on
amazon support with no luck yet
https://forums.aws.amazon.com/thread.jspa?threadID=78001.

I'm getting this exception only with our oldest s3 bucket with this command:
hadoop distcp s3://MY_BUCKET_NAME/logfile-20110815.gz
/tmp/logfile-20110815.gz

java.lang.IllegalArgumentException: Invalid hostname in URI
s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz
at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)
at
org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:82)

As you can see, hadoop is rejecting my url before starting to do the
authorization steps. Someone has been in a similar issue? I have already
tested the same operation in newer s3 buckets and the command is working
correctly.

Thanks in advance,
Raimon Bosch.


Re: Hbase with Hadoop

2011-10-13 Thread Ramya Sunil
You already have zookeeper running on 2181 according to your jps output.
That is the reason, master seems to be complaining.
Can you please stop zookeeper, verify that no daemons are running on 2181
and restart your master?

On Thu, Oct 13, 2011 at 12:37 PM, Jignesh Patel jign...@websoft.com wrote:

 Ramya,


 Based on Hbase the definite guide it seems zookeeper being started by
 hbase no need to start it separately(may be this is changed for 0.90.4.
 Anyways now  following is the updated status.

 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh
 starting master, logging to
 /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out
 Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting.
 Why? Because clients (eg shell) wont be able to find this ZK quorum
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps
 41486 HQuorumPeer
 38814 SecondaryNameNode
 41578 Jps
 38878 JobTracker
 38726 DataNode
 38639 NameNode
 38964 TaskTracker

 On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote:

  Jignesh,
 
  I dont see zookeeper running on your master. My cluster reads the
 following:
 
  $ jps
  15315 Jps
  13590 HMaster
  15235 HQuorumPeer
 
  Can you please shutdown your Hmaster and run the following first:
  $ hbase-daemon.sh start zookeeper
 
  And then start your hbasemaster and regionservers?
 
  Thanks
  Ramya
 
  On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com
 wrote:
 
  ok --config worked but it is showing me same error. How to resolve this.
 
  http://pastebin.com/UyRBA7vX
 
  On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:
 
  Hi Jignesh,
 
  --config (i.e. - - config) is the option to use and not -config.
  Alternatively you can also set HBASE_CONF_DIR.
 
  Below is the exact command line:
 
  $ hbase --config /home/ramya/hbase/conf shell
  hbase(main):001:0 create 'newtable','family'
  0 row(s) in 0.5140 seconds
 
  hbase(main):002:0 list 'newtable'
  TABLE
  newtable
  1 row(s) in 0.0120 seconds
 
  OR
 
  $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
  $ hbase shell
 
  hbase(main):001:0 list 'newtable'
  TABLE
 
  newtable
 
  1 row(s) in 0.3860 seconds
 
 
  Thanks
  Ramya
 
 
  On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel 
 jigneshmpa...@gmail.com
  wrote:
 
  There is no command like -config see below
 
  Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config
 ./config
  shell
  Unrecognized option: -config
  Could not create the Java virtual machine.
 
  --
  View this message in context:
 
 
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
  Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
 
 
 




Re: cannot use distcp in some s3 buckets

2011-10-13 Thread Raimon Bosch
By the way,

The url I'm trying has a '_' in the bucket name. Could be this the problem?

2011/10/13 Raimon Bosch raimon.bo...@gmail.com

 Hi,

 I've been having some problems with one of our s3 buckets. I have asked on
 amazon support with no luck yet
 https://forums.aws.amazon.com/thread.jspa?threadID=78001.

 I'm getting this exception only with our oldest s3 bucket with this
 command: hadoop distcp s3://MY_BUCKET_NAME/logfile-20110815.gz
 /tmp/logfile-20110815.gz

 java.lang.IllegalArgumentException: Invalid hostname in URI
 s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz
 at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)
 at
 org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:82)

 As you can see, hadoop is rejecting my url before starting to do the
 authorization steps. Someone has been in a similar issue? I have already
 tested the same operation in newer s3 buckets and the command is working
 correctly.

 Thanks in advance,
 Raimon Bosch.





Re: cannot use distcp in some s3 buckets

2011-10-13 Thread Tom White
On Thu, Oct 13, 2011 at 2:06 PM, Raimon Bosch raimon.bo...@gmail.com wrote:
 By the way,

 The url I'm trying has a '_' in the bucket name. Could be this the problem?

Yes, underscores are not permitted in hostnames.

Cheers,
Tom


 2011/10/13 Raimon Bosch raimon.bo...@gmail.com

 Hi,

 I've been having some problems with one of our s3 buckets. I have asked on
 amazon support with no luck yet
 https://forums.aws.amazon.com/thread.jspa?threadID=78001.

 I'm getting this exception only with our oldest s3 bucket with this
 command: hadoop distcp s3://MY_BUCKET_NAME/logfile-20110815.gz
 /tmp/logfile-20110815.gz

 java.lang.IllegalArgumentException: Invalid hostname in URI
 s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz
 at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)
 at
 org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:82)

 As you can see, hadoop is rejecting my url before starting to do the
 authorization steps. Someone has been in a similar issue? I have already
 tested the same operation in newer s3 buckets and the command is working
 correctly.

 Thanks in advance,
 Raimon Bosch.






Re: Hbase with Hadoop

2011-10-13 Thread Jignesh Patel
Ok now the problem is

if I only use bin/hbase-start.sh then it doesn't start zookeeper.

But if I use bin/hbase-daemon.sh start zookeeper before starting 
bin/hbase-start.sh then it will try to start zookeeper at port 2181 and then I 
have following error.

Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. 
Why? Because clients (eg shell) wont be able to find this ZK quorum


So I am wondering if bin/hbase-start.sh is trying to start zookeeper then while 
zookeeper is not running it should start the zookeeper. I only get the error if 
zookeeper already running.


-Jignesh


On Oct 13, 2011, at 4:53 PM, Ramya Sunil wrote:

 You already have zookeeper running on 2181 according to your jps output.
 That is the reason, master seems to be complaining.
 Can you please stop zookeeper, verify that no daemons are running on 2181
 and restart your master?
 
 On Thu, Oct 13, 2011 at 12:37 PM, Jignesh Patel jign...@websoft.com wrote:
 
 Ramya,
 
 
 Based on Hbase the definite guide it seems zookeeper being started by
 hbase no need to start it separately(may be this is changed for 0.90.4.
 Anyways now  following is the updated status.
 
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh
 starting master, logging to
 /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out
 Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting.
 Why? Because clients (eg shell) wont be able to find this ZK quorum
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps
 41486 HQuorumPeer
 38814 SecondaryNameNode
 41578 Jps
 38878 JobTracker
 38726 DataNode
 38639 NameNode
 38964 TaskTracker
 
 On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote:
 
 Jignesh,
 
 I dont see zookeeper running on your master. My cluster reads the
 following:
 
 $ jps
 15315 Jps
 13590 HMaster
 15235 HQuorumPeer
 
 Can you please shutdown your Hmaster and run the following first:
 $ hbase-daemon.sh start zookeeper
 
 And then start your hbasemaster and regionservers?
 
 Thanks
 Ramya
 
 On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com
 wrote:
 
 ok --config worked but it is showing me same error. How to resolve this.
 
 http://pastebin.com/UyRBA7vX
 
 On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:
 
 Hi Jignesh,
 
 --config (i.e. - - config) is the option to use and not -config.
 Alternatively you can also set HBASE_CONF_DIR.
 
 Below is the exact command line:
 
 $ hbase --config /home/ramya/hbase/conf shell
 hbase(main):001:0 create 'newtable','family'
 0 row(s) in 0.5140 seconds
 
 hbase(main):002:0 list 'newtable'
 TABLE
 newtable
 1 row(s) in 0.0120 seconds
 
 OR
 
 $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
 $ hbase shell
 
 hbase(main):001:0 list 'newtable'
 TABLE
 
 newtable
 
 1 row(s) in 0.3860 seconds
 
 
 Thanks
 Ramya
 
 
 On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel 
 jigneshmpa...@gmail.com
 wrote:
 
 There is no command like -config see below
 
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config
 ./config
 shell
 Unrecognized option: -config
 Could not create the Java virtual machine.
 
 --
 View this message in context:
 
 
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
 
 
 
 
 



Re: Hbase with Hadoop

2011-10-13 Thread Jignesh Patel
Is there a way to resolve this weird problem.

 bin/hbase-start.sh is supposed to start zookeeper but it doesn't start. But 
 on the other side if zookeeper up and running then it says 

 Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. 
 Why? Because clients (eg shell) wont be able to find this ZK quorum



On Oct 13, 2011, at 5:40 PM, Jignesh Patel wrote:

 Ok now the problem is
 
 if I only use bin/hbase-start.sh then it doesn't start zookeeper.
 
 But if I use bin/hbase-daemon.sh start zookeeper before starting 
 bin/hbase-start.sh then it will try to start zookeeper at port 2181 and then 
 I have following error.
 
 Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. 
 Why? Because clients (eg shell) wont be able to find this ZK quorum
 
 
 So I am wondering if bin/hbase-start.sh is trying to start zookeeper then 
 while zookeeper is not running it should start the zookeeper. I only get the 
 error if zookeeper already running.
 
 
 -Jignesh
 
 
 On Oct 13, 2011, at 4:53 PM, Ramya Sunil wrote:
 
 You already have zookeeper running on 2181 according to your jps output.
 That is the reason, master seems to be complaining.
 Can you please stop zookeeper, verify that no daemons are running on 2181
 and restart your master?
 
 On Thu, Oct 13, 2011 at 12:37 PM, Jignesh Patel jign...@websoft.com wrote:
 
 Ramya,
 
 
 Based on Hbase the definite guide it seems zookeeper being started by
 hbase no need to start it separately(may be this is changed for 0.90.4.
 Anyways now  following is the updated status.
 
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh
 starting master, logging to
 /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out
 Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting.
 Why? Because clients (eg shell) wont be able to find this ZK quorum
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps
 41486 HQuorumPeer
 38814 SecondaryNameNode
 41578 Jps
 38878 JobTracker
 38726 DataNode
 38639 NameNode
 38964 TaskTracker
 
 On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote:
 
 Jignesh,
 
 I dont see zookeeper running on your master. My cluster reads the
 following:
 
 $ jps
 15315 Jps
 13590 HMaster
 15235 HQuorumPeer
 
 Can you please shutdown your Hmaster and run the following first:
 $ hbase-daemon.sh start zookeeper
 
 And then start your hbasemaster and regionservers?
 
 Thanks
 Ramya
 
 On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com
 wrote:
 
 ok --config worked but it is showing me same error. How to resolve this.
 
 http://pastebin.com/UyRBA7vX
 
 On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote:
 
 Hi Jignesh,
 
 --config (i.e. - - config) is the option to use and not -config.
 Alternatively you can also set HBASE_CONF_DIR.
 
 Below is the exact command line:
 
 $ hbase --config /home/ramya/hbase/conf shell
 hbase(main):001:0 create 'newtable','family'
 0 row(s) in 0.5140 seconds
 
 hbase(main):002:0 list 'newtable'
 TABLE
 newtable
 1 row(s) in 0.0120 seconds
 
 OR
 
 $ export HBASE_CONF_DIR=/home/ramya/hbase/conf
 $ hbase shell
 
 hbase(main):001:0 list 'newtable'
 TABLE
 
 newtable
 
 1 row(s) in 0.3860 seconds
 
 
 Thanks
 Ramya
 
 
 On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel 
 jigneshmpa...@gmail.com
 wrote:
 
 There is no command like -config see below
 
 Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config
 ./config
 shell
 Unrecognized option: -config
 Could not create the Java virtual machine.
 
 --
 View this message in context:
 
 
 http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
 
 
 
 
 
 



Web crawler in hadoop - unresponsive after a while

2011-10-13 Thread Aishwarya Venkataraman
Hello,

I trying to make my web crawling go faster with hadoop. My mapper just
consists of a single line and my reducer is an IdentityReducer

while read line;do
  #result=`wget -O - --timeout=500 http://$line 21`
  echo $result
done

I am crawling about 50,000 sites. But my mapper always seems to time out
after sometime. The crawler just becomes unresponsive I guess.
I am not able to see which site is causing the problem as mapper deletes the
output if the job fails. I am running a single node hadoop cluster
currently.
Is this the problem ?

Did anyone else have a similar problem ? I am not sure why this is
happening. Can I prevent mapper from deleting intermediate outputs ?

I tried running mapper against 10-20 sites as opposed to 50k sites and that
worked fine.

Thanks,
Aishwarya


Re: Web crawler in hadoop - unresponsive after a while

2011-10-13 Thread bejoy . hadoop
Hi Aishwarya
To debug this issue you necessarily don't need the intermediate output. 
If there is any error/exception then you can get it from your job logs 
directly. In your case the job turns irresponsive, to do further trouble 
shooting  you can include log statements on your program and then rerun the 
same and obtain the records that creates the problem from your logs.
   In a direct manner you can obtain your logs from the job tracker web UI. 
http://host:50030/jobtracker.jsp. From your job drill down to the task and on 
the right side you can see options to display your task tracker logs. 
   On top of this i'd like to add on, since you mentioned  single node, I 
assume it is either on stand alone/distributed mode. These setup is basically 
for development and testing of functionality. If you are looking for better 
performance of your jobs, you  need to leverage the parallel processing power 
of hadoop. You need to have  a mini cluster at least for performance bench 
marking and processing relatively large volume data.

Hope it helps!..

--Original Message--
From: Aishwarya Venkataraman
Sender: avenk...@eng.ucsd.edu
To: common-user@hadoop.apache.org
ReplyTo: common-user@hadoop.apache.org
Subject: Web crawler in hadoop - unresponsive after a while
Sent: Oct 14, 2011 08:20

Hello,

I trying to make my web crawling go faster with hadoop. My mapper just
consists of a single line and my reducer is an IdentityReducer

while read line;do
  #result=`wget -O - --timeout=500 http://$line 21`
  echo $result
done

I am crawling about 50,000 sites. But my mapper always seems to time out
after sometime. The crawler just becomes unresponsive I guess.
I am not able to see which site is causing the problem as mapper deletes the
output if the job fails. I am running a single node hadoop cluster
currently.
Is this the problem ?

Did anyone else have a similar problem ? I am not sure why this is
happening. Can I prevent mapper from deleting intermediate outputs ?

I tried running mapper against 10-20 sites as opposed to 50k sites and that
worked fine.

Thanks,
Aishwarya



Regards
Bejoy K S

wordcount example throwing null pointer with ConcurrentHashMap

2011-10-13 Thread Santosh Belda


Hi,

I have setup the hadoop on single node and worked fine but when executing
the wordcount example, following error is thornw, Is this any configuration
issue? 

 bin/hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount
/user/hduser/testfiles /user/hduser/output
11/10/14 10:29:53 INFO input.FileInputFormat: Total input paths to process :
3
11/10/14 10:29:53 WARN snappy.LoadSnappy: Snappy native library is available
11/10/14 10:29:53 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
11/10/14 10:29:53 INFO snappy.LoadSnappy: Snappy native library loaded
11/10/14 10:29:53 INFO mapred.JobClient: Running job: job_201110141028_0001
11/10/14 10:29:54 INFO mapred.JobClient:  map 0% reduce 0%
11/10/14 10:29:59 INFO mapred.JobClient:  map 66% reduce 0%
11/10/14 10:30:01 INFO mapred.JobClient: Task Id :
attempt_201110141028_0001_r_00_0, Status : FAILED
Error: java.lang.NullPointerException
at
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2824)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2744)

11/10/14 10:30:02 INFO mapred.JobClient:  map 100% reduce 0%
11/10/14 10:30:03 INFO mapred.JobClient: Task Id :
attempt_201110141028_0001_r_00_1, Status : FAILED
Error: java.lang.NullPointerException
at
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2824)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2744)

11/10/14 10:30:05 INFO mapred.JobClient: Task Id :
attempt_201110141028_0001_r_00_2, Status : FAILED
Error: java.lang.NullPointerException
at
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2824)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2744)

11/10/14 10:30:08 INFO mapred.JobClient: Job complete: job_201110141028_0001
11/10/14 10:30:08 INFO mapred.JobClient: Counters: 18
11/10/14 10:30:08 INFO mapred.JobClient:   Job Counters
11/10/14 10:30:08 INFO mapred.JobClient: Launched reduce tasks=4
11/10/14 10:30:08 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=9167
11/10/14 10:30:08 INFO mapred.JobClient: Total time spent by all reduces
waiting after reserving slots (ms)=0
11/10/14 10:30:08 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
11/10/14 10:30:08 INFO mapred.JobClient: Launched map tasks=3
11/10/14 10:30:08 INFO mapred.JobClient: Data-local map tasks=3
11/10/14 10:30:08 INFO mapred.JobClient: Failed reduce tasks=1
11/10/14 10:30:08 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=3292
11/10/14 10:30:08 INFO mapred.JobClient:   FileSystemCounters
11/10/14 10:30:08 INFO mapred.JobClient: FILE_BYTES_READ=740427
11/10/14 10:30:08 INFO mapred.JobClient: HDFS_BYTES_READ=2863597
11/10/14 10:30:08 INFO mapred.JobClient: FILE_BYTES_WRITTEN=2161157
11/10/14 10:30:08 INFO mapred.JobClient:   Map-Reduce Framework
11/10/14 10:30:08 INFO mapred.JobClient: Combine output records=87431
11/10/14 10:30:08 INFO mapred.JobClient: Map input records=58570
11/10/14 10:30:08 INFO mapred.JobClient: Spilled Records=138742
11/10/14 10:30:08 INFO mapred.JobClient: Map output bytes=4774081
11/10/14 10:30:08 INFO mapred.JobClient: Combine input records=487561
11/10/14 10:30:08 INFO mapred.JobClient: Map output records=487561
11/10/14 10:30:08 INFO mapred.JobClient: SPLIT_RAW_BYTES=361

-- 
View this message in context: 
http://old.nabble.com/wordcount-example-throwing-null-pointer-with-ConcurrentHashMap-tp32650178p32650178.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.