Re: Hbase with Hadoop
Jignesh, passing --config path_to_hbase_configs would help. Like: bin/hbase --config path_to_hbase_configs shell -Giri On 10/12/11 4:50 PM, Matt Foley wrote: Hi Jignesh, Not clear what's going on with your ZK, but as a starting point, the hsync/flush feature in 205 was implemented with an on-off switch. Make sure you've turned it on by setting *dfs.support.append *to true in the hdfs-site.xml config file. Also, are you installing Hadoop with security turned on or off? I'll gather some other config info that should help. --Matt On Wed, Oct 12, 2011 at 1:47 PM, Jignesh Pateljign...@websoft.com wrote: When I tried to run Hbase 0.90.4 with hadoop-.0.20.205.0 I got following error Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase shell HBase Shell; enter 'helpRETURN' for list of supported commands. Type exitRETURN to leave the HBase Shell Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011 hbase(main):001:0 status ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. And when I tried to stop Hbase I continuously sees dot being printed and no sign of stopping it. Not sure why it just simply stop it. stopping hbase...…. On Oct 12, 2011, at 3:19 PM, Jignesh Patel wrote: The new plugin works after deleting eclipse and reinstalling it. On Oct 12, 2011, at 2:39 PM, Jignesh Patel wrote: I have installed Hadoop-0.20.205.0 but when I replace the hadoop 0.20.204.0 eclipse plugin with the 0.20.205.0, eclipse is not recognizing it. -Jignesh On Oct 12, 2011, at 12:31 PM, Vinod Gupta Tankala wrote: its free and open source too.. basically, their releases are ahead of public releases of hadoop/hbase - from what i understand, major bug fixes and enhancements are checked in to their branch first and then eventually make it to public release branches. thanks On Wed, Oct 12, 2011 at 9:26 AM, Jignesh Pateljign...@websoft.com wrote: Sorry to here that. Is CDH3 is a open source or a paid version? -jignesh On Oct 12, 2011, at 11:58 AM, Vinod Gupta Tankala wrote: for what its worth, i was in a similar situation/dilemma few days ago and got frustrated figuring out what version combination of hadoop/hbase to use and how to build hadoop manually to be compatible with hbase. the build process didn't work for me either. eventually, i ended up using cloudera distribution and i think it saved me a lot of headache and time. thanks On Tue, Oct 11, 2011 at 8:29 PM, jigneshmpatel jigneshmpa...@gmail.com wrote: Matt, Thanks a lot. Just wanted to have some more information. If hadoop 0.2.205.0 voted by the community members then will it become major release? And what if it is not approved by community members. And as you said I do like to use 0.90.3 if it works. If it is ok, can you share the deails of those configuration changes? -Jignesh -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3414658.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com. -- -Giri
Re: Hbase with Hadoop
Actually real problem is here http://pastebin.com/jyvpivt6 Moreover I didn't find any command like -config. -Jignesh On Oct 13, 2011, at 2:02 AM, giridharan kesavan wrote: Jignesh, passing --config path_to_hbase_configs would help. Like: bin/hbase --config path_to_hbase_configs shell -Giri On 10/12/11 4:50 PM, Matt Foley wrote: Hi Jignesh, Not clear what's going on with your ZK, but as a starting point, the hsync/flush feature in 205 was implemented with an on-off switch. Make sure you've turned it on by setting *dfs.support.append *to true in the hdfs-site.xml config file. Also, are you installing Hadoop with security turned on or off? I'll gather some other config info that should help. --Matt On Wed, Oct 12, 2011 at 1:47 PM, Jignesh Pateljign...@websoft.com wrote: When I tried to run Hbase 0.90.4 with hadoop-.0.20.205.0 I got following error Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase shell HBase Shell; enter 'helpRETURN' for list of supported commands. Type exitRETURN to leave the HBase Shell Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011 hbase(main):001:0 status ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. And when I tried to stop Hbase I continuously sees dot being printed and no sign of stopping it. Not sure why it just simply stop it. stopping hbase...…. On Oct 12, 2011, at 3:19 PM, Jignesh Patel wrote: The new plugin works after deleting eclipse and reinstalling it. On Oct 12, 2011, at 2:39 PM, Jignesh Patel wrote: I have installed Hadoop-0.20.205.0 but when I replace the hadoop 0.20.204.0 eclipse plugin with the 0.20.205.0, eclipse is not recognizing it. -Jignesh On Oct 12, 2011, at 12:31 PM, Vinod Gupta Tankala wrote: its free and open source too.. basically, their releases are ahead of public releases of hadoop/hbase - from what i understand, major bug fixes and enhancements are checked in to their branch first and then eventually make it to public release branches. thanks On Wed, Oct 12, 2011 at 9:26 AM, Jignesh Pateljign...@websoft.com wrote: Sorry to here that. Is CDH3 is a open source or a paid version? -jignesh On Oct 12, 2011, at 11:58 AM, Vinod Gupta Tankala wrote: for what its worth, i was in a similar situation/dilemma few days ago and got frustrated figuring out what version combination of hadoop/hbase to use and how to build hadoop manually to be compatible with hbase. the build process didn't work for me either. eventually, i ended up using cloudera distribution and i think it saved me a lot of headache and time. thanks On Tue, Oct 11, 2011 at 8:29 PM, jigneshmpatel jigneshmpa...@gmail.com wrote: Matt, Thanks a lot. Just wanted to have some more information. If hadoop 0.2.205.0 voted by the community members then will it become major release? And what if it is not approved by community members. And as you said I do like to use 0.90.3 if it works. If it is ok, can you share the deails of those configuration changes? -Jignesh -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3414658.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com. -- -Giri
Re: Hbase with Hadoop
Another thing I am using hadoop in psuedo single node server. But even if I don't start Hbase I will have same error. ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418992.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Hbase with Hadoop
There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Hbase with Hadoop
You'll need two hyphens before 'config'. On 13-Oct-2011, at 9:00 PM, jigneshmpatel wrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Hbase with Hadoop
Hi Jignesh, the option is --config (with a double dash) not -config (with a single dash). Please let me know if that works. --Matt On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.comwrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Hbase with Hadoop
Hi Jignesh, --config (i.e. - - config) is the option to use and not -config. Alternatively you can also set HBASE_CONF_DIR. Below is the exact command line: $ hbase --config /home/ramya/hbase/conf shell hbase(main):001:0 create 'newtable','family' 0 row(s) in 0.5140 seconds hbase(main):002:0 list 'newtable' TABLE newtable 1 row(s) in 0.0120 seconds OR $ export HBASE_CONF_DIR=/home/ramya/hbase/conf $ hbase shell hbase(main):001:0 list 'newtable' TABLE newtable 1 row(s) in 0.3860 seconds Thanks Ramya On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.comwrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Hbase with Hadoop
ok --config worked but it is showing me same error. How to resolve this. http://pastebin.com/UyRBA7vX On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote: Hi Jignesh, --config (i.e. - - config) is the option to use and not -config. Alternatively you can also set HBASE_CONF_DIR. Below is the exact command line: $ hbase --config /home/ramya/hbase/conf shell hbase(main):001:0 create 'newtable','family' 0 row(s) in 0.5140 seconds hbase(main):002:0 list 'newtable' TABLE newtable 1 row(s) in 0.0120 seconds OR $ export HBASE_CONF_DIR=/home/ramya/hbase/conf $ hbase shell hbase(main):001:0 list 'newtable' TABLE newtable 1 row(s) in 0.3860 seconds Thanks Ramya On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.comwrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Hbase with Hadoop
Jignesh, I dont see zookeeper running on your master. My cluster reads the following: $ jps 15315 Jps 13590 HMaster 15235 HQuorumPeer Can you please shutdown your Hmaster and run the following first: $ hbase-daemon.sh start zookeeper And then start your hbasemaster and regionservers? Thanks Ramya On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com wrote: ok --config worked but it is showing me same error. How to resolve this. http://pastebin.com/UyRBA7vX On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote: Hi Jignesh, --config (i.e. - - config) is the option to use and not -config. Alternatively you can also set HBASE_CONF_DIR. Below is the exact command line: $ hbase --config /home/ramya/hbase/conf shell hbase(main):001:0 create 'newtable','family' 0 row(s) in 0.5140 seconds hbase(main):002:0 list 'newtable' TABLE newtable 1 row(s) in 0.0120 seconds OR $ export HBASE_CONF_DIR=/home/ramya/hbase/conf $ hbase shell hbase(main):001:0 list 'newtable' TABLE newtable 1 row(s) in 0.3860 seconds Thanks Ramya On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.com wrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Hbase with Hadoop
Ramya, Based on Hbase the definite guide it seems zookeeper being started by hbase no need to start it separately(may be this is changed for 0.90.4. Anyways now following is the updated status. Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh starting master, logging to /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. Why? Because clients (eg shell) wont be able to find this ZK quorum Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps 41486 HQuorumPeer 38814 SecondaryNameNode 41578 Jps 38878 JobTracker 38726 DataNode 38639 NameNode 38964 TaskTracker On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote: Jignesh, I dont see zookeeper running on your master. My cluster reads the following: $ jps 15315 Jps 13590 HMaster 15235 HQuorumPeer Can you please shutdown your Hmaster and run the following first: $ hbase-daemon.sh start zookeeper And then start your hbasemaster and regionservers? Thanks Ramya On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com wrote: ok --config worked but it is showing me same error. How to resolve this. http://pastebin.com/UyRBA7vX On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote: Hi Jignesh, --config (i.e. - - config) is the option to use and not -config. Alternatively you can also set HBASE_CONF_DIR. Below is the exact command line: $ hbase --config /home/ramya/hbase/conf shell hbase(main):001:0 create 'newtable','family' 0 row(s) in 0.5140 seconds hbase(main):002:0 list 'newtable' TABLE newtable 1 row(s) in 0.0120 seconds OR $ export HBASE_CONF_DIR=/home/ramya/hbase/conf $ hbase shell hbase(main):001:0 list 'newtable' TABLE newtable 1 row(s) in 0.3860 seconds Thanks Ramya On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.com wrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
cannot use distcp in some s3 buckets
Hi, I've been having some problems with one of our s3 buckets. I have asked on amazon support with no luck yet https://forums.aws.amazon.com/thread.jspa?threadID=78001. I'm getting this exception only with our oldest s3 bucket with this command: hadoop distcp s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz java.lang.IllegalArgumentException: Invalid hostname in URI s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41) at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:82) As you can see, hadoop is rejecting my url before starting to do the authorization steps. Someone has been in a similar issue? I have already tested the same operation in newer s3 buckets and the command is working correctly. Thanks in advance, Raimon Bosch.
Re: Hbase with Hadoop
You already have zookeeper running on 2181 according to your jps output. That is the reason, master seems to be complaining. Can you please stop zookeeper, verify that no daemons are running on 2181 and restart your master? On Thu, Oct 13, 2011 at 12:37 PM, Jignesh Patel jign...@websoft.com wrote: Ramya, Based on Hbase the definite guide it seems zookeeper being started by hbase no need to start it separately(may be this is changed for 0.90.4. Anyways now following is the updated status. Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh starting master, logging to /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. Why? Because clients (eg shell) wont be able to find this ZK quorum Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps 41486 HQuorumPeer 38814 SecondaryNameNode 41578 Jps 38878 JobTracker 38726 DataNode 38639 NameNode 38964 TaskTracker On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote: Jignesh, I dont see zookeeper running on your master. My cluster reads the following: $ jps 15315 Jps 13590 HMaster 15235 HQuorumPeer Can you please shutdown your Hmaster and run the following first: $ hbase-daemon.sh start zookeeper And then start your hbasemaster and regionservers? Thanks Ramya On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com wrote: ok --config worked but it is showing me same error. How to resolve this. http://pastebin.com/UyRBA7vX On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote: Hi Jignesh, --config (i.e. - - config) is the option to use and not -config. Alternatively you can also set HBASE_CONF_DIR. Below is the exact command line: $ hbase --config /home/ramya/hbase/conf shell hbase(main):001:0 create 'newtable','family' 0 row(s) in 0.5140 seconds hbase(main):002:0 list 'newtable' TABLE newtable 1 row(s) in 0.0120 seconds OR $ export HBASE_CONF_DIR=/home/ramya/hbase/conf $ hbase shell hbase(main):001:0 list 'newtable' TABLE newtable 1 row(s) in 0.3860 seconds Thanks Ramya On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.com wrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: cannot use distcp in some s3 buckets
By the way, The url I'm trying has a '_' in the bucket name. Could be this the problem? 2011/10/13 Raimon Bosch raimon.bo...@gmail.com Hi, I've been having some problems with one of our s3 buckets. I have asked on amazon support with no luck yet https://forums.aws.amazon.com/thread.jspa?threadID=78001. I'm getting this exception only with our oldest s3 bucket with this command: hadoop distcp s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz java.lang.IllegalArgumentException: Invalid hostname in URI s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41) at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:82) As you can see, hadoop is rejecting my url before starting to do the authorization steps. Someone has been in a similar issue? I have already tested the same operation in newer s3 buckets and the command is working correctly. Thanks in advance, Raimon Bosch.
Re: cannot use distcp in some s3 buckets
On Thu, Oct 13, 2011 at 2:06 PM, Raimon Bosch raimon.bo...@gmail.com wrote: By the way, The url I'm trying has a '_' in the bucket name. Could be this the problem? Yes, underscores are not permitted in hostnames. Cheers, Tom 2011/10/13 Raimon Bosch raimon.bo...@gmail.com Hi, I've been having some problems with one of our s3 buckets. I have asked on amazon support with no luck yet https://forums.aws.amazon.com/thread.jspa?threadID=78001. I'm getting this exception only with our oldest s3 bucket with this command: hadoop distcp s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz java.lang.IllegalArgumentException: Invalid hostname in URI s3://MY_BUCKET_NAME/logfile-20110815.gz /tmp/logfile-20110815.gz at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41) at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.initialize(Jets3tFileSystemStore.java:82) As you can see, hadoop is rejecting my url before starting to do the authorization steps. Someone has been in a similar issue? I have already tested the same operation in newer s3 buckets and the command is working correctly. Thanks in advance, Raimon Bosch.
Re: Hbase with Hadoop
Ok now the problem is if I only use bin/hbase-start.sh then it doesn't start zookeeper. But if I use bin/hbase-daemon.sh start zookeeper before starting bin/hbase-start.sh then it will try to start zookeeper at port 2181 and then I have following error. Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. Why? Because clients (eg shell) wont be able to find this ZK quorum So I am wondering if bin/hbase-start.sh is trying to start zookeeper then while zookeeper is not running it should start the zookeeper. I only get the error if zookeeper already running. -Jignesh On Oct 13, 2011, at 4:53 PM, Ramya Sunil wrote: You already have zookeeper running on 2181 according to your jps output. That is the reason, master seems to be complaining. Can you please stop zookeeper, verify that no daemons are running on 2181 and restart your master? On Thu, Oct 13, 2011 at 12:37 PM, Jignesh Patel jign...@websoft.com wrote: Ramya, Based on Hbase the definite guide it seems zookeeper being started by hbase no need to start it separately(may be this is changed for 0.90.4. Anyways now following is the updated status. Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh starting master, logging to /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. Why? Because clients (eg shell) wont be able to find this ZK quorum Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps 41486 HQuorumPeer 38814 SecondaryNameNode 41578 Jps 38878 JobTracker 38726 DataNode 38639 NameNode 38964 TaskTracker On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote: Jignesh, I dont see zookeeper running on your master. My cluster reads the following: $ jps 15315 Jps 13590 HMaster 15235 HQuorumPeer Can you please shutdown your Hmaster and run the following first: $ hbase-daemon.sh start zookeeper And then start your hbasemaster and regionservers? Thanks Ramya On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com wrote: ok --config worked but it is showing me same error. How to resolve this. http://pastebin.com/UyRBA7vX On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote: Hi Jignesh, --config (i.e. - - config) is the option to use and not -config. Alternatively you can also set HBASE_CONF_DIR. Below is the exact command line: $ hbase --config /home/ramya/hbase/conf shell hbase(main):001:0 create 'newtable','family' 0 row(s) in 0.5140 seconds hbase(main):002:0 list 'newtable' TABLE newtable 1 row(s) in 0.0120 seconds OR $ export HBASE_CONF_DIR=/home/ramya/hbase/conf $ hbase shell hbase(main):001:0 list 'newtable' TABLE newtable 1 row(s) in 0.3860 seconds Thanks Ramya On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.com wrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Hbase with Hadoop
Is there a way to resolve this weird problem. bin/hbase-start.sh is supposed to start zookeeper but it doesn't start. But on the other side if zookeeper up and running then it says Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. Why? Because clients (eg shell) wont be able to find this ZK quorum On Oct 13, 2011, at 5:40 PM, Jignesh Patel wrote: Ok now the problem is if I only use bin/hbase-start.sh then it doesn't start zookeeper. But if I use bin/hbase-daemon.sh start zookeeper before starting bin/hbase-start.sh then it will try to start zookeeper at port 2181 and then I have following error. Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. Why? Because clients (eg shell) wont be able to find this ZK quorum So I am wondering if bin/hbase-start.sh is trying to start zookeeper then while zookeeper is not running it should start the zookeeper. I only get the error if zookeeper already running. -Jignesh On Oct 13, 2011, at 4:53 PM, Ramya Sunil wrote: You already have zookeeper running on 2181 according to your jps output. That is the reason, master seems to be complaining. Can you please stop zookeeper, verify that no daemons are running on 2181 and restart your master? On Thu, Oct 13, 2011 at 12:37 PM, Jignesh Patel jign...@websoft.com wrote: Ramya, Based on Hbase the definite guide it seems zookeeper being started by hbase no need to start it separately(may be this is changed for 0.90.4. Anyways now following is the updated status. Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/start-hbase.sh starting master, logging to /users/hadoop-user/hadoop-hbase/logs/hbase-hadoop-user-master-Jignesh-MacBookPro.local.out Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. Why? Because clients (eg shell) wont be able to find this ZK quorum Jignesh-MacBookPro:hadoop-hbase hadoop-user$ jps 41486 HQuorumPeer 38814 SecondaryNameNode 41578 Jps 38878 JobTracker 38726 DataNode 38639 NameNode 38964 TaskTracker On Oct 13, 2011, at 3:23 PM, Ramya Sunil wrote: Jignesh, I dont see zookeeper running on your master. My cluster reads the following: $ jps 15315 Jps 13590 HMaster 15235 HQuorumPeer Can you please shutdown your Hmaster and run the following first: $ hbase-daemon.sh start zookeeper And then start your hbasemaster and regionservers? Thanks Ramya On Thu, Oct 13, 2011 at 12:01 PM, Jignesh Patel jign...@websoft.com wrote: ok --config worked but it is showing me same error. How to resolve this. http://pastebin.com/UyRBA7vX On Oct 13, 2011, at 1:34 PM, Ramya Sunil wrote: Hi Jignesh, --config (i.e. - - config) is the option to use and not -config. Alternatively you can also set HBASE_CONF_DIR. Below is the exact command line: $ hbase --config /home/ramya/hbase/conf shell hbase(main):001:0 create 'newtable','family' 0 row(s) in 0.5140 seconds hbase(main):002:0 list 'newtable' TABLE newtable 1 row(s) in 0.0120 seconds OR $ export HBASE_CONF_DIR=/home/ramya/hbase/conf $ hbase shell hbase(main):001:0 list 'newtable' TABLE newtable 1 row(s) in 0.3860 seconds Thanks Ramya On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel jigneshmpa...@gmail.com wrote: There is no command like -config see below Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config shell Unrecognized option: -config Could not create the Java virtual machine. -- View this message in context: http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Web crawler in hadoop - unresponsive after a while
Hello, I trying to make my web crawling go faster with hadoop. My mapper just consists of a single line and my reducer is an IdentityReducer while read line;do #result=`wget -O - --timeout=500 http://$line 21` echo $result done I am crawling about 50,000 sites. But my mapper always seems to time out after sometime. The crawler just becomes unresponsive I guess. I am not able to see which site is causing the problem as mapper deletes the output if the job fails. I am running a single node hadoop cluster currently. Is this the problem ? Did anyone else have a similar problem ? I am not sure why this is happening. Can I prevent mapper from deleting intermediate outputs ? I tried running mapper against 10-20 sites as opposed to 50k sites and that worked fine. Thanks, Aishwarya
Re: Web crawler in hadoop - unresponsive after a while
Hi Aishwarya To debug this issue you necessarily don't need the intermediate output. If there is any error/exception then you can get it from your job logs directly. In your case the job turns irresponsive, to do further trouble shooting you can include log statements on your program and then rerun the same and obtain the records that creates the problem from your logs. In a direct manner you can obtain your logs from the job tracker web UI. http://host:50030/jobtracker.jsp. From your job drill down to the task and on the right side you can see options to display your task tracker logs. On top of this i'd like to add on, since you mentioned single node, I assume it is either on stand alone/distributed mode. These setup is basically for development and testing of functionality. If you are looking for better performance of your jobs, you need to leverage the parallel processing power of hadoop. You need to have a mini cluster at least for performance bench marking and processing relatively large volume data. Hope it helps!.. --Original Message-- From: Aishwarya Venkataraman Sender: avenk...@eng.ucsd.edu To: common-user@hadoop.apache.org ReplyTo: common-user@hadoop.apache.org Subject: Web crawler in hadoop - unresponsive after a while Sent: Oct 14, 2011 08:20 Hello, I trying to make my web crawling go faster with hadoop. My mapper just consists of a single line and my reducer is an IdentityReducer while read line;do #result=`wget -O - --timeout=500 http://$line 21` echo $result done I am crawling about 50,000 sites. But my mapper always seems to time out after sometime. The crawler just becomes unresponsive I guess. I am not able to see which site is causing the problem as mapper deletes the output if the job fails. I am running a single node hadoop cluster currently. Is this the problem ? Did anyone else have a similar problem ? I am not sure why this is happening. Can I prevent mapper from deleting intermediate outputs ? I tried running mapper against 10-20 sites as opposed to 50k sites and that worked fine. Thanks, Aishwarya Regards Bejoy K S
wordcount example throwing null pointer with ConcurrentHashMap
Hi, I have setup the hadoop on single node and worked fine but when executing the wordcount example, following error is thornw, Is this any configuration issue? bin/hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount /user/hduser/testfiles /user/hduser/output 11/10/14 10:29:53 INFO input.FileInputFormat: Total input paths to process : 3 11/10/14 10:29:53 WARN snappy.LoadSnappy: Snappy native library is available 11/10/14 10:29:53 INFO util.NativeCodeLoader: Loaded the native-hadoop library 11/10/14 10:29:53 INFO snappy.LoadSnappy: Snappy native library loaded 11/10/14 10:29:53 INFO mapred.JobClient: Running job: job_201110141028_0001 11/10/14 10:29:54 INFO mapred.JobClient: map 0% reduce 0% 11/10/14 10:29:59 INFO mapred.JobClient: map 66% reduce 0% 11/10/14 10:30:01 INFO mapred.JobClient: Task Id : attempt_201110141028_0001_r_00_0, Status : FAILED Error: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2824) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2744) 11/10/14 10:30:02 INFO mapred.JobClient: map 100% reduce 0% 11/10/14 10:30:03 INFO mapred.JobClient: Task Id : attempt_201110141028_0001_r_00_1, Status : FAILED Error: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2824) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2744) 11/10/14 10:30:05 INFO mapred.JobClient: Task Id : attempt_201110141028_0001_r_00_2, Status : FAILED Error: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2824) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2744) 11/10/14 10:30:08 INFO mapred.JobClient: Job complete: job_201110141028_0001 11/10/14 10:30:08 INFO mapred.JobClient: Counters: 18 11/10/14 10:30:08 INFO mapred.JobClient: Job Counters 11/10/14 10:30:08 INFO mapred.JobClient: Launched reduce tasks=4 11/10/14 10:30:08 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=9167 11/10/14 10:30:08 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 11/10/14 10:30:08 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 11/10/14 10:30:08 INFO mapred.JobClient: Launched map tasks=3 11/10/14 10:30:08 INFO mapred.JobClient: Data-local map tasks=3 11/10/14 10:30:08 INFO mapred.JobClient: Failed reduce tasks=1 11/10/14 10:30:08 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=3292 11/10/14 10:30:08 INFO mapred.JobClient: FileSystemCounters 11/10/14 10:30:08 INFO mapred.JobClient: FILE_BYTES_READ=740427 11/10/14 10:30:08 INFO mapred.JobClient: HDFS_BYTES_READ=2863597 11/10/14 10:30:08 INFO mapred.JobClient: FILE_BYTES_WRITTEN=2161157 11/10/14 10:30:08 INFO mapred.JobClient: Map-Reduce Framework 11/10/14 10:30:08 INFO mapred.JobClient: Combine output records=87431 11/10/14 10:30:08 INFO mapred.JobClient: Map input records=58570 11/10/14 10:30:08 INFO mapred.JobClient: Spilled Records=138742 11/10/14 10:30:08 INFO mapred.JobClient: Map output bytes=4774081 11/10/14 10:30:08 INFO mapred.JobClient: Combine input records=487561 11/10/14 10:30:08 INFO mapred.JobClient: Map output records=487561 11/10/14 10:30:08 INFO mapred.JobClient: SPLIT_RAW_BYTES=361 -- View this message in context: http://old.nabble.com/wordcount-example-throwing-null-pointer-with-ConcurrentHashMap-tp32650178p32650178.html Sent from the Hadoop core-user mailing list archive at Nabble.com.