Re: Hadoop Map task - No Task Attempts found

2012-03-26 Thread T Vinod Gupta
did you check the hadoop job logs to see what could be going on?
thanks

On Mon, Mar 26, 2012 at 12:14 PM, V_sriram vsrira...@gmail.com wrote:


 Hi,

 I am running a Map reduce program which would scan the Hbase and will get
 me
 the required data. The Map process runs till 99.47% and after that it
 simply
 waits for the remaining 5 Map tasks to complete. But those remaining 5 Map
 tasks remain in 0.00% without any task attempt. Any ideas plz..
 --
 View this message in context:
 http://old.nabble.com/Hadoop-Map-task---No-Task-Attempts-found-tp33544760p33544760.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




problem running hadoop map reduce due to zookeeper ensemble not found

2012-03-02 Thread T Vinod Gupta
can someone tell, what the right way to do this.. i created a jar that
creates a map reduce job and submits it. but i get this error when i run it
-

12/03/02 21:42:13 ERROR zookeeper.ZKConfig: no clientPort found in zoo.cfg
12/03/02 21:42:13 ERROR mapreduce.TableInputFormat:
org.apache.hadoop.hbase.ZooKeeperConnectionException: java.io.IOException:
Unable to determine ZooKeeper ensemble
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1000)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:303)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:294)
at
org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156)
at org.apache.hadoop.hbase.client.HTable.init(HTable.java:167)
at org.apache.hadoop.hbase.client.HTable.init(HTable.java:145)
at
org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:91)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:448)
Caused by: java.io.IOException: Unable to determine ZooKeeper ensemble
at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:92)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:119)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:998)
... 17 more

this is on a standalone hbase installation.. when i try to run it on a
different machine with distributed hbase installation, i get the same
error..
i just it simply by doing
java jar name classname with main in it

thanks


hadoop 0.20.2+923.142-1 release notes?

2012-01-19 Thread T Vinod Gupta
hi,
i have a running hadoop/hbase production environment with the below hadoop
version -
Hadoop 0.20.2-cdh3u1
Subversion file:///tmp/topdir/BUILD/hadoop-0.20.2-cdh3u1 -r
bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638
Compiled by root on Mon Jul 18 09:40:22 PDT 2011
From source with checksum 3127e3d410455d2bacbff7673bf3284c

its been running for a while and hadoop is very stable. but now i need to
install hadoop-0.20-native package since i want to use snappy compression
for my hbase table. but when i try to yum install that, the following
dependencies will be brought in.

Updating for dependencies:
 hadoop-0.20noarch  0.20.2+923.142-1   cloudera-cdh3
29 M
 hadoop-0.20-datanode   noarch  0.20.2+923.142-1   cloudera-cdh3
 5.0 k
 hadoop-0.20-jobtracker noarch  0.20.2+923.142-1   cloudera-cdh3
 5.1 k
 hadoop-0.20-namenode   noarch  0.20.2+923.142-1   cloudera-cdh3
 5.1 k
 hadoop-0.20-secondarynamenode  noarch  0.20.2+923.142-1   cloudera-cdh3
 5.1 k
 hadoop-0.20-tasktrackernoarch  0.20.2+923.142-1   cloudera-cdh3
 5.1 k

i am tentative to take chances here with a running stable production
environment. i want to make sure this doesn't break anything. how do i find
out the delta between 0.20.2+923.142-1 and what i already have?

thanks


setting mapred.map.child.java.opts not working

2012-01-11 Thread T Vinod Gupta
Hi,
Can someone help me asap? when i run my mapred job, it fails with this
error -
12/01/12 02:58:36 INFO mapred.JobClient: Task Id :
attempt_201112151554_0050_m_71_0, Status : FAILED
Error: Java heap space
attempt_201112151554_0050_m_71_0: log4j:ERROR Failed to flush writer,
attempt_201112151554_0050_m_71_0: java.io.IOException: Stream closed
attempt_201112151554_0050_m_71_0:   at
sun.nio.cs.StreamEncoder.ensureOpen(StreamEncoder.java:44)
attempt_201112151554_0050_m_71_0:   at
sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:139)
attempt_201112151554_0050_m_71_0:   at
java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
attempt_201112151554_0050_m_71_0:   at
org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:58)
attempt_201112151554_0050_m_71_0:   at
org.apache.hadoop.mapred.TaskLogAppender.flush(TaskLogAppender.java:94)
attempt_201112151554_0050_m_71_0:   at
org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:260)
attempt_201112151554_0050_m_71_0:   at
org.apache.hadoop.mapred.Child$2.run(Child.java:142)


so i updated my mapred-site.xml with these settings -

  property
namemapred.map.child.java.opts/name
value-Xmx2048M/value
  /property

  property
namemapred.reduce.child.java.opts/name
value-Xmx2048M/value
  /property

also, when i run my jar, i provide -
-Dmapred.map.child.java.opts=-Xmx4000m at the end.
inspite of this, the task is not getting the max heap size im setting.

where did i go wrong?

after changing mapred-site.xml, i restarted jobtracker and tasktracker.. is
that not good enough?

thanks


Re: setting mapred.map.child.java.opts not working

2012-01-11 Thread T Vinod Gupta
Harsh, did you mean my job id_conf.xml? for some strange reason, i do see
these 3 lines -

property!--Loaded from
/media/ephemeral3/hadoop/mapred/local/jobTracker/job_201201120656_0001.xml--namemapred.reduce.child.java.opts/namevalue-Xmx2048M/value/property
property!--Loaded from
/media/ephemeral3/hadoop/mapred/local/jobTracker/job_201201120656_0001.xml--namemapred.child.java.opts/namevalue-Xmx200m/value/property
property!--Loaded from
/media/ephemeral3/hadoop/mapred/local/jobTracker/job_201201120656_0001.xml--namemapred.map.child.java.opts/namevalue-Xmx2048M/value/property

the 1st and 3rd is what i set. but i don't know if the middle property
overrides the others.

btw, my hadoop version is below -

Hadoop 0.20.2-cdh3u1
Subversion file:///tmp/topdir/BUILD/hadoop-0.20.2-cdh3u1 -r
bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638
Compiled by root on Mon Jul 18 09:40:22 PDT 2011
From source with checksum 3127e3d410455d2bacbff7673bf3284c

thanks

On Wed, Jan 11, 2012 at 10:57 PM, Koji Noguchi knogu...@yahoo-inc.comwrote:

  but those new settings have not yet been
  added to mapred-default.xml.
 
 It's intentionally left out.
 If set in mapred-default.xml, user's mapred.child.java.opts would be
 ignored
 since mapred.{map,reduce}.child.java.opts would always win.

 Koji

 On 1/11/12 9:34 PM, George Datskos george.dats...@jp.fujitsu.com
 wrote:

  Koji, Harsh
 
  mapred-478 seems to be in v1, but those new settings have not yet been
  added to mapred-default.xml.  (for backwards compatibility?)
 
 
  George
 
  On 2012/01/12 13:50, Koji Noguchi wrote:
  Hi Harsh,
 
  Wasn't MAPREDUCE-478 in 1.0 ?  Maybe the Jira is not up to date.
 
  Koji
 
 
  On 1/11/12 8:44 PM, Harsh Jha...@cloudera.com  wrote:
 
  These properties are not available on Apache Hadoop 1.0 (Formerly
  known as 0.20.x). This was a feature introduced in 0.21
  (https://issues.apache.org/jira/browse/MAPREDUCE-478), and is
  available today on 0.22 and 0.23 line of releases.
 
  For 1.0/0.20, use mapred.child.java.opts, that applies to both map
  and reduce commonly.
 
  Would also be helpful if you can tell us what doc guided you to use
  these property names instead of the proper one, so we can fix it.
 
  On Thu, Jan 12, 2012 at 8:44 AM, T Vinod Guptatvi...@readypulse.com
  wrote:
  Hi,
  Can someone help me asap? when i run my mapred job, it fails with this
  error -
  12/01/12 02:58:36 INFO mapred.JobClient: Task Id :
  attempt_201112151554_0050_m_71_0, Status : FAILED
  Error: Java heap space
  attempt_201112151554_0050_m_71_0: log4j:ERROR Failed to flush
 writer,
  attempt_201112151554_0050_m_71_0: java.io.IOException: Stream
 closed
  attempt_201112151554_0050_m_71_0:   at
  sun.nio.cs.StreamEncoder.ensureOpen(StreamEncoder.java:44)
  attempt_201112151554_0050_m_71_0:   at
  sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:139)
  attempt_201112151554_0050_m_71_0:   at
  java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
  attempt_201112151554_0050_m_71_0:   at
  org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:58)
  attempt_201112151554_0050_m_71_0:   at
 
 org.apache.hadoop.mapred.TaskLogAppender.flush(TaskLogAppender.java:94)
  attempt_201112151554_0050_m_71_0:   at
  org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:260)
  attempt_201112151554_0050_m_71_0:   at
  org.apache.hadoop.mapred.Child$2.run(Child.java:142)
 
 
  so i updated my mapred-site.xml with these settings -
 
property
  namemapred.map.child.java.opts/name
  value-Xmx2048M/value
/property
 
property
  namemapred.reduce.child.java.opts/name
  value-Xmx2048M/value
/property
 
  also, when i run my jar, i provide -
  -Dmapred.map.child.java.opts=-Xmx4000m at the end.
  inspite of this, the task is not getting the max heap size im setting.
 
  where did i go wrong?
 
  after changing mapred-site.xml, i restarted jobtracker and
 tasktracker.. is
  that not good enough?
 
  thanks
 
 
 




how to specify class name to run in mapreduce job

2012-01-10 Thread T Vinod Gupta
hi,
how can i specify which class' main method to run as a job when i do
mapreduce? lets say my jar has 4 classes and each one of them has a main
method. i want to pass the class name in the 'hadoop jar jar file
classname' command. this will be similar to running stock tools inside
hbase or other hadoop jars that out of the box.
currently, i solve this problem by setting the main in the project and
building the jar and running it. but i want to be able to flip that at run
time instead of compile time.

thanks


can't run a simple mapred job

2012-01-09 Thread T Vinod Gupta
Hi,
I have a hbase/hadoop setup on my instance in aws. I am able to run the
simple wordcount map reduce example but not a custom one that i wrote. here
is the error that i get -

[ec2-user@ip-10-68-145-124 bin]$ hadoop jar HBaseTest.jar
com.akanksh.information.hbasetest.HBaseSweeper
12/01/09 11:27:27 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
12/01/09 11:27:27 INFO mapred.JobClient: Cleaning up the staging area
hdfs://ip-10-68-145-124.ec2.internal:9100/media/ephemeral1/hadoop/mapred/staging/ec2-user/.staging/job_201112151554_0006
Exception in thread main java.lang.RuntimeException:
java.lang.InstantiationException
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:869)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
at
com.akanksh.information.hbasetest.HBaseSweeper.main(HBaseSweeper.java:86)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.lang.InstantiationException
at
sun.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113)
... 14 more

does this sound like a standard problem?

here is my main method - nothing much in it -

public static void main(String args[]) throws Exception {
Configuration config = HBaseConfiguration.create();

Job job = new Job(config, HBaseSweeper);
job.setJarByClass(HBaseSweeper.class);
Scan scan = new Scan();
scan.setCaching(500);
scan.setCacheBlocks(false);

TableMapReduceUtil.initTableMapperJob(TABLE_NAME, scan,
SweeperMapper.class,
ImmutableBytesWritable.class, Delete.class, job);
job.setOutputFormatClass(FileOutputFormat.class);
job.setInputFormatClass(TextInputFormat.class);
boolean b = job.waitForCompletion(true);
if (!b) {
throw new IOException(error with job!);
}
}

can someone help? i will really appreciate it.

thanks
vinod