Map Task Capacity Not Changing

2011-12-15 Thread Joey Krabacher
I have looked up how to up this value on the web and have tried all
suggestions to no avail.

Any help would be great.

Here is some background:

Version: 0.20.2, r911707
Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo

Nodes: 5
Current Map Task Capacity : 10  <--- this is what I want to increase.

What I have tried :

Adding
   
mapred.tasktracker.map.tasks.maximum
8
true
  
to mapred-site.xml on NameNode.  I also added this to one of the
datanodes for the hell of it and that didn't work either.

Thanks.


Re: Map Task Capacity Not Changing

2011-12-16 Thread Joey Krabacher
Turns out my tasktrackers(on the datanodes) are not starting properly

so I guess they are taking some alternate route??

because they are up and running...even though when I run
stop-mapred.sh it says "data01: no tasktracker to stop"

--Joey

On Thu, Dec 15, 2011 at 5:37 PM, James Warren
 wrote:
> (moving to mapreduce-user@, bcc'ing common-user@)
>
> Hi Joey -
>
> You'll want to change the value on all of your servers running tasktrackers
> and then restart each tasktracker to reread the configuration.
>
> cheers,
> -James
>
> On Thu, Dec 15, 2011 at 3:30 PM, Joey Krabacher wrote:
>
>> I have looked up how to up this value on the web and have tried all
>> suggestions to no avail.
>>
>> Any help would be great.
>>
>> Here is some background:
>>
>> Version: 0.20.2, r911707
>> Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo
>>
>> Nodes: 5
>> Current Map Task Capacity : 10  <--- this is what I want to increase.
>>
>> What I have tried :
>>
>> Adding
>>   
>>    mapred.tasktracker.map.tasks.maximum
>>    8
>>    true
>>  
>> to mapred-site.xml on NameNode.  I also added this to one of the
>> datanodes for the hell of it and that didn't work either.
>>
>> Thanks.
>>


Re: Map Task Capacity Not Changing

2011-12-16 Thread Joey Krabacher
pid files are there, I checked for running processes with the same
ID's and they all checked out.

--Joey

On Fri, Dec 16, 2011 at 5:40 PM, Rahul Jain  wrote:
> You might be suffering from HADOOP-7822; I'd suggest you verify your pid
> files and fix the problem by hand if it is the same issue.
>
> -Rahul
>
> On Fri, Dec 16, 2011 at 2:40 PM, Joey Krabacher wrote:
>
>> Turns out my tasktrackers(on the datanodes) are not starting properly
>>
>> so I guess they are taking some alternate route??
>>
>> because they are up and running...even though when I run
>> stop-mapred.sh it says "data01: no tasktracker to stop"
>>
>> --Joey
>>
>> On Thu, Dec 15, 2011 at 5:37 PM, James Warren
>>  wrote:
>> > (moving to mapreduce-user@, bcc'ing common-user@)
>> >
>> > Hi Joey -
>> >
>> > You'll want to change the value on all of your servers running
>> tasktrackers
>> > and then restart each tasktracker to reread the configuration.
>> >
>> > cheers,
>> > -James
>> >
>> > On Thu, Dec 15, 2011 at 3:30 PM, Joey Krabacher > >wrote:
>> >
>> >> I have looked up how to up this value on the web and have tried all
>> >> suggestions to no avail.
>> >>
>> >> Any help would be great.
>> >>
>> >> Here is some background:
>> >>
>> >> Version: 0.20.2, r911707
>> >> Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo
>> >>
>> >> Nodes: 5
>> >> Current Map Task Capacity : 10  <--- this is what I want to increase.
>> >>
>> >> What I have tried :
>> >>
>> >> Adding
>> >>   
>> >>    mapred.tasktracker.map.tasks.maximum
>> >>    8
>> >>    true
>> >>  
>> >> to mapred-site.xml on NameNode.  I also added this to one of the
>> >> datanodes for the hell of it and that didn't work either.
>> >>
>> >> Thanks.
>> >>
>>


Re: Map Task Capacity Not Changing

2011-12-16 Thread Joey Krabacher
pid files are there, I checked for running processes with the sameID's
and they all checked out.
--Joey
On Fri, Dec 16, 2011 at 5:40 PM, Rahul Jain  wrote:
> You might be suffering from HADOOP-7822; I'd suggest you verify your pid
> files and fix the problem by hand if it is the same issue.
>
> -Rahul
>
> On Fri, Dec 16, 2011 at 2:40 PM, Joey Krabacher wrote:
>
>> Turns out my tasktrackers(on the datanodes) are not starting properly
>>
>> so I guess they are taking some alternate route??
>>
>> because they are up and running...even though when I run
>> stop-mapred.sh it says "data01: no tasktracker to stop"
>>
>> --Joey
>>
>> On Thu, Dec 15, 2011 at 5:37 PM, James Warren
>>  wrote:
>> > (moving to mapreduce-user@, bcc'ing common-user@)
>> >
>> > Hi Joey -
>> >
>> > You'll want to change the value on all of your servers running
>> tasktrackers
>> > and then restart each tasktracker to reread the configuration.
>> >
>> > cheers,
>> > -James
>> >
>> > On Thu, Dec 15, 2011 at 3:30 PM, Joey Krabacher > >wrote:
>> >
>> >> I have looked up how to up this value on the web and have tried all
>> >> suggestions to no avail.
>> >>
>> >> Any help would be great.
>> >>
>> >> Here is some background:
>> >>
>> >> Version: 0.20.2, r911707
>> >> Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo
>> >>
>> >> Nodes: 5
>> >> Current Map Task Capacity : 10  <--- this is what I want to increase.
>> >>
>> >> What I have tried :
>> >>
>> >> Adding
>> >>   
>> >>    mapred.tasktracker.map.tasks.maximum
>> >>    8
>> >>    true
>> >>  
>> >> to mapred-site.xml on NameNode.  I also added this to one of the
>> >> datanodes for the hell of it and that didn't work either.
>> >>
>> >> Thanks.
>> >>
>>


Re: java.net.ConnectException: Connection refused

2011-12-22 Thread Joey Krabacher
If you are just trying to list the HDFS content then just use:

bin/hadoop fs -ls

See if that command get you what you're looking for.

--Joey

On Thu, Dec 22, 2011 at 8:45 PM, warren  wrote:
> hi everyone
>
> I useing hadoop-0.20.203.0,one master and one slave.
>
> I can see one live datanode at: http://localhost:50070/dfshealth.jsp
>
> when I type:bin/hadoop dfs -fs datanode1 -ls /
>
> I found that:
>
> 11/12/23 10:44:21 WARN fs.FileSystem: "datanode1" is a deprecated filesystem
> name. Use "hdfs://datanode1/" instead.
> 11/12/23 10:44:25 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 0 time(s).
> 11/12/23 10:44:26 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 1 time(s).
> 11/12/23 10:44:27 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 2 time(s).
> 11/12/23 10:44:28 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 3 time(s).
> 11/12/23 10:44:29 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 4 time(s).
> 11/12/23 10:44:30 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 5 time(s).
> 11/12/23 10:44:31 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 6 time(s).
> 11/12/23 10:44:32 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 7 time(s).
> 11/12/23 10:44:33 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 8 time(s).
> 11/12/23 10:44:34 INFO ipc.Client: Retrying connect to server:
> datanode1/10.238.11.198:8020. Already tried 9 time(s).
> Bad connection to FS. command aborted. exception: Call to
> datanode1/10.238.11.198:8020 failed on connection exception:
> java.net.ConnectException: Connection refused
>
> please help me thanks


Re: Hadoop configuration

2011-12-24 Thread Joey Krabacher
have you checked your log files for any clues?

--Joey

On Sat, Dec 24, 2011 at 3:15 AM, Humayun kabir  wrote:
> Hi Uma,
>
> Thank you very much for your tips. We tried it in 3 nodes in virtual box as
> you suggested. But still we are facing problem. Here is our all
> configuration file to all nodes. please take a look and show us some ways
> to solve it. It was nice and it would be great if you help us in this
> regards.
>
> core-site.xml < http://pastebin.com/Twn5edrp >
> hdfs-site.xml < http://pastebin.com/k4hR4GE9 >
> mapred-site.xml < http://pastebin.com/gZuyHswS >
>
> /etc/hosts < http://pastebin.com/5s0yhgnj >
>
> output < http://paste.ubuntu.com/780807/ >
>
>
> Hope you will understand and extend your helping hand towards us.
>
> Have a nice day.
>
> Regards
> Humayun
>
> On 23 December 2011 17:31, Uma Maheswara Rao G  wrote:
>
>> Hi Humayun ,
>>
>>  Lets assume you have JT, TT1, TT2, TT3
>>
>>  Now you should configure the \etc\hosts like below examle
>>
>>      10.18.xx.1 JT
>>
>>      10.18.xx.2 TT1
>>
>>      10.18.xx.3 TT2
>>
>>      10.18.xx.4 TT3
>>
>>   Configure the same set in all the machines, so that all task trackers
>> can talk each other with hostnames correctly. Also pls remove some entries
>> from your files
>>
>>   127.0.0.1 localhost.localdomain localhost
>>
>>   127.0.1.1 humayun
>>
>>
>>
>> I have seen others already suggested many links for the regular
>> configuration items. Hope you might clear about them.
>>
>> hope it will help...
>>
>> Regards,
>>
>> Uma
>>
>> 
>>
>> From: Humayun kabir [humayun0...@gmail.com]
>> Sent: Thursday, December 22, 2011 10:34 PM
>> To: common-user@hadoop.apache.org; Uma Maheswara Rao G
>> Subject: Re: Hadoop configuration
>>
>> Hello Uma,
>>
>> Thanks for your cordial and quick reply. It would be great if you explain
>> what you suggested to do. Right now we are running on following
>> configuration.
>>
>> We are using hadoop on virtual box. when it is a single node then it works
>> fine for big dataset larger than the default block size. but in case of
>> multinode cluster (2 nodes) we are facing some problems. We are able to
>> ping both "Master->Slave" and "Slave->Master".
>> Like when the input dataset is smaller than the default block size(64 MB)
>> then it works fine. but when the input dataset is larger than the default
>> block size then it shows ‘too much fetch failure’ in reduce state.
>> here is the output link
>> http://paste.ubuntu.com/707517/
>>
>> this is our /etc/hosts file
>>
>> 192.168.60.147 humayun # Added by NetworkManager
>> 127.0.0.1 localhost.localdomain localhost
>> ::1 humayun localhost6.localdomain6 localhost6
>> 127.0.1.1 humayun
>>
>> # The following lines are desirable for IPv6 capable hosts
>> ::1 localhost ip6-localhost ip6-loopback
>> fe00::0 ip6-localnet
>> ff00::0 ip6-mcastprefix
>> ff02::1 ip6-allnodes
>> ff02::2 ip6-allrouters
>> ff02::3 ip6-allhosts
>>
>> 192.168.60.1 master
>> 192.168.60.2 slave
>>
>>
>> Regards,
>>
>> -Humayun.
>>
>>
>> On 22 December 2011 15:47, Uma Maheswara Rao G > > wrote:
>> Hey Humayun,
>>
>>  To solve the too many fetch failures problem, you should configure host
>> mapping correctly.
>> Each tasktracker should be able to ping from each other.
>>
>> Regards,
>> Uma
>> 
>> From: Humayun kabir [humayun0...@gmail.com]
>> Sent: Thursday, December 22, 2011 2:54 PM
>> To: common-user@hadoop.apache.org
>> Subject: Hadoop configuration
>>
>> someone please help me to configure hadoop such as core-site.xml,
>> hdfs-site.xml, mapred-site.xml etc.
>> please provide some example. it is badly needed. because i run in a 2 node
>> cluster. when i run the wordcount example then it gives the result too
>> mutch fetch failure.
>>
>>


Re: Map Task Capacity Not Changing

2011-12-29 Thread Joey Krabacher
To follow up on what I have found:

I opened up some of the logs on the datanodes and found this message:
"Can not start task tracker because java.net.BindException: Address
already in use"

It was using the default port setting from mapred-default.xml, which was 50060.
I decided to try an add

  
mapred.task.tracker.http.address
0.0.0.0:0
  

to mapred-site.xml so that the first open port would be selected.
This works and it also, allows for the tasktracker to start normally
which in turn allows the mapred.tasktracker.map.tasks.maximum setting
to take effect.

Thanks for all who helped.

--Joey

On Sat, Dec 17, 2011 at 1:42 AM, Joey Krabacher  wrote:
> pid files are there, I checked for running processes with the sameID's
> and they all checked out.
> --Joey
> On Fri, Dec 16, 2011 at 5:40 PM, Rahul Jain  wrote:
>> You might be suffering from HADOOP-7822; I'd suggest you verify your pid
>> files and fix the problem by hand if it is the same issue.
>>
>> -Rahul
>>
>> On Fri, Dec 16, 2011 at 2:40 PM, Joey Krabacher wrote:
>>
>>> Turns out my tasktrackers(on the datanodes) are not starting properly
>>>
>>> so I guess they are taking some alternate route??
>>>
>>> because they are up and running...even though when I run
>>> stop-mapred.sh it says "data01: no tasktracker to stop"
>>>
>>> --Joey
>>>
>>> On Thu, Dec 15, 2011 at 5:37 PM, James Warren
>>>  wrote:
>>> > (moving to mapreduce-user@, bcc'ing common-user@)
>>> >
>>> > Hi Joey -
>>> >
>>> > You'll want to change the value on all of your servers running
>>> tasktrackers
>>> > and then restart each tasktracker to reread the configuration.
>>> >
>>> > cheers,
>>> > -James
>>> >
>>> > On Thu, Dec 15, 2011 at 3:30 PM, Joey Krabacher >> >wrote:
>>> >
>>> >> I have looked up how to up this value on the web and have tried all
>>> >> suggestions to no avail.
>>> >>
>>> >> Any help would be great.
>>> >>
>>> >> Here is some background:
>>> >>
>>> >> Version: 0.20.2, r911707
>>> >> Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo
>>> >>
>>> >> Nodes: 5
>>> >> Current Map Task Capacity : 10  <--- this is what I want to increase.
>>> >>
>>> >> What I have tried :
>>> >>
>>> >> Adding
>>> >>   
>>> >>    mapred.tasktracker.map.tasks.maximum
>>> >>    8
>>> >>    true
>>> >>  
>>> >> to mapred-site.xml on NameNode.  I also added this to one of the
>>> >> datanodes for the hell of it and that didn't work either.
>>> >>
>>> >> Thanks.
>>> >>
>>>


Re: Unable to build pig from Trunk

2011-12-29 Thread Joey Krabacher
Try pinging http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
to see if your server can connect to that URL.
If not you have some kind of connection issue with outgoing requests.

--Joey

On Thu, Dec 29, 2011 at 11:28 PM, praveenesh kumar  wrote:
> Hi everyone,
> I am trying to build Pig from SVN trunk on hadoop 0.20.205.
> While doing that, I am getting the following error : Any idea why its
> happening ?
>
> Thanks,
> Praveenesh
>
>
> root@lxe [/usr/local/hadoop/pig/new/trunk] $ --> ant jar-withouthadoop
> -verbose
> Apache Ant version 1.6.5 compiled on June 5 2007
> Buildfile: build.xml
> Detected Java version: 1.5 in: /usr/java/jdk1.6.0_25/jre
> Detected OS: Linux
> parsing buildfile /usr/local/hadoop/pig/new/trunk/build.xml with URI =
> file:///usr/local/hadoop/pig/new/trunk/build.xml
> Project base dir set to: /usr/local/hadoop/pig/new/trunk
>  [property] Loading /root/build.properties
>  [property] Unable to find property file: /root/build.properties
>  [property] Loading /usr/local/hadoop/pig/new/trunk/build.properties
>  [property] Unable to find property file:
> /usr/local/hadoop/pig/new/trunk/build.properties
> Override ignored for property test.log.dir
> Property ${clover.home} has not been set
> [available] Unable to find ${clover.home}/lib/clover.jar to set property
> clover.present
> Property ${repo} has not been set
> Override ignored for property build.dir
> Override ignored for property dist.dir
> Property ${zookeeper.jarfile} has not been set
> Build sequence for target(s) `jar-withouthadoop' is [ivy-download,
> ivy-init-dirs, ivy-probe-antlib, ivy-init-antlib, ivy-init, ivy-resolve,
> ivy-compile, init, cc-compile, prepare, genLexer, genParser, genTreeParser,
> gen, compile, jar-withouthadoop]
> Complete build sequence is [ivy-download, ivy-init-dirs, ivy-probe-antlib,
> ivy-init-antlib, ivy-init, ivy-resolve, ivy-compile, init, cc-compile,
> prepare, genLexer, genParser, genTreeParser, gen, compile,
> jar-withouthadoop, forrest.check, ivy-javadoc, javadoc-all, docs,
> ivy-jdiff, write-null, api-xml, api-report, jar, package, tar, source-jar,
> patch.check, makepom, ivy-releaseaudit, releaseaudit, ivy-test,
> compile-test, pigunit-jar, javadoc, javadoc-jar, package-release,
> clover.setup, jarWithSvn, piggybank, test-e2e-local, assert-pig-jar-exists,
> ready-to-publish, copy-jar-to-maven, jar-withouthadoopWithOutSvn,
> compile-sources, clover.info, clover, clean-sign, sign,
> test-e2e-deploy-local, ivy-publish-local, ant-task-download, mvn-taskdef,
> test-commit, test-smoke, copypom, maven-artifacts, published, set-version,
> test-unit, test-e2e, test-contrib, test, jar-withouthadoopWithSvn,
> clover.check, check-for-findbugs, test-core, ivy-buildJar,
> checkstyle.check, tar-release, rpm, clean, smoketests-jar, mvn-install,
> test-e2e-undeploy, ivy-checkstyle, jar-all, test-pigunit, signanddeploy,
> simpledeploy, mvn-deploy, findbugs, buildJar-withouthadoop, checkstyle,
> buildJar, findbugs.check, test-patch, jarWithOutSvn, test-e2e-deploy,
> hudson-test-patch, compile-sources-all-warnings, test-contrib-internal,
> include-meta, deb, eclipse-files, generate-clover-reports, ]
>
> ivy-download:
>      [get] Getting:
> http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
>      [get] To: /usr/local/hadoop/pig/new/trunk/ivy/ivy-2.2.0.jar
>      [get] Error getting
> http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar to
> /usr/local/hadoop/pig/new/trunk/ivy/ivy-2.2.0.jar
>
> BUILD FAILED
> /usr/local/hadoop/pig/new/trunk/build.xml:1443: java.net.ConnectException:
> Connection timed out
>        at org.apache.tools.ant.taskdefs.Get.execute(Get.java:80)
>        at
> org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:275)
>        at org.apache.tools.ant.Task.perform(Task.java:364)
>        at org.apache.tools.ant.Target.execute(Target.java:341)
>        at org.apache.tools.ant.Target.performTasks(Target.java:369)
>        at
> org.apache.tools.ant.Project.executeSortedTargets(Project.java:1216)
>        at org.apache.tools.ant.Project.executeTarget(Project.java:1185)
>        at
> org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:40)
>        at org.apache.tools.ant.Project.executeTargets(Project.java:1068)
>        at org.apache.tools.ant.Main.runBuild(Main.java:668)
>        at org.apache.tools.ant.Main.startAnt(Main.java:187)
>        at org.apache.tools.ant.launch.Launcher.run(Launcher.java:246)
>        at org.apache.tools.ant.launch.Launcher.main(Launcher.java:67)
> Caused by: java.net.ConnectException: Connection timed out
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at
> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at 

Re: Unable to build pig from Trunk

2011-12-29 Thread Joey Krabacher
Can you ping any URL successfully?
Try www.google.com, www.yahoo.com or something like that.

If you can't ping any of those then you are probably behind a firewall
and you'll have to poke a hole into it to get to the outside world.

Or you can download the jar that it is trying to find (ivy-2.2.0.jar)
from another computer and copy it to the one you are building on.
I would put it in this folder :  /usr/local/hadoop/pig/new/trunk/ivy/

--Joey

On Thu, Dec 29, 2011 at 11:41 PM, praveenesh kumar  wrote:
> When I am pinging its saying "Unknown host."..
> Is there any kind of proxy setting we need to do, when building from ant ?
>
> Thanks,
> Praveenesh
>
>
> On Fri, Dec 30, 2011 at 11:02 AM, Joey Krabacher wrote:
>
>> Try pinging
>> http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
>> to see if your server can connect to that URL.
>> If not you have some kind of connection issue with outgoing requests.
>>
>> --Joey
>>
>> On Thu, Dec 29, 2011 at 11:28 PM, praveenesh kumar 
>> wrote:
>> > Hi everyone,
>> > I am trying to build Pig from SVN trunk on hadoop 0.20.205.
>> > While doing that, I am getting the following error : Any idea why its
>> > happening ?
>> >
>> > Thanks,
>> > Praveenesh
>> >
>> >
>> > root@lxe [/usr/local/hadoop/pig/new/trunk] $ --> ant jar-withouthadoop
>> > -verbose
>> > Apache Ant version 1.6.5 compiled on June 5 2007
>> > Buildfile: build.xml
>> > Detected Java version: 1.5 in: /usr/java/jdk1.6.0_25/jre
>> > Detected OS: Linux
>> > parsing buildfile /usr/local/hadoop/pig/new/trunk/build.xml with URI =
>> > file:///usr/local/hadoop/pig/new/trunk/build.xml
>> > Project base dir set to: /usr/local/hadoop/pig/new/trunk
>> >  [property] Loading /root/build.properties
>> >  [property] Unable to find property file: /root/build.properties
>> >  [property] Loading /usr/local/hadoop/pig/new/trunk/build.properties
>> >  [property] Unable to find property file:
>> > /usr/local/hadoop/pig/new/trunk/build.properties
>> > Override ignored for property test.log.dir
>> > Property ${clover.home} has not been set
>> > [available] Unable to find ${clover.home}/lib/clover.jar to set property
>> > clover.present
>> > Property ${repo} has not been set
>> > Override ignored for property build.dir
>> > Override ignored for property dist.dir
>> > Property ${zookeeper.jarfile} has not been set
>> > Build sequence for target(s) `jar-withouthadoop' is [ivy-download,
>> > ivy-init-dirs, ivy-probe-antlib, ivy-init-antlib, ivy-init, ivy-resolve,
>> > ivy-compile, init, cc-compile, prepare, genLexer, genParser,
>> genTreeParser,
>> > gen, compile, jar-withouthadoop]
>> > Complete build sequence is [ivy-download, ivy-init-dirs,
>> ivy-probe-antlib,
>> > ivy-init-antlib, ivy-init, ivy-resolve, ivy-compile, init, cc-compile,
>> > prepare, genLexer, genParser, genTreeParser, gen, compile,
>> > jar-withouthadoop, forrest.check, ivy-javadoc, javadoc-all, docs,
>> > ivy-jdiff, write-null, api-xml, api-report, jar, package, tar,
>> source-jar,
>> > patch.check, makepom, ivy-releaseaudit, releaseaudit, ivy-test,
>> > compile-test, pigunit-jar, javadoc, javadoc-jar, package-release,
>> > clover.setup, jarWithSvn, piggybank, test-e2e-local,
>> assert-pig-jar-exists,
>> > ready-to-publish, copy-jar-to-maven, jar-withouthadoopWithOutSvn,
>> > compile-sources, clover.info, clover, clean-sign, sign,
>> > test-e2e-deploy-local, ivy-publish-local, ant-task-download, mvn-taskdef,
>> > test-commit, test-smoke, copypom, maven-artifacts, published,
>> set-version,
>> > test-unit, test-e2e, test-contrib, test, jar-withouthadoopWithSvn,
>> > clover.check, check-for-findbugs, test-core, ivy-buildJar,
>> > checkstyle.check, tar-release, rpm, clean, smoketests-jar, mvn-install,
>> > test-e2e-undeploy, ivy-checkstyle, jar-all, test-pigunit, signanddeploy,
>> > simpledeploy, mvn-deploy, findbugs, buildJar-withouthadoop, checkstyle,
>> > buildJar, findbugs.check, test-patch, jarWithOutSvn, test-e2e-deploy,
>> > hudson-test-patch, compile-sources-all-warnings, test-contrib-internal,
>> > include-meta, deb, eclipse-files, generate-clover-reports, ]
>> >
>> > ivy-download:
>> >      [get] Getting:
>> > http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
>> >      [get] To: /usr/local/hadoop/pig/new/trunk/ivy/ivy-2.2

Re: Unable to build pig from Trunk

2011-12-29 Thread Joey Krabacher
.java:246)
>        at org.apache.tools.ant.launch.Launcher.main(Launcher.java:67)
>
> Total time: 0 seconds
>
>
>
> On Fri, Dec 30, 2011 at 11:11 AM, praveenesh kumar 
> wrote:
>
>> When I am pinging its saying "Unknown host."..
>> Is there any kind of proxy setting we need to do, when building from ant ?
>>
>> Thanks,
>> Praveenesh
>>
>>
>>
>> On Fri, Dec 30, 2011 at 11:02 AM, Joey Krabacher wrote:
>>
>>> Try pinging
>>> http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
>>> to see if your server can connect to that URL.
>>> If not you have some kind of connection issue with outgoing requests.
>>>
>>> --Joey
>>>
>>> On Thu, Dec 29, 2011 at 11:28 PM, praveenesh kumar 
>>> wrote:
>>> > Hi everyone,
>>> > I am trying to build Pig from SVN trunk on hadoop 0.20.205.
>>> > While doing that, I am getting the following error : Any idea why its
>>> > happening ?
>>> >
>>> > Thanks,
>>> > Praveenesh
>>> >
>>> >
>>> > root@lxe [/usr/local/hadoop/pig/new/trunk] $ --> ant jar-withouthadoop
>>> > -verbose
>>> > Apache Ant version 1.6.5 compiled on June 5 2007
>>> > Buildfile: build.xml
>>> > Detected Java version: 1.5 in: /usr/java/jdk1.6.0_25/jre
>>> > Detected OS: Linux
>>> > parsing buildfile /usr/local/hadoop/pig/new/trunk/build.xml with URI =
>>> > file:///usr/local/hadoop/pig/new/trunk/build.xml
>>> > Project base dir set to: /usr/local/hadoop/pig/new/trunk
>>> >  [property] Loading /root/build.properties
>>> >  [property] Unable to find property file: /root/build.properties
>>> >  [property] Loading /usr/local/hadoop/pig/new/trunk/build.properties
>>> >  [property] Unable to find property file:
>>> > /usr/local/hadoop/pig/new/trunk/build.properties
>>> > Override ignored for property test.log.dir
>>> > Property ${clover.home} has not been set
>>> > [available] Unable to find ${clover.home}/lib/clover.jar to set property
>>> > clover.present
>>> > Property ${repo} has not been set
>>> > Override ignored for property build.dir
>>> > Override ignored for property dist.dir
>>> > Property ${zookeeper.jarfile} has not been set
>>> > Build sequence for target(s) `jar-withouthadoop' is [ivy-download,
>>> > ivy-init-dirs, ivy-probe-antlib, ivy-init-antlib, ivy-init, ivy-resolve,
>>> > ivy-compile, init, cc-compile, prepare, genLexer, genParser,
>>> genTreeParser,
>>> > gen, compile, jar-withouthadoop]
>>> > Complete build sequence is [ivy-download, ivy-init-dirs,
>>> ivy-probe-antlib,
>>> > ivy-init-antlib, ivy-init, ivy-resolve, ivy-compile, init, cc-compile,
>>> > prepare, genLexer, genParser, genTreeParser, gen, compile,
>>> > jar-withouthadoop, forrest.check, ivy-javadoc, javadoc-all, docs,
>>> > ivy-jdiff, write-null, api-xml, api-report, jar, package, tar,
>>> source-jar,
>>> > patch.check, makepom, ivy-releaseaudit, releaseaudit, ivy-test,
>>> > compile-test, pigunit-jar, javadoc, javadoc-jar, package-release,
>>> > clover.setup, jarWithSvn, piggybank, test-e2e-local,
>>> assert-pig-jar-exists,
>>> > ready-to-publish, copy-jar-to-maven, jar-withouthadoopWithOutSvn,
>>> > compile-sources, clover.info, clover, clean-sign, sign,
>>> > test-e2e-deploy-local, ivy-publish-local, ant-task-download,
>>> mvn-taskdef,
>>> > test-commit, test-smoke, copypom, maven-artifacts, published,
>>> set-version,
>>> > test-unit, test-e2e, test-contrib, test, jar-withouthadoopWithSvn,
>>> > clover.check, check-for-findbugs, test-core, ivy-buildJar,
>>> > checkstyle.check, tar-release, rpm, clean, smoketests-jar, mvn-install,
>>> > test-e2e-undeploy, ivy-checkstyle, jar-all, test-pigunit, signanddeploy,
>>> > simpledeploy, mvn-deploy, findbugs, buildJar-withouthadoop, checkstyle,
>>> > buildJar, findbugs.check, test-patch, jarWithOutSvn, test-e2e-deploy,
>>> > hudson-test-patch, compile-sources-all-warnings, test-contrib-internal,
>>> > include-meta, deb, eclipse-files, generate-clover-reports, ]
>>> >
>>> > ivy-download:
>>> >      [get] Getting:
>>> > http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
>>> >      [get] To: /usr/local/hadoop/pig/new/trunk/ivy/ivy

Re: datanode failing to start.

2012-01-04 Thread Joey Krabacher
Have you checked your logs?

--Joey

On Wed, Jan 4, 2012 at 4:37 PM, Dave Kelsey  wrote:
>
> java version 1.6.0_29
> hadoop: 0.20.203.0
>
> I'm attempting to setup the pseudo-distributed config on a mac 10.6.8.
> I followed the steps from the QuickStart
> (http://wiki.apache.org./hadoop/QuickStart) and succeeded with Stage 1:
> Standalone Operation.
> I followed the steps for Stage 2: Pseudo-distributed Configuration.
> I set the JAVA_HOME variable in conf/hadoop-env.sh and I changed tools.jar
> to the location of classes.jar (a mac version of tools.jar)
> I've modified the three .xml files as described in the QuickStart.
> ssh'ing to localhost has been configured and works with passwordless
> authentication.
> I formatted the namenode with "bin/hadoop namenode -format" as the
> instructions say
>
> This is what I see when I run bin/start-all.sh
>
> root# bin/start-all.sh
> starting namenode, logging to
> /Users/admin/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-root-namenode-Hoot-2.local.out
> localhost: starting datanode, logging to
> /Users/admin/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-root-datanode-Hoot-2.local.out
> localhost: Exception in thread "main" java.lang.NoClassDefFoundError: server
> localhost: Caused by: java.lang.ClassNotFoundException: server
> localhost:     at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> localhost:     at java.security.AccessController.doPrivileged(Native Method)
> localhost:     at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> localhost:     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> localhost:     at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> localhost:     at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> localhost: starting secondarynamenode, logging to
> /Users/admin/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-root-secondarynamenode-Hoot-2.local.out
> starting jobtracker, logging to
> /Users/admin/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-root-jobtracker-Hoot-2.local.out
> localhost: starting tasktracker, logging to
> /Users/admin/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-root-tasktracker-Hoot-2.local.out
>
> There are 4 processes running:
> ps -fax | grep hadoop | grep -v grep | wc -l
>      4
>
> They are:
> SecondaryNameNode
> TaskTracker
> NameNode
> JobTracker
>
>
> I've searched to see if anyone else has encountered this and not found
> anything
>
> Dave Kelsey
>


Re: Memory exception in the mapper

2012-05-23 Thread Joey Krabacher
My experience with this sort of problem tells me one of two things and
maybe both:

1. there are some optimizations to the code that can be made (variable
re-creation inside of loops, etc.)
2. something has gone horribly wrong with the logic in the mapper.

To troubleshoot I would output some log entries at specific points in the
mapper (be careful not to log every execution of the mapper because this
could cause major issues with the disk filling up and that sort of thing.)

Hope that helps.

/* Joey */

On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner wrote:

> Hi, all,
>
> I got the exception below in the mapper. I already have my global Hadoop
> heap at 5 GB, but is there a specific other setting? Or maybe I should
> troubleshoot for memory?
>
> But the same application works in the IDE.
>
> Thank you!
>
> Mark
>
> *stderr logs*
>
> Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> Java heap space
>at java.io.BufferedOutputStream.(BufferedOutputStream.java:76)
>at java.io.BufferedOutputStream.(BufferedOutputStream.java:59)
>at
> org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
>at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
>at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> Exception in thread "communication thread" java.lang.OutOfMemoryError:
> Java heap space
>
> Exception: java.lang.OutOfMemoryError thrown from the
> UncaughtExceptionHandler in thread "communication thread"
>


Re: Memory exception in the mapper

2012-05-23 Thread Joey Krabacher
MergeQueue.merge(Merger.java:381)
>at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
>at
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548)
>at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
>at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
>at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
>at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:396)
>at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
>at org.apache.hadoop.mapred.Child.main(Child.java:264)
>
> On Wed, May 23, 2012 at 2:47 PM, Joey Krabacher  >wrote:
>
> > My experience with this sort of problem tells me one of two things and
> > maybe both:
> >
> > 1. there are some optimizations to the code that can be made (variable
> > re-creation inside of loops, etc.)
> > 2. something has gone horribly wrong with the logic in the mapper.
> >
> > To troubleshoot I would output some log entries at specific points in the
> > mapper (be careful not to log every execution of the mapper because this
> > could cause major issues with the disk filling up and that sort of
> thing.)
> >
> > Hope that helps.
> >
> > /* Joey */
> >
> > On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner  > >wrote:
> >
> > > Hi, all,
> > >
> > > I got the exception below in the mapper. I already have my global
> Hadoop
> > > heap at 5 GB, but is there a specific other setting? Or maybe I should
> > > troubleshoot for memory?
> > >
> > > But the same application works in the IDE.
> > >
> > > Thank you!
> > >
> > > Mark
> > >
> > > *stderr logs*
> > >
> > > Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> > > Java heap space
> > >at
> > java.io.BufferedOutputStream.(BufferedOutputStream.java:76)
> > >at
> > java.io.BufferedOutputStream.(BufferedOutputStream.java:59)
> > >at
> > > org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> > >at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> > >at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> > > Exception in thread "communication thread" java.lang.OutOfMemoryError:
> > > Java heap space
> > >
> > > Exception: java.lang.OutOfMemoryError thrown from the
> > > UncaughtExceptionHandler in thread "communication thread"
> > >
> >
>


Re: Memory exception in the mapper

2012-05-23 Thread Joey Krabacher
No problem, glad I could help.

In our test environment I have lots of output and logging turned on, but as
soon as it is on production all output and logging is reduced to the bare
minimum.
Basically, in production we only log caught exceptions.

I would take it out unless you absolutely need it. IMHO.
If your jobs are not mission critical and do not need to run as smooth as
possible then it's not as important to remove those.

/* Joey */

On Wed, May 23, 2012 at 10:21 PM, Mark Kerzner wrote:

> Joey,
>
> that did the trick!
>
> Actually, I am writing to the log with System.out.println() statements, and
> I write about 12,000 lines, would that be a problem? I don't really need
> this output, so if you think it's inadvisable, I will remove that.
>
> Also, I hope that if I have not 6,000 maps but 12,000 or even 30,000, it
> will still work.
>
> Well, I will see pretty soon, I guess, with more data.
>
> Again, thank you.
>
> Sincerely,
> Mark
>
> On Wed, May 23, 2012 at 9:43 PM, Joey Krabacher  >wrote:
>
> > Mark,
> >
> > Have you tried tweaking the mapred.child.java.opts property in your
> > mapred-site.xml?
> >
> > 
> >mapred.child.java.opts
> >-Xmx2048m
> >  
> >
> > This might help.
> > It looks like the fatal error came right after the log truncater fired
> off.
> > Are you outputting anything to the logs manually, or have you looked at
> the
> > user logs to see if there is anything taking up lots of room?
> >
> > / * Joey */
> >
> >
> > On Wed, May 23, 2012 at 9:35 PM, Mark Kerzner  > >wrote:
> >
> > > Joey,
> > >
> > > my errors closely resembles this
> > > one<
> > >
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201006.mbox/%3caanlktikr3df4ce-tgiphv9_-evfoed_5-t684nf4y...@mail.gmail.com%3E
> > > >in
> > > the archives. I can now be much more specific with the errors message,
> > > and it is quoted below. I tried -Xmx3096. But I got the same error.
> > >
> > > Thank you,
> > > Mark
> > >
> > >
> > > syslog logs
> > > 2012-05-23 20:04:52,349 WARN org.apache.hadoop.util.NativeCodeLoader:
> > > Unable to load native-hadoop library for your platform... using
> > > builtin-java classes where applicable
> > > 2012-05-23 20:04:52,519 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > > Initializing JVM Metrics with processName=MAP, sessionId=
> > > 2012-05-23 20:04:52,695 INFO org.apache.hadoop.util.ProcessTree: setsid
> > > exited with exit code 0
> > > 2012-05-23 20:04:52,699 INFO org.apache.hadoop.mapred.Task:  Using
> > > ResourceCalculatorPlugin :
> > > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@d56b37
> > > 2012-05-23 20:04:52,813 INFO org.apache.hadoop.mapred.MapTask:
> > io.sort.mb =
> > > 100
> > > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: data
> > buffer
> > > = 79691776/99614720
> > > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: record
> > > buffer = 262144/327680
> > > 2012-05-23 20:04:53,010 WARN
> > > org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
> > not
> > > loaded
> > > 2012-05-23 20:12:29,120 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 0; bufend = 79542629; bufvoid = 99614720
> > > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > 0;
> > > kvend = 228; length = 327680
> > > 2012-05-23 20:12:31,248 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 0
> > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 79542629; bufend = 53863940; bufvoid = 99614720
> > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 228; kvend = 431; length = 327680
> > > 2012-05-23 20:13:03,294 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 1
> > > 2012-05-23 20:13:48,121 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 53863940; bufend = 31696780; bufvoid = 99614720
> &g

Re: master and slaves are running but they seem disconnected

2012-06-09 Thread Joey Krabacher
Not sure, but I did notice that safe mode is still. I would investigate
that and see if the other nodes show up.

/* Joey */
On Jun 9, 2012 2:52 PM, "Pierre Antoine DuBoDeNa"  wrote:

> Hello everyone..
>
> I have a cluster of 5 VMs, 1 as master/slave the rest are slaves. I run
> bin/start-all.sh everything seems ok i get no errors..
>
> I check with jps in all server they run:
>
> master:
> 22418 Jps
> 21497 NameNode
> 21886 SecondaryNameNode
> 21981 JobTracker
> 22175 TaskTracker
> 21688 DataNode
>
> slave:
> 3161 Jps
> 2953 DataNode
> 3105 TaskTracker
>
> But  in the web interface i get only 1 server connected.. is like the
> others are ignored.. Any clue why this can happen? where to look for
> errors?
>
> The hdfs web interface:
>
> Live Nodes<
> http://fusemaster.cs.columbia.edu:50070/dfsnodelist.jsp?whatNodes=LIVE>
> : 1 Dead Nodes<
> http://fusemaster.cs.columbia.edu:50070/dfsnodelist.jsp?whatNodes=DEAD>
> : 0
> it doesn't even show the rest slaves as dead..
>
> can it be a networking issue? (but i start all processes from master and it
> starts all processes to all others).
>
> best,
> PA
>