[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209011#comment-13209011
 ] 

Tsz Wo (Nicholas), SZE commented on MAPREDUCE-3583:
---------------------------------------------------

Sorry that I thought BigInteger was used for checking overflow.  If the range 
of stime is expected to be larger than Long.MAX_VALUE, it is okay to use 
BigInteger for the moment.  We may improve it later on.

                
> ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException
> -----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3583
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3583
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.205.0
>         Environment: 64-bit Linux:
> asf011.sp2.ygridcore.net
> Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 
> 17:42:25 UTC 2011 x86_64 GNU/Linux
>            Reporter: Zhihong Yu
>            Assignee: Zhihong Yu
>            Priority: Critical
>         Attachments: mapreduce-3583-trunk-v2.txt, 
> mapreduce-3583-trunk-v2.txt, mapreduce-3583-trunk-v3.txt, 
> mapreduce-3583-trunk-v4.txt, mapreduce-3583-trunk-v5.txt, 
> mapreduce-3583-trunk-v6.txt, mapreduce-3583-trunk.txt, mapreduce-3583-v2.txt, 
> mapreduce-3583-v3.txt, mapreduce-3583-v4.txt, mapreduce-3583-v5.txt, 
> mapreduce-3583.txt
>
>
> HBase PreCommit builds frequently gave us NumberFormatException.
> From 
> https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/:
> {code}
> 2011-12-20 01:44:01,180 WARN  [main] mapred.JobClient(784): No job jar file 
> set.  User classes may not be found. See JobConf(Class) or 
> JobConf#setJar(String).
> java.lang.NumberFormatException: For input string: "18446743988060683582"
>       at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
>       at java.lang.Long.parseLong(Long.java:422)
>       at java.lang.Long.parseLong(Long.java:468)
>       at 
> org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413)
>       at 
> org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148)
>       at 
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401)
>       at org.apache.hadoop.mapred.Task.initialize(Task.java:536)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>       at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
> From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, 
> causing NFE:
> {code}
>         // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss)
>          pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)),
> {code}
> You can find information on the OS at the beginning of 
> https://builds.apache.org/job/PreCommit-HBASE-Build/553/console:
> {code}
> asf011.sp2.ygridcore.net
> Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 
> 17:42:25 UTC 2011 x86_64 GNU/Linux
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 20
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 16382
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 60000
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 2048
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
> 60000
> Running in Jenkins mode
> {code}
> From Nicolas Sze:
> {noformat}
> It looks like that the ppid is a 64-bit positive integer but Java long is 
> signed and so only works with 63-bit positive integers.  In your case,
>   2^64 > 18446743988060683582 > 2^63.
> Therefore, there is a NFE. 
> {noformat}
> I propose changing allProcessInfo to Map<String, ProcessInfo> so that we 
> don't encounter this problem by avoiding parsing large integer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to