[ 
https://issues.apache.org/jira/browse/HDFS-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605064#comment-13605064
 ] 

rajat agarwal commented on HDFS-2541:
-------------------------------------

{code}
long period = Math.min(scanPeriod, 
                           Math.max(blockMap.size(),1) * 600 * 1000L);
{code}

The problem here is that when the blockMap.size becomes >3579139 then the 
result of its multiplication with 600 exceeds the INTEGER.MAX_VALUE and gives 
negative value. Had it been like 
{code}
long period = Math.min(scanPeriod, 
                           Math.max(blockMap.size(),1) * 600L * 1000L);
{code}
"period" would have been dependent on scanPeriod only OR always positive.

Now there can be two cases here:

1) If scanPeriod is taken up the default value (21*24L three weeks) or its less 
than 596 hours (596*3600*1000 < INTEGER.MAX_VALUE) the "period" would always 
been a positive value.
2) If the scan period is more than that, the "period" would again be negative. 
In this case we can have something like this

{code}
if((int)period < 0)
    period=scanPeriod

return System.currentTimeMillis() - scanPeriod + 
           random.nextInt((int)period); 
{code}
                
> For a sufficiently large value of blocks, the DN Scanner may request a random 
> number with a negative seed value.
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-2541
>                 URL: https://issues.apache.org/jira/browse/HDFS-2541
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 0.20.1
>            Reporter: Harsh J
>            Assignee: Harsh J
>             Fix For: 0.23.1, 1.1.0
>
>         Attachments: BSBugTest.java, HDFS-2541.patch
>
>
> Running off 0.20-security, I noticed that one could get the following 
> exception when scanners are used:
> {code}
> DataXceiver 
> java.lang.IllegalArgumentException: n must be positive 
> at java.util.Random.nextInt(Random.java:250) 
> at 
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.getNewBlockScanTime(DataBlockScanner.java:251)
>  
> at 
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.addBlock(DataBlockScanner.java:268)
>  
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:432)
>  
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:122)
> {code}
> This is cause the period, determined in the DataBlockScanner (0.20+) or 
> BlockPoolSliceScanner (0.23+), is cast to an integer before its sent to a 
> Random.nextInt(...) call. For sufficiently large values of the long 'period', 
> the casted integer may be negative. This is not accounted for. I'll attach a 
> sample test that shows this possibility with the numbers.
> We should ensure we do a Math.abs(...) before we send it to the 
> Random.nextInt(...) call to avoid this.
> With this bug, the maximum # of blocks a scanner may hold in its blocksMap 
> without opening up the chance for beginning this exception (intermittent, as 
> blocks continue to grow) would be 3582718.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to