[ https://issues.apache.org/jira/browse/MAPREDUCE-6992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Philip Zeyliger resolved MAPREDUCE-6992. ---------------------------------------- Resolution: Duplicate I agree; this is a dupe. Thanks! > Race for temp dir in LocalDistributedCacheManager.java > ------------------------------------------------------ > > Key: MAPREDUCE-6992 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6992 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Philip Zeyliger > > When localizing distributed cache files in "local" mode, > LocalDistributedCacheManager.java chooses a "unique" directory based on a > millisecond time stamp. When running code with some parallelism, it's > possible to run into this. > The error message looks like > {code} > bq. java.io.FileNotFoundException: jenkins/mapred/local/1508958341829_tmp > does not exist > {code} > I ran into this in Impala's data loading. There, we run a HiveServer2 which > runs in MapReduce. If multiple queries are submitted simultaneously to the > HS2, they conflict on this directory. Googling found that StreamSets ran into > something very similar looking at > https://issues.streamsets.com/browse/SDC-5473. > I believe the buggy code is (link: > https://github.com/apache/hadoop/blob/2da654e34a436aae266c1fbdec5c1067da8d854e/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalDistributedCacheManager.java#L94) > {code} > // Generating unique numbers for FSDownload. > AtomicLong uniqueNumberGenerator = > new AtomicLong(System.currentTimeMillis()); > {code} > Notably, a similar code path uses an actual random number generator > ({{LocalJobRunner.java}}, > https://github.com/apache/hadoop/blob/2da654e34a436aae266c1fbdec5c1067da8d854e/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java#L912). > {code} > public String getStagingAreaDir() throws IOException { > Path stagingRootDir = new Path(conf.get(JTConfig.JT_STAGING_AREA_ROOT, > "/tmp/hadoop/mapred/staging")); > UserGroupInformation ugi = UserGroupInformation.getCurrentUser(); > String user; > randid = rand.nextInt(Integer.MAX_VALUE); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org