HuanWang created YARN-3791: ------------------------------ Summary: FSDownload Key: YARN-3791 URL: https://issues.apache.org/jira/browse/YARN-3791 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.4.0 Environment: Linux 2.6.32-279.el6.x86_64 Reporter: HuanWang
Inadvertently,we set two source ftp path: { { ftp://10.27.178.207:21/home/cbt/1213/jxf.sql, 1433225510000, FILE, null },pending,[(container_20150608111420_41540_1213_1503_)],4237640867118938,DOWNLOADING} ftp://10.27.89.13:21/home/cbt/common/2/sql.jar, 1433225415000, FILE, null },pending,[(container_20150608111420_41540_1213_1503_)],4237640866988089,DOWNLOADING} the first one is a wrong path,only one source was set this;but Follow the log,i saw Starting from the first path source download,All next jobs sources were downloaded from ftp://10.27.178.207 by default. the log is : <code> 2015-06-09 11:14:34,653 INFO [AsyncDispatcher event handler] localizer.ResourceLocalizationService (ResourceLocalizationService.java:addResource(544)) - Downloading public rsrc:{ ftp://10.27.89.13:21/home/cbt/common/2/sql.jar, 1433225415000, FILE, null } 2015-06-09 11:14:34,653 INFO [AsyncDispatcher event handler] localizer.ResourceLocalizationService (ResourceLocalizationService.java:addResource(544)) - Downloading public rsrc:{ ftp://10.27.178.207:21/home/cbt/1213/jxf.sql, 1433225510000, FILE, null } 2015-06-09 11:14:37,883 INFO [Public Localizer] localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(672)) - Failed to download rsrc { { ftp://10.27.178.207:21/home/cbt/1213/jxf.sql, 1433225510000, FILE, null },pending,[(container_20150608111420_41540_1213_1503_)],4237640867118938,DOWNLOADING} java.io.IOException: Login failed on server - 10.27.178.207, port - 21 at org.apache.hadoop.fs.ftp.FTPFileSystem.connect(FTPFileSystem.java:133) at org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:390) at com.suning.cybertron.superion.util.FSDownload.copy(FSDownload.java:172) at com.suning.cybertron.superion.util.FSDownload.call(FSDownload.java:279) at com.suning.cybertron.superion.util.FSDownload.call(FSDownload.java:52) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-06-09 11:14:37,885 INFO [Public Localizer] localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource ftp://10.27.178.207:21/home/cbt/1213/jxf.sql transitioned from DOWNLOADING to FAILED 2015-06-09 11:14:37,886 INFO [Public Localizer] localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(672)) - Failed to download rsrc { { ftp://10.27.89.13:21/home/cbt/common/2/sql.jar, 1433225415000, FILE, null },pending,[(container_20150608111420_41540_1213_1503_)],4237640866988089,DOWNLOADING} java.io.IOException: Login failed on server - 10.27.178.207, port - 21 at org.apache.hadoop.fs.ftp.FTPFileSystem.connect(FTPFileSystem.java:133) at org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:390) at com.suning.cybertron.superion.util.FSDownload.copy(FSDownload.java:172) at com.suning.cybertron.superion.util.FSDownload.call(FSDownload.java:279) at com.suning.cybertron.superion.util.FSDownload.call(FSDownload.java:52) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-06-09 11:14:37,886 INFO [AsyncDispatcher event handler] container.Container (ContainerImpl.java:handle(853)) - Container container_20150608111420_41540_1213_1503_ transitioned from LOCALIZING to LOCALIZATION_FAILED 2015-06-09 11:14:37,887 INFO [Public Localizer] localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource ftp://10.27.89.13:21/home/cbt/common/2/sql.jar transitioned from DOWNLOADING to FAILED 2015-06-09 11:14:37,887 INFO [AsyncDispatcher event handler] localizer.LocalResourcesTrackerImpl (LocalResourcesTrackerImpl.java:handle(133)) - Container container_20150608111420_41540_1213_1503_ sent RELEASE event on a resource request { ftp://10.27.89.13:21/home/cbt/common/2/sql.jar, 1433225415000, FILE, null } not present <code> I debug the code of yarn.I found the piont is org.apache.hadoop.fs.FileSystem#cache the code source is here: <code> private FileSystem getInternal(URI uri, Configuration conf, Key key) throws IOException{ FileSystem fs; synchronized (this) { fs = map.get(key); } if (fs != null) { return fs; } fs = createFileSystem(uri, conf); synchronized (this) { // refetch the lock again FileSystem oldfs = map.get(key); if (oldfs != null) { // a file system is created while lock is releasing fs.close(); // close the new file system return oldfs; // return the old file system } // now insert the new file system into the map if (map.isEmpty() && !ShutdownHookManager.get().isShutdownInProgress()) { ShutdownHookManager.get().addShutdownHook(clientFinalizer, SHUTDOWN_HOOK_PRIORITY); } fs.key = key; map.put(key, fs); if (conf.getBoolean("fs.automatic.close", true)) { toAutoClose.add(key); } return fs; } } ============================= FTPFileSystem.java @Override public void initialize(URI uri, Configuration conf) throws IOException { // get super.initialize(uri, conf); // get host information from uri (overrides info in conf) String host = uri.getHost(); host = (host == null) ? conf.get("fs.ftp.host", null) : host; if (host == null) { throw new IOException("Invalid host specified"); } conf.set("fs.ftp.host", host); // get port information from uri, (overrides info in conf) int port = uri.getPort(); port = (port == -1) ? FTP.DEFAULT_PORT : port; conf.setInt("fs.ftp.host.port", port); // get user/password information from URI (overrides info in conf) String userAndPassword = uri.getUserInfo(); if (userAndPassword == null) { userAndPassword = (conf.get("fs.ftp.user." + host, null) + ":" + conf .get("fs.ftp.password." + host, null)); if (userAndPassword == null) { throw new IOException("Invalid user/passsword specified"); } } String[] userPasswdInfo = userAndPassword.split(":"); conf.set("fs.ftp.user." + host, userPasswdInfo[0]); if (userPasswdInfo.length > 1) { conf.set("fs.ftp.password." + host, userPasswdInfo[1]); } else { conf.set("fs.ftp.password." + host, null); } setConf(conf); this.uri = uri; } <code> firstly.we have a source in ftp://10.27.89.13:21 then cache will store this key ,and fs.ftp.host=10.27.89.13 ,fs.ftp.user.10.27.89.13=XX and fs.ftp.password.10.27.89.13 will be set in conf. secondly,the source in ftp:/10.27.178.207 come, this is not exit in cache. cache will store this key ,and ,and fs.ftp.host=10.27.178.207 ,fs.ftp.user.10.27.178.207=XX and fs.ftp.password.10.27.178.207 will be set in conf. the key point is that : <code> /** * Connect to the FTP server using configuration parameters * * * @return An FTPClient instance * @throws IOException */ private FTPClient connect() throws IOException { FTPClient client = null; Configuration conf = getConf(); String host = conf.get("fs.ftp.host"); int port = conf.getInt("fs.ftp.host.port", FTP.DEFAULT_PORT); String user = conf.get("fs.ftp.user." + host); String password = conf.get("fs.ftp.password." + host); client = new FTPClient(); client.connect(host, port); int reply = client.getReplyCode(); if (!FTPReply.isPositiveCompletion(reply)) { throw new IOException("Server - " + host + " refused connection on port - " + port); } else if (client.login(user, password)) { client.setFileTransferMode(FTP.BLOCK_TRANSFER_MODE); client.setFileType(FTP.BINARY_FILE_TYPE); client.setBufferSize(DEFAULT_BUFFER_SIZE); } else { throw new IOException("Login failed on server - " + host + ", port - " + port); } return client; } <code> FTPFileSystem use conf to get host .port .username and password. after the first two steps. the fs.ftp.host in conf is set to 10.27.178.207. thirdly,a source in ftp://10.27.89.13:21 come.cache find there is exit. so ftpClient use cache to connect. But the fs.ftp.host in conf is 10.27.178.207!!!! it's Confusing!!! -- This message was sent by Atlassian JIRA (v6.3.4#6332)