> On Oct. 6, 2017, 12:26 a.m., Robert Kanter wrote: > > tools/src/main/java/org/apache/oozie/tools/diag/AppInfoCollector.java > > Lines 173-174 (patched) > > <https://reviews.apache.org/r/62459/diff/10/?file=1845962#file1845962line173> > > > > ````LogAggregationUtils```` is marked ````@Private```` so we shouldn't > > be using it. Hadoop can change things incompatibly here. > > Attila Sasvari wrote: > What do you think about borrowing/inlining/copying those functions from > Hadoop 2.6 too? I was thinking about it and it is probably better than > bringing back the initial ``ExecutorService`` based approach. I am thinking > of a class like: > > ```java > // TODO: once OOZIE-2983 ("Stream the Launcher AM Logs") is done, remove > it. > public class OozieLauncherLogFetcher { > private static final String TMP_FILE_SUFFIX = ".tmp"; > final private Configuration hadoopConfig; > > public OozieLauncherLogFetcher(final Configuration hadoopConfig) { > this.hadoopConfig = hadoopConfig; > } > > // Borrowed code from > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers > private static void logDirNotExist(String remoteAppLogDir) { > System.out.println(remoteAppLogDir + "does not exist."); > System.out.println("Log aggregation has not completed or is not > enabled."); > } > // Borrowed code from > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers > private static void emptyLogDir(String remoteAppLogDir) { > System.out.println(remoteAppLogDir + "does not have any log > files."); > } > // Borrowed code from > org.apache.hadoop.yarn.logaggregation.LogAggregationUtils > public static String getRemoteNodeLogDirSuffix(Configuration conf) { > return conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR_SUFFIX, > YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR_SUFFIX); > } > // Borrowed code from > org.apache.hadoop.yarn.logaggregation.LogAggregationUtils > public static Path getRemoteLogSuffixedDir(Path remoteRootLogDir, > String user, String suffix) { > return suffix != null && !suffix.isEmpty() ? new > Path(getRemoteLogUserDir(remoteRootLogDir, user), suffix) : > getRemoteLogUserDir(remoteRootLogDir, user); > } > // Borrowed code from > org.apache.hadoop.yarn.logaggregation.LogAggregationUtils > public static Path getRemoteLogUserDir(Path remoteRootLogDir, String > user) { > return new Path(remoteRootLogDir, user); > } > > // Borrowed code from > org.apache.hadoop.yarn.logaggregation.LogAggregationUtils > public static Path getRemoteAppLogDir(Path remoteRootLogDir, > ApplicationId appId, String user, String suffix) { > return new Path(getRemoteLogSuffixedDir(remoteRootLogDir, user, > suffix), appId.toString()); > } > > // Borrowed code from > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers > public int dumpAllContainersLogs(ApplicationId appId, String > appOwner, PrintStream out) throws IOException { > Path remoteRootLogDir = new > Path(hadoopConfig.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR, > YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR)); > String logDirSuffix = getRemoteNodeLogDirSuffix(hadoopConfig); > Path remoteAppLogDir = getRemoteAppLogDir(remoteRootLogDir, > appId, appOwner, logDirSuffix); > > RemoteIterator nodeFiles; > try { > Path qualifiedLogDir = > FileContext.getFileContext(hadoopConfig).makeQualified(remoteAppLogDir); > nodeFiles = > FileContext.getFileContext(qualifiedLogDir.toUri(), > hadoopConfig).listStatus(remoteAppLogDir); > } catch (FileNotFoundException fileNotFoundException) { > logDirNotExist(remoteAppLogDir.toString()); > return -1; > } > > boolean foundAnyLogs = false; > > while(true) { > FileStatus thisNodeFile; > do { > if (!nodeFiles.hasNext()) { > if (!foundAnyLogs) { > emptyLogDir(remoteAppLogDir.toString()); > return -1; > } > > return 0; > } > > thisNodeFile = (FileStatus)nodeFiles.next(); > } > while(thisNodeFile.getPath().getName().endsWith(TMP_FILE_SUFFIX)); > > AggregatedLogFormat.LogReader reader = new > AggregatedLogFormat.LogReader(hadoopConfig, thisNodeFile.getPath()); > > try { > AggregatedLogFormat.LogKey key = new > AggregatedLogFormat.LogKey(); > DataInputStream valueStream = reader.next(key); > > while(valueStream != null) { > String containerString = "\n\nContainer: " + key + " > on " + thisNodeFile.getPath().getName(); > out.println(containerString); > out.println(StringUtils.repeat("=", > containerString.length())); > > while(true) { > try { > > AggregatedLogFormat.LogReader.readAContainerLogsForALogType(valueStream, out, > thisNodeFile.getModificationTime()); > foundAnyLogs = true; > } catch (EOFException eofException) { > key = new AggregatedLogFormat.LogKey(); > valueStream = reader.next(key); > break; > } > } > } > } finally { > reader.close(); > } > } > } > } > > ```
I think this is an acceptable solution until we figure OOZIE-2983 out. We might try to push for a public API for this in Hadoop itself so we can just use that. - Peter ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62459/#review187223 ----------------------------------------------------------- On Oct. 4, 2017, 2:18 p.m., Attila Sasvari wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62459/ > ----------------------------------------------------------- > > (Updated Oct. 4, 2017, 2:18 p.m.) > > > Review request for oozie. > > > Repository: oozie-git > > > Description > ------- > > A diagnostic tool that collects a bunch of job and other information from > Oozie in a zip file. > > > Diffs > ----- > > docs/src/site/twiki/DG_CommandLineTool.twiki > d4047671876dcc3279a2ec379bc1d003f5e6f1aa > pom.xml 0b94484da1c97618e9168cea0ebbfff7f70f723c > tools/pom.xml 7306a14e7b237977be00f8fe28e34573540fd508 > tools/src/main/bin/oozie-diag-bundle-collector.sh PRE-CREATION > tools/src/main/java/org/apache/oozie/tools/diag/AppInfoCollector.java > PRE-CREATION > tools/src/main/java/org/apache/oozie/tools/diag/ArgParser.java PRE-CREATION > > tools/src/main/java/org/apache/oozie/tools/diag/DiagBundleCollectorDriver.java > PRE-CREATION > tools/src/main/java/org/apache/oozie/tools/diag/DiagBundleCompressor.java > PRE-CREATION > tools/src/main/java/org/apache/oozie/tools/diag/DiagBundleEntryWriter.java > PRE-CREATION > tools/src/main/java/org/apache/oozie/tools/diag/DiagOozieClient.java > PRE-CREATION > tools/src/main/java/org/apache/oozie/tools/diag/MetricsCollector.java > PRE-CREATION > tools/src/main/java/org/apache/oozie/tools/diag/ServerInfoCollector.java > PRE-CREATION > tools/src/test/java/org/apache/oozie/tools/diag/TestAppInfoCollector.java > PRE-CREATION > tools/src/test/java/org/apache/oozie/tools/diag/TestArgParser.java > PRE-CREATION > tools/src/test/java/org/apache/oozie/tools/diag/TestMetricsCollector.java > PRE-CREATION > > tools/src/test/java/org/apache/oozie/tools/diag/TestServerInfoCollector.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/62459/diff/10/ > > > Testing > ------- > > - new unit tests: TestOozieDiagBundleCollector > - started Oozie with a pseudo hadoop cluster, submitted a couple workflows, > and executed the following commands: > -- ``bin/oozie-diag-bundle-collector.sh`` (usage info printed), > -- ``bin/oozie-diag-bundle-collector.sh -numworkflows 2000 -oozie > http://localhost:11000/oozie -output /tmp``, > -- ``bin/oozie-diag-bundle-collector.sh -jobs > 0000001-170918144116149-oozie-asas-W -oozie http://localhost:11000/oozie > -output .`` (verified zip the tool generated). > > > Thanks, > > Attila Sasvari > >