Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
On Nov. 3, 2014, 3:53 p.m., Jonathan Hurley wrote: What happens when this agent code is installed in a cluster that's not running HDFS (GlusterFS for example)? Cabir Zounaidou wrote: Presently, the command will fail and will not push the log files to HDFS. For log to work, we need HDFS running. The LogHandlerCommand on the ambari-server side can skip pushing the configs if HDFS service is not available. In that case, no jobs will be scheduled. When you say that the LogHandlerCommand can skip pushing configs (so the jobs won't be scheduled), do you mean that you've implemented it this way, or that you'd need to change it so that the commands are skipped? If it works this way today, I can +1 the patch, but I didn't see it in the heartbeat handler. - Jonathan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/#review59638 --- On Nov. 6, 2014, 9:11 p.m., Cabir Zounaidou wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Nov. 6, 2014, 9:11 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs - ambari-agent/src/main/python/ambari_agent/AmbariConfig.py ca2e80c ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestHdfsApi.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] : -- Ran 324 tests in 8.282s OK : [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 14.759s [INFO] Finished at: Thu Oct 30 14:58:21 PDT 2014 [INFO] Final Memory: 10M/4079M [INFO] Thanks, Cabir Zounaidou
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/#review60601 --- Ship it! Ship It! - Nate Cole On Nov. 6, 2014, 9:11 p.m., Cabir Zounaidou wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Nov. 6, 2014, 9:11 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs - ambari-agent/src/main/python/ambari_agent/AmbariConfig.py ca2e80c ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestHdfsApi.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] : -- Ran 324 tests in 8.282s OK : [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 14.759s [INFO] Finished at: Thu Oct 30 14:58:21 PDT 2014 [INFO] Final Memory: 10M/4079M [INFO] Thanks, Cabir Zounaidou
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Nov. 7, 2014, 2:11 a.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Changes --- Nate, updated as per your comments. Can you please review? Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs - ambari-agent/src/main/python/ambari_agent/AmbariConfig.py ca2e80c ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestHdfsApi.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing (updated) --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] : -- Ran 324 tests in 8.282s OK : [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 14.759s [INFO] Finished at: Thu Oct 30 14:58:21 PDT 2014 [INFO] Final Memory: 10M/4079M [INFO] Thanks, Cabir Zounaidou
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Nov. 5, 2014, 3:49 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Changes --- Fixed as per Nate's review comments. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs (updated) - ambari-agent/src/main/python/ambari_agent/AmbariConfig.py ca2e80c ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestHdfsApi.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] : -- Ran 324 tests in 8.282s OK : [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 14.759s [INFO] Finished at: Thu Oct 30 14:58:21 PDT 2014 [INFO] Final Memory: 10M/4079M [INFO] Thanks, Cabir Zounaidou
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Nov. 4, 2014, 10:11 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Changes --- Fixed testcase failure. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs (updated) - ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] : -- Ran 324 tests in 8.282s OK : [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 14.759s [INFO] Finished at: Thu Oct 30 14:58:21 PDT 2014 [INFO] Final Memory: 10M/4079M [INFO] Thanks, Cabir Zounaidou
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
On Nov. 3, 2014, 8:53 p.m., Jonathan Hurley wrote: What happens when this agent code is installed in a cluster that's not running HDFS (GlusterFS for example)? Presently, the command will fail and will not push the log files to HDFS. For log to work, we need HDFS running. The LogHandlerCommand on the ambari-server side can skip pushing the configs if HDFS service is not available. In that case, no jobs will be scheduled. On Nov. 3, 2014, 8:53 p.m., Jonathan Hurley wrote: ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py, line 78 https://reviews.apache.org/r/27396/diff/2-3/?file=746522#file746522line78 command_args (spelling) Fixed in the latest patch. - Cabir --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/#review59638 --- On Nov. 4, 2014, 10:11 p.m., Cabir Zounaidou wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Nov. 4, 2014, 10:11 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs - ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] : -- Ran 324 tests in 8.282s OK : [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 14.759s [INFO] Finished at: Thu Oct 30 14:58:21 PDT 2014 [INFO] Final Memory: 10M/4079M [INFO] Thanks, Cabir Zounaidou
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
On Nov. 3, 2014, 8:45 p.m., Nate Cole wrote: Thanks for the review. On Nov. 3, 2014, 8:45 p.m., Nate Cole wrote: ambari-agent/src/main/python/ambari_agent/loghandler/config.py, lines 27-49 https://reviews.apache.org/r/27396/diff/2/?file=746521#file746521line27 You should be using an AmbariConfig instance and create a section in that file to handle setting all these. AmbariConfig is the external property file for things like this. Thanks for the suggestion. I will move this to AmbariConfig. On Nov. 3, 2014, 8:45 p.m., Nate Cole wrote: ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py, lines 71-75 https://reviews.apache.org/r/27396/diff/2/?file=746522#file746522line71 We may have an HdfsExecute common library for things like this Will move to common library. On Nov. 3, 2014, 8:45 p.m., Nate Cole wrote: ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py, lines 52-53 https://reviews.apache.org/r/27396/diff/2/?file=746523#file746523line52 This style is generally deprecated for the string.format() method. Fixed locally. On Nov. 3, 2014, 8:45 p.m., Nate Cole wrote: ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py, lines 166-173 https://reviews.apache.org/r/27396/diff/2/?file=746523#file746523line166 Use the with syntax. Fixed locally. On Nov. 3, 2014, 8:45 p.m., Nate Cole wrote: ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py, lines 178-180 https://reviews.apache.org/r/27396/diff/2/?file=746523#file746523line178 No error handling here can leave an open file. Use this style and the file won't be left hanging. with open(self.__metadata_path, 'w') as file: file.write(...) Fixed locally. On Nov. 3, 2014, 8:45 p.m., Nate Cole wrote: ambari-agent/src/main/python/ambari_agent/loghandler/util.py, lines 30-33 https://reviews.apache.org/r/27396/diff/2/?file=746524#file746524line30 All you have is statics. This doesn't need to be in a class, can just be loghandler/util with methods in it. Fixed. On Nov. 3, 2014, 8:45 p.m., Nate Cole wrote: ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py, lines 183-189 https://reviews.apache.org/r/27396/diff/2/?file=746523#file746523line183 See comment about AmbariConfig Will use AmbariConfig - Cabir --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/#review59598 --- On Nov. 4, 2014, 10:11 p.m., Cabir Zounaidou wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Nov. 4, 2014, 10:11 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs - ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO]
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
On Nov. 1, 2014, 2:34 a.m., Jonathan Hurley wrote: Jonathan, thanks very much for the review. On Nov. 1, 2014, 2:34 a.m., Jonathan Hurley wrote: ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py, line 52 https://reviews.apache.org/r/27396/diff/2/?file=746519#file746519line52 I believe that this will create your own scheduler instance (along with its own locks, pools, and jobstore). However, we might want to confirm this. At the very least, maybe the alert and log schedulers should define a jobstore aside from using the default name? Yes, verified that each scheduler instance creates it own locks, thread pools and jobstore. Not sure we need to explicitly specify jobstore, verified that it creates a new instance of jobstore for each instance of scheduler. On Nov. 1, 2014, 2:34 a.m., Jonathan Hurley wrote: ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py, line 167 https://reviews.apache.org/r/27396/diff/2/?file=746519#file746519line167 I know the documentation says that you can give the job a name on creation; I wasn't able to get it to work; I'm guessing you weren't either. i am able to get the job name to work. Will fix the code to use name options parameter. scheduler.add_interval_job(callback, seconds=self.__interval, name=jobname) On Nov. 1, 2014, 2:34 a.m., Jonathan Hurley wrote: ambari-agent/src/main/python/ambari_agent/loghandler/config.py, line 61 https://reviews.apache.org/r/27396/diff/2/?file=746521#file746521line61 Perhaps log what component name you were looking for? Same for other areas where an Exception is constructed due to lack of a required parameter. The exception is raised because the component name is missing. Changed the message to 'component name required' On Nov. 1, 2014, 2:34 a.m., Jonathan Hurley wrote: ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py, line 179 https://reviews.apache.org/r/27396/diff/2/?file=746519#file746519line179 This will produce data in ambari-agent.out - I think we'd want to keep this ambari-agent.log by using logger.exception() Same goes for other areas where you have traceback Changed to log info message, because this file need not have to exists and will be automatically generated by the execution command. On Nov. 1, 2014, 2:34 a.m., Jonathan Hurley wrote: ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py, line 31 https://reviews.apache.org/r/27396/diff/2/?file=746522#file746522line31 I'm not sure this command will work for all deployments. In some cases, the hdfs user is the only one that can run these commands; and in security-enabled environments, that means using keytabs. Can you verify that this command will work in the above scenarios? Same goes for other hdfs commands. Good point. Verified that it is an issue for 'ambari' user running hdfs command in secured mode. I will check using 'hdfs' user is okay for this use case. On Nov. 1, 2014, 2:34 a.m., Jonathan Hurley wrote: ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py, line 54 https://reviews.apache.org/r/27396/diff/2/?file=746519#file746519line54 make_cachedir seems to always be True; is it supposed to be configurable? Made it as in input parameter. - Cabir --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/#review59451 --- On Oct. 31, 2014, 11:50 p.m., Cabir Zounaidou wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Oct. 31, 2014, 11:50 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs - ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Nov. 3, 2014, 8:30 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Changes --- Addressed Jonathan's review comments Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs (updated) - ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] : -- Ran 324 tests in 8.282s OK : [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 14.759s [INFO] Finished at: Thu Oct 30 14:58:21 PDT 2014 [INFO] Final Memory: 10M/4079M [INFO] Thanks, Cabir Zounaidou
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Oct. 31, 2014, 11:50 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Changes --- Added update_config method to reschedule jobs. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs (updated) - ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] : -- Ran 324 tests in 8.282s OK : [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 14.759s [INFO] Finished at: Thu Oct 30 14:58:21 PDT 2014 [INFO] Final Memory: 10M/4079M [INFO] Thanks, Cabir Zounaidou
Re: Review Request 27396: Pushing component logs to HDFS from ambari-agent
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/#review59451 --- ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py https://reviews.apache.org/r/27396/#comment100733 I believe that this will create your own scheduler instance (along with its own locks, pools, and jobstore). However, we might want to confirm this. At the very least, maybe the alert and log schedulers should define a jobstore aside from using the default name? ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py https://reviews.apache.org/r/27396/#comment100732 make_cachedir seems to always be True; is it supposed to be configurable? ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py https://reviews.apache.org/r/27396/#comment100735 Why instantiate a new Scheduler here? start(self) also creates a new Scheduler. Maybe this should set Scheduler to None and then have start(self) check for None to instantiate one? ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py https://reviews.apache.org/r/27396/#comment100736 I know the documentation says that you can give the job a name on creation; I wasn't able to get it to work; I'm guessing you weren't either. ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py https://reviews.apache.org/r/27396/#comment100734 This will produce data in ambari-agent.out - I think we'd want to keep this ambari-agent.log by using logger.exception() Same goes for other areas where you have traceback ambari-agent/src/main/python/ambari_agent/loghandler/config.py https://reviews.apache.org/r/27396/#comment100737 Perhaps log what component name you were looking for? Same for other areas where an Exception is constructed due to lack of a required parameter. ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py https://reviews.apache.org/r/27396/#comment100738 I'm not sure this command will work for all deployments. In some cases, the hdfs user is the only one that can run these commands; and in security-enabled environments, that means using keytabs. Can you verify that this command will work in the above scenarios? Same goes for other hdfs commands. - Jonathan Hurley On Oct. 31, 2014, 7:50 p.m., Cabir Zounaidou wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27396/ --- (Updated Oct. 31, 2014, 7:50 p.m.) Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, John Speidel, Mahadev Konar, Nate Cole, and Yusaku Sako. Bugs: AMBARI-1522 https://issues.apache.org/jira/browse/AMBARI-1522 Repository: ambari Description --- The idea is to run a scheduler which computes diffs in a component log and pushes the diff (only if available) to HDFS at fixed interval. 1. Each component will have its own monitor 2. The apscheduler threadpool controls the monitor execution. 3. If the log directory is not available if the component (or service) is not running, it will skip the processing. 4. It saves the last read line index and last modified time for next iteration 5. It uses HDFS shell utility to push log patch to HDFS. 6. Right now, the component log directory is configured in a json file. In the next iteration, it will try to automatically detected from the stack config. Diffs - ambari-agent/src/main/python/ambari_agent/Controller.py dc3a1cf ambari-agent/src/main/python/ambari_agent/LogSchedulerHandler.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/__init__.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/config.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/hdfsapi.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/monitor.py PRE-CREATION ambari-agent/src/main/python/ambari_agent/loghandler/util.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/TestLogHandler.py PRE-CREATION ambari-agent/src/test/python/ambari_agent/dummy_files/log_handler_config.json PRE-CREATION Diff: https://reviews.apache.org/r/27396/diff/ Testing --- - Ran the ambari-agent with the patch and verified 1. Started ambari-agent with the patch 2. The scheduler started successfully 3. The logs files are getting pushed to HDFS successfully. Verified using HDFS shell utility. - Ran the tests successfully [INFO] [INFO] [INFO] Building Ambari Agent 1.3.0-SNAPSHOT [INFO] [INFO] :