Hi, W. D. Sadeep wrote on Tue, Oct 28, 2025 at 09:21:36PM +0800:
> I'm thinking of parsing the /var/www/logs/access.log from httpd for > purposes like identifying bot activity using fgrep, grep, cut, sed, > sort, and uniq. In general, parsers are notorious for inviting bugs, and in general, bugs are notorious for causing security issues. For that reason, in many programs that need parsers, the parsers are the parts that you want to run with the least privilege. > Is it safe to do that in a cron job? > > I see requests that appear to embed scripts. So, I'm wondering if it's > naive to parse them like that. Writing a log parser in the sh(1) language - and your above question sounds a bit as if that is what you are planning to do - does not strike me as a particularly wise choice because the sh(1) language is notorious for meta-character, word-splitting/whitespace and quoting issues, so having a shell script is rarely good for security. Picking a safer language may be better even if you are quite experienced with shell programming. > If that's the case, are there any precautions I can take? Use a dedicated user account that has no access to anything else, such that, if the parser spirals out of control, the worst that can happen is that your log report gets corrupted. In particular, do not run the parser as the root or daemon or www user or any other system user, and make sure that the user you set up for that purpose cannot write to the /var/www/logs/ directory, where it might potentially destroy the logs it was supposed to merely read. Also, many log analysis programs exists - maybe one of the existing programs fits your needs? I admit that more than twenty years ago, i wrote a log parser for some machines i was running back then (and i used Perl, which was almost certainly a reasonable choice), but i'm no longer it sure rolling my own was a particularly good idea. Anyway, i never published it, and it certainly wasn't good enough for publishing. Yours, Ingo

