Hi Bob, and welcome! My responses interleaved with yours, below.
On Fri, May 02, 2014 at 11:19:26PM +0100, Bob Williams wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > I'm fairly new to coding and python. My system is linux (openSUSE > 13.1). Nice to know. And I see you have even more infomation about your system in your email signature, including your email client and uptime. But what you don't tell us is what version of Python you're using. I'm going to guess that it is something in the 3.x range, since you call print as a function rather than a statement, but can't be sure. Fortunately in this case I don't think the exact version matters. [...] > fullPath = [] # declare (initially empty) lists > truncPath = [] > > with codecs.open('/var/log/rsyncd.log', 'r') as rsyncd_log: > for line in rsyncd_log.readlines(): > fullPath += [line.decode('utf-8', 'ignore').strip()] A small note about performance here. If your log files are very large (say, hundreds of thousands or millions of lines) you will find that this part is *horribly horrible slow*. There's two problems, a minor and a major one. First, rsyncd_log.readlines will read the entire file in one go. Since you end up essentially copying the whole file, you end up with two large lists of lines. There are ways to solve that, and process the lines lazily, one line at a time without needing to store the whole file. But that's not the big problem. The big problem is this: fullPath += [line.decode('utf-8', 'ignore').strip()] which is an O(N**2) algorithm. Do you know that terminology? Very briefly: O(1) means approximately constant time: tripling the size of the input makes no difference to the processing time. O(N) means linear time: tripling the input triples the processing time. O(N**2) means quadratic time: tripling the input increases the processing time not by a factor of three, but a factor of three squared, or nine. With small files, and fast computers, you won't notice. But with huge files and a slow computer, that could be painful. Instead, a better approach is: fullPath.append(line.decode('utf-8', 'ignore').strip()) which avoids the O(N**2) performance trap. > if fullPath[-1][0:10] == today: > print("\n Rsyncd.log has been modified in the last 24 hours...") > else: > print("\n No recent rsync activity. Nothing to do.\n") > sys.exit() > > # Search for lines starting with today's date and containing 'recv' > # Strip everything up to and including 'recv' and following last '/' > path separator > for i in range(0, len(fullPath)): > if fullPath[i][0:10] == today and 'recv' in fullPath[i]: > print("got there") > begin = fullPath[i].find('recv ') > end = fullPath[i].rfind('/') > fullPath[i] = fullPath[i][begin+5:end] > truncPath.append(fullPath[i]) > print(" ...and the following new albums have been added:\n") > else: > print(" ...but no new music has been downloaded.\n") > sys.exit() Now at last we get to your immediate problem: the above is intended to iterate over the lines of fullPath. But it starts at the beginning of the file, which may not be today. The first time you hit a line which is not today, the program exits, before it gets a chance to advance to the more recent days. That probably means that it looks at the first line in the log, determines that it is not today, and exits. I'm going to suggest a more streamlined algorithm. Most of it is actual Python code, assuming you're using Python 3. Only the "process this line" part needs to be re-written. new_activity = False # Nothing has happened today. with open('/var/log/rsyncd.log', 'r', encoding='utf-8', errors='ignore') as rsyncd_log: for line in rsyncd_log: line = line.strip() if line[0:10] == today and 'recv' in line: new_activity = True process this line # <== fix this if not new_activity: print("no new albums have been added today") This has the benefit that every line is touched only once, not three times as in your version. Performance is linear, not quadratic. You should be able to adapt this to your needs. Good luck, and feel free to ask questions! -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor