Re: [Tutor] saveLine decked

2018-11-11 Thread Avi Gross
Peter,

Appreciated. I wrote something like this in another message before reading
yours. Indeed one of the things I found was the deque class in the
collections module. 

But I was not immediately clear on whether that would be directly
applicable. Their maximum sounded like if you exceeded it, it might either
reject the addition or throw an error. The behavior I wanted was sort of a
sliding window protocol where the oldest entry scrolled off the screen or
was simply removed. Sort of like what you might do with a moving average
that takes the average of just the last 20 days of a stock price.

But I searched some more and stand corrected. 

"New in version 2.4.

If maxlen is not specified or is None, deques may grow to an arbitrary
length. Otherwise, the deque is bounded to the specified maximum length.
Once a bounded length deque is full, when new items are added, a
corresponding number of items are discarded from the opposite end. Bounded
length deques provide functionality similar to the tail filter in Unix. They
are also useful for tracking transactions and other pools of data where only
the most recent activity is of interest."

That sounds exactly like what is needed. As long as you keep adding at the
end (using the append method) it should eventually remove from the beginning
automatically.  No need to use count and selectively remove or pop manually.

And despite all the additional functionality, I suspect it is tuned and
perhaps has parts written in C++ for added speed. I do note that any larger
log file used in the application discussed may throw things on the deque
many times but only ask it to display rarely so the former should be
optimized.

But one question before I go, Columbo style. The manual page
(https://docs.python.org/2/library/collections.html ) suggest you call deque
with an iterator. That would not necessarily meet our need as giving it the
entire file as an iterator would just grind away without any logic and keep
just the last N lines. We could arrange the logic in our own iterator, such
as a function that reads a line at a time using its own open iterator and
yields the line over but that too is problematic as to how and when you stop
and print the results. But on second look, the iterator is optional and I
tried creating a deque using just a maxlen=3 argument for illustration.

>>> from collections import deque

>>> a=deque(maxlen=3)
>>> a
deque([], maxlen=3)
>>> a.append('line 1\n')
>>> a
deque(['line 1\n'], maxlen=3)
>>> a.append('line 2\n')
>>> a.append('line 3\n')
>>> a
deque(['line 1\n', 'line 2\n', 'line 3\n'], maxlen=3)
>>> a.append('line N\n')
>>> a
deque(['line 2\n', 'line 3\n', 'line N\n'], maxlen=3)

OK, that looks right so all you need to figure out is how to print it in a
format you want.

As it happens, deque has an str and a repr that seem the same when I try to
print:

>>> a.__str__()
"deque(['line 2\\n', 'line 3\\n', 'line N\\n'], maxlen=3)"
>>> a.__repr__()
"deque(['line 2\\n', 'line 3\\n', 'line N\\n'], maxlen=3)"

So you either need to subclass deque to get your own printable version (or
use an amazing number of other Python tricks since you can, or do something
manually. 

>>> for line in a: print(line)

line 2

line 3

line N

OK, that works but my \n characters at the end of some items might suggest
using end='' in the 3.X version of print for a smaller display.

Summary: the method Peter mentions is a decent solution with no programming
or debugging overhead. It is even flexible enough, if you choose, to store
or display the lines backwards as in showing the last line that showed the
error, followed by successively earlier lines. 

Why use a limited solution when you can play with a full deck?





-Original Message-
From: Tutor  On Behalf Of
Peter Otten
Sent: Sunday, November 11, 2018 2:43 PM
To: tutor@python.org
Subject: Re: [Tutor] saveLine

Avi Gross wrote:

> Alan and others have answered the questions posed and what I am asking 
> now is to look at the function he proposed to keep track of the last 
> five lines.
> 
> There is nothing wrong with it but I wonder what alternatives people 
> would prefer. His code is made for exactly 5 lines to be buffered and 
> is quite efficient. But what if you wanted N lines buffered, perhaps 
> showing a smaller number of lines on some warnings or errors and the 
> full N in other cases?

The standard library features collections.deque. With that:

buffer = collections.deque(maxlen=N)
save_line = buffer.append

This will start with an empty buffer. To preload the buffer:

buffer = collections.deque(itertools.repeat("", N), maxlen=N)

To print the buffer:

print_buffer = sys.stdout.writelines

or, more general:

def print_buffer(items, end=""):
for item in items:
print(item, end=end)

Also, for s

Re: [Tutor] saveLine

2018-11-11 Thread Peter Otten
Avi Gross wrote:

> Alan and others have answered the questions posed and what I am asking now
> is to look at the function he proposed to keep track of the last five
> lines.
> 
> There is nothing wrong with it but I wonder what alternatives people would
> prefer. His code is made for exactly 5 lines to be buffered and is quite
> efficient. But what if you wanted N lines buffered, perhaps showing a
> smaller number of lines on some warnings or errors and the full N in other
> cases?

The standard library features collections.deque. With that:

buffer = collections.deque(maxlen=N)
save_line = buffer.append

This will start with an empty buffer. To preload the buffer:

buffer = collections.deque(itertools.repeat("", N), maxlen=N)

To print the buffer:

print_buffer = sys.stdout.writelines

or, more general:

def print_buffer(items, end=""):
for item in items:
print(item, end=end)

Also, for smallish N:

def print_buffer(items, end=""):
print(*items, sep=end)

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] saveLine

2018-11-11 Thread Avi Gross
Alan and others have answered the questions posed and what I am asking now
is to look at the function he proposed to keep track of the last five lines.

There is nothing wrong with it but I wonder what alternatives people would
prefer. His code is made for exactly 5 lines to be buffered and is quite
efficient. But what if you wanted N lines buffered, perhaps showing a
smaller number of lines on some warnings or errors and the full N in other
cases?

Here is Alan's code for comparison:

buffer = ['','','','','']

def saveLine(line, buff):
buff[0] = buff[1]
buff[1] = buff[2]
buff[2] = buff[3]
buff[3] = buff[4]
buff[4] = line

Again, that works fine. If N was not 5, I would suggest initializing the
buffer might look like this:

buffed = 5
buffer = [''] * buffed

Then instead of changing the list in place, I might change the function to
return the new string created by taking the substring containing all but the
first that is then concatenated with the new entry:

def saveLineN(line, buff):
buff = buff[1:] + [line]
return buff

Clearly less efficient but more general. And, yes, the return statement
could be the entire function as in:

def saveLineN(line, buff):
return  buff[1:] + [line]


Here is a transcript of it running:

>>> buffer
['', '', '', '', '']
>>> saveLineN('a', buffer)
['', '', '', '', 'a']
>>> buffer = saveLineN('a', buffer)
>>> buffer
['', '', '', '', 'a']
>>> buffer = saveLineN('b', buffer)
>>> buffer = saveLineN('c', buffer)
>>> buffer = saveLineN('d', buffer)
>>> buffer = saveLineN('e', buffer)
>>> buffer
['a', 'b', 'c', 'd', 'e']
>>> buffer = saveLineN('6th', buffer)
>>> buffer
['b', 'c', 'd', 'e', '6th']

So perhaps using in-line changes might make sense.

Buff.pop(0) would remove the zeroeth item with the side effect of returning
the first item to be ignored. 

>>> buffer = ['a', 'b', 'c', 'd', 'e']
>>> buffer.pop(0)
'a'
>>> buffer
['b', 'c', 'd', 'e']

And it can be extended in-line:

>>> buffer.append('6th')
>>> buffer
['b', 'c', 'd', 'e', '6th']

Sorry, I mean appended, not extended! LOL!

So to make this compact and less wasteful, I consider using del buffer[0]:

>>> del buffer[0]
>>> buffer
['b', 'c', 'd', 'e']

So here is this version that might be more efficient. It deletes the first
item/line of the buffer then adds a new  nth in-line:

def saveLineN2(line, buff):
del buff[0]
buff.append(line)

Here is a transcript of it in use, using N=3 to be different:

>>> buffed = 3
  
>>> buffer = [''] * buffed
  
>>> buffer
  
['', '', '']
>>> saveLineN2('First Line\n', buffer)
  
>>> buffer
  
['', '', 'First Line\n']
>>> saveLineN2('Second Line\n', buffer)
  
>>> saveLineN2('Third Line\n', buffer)
  
>>> buffer
  
['First Line\n', 'Second Line\n', 'Third Line\n']
>>> saveLineN2('nth Line\n', buffer)
  
>>> buffer
  
['Second Line\n', 'Third Line\n', 'nth Line\n']

I can think of many other ways to do this, arguably some are more weird than
others. There is the obvious one which does all the changes in one line as
in:

buff[0],buff[1],buff[2] = buff[1],buff[2],line

Of course, for 5 you change that a bit. Might even be a tad more efficient.

There is also the odd concept of not scrolling along but dealing with things
at print time. I mean you can have a list with 1:N entries and a variable
that holds an index from 1 to N. You store a new line at buffer[index] each
time then you increment index modulo N. This resets it to 0 periodically. At
print time, you print buffer[index:] + buffer[:index] and you have the same
result.

Does anyone have comments on what methods may be better for some purposes or
additional ways to do this? I mean besides reading all lines into memory and
holding on to them and indexing backwards.

For that matter, you can revisit the question of using a list of lines and
consider a dictionary variant which might work better for larger values of
N. No need to slide a buffer window along, just maintain a modular index and
overwrite the key value as in buffer[index] = line

One more comment, if I may. 

Alan mentions but does not define a printBuffer() function. Using any of the
methods above, you can end up with an error happening early on so the kind
of fixed-length buffer mentioned above contains blank entries (not even a
'\n') so it probably should suppress printing items of zero length or that
are empty. And, in some of the methods shown above, it may be worth starting
with an empty buffer and adding lines up to some N and only then removing
the first entry each time. That would complicate the code a bit but make
printing trivial.





___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor