If the consensus is "Let's add ten lines to the recipes" I'm all aboard,
ignore the rest:
if I could have googled a good answer I would have stopped there. I won't
argue the necessity or obviousness of itertools.groupby, just it's name:
* I myself am a false negative that wanted the RLE behavior
If you understand what iterators do, the fact that itertools.groupby
collects contiguous elements is both obvious and necessary. Iterators
might be infinitely long... you cannot ask for every "A" that might
eventually occur in an infinite sequence of letters.
On Sat, Jun 10, 2017 at 10:08 PM, Nea
Agreed to a degree about providing it as code, but it may also be worth
mentioning also that zlib itself implements rle [1], and if there was ever
a desire to go "python all the way down" you need an RLE somewhere anyway
:)
That said, I'll be pretty happy with anything that replaces an hour of
go
11.06.17 05:20, Neal Fultz пише:
I am very new to this, but on a different forum and after a couple
conversations, I really wished Python came with run-length encoding
built-in; after all, it ships with zip, which is much more complicated :)
The general idea is to be able to go back and forth
On 11 June 2017 at 13:35, David Mertz wrote:
> You are right. I made a thinko.
>
> List construction from an iterator is O(N) just as is `sum(1 for _ in it)`.
> Both of them need to march through every element. But as a constant
> multiplier, just constructing the list should be faster than need
In my experience, RLE isn't something you often find on its own.
Usually it's used as part of some compression scheme that also
has ways of encoding verbatim runs of data and maybe other
things.
So I'm skeptical that it can be usefully provided as a library
function. It seems more like a design p
On 11 June 2017 at 13:35, Neal Fultz wrote:
> Whoops, scratch that part about encode /decode.
Aye, decode is a relatively straightforward nested comprehension:
def run_length_decode(iterable):
return (item for item, item_count in iterable for __ in
range(item_count))
It's only encod
On 11 June 2017 at 13:27, Joshua Morton wrote:
> David: You're absolutely right, s/2/3 in my prior post!
>
> Neal: As for why zip (at first I thought you meant the zip function, not the
> zip compression scheme) is included and rle is not, zip is (or was), I
> believe, used as part of python's pac
In the other direction, e.g.,
def expand_rle(rle):
from itertools import repeat, chain
return list(chain.from_iterable(repeat(x, n) for x, n in rle))
Then
>>> expand_rle([('a', 5), ('bc', 3)])
['a', 'a', 'a', 'a', 'a', 'bc', 'bc', 'bc']
As to why zip is in the distribution,
I would also submit there's some value in the obvious readability of
z = runlength.encode(sequence)
vs
z = [(k, len(list(g))) for k, g in itertools.groupby(sequence)]
but that's my personal opinion. Everyone is welcome to use my code, but I
probably won't submit to pypi for a two function mo
God no! Not in the Python 2 docs! ... if the recipe belongs somewhere it's
in the Python 3 docs. Although, I suppose it could go under 2 also, since
it's not actually a behavior change in the feature-frozen interpreter. But
as a Python instructor (and someone who remembers the cool new features o
On 6/10/2017 11:27 PM, Joshua Morton wrote:
Neal: As for why zip (at first I thought you meant the zip function, not
the zip compression scheme) is included and rle is not, zip is (or was),
I believe, used as part of python's packaging infrastructure, hopefully
someone else can correct me if t
You are right. I made a thinko.
List construction from an iterator is O(N) just as is `sum(1 for _ in
it)`. Both of them need to march through every element. But as a constant
multiplier, just constructing the list should be faster than needing an
addition (Python append is O(1) because of smar
Whoops, scratch that part about encode /decode.
On Sat, Jun 10, 2017 at 8:33 PM, Neal Fultz wrote:
> Yes, I mean zip compression :)
>
> Also, everyone's been posting decode functions, but encode is a bit harder
> :).
>
> I think it should be equally easy to go one direction as the other.
> Hopef
Yes, I mean zip compression :)
Also, everyone's been posting decode functions, but encode is a bit harder
:).
I think it should be equally easy to go one direction as the other.
Hopefully this email chain builds up enough info to update the docs for
posterity / future me.
On Sat, Jun 10, 2017 at
If what you really want is sparse matrices, you should use those:
https://docs.scipy.org/doc/scipy/reference/sparse.html.
Or maybe from the experimental Dask offshoot that I contributed a few lines
to: https://github.com/mrocklin/sparse.
Either of those will be about two orders of magnitude faste
David: You're absolutely right, s/2/3 in my prior post!
Neal: As for why zip (at first I thought you meant the zip function, not
the zip compression scheme) is included and rle is not, zip is (or was), I
believe, used as part of python's packaging infrastructure, hopefully
someone else can correct
On 2017-06-11 00:13, David Mertz wrote:
Bernardo Sulzbach posted a much prettier version than mine that is a bit
shorter. But his is also somewhat slower (and I believe asymptotically
so as the number of equal elements in subsequence goes up). He needs to
sum up a bunch of 1's repeatedly rath
Bernardo Sulzbach posted a much prettier version than mine that is a bit
shorter. But his is also somewhat slower (and I believe asymptotically so
as the number of equal elements in subsequence goes up). He needs to sum
up a bunch of 1's repeatedly rather than do the O(1) `len()` function.
For a
Thanks, that's cool. Maybe the root problem is that the docs aren't using
the right words when I google. Run-length-encoding is particularly relevant
for spare matrices, but there's probably a library for those as well. On
the data science side of things, there's a few hundred R packages that us
Another is
[(k, len(list(g))) for k, g in groupby(l)]
It might be worth adding it to the list of recipies either at
https://docs.python.org/2/library/itertools.html#itertools.groupby or at
https://docs.python.org/2/library/itertools.html#recipes, though.
On Sat, Jun 10, 2017 at 8:07 PM David Me
Here's a one-line version:
from itertools import groupby
rle_encode = lambda it: (
(l[0],len(l)) for g in groupby(it) for l in [list(g[1])])
Since "not every one line function needs to be in the standard library" is
a guiding principle of Python, and even moreso of `itertools`, probably
this
On 2017-06-10 23:20, Neal Fultz wrote:
Hello python-ideas,
I am very new to this, but on a different forum and after a couple
conversations, I really wished Python came with run-length encoding
built-in; after all, it ships with zip, which is much more complicated :)
The general idea is to
Hello python-ideas,
I am very new to this, but on a different forum and after a couple
conversations, I really wished Python came with run-length encoding
built-in; after all, it ships with zip, which is much more complicated :)
The general idea is to be able to go back and forth between two
rep
24 matches
Mail list logo