Re: [Python-ideas] Introduce typing.SupportsFsPath

2018-10-08 Thread Guido van Rossum
In typeshed there is os.PathLike which is close. You should be able to use
Union[str, os.PathLike[str]] for what you want (or define an alias).

We generally don't want to add more things to typing that aren't closely
related to the type system. (Adding the io and re classes was already less
than ideal, and we don't want to do more of those.)

On Mon, Oct 8, 2018 at 3:10 PM  wrote:

> Hello,
>
> Since __fspath__ was introduced in PEP 519 it is possible to create
> object classes that are representing file system paths.
> But there is no corresponding type object in the "typing" module. Thus I
> cannot specify functions, that accept any kind of object which supports
> the __fspath__ protocol.
>
> Please note that "Path" is not a replacement for "SupportsFsPath", since
> the concept of PEP 519 is, that I could implement new objects (without
> dependency to "Path")
> that are implementing the __fspath__ protocol.
>
> robert
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Introduce typing.SupportsFsPath

2018-10-08 Thread robert . hoelzl

Hello,

Since __fspath__ was introduced in PEP 519 it is possible to create 
object classes that are representing file system paths.
But there is no corresponding type object in the "typing" module. Thus I 
cannot specify functions, that accept any kind of object which supports 
the __fspath__ protocol.


Please note that "Path" is not a replacement for "SupportsFsPath", since 
the concept of PEP 519 is, that I could implement new objects (without 
dependency to "Path")

that are implementing the __fspath__ protocol.

robert
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread cs

On 08Oct2018 13:36, Ram Rachum  wrote:

I'm not an expert on memory. I used Process Explorer to look at the
Process. The Working Set of the current run is 11GB. The Private Bytes is
708MB. Actually, see all the info here:
https://www.dropbox.com/s/tzoud028pzdkfi7/screenshot_TURING_2018-10-08_133355.jpg?dl=0


And the process' virtual size is about 353GB, which matches having your file 
mmaped (its contents is now part of your process virtual memory space).



I've got 16GB of RAM on this computer, and Process Explorer says it's
almost full, just ~150MB left. This is physical memory.


I'd say this is expected behaviour. As you access the memory it is paged into 
physical memory, and since it may be wanted again (the OS can't tell) it isn't 
paged out until that becomes necessary to make room for other virtual pages.


I suspect (but would need to test to find out) that sequentially reading the 
file instead of memory mapping it might not be so aggressive because your 
process would be reusing that same small pool of memory to hold data as you 
scan the file.


Cheers,
Cameron Simpson 
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] add a time decorator to timeit.py

2018-10-08 Thread Amjad Ben Hedhili
> Summary: Python's timeit.timeit() has an undocumented feature /
> implementation detail that gives much of what the original poster has
> asked for. Perhaps revising the docs will solve the problem.

although timeit can be used with a callable, you need to create a lambda 
expression if the function has args:
```
def func_to_time(a, b):
...

timeit.timeit(lambda: func_to_time(a, b), globals=globals())
```
and you can't use it as a decorator.

De : Python-ideas  de 
la part de Jonathan Fine 
Envoyé : dimanche 7 octobre 2018 09:15
À : python-ideas
Objet : Re: [Python-ideas] add a time decorator to timeit.py

Summary: Python's timeit.timeit() has an undocumented feature /
implementation detail that gives much of what the original poster has
asked for. Perhaps revising the docs will solve the problem.

This thread has prompted me to look at timeit again. Usually, I look
at the command line help first.

>>> import timeit
>>> help(timeit)
Classes:
Timer
Functions:
timeit(string, string) -> float
repeat(string, string) -> list
default_timer() -> float

This time, to my surprise, I found the following works:

>>> def fn(): return 2 + 2
>>> timeit.timeit(fn)
0.10153918000287376

Until today, as I recall, I didn't know this.

Now for: https://docs.python.org/3/library/timeit.html

I don't see any examples there, that show that timeit.timeit can take
a callable as its first argument. So my ignorance can, I hope be
forgiven.

Now for: https://github.com/python/cpython/blob/3.7/Lib/timeit.py#L100

This contains, for both the stmt and setup parameters, explicit tests such as

if isinstance(stmt, str):
# string case
elif callable(stmt):
# callable case

So I think it's an undocumented feature, rather than an implementation detail.

And if you're a software historian, now perhaps look at
https://github.com/python/cpython/commits/3.7/Lib/timeit.py

And also, if you wish, for the tests for timeit.py.

--
Jonathan
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Terry Reedy

On 10/8/2018 10:26 AM, Steven D'Aprano wrote:

On Sun, Oct 07, 2018 at 04:24:58PM -0400, Terry Reedy wrote:



https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-by-contract/index.html

defines contracts as "precise (legally unambiguous) specifications" (5.2
Business Contracting/Sub-contracting Metaphor)


You are quoting that out of context. The full context says (emphasis
added):

 IN THE BUSINESS WORLD, contracts are precise (legally unambiguous)
 specifications that define the obligations and benefits of the
 (usually two) parties involved.


This is silly.  Every quote that is not complete is literally 'out of 
context'.  However, 'quoting out of context', in the colloquial sense, 
means selectively quoting so as to distort the original meaning, whereas 
I attempted to focus on the core meaning I was about to discuss.


Marko asked an honest question about why things obvious to him are not 
obvious to others.  I attempted to give an honest answer.  If my answer 
suggested that I have not undertstood Marko properly, as is likely, he 
can use it as a hint as to how communicate his position better.


>> I said above that functions may be specified by
>> process rather than result.
>
> Fine. What of it? Can you describe what the function does?
>
> "It sorts the list in place."
>
> "It deletes the given record from the database."

> These are all post-conditions.

No they are not.  They are descriptions of the process.  Additional 
mental work is required to turn them into formal descriptions of the 
result that can be coded.  Marko appears to claim that such coded formal 
descriptions are easier to read and understand than the short English 
description.  I disagree.  It is therefore not obvious to me that the 
extra work is worthwhile.



def append_first(seq):
 "Append seq[0] to seq."

[...]


The snipped body (revised to omit irrelevant 'return')
seq.append(seq[0])


But with duck-typing, no post condition is possible.


That's incorrect.

def append_first(seq):
 require:
 len(seq) > 0


seq does not neccessarily have a __len__ method


 hasattr(seq, "append")


The actual precondition is that seq[0] be in the domain of seq.append. 
The only absolutely sure way to test this is to run the code body.  Or 
one could run seq[0] and check it against the preconditions, if formally 
specified, of seq.append.



 ensure:
 len(seq) == len(OLD.seq) + 1
 seq[0] == seq[-1]


Not even all sequences implement negative indexing.

This is true for lists, as I said, but not for every object the meets 
the preconditions.  As others have said, duck typing means that we don't 
know what unexpected things methods of user-defined classes might do.


class Unexpected():
def __init__(self, first):
self.first = first
def __getitem__(self, key):
if key == 0:
return self.first
else:
raise ValueError(f'key {key} does not equal 0')
def append(self, item):
if isinstance(item, int):
self.last = item
else:
raise TypeError(f'item {item} is not an int')

def append_first(seq):
seq.append(seq[0])

x = Unexpected(42)
append_first(x)
print(x.first, x.last)
# 42 42

A more realistic example:

def add(a, b): return a + b

The simplified precondition is that a.__add__ exists and applies to b or 
that b.__radd__ exists and applies to a.  I see no point in formally 
specifying this as part of 'def add' as it is part of the language 
definition.  It is not just laziness that makes me averse to such 
redundancy.


Even ignoring user classes, a *useful* post-condition that applies to 
both numbers and sequences is hard to write.  I believe + is 
distributive for both, so that a + (b + b) = (a + b) + b, but


--
Terry Jan Reedy

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Better error messages for missing optional stdlib packages

2018-10-08 Thread Marcus Harnisch

On 10/08/2018 12:29 AM, Terry Reedy wrote:


On 10/3/2018 4:29 PM, Marcus Harnisch wrote:

When trying to import lzma on one of my machines, I was suprised to 
get a normal import error like for any other module. 
What was the traceback and message?  Did you get an import error for 
one of the three imports in lzma.py.  I don't know why you would 
expect anything else.  Any import in any stdlib module can potential 
fail if the file is buggy, corrupted, or missing.


$ /usr/bin/python3
Python 3.7.0 (default, Oct  4 2018, 03:21:59)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import lzma
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.7/lzma.py", line 27, in 
ModuleNotFoundError: No module named '_lzma'
>>>


> According to the docs
lzma has been part of stdlib since 3.3. Further digging revealed that 
the error is due to the fact that xz wasn't compiled in when building 
Python.


Perhaps this is a buggy build.
This, I reckon, depends on the perspective and the definition of 
“buggy”. If the build process finishes without error, can we assume that 
the build is not buggy? If we make claims along the lines of “nobody in 
their right mind would build Python without lzma” it would only be fair 
to break the build if liblzma can't be detected.
Unless I missed anything it doesn't happen until after the build has 
finished successfully, that a message is printed which lists the modules 
which couldn't be detected by setup.py.

Here is a list of modules, which I believe are affected:

$ grep -F missing.append setup.py
    missing.append('spwd')
    missing.append('readline')
    missing.append('_ssl')
    missing.append('_hashlib')
    missing.append('_sqlite3')
    missing.append('_dbm')
    missing.append('_gdbm')
    missing.append('nis')
    missing.append('_curses')
    missing.append('_curses_panel')
    missing.append('zlib')
    missing.append('zlib')
    missing.append('zlib')
    missing.append('_bz2')
    missing.append('_lzma')
    missing.append('_elementtree')
    missing.append('ossaudiodev')
    missing.append('_tkinter')
    missing.append('_uuid')


  Have you complained to the distributor?
After finding the root cause of the missing import I did file a request 
for including lzma in future releases of the distribution.


All I am asking is that unsuspecting users not be left in the dark when 
it comes to debugging unexpected import errors. I believe a missing 
stdlib module qualifies for “unexpected”. This could happen in form of 
documentation or by means of an import error handler that prints some 
helpful message in case that a stdlib module couldn't be found.


Regards,
Marcus
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread Guido van Rossum
On Mon, Oct 8, 2018 at 4:53 AM Erik Bray  wrote:

> If I had the energy to argue it I would also argue against using TOML
> in those PEPs.  I personally don't especially care for TOML and what's
> "obvious" to Tom is not at all obvious to me.  I'd rather just stick
> with YAML or perhaps something even simpler than either one.
>

I feel the same way. (Somebody was requesting extensive TOML support for
mypy and was also waving those PEPs in front of us.)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread Pradyun Gedam
On Mon, Oct 8, 2018 at 12:49 PM Jimmy Girardet  wrote:
>
> Hi,

Hi Jimmy and welcome! :)

>
> I don't know if this was already debated  but I don't know how to search
> in the whole archive of the list.
>
>
> For  now the  adoption of pyproject.toml file is more difficult because
> toml is not in the standard library.
>
> Each tool which wants to use pyproject.toml has to add a toml lib  as a
> conditional or hard dependency.
>
> Since toml is now the standard configuration file format, it's strange
> the python does not support it in the stdlib lije it would have been
> strange to not have the configparser module.
>

Let's wait till TOML hits 1.0 before adding it to the standard
library. It's still at 0.5 right now.

I am personally in favor of adding a standard library module for TOML,
after it hits 1.0 and there's some stability after the release.

> I know it's complicated to add more and more thing to the stdlib but I
> really think it is necessary for python packaging being more consistent.
>

TOML has a fairly unambiguous specification so I don't think the
choice of library should really affect what data gets loaded. If there
are differences across implementations, due to the TOML specification
unintentionally being ambiguous, please do file an issue on GitHub. :)

>
> Maybe we could thought to a readonly lib to limit the added code.

I don't think that would be as helpful as possibly a round-tripping
parser-writer combination but I'll refrain from pushing for that
*right now*.

>
>
> If it's conceivable, I'd be happy to help in it.
>
>
> Nice Day guys and girls.
>
> Jimmy

Cheers,
Pradyun (pip maintainer, TOML Core member)

>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Steven D'Aprano
On Tue, Oct 09, 2018 at 01:21:57AM +1100, Chris Angelico wrote:

> > > Yet we keep having use-cases shown to us involving one person with one
> > > module, and another person with another module, and the interaction
> > > between the two.
> >
> > Do we? I haven't noticed anything that matches that description,
> > although I admit I haven't read every single post in these threads
> > religiously.
> 
> Try this:

Thanks for the example, that's from one of the posts I haven't read.


> If you're regularly changing your function contracts, such that you
> need to continually test  in case something in the other package
> changed, then yes, that's exactly what I'm talking about.

Presumably you're opposed to continuous integration testing too.


> I'm tired of debating this.

Is that what you were doing? I had wondered.

http://www.montypython.net/scripts/argument.php

*wink*




-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Marko Ristin-Kaufmann
Hi Chris,

I hope you don't mind me responding though you would like to stop
participating. This message is meant for other readers in case they are
interested.

> Alice tests her package A with some test data D_A. Now assume Betty did
> not write any contracts for her package B. When Alice tests her package,
> she is actually making an integration test. While she controls the inputs
> to B from A, she can only observe the results from B, but not whether they
> are correct by coincidence or B did its job correctly. Let's denote D'_B
> the data that is given to B from her original test data D_A during Alice's
> integration testing.
> >
>
> If you're regularly changing your function contracts, such that you
> need to continually test  in case something in the other package
> changed, then yes, that's exactly what I'm talking about.
>

The user story I put above had nothing to do with change. I was telling how
manually performing integration tests between A and B is tedious for us
(since it involves some form or the other of manual recording of
input/outputs to the module B and adapting unit tests of B) while contracts
are much better (*for us*) since they incur little overhead (write them
once for B, anybody runs them automatically).

I did not want to highlight the *change* in my user story, but the ease of
integration tests with contracts. If it were not for contracts, we would
have never performed them.

Cheers,
Marko



On Mon, 8 Oct 2018 at 16:22, Chris Angelico  wrote:

> On Mon, Oct 8, 2018 at 11:11 PM Steven D'Aprano 
> wrote:
> >
> > On Mon, Oct 08, 2018 at 09:32:23PM +1100, Chris Angelico wrote:
> > > On Mon, Oct 8, 2018 at 9:26 PM Steven D'Aprano 
> wrote:
> > > > > In other words, you change the *public interface* of your functions
> > > > > all the time? How do you not have massive breakage all the time?
> > > >
> > > > I can't comment about Marko's actual use-case, but *in general*
> > > > contracts are aimed at application *internal* interfaces, not so much
> > > > library *public* interfaces.
> > >
> > > Yet we keep having use-cases shown to us involving one person with one
> > > module, and another person with another module, and the interaction
> > > between the two.
> >
> > Do we? I haven't noticed anything that matches that description,
> > although I admit I haven't read every single post in these threads
> > religiously.
>
> Try this:
>
> On Mon, Oct 8, 2018 at 5:11 PM Marko Ristin-Kaufmann
>  wrote:
> > Alice tests her package A with some test data D_A. Now assume Betty did
> not write any contracts for her package B. When Alice tests her package,
> she is actually making an integration test. While she controls the inputs
> to B from A, she can only observe the results from B, but not whether they
> are correct by coincidence or B did its job correctly. Let's denote D'_B
> the data that is given to B from her original test data D_A during Alice's
> integration testing.
> >
>
> If you're regularly changing your function contracts, such that you
> need to continually test  in case something in the other package
> changed, then yes, that's exactly what I'm talking about.
>
> I'm tired of debating this. Have fun. If you love contracts so much,
> marry them. I'm not interested in using them, because nothing in any
> of these threads has shown me any good use-cases that aren't just
> highlighting bad coding practices.
>
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Chris Angelico
On Mon, Oct 8, 2018 at 11:15 PM Anders Hovmöller  wrote:
>
>
> However, another possibility is the the regexp is consuming lots of memory.
>
> The regexp seems simple enough (b'.'), so I doubt it is leaking memory like
> mad; I'm guessing you're just seeing the OS page in as much of the file as it
> can.
>
>
> Yup. Windows will aggressively fill up your RAM in cases like this
> because after all why not?  There's no use to having memory just
> sitting around unused.  For read-only, non-anonymous mappings it's not
> much problem for the OS to drop pages that haven't been recently
> accessed and use them for something else.  So I wouldn't be too
> worried about the process chewing up RAM.
>
> I feel like this is veering more into python-list territory for
> further discussion though.
>
>
> Last time I worked on windows, which admittedly was a long time, the file 
> cache was not attributed to a process, so this doesn't seem to be relevant to 
> this situation.

Depends whether it's a file cache or a memory-mapped file, though. On
Linux, if I open a file, read it, then close it, I'm not using that
file any more, but it might remain in cache (which will mean that
re-reading it will be fast, regardless of whether that's from the same
or a different process). That usage shows up as either "buffers" or
"cache", and doesn't belong to any process.

In contrast, a mmap'd file is memory that you do indeed own. If the
system runs short of physical memory, it can simply discard those
pages (rather than saving them to the swap file), but they're still
owned by one specific process, and should count in that process's
virtual memory.

(That's based on my knowledge of Linux today and OS/2 back in the 90s.
It may or may not be accurate to Windows, but I suspect it won't be
very far wrong.)

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Steven D'Aprano
On Sun, Oct 07, 2018 at 04:24:58PM -0400, Terry Reedy wrote:

> A mathematical function is defined or specified by a input domain, 
> output range, and a mapping from inputs to outputs.  The mapping can be 
> defined either by an explicit listing of input-output pairs or by a rule 
> specifying either a) the process, what is done to inputs to produce 
> outputs or, b) the result, how the output relates to the input.

Most code does not define pure mathematical functions, unless you're 
writing in Haskall :-)


> https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-by-contract/index.html
>  
> 
> defines contracts as "precise (legally unambiguous) specifications" (5.2 
> Business Contracting/Sub-contracting Metaphor)

You are quoting that out of context. The full context says (emphasis 
added):

IN THE BUSINESS WORLD, contracts are precise (legally unambiguous) 
specifications that define the obligations and benefits of the 
(usually two) parties involved.

and later goes on to say:

How does this apply to software correctness?

Consider the execution of a routine. The called routine provides 
a service - it is a supplier. The caller is the client that is 
requesting the service. We can impose a contract that spells out
precisely the obligations and benefits of both the caller (client)
and the callee (supplier). This contract SERVES AS THE INTERFACE 
SPECIFICATION FOR THE ROUTINE.

(I would add *executable* interface specification.)


> It is not obvious to me 
> that the metaphor of contracts adds anything worthwhile to the idea of 
> 'function'.

It doesn't. That's not what the metaphor is for.

Design By Contract is not a redefinition of "function", it is a software 
methodology, a paradigm for helping programmers reason better about 
functions and specify the interface so that bugs are discovered earlier.


> 1. Only a small sliver of human interactions are governed by formal 
> legal contracts read, understood, and agreed to by both (all) parties.

Irrelevant.


> 2. The idealized definition is naive in practice.  Most legal contracts, 
> unlike the example in the link article, are written in language that 
> most people cannot read. 

Irrelevant. 

Dicts aren't actual paper books filled with definitions of words, floats 
don't actually float, neural nets are not made of neurons nor can you 
catch fish in them, and software contracts are code, not legally binding 
contracts. It is a *metaphor*.


> Many contracts are imprecise and legally 
> ambiguous, which is why we have contract dispute courts.  And even then, 
> the expense means that most people who feel violated in a transaction do 
> not use courts.

Is this a critique of the legal system? What relevance does it have to 
Design By Contract?

Honestly Terry, you seem to be arguing:

"Hiring a lawyer is too expensive, and that's why Design By 
Contract doesn't work as a software methodology."


> Post-conditions specify a function by result.  I claim that this is not 
> always sensible.

In this context, "result" can mean either "the value returned by the 
function" OR "the action performed by the function (its side-effect)".

Post-conditions can check both.


> I said above that functions may be specified by 
> process rather than result. 

Fine. What of it? Can you describe what the function does?

"It sorts the list in place."

"It deletes the given record from the database."

"It deducts the given amount from Account A and transfers it
to Account B, guaranteeing that either both transactions occur
or neither of them, but never one and not the other."

These are all post-conditions. Write them as code, and they are 
contracts. If you can't write them as code, okay, move on to the next 
function.

(By the way, since you started off talking about mathematical functions, 
functions which perform a process rather than return a result aren't 
mathematical functions.)


> Ironically, the contract metaphor 
> reinforces my claim.  Many contracts, such as in teaching and medicine, 
> only specify process and explicitly disclaim any particular result of 
> concern to the client.

Irrelevant.


> >b)//If you write contracts in text, they will become stale over time 
> 
> Not true for good docstrings.  We very seldom change the essential 
> meaning of public functions.

What about public functions while they are still under active 
development with an unstable interface?


> How has "Return the sine of x (measured in radians).", for math.sin, 
> become stale?  Why would it ever?

Of course a stable function with a fixed API is unlikely to change. 
What's your point? The sin() function implementation on many platforms 
probably hasn't changed in 10 or even 20 years. (It probably just calls 
the hardware routines.) Should we conclude that unit testing is 
therefore bunk and nobody needs to write unit tests?


> What formal executable post condition 
> would help someone who does not understand 'sine', or

Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Chris Angelico
On Mon, Oct 8, 2018 at 11:11 PM Steven D'Aprano  wrote:
>
> On Mon, Oct 08, 2018 at 09:32:23PM +1100, Chris Angelico wrote:
> > On Mon, Oct 8, 2018 at 9:26 PM Steven D'Aprano  wrote:
> > > > In other words, you change the *public interface* of your functions
> > > > all the time? How do you not have massive breakage all the time?
> > >
> > > I can't comment about Marko's actual use-case, but *in general*
> > > contracts are aimed at application *internal* interfaces, not so much
> > > library *public* interfaces.
> >
> > Yet we keep having use-cases shown to us involving one person with one
> > module, and another person with another module, and the interaction
> > between the two.
>
> Do we? I haven't noticed anything that matches that description,
> although I admit I haven't read every single post in these threads
> religiously.

Try this:

On Mon, Oct 8, 2018 at 5:11 PM Marko Ristin-Kaufmann
 wrote:
> Alice tests her package A with some test data D_A. Now assume Betty did not 
> write any contracts for her package B. When Alice tests her package, she is 
> actually making an integration test. While she controls the inputs to B from 
> A, she can only observe the results from B, but not whether they are correct 
> by coincidence or B did its job correctly. Let's denote D'_B the data that is 
> given to B from her original test data D_A during Alice's integration testing.
>

If you're regularly changing your function contracts, such that you
need to continually test  in case something in the other package
changed, then yes, that's exactly what I'm talking about.

I'm tired of debating this. Have fun. If you love contracts so much,
marry them. I'm not interested in using them, because nothing in any
of these threads has shown me any good use-cases that aren't just
highlighting bad coding practices.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread David Mertz
I agree here. I briefly urged against using the less used TOML format, but
I have no real skin in the game around packaging. I like YAML, but that's
also not in the standard library, even if more widely used.

But given that packaging is committed to TOML, I think that's a strong case
for including a library in stdlib. The PEP 517/518 authors had their
reasons that were accepted. Now there is broad ecosystem that is built on
that choice. Let's support it.

On Mon, Oct 8, 2018, 8:03 AM Anders Hovmöller  wrote:

>
> >> He's referring to PEPs 518 and 517 [1], which indeed standardize on
> >> TOML as a file format for Python package build metadata.
> >>
> >> I think moving anything into the stdlib would be premature though –
> >> TOML libraries are under active development, and the general trend in
> >> the packaging space has been to move things *out* of the stdlib (e.g.
> >> there's repeated rumblings about moving distutils out), because the
> >> stdlib release cycle doesn't work well for packaging infrastructure.
> >
> > If I had the energy to argue it I would also argue against using TOML
> > in those PEPs.  I personally don't especially care for TOML and what's
> > "obvious" to Tom is not at all obvious to me.  I'd rather just stick
> > with YAML or perhaps something even simpler than either one.
>
> This thread isn't about regretting past decisions but what makes sense
> given current realities though.
>
> / Anders
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Debugging: some problems and possible solutions

2018-10-08 Thread Rhodri James

On 04/10/18 19:10, Jonathan Fine wrote:

In response to my problem-solution pair (fixing a typo)

TITLE: Debug print() statements cause doctests to fail


Rhodri James wrote:

Or write your debug output to stderr?


Perhaps I've been too concise. If so, I apologise. My proposal is that
the system be set up so that
 debug(a, b, c)
sends output to the correct stream, whatever it should be.

Rhodri:  Thank you for your contribution. Are you saying that because
the developer can write
 print(a, b, c, file=sys.stderr)
there's not a problem to solve here?


Exactly so.  If you want a quick drop of debug information, print() will 
do that just fine.  If you want detailed or tunable information, that's 
what the logging module is for.  I'm not sure where on the line between 
the two your debug() sits and what it's supposed to offer that is better 
than either of the alternatives.


--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Ram Rachum
Thanks for your help everybody! I'm very happy to have learned about mmap.

On Mon, Oct 8, 2018 at 3:27 PM Richard Damon 
wrote:

> On 10/8/18 8:11 AM, Ram Rachum wrote:
> > " Windows will aggressively fill up your RAM in cases like this
> > because after all why not?  There's no use to having memory just
> > sitting around unused."
> >
> > Two questions:
> >
> > 1. Is the "why not" sarcastic, as in you're agreeing it's a waste?
> > 2. Will this be different on Linux? Which command do I run on Linux to
> > verify that the process isn't taking too much RAM?
> >
> >
> > Thanks,
> > Ram.
> I would say the 'why not' isn't being sarcastic but pragmatic. (And I
> would expect Linux to work similarly). After all if you have a system
> with X amount of memory, and total memory demand for the other processes
> is 10% of X, what is the issue with letting one process use 80% of X
> with memory usages that is easy to clear out if something else wants it.
> A read only page that is already backed on the disk is trivial to make
> available for another usage.
>
> Memory just sitting idle is the real waste.
>
> --
> Richard Damon
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Richard Damon
On 10/8/18 8:11 AM, Ram Rachum wrote:
> " Windows will aggressively fill up your RAM in cases like this
> because after all why not?  There's no use to having memory just
> sitting around unused."
>
> Two questions:
>
> 1. Is the "why not" sarcastic, as in you're agreeing it's a waste?
> 2. Will this be different on Linux? Which command do I run on Linux to
> verify that the process isn't taking too much RAM?
>
>
> Thanks,
> Ram.
I would say the 'why not' isn't being sarcastic but pragmatic. (And I
would expect Linux to work similarly). After all if you have a system
with X amount of memory, and total memory demand for the other processes
is 10% of X, what is the issue with letting one process use 80% of X
with memory usages that is easy to clear out if something else wants it.
A read only page that is already backed on the disk is trivial to make
available for another usage.

Memory just sitting idle is the real waste.

-- 
Richard Damon

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Anders Hovmöller

>> However, another possibility is the the regexp is consuming lots of memory.
>> 
>> The regexp seems simple enough (b'.'), so I doubt it is leaking memory like
>> mad; I'm guessing you're just seeing the OS page in as much of the file as it
>> can.
> 
> Yup. Windows will aggressively fill up your RAM in cases like this
> because after all why not?  There's no use to having memory just
> sitting around unused.  For read-only, non-anonymous mappings it's not
> much problem for the OS to drop pages that haven't been recently
> accessed and use them for something else.  So I wouldn't be too
> worried about the process chewing up RAM.
> 
> I feel like this is veering more into python-list territory for
> further discussion though.

Last time I worked on windows, which admittedly was a long time, the file cache 
was not attributed to a process, so this doesn't seem to be relevant to this 
situation. 

/ Anders___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Ram Rachum
" Windows will aggressively fill up your RAM in cases like this because
after all why not?  There's no use to having memory just sitting around
unused."

Two questions:

1. Is the "why not" sarcastic, as in you're agreeing it's a waste?
2. Will this be different on Linux? Which command do I run on Linux to
verify that the process isn't taking too much RAM?


Thanks,
Ram.


On Mon, Oct 8, 2018 at 3:02 PM Erik Bray  wrote:

> On Mon, Oct 8, 2018 at 12:20 PM Cameron Simpson  wrote:
> >
> > On 08Oct2018 10:56, Ram Rachum  wrote:
> > >That's incredibly interesting. I've never used mmap before.
> > >However, there's a problem.
> > >I did a few experiments with mmap now, this is the latest:
> > >
> > >path = pathlib.Path(r'P:\huge_file')
> > >
> > >with path.open('r') as file:
> > >mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
> >
> > Just a remark: don't tromp on the "mmap" name. Maybe "mapped"?
> >
> > >for match in re.finditer(b'.', mmap):
> > >pass
> > >
> > >The file is 338GB in size, and it seems that Python is trying to load it
> > >into memory. The process is now taking 4GB RAM and it's growing. I saw
> the
> > >same behavior when searching for a non-existing match.
> > >
> > >Should I open a Python bug for this?
> >
> > Probably not. First figure out what is going on. BTW, how much RAM have
> you
> > got?
> >
> > As you access the mapped file the OS will try to keep it in memory in
> case you
> > need that again. In the absense of competition, most stuff will get
> paged out
> > to accomodate it. That's normal. All the data are "clean" (unmodified)
> so the
> > OS can simply release the older pages instantly if something else needs
> the
> > RAM.
> >
> > However, another possibility is the the regexp is consuming lots of
> memory.
> >
> > The regexp seems simple enough (b'.'), so I doubt it is leaking memory
> like
> > mad; I'm guessing you're just seeing the OS page in as much of the file
> as it
> > can.
>
> Yup. Windows will aggressively fill up your RAM in cases like this
> because after all why not?  There's no use to having memory just
> sitting around unused.  For read-only, non-anonymous mappings it's not
> much problem for the OS to drop pages that haven't been recently
> accessed and use them for something else.  So I wouldn't be too
> worried about the process chewing up RAM.
>
> I feel like this is veering more into python-list territory for
> further discussion though.
>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Steven D'Aprano
On Mon, Oct 08, 2018 at 09:32:23PM +1100, Chris Angelico wrote:
> On Mon, Oct 8, 2018 at 9:26 PM Steven D'Aprano  wrote:
> > > In other words, you change the *public interface* of your functions
> > > all the time? How do you not have massive breakage all the time?
> >
> > I can't comment about Marko's actual use-case, but *in general*
> > contracts are aimed at application *internal* interfaces, not so much
> > library *public* interfaces.
> 
> Yet we keep having use-cases shown to us involving one person with one
> module, and another person with another module, and the interaction
> between the two.

Do we? I haven't noticed anything that matches that description, 
although I admit I haven't read every single post in these threads 
religiously.

But "application" != "one module" or "one developer". I fail to see the 
contradiction. An application can be split over dozens of modules, 
written by teams of developers. Whether one or a dozen modules, it still 
has no public interface that third-party code can call. It is *all* 
internal.

Obviously if you are using contracts in public library code, the way 
you will manage them is different from the way you would manage them if 
you are using them for private or internal code.

That's no different from (say) docstrings and doctests: there are 
implied stability promises for those in *some* functions (the public 
ones) but not *other* functions (the private ones).

Of course some devs don't believe in stability promises, and treat all 
APIs as unstable. So what? That has nothing to do with contracts. People 
can "move fast and break everything" in any programming style they like.


> Which way is it? Do the contracts change frequently or not? 

"Mu."

https://en.wikipedia.org/wiki/Mu_(negative)

They change as frequently as you, the developer writing them, chooses to 
change them. Just like your tests, your type annotations, your doc 
strings, and every other part of your code.


> Are they public or not? 

That's up to you.

Contracts were originally designed for application development, where 
the concept of "public" versus "private" is meaningless. The philosophy 
of DbC is always going to be biased towards that mind-set.

Nevertheless, people can choose to use them for library code where there 
is a meaningful distinction. If they do so, then how they choose to 
manage the contracts is up to them.

If you want to make a contract a public part of the interface, then you 
can (but that would rule out disabling that specific contract, at least 
for pre-conditions).

If you only want to use it for internal interfaces, you can do that too. 

If you want to mix and match and make some contracts internal and some 
public, there is no DbC Police to tell you that you can't.


> How are we supposed to understand the point of contracts 

You could start by reading the explanations given on the Eiffel page, 
which I've linked to about a bazillion times. Then you could read about 
another bazillion blog posts and discussions that describe it (some pro, 
some con, some mixed). And you can read the Wikipedia page that shows 
how DbC is supported natively by at least 17 languages (not just Eiffel) 
and via libraries in at least 15 others. Not just new experimental 
languages, but old, established and conservative languages like Java, C 
and Ada.

There are heaps of discussions on DbC on Stackoverflow:

https://stackoverflow.com/search?q=design%20by%20contract

and a good page on wiki.c2:

http://wiki.c2.com/?DesignByContract

TIL: Pre- and postconditions were first supported natively Barbara 
Liskov's CLU in the 1970s.

This is not some "weird bizarre Eiffel thing", as people seem to 
believe. If it hasn't quite gone mainstream, it is surely at least as 
common as functional programming style. It has been around for over 
forty years in one way or another, not four weeks, and is a standard, 
well-established if minority programming style and development process.

Of course it is always valid to debate the pros and cons of DbC versus 
other development paradigms, but questioning the very basis of DbC as 
people here keep doing is as ludicrous and annoying as questioning the 
basis of OOP or FP or TDD would be.

Just as functional programming is a paradigm that says (among other 
things) "no side effects", "no global variables holding state" etc, and 
we can choose to apply that paradigm even in non-FP languages, so DbC is 
in part a paradigm that tells you how to design the internals of your 
application.

We can apply the same design concepts to any code we want, even if we're 
not buying into the whole Contract metaphor:

- pre-conditions can be considered argument validation;

- post-conditions can be considered a kind of test;

- class invariants can be considered a kind of defensive assertion.



> if the use-cases being shown all involve bad code
> and/or bad coding practices?

How do you draw that conclusion?


> Contracts, apparently, allow people to violate vers

Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Erik Bray
On Mon, Oct 8, 2018 at 12:20 PM Cameron Simpson  wrote:
>
> On 08Oct2018 10:56, Ram Rachum  wrote:
> >That's incredibly interesting. I've never used mmap before.
> >However, there's a problem.
> >I did a few experiments with mmap now, this is the latest:
> >
> >path = pathlib.Path(r'P:\huge_file')
> >
> >with path.open('r') as file:
> >mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
>
> Just a remark: don't tromp on the "mmap" name. Maybe "mapped"?
>
> >for match in re.finditer(b'.', mmap):
> >pass
> >
> >The file is 338GB in size, and it seems that Python is trying to load it
> >into memory. The process is now taking 4GB RAM and it's growing. I saw the
> >same behavior when searching for a non-existing match.
> >
> >Should I open a Python bug for this?
>
> Probably not. First figure out what is going on. BTW, how much RAM have you
> got?
>
> As you access the mapped file the OS will try to keep it in memory in case you
> need that again. In the absense of competition, most stuff will get paged out
> to accomodate it. That's normal. All the data are "clean" (unmodified) so the
> OS can simply release the older pages instantly if something else needs the
> RAM.
>
> However, another possibility is the the regexp is consuming lots of memory.
>
> The regexp seems simple enough (b'.'), so I doubt it is leaking memory like
> mad; I'm guessing you're just seeing the OS page in as much of the file as it
> can.

Yup. Windows will aggressively fill up your RAM in cases like this
because after all why not?  There's no use to having memory just
sitting around unused.  For read-only, non-anonymous mappings it's not
much problem for the OS to drop pages that haven't been recently
accessed and use them for something else.  So I wouldn't be too
worried about the process chewing up RAM.

I feel like this is veering more into python-list territory for
further discussion though.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread Anders Hovmöller

>> He's referring to PEPs 518 and 517 [1], which indeed standardize on
>> TOML as a file format for Python package build metadata.
>> 
>> I think moving anything into the stdlib would be premature though –
>> TOML libraries are under active development, and the general trend in
>> the packaging space has been to move things *out* of the stdlib (e.g.
>> there's repeated rumblings about moving distutils out), because the
>> stdlib release cycle doesn't work well for packaging infrastructure.
> 
> If I had the energy to argue it I would also argue against using TOML
> in those PEPs.  I personally don't especially care for TOML and what's
> "obvious" to Tom is not at all obvious to me.  I'd rather just stick
> with YAML or perhaps something even simpler than either one.

This thread isn't about regretting past decisions but what makes sense given 
current realities though.

/ Anders
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Marko Ristin-Kaufmann
Hi Crhis,

> In other words, you change the *public interface* of your functions
> > all the time? How do you not have massive breakage all the time?
>
> I can't comment about Marko's actual use-case, but *in general*
> contracts are aimed at application *internal* interfaces, not so much
> library *public* interfaces.
>

Sorry, I might have misunderstood the question -- I was referring to
modules used within the company, not outside. Of course, public libraries
put on pypi don't change their interfaces weekly.

Just to clear the confusion, both Steve and I would claim that the
contracts do count as part of the interface.

For everything internal, we make changes frequently (including the
interface) and more often than not, the docstring is not updated when the
implementation of the function is. Contracts help our team catch breaking
changes more easily. When we change the behavior of the function, we use
"Find usage" in Pycharm, fix manually what we can obviously see that was
affected by the changed implementation, then statically check with mypy
that the changed return type did not affect the callers, and contracts (of
other functions!) catch some of the bugs during testing that we missed when
we changed the implementation. End-to-end test with testing contracts
turned off catch some more bugs on the real data, and then it goes into
production where hopefully we see no errors.

Cheers,
Marko



On Mon, 8 Oct 2018 at 12:32, Chris Angelico  wrote:

> On Mon, Oct 8, 2018 at 9:26 PM Steven D'Aprano 
> wrote:
> > > In other words, you change the *public interface* of your functions
> > > all the time? How do you not have massive breakage all the time?
> >
> > I can't comment about Marko's actual use-case, but *in general*
> > contracts are aimed at application *internal* interfaces, not so much
> > library *public* interfaces.
>
> Yet we keep having use-cases shown to us involving one person with one
> module, and another person with another module, and the interaction
> between the two. Which way is it? Do the contracts change frequently
> or not? Are they public or not? How are we supposed to understand the
> point of contracts if the use-cases being shown all involve bad code
> and/or bad coding practices?
>
> Contracts, apparently, allow people to violate versioning expectations
> and feel good about it.
>
> (Am I really exaggerating all that much here?)
>
> ChrisA
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread Erik Bray
On Mon, Oct 8, 2018 at 12:23 PM Nathaniel Smith  wrote:
>
> On Mon, Oct 8, 2018 at 2:55 AM, Steven D'Aprano  wrote:
> >
> > On Mon, Oct 08, 2018 at 09:10:40AM +0200, Jimmy Girardet wrote:
> >> Each tool which wants to use pyproject.toml has to add a toml lib  as a
> >> conditional or hard dependency.
> >>
> >> Since toml is now the standard configuration file format,
> >
> > It is? Did I miss the memo? Because I've never even heard of TOML before
> > this very moment.
>
> He's referring to PEPs 518 and 517 [1], which indeed standardize on
> TOML as a file format for Python package build metadata.
>
> I think moving anything into the stdlib would be premature though –
> TOML libraries are under active development, and the general trend in
> the packaging space has been to move things *out* of the stdlib (e.g.
> there's repeated rumblings about moving distutils out), because the
> stdlib release cycle doesn't work well for packaging infrastructure.

If I had the energy to argue it I would also argue against using TOML
in those PEPs.  I personally don't especially care for TOML and what's
"obvious" to Tom is not at all obvious to me.  I'd rather just stick
with YAML or perhaps something even simpler than either one.
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Ram Rachum
I'm not an expert on memory. I used Process Explorer to look at the
Process. The Working Set of the current run is 11GB. The Private Bytes is
708MB. Actually, see all the info here:
https://www.dropbox.com/s/tzoud028pzdkfi7/screenshot_TURING_2018-10-08_133355.jpg?dl=0

I've got 16GB of RAM on this computer, and Process Explorer says it's
almost full, just ~150MB left. This is physical memory.

To your question: The loop does iterate, i.e. finding multiple matches.

On Mon, Oct 8, 2018 at 1:20 PM Cameron Simpson  wrote:

> On 08Oct2018 10:56, Ram Rachum  wrote:
> >That's incredibly interesting. I've never used mmap before.
> >However, there's a problem.
> >I did a few experiments with mmap now, this is the latest:
> >
> >path = pathlib.Path(r'P:\huge_file')
> >
> >with path.open('r') as file:
> >mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
>
> Just a remark: don't tromp on the "mmap" name. Maybe "mapped"?
>
> >for match in re.finditer(b'.', mmap):
> >pass
> >
> >The file is 338GB in size, and it seems that Python is trying to load it
> >into memory. The process is now taking 4GB RAM and it's growing. I saw the
> >same behavior when searching for a non-existing match.
> >
> >Should I open a Python bug for this?
>
> Probably not. First figure out what is going on. BTW, how much RAM have
> you
> got?
>
> As you access the mapped file the OS will try to keep it in memory in case
> you
> need that again. In the absense of competition, most stuff will get paged
> out
> to accomodate it. That's normal. All the data are "clean" (unmodified) so
> the
> OS can simply release the older pages instantly if something else needs
> the
> RAM.
>
> However, another possibility is the the regexp is consuming lots of memory.
>
> The regexp seems simple enough (b'.'), so I doubt it is leaking memory
> like
> mad; I'm guessing you're just seeing the OS page in as much of the file as
> it
> can.
>
> Also, does the loop iterate? i.e. does it find multiple matches as the
> memory
> gets consumed, or is the first iateration blocking and consuming gobs of
> memory
> before the first match comes back? A print() call will tell you that.
>
> Cheers,
> Cameron Simpson 
>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Chris Angelico
On Mon, Oct 8, 2018 at 9:26 PM Steven D'Aprano  wrote:
> > In other words, you change the *public interface* of your functions
> > all the time? How do you not have massive breakage all the time?
>
> I can't comment about Marko's actual use-case, but *in general*
> contracts are aimed at application *internal* interfaces, not so much
> library *public* interfaces.

Yet we keep having use-cases shown to us involving one person with one
module, and another person with another module, and the interaction
between the two. Which way is it? Do the contracts change frequently
or not? Are they public or not? How are we supposed to understand the
point of contracts if the use-cases being shown all involve bad code
and/or bad coding practices?

Contracts, apparently, allow people to violate versioning expectations
and feel good about it.

(Am I really exaggerating all that much here?)

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Why is design-by-contracts not widely adopted?

2018-10-08 Thread Steven D'Aprano
On Mon, Oct 08, 2018 at 04:29:34PM +1100, Chris Angelico wrote:
> On Mon, Oct 8, 2018 at 4:26 PM Marko Ristin-Kaufmann
>  wrote:
> >> Not true for good docstrings.  We very seldom change the essential
> >> meaning of public functions.
> >
> > In my team, we have a stale docstring once every two weeks or even more 
> > often.

"At Resolver we've found it useful to short-circuit any doubt and just
refer to comments in code as 'lies'. "
--Michael Foord paraphrases Christian Muirhead on python-dev, 2009-03-22


> If it weren't for doctests and contracts, I could imagine we would 
> have them even more often :)
> >
> 
> In other words, you change the *public interface* of your functions
> all the time? How do you not have massive breakage all the time?

I can't comment about Marko's actual use-case, but *in general* 
contracts are aimed at application *internal* interfaces, not so much 
library *public* interfaces.

That's not to say that contracts can't be used for libraries at all, but 
they're not so useful for public interfaces that could be called by 
arbitrary third-parties. They are more useful for internal interfaces, 
where you don't break anyone's code but your own if you change the API.

Think about it this way: you probably wouldn't hesitate much to change 
the interface of a _private method or function, aside from discussing it 
with your dev team. Sure it will break some code, but you have tests to 
identify the breakage, and maybe refactoring tools to help. And of 
course the contracts themselves are de facto tests. Such changes are 
manageable. And since its a private function, nobody outside of your 
team need care.

Same with contracts. (At least in the ideal case.)


-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread Nathaniel Smith
On Mon, Oct 8, 2018 at 2:55 AM, Steven D'Aprano  wrote:
>
> On Mon, Oct 08, 2018 at 09:10:40AM +0200, Jimmy Girardet wrote:
>> Each tool which wants to use pyproject.toml has to add a toml lib  as a
>> conditional or hard dependency.
>>
>> Since toml is now the standard configuration file format,
>
> It is? Did I miss the memo? Because I've never even heard of TOML before
> this very moment.

He's referring to PEPs 518 and 517 [1], which indeed standardize on
TOML as a file format for Python package build metadata.

I think moving anything into the stdlib would be premature though –
TOML libraries are under active development, and the general trend in
the packaging space has been to move things *out* of the stdlib (e.g.
there's repeated rumblings about moving distutils out), because the
stdlib release cycle doesn't work well for packaging infrastructure.

-n

[1] https://www.python.org/dev/peps/pep-0518/
https://www.python.org/dev/peps/pep-0517

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Cameron Simpson

On 08Oct2018 10:56, Ram Rachum  wrote:

That's incredibly interesting. I've never used mmap before.
However, there's a problem.
I did a few experiments with mmap now, this is the latest:

path = pathlib.Path(r'P:\huge_file')

with path.open('r') as file:
   mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)


Just a remark: don't tromp on the "mmap" name. Maybe "mapped"?


   for match in re.finditer(b'.', mmap):
   pass

The file is 338GB in size, and it seems that Python is trying to load it
into memory. The process is now taking 4GB RAM and it's growing. I saw the
same behavior when searching for a non-existing match.

Should I open a Python bug for this?


Probably not. First figure out what is going on. BTW, how much RAM have you 
got?


As you access the mapped file the OS will try to keep it in memory in case you 
need that again. In the absense of competition, most stuff will get paged out 
to accomodate it. That's normal. All the data are "clean" (unmodified) so the 
OS can simply release the older pages instantly if something else needs the 
RAM.


However, another possibility is the the regexp is consuming lots of memory.

The regexp seems simple enough (b'.'), so I doubt it is leaking memory like 
mad; I'm guessing you're just seeing the OS page in as much of the file as it 
can.


Also, does the loop iterate? i.e. does it find multiple matches as the memory 
gets consumed, or is the first iateration blocking and consuming gobs of memory 
before the first match comes back? A print() call will tell you that.


Cheers,
Cameron Simpson 
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] support toml for pyproject support

2018-10-08 Thread Steven D'Aprano
Hi Jimmy, and welcome,


On Mon, Oct 08, 2018 at 09:10:40AM +0200, Jimmy Girardet wrote:
> Hi,
> 
> I don't know if this was already debated  but I don't know how to search
> in the whole archive of the list.
> 
> 
> For  now the  adoption of pyproject.toml file is more difficult because
> toml is not in the standard library.

It is true that using third-party libraries is more difficult than using 
the std lib. That alone is not a reason to add a library to the std lib.


> Each tool which wants to use pyproject.toml has to add a toml lib  as a
> conditional or hard dependency.
> 
> Since toml is now the standard configuration file format, 

It is? Did I miss the memo? Because I've never even heard of TOML before 
this very moment.

Google Trends doesn't really support your assertion that TOML has become 
"the standard" for config files:

# compare TOML, JSON and YAML
https://trends.google.com/trends/explore?q=%2Fg%2F11c5zwr35t,%2Fm%2F05cntt,%2Fm%2F01w6k2

although it is trending upwards:

https://trends.google.com/trends/explore?q=%2Fg%2F11c5zwr35t



> it's strange
> the python does not support it in the stdlib lije it would have been
> strange to not have the configparser module.

We don't even ship a YAML library, and that seems to be far more popular 
than TOML. On the other hand, we do ship a plist library.


> I know it's complicated to add more and more thing to the stdlib but I
> really think it is necessary for python packaging being more consistent.
> 
> 
> Maybe we could thought to a readonly lib to limit the added code.

What is a readonly lib?



-- 
Steve
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Support parsing stream with `re`

2018-10-08 Thread Ram Rachum
That's incredibly interesting. I've never used mmap before.

However, there's a problem.

I did a few experiments with mmap now, this is the latest:

path = pathlib.Path(r'P:\huge_file')

with path.open('r') as file:
mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
for match in re.finditer(b'.', mmap):
pass

The file is 338GB in size, and it seems that Python is trying to load it
into memory. The process is now taking 4GB RAM and it's growing. I saw the
same behavior when searching for a non-existing match.

Should I open a Python bug for this?

On Sun, Oct 7, 2018 at 7:49 PM <2...@jmunch.dk> wrote:

> On 18-10-07 16.15, Ram Rachum wrote:
>  > I tested it now and indeed bytes patterns work on memoryview objects.
>  > But how do I use this to scan for patterns through a stream without
>  > loading it to memory?
>
> An mmap object is one of the things you can make a memoryview of,
> although looking again, it seems you don't even need to, you can
> just re.search the mmap object directly.
>
> re.search'ing the mmap object means the operating system takes care of
> the streaming for you, reading in parts of the file only as necessary.
>
> regards, Anders
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] support toml for pyproject support

2018-10-08 Thread Jimmy Girardet
Hi,

I don't know if this was already debated  but I don't know how to search
in the whole archive of the list.


For  now the  adoption of pyproject.toml file is more difficult because
toml is not in the standard library.

Each tool which wants to use pyproject.toml has to add a toml lib  as a
conditional or hard dependency.

Since toml is now the standard configuration file format, it's strange
the python does not support it in the stdlib lije it would have been
strange to not have the configparser module.


I know it's complicated to add more and more thing to the stdlib but I
really think it is necessary for python packaging being more consistent.


Maybe we could thought to a readonly lib to limit the added code.


If it's conceivable, I'd be happy to help in it.


Nice Day guys and girls.

Jimmy


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/