Re: [RELEASE] Python 3.9.0a6 is now available for testing
m[-1]!= '\n'and'\n'or' ' This is '\n' if m[-1] != '\n' else ' ' ... just written in a way that was common before the if-expression was invented, using and and or. In itself, it's fine. The problem is that the space in "and '\n'" is omitted (presumably for brevity for some reason?) leaving us with and'\n', which looks like a string prefix (similar to r'\n' or f'\n') but with a prefix of "and" which isn't a valid string prefix. The problem is that something isn't disambiguating this the same way as the 3.8 parser did (I'd say it's the "new parser" but Robin showed the same behaviour with "-X oldparser" which makes me wonder... Anyway, that's what I think is going on. I'll leave it to the parser experts to understand what's happening and decide on a fix :-) Paul On Wed, 29 Apr 2020 at 20:54, Rhodri James wrote: > > On 29/04/2020 20:23, Schachner, Joseph wrote: > >> norm=lambda m: m+(m and(m[-1]!= '\n'and'\n'or' ')or'\n') > > Parentheses 1 2 > > 1 0 > > quotes 1 0 > > 1 0 1 01 0 > > > > OK I don't see any violation of quoting or parentheses matching. Still > > trying to figure out what this lambda does. > > Presumably it's something to do with recognising string prefixes? > > -- > Rhodri James *-* Kynesim Ltd > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: pip UX Studies - help improve the usability of pip
We've had some questions as to whether this survey is legitimate. I can confirm it is (speaking as a pip core developer). The link to a page describing this work is https://pyfound.blogspot.com/2019/12/moss-czi-support-pip.html, if anyone wants to find out more. Paul Moore On Sat, 7 Mar 2020 at 01:49, Bernard Tyers - Sane UX Design wrote: > > Hi there, > > My name is Bernard Tyers. I'm a UX designer and have recently started > working on the PSF project to improve the usability of pip, funded by > MOSS/CZI. > > I want to let you know about the pip UX Studies we've started today, and > encourage you to sign-up and take part. > > The pip Team is looking for Python users who use pip to take part in our > UX Studies. Your input will have a direct impact on improving pip. > > We want to speak with as diverse a group as possible. We'd particularly > like to speak with people with accessibility needs. > > You _don't_ have to be a Python expert to take part - I can't stress > this enough! > > You can find out all the details you'll need and find the sign-up link > on this blogpost: > > http://www.ei8fdb.org/thoughts/2020/03/pip-ux-study-recruitment/ > > If you do have questions I've not answered there, let me know. > > We'd also appreciate some signal boosting to reach as wide an audience > as possible. Please share the blog post with people in different Python > using communities. > > If you're a Twitter/Mastodon user we'd appreciate a signal boost of > these also: > > https://twitter.com/bernardtyers/status/123603961730017 > https://social.ei8fdb.org/@bernard/103778645553767728 > > > Thank you for your attention! > > Best wishes, > > Bernard > -- > > Bernard Tyers, User research & Interaction Design > > T: @bernardtyers > M: @bern...@social.ei8fdb.org > PGP Key: keybase.io/ei8fdb > > > I work on User-Centred Design, Open Source Software and Privacy. > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Possible Addition to Python Language: Marked Sub-condition
On Sun, 8 Mar 2020 at 15:02, Shrinivas Kulkarni wrote: > > Hello Everyone > > While writing python code, I frequently come across the need to do > certain tasks based on combined conditions. > > Much of the task for all the sub-conditions are common but some are > specific to one or more of these sub-conditions. > > A simplified example: > > ## Code ## > if (color == BLUE and count == 20) or (color == RED and count % 5 == 0): > rotate_the_wheel() # Common to the two sub-conditions > if(color == BLUE and count == 20): # First sub-condition > set_signal() > if(color == RED and count % 5 == 0): # Second sub-condition > clear_signal() > proc_post_rotate() # Common to the two sub-conditions > > I am not sure if there is a better way to do this. If not, maybe there > can be an extension to the language, which would allow marking a > sub-condition just like we mark a sub-expression in a regular > expression. I would have thought that simply naming the sub-conditions would be sufficient. blue_20 = (color == BLUE and count == 20) red_5 = (color == RED and count % 5 == 0) if blue_20 or red_5: rotate_the_wheel() # Common to the two sub-conditions if blue_20: # First sub-condition set_signal() if red_5: # Second sub-condition clear_signal() proc_post_rotate() # Common to the two sub-conditions I don't know how experienced you are with Python programming, but if you had framed your question as "how do I modify this code to avoid repeating the conditions?" you would likely have been given this advice on the python-list mailinglist, or other similar Python programming help resources. Starting with a proposed language change before you've explored the existing options isn't likely to be the best approach (and would likely have meant you could resolve your issue without needing to bring it to python-ideas at all). Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Sandboxing eval() (was: Calculator)
On Sun, 19 Jan 2020 at 17:45, wrote: > > Is it actually possible to build a "sandbox" around eval, permitting it > only to do some arithmetic and use some math functions, but no > filesystem acces or module imports? No. This has been tried before, and it simply isn't safe in the face of malicious input. > I have an application that loads calculation recipes (a few lines of > variable assignments and arithmetic) from a database. > > exec(string, globals, locals) > > with locals containing the input variables, and globals has a > __builtin__ object with a few math functions. It works, but is it safe? If you trust the source, it's OK, but a creative attacker who had the ability to create a recipe could execute arbitrary code. If you require safety, you really need to write your own parser/evaluator. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Relative import cannot find .so submodule?
Did you build the extension as a debug build, but you're trying to use it in a release build Python? That may not work (it may depend on the OS, I don't know the compatibility details on MacOS)... On Mon, 13 Jan 2020 at 16:02, Pieter van Oostrum wrote: > > Patrick Stinson writes: > > > I have a module named rtmidi, and its C submodule named rtmidi/_rtmidi. The > > distills script builds successfully and successfully creates a build/lib > > dir with a rtmidi dir in it and the submodule file > > rtmidi/_rtmidi.cpython-36dm-darwin.so. I have set PYTHONPATH to this lib > > dir, but rtmidi/__init__.py gives the following error: > > > > Traceback (most recent call last): > > File "main.py", line 6, in > > from pkmidicron import MainWindow, util, ports > > File "/Users/patrick/dev/pkmidicron/pkmidicron/__init__.py", line 1, in > > > > from .mainwindow import * > > File "/Users/patrick/dev/pkmidicron/pkmidicron/mainwindow.py", line 2, in > > > > import rtmidi > > File > > "/Users/patrick/dev/pkmidicron/pyrtmidi/build/lib/rtmidi/__init__.py", line > > 1, in > > from ._rtmidi import * > > ModuleNotFoundError: No module named 'rtmidi._rtmidi’ > > > > How does the module finder work in the import system? I assume ti > > automatically resolves the name _rtmidi.cpython-36dm-darwin.so to _rtmidi? > > I didn’t have any luck reading the docs on the import system. > > Are you running python 3.6? > > I tried this on python 3.7 and it worked, but the file is called > _rtmidi.cpython-37m-darwin.so there (37 for python3.7, and the d is missing, > I don't know what that indicates). > -- > Pieter van Oostrum > www: http://pieter.vanoostrum.org/ > PGP key: [8DAE142BE17999C4] > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Python3 - How do I import a class from another file
On Tue, 10 Dec 2019 at 21:12, R.Wieser wrote: > And although you have been fighting me over when the __del__ method is > called, it /is/ called directly as a result of an "del instance" and the > refcount goes zero. There is /no/ delay.(with the only exception is > when a circular reference exists). You do understand that the reference counting garbage collector is an implementation detail of the CPython implementation *only*, don't you? The (implementation independent) language semantics makes no assertion about what form of garbage collection is used, and under other garbage collectors, there can be an indefinite delay between the last reference to a value being lost and the object being collected (which is when __del__ gets called). There is not even a guarantee that CPython will retain the reference counting GC in future versions. Removing it would be a big change, but not impossible. If all you are interested is the semantics of the current CPython release, then your statements are true. But why would anyone here know that you were looking at the situation from such a limited perspective? Your "logic" seems to be full of hidden assumptions and unstated qualifications. And your attitude seems to be confrontational and aggressive. Frankly, it's unlikely that you're going to learn much without a change in your approach. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: keying by identity in dict and set
On Mon, 28 Oct 2019 at 10:01, Steve White wrote: > > Hi Chris, > > I'm afraid you've missed my point. As I said in the initial post, I > have read the documentation. > > I think the documentation does not adequately explain how the > hashtable (or hashtables generally) work internally. As stated in the docs, "The only required property is that objects which compare equal have the same hash value". The docs do *not* explain how dictionaries work internally, and that's deliberate. You should not rely on the internal workings of dictionaries, because your code may then not work on other Python implementations (or even on later versions of CPython). Of course, if you're only interested in working on CPython, then you can rely on the details of the dictionary implementation, but in that case you;re expected to look at the implementation (and be prepared for it to change over time!) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: installation problem
On Thu, 24 Oct 2019 at 00:50, fateme jbr wrote: > > Dear Python team > > I have installed Python 3.7.4 on windows 10. I have access to IDLE and I > can run simple programs, but when I type python in command window nothing > happens. I wanna install pip and afterward some libraries and it is when > the problem occurs. > > why doesn't prompt window recognize python. What shall I do? You probably didn't choose "add Python to your PATH" when installing (it's an option, that's off by default). You can manually add Python to your PATH (if you don't know how to do this, there are plenty of resources that can help - it's a fairly basic thing you need to be aware of if you're using the command line), or you can reinstall, or "repair" your installation and select the option. Or you can use the Python launcher, py.exe, which is on PATH and which will launch Python for you: * Run python: py * Run pip: py -m pip Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Black
IMO, if you care enough to not like black's formatting choices, you probably shouldn't use it. The point of black is to *not* care about formatting, but leave the decisions to the tool. If you're not doing that, then it's probably the wrong tool for you. There may be other code formatters that are more configurable - I honestly don't know - and which therefore may be more useful to you once configured to your liking. But black is (as I understand it) about *not* worrying about formatting choices, so you can get on to other more important decisions. Personally, I don't really like some of black's choices. So I only use it on projects where I'm tired of worrying about formatting and readability and just want something that's OK. Over time, I'm finding that's true of more and more of my projects :-) Regarding your question on spaces, I like some of your spacing choices, but not all of them. So we could have a debate over precisely which spaces I agree with and which I don't. But I'm not really interested in doing that. Black would probably help if we had to work together, but unless we do, you can do what you want and I won't mind :-) Paul On Mon, 21 Oct 2019 at 15:22, wrote: > > What do people think about black? > > I'm asking because one of my personal preferences is to use spaces for > clarity: > > 1. right = mystr[ start : ] > > black version right=mystr[start:] > > 2. mtime = time.asctime( time.localtime( info.st_mtime ) ) > > black version mtime = time.asctime(time.localtime(info.st_mtime)) > > Is there a reason why PEP8 doesn't like these spaces? > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Curious about library inclusion
Probably low. There would need to be a clear justification as to why having the library in the stdlib (and hence tied to Python's release schedule for updates/bug fixes etc - which is typically a really severe limitation for a newish library) would be better than having it available as a 3rd party library on PyPI. It would also mean that it would only be available in Python 3.9+, unless a parallel backport module was maintained on PyPI, making the argument for inclusion in the stdlib even weaker. Basically, someone needs to explain the benefits of having *this* library in the stdlib, and demonstrate that they are compelling. (Generic arguments like "being in the stdlib means no need for a 3rd party dependency" don't count because they would apply to any library on PyPI equally...) That's a pretty high bar to reach. Not impossible, by any means, but it needs a lot more than "this is a neat library". As another measure, look at various other libraries on PyPI and ask yourself why *this* library needs to be in the stdlib more than those others. The answer to that question would be a good start for an argument to include the library. Paul On Thu, 10 Oct 2019 at 11:37, Antoon Pardon wrote: > > That seems to have been thoruoghly garbled so I start again. > > I was wondering how likely it would be that piped iterators like shown > in http://code.activestate.com/recipes/580625-collection-pipeline-in-python/ > would make it into a future python version/ Once I started using them > (and included some more) I found I really liked using them. For instance > I used to write the following a lot: for line in some_file: line = > line.strip() lst = line.split(':') do_stuff(lst) Now I seem to drift > into writing: for lst in some_file | Strip | Split(':'): do_stuff(lst) > where Strip and Split are defined as follows: Strip = > Map(methodcaller('strip')) def Split(st): return > Map(methodcaller('split', st)) I also found that Apply can be used as a > decorator, that easily turns a generator into a piped version. So what > are the odds? -- Antoon. > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 3.6 on Windows - does a python3 alias get created by installation?
No, the Windows builds do not provide versioned executables (python3.exe or python3.7.exe). Generally, the recommended way to launch Python on Windows is via the py.exe launcher (py -3.7, or just py for the default), but if you have Python on your PATH then python works. The reason pip has versioned executables is because that's how pip defines its entry points. It's cross-platform and unrelated to the conventions the Python core installers follow. Yes, it's all a bit confusing :-) Paul On Wed, 9 Oct 2019 at 17:37, Malcolm Greene wrote: > > I'm jumping between Linux, Mac and Windows environments. On Linux and Mac we > can invoke Python via python3 but on Windows it appears that only python > works. Interestingly, Windows supports both pip and pip3 flavors. Am I > missing something? And yes, I know I can manually create a python3 alias by > copying python.exe to python3.exe but that approach has its own set of > nuances on locked down servers plus the hassle of keeping these python3 > copies up-to-date across Python updates. > > Also curious: Do the Windows versions of Python 3.7 and 3.8 provide a python3 > alias to start Python? > > Thanks! > Malcolm > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Announcing config-path an OS independent configuration file paths
On Mon, 30 Sep 2019 at 15:51, Barry Scott wrote: > > > On 30 Sep 2019, at 14:17, Paul Moore wrote: > > > > How does this compare to the existing appdirs module on PyPI? > > > > I did not know about appdirs. Fair enough ;-) > It does not seem to have separate read vs. save paths. > Required for XDG specification where a path of config folder is defined. > > On 1st run the config may be in a system directory which the user cannot > write to. > Saving a config file will go into the users config file. > On the 2nd run that users config should be read not the system config. > > appdirs covers the location of more type of files, like cache and data. Interesting - thanks for the comparison, I'll take a detailed look next time I need this type of functionality. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Announcing config-path an OS independent configuration file paths
How does this compare to the existing appdirs module on PyPI? On Mon, 30 Sep 2019 at 13:15, Barry Scott wrote: > > See https://pypi.org/project/config-path/ for documentation. > > Install using pip: > > python -m pip install config-path > > I write tools that run on macOS, Unix and Windows systems. > Each operating system has its own conventions on where config > files should be stored. Following the conventions is not always > straight forward. > > After a question raised here on Python users about config file locations > I was inspired to write this module to help with that problem. > > config_path library works out which path to use for configuration folders > and files in an operating system independent way. > > Each operating system has particular conventions for where an application > is expected to stores it configuration. The information provided to ConfigPath > is used to figure out an appropriate file path or folder path for the > application's > configuration data. > > Supports Windows, macOS and most unix systems using the 'XDG Base Directory > Specification'. > > Example for the "widget" app from "example.com" that uses JSON for its config: > > from config_path import ConfigPath > conf_path = ConfigPath( 'example.com', 'widget', '.json' ) > > path = conf_path.saveFilePath( mkdir=True ) > with path.open() as f: > f.write( config_data ) > > And to read the config: > > path = conf_path.readFilePath() > if path is not None: > # path exists and config can be read > with path.open() as f: > config = json.loads( f ) > else: > config = default_config > > Barry > > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: ypeError: decoding str is not supported
On Sat, 28 Sep 2019 at 10:53, Peter Otten <__pete...@web.de> wrote: > > Hongyi Zhao wrote: > > > Hi, > > > > I have some code comes from python 2 like the following: > > > > str('a', encoding='utf-8') > > This fails in Python 2 > > >>> str("a", encoding="utf-8") > Traceback (most recent call last): > File "", line 1, in > TypeError: str() takes at most 1 argument (2 given) > > ...unless you have redefined str, e. g. with > > >>> str = unicode > >>> str("a", encoding="utf-8") > u'a' > > > But for python 3, this will fail as follows: > > > str('a', encoding='utf-8') > > Traceback (most recent call last): > > File "", line 1, in > > TypeError: decoding str is not supported > > > > > > How to fix it? > > Don' try to decode an already decoded string; use it directly: > > "a" To explain a little further, one of the biggest differences between Python 2 and Python 3 is that you *have* to be clear in Python 3 on which data is encoded byte sequences (which need a decode to turn them into text strings, but cannot be encoded, because they already are) and which are text strings (which don't need to be, and can't be, decoded, but which can be encoded if you want to get a byte sequence). If you're not clear whether some data is a byte string or a text string, you will get in a muddle, and Python 2 won't help you (but it will sometimes produce mojibake without generating an error) whereas Python 3 will tend to throw errors flagging the issue (but it may sometimes be stricter than you are used to). Thinking that saying `str = unicode` is a reasonable thing to do is a pretty strong indication that you're not clear on whether your data is text or bytes - either that or you're hoping to make a "quick fix". But as you've found, quick fixes tend to result in a cascade of further issues that *also* need quick fixes. The right solution here (and by far the cleanest one) is to review your code as a whole, and have a clear separation between bytes data and text data. The usual approach people use for this is to decode bytes into text as soon as it's read into your program, and only ever use genuine text data within your program - so you should only ever be using encode/decode in the I/O portion of your application, where it's pretty clear when you have encoded bytes coming in or going out. Hope this helps, Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: How to get only valid python package index
On Mon, 23 Sep 2019 at 19:15, Vijay Kumar Kamannavar wrote: > > Hellom > > As per https://pypi.org/simple/ we have ~2,00,000 packages. i feel there > are lot of packages found to be dummy/Experimental. Where can we get the > properly maintained package list for python? There is no "properly maintained package list" in the sense that I suspect you mean it, i.e. a curated list where the maintainers guarantee a particular level of quality or support for the available packages. PyPI is an open index and anyone can register an account and upload packages, without restriction. > If not, atleast please let me know what kind of strategy i shoud use to > decide whether package is valid or not? The responsibility for reviewing and assessing the quality of packages lies with the user, so you'll need to assess each package for yourself, in much the same way that you would assess any other open source package - you can look at existing code, blog posts or articles to get a sense of what packages are considered good, or "best of breed", or you can assess the code and documentation against whatever standards you wish to apply. It shouldn't take long if you read some articles to get a sense of some of the more well-known packages (things like requests, numpy, pandas, matplotlib, django, ...) but what is best for you depends entirely on what you are trying to do with Python. Hope this helps, Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: For the code to generate `zen of python'.
It's basically a decryption of the string s in the same module, that's encoded using the ROT13 algorithm - https://en.wikipedia.org/wiki/ROT13. This isn't meant to be secure, it's basically a little bit of fun obfuscating the actual text. The code creates a dictionary mapping encoded characters to their decoded equivalents. Decoding is done by adding 13 to the ASCII value of the letter (wrapping round from 25 back to 0). That's about it, really. Paul On Wed, 7 Aug 2019 at 14:17, Hongyi Zhao wrote: > > Hi here, > > I noticed that the `zen of python' is generated by the following code: > > d = {} > for c in (65, 97): > for i in range(26): > d[chr(i+c)] = chr((i+13) % 26 + c) > > print("".join([d.get(c, c) for c in s])) > > > But the above code is not so easy for me to figure out. Could someone > please give me some explanations on it? > > Regards > -- > .: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :. > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: installing of python
(a) By default if you're using a user install, Python is installed to %LOCALAPPDATA%\Programs\Python. (b) This list is text-only, so the screenshot didn't appear - I'm therefore only guessing what your issue is here. (c) Does the command py -V work? That should run Python and give the version number. (d) If you did a default install, Python is not added to your user PATH so you need to use the "py" launcher as I showed in (c) above. If you want Python adding to PATH, you need to specify that when installing (or manually add it to your PATH afterwards). Paul On Mon, 17 Jun 2019 at 10:31, jaydip rajpara wrote: > > this pic of my c drive after installing python 3.7.2. No python folder > generated > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Implementing C++'s getch() in Python
On Sat, 25 May 2019 at 12:12, wrote: > > I'm working on Python 3.7 under Windows. I need a way to input characters > without echoing them on screen, something that getch() did effectively in > C++. I read about the unicurses, ncurses and curses modules, which I was not > able to install using pip. > > Is there any way of getting this done? On Windows, the msvcrt module exposes getch: https://docs.python.org/3.7/library/msvcrt.html#msvcrt.getch -- https://mail.python.org/mailman/listinfo/python-list
Re: What does at 0x0402C7B0> mean?
On Wed, 22 May 2019 at 02:45, Terry Reedy wrote: > > On 5/21/2019 9:11 PM, CrazyVideoGamez wrote: > > I tried doing a list comprehension. I typed: > > > > favorite_fruits = ['watermelon', 'blackberries'] > > print(fruit for fruit in favorite_fruits) > > > > And I got: > > at 0x0402C7B0> > > What does this mean > > It means that the expression (fruit for fruit in favorite_fruits) > evaluates to a generator object. > > > and what do I have to fix? > > Perhaps you wanted to run the generator, perhaps like this: > > >>> favorite_fruits = ['watermelon', 'blackberries'] > >>> print(*(fruit for fruit in favorite_fruits)) > watermelon blackberries Or maybe you just wanted a loop? for fruit in favorite_fruits: print(fruit) Generator and list comprehension syntax is a somewhat more advanced feature of Python (not *very* advanced, particularly in the case of list comprehensions, but enough so to be tricky for a newcomer). I don't know how much experience the OP has with Python, but as a general rule, it's always better to stick with the simplest approach that does what you want, and statements like loops are simpler than complex one-line expressions (even if the complex one liners make you look cool ;-)) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Why Python has no equivalent of JDBC of Java?
On Tue, 21 May 2019 at 13:50, Adriaan Renting wrote: > > > I think it's partially a design philosophy difference. > > Java was meant to be generic, run anywhere and abstract and hide > differences in its underlying infrastructure. This has led to the Java > VM, and also JDBC I guess. > > Python was more of a script interpreted C-derivative, much closer to > the bare metal, and thus much less effort was made to hide and > abstract. In practice, I think it was more to do with the "Pure Java" philosophy/movement, which resulted in a lot of investment into reinventing/re-implementing code in Java - in this particular case, the network protocols that database clients and servers use to communicate. Because a commercial product like Oracle doesn't document those protocols, open-source reimplementations are hard, if not impossible. The Java drivers for Oracle are supplied by Oracle themselves - Oracle could also provide pure-Python implementations of the protocols, but they don't - so Python interfaces have to rely on the libraries Oracle *do* provide. The same is true of other database interfaces - although in the case of open source databases it *is* possible to implement the protocol in pure Python. It's just far less convenient when interfacing to the existing C libraries is pretty straightforward. For Java interfaces, linking to "native" libraries is more complex, and generally frowned on, so there's pressure to implement a "pure Java" solution. Not having to manage native binaries is a big advantage Java has, certainly. But conversely, it's meant that building the Java ecosystem required a massive amount of effort. Luckily, commercial interests have paid for much of that effort and have made the resulting libraries freely available. Who knows where we would be if Python had received a similar level of investment :-) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: How to concatenate strings with iteration in a loop?
On Tue, 21 May 2019 at 09:25, Frank Millman wrote: > > On 2019-05-21 9:42 AM, Madhavan Bomidi wrote: > > Hi, > > > > I need to create an array as below: > > > > tempStr = year+','+mon+','+day+','+str("{:6.4f}".format(UTCHrs[k]))+','+ \ > > str("{:9.7f}".format(AExt[k,0]))+','+str({:9.7f}".format(AExt[k,1]))+','+ \ > > str("{:9.7f}".format(AExt[k,2]))+','+str("{:9.7f}".format(AExt[k,3]))+','+ \ > > str("{:9.7f}".format(AExt[k,4]))+','+str("{:9.7f}".format(AExt[k,5]))+','+ \ > > str("{:9.7f}".format(AExt[k,6]))+','+str("{:9.7f}".format(AExt[k,7]))+','+ \ > > str("{:9.7f}".format(AExt[k,8]))+','+str("{:9.7f}".format(AExt[k,9])) > > > > > > k is a row index > > > > Can some one suggest me how I can iterate the column index along with row > > index to concatenate the string as per the above format? > > > > Thanks in advance > > > > The following (untested) assumes that you are using a reasonably > up-to-date Python that has the 'f' format operator. > > tempStr = f'{year},{mon},{day},{UTCHrs[k]:6.4f}' > for col in range(10): > tempStr += f',{AExt[k, col]:9.7f}' > As a minor performance note (not really important with only 10 items, but better to get into good habits from the start): temp = [f'{year},{mon},{day},{UTCHrs[k]:6.4f}'] for col in range(10): temp.append(f',{AExt[k, col]:9.7f}') tempStr = ''.join(tempStr) Repeated concatenation of immutable strings (which is what Python has) is O(N**2) in the number of chunks added because of the need to repeatedly copy the string. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Conway's game of Life, just because.
Golly <http://golly.sourceforge.net/> supports both bounded and unbounded universes. I don't know how it does it, and (obviously) it will hit limits *somewhere*, but I consider support of unbounded universes to mean that any failure modes will not be attributable to limits on the value of co-ordinates (for the pedants out there, ignoring issues such as numbers to big to represent in memory...) It does limit *editing* of patterns to co-ordinates below a billion (to quote the "Known Limitations" page). But that's a distinct issue (I guess related to the GUI toolkit being used). Disclaimer: I've only made very light use of golly, most of the above is inferred from the manual and reports of how others have used it. Paul On Wed, 8 May 2019 at 12:38, Richard Damon wrote: > > On 5/8/19 4:26 AM, Paul Moore wrote: > > On Wed, 8 May 2019 at 03:39, Richard Damon wrote: > >> My experience is that the wrap around is common, as otherwise the hard > >> edge causes a discontinuity in the rules at the edge, so any pattern > >> that reaches the edge no longer has a valid result. The torus effect > >> still perturbs the result, but that perturbation is effectively that the > >> universe was tiled with an infinite grid of the starting pattern, so > >> represents a possible universe. > > In my experience, "simple" implementations that use a fixed array > > often wrap around because the inaccuracies (compared to the correct > > infinite-area result) are less disruptive for simple examples. But > > more full-featured implementations that I've seen don't have a fixed > > size. I assume they don't use a simple array as their data model, but > > rather use something more complex, probably something that's O(number > > of live cells) rather than something that's O(maximum co-ordinate > > value ** 2). > > > > Paul > > > An implementation that creates an infinite grid to work on doesn't need > to worry about what happens on the 'edge' as there isn't one. > > I suspect an implementation that makes an effectively infinite grid > might not either, though may include code to try and keep the pattern > roughly 'centered' to keep away from the dragons at the edge. > > If, like likely with a 'high efficiency' language with fixed sized > integers, the coordinates wrap around (max_int + 1 => min_int) then it > naturally still is a torus, though processing that case may add > complexity to keep the computation of a typical cell O(1). You might end > up with numbers becoming floating point, where some_big_number +1 -> the > same some_big_number that would lead to issues with the stability of the > data structure, so maybe some hard size limit is imposed to prevent that. > > So it comes to that if there is an edge that you might see, the normal > processing is to wrap to make the edge less disruptive. If you aren't > apt to see the edge, then it really doesn't matter how it behaves (sort > of like how people aren't concerned about the Y10k issue) > > -- > Richard Damon > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Conway's game of Life, just because.
On Wed, 8 May 2019 at 03:39, Richard Damon wrote: > My experience is that the wrap around is common, as otherwise the hard > edge causes a discontinuity in the rules at the edge, so any pattern > that reaches the edge no longer has a valid result. The torus effect > still perturbs the result, but that perturbation is effectively that the > universe was tiled with an infinite grid of the starting pattern, so > represents a possible universe. In my experience, "simple" implementations that use a fixed array often wrap around because the inaccuracies (compared to the correct infinite-area result) are less disruptive for simple examples. But more full-featured implementations that I've seen don't have a fixed size. I assume they don't use a simple array as their data model, but rather use something more complex, probably something that's O(number of live cells) rather than something that's O(maximum co-ordinate value ** 2). Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Change in cache tag in Python 3.8 - pip confused
On Tue, 7 May 2019 at 22:26, Chris Angelico wrote: > So the next question is: Is this actually a problem? If it's something > that can only ever happen to people who build prerelease Pythons, it's > probably not an issue. Is there any way that a regular installation of > Python could ever change its cache_tag? What I experienced would be > extremely confusing if it ever happened live ("but it IS installed, no > it ISN'T installed..."). I'm probably not the person to answer, because I know *way* to much about how pip works to judge what's confusing :-) However, I'd say that: 1. It's mostly about what core Python/distutils does (compiled module names, and cache tags, are core language things - see PEP 3147 for the details).. 2. In a normal release, Python wouldn't break compatibility like this, so this would be a non-issue in a standard x.y.z Python release. 3. Python doesn't guarantee ABI compatibility across minor releases, and the site-packages locations would normally be different anyway (I know they are on Windows, and I assume they would be on Unix) so you won't have this issue in anything other than X.Y.Z and X.Y.W (where W != Z). I'd normally recommend reinstalling everything when changing a point release too, but I know not everyone does that. ABI compatibility between point releases is maintained, though, so it shouldn't be a problem. Short answer, it's only ever going to be a problem for people building their own Python from git, and they are probably going to know enough to debug the issue themselves (even if, like you, they find it hard to work out what happened...) > When pip installed python-lzo from source, it must have constructed > the file name somehow. I've tried digging through pip's sources, but > searching for "cache_tag" came up blank, and I'm not sure where it > actually figures out what name to save the .so file under. There's a > lot of code in pip, though, and I freely admit to having gotten quite > lost in the weeds :| The build is done in distutils (or maybe, setuptools) so that's where you should probably look. Pip just gets a bunch of files from setuptools, and installs them, at least for up to date projects and versions of pip - it's the "PEP 517" behaviour if you're into recent packaging standards. For older pip versions, or projects not updated to recent standards, pip just says "hey, setuptools, install this stuff and tell me the names of all the files you created". Good luck digging into the setuptools/distutils code - take rations, torches and emergency flares, it's dangerous in there :-) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Change in cache tag in Python 3.8 - pip confused
On Tue, 7 May 2019 at 21:53, Chris Angelico wrote: > > I've been building Python 3.8 pre-alphas and alphas for a while now, > and ran into a weird problem. Not sure if this is significant or not, > and hoping to get other people's views. > > It seems that the value of sys.implementation.cache_tag changed from > "cpython-38m" to just "cpython-38" at some point. That means that a > statement like "import lzo" will no longer find a file called > "lzo.cpython-38m-x86_64-linux-gnu.so", which I had had prior to the > change. The trouble is, pip *did* recognize that file, and said that > the python-lzo package was still installed. > > Solution: "pip uninstall python-lzo" and then reinstall it. Now it's > created "lzo.cpython-38-x86_64-linux-gnu.so" and all is well. > > Does anyone else know about how pip detects existing files, and > whether it could be brought more in sync with the import machinery? I don't know if this is what you were after, but pip decides if a project FOO is installed by looking for the FOO-X.Y.Z.dist-info directory in site-packages. The RECORD file in that directory contains a list of all files installed when the package was originally installed (that's what pip uses to decide what to uninstall). Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: pip as an importable module?
Pip doesn't have a programming API, so no, you can't do this. Having said that, it looks like pip-conflict-checker only uses `pip.get_installed_distributions`, and setuptools (pkg_resources, specifically) has an API that does this, so you could probably relatively easily fix the code to use that. The stuff here is probably what you want: https://setuptools.readthedocs.io/en/latest/pkg_resources.html#getting-or-creating-distributions Paul On Wed, 1 May 2019 at 15:05, Skip Montanaro wrote: > > I'm trying to get pip-conflict-checker working with Python 3.6: > > https://github.com/ambitioninc/pip-conflict-checker > > It would seem that the pip API has changed since this tool was last > updated, as what's there appears only to be used in support of pip as > a command line tool. Is there something comparable which is being > maintained, or a description of how to adapt to the new > pip-as-a-module API? > > Thx, > > Skip > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: pip unable to find an extension of a lib.
>From https://pypi.org/project/PyQt5-sip/#files, it looks like the project only distributes wheels (no source archive). You don't say what platform you're using, but if it's Linux, the fact that you have a debug Python probably means you need a different ABI, so the standard wheels that they provide aren't compatible. So if there's no compatible wheel and no source, there's nothing that can be installed. pip install -v pyqt5 might give you some more information about why the files available are not being considered as suitable, but I think the above is likely. Paul On Fri, 12 Apr 2019 at 06:44, dieter wrote: > > Vincent Vande Vyvre writes: > > ... > > Using Python-3.7.2 (compiled with --with-pydebug) in a venv, I've > > encountered this problem: > > > > $ pip install --upgrade pip setuptools wheel > > > > Successfully installed setuptools-41.0.0 wheel-0.33.1 > > --- > > ... > > $ pip install pyqt5 > > ... > > Collecting PyQt5_sip<4.20,>=4.19.14 (from pyqt5) > > Could not find a version that satisfies the requirement > > PyQt5_sip<4.20,>=4.19.14 (from pyqt5) (from versions: ) > > No matching distribution found for PyQt5_sip<4.20,>=4.19.14 (from pyqt5) > > ... > > $ pip search pyqt5 > > ... > > PyQt5-sip (4.19.15) - Python extension module support > > There is a spelling difference: "PyQt5_sip" versus "PyQt5-sip" -- > not sure, however, whether this is important. > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: fs package question - cp -p semantics when copying files?
On Wed, 3 Apr 2019 at 16:06, Skip Montanaro wrote: > > > From a brief look at the docs, there's an on_copy callback to copy_fs. > > Maybe you could use the getinfo/setinfo methods to copy over the > > timestamps and any other file metadata that you want in that callback? > > Yes, I had gotten to that point when I first posted to the > PyFilesystem Google Group. I had tried to figure things out before > posting, but hadn't deciphered the docs, source, or test functions. It > seems clear I need to generate a dictionary which maps "namespaces" to > particular values, but the docs are not clear on what the values are. > In particular, it seems the values have different structure for > different aspects of the overall set of attributes I want to modify. Yeah, the getinfo/setinfo stuff confused me too. But I thought it might be worth mentioning in case you hadn't spotted it. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: fs package question - cp -p semantics when copying files?
On Wed, 3 Apr 2019 at 14:55, Skip Montanaro wrote: > It's part of a larger application which needs to copy files from a number > of locations, not all of which are filesystems on local or remote hosts > (think zip archives, s3 buckets, tar files, etc). In that application, I am > using the fs package (PyFilesystem). Since it collects files from a number > of locations (some static, some updated monthly, some daily), it would be > handy if I could preserve file timestamps for post mortem debugging. >From a brief look at the docs, there's an on_copy callback to copy_fs. Maybe you could use the getinfo/setinfo methods to copy over the timestamps and any other file metadata that you want in that callback? Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Library for parsing binary structures
On Fri, 29 Mar 2019 at 23:21, Cameron Simpson wrote: > > On 27Mar2019 18:41, Paul Moore wrote: > >I'm looking for a library that lets me parse binary data structures. > >The stdlib struct module is fine for simple structures, but when it > >gets to more complicated cases, you end up doing a lot of the work by > >hand (which isn't that hard, and is generally perfectly viable, but > >I'm feeling lazy ;-)) > > I wrote my own: cs.binary, available on PyPI. The PyPI page has is > module docs, which I think are ok: > > https://pypi.org/project/cs.binary/ Nice, thanks - that's exactly the sort of pointer I was looking for. I'll take a look and see how it works for my use case. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Library for parsing binary structures
On Fri, 29 Mar 2019 at 16:16, Peter J. Holzer wrote: > Obviously you need some way to describe the specific binary format you > want to parse - in other words, a grammar. The library could then use > the grammar to parse the input - either by interpreting it directly, or > by generating (Python) code from it. The latter has the advantage that > it has to be done only once, not every time you want to parse a file. > > If that sounds familiar, it's what yacc does. Except that it does it for > text files, not binary files. I am not aware of any generic binary > parser generator for Python. I have read research papers about such > generators for (I think) C and Java, but I don't remember the names and > I'm not sure if the generators got beyond the proof of concept stage. That's precisely what I'm looking at. The construct library (https://pypi.org/project/construct/) basically does that, but using a DSL implemented in Python rather than generating Python code from a grammar. In fact, the problem I had with my recursive data structure turned out to be solvable in construct - as the DSL effectively builds a data structure describing the grammar, I was able to convert the problem of writing a recursive grammar into one of writing a recursive data structure: type_layouts = {} layout1 = layout2 = type_layouts[1] = layout1 type_layouts[2] = layout2 data_layout = However, the resulting parser works, but it gives horrible error messages. This is a normal problem with generated parsers, there are plenty of books and articles covering how to persuade tools like yacc to produce usable error reports on parse failures. There don't seem to be any particularly good error reporting features in construct (although I haven't looked closely), so I'm actually now looking at writing a hand-crafted parser, just to control the error reporting[1]. I don't know which solution I'll ultimately use, but it's an interesting exercise doing it both ways. And parsing binary data, unlike parsing text, is actually easy enough that hand crafting a parser isn't that much of a bother - maybe that's why there's less existing work in this area. Paul [1] The errors I'm reporting on are likely to be errors in my parsing code at this point, rather than errors in the data, but the problem is pretty much the same either way ;-) -- https://mail.python.org/mailman/listinfo/python-list
Re: Library for parsing binary structures
On Thu, 28 Mar 2019 at 08:15, dieter wrote: > What you have is a generalized deserialization problem. > It can be solved with a set of deserializers. Yes, and thanks for the suggested code structure. As I say, I can certainly do the parsing "by hand", and the way you describe is very similar to how I'd approach that. My real interest is in whether any libraries exist to do this sort of thing (there are plenty of parser libraries for text, pyparsing being the obvious one, but far fewer for binary structures). Paul -- https://mail.python.org/mailman/listinfo/python-list
Library for parsing binary structures
I'm looking for a library that lets me parse binary data structures. The stdlib struct module is fine for simple structures, but when it gets to more complicated cases, you end up doing a lot of the work by hand (which isn't that hard, and is generally perfectly viable, but I'm feeling lazy ;-)) I know of Construct, which is a nice declarative language, but it's either weak, or very badly documented, when it comes to recursive structures. (I really like Construct, and if I could only understand the docs better I may well not need to look any further, but as it is, I can't see anything showing how to do recursive structures...) I am specifically trying to parse a structure that looks something like the following: Multiple instances of: - a type byte - a chunk of data structured based on the type types include primitives like byte, integer, etc, as well as (type byte, count, data) - data is "count" occurrences of data of the given type. That last one is a list, and yes, you can have lists of lists, so the structure is recursive. Does anyone know of any other binary data parsing libraries, that can handle recursive structures reasonably cleanly? I'm already *way* past the point where it would have been quicker for me to write the parsing code by hand rather than trying to find a "quick way", so the questions, honestly mostly about finding out what people recommend for jobs like this rather than actually needing something specific to this problem. But I do keep hitting the need to parse binary structures, and having something in my toolbox for the future would be really nice. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Syntax for one-line "nonymous" functions in "declaration style"
On Wed, 27 Mar 2019 at 12:27, Alexey Muranov wrote: > > On mer., mars 27, 2019 at 10:10 AM, Paul Moore > wrote: > > On Wed, 27 Mar 2019 at 08:25, Alexey Muranov > > wrote: > >> > >> Whey you need a simple function in Python, there is a choice > >> between a > >> normal function declaration and an assignment of a anonymous > >> function > >> (defined by a lambda-expression) to a variable: > >> > >> def f(x): return x*x > >> > >> or > >> > >> f = lambda x: x*x > >> > >> It would be however more convenient to be able to write instead just > >> > >> f(x) = x*x > > > > Why? Is saving a few characters really that helpful? So much so that > > it's worth adding a *third* method of defining functions, which would > > need documenting, adding to training materials, etc, etc? > > Because i think i would prefer to write it this way. That's not likely to be sufficient reason for changing a language that's used by literally millions of people. > (Almost no new documentation or tutorials would be needed IMHO.) Documentation would be needed to explain how the new construct worked, for people who either wanted to use it or encountered it in other people's code. While it may be obvious to you how it works, it likely won't be to others, and there will probably be edge cases you haven't considered that others will find and ask about. Your interest in improving the language is great, but there are a great many practical considerations in any change, and if you actually want your idea to progress, you'll need to be prepared to address those. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Syntax for one-line "nonymous" functions in "declaration style"
On Wed, 27 Mar 2019 at 08:25, Alexey Muranov wrote: > > Whey you need a simple function in Python, there is a choice between a > normal function declaration and an assignment of a anonymous function > (defined by a lambda-expression) to a variable: > > def f(x): return x*x > > or > > f = lambda x: x*x > > It would be however more convenient to be able to write instead just > > f(x) = x*x Why? Is saving a few characters really that helpful? So much so that it's worth adding a *third* method of defining functions, which would need documenting, adding to training materials, etc, etc? -1 on this. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about the @staticmethod decorator
On Sun, 17 Mar 2019 at 18:18, Arup Rakshit wrote: > > I am reading a book where the author says that: > > In principle, it would also be possible to implement any @staticmethod > completely outside of the class at module scope without any loss of > functionality — so you may want to consider carefully whether a particular > function should be a module scope function or a static method. The > @staticmethod decorator merely facilitates a particular organisation of the > code allowing us to place what could otherwise be free functions within > classes. > > I didn’t get quiet well this block of text. My first question is how would I > make a module level function as static method of a class. Can anyone give me > an example of this? What are the contexts that would let you to think if they > are good fit inside the class or module level scope functions? The point the author is trying to make is that there's no practical difference between def say_hello(name): print("Hello,", name) and class Talker: @staticmethod def say_hello(name): print("Hello,", name) You refer to the first as "say_hello", and the second as "Talker.say_hello", but otherwise they are used identically. The static method has no access to the class or instance variables, so it has no special capabilities that the standalone "say_hello" function has. So, to rephrase the words you used, @staticmethod lets you organise your code in a certain way, but doesn't offer any extra capabilities over module-level functions. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Convert Windows paths to Linux style paths
On Wed, 13 Mar 2019 at 08:13, eryk sun wrote: > > On 3/12/19, Paul Moore wrote: > > > > Do you care about case sensitivity (for example, is it important to you > > whether filenames "foo" and "FOO" map to the same file or not on > > Linux, given that they do on Windows)? > > That's no longer a given in Windows, since NTFS in Windows 10 supports > case-sensitive directories that override the Windows API. I know, but I thought my answer to the OP included enough complexities for them to think about without going into this level of detail ;-) But yes, "converting filenames" is a hard problem, unless you really do just want a simple text based transformation. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Convert Windows paths to Linux style paths
On Tue, 12 Mar 2019 at 14:54, Malcolm Greene wrote: > > Looking for best practice technique for converting Windows style paths to > Linux paths. Is there an os function or pathlib method that I'm missing or is > it some combination of replacing Windows path separators with Linux path > separators plus some other platform specific logic? You need to explain what you mean by "converting". How would you want "C:\Windows" to be converted? Or "\\myserver\myshare\path\to\file.txt"? Or "\\?\D:\very long path"? What if the path is too long for POSIX filename length limitations? How do you want to handle Unicode? Do you care about case sensitivity (for example, is it important to you whether filenames "foo" and "FOO" map to the same file or not on Linux, given that they do on Windows)? It's quite possible that your answer to any or all of these questions are "I don't need to consider these cases". But we don't know which cases matter to you unless you clarify, and the *general* problem of "converting filenames" between operating systems is essentially impossible, because semantics are radically different. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: dash/underscore on name of package uploaded on pypi
On Thu, 28 Feb 2019 at 16:14, ast wrote: > > Hello > > I just uploaded a package on pypi, whose name is "arith_lib" > > The strange thing is that on pypi the package is renamed "arith-lib" > The underscore is substitued with a dash The version with a dash is the normalised form of the name - see https://www.python.org/dev/peps/pep-0503/#normalized-names Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Deletion of Environmental Variables
On Mon, 7 Jan 2019 at 06:37, Terry Reedy wrote: > The pydev recommended way to run pip on windows is > > py -x.y pip > as this installs the package requested into the x.y site-packages > directory. py -3.7 -m pip ... Note the extra -m). Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Transparently treating tar files and zip archives as directories
Pyfilesystem (https://pypi.org/project/fs/) does something like this - it might be what you're after, Paul On Sun, 6 Jan 2019 at 22:32, Skip Montanaro wrote: > > I find it useful in some of the work I do to treat Zip archives as if > they were directories. I don't think it would be too difficult to make > it pretty much transparent, so that you could execute something like: > > fileobj = magic_open("/path/to/some/archive.zip/some/internal/path/magic.txt") > > or equivalent for writing and other list-y, glob-y sorts of things. > (In fact, I'd be mildly surprised if I couldn't find something on PyPI > if I spent a few minutes searching.) > > As I considered that idea the other day, I thought, "Hmmm... might be > useful for tar files as well." Alas, the tarfile module API didn't > seem like it would support such a higher level API anywhere near as > easily. Is that a fundamental property/shortcoming of the tarfile > format, or is it just a function of the tarfile module's API? > > Skip > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about slight deviations when using integer division with large integers.
On Mon, 31 Dec 2018 at 09:00, Christian Seberino wrote: > > Thanks. I didn’t post new code. I was just referring back to original > post. I need to duplicate the exact behavior of Java’s BigIntegers. > > I’m guessing difference between Java and Python is that Java BigIntegers do > not switch to floor for negatives. Presumably Java BigInteger division consistently rounds to zero, whereas Python's // operator consistently rounds down. > Possible to tweak rounding of Python to be like Java? The correct answer is to have your developers understand the behaviour of the language they are using, and not assume it's like another language that they are more familiar with. But I appreciate that's not always easy. So what you are looking for are ways to help your developers avoid errors. Python doesn't have any way to change the behaviour of the // operator. I assume Java doesn't have a way to make BigInteger division round down, either? I doubt that using a custom-defined class in Python would help much - developers would likely not use it, unless they understood the reason for it (at which point, they wouldn't need it!). Code reviews, focusing on the use of the // operator, might be an answer (hopefully only needed short term until your developers understood the different behaviours of Java and Python). Or maybe some form of coding convention (all uses of // must be covered by unit tests that check all combinations of signs of the 2 operands). Or maybe a function java_style_divide(), that your conventions mandate must be used in preference to // Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Decoding a huge JSON file incrementally
(Sorry, hit "Send" too soon on the last try!) On Thu, 20 Dec 2018 at 17:22, Chris Angelico wrote: > > On Fri, Dec 21, 2018 at 2:44 AM Paul Moore wrote: > > > > I'm looking for a way to incrementally decode a JSON file. I know this > > has come up before, and in general the problem is not soluble (because > > in theory the JSON file could be a single object). In my particular > > situation, though, I have a 9GB file containing a top-level array > > object, with many elements. So what I could (in theory) do is to parse > > an element at a time, yielding them. > > > > The problem is that the stdlib JSON library reads the whole file, > > which defeats my purpose. What I'd like is if it would read one > > complete element, then just enough far ahead to find out that the > > parse was done, and return the object it found (it should probably > > also return the "next token", as it can't reliably push it back - I'd > > check that it was a comma before proceeding with the next list > > element). > > It IS possible to do an incremental parse, but for that to work, you > would need to manually strip off the top-level array structure. What > you'd need to use would be this: > > https://docs.python.org/3/library/json.html#json.JSONDecoder.raw_decode > > It'll parse stuff and then tell you about what's left. Since your data > isn't coming from a ginormous string, but is coming from a file, > you're probably going to need something like this: > > def get_stuff_from_file(f): > buffer = "" > dec = json.JSONDecoder() > while "not eof": > while "no object yet": > try: obj, pos = dec.raw_decode(buffer) > except JSONDecodeError: buffer += f.read(1024) > else: break > yield obj > buffer = buffer[pos:].lstrip().lstrip(",") Ah, right. I'd found that function, but as it took input from a string rather than a file-like object, I'd dismissed it. I didn't think of decoding partial reads. That's a nice trick, thanks! > Proper error handling is left as an exercise for the reader, both in > terms of JSON errors and file errors. Also, the code is completely > untested. Have fun :) Yeah, once you have the insight that you can attempt to parse a block at a time, the rest is just a "simple matter of programming" :-) > The basic idea is that you keep on grabbing more data till you can > decode an object, then you keep whatever didn't get used up ("pos" > points to whatever didn't get consumed). Algorithmic complexity should > be O(n) as long as your objects are relatively small, and you can > optimize disk access by tuning your buffer size to be at least the > average size of an object. Got it, thanks. > Hope that helps. Yes it does, a lot. Much appreciated. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Decoding a huge JSON file incrementally
On Thu, 20 Dec 2018 at 17:22, Chris Angelico wrote: > > On Fri, Dec 21, 2018 at 2:44 AM Paul Moore wrote: > > > > I'm looking for a way to incrementally decode a JSON file. I know this > > has come up before, and in general the problem is not soluble (because > > in theory the JSON file could be a single object). In my particular > > situation, though, I have a 9GB file containing a top-level array > > object, with many elements. So what I could (in theory) do is to parse > > an element at a time, yielding them. > > > > The problem is that the stdlib JSON library reads the whole file, > > which defeats my purpose. What I'd like is if it would read one > > complete element, then just enough far ahead to find out that the > > parse was done, and return the object it found (it should probably > > also return the "next token", as it can't reliably push it back - I'd > > check that it was a comma before proceeding with the next list > > element). > > It IS possible to do an incremental parse, but for that to work, you > would need to manually strip off the top-level array structure. What > you'd need to use would be this: > > https://docs.python.org/3/library/json.html#json.JSONDecoder.raw_decode > > It'll parse stuff and then tell you about what's left. Since your data > isn't coming from a ginormous string, but is coming from a file, > you're probably going to need something like this: > > def get_stuff_from_file(f): > buffer = "" > dec = json.JSONDecoder() > while "not eof": > while "no object yet": > try: obj, pos = dec.raw_decode(buffer) > except JSONDecodeError: buffer += f.read(1024) > else: break > yield obj > buffer = buffer[pos:].lstrip().lstrip(",") > > Proper error handling is left as an exercise for the reader, both in > terms of JSON errors and file errors. Also, the code is completely > untested. Have fun :) > > The basic idea is that you keep on grabbing more data till you can > decode an object, then you keep whatever didn't get used up ("pos" > points to whatever didn't get consumed). Algorithmic complexity should > be O(n) as long as your objects are relatively small, and you can > optimize disk access by tuning your buffer size to be at least the > average size of an object. > > Hope that helps. > > ChrisA > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Decoding a huge JSON file incrementally
I'm looking for a way to incrementally decode a JSON file. I know this has come up before, and in general the problem is not soluble (because in theory the JSON file could be a single object). In my particular situation, though, I have a 9GB file containing a top-level array object, with many elements. So what I could (in theory) do is to parse an element at a time, yielding them. The problem is that the stdlib JSON library reads the whole file, which defeats my purpose. What I'd like is if it would read one complete element, then just enough far ahead to find out that the parse was done, and return the object it found (it should probably also return the "next token", as it can't reliably push it back - I'd check that it was a comma before proceeding with the next list element). I couldn't see a way to get the stdlib json library to read "just as much as needed" in this way. Did I miss a trick? Or alternatively, is there a JSON decoder library on PyPI that supports this sort of usage? I'd rather not have to implement my own JSON parser if I can avoid it. Thanks, Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: [OT] master/slave debate in Python
On Wed, 26 Sep 2018 at 16:30, Ian Kelly wrote: > Also: a human slave is not "a person being treated like a computer" > and I find it highly disrespectful that you would move to trivialize > slavery like that. I have no idea what it must feel like to be a slave (other than the trite and obvious idea that "it must be awful"). Unfortunately, debates like this do nothing to help me understand or empathise with the people suffering in that way, or people dealing with the aftermath of historical cases. I'm more than happy to ensure that we are not causing pain or being disrespectful of the suffering of others, but rather than simply making the whole issue feel like a censorship debate, I'd rather we were helping people to understand and empathise, so that they would *of their own accord* act in an appropriate way. Self-censorship based on understanding and empathy is far more reasonable than any sort of externally-imposed rules. But discussing what it means to be a slave, or the implications of slavery on our culture(s) is way off-topic for this list, so I'd prefer not to debate it further here. I'm sure anyone interested in understanding more can easily find more appropriate forums to participate in. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Any SML coders able to translate this to Python?
On Fri, 7 Sep 2018 at 15:10, Paul Moore wrote: > > On Fri, 7 Sep 2018 at 14:06, Steven D'Aprano > wrote: > > > > On Thu, 06 Sep 2018 13:48:54 +0300, Marko Rauhamaa wrote: > > > > > Chris Angelico : > > >> The request was to translate this into Python, not to slavishly imitate > > >> every possible semantic difference even if it won't actually affect > > >> behaviour. > > > > > > I trust Steven to be able to refactor the code into something more > > > likable. His only tripping point was the meaning of the "let" construct. > > > > Thanks for the vote of confidence :-) > > > > However I have a follow up question. Why the "let" construct in the first > > place? Is this just a matter of principle, "put everything in its own > > scope as a matter of precautionary code hygiene"? Because I can't see any > > advantage to the inner function: > > My impression is that this is just functional programming "good > style". As you say, it's not needed, it's just "keep things valid in > the smallest range possible". Probably also related to the > mathematical style of naming sub-expressions. Also, it's probably the > case that in a (compiled) functional language like SML, the compiler > can optimise this to avoid any actual inner function, leaving it as > nothing more than a temporary name. It's also worth noting that functional languages don't typically have variables or assignments (more accurately, such things aren't fundamental to the programming model the way they are in imperative languages). So although technically let introduces a new scope, in practical terms it's basically just "how functional programmers do assignments". Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Any SML coders able to translate this to Python?
On Fri, 7 Sep 2018 at 14:06, Steven D'Aprano wrote: > > On Thu, 06 Sep 2018 13:48:54 +0300, Marko Rauhamaa wrote: > > > Chris Angelico : > >> The request was to translate this into Python, not to slavishly imitate > >> every possible semantic difference even if it won't actually affect > >> behaviour. > > > > I trust Steven to be able to refactor the code into something more > > likable. His only tripping point was the meaning of the "let" construct. > > Thanks for the vote of confidence :-) > > However I have a follow up question. Why the "let" construct in the first > place? Is this just a matter of principle, "put everything in its own > scope as a matter of precautionary code hygiene"? Because I can't see any > advantage to the inner function: My impression is that this is just functional programming "good style". As you say, it's not needed, it's just "keep things valid in the smallest range possible". Probably also related to the mathematical style of naming sub-expressions. Also, it's probably the case that in a (compiled) functional language like SML, the compiler can optimise this to avoid any actual inner function, leaving it as nothing more than a temporary name. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Any SML coders able to translate this to Python?
On Tue, 4 Sep 2018 at 13:31, Steven D'Aprano wrote: > > I have this snippet of SML code which I'm trying to translate to Python: > > fun isqrt n = if n=0 then 0 > else let val r = isqrt (n/4) > in > if n < (2*r+1)^2 then 2*r > else 2*r+1 > end > > > I've tried reading up on SML and can't make heads or tails of the > "let...in...end" construct. > > > The best I've come up with is this: > > def isqrt(n): > if n == 0: > return 0 > else: > r = isqrt(n/4) > if n < (2*r+1)**2: > return 2*r > else: > return 2*r+1 > > but I don't understand the let ... in part so I'm not sure if I'm doing > it right. I've not used SML much, but what you have looks right. let ... in is basically a local binding "let x = 12 in x+2" is saying "the value of x+2 with x set to 12". As I'm sure you realise (but I'll add it here for the sake of any newcomers who read this), the recursive approach is not natural (or efficient) in Python, whereas it's the natural approach in functional languages like SML. In Python an iterative solution would be better. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Any SML coders able to translate this to Python?
On Tue, 4 Sep 2018 at 13:31, Steven D'Aprano wrote: > > I have this snippet of SML code which I'm trying to translate to Python: > > fun isqrt n = if n=0 then 0 > else let val r = isqrt (n/4) > in > if n < (2*r+1)^2 then 2*r > else 2*r+1 > end > > > I've tried reading up on SML and can't make heads or tails of the > "let...in...end" construct. > > > The best I've come up with is this: > > def isqrt(n): > if n == 0: > return 0 > else: > r = isqrt(n/4) > if n < (2*r+1)**2: > return 2*r > else: > return 2*r+1 > > but I don't understand the let ... in part so I'm not sure if I'm doing > it right. I've not used SML much, but what you have looks right. let ... in is basically a local binding "let x = 12 in x+2" is saying "the value of x+2 with x set to 12". As I'm sure you realise (but I'll add it here for the sake of any newcomers who read this), the recursive approach is not natural (or efficient) in Python, whereas it's the natural approach in functional languages like SML. In Python an iterative solution would be better. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Question about floating point
On Sat, 1 Sep 2018 at 12:31, Frank Millman wrote: > > "Frank Millman" wrote in message news:pm3l2m$kv4$1...@blaine.gmane.org... > > > > I know about this gotcha - > > > > >>> x = 1.1 + 2.2 > > >>> x > > 3.3003 > > > [...] > > I have enjoyed the discussion, and I have learnt a lot about floating point. > Thanks to all. > > I have just noticed one oddity which I thought worth a mention. > > >>> from decimal import Decimal as D > >>> f"{D('1.1')+D('2.2'):.60f}" > '3.3000' > >>> '{:.60f}'.format(D('1.1') + D('2.2')) > '3.3000' > >>> '%.60f' % (D('1.1') + D('2.2')) > '3.2998223643160599749535322189331054687500' > >>> > > The first two format methods behave as expected. The old-style '%' operator > does not. > > Frank Presumably, Decimal has a custom formatting method. The old-style % formatting doesn't support custom per-class formatting, so %.60f converts its argument to float and then prints it. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: __init__ patterns
On Thu, 30 Aug 2018 at 14:07, Tim wrote: > > I saw a thread on reddit/python where just about everyone said they never put > code in their __init__ files. > > Here's a stackoverflow thread saying the same thing. > https://stackoverflow.com/questions/1944569/how-do-i-write-good-correct-package-init-py-files > > That's new to me. I like to put functions in there that other modules within > the module need. > Thought that was good practice DRY and so forth. And I never do 'from > whatever import *' > Ever. > > The reddit people said they put all their stuff into different modules and > leave init empty. > > What do you do? I like my pattern but I'm willing to learn. What matters is the user interface, not where you put your code "behind the scenes". If your documented interface is import foo foo.do_something() then it's perfectly OK (IMO) for do_something to be implemented in foo/__init__.py. It's *also* perfectly OK for it to be implemented in foo/internals.py and imported into __init__.py. Whatever makes your development process easier. Of course, if you *document* that do_something is available as foo.internals.do_something, then you can no longer take the first option - that's up to you, though :-) Conversely, if your package interface is import foo.utilities foo.utilities.do_something() then do_something needs to be implemented in foo/utilities.py (assuming utilities isn't itself a subpackage :-)) Whether you choose to have a convenience alias of foo.do_something() in that case determines whether you also need to import it in foo/__init__.py, but that's fine. tl;dr; Design your documented interface, and make sure that's as you (and your users) want it. Don't let anyone tell you how you should structure your internal code, it's none of their business. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Path-like objects in the standard library
On Fri, 24 Aug 2018 at 09:57, Torsten Bronger wrote: > > Hallöchen! > > Path-like objects are accepted by all path-processing functions in > the standard library since Python 3.6. Unfortunately, this is not > made explicit everywhere. In particular, if I pass a Path in the > first argument of subprocess.run, do I use an implementation detail > of CPython? Because on > https://docs.python.org/3.6/library/subprocess.html, only for the > cwd argument the acceptance of Paths is stated explicitly. > > The same goes for all functions in the shutil module. > https://docs.python.org/3.6/library/shutil.html does not mention the > path-like protocol anywhere, but the CPython modules seem to accept > them anyway. I would imagine that doc fixes would be gratefully accepted. Or if they are rejected, that rejection would be confirmation that the support is intended as an implementation detail (and a fix to the documentation to explicitly state that would be reasonable). Personally, I'd expect that the intention is that you can rely on it. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Partitioning a list
On Wed, 22 Aug 2018 at 17:33, Richard Damon wrote: > Paul, my understanding of the problem is that you want to create multiple > divisions of the larger group into smaller groups, such that if two people > are in the same group in 1 division, they never appear together in other > divisions. > > One sample problem which I have had to solve with a similar restriction (but > in my case was to minimize not eliminate repeats) was scheduling races. Say > you have 9 cars you want to race in triples, and no pair of cars should ever > meet twice. One option would be the following matches: > 1,2,3; 4,5,6; 7,8,9 > 1,4,7; 2,5,8; 3,6,9 > 1,5,9; 2,6,7; 3,4,8 > 1,6,8; 2,4,9; 3,5,7 > You can show this is the most number of triples you can get out of 9, as > every number has been paired with every other number once, so to create > another triple, you must get a duplicate pairing. > I don’t know of any general algorithm to generate these. I don't know of a general algorithm for that either. What I'd do is: 1. Generate all the combinations 2. Run through them, keeping track of all pairings we've seen in already-accepted combinations 3. If the new combination has no "already seen" pairings, yield it and add its pairings to the set of already seen ones, otherwise skip it. I think that gives the expected result. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Partitioning a list
On Wed, 22 Aug 2018 at 00:44, Poul Riis wrote: > > I would like to list all possible ways to put N students in groups of k > students (suppose that k divides N) with the restriction that no two students > should ever meet each other in more than one group. > I think this is a classical problem and I think there must be a python > solution out there but I cannot find it. For instance, numpy's array_split > only lists one (trivial) split. > I would be happy if someone could refer me to a general python algorithm > solving the problem. The basic problem seems to be a simple case of looking for combinations of k items from N, which is something itertools can help you with. I don't understand the restriction (or at least, if I understand the basic requirement then the restriction makes no sense to me, as a set of combinations only puts any one student in one group, so "meeting someone in more than one group" isn't possible). But what I'd be inclined to do, at least as a first approach (refine it if it's too slow for your real case) is to take each basic solution, and check if it satisfies the extra constraint - keep the ones that do. Combinatorial problems typically grow very fast as N (and/or k) increase, so that approach may not work as a practical solution, but it will at least mean that you code your requirement in an executable form (which you can test for small numbers) and then you'll have a more precise problem to describe when looking for help tuning it. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: RFC -- custom operators
On Tue, 7 Aug 2018 at 09:03, Steven D'Aprano wrote: > I'll looking for comments on custom binary operators: would it be useful, > if so, what use-cases do you have? I've never found a need for custom binary operators. I can imagine some *theoretical* cases where they might be useful (but no actual use cases!) but those would almost certainly require relaxing one or more of the restrictions you listed, so they do not even count as theoretical support for your suggested proposal. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Dealing with errors in interactive subprocess running python interpreter that freeze the process
On Thu, 2 Aug 2018 at 20:58, wrote: > > Sorry, but there's no "simple" answer here for you (although you may > > well be able to get something that works well enough for your specific > > needs - but it's not obvious from your snippet of code what you're > > trying to achieve). > > To send and receive text from a subprocess..even when there are exceptions. For an arbitrary program as the subprocess? You need some sort of threading in your process, or maybe some sort of async processing with non-blocking reads/writes, as you cannot assume any particular interleaving of input and output on the part of the child. If you control the child process, you can implement a custom protocol to do the communication, and avoid many of the problems that way. If you're trying to "wrap" something like the Python interpreter, it's certainly possible, but it's tricky to get all the edge cases right (I know, I've tried!) and you probably need to run it with the -u flag (or with PYTHONUNBUFFERED set) to avoid IO buffers making your life miserable ;-) Or, as you say you're on Unix, you may be able to use ptys to do this - I gather that's what they are designed for, but as a Windows programmer myself, I know very little about them. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Dealing with errors in interactive subprocess running python interpreter that freeze the process
On Wed, 1 Aug 2018 at 21:17, wrote: > > I can run python3 interactively in a subprocess w/ Popen but > if I sent it text, that throws an exception, the process freezes > instead of just printing the exception like the normal interpreter.. > why? how fix? Here is my code below. > > (I suspect when there is an exception, there is NO output to stdin so that > the problem is the line below that tries to read from stdin never finishes. > Maybe I need a different readline that can "survive" when there is no output > and won't block?) > > > > import subprocess > > interpreter = subprocess.Popen(['python3', '-i'], >stdin = subprocess.PIPE, >stdout = subprocess.PIPE, >stderr = subprocess.PIPE) > > while True: > exp = input(">>> ").encode() + b"\n" > interpreter.stdin.write(exp) > interpreter.stdin.flush() > print(interpreter.stdout.readline().strip()) > interpreter.stdin.close() > interpreter.terminate() You're only reading one line from stdout, but an exception is multiple lines. So the subprocess is still trying to write while you're wanting to give it input again. This is a classic way to get a deadlock, which is basically what you're seeing. Add to that the fact that there are likely IO buffers in the subprocess that mean it's not necessarily passing output back to you at the exact time you expect it to (and the subprocess probably has different buffering behaviour when the IO is to pipes rather than to the console) and it gets complex fast. As others have mentioned, separate threads for the individual pipes may help, or if you need to go that far there are specialised libraries, I believe (pexpect is one, but from what I know it's fairly Unix-specific, so I'm not very familiar with it). Sorry, but there's no "simple" answer here for you (although you may well be able to get something that works well enough for your specific needs - but it's not obvious from your snippet of code what you're trying to achieve). Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Are dicts supposed to raise comparison errors
On Wed, 1 Aug 2018 at 18:43, Steven D'Aprano wrote: > > On Wed, 01 Aug 2018 16:22:16 +0100, Paul Moore wrote: > > > If they've reported to you that your code produces warnings under -b, > > your response can quite reasonably be "thanks for the information, we've > > reviewed our bytes/string handling and can confirm that it's safe, so > > there's no fixes needed in reportlab". > > I'm sorry, I don't understand this reasoning. (Perhaps I have missed > something.) Robin says his code runs under both Python2 and Python3. He's > getting a warning that the behaviour has changed between the two, and > there's a dubious comparison being made between bytes and strings. > Consequently, there's a very real chance that he has a dicts which have > one key in Python 2 but two in Python 3: Rightly or wrongly, I'm trusting Robin's assertion that he doesn't believe there's a problem with his code. After all, the change in behaviour between Python 2 and 3 has been explained a couple of times in this thread, so I'm assuming Robin understands it, and is not simply asserting unverified that he thinks his code is OK. Certainly, if Robin hasn't checked that the warning isn't flagging an actual issue with his code, then he should do so. Is that not obvious? If it's not, then my apologies for assuming it was. My point was that it's a *warning*, and as such it's perfectly possible for a warning to *not* need addressing (other than to suppress or ignore it once you're happy that doing so is the right approach). Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Are dicts supposed to raise comparison errors
On Wed, 1 Aug 2018 at 16:10, Robin Becker wrote: > > On 01/08/2018 14:38, Chris Angelico wrote: > > t's a warning designed to help people port code from Py2 to Py3. It's > > not meant to catch every possible comparison. Unless you are actually > > porting Py2 code and are worried that you'll be accidentally comparing > > bytes and text, just*don't use the -b switch* and there will be no > > problems. > > > > I don't understand what the issue is here. > > I don't either, I have never used the -b flag until the issue was raised on > bitbucket. If someone is testing a program with > reportlab and uses that flag then they get a lot of warnings from this > dictionary assignment. Probably the code needs tightening > so that we insist on using native strings everywhere; that's quite hard for > py2/3 compatible code. They should probably use the warnings module to disable the warning in library code that they don't control, in that case. If they've reported to you that your code produces warnings under -b, your response can quite reasonably be "thanks for the information, we've reviewed our bytes/string handling and can confirm that it's safe, so there's no fixes needed in reportlab". Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Are dicts supposed to raise comparison errors
On 31 July 2018 at 09:32, Robin Becker wrote: > On 31/07/2018 09:16, Paul Moore wrote: >> >> On 31 July 2018 at 08:40, Robin Becker wrote: >>> >>> A bitbucket user complains that python 3.6.6 with -Wall -b prints >>> warnings >>> for some reportlab code; the >>> example boils down to the following >>> >>> ## >>> C:\code\hg-repos\reportlab\tmp>cat tb.py >>> if __name__=='__main__': >>> d={'a':1} >>> d[b'a'] = d['a'] >>> ## >>> >> .. >> v.1500 64 bit (AMD64)] on win32 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> >>>>> b'a' == 'a' >> >> True >>>>> >>>>> b'a' == u'a' >> >> True >>>>> >>>>> >> >> which is basically the sort of thing that -b should warn about. >> Specifically the quoted code would end up with a dictionary with 2 >> entries on Python 3, but 1 entry on Python 2. >> >> Paul >> > yes but I didn't do the compare so this warning seems entirely spurious and > wrong. It's not an error to put 1 and 1.0 and 'a' into a dict. Should I get > a warning if the hashes of two different types happen to clash so that an > int needs to be checked against a string? No, but it does seem reasonable (to me, at least) that you'd get a warning if the behaviour of a[1] = 12 a[1.0] = 99 were to change so that the dict had two separate entries. That's exactly what happened here - Python 3 behaves differently than Python 2, and the -b flag is to enable warnings about such cases. If you feel the warning is spurious then you can simply not use -b. Or suppress the warning, I guess. But it seems to me that it's an opt-in warning of something that could cause problems, so I don't really see why it's such a big problem. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Are dicts supposed to raise comparison errors
On 31 July 2018 at 08:40, Robin Becker wrote: > A bitbucket user complains that python 3.6.6 with -Wall -b prints warnings > for some reportlab code; the > example boils down to the following > > ## > C:\code\hg-repos\reportlab\tmp>cat tb.py > if __name__=='__main__': > d={'a':1} > d[b'a'] = d['a'] > ## > > > C:\code\hg-repos\reportlab\tmp>\python36\python -Wall -b tb.py > tb.py:3: BytesWarning: Comparison between bytes and string > d[b'a'] = d['a'] > > I had always assumed that dicts didn't care about the type of keys although > some types might cause issue with hashability, but obviously the > implementation seems to be comparing b'a' with 'a' (I suppose because they > hash to the same chain). > > Is this code erroneous or is the warning spurious or wrong? The warning seems right to me. Behaviour differs between Python 2 and Python 3: >py -Wall -b Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:59:51) [MSC v.1914 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> b'a' == 'a' __main__:1: BytesWarning: Comparison between bytes and string False >>> ^Z >py -2 Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> b'a' == 'a' True >>> b'a' == u'a' True >>> which is basically the sort of thing that -b should warn about. Specifically the quoted code would end up with a dictionary with 2 entries on Python 3, but 1 entry on Python 2. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Better way / regex to extract values form a dictionary
def return_filename_test_case(filepath): filename = os.path.basename(filepath) testcase = filename.partition('_')[0] return filename, testcase On 21 July 2018 at 12:37, Ganesh Pal wrote: > I have one of the dictionary values in the below format > > '/usr/local/ABCD/EDF/ASASAS/GTH/HELLO/MELLO/test04_Failures.log' > '/usr/local/ABCD/EDF/GTH/HEL/OOLO/MELLO/test02_Failures.log' > '/usr/local/ABCD/EDF/GTH/BEL/LO/MELLO/test03_Failures.log' > > I need to extract the file name in the path example, say test04_Failure.log > and testcase no i.e test04 > > > Here is my solutions: > > gpal-cwerzvd-1# vi filename.py > import re > > Common_dict = {} > Common_dict['filename'] = > '/usr/local/ABCD/EDF/GTH/HELLO/MELLO/test04_Failures.log' > > def return_filename_test_case(filepath): > if filepath: >filename = re.findall(r'(test\d{1,4}_\S+)', filepath) > if filename: >testcase = re.findall(r'(test\d{1,4})', ''.join(filename)) > > return filename, testcase > > > if Common_dict['filename']: >path = Common_dict['filename'] >fname, testcase = return_filename_test_case(path) >print fname, testcase > > > op: > qerzvd-1# python filename.py > ['test04_Failures.log'] > ['test04'] > > > Please suggest how can this code can be optimized further looks messy , > what would be your one liner or a simple solution to return both test-case > no and filename > > I am on Python 2.7 and Linux > > > Regards, > Ganesh > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Reading EmailMessage from file
On 16 July 2018 at 02:31, Skip Montanaro wrote: >> What are you actually trying to do? You're talking like you're trying >> to read an existing RFC822 email-with-headers from a file, but you're >> showing code that creates a new email with body content set from >> a file, which is a completely different thing. > > Yes, that's exactly what I'm trying to do. A bit more context... I'm > trying to port SpamBayes from Python 2 to Python 3. The file I > attached which failed to come through was exactly what you suggested, > an email in a file. That is what the example from the 3.7 docs > suggested I should be able to do. Had the message in the file been > encoded as utf-8, that would have worked. I just tested it with > another message which is utf-8-encoded. As I understand it, an email is a logical object, and when it's saved to disk it should be done in some mailbox format or other. So rather than grabbing raw file data and trying to deserialise it, maybe you'd be better using one of the classes in the mailbox module (https://docs.python.org/3.7/library/mailbox.html)? Alternatively, maybe email.parser.message_from_binary_file (with a file object opened in binary) would do what you're after? Note: I've not actually used any of these methods myself... Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Python for beginners or not? [was Re: syntax difference]
From: Paul Moore On 25 June 2018 at 11:53, Steven D'Aprano wrote: > And the specific line you reference is *especially* a joke, one which > flies past nearly everyone's head: > > There should be one-- and preferably only one --obvious way to do it. > > > Notice the dashes? There are *two* traditional ways to use an pair of em- > dashes for parenthetical asides: > > 1. With no space--like this--between the parenthetical aside and the text; > > 2. With a single space on either side -- like this -- between the aside > and the rest of the text. > > Not satisfied with those two ways, Tim invented his own. > > > https://bugs.python.org/issue3364 > > > (Good grief, its been nearly ten years since that bug report. I remember > it like it was yesterday.) Thank you for that bit of history! I never knew that (and indeed had missed that part of the joke). Tim's contributions to Python are always to be treasured :-) Paul --- BBBS/Li6 v4.10 Toy-3 * Origin: Prism bbs (1:261/38) -- https://mail.python.org/mailman/listinfo/python-list
Re: Quick survey: locals in comprehensions (Python 3 only)
From: Paul Moore On 24 June 2018 at 06:03, Steven D'Aprano wrote: > I'd like to run a quick survey. There is no right or wrong answer, since > this is about your EXPECTATIONS, not what Python actually does. > > Given this function: > > > def test(): > a = 1 > b = 2 > result = [value for key, value in locals().items()] > return result > > what would you expect the result of calling test() to be? Is that the > result you think is most useful? In your opinion, is this a useful > feature, a misfeature, a bug, or "whatever"? > > I'm only looking for answers for Python 3. (The results in Python 2 are > genuinely weird :-) My immediate reaction was "that's not something I'd want to do, so I don't care (but I've a feeling it would be weird). On thinking some more, I decided that [1, 2] made sense (but I still didn't actually care). After reading Chris Angelico's analysis, I went back to my first opinion (that I don't care, but I suspect it might be weird). I'm aware of the background for this question. Is there any equivalent question that doesn't use locals()? The reason I ask is that I see locals() as "digging into implementation stuff" and sort of expect it to act oddly in situations like this... Paul --- BBBS/Li6 v4.10 Toy-3 * Origin: Prism bbs (1:261/38) -- https://mail.python.org/mailman/listinfo/python-list
Re: Quick survey: locals in comprehensions (Python 3 only)
On 26 June 2018 at 11:09, Chris Angelico wrote: > On Tue, Jun 26, 2018 at 8:04 PM, Antoon Pardon wrote: >> On 26-06-18 11:22, Steven D'Aprano wrote: >>> On Tue, 26 Jun 2018 10:20:38 +0200, Antoon Pardon wrote: >>> > def test(): > a = 1 > b = 2 > result = [value for key, value in locals().items()] > return result >>> [...] >>> I would expect an UnboundLocalError: local variable 'result' referenced before assignment. >>> Well, I did say that there's no right or wrong answers, but that >>> surprises me. Which line do you expect to fail, and why do you think >>> "result" is unbound? >> >> I would expect the third statement to fail because IMO we call the locals >> function before result is bound. But result is a local variable so the >> locals function will try to reference it, hence the UnboundLocalError. > > Would you expect the same behaviour from this function? > > def test(): > a = 1 > b = 2 > result = locals() > return result Regardless of the answer to that question, the message here is basically "people don't have good intuition of how locals() works". Or to put it another way, "code that uses locals() is already something you should probably be checking the docs for if you care about the details of what it does". Which agrees with my immediate reaction when I saw the original question :-) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Python for beginners or not? [was Re: syntax difference]
On 25 June 2018 at 11:53, Steven D'Aprano wrote: > And the specific line you reference is *especially* a joke, one which > flies past nearly everyone's head: > > There should be one-- and preferably only one --obvious way to do it. > > > Notice the dashes? There are *two* traditional ways to use an pair of em- > dashes for parenthetical asides: > > 1. With no space--like this--between the parenthetical aside and the text; > > 2. With a single space on either side -- like this -- between the aside > and the rest of the text. > > Not satisfied with those two ways, Tim invented his own. > > > https://bugs.python.org/issue3364 > > > (Good grief, its been nearly ten years since that bug report. I remember > it like it was yesterday.) Thank you for that bit of history! I never knew that (and indeed had missed that part of the joke). Tim's contributions to Python are always to be treasured :-) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Quick survey: locals in comprehensions (Python 3 only)
On 24 June 2018 at 06:03, Steven D'Aprano wrote: > I'd like to run a quick survey. There is no right or wrong answer, since > this is about your EXPECTATIONS, not what Python actually does. > > Given this function: > > > def test(): > a = 1 > b = 2 > result = [value for key, value in locals().items()] > return result > > what would you expect the result of calling test() to be? Is that the > result you think is most useful? In your opinion, is this a useful > feature, a misfeature, a bug, or "whatever"? > > I'm only looking for answers for Python 3. (The results in Python 2 are > genuinely weird :-) My immediate reaction was "that's not something I'd want to do, so I don't care (but I've a feeling it would be weird). On thinking some more, I decided that [1, 2] made sense (but I still didn't actually care). After reading Chris Angelico's analysis, I went back to my first opinion (that I don't care, but I suspect it might be weird). I'm aware of the background for this question. Is there any equivalent question that doesn't use locals()? The reason I ask is that I see locals() as "digging into implementation stuff" and sort of expect it to act oddly in situations like this... Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: PEP8 compliance
On 13 June 2018 at 17:35, T Berger wrote: > I did make the changes in IDLE, but I thought I must be in the wrong place. > The line of code I got in terminal was: > /Users/TamaraB/Desktop/mymodules/vsearch.py:1:25: E231 missing whitespace > after ':' > def search4vowels(phrase:str)->set: > > I thought the 1:25: E231 couldn't be referring to anything in my text editor. > But now I see that 1:25 refers to the first line, 25th spot (I suppose the 25 > refers to the spot BEFORE which I'm supposed to add white space. I don't know > what the E231 refers to, but it doesn't seem helpful. That's the correct interpretation of that message. Well done (and I don't mean that patronisingly) - messages from tools like whatever it was that reported these errors to you are often rooted in assumptions which are *far* from obvious to someone new to programming. To expand a little: The 1 is as you say the line on which the tool spotted the problem. Program text is viewed (by tools just as by people) as a block of lines of text, numbered starting from line 1. Tools will number blank lines (lines with nothing on them) equally with lines with text on them - sometimes people number only the non-blank lines, but programming tools don't typically do that. The 25 does refer to the position on the line that the tool is referring to. Position is measured in characters. You say "spot", and that's as good a term as any. Characters as counted by a computer include letters, numbers, punctuation, and even spaces. You can think of it as "column on the screen" in this case and not be far wrong. The E231 is a code for the specific error that the tool found - so it means "missing whitespace". The text of the message is all you need to deal with, but having a unique, concise code can help, for example when looking up information in the documentation or the source code of the tool. It's very helpful to quote error numbers like this when reporting problems or asking for help, as they are more precise (to people who know how to interpret them) than the textual message. But reporting the text as well is crucial, as it saves people having to look up the code to know what you're talking about! > And, no, I'm not going to make these picayune changes that actually make the > code harder to read. Adding a white space between "phrase:" and "str" just > splits apart a conceptual unit in a complicated line of code. I was just > doing this exercise in my workbook. That's a very good attitude. There *are* good reasons for many of the style recommendations, and as you learn more you may be persuaded to change your view, but style guides are all ultimately about making your code "readable", and it sounds like you are already developing a good sense of how you want to group and present your code. That's something many programmers can take a long time (years, in some cases) to develop, and a good sense of style is often (IMO) what separates good programmers from mediocre/bad ones. Reading other people's code is often a very good way to develop a sense of style, if you get the chance to do so. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Why exception from os.path.exists()?
On 4 June 2018 at 13:01, Steven D'Aprano wrote: >> Turns out that this is a limitation on Windows as well. The \0 is not >> allowed for Windows, macOS and Posix. > > We -- all of us, including myself -- have been terribly careless all > through this discussion. The fact is, this should not be an OS limitation > at all. It is a *file system* limitation. > > If I can mount a HFS or HFS-Plus disk on Linux, it can include file names > with embedded NULs or slashes. (Only the : character is illegal in HFS > file names.) It shouldn't matter what the OS is, if I have drivers for > HFS and can mount a HFS disk, I ought to be able to sensibly ask for file > names including NUL. Agreed, being completely precise in this situation is both pretty complicated, and essential. The question of what are legal characters in a filename is, as you say, a filesystem related issue. People traditionally forget this point, but in these days of cross-platform filesystem mounting, networked filesystems[1], etc, it's more and more relevant, and thankfully people are getting more aware of the point. But there's also the question of what capability the kernel API has to express the queries. The fact that the Unix API (and the Windows one, in most cases - although as Eryk Sun pointed out there are exceptions in the Windows kernel API) uses NUL-terminated strings means that querying the filesystem about filenames with embedded \0 characters isn't possible *at the OS level*. (As another example, the fact that the Unix kernel treats filenames as byte strings means that there are translation issues querying an NTFS filesystem that uses Unicode (UTF-16) natively - and vice versa when Windows queries a Unix-native filesystem). So "it's complicated" is about the best we can say :-) Paul [1] And of course if you mount (say) an NTFS filesystem over NFS, you have *two* filesystems involved, each adding its own layer of restrictions and capabilities. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why exception from os.path.exists()?
On 2 June 2018 at 12:28, Chris Angelico wrote: > On Sat, Jun 2, 2018 at 9:13 PM, Steven D'Aprano > wrote: >> On Sat, 02 Jun 2018 20:58:43 +1000, Chris Angelico wrote: >> > Windows isn't POSIX compliant. Technically, Windows is POSIX compliant. You have to turn off a bunch of features, turn on another bunch of features, and what you get is the bare minimum POSIX compliance possible, but it's enough to tick the check box for POSIX compliance. >>> >>> Really? I didn't know that Windows path names were POSIX compliant. Or >>> do you have to use the Cygwin fudge to count Windows as POSIX? And what >>> about POSIX signal handling? >>> >>> Citation needed, big-time. >> >> https://en.wikipedia.org/wiki/Microsoft_POSIX_subsystem >> >> https://technet.microsoft.com/en-us/library/bb463220.aspx >> >> https://brianreiter.org/2010/08/24/the-sad-history-of-the-microsoft-posix- >> subsystem/ > > Can someone confirm whether or not all the listed signals are actually > supported? We know that Ctrl-C maps to the internal Windows interrupt > handler, and "kill process" maps to the internal Windows "terminate", > but can you send a different process all the different signals and > handle them differently? > > I also can't find anything about path names there. What does POSIX say > about the concept of relative paths? Does Windows comply with that? > > "Windows has some features which are compatible with the equivalent > POSIX features" is not the same as "Technically, Windows is POSIX > compliant". My apologies, I don't have time to hunt out complete references now, but my recollection is that Windows (the OS) is POSIX compliant (as noted, with certain configurations, etc). However, the Win32 API (which is what most people think of when they say "Windows") is not POSIX compatible. As an example, Windows (the kernel) has the capability to implement fork(), but this isn't exposed via the Win32 API. To implement fork() you need to go to the raw kernel layer. Which is basically what the Windows Linux subsystem (bash on Windows 10) does - it's a user-level implementation of the POSIX API using Win32 kernel calls. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Sorting NaNs
On 2 June 2018 at 08:32, Peter J. Holzer wrote: > Browsing through older messages I came upon a thread discussing the > treatment of NaNs by median(). Since you have to (partially) sort the > values to compute the median, I played around with sorted(): > > Python 3.5.3 (default, Jan 19 2017, 14:11:04) > [GCC 6.3.0 20170118] on linux > Type "help", "copyright", "credits" or "license" for more information. > sorted([3, 5, float('NaN'), 1]) > [3, 5, nan, 1] > > What? Does NaN cause sorted to return the original list? > sorted([3, 5, float('NaN'), 1, 0.5]) > [3, 5, nan, 0.5, 1] > > Nope. Does it partition the list into sublists, which are sorted > individually? > sorted([3, 5, -8, float('NaN'), 1, 0.5]) > [-8, 0.5, 1, 3, 5, nan] sorted([3, 5, -8, float('NaN'), 1, 0.5, 33]) > [-8, 0.5, 1, 3, 5, nan, 33] > > Also nope. It looks like NaNs just mess up sorting in an unpredictable > way. Is this the intended behaviour or just an accident of > implementation? (I think it's the latter: I can see how a sort algorithm > which doesn't treat NaN specially would produce such results.) I'd simply assume it's the result of two factors: 1. The behaviour of comparisons involving NaN values is weird (not undefined, as far as I know NaN behaviour is very well defined, but violates a number of normally fundamental properties of comparisons) 2. The precise behaviour of the sort algorithm use by Python is not mandated by the language. A consequence of (1) is that there is no meaningful definition of "a sorted list of numbers" if that list includes NaN, as the definition "being sorted" relies on properties like "only one of a < b and b < a is true" (precisely which properties depend on how you define "sorted" which no-one normally cares about because under normal assumptions they are all equivalent). A list including NaNs therefore cannot be permuted into a sorted order (which is the basic language-mandated detail of sorted() - there are others, but this is what matters here). So call it an accident of implementation of you like. Or "sorting a list with NaNs in it is meaningless" if you prefer. Or "undefined behaviour" if you're a fan of the language in the C standard. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: version
On 2 June 2018 at 02:34, Mike McClain wrote: > It looks like what I was wanting is something like 'C's #if, a > compiler conditional. > > Does python have anything like that to tell the interpreter to ignore > a line that is not a comment or a quoted string? No, it doesn't. Honestly, if you are writing "play" scripts as you say, I wouldn't bother trying to make them cross-version compatible. Move all your old scripts to a "python2" subdirectory and forget them. Write new scripts as Python 3 only. As an exercise in understanding the differences between Python 2 and 3, porting some of the scripts in your "python2" directory to Python 3 would possibly be a useful thing to do. If you have important scripts that you use a lot that are in Python 2 form, continue running them under Python 2 until you have some time (and maybe the understanding, if they are complex) to port them, and then switch. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Indented multi-line strings
On 1 June 2018 at 22:57, Chris Angelico wrote: > How will a method be worse than a standalone function? Please explain > this. Because the standalone function already exists (in the textwrap module). There's nothing wrong with adding string methods, but where they don't add anything that can't be done already within the core language and stdlib, then they have to get past the hurdle of "status quo wins". > A method is a lot easier to discover than a stdlib module > function, If you're using Google or the python docs, I'm not sure it is easier, and I'm certain it's not a *lot* easier. If you're using the REPL, then yes, finding a string method is easier than hunting out a random function that takes a string argument. > which is in turn IMMENSELY more discoverable than anything > on pypi. True. But discoverability of anything gets a lot better when it's more widely used, so if this is a common need, I'd hope that discoverability of any approach would improve over time. > If you dislike adding features to a language on the basis that it > makes the language harder to learn, remember that you instead force > one of three even worse options: > > 1) Messy code because people unindent inside their source code, > creating wonky indentation (which Python usually avoids) > > 2) Forcing readers to look up the third-party module you're using > before they can understand your code > > 3) Forcing readers to look up your ad-hoc function before > understanding your code. > > All of these make it harder to understand your code, specifically > BECAUSE the language doesn't have the requisite feature. Well-written > language features are good, not bad, for readability. You missed the option that's actually the case: 0) Expecting readers to look up a stdlib module in order to understand your code. And you aren't comparing like with like, if this were a string method users would have to look up that method, just as much as they would have to look up a stdlib function. But honestly, if someone needs to look up the definition a function or method called "dedent", then either there is something more than normally complex going on, or they are not a native English speaker and this is the least of their worries, or something similar. No-one is saying a method is *worse* than a standalone function - they are just saying it's *not sufficiently better* to justify creating a string method that replicates an existing stdlib function. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Indented multi-line strings
On 1 June 2018 at 16:36, Chris Angelico wrote: > On Sat, Jun 2, 2018 at 12:57 AM, Paul Moore wrote: >> Why does this need to be a string method? Why can't it be a standalone >> function? Maybe you should publish an implementation on PyPI, collect >> some data on how popular it is, and then if it's widely used, propose >> it for inclusion in the stdlib at that point? By making it a string >> method, you're also restricting its use to users of recent versions of >> Python, whereas a PyPI implementation would work for everyone. > > The biggest reason to make it a string method is to give the > possibility of optimization. Python cannot optimize int(1.2) down to > the constant 1 because you might have shadowed int; but a method on a > string literal cannot be shadowed, and could potentially be > constant-folded. Include that in the initial post to preempt this > recommendation. So the optimisation should probably be an explicit part of the proposal. Without the optimisation, factors like "won't be usable in code that wants to support older Python", "why not just make it a standalone function", etc. will probably result in the proposal not getting accepted. On 1 June 2018 at 16:20, Dan Strohl wrote: > > Good point, so, basically, there already is a function for this built in > textwrap.dedent() and textwrap.indent(), I would think (hope) that that would > answer that question. OK, so unless the argument is "provide a string method, that's guaranteed to be constant folded"[1] I suspect that it's pretty unlikely that a proposal to add a string method that simply replicated the textwrap functions would get very far. But regardless, there's no point in me trying to second guess what might come up on python-ideas, you should just post there and see what reception you get. Paul [1] There's two possibilities here, of course. First, provide an implementation for CPython that includes the constant folding, or second, make constant folding a language guarantee that other implementations also have to implement. I doubt the second option is going to be practical, though. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why exception from os.path.exists()?
On 1 June 2018 at 15:38, Grant Edwards wrote: > On 2018-06-01, Paul Moore wrote: > >> Python allows strings with embedded \0 characters, so it's possible >> to express that question in Python - os.path.exists('a\0b'). What >> can be expressed in terms of the low-level (C-based) operating >> system API shouldn't be relevant. > > Python allows floating point numbers, so it is possible to express > this question in python: os.path.exists(3.14159). Is the fact that > the underlying OS/filesystem can't identify files via a floating point > number relevent? Should it return False or raise ValueError? I'm not sure if you're asking a serious question here, or trying to make some sort of point, but os.path.exists is documented as taking a string, so passing a float should be a TypeError. And it is. But as I already said, this is a huge amount of effort spent on a pretty trivial corner case, so I'll duck out of this thread now. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Indented multi-line strings
On 1 June 2018 at 15:36, Dan Strohl via Python-list wrote: > So... how does one go about suggesting changes to the built in types? I > could take a whack at the code for it, but my C skills are no where near what > should probably be needed for something this close to the core of the > language. I'm not sure if adding a couple of methods is a PEP type of thing. It would probably have to go via python-ideas, but if it gets the OK there I doubt it would need a PEP. There are a few key questions I'd expect to see come up. Why does this need to be a string method? Why can't it be a standalone function? Maybe you should publish an implementation on PyPI, collect some data on how popular it is, and then if it's widely used, propose it for inclusion in the stdlib at that point? By making it a string method, you're also restricting its use to users of recent versions of Python, whereas a PyPI implementation would work for everyone. None of these are showstoppers - many proposals have got past them - but it's worth having at least thought through your answers to them, so you can present the idea in the best light. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Why exception from os.path.exists()?
On 1 June 2018 at 13:15, Barry Scott wrote: > I think the reason for the \0 check is that if the string is passed to the > operating system with the \0 you can get surprising results. > > If \0 was not checked for you would be able to get True from: > > os.file.exists('/home\0ignore me') > > This is because a posix system only sees '/home'. So because the OS API can't handle filenames with \0 in (because that API uses null-terminated strings) Python has to special case its handling of the check. That's fine. > Surely ValueError is reasonable? Well, if the OS API can't handle filenames with embedded \0, we can be sure that such a file doesn't exist - so returning False is reasonable. > Once you know that all of the string you provided is given to the operating > system it can then do whatever checks it sees fit to and return a suitable > result. As the programmer, I don't care. The Python interpreter should take care of that for me, and if I say "does file 'a\0b' exist?" I want an answer. And I don't see how anything other than "no it doesn't" is correct. Python allows strings with embedded \0 characters, so it's possible to express that question in Python - os.path.exists('a\0b'). What can be expressed in terms of the low-level (C-based) operating system API shouldn't be relevant. Disclaimer - the Python "os" module *does* expose low-level OS-dependent functionality, so it's not necessarily reasonable to extend this argument to other functions in os. But it seems like a pretty solid argument in this particular case. > As an aside Windows has lots of special filenames that you have to know about > if you are writting robust file handling. AUX, COM1, \this\is\also\COM1 etc. I don't think that's relevant in this context. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Why exception from os.path.exists()?
On 31 May 2018 at 16:11, Steven D'Aprano wrote: > On Thu, 31 May 2018 22:46:35 +1000, Chris Angelico wrote: > [...] >>> Most other analogous reasons *don't* generate an exception, nor is that >>> possibility mentioned in the specification: >>> >>>https://docs.python.org/3/library/os.path.html?#os.path.exists >>> >>> Is the behavior a bug? Shouldn't it be: >>> >>>>>> os.path.exists("\0") >>>False >> >> A Unix path name cannot contain a null byte, so what you have is a >> fundamentally invalid name. ValueError is perfectly acceptable. > > It should still be documented. > > What does it do on Windows if the path is illegal? Returns False (confirmed with paths of '?' and ':', among others). Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Why exception from os.path.exists()?
On 31 May 2018 at 15:01, Chris Angelico wrote: > Can someone on Windows see if there are other path names that raise > ValueError there? Windows has a whole lot more invalid characters, and > invalid names as well. On Windows: >>> os.path.exists('\0') ValueError: stat: embedded null character in path >>> os.path.exists('?') False >>> os.path.exists('\u77412') False >>> os.path.exists('\t') False Honestly, I think the OP's point is correct. os.path.exists should simply return False if the filename has an embedded \0 - at least on Unix. I don't know if Windows allows \0 in filenames, but if it does, then os.path.exists should respect that... Although I wouldn't consider this as anything even remotely like a significant issue... Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: syntax oddities
On 18 May 2018 at 12:08, Rhodri James wrote: > On 17/05/18 23:44, Paul wrote: >> >> I've been using email for thirty years, including thousands of group >> emails >> at many tech companies, and no one has ever suggested, let alone insisted >> on, bottom posting. > > I've been using email for thirty years, etc, etc, and I've always insisted > on proper quoting, trimming and interspersed posting. Clearly you've never > worked in the right companies ;-) There are two completely independent cultures here. In "Corporate" cultures like where I work (where IT and business functions interact a lot, and business users typically use tools like Outlook) top-posting is common, conventional, and frankly, effective. Conversely, in purely technical communities like open source, where conventions originated in low-bandwidth channels like early networks, interspersed posting, heavy trimming and careful quoting are the norm. I've participated in both communities for 30 years or more, and you deal with people in the way that they find most comfortable. It's polite to follow the conventions of the community that you're interacting with - so on this mailing list, for example, quoting and posting inline is the norm and top-posting is considered impolite. Arguing about how the community's conventions are wrong is also impolite :-) I'm reminded of the old stereotypes of Brits speaking English NICE AND LOUDLY to foreigners to help them understand what we're saying... (Disclaimer: I'm a Brit, so I'm poking fun at myself here :-)) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: what does := means simply?
On 17 May 2018 at 12:58, bartc wrote: > On 17/05/2018 04:54, Steven D'Aprano wrote: >> >> On Thu, 17 May 2018 05:33:38 +0400, Abdur-Rahmaan Janhangeer wrote: >> >>> what does := proposes to do? > >> A simple example (not necessarily a GOOD example, but a SIMPLE one): >> >> print(x := 100, x+1, x*2, x**3) > > > It's also not a good example because it assumes left-to-right evaluation > order of the arguments. Even if Python guarantees that, it might be a > problem if the code is ever ported anywhere else. It's a good example, because it makes it clear that the benefits of := are at least in some cases, somewhat dependent on the fact that Python evaluates arguments left to right :-) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: object types, mutable or not?
On 16 May 2018 at 14:23, Ned Batchelder wrote: > I've also experimented with different ways to better say "everything is an > object". One possibility is, "any right-hand side of an assignment is an > object," though that is a bit tortured. C++ called that an "rvalue". And then went on to define things that could go on the left hand side of an assignment as "lvalues". And now we have two confusing concepts to explain - see what happens when you let a standards committee define your language? :-) Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Python-list Digest, Vol 176, Issue 16
On 14 May 2018 at 20:02, Paul wrote: > 1) I understand the added cost of verifying the sequence. However, this > appears to be a one-time cost. E.G., if I submit this, > > random.choices(lm,cum_weights=[25,26,36,46,136],k=400 > > then the code will do an O(n log n) operation 400 times. > > If verification was added, then the the code would do an O(n log n) > operation 400 times, plus an O(n) operation done *one* time. So, I'm not > sure that this would be a significant efficiency hit (except in rare cases). That's a good point. But as I don't have any need myself for random.choices with a significant population size (the only case where this matters) I'll leave it to those who do use the functionality to decide on that point. Regards, Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: random.choices() Suggest that the code confirm that cum_weights sequence is in ascending order
On 14 May 2018 at 14:07, Steven D'Aprano wrote: > On Mon, 14 May 2018 12:59:28 +0100, Paul Moore wrote: > >> The problem is that supplying cum_weights allows the code to run in >> O(log n) by using bisection. This is significantly faster on large >> populations. Adding a test that the cumulative weights are nondecreasing >> would add an O(n) step to the code. >> >> So while I understand the OP's problem, I don't think it's soluble >> without making the cum_weights argument useless in practice. > > How does O(N) make it "useless"? There are lots of O(N) algorithms, even > O(N**2) and O(2**N) which are nevertheless still useful. Well, I've never seen an actual use case for this argument (I can't think of a case where I'd even have cumulative weights rather than weights, and obviously calculating the cumulative weights from the actual weights is what we're trying to avoid). And if you have cum_weights and O(n) is fine, then calculating weights from cum_weights is acceptable (although pointless, as it simply duplicates work). So the people who *really* need cum_weights are those who have the cumulative weights already, and cannot afford an O(n) precalculation step. But yes, clearly in itself an O(n) algorithm isn't useless. And agreed, in most cases whether random.choices() is O(n) or O(log n) is irrelevant in practice. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: random.choices() Suggest that the code confirm that cum_weights sequence is in ascending order
On 14 May 2018 at 13:53, Chris Angelico wrote: > On Mon, May 14, 2018 at 10:49 PM, Paul Moore wrote: >> On 14 May 2018 at 13:27, Chris Angelico wrote: >>> On Mon, May 14, 2018 at 9:59 PM, Paul Moore wrote: >>>> The problem is that supplying cum_weights allows the code to run in >>>> O(log n) by using bisection. This is significantly faster on large >>>> populations. Adding a test that the cumulative weights are >>>> nondecreasing would add an O(n) step to the code. >>>> >>> >>> Hang on - are the 'n' and 'log n' there referring to the same n? >> >> Yes. The number of elements in the sample population (which is the >> same as the number of entries in the weights/cum_weights arrays). > > Okay, cool. Thanks. I was a little confused as to whether the weights > were getting grouped up or not. Have seen too many cases where someone > panics about an O(n²) on a tiny n that's unrelated to the important > O(n) on a huge n :) Yeah, for all of *my* uses of the functions in random, n is so small as to make all this irrelevant. But when I looked into how cum_weights worked, I realised it's aimed at people passing significant sized data sets. An they would probably be hit hard by a change from O(log n) to O(n). One thing I always liked about C++ was the way the standard library documented a lot of the O(n) properties of the operations. It not only made it easier to know what was costly and what wasn't, it also made it much clearer what functions were intended for use on large data sets. I sort of miss that information in Python - not least because functions like random.choices are often a lot faster than I'd naively expect. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: random.choices() Suggest that the code confirm that cum_weights sequence is in ascending order
On 14 May 2018 at 13:27, Chris Angelico wrote: > On Mon, May 14, 2018 at 9:59 PM, Paul Moore wrote: >> The problem is that supplying cum_weights allows the code to run in >> O(log n) by using bisection. This is significantly faster on large >> populations. Adding a test that the cumulative weights are >> nondecreasing would add an O(n) step to the code. >> > > Hang on - are the 'n' and 'log n' there referring to the same n? Yes. The number of elements in the sample population (which is the same as the number of entries in the weights/cum_weights arrays). See https://github.com/python/cpython/blob/master/Lib/random.py#L382 for details, but basically calculating cum_weights from weights costs O(n), and locating the right index into the population by doing a bisection search (bisect.bisect) on the cum_weights sequence costs O(log n). Using the cum_weights argument rather than the weights argument skips the O(n) step. If it's possible to check that cum_weights is nondecreasing in O(log n) time (either directly here, or in bisect.bisect), then the check wouldn't affect the algorithmic complexity of that case (it would affect the constants, but I assume we don't care too much about that). But I don't know of a way of doing that. Improving the documentation is of course free of runtime cost. And making it clear that "you should only use cum_weights if you know what you're doing, and in particular it doesn't cost you O(n) to work them out" would seem entirely reasonable to me. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: random.choices() Suggest that the code confirm that cum_weights sequence is in ascending order
The problem is that supplying cum_weights allows the code to run in O(log n) by using bisection. This is significantly faster on large populations. Adding a test that the cumulative weights are nondecreasing would add an O(n) step to the code. So while I understand the OP's problem, I don't think it's soluble without making the cum_weights argument useless in practice. Better documentation might be worthwhile (although I don't personally find the current docs confusing, so suggestions for improvements would be helpful). Paul On 14 May 2018 at 12:36, Steven D'Aprano wrote: > Hi Paul, and welcome! > > On Sun, 13 May 2018 17:48:47 -0700, Paul wrote: > >> Hi, >> I just learned how to use random.choices(). > [...] >> Consequently, I specified 'cum_weights' with a sequence which wasn't in >> ascending order. I got back k results but I determined that they >> weren't correct (eg, certain population values were never returned). >> >> Since the non-ascending sequence, which I had supplied, could not >> possibly be valid input, why isn't this checked (and an error returned)? >> Returning incorrect results (which could be hard to spot as being >> incorrect) is much more dangerous. Also, checking that the list is in >> ascending order need only be done once, and seems like it would be >> inexpensive. > > Sounds like a reasonable feature request to me. > > > https://bugs.python.org/issue33494 > > > > -- > Steve > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Python 2.7.15 Windows MSI removes previous 2.7.x binaries?
It's intended behaviour (to my knowledge). Or at least, we don't intend for people to install two different patch versions in parallel (at least not with the official installers). I thought this behaviour was always the case. It may be related to the installer technology involved, though, so it may have changed when we switched to wix. You're right it doesn't seem like the details are well documented. It might be worth raising a docs bug on bugs.python.org asking for the details to be clarified. Paul On 1 May 2018 at 18:28, wrote: > I downloaded the 64-bit Windows MSI for Python 2.7.15 and upon finishing the > installation, I noted that prior Python installs had effectively been > removed, only leaving a Lib and Scripts folder in the directory to which said > prior version had been installed. For what it's worth, it appears this is > only true if I install for 'All users' rather than for 'just this user'. > > I can't find anything in the release notes, docs or on this mailing list that > describes the motivation for doing this. I personally find that having > multiple versions installed on Windows is very helpful for being able to > target distribution version for the purposes of constructing virtualenvs. > > Can anyone say whether this is intended behavior? > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: venv: make installed modules visible
On 1 May 2018 at 17:06, Rich Shepard wrote: > Activating venv and trying to run the project Python tells me it cannot > find the wxPython4 modules: > > Traceback (most recent call last): > File "./openEDMS.py", line 12, in > import wx > ModuleNotFoundError: No module named 'wx' > > I've read the Python3 venv standard library doc page (section 28.3) > without seeing how to make installed modules (such as wxPython, psycopg2, > and SQLAlchemy visible in the virtual environment. I suspect that EnvBuilder > is involved but I'm not seeing how to apply this class ... if that's how > modules are made visible in the venv. > > A clue is needed. Maybe you need --system-site-packages? Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: www.python.org down
It's working for me now. Paul On 30 April 2018 at 18:38, Jorge Gimeno wrote: > Not sure who to report to, but the site comes back with a 503. Anyone know > where I can direct this to? > > -Jorge L. Gimeno > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Installation of tensorflow via pip -- messages?
On 26 April 2018 at 21:18, Terry Reedy wrote: >> If my memory is correct, this is the default for path directories. > > The Python entries do, as added by the Windows Installer written by a > Microsoft engineer, so this must at least be a correct alternative. It's definitely acceptable - there's no doubt the pip 10.0.1 behaviour is a bug (that's been fixed). No-one is arguing otherwise. The suggestion to remove the backslashes was nothing more than a workaround that can be used until the next release of pip. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Installation of tensorflow via pip -- messages?
On 26 April 2018 at 20:04, Virgil Stokes wrote: > IMHO it would have been useful to have "warning" somewhere in these > messages. Ha, I'd never even noticed that it didn't... I think it's in a different colour, FWIW, but your point is good. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Installation of tensorflow via pip -- messages?
On 26 April 2018 at 19:33, Virgil Stokes wrote: > Why am I getting this message, that I need to consider adding this directory > to PATH when it is already in PATH? > Note, all of these *.exe files are in C:\Python36\Scripts. The PATH entry ends with a backslash, which is confusing the check done by pip. It's a known issue and has been fixed in the development version of pip, so it'll be resolved in the next release. In the meantime, you can either remove the redundant trailing backslash from your PATH, or just ignore the warning. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Determining whether a package or module should be installed globally using pip
On 25 April 2018 at 16:32, Rodrigo Acosta wrote: > Is there a rule of thumb in deciding where to install a package? What makes a > package, other than security vulnerabilities, better to install globally e.g. > using sudo pip install, or by changing directory to tmp folder, or by using > virtualenv? Generally, I would say you should *never* use "sudo pip install". The system installed Python is managed by your distribution package manager, and you should only ever make changes to that using vendor supplied packages ("apt-get install python-requests" or whatever the incantation might be). If you're installing packages for a project, virtualenvs are a very good idea. (You can also use higher level tools like pew or pipenv to make management of virtual environments easier). Or if you just want to install packages for your own use in adhoc scripts, etc, then you can install them using "pip install --user" - this will make them available in the system Python without altering system-managed files or directories (note that I *didn't* use sudo!) Hope this helps, Paul -- https://mail.python.org/mailman/listinfo/python-list
Pip 10.0.1 has been released
On behalf of the PyPA, I am pleased to announce that pip 10.0.1 has just been released. This release fixes a number of issues with the initial release of pip 10.0, notably: * A problem with running the "pip.exe" wrapper script on Windows from a directory with a space in the name. * A problem with get-pip.py needing to be renamed on Windows to avoid triggering a check in pip that aborts the run. * A problem with build isolation when pip is installed as --user * An issue with the vendored msgpack library on older versions of Python 2.7 * A problem with pip installing from non-editable VCS URLs Thanks to all the people who reported issues and helped with the fixes. Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: The basics of the logging module mystify me
On 19 April 2018 at 02:00, Skip Montanaro wrote: > I really don't like the logging module, but it looks like I'm stuck > with it. Why aren't simple/obvious things either simple or obvious? If you can use non-stdlib things there are alternatives. I've heard good things about logbok (https://logbook.readthedocs.io/en/stable/) although I will say I've never tried it myself. I do agree that the stdlib logging module, while technically powerful, is frustratingly clumsy to use in all of the relatively simple situations I've felt it might be helpful to me :-( Paul -- https://mail.python.org/mailman/listinfo/python-list
Pip 10.0 has been released
On behalf of the PyPA, I am pleased to announce that pip 10.0 has just been released. This release has been the culmination of many months of work by the community. To install pip 10.0, you can run python -m pip install --upgrade pip or use get-pip, as described in https://pip.pypa.io/en/latest/installing. If you are using a version of pip supplied by your distribution vendor, vendor-supplied upgrades will be available in due course (or you can use pip 10 in a virtual environment). (One minor issue with using get-pip on Windows - when you download get-pip.py, rename it to something that doesn't include "pip" in the name, such as "gp.py", as the standard name triggers a check in pip that aborts the run - this is being tracked in https://github.com/pypa/pip/issues/5219). Highlights of the new release: * Python 2.6 is no longer supported - if you need pip on Python 2.6, you should stay on pip 9, which is the last version to support Python 2.6. * Support for PEP 518, which allows projects to specify what packages they require in order to build from source. (PEP 518 support is currently limited, with full support coming in future versions - see the documentation for details). * Significant improvements in Unicode handling for non-ASCII locales on Windows. * A new "pip config" command. * The default upgrade strategy has become "only-if-needed" * Many bug fixes and minor improvements. In addition, the previously announced reorganisation of pip's internals has now taken place. Unless you are the author of code that imports the pip module (or a user of such code), this change will not affect you. If you are affected, please report the issue to the author of the offending code (refer them to https://mail.python.org/pipermail/distutils-sig/2017-October/031642.html for the details of the announcement). Thanks to everyone who put so much effort into the new release. Many of the contributions came from community members, whether in the form of code, participation in design discussions, or bug reports. The pip development team is extremely grateful to everyone in the community for their contributions. Thanks, Paul -- https://mail.python.org/mailman/listinfo/python-list
Re: Asynchronous processing is more efficient -- surely not?
On 4 April 2018 at 08:27, Steven D'Aprano wrote: > "Asynchronous programming has been gaining a lot of traction in the past > few years, and for good reason. Although it can be more difficult than > the traditional linear style, it is also much more efficient." > > I can agree with the first part of the first sentence (gaining a lot of > traction), and the first part of the second sentence (more difficult than > the traditional style), but the second part? Asynchronous processing is > *more efficient*? I'd need to know what "efficient" meant. Obviously you're never going to get more than 100% utilisation of a single core with async (because of the GIL) but I can easily imagine async making more complete use of that core by having less time spent waiting for I/O. Whether you describe that as "more efficient" use of the CPU, or something else, I don't know. Honestly, that paragraph reads more like sales blurb than anything else, so I'd be inclined to take it with a pinch of salt anyway. IMO, async has proved useful for handling certain types of IO bound workloads with lower overheads[1] than traditional multi-threaded or multi-process designs. Whether it's a good fit for any particular application is something you'd have to test, as with anything else. Paul [1] I found it really hard to avoid saying "more efficiently" there. Not sure what that implies other than that the phrase means whatever you want it to mean!!! -- https://mail.python.org/mailman/listinfo/python-list