[Python-ideas] Re: Adding str.remove()

2021-05-01 Thread Cameron Simpson
On 01May2021 05:30, David Mertz  wrote:
>I was actually thinking about this before the recent "string comprehension"
>thread.  I wasn't really going to post the idea, but it's similar enough
>that I am nudged to.  Moreover, since PEP 616 added str.removeprefix() and
>str.removesuffix(), this feels like a natural extension of that.
>
>I find myself very often wanting to remove several substrings of similar
>lines to get at "the good bits" for my purpose.  Log files are a good
>example of this, but it arises in lots of other contexts I encounter.
>Let's take a not-absurd hypothetical:
>
>GET [http://example.com/picture] 200 image/jpeg
>POST [http://nowhere.org/data] 200 application/json
>PUT [https://example.org/page] 200 text/html
>
>For each of these lines, I'd like to see the URL and the MIME type only.
>The new str.removeprefix() helps some, but not as much as I would like
>since the "remove a tuple of prefixes" idea was rejected for PEP 616.  But
>even past that, very often much of what I want to remove is in the middle,
>not at the start or the end.

This is not a good way to tidy up log lines. try parsing it into fields:

PUT
http://example.com/picture
200
image.jpeg

and then only looking at the fields you care about.

>I know I can use regular expressions here.  However, they are definitely a
>higher cognitive burden, and especially so for those who haven't taught
>them and written about them a lot, as I have.  Even for me, I'd rather not
>think about regexen if I don't *have to*.

Though for this, they are ok. Or even just:

method, _url_, code, mimetype = line.split(None,3)

There shouldn't be any whitespace in a log line URL - it should be 
percent encoded.

>So probably I'll do something
>like this:
>
>for line in lines:
>for noise in ('GET', 'POST', 'PUT', '200', '[', ']'):
>line = line.replace(noise, '')

This is a very bad way to do this. What about thr URL 
"http://example.com/foo/PUT/bah";. Badness ensues. It's worse than using 
a well written regexp.

>process_line(line)
>
>That's not horrible, but it would be nicer to write:
>
>for line in lines:
>process_line(line.remove(('GET', 'POST', 'PUT', '200', '[', ']'))

I'm -1 on this idea.

As you note, str.replace already exists and does what your line.remove 
does, just on a single substring basis. It's a trivial exercise to write 
an mreplace(s,substrs) function. Just do it and put it in your personal 
kit, and import it.

>Of course, if I really needed this as much as I seem to be suggesting, 
>I
>know how to write a function `remove_strings()`... and I confess I have not
>done that. Or at least I haven't done it in some standard "my_utils" module
>I always import.  Nonetheless, a string method would feel even more natural
>than a function taking the string as an argument.

A method is almost always "easier/natural", but how many do we really 
want? If you really want this, write a StrMixin with a bunch of nice 
methods, subclass str, and promote your lines to your new subclass.  
Methods managed!

Cheers,
Cameron Simpson 
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NIOOPW2LJ754EVTAVEDQWFW2RTCD2CH7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: String comprehension

2021-05-01 Thread Steven D'Aprano
On Sat, May 01, 2021 at 06:21:43AM -, Valentin Berlier wrote:

> > The builtin interables bytearray, bytes, enumerate, filter 
> > frozenset, map, memoryview, range, reversed, tuple and zip suggest 
> > differently.
> 
> enumerate, filter, map, range, reversed and zip don't apply because 
> they're not collections,

You didn't say anything about *collections*, you talked about builtin 
*iterables*.

And range is a collection:

>>> import collections.abc
>>> isinstance(range(10), collections.abc.Collection)
True


> you wouldn't be able to store the result of 
> the computation anywhere.

I don't know what this objection means. The point of iterators like map, 
zip and filter is to *avoid* performing the computation until it is 
required.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/E452U6MTPRL4G7SYP6BJBPFUYHZPCNJ5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: String comprehension

2021-05-01 Thread Valentin Berlier
> you talked about builtin *iterables*

My mistake, I reused the terminology used by the original author to make it 
easier to follow.

> The point of iterators like map, zip and filter is to *avoid* performing the 
> computation until it is required.

Of course. Maybe I wasn't clear enough. I don't know why we're bringing up 
these operators in a discussion about comprehensions. And what would a "range" 
comprehension even look like? To me the fact that there's no comprehensions for 
enumerate, filter, map, range, reversed and zip doesn't contribute to making 
dict, list and set exceptional cases.

As I said we're left with bytearray, frozenset and memoryview. These are much 
less frequently used and don't even have a literal form so expecting 
comprehensions for them would be a bit nonsensical. On the other hand strings, 
bytes, lists, dicts and sets all have literal forms but only lists, dicts and 
sets have comprehensions. Three out of five doesn't make them exceptional cases 
so it's only logical to at least consider the idea of adding comprehensions for 
strings (and bytes) too.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MU5CMPQJ2REDMDOKHFJTJCF2B3F5LIPN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: String comprehension

2021-05-01 Thread Stestagg
On Fri, 30 Apr 2021 at 17:08, David Álvarez Lombardi 
wrote:

> I propose a syntax for constructing/filtering strings analogous to the one
> available for all other builtin iterables. It could look something like
> this.
>
> >>> dirty = "f8sjGe7"
> >>> clean = c"char for char in dirty if char in string.ascii_letters"
> >>> clean
> 'fsjGe'
>
> Currently, the best way to do this (in the general case) seems to be the
> following.
> >>> clean = "".join(char for char in dirty if char in string.ascii_letters)
>
> But I think the proposed syntax would be superior for two main reasons.
>

I’m not against a specialised string generator construct per-se (I’m not
for it either :) as it’s not a problem I have experienced, and I’ve been
doing a lot of string parsing/formatting at scale recently) but that
doesn’t mean your use-cases are invalid.

To me, the chosen syntax is problematic.  The idea of introducing
structural logic by using “” seems likely to cause confusion. Across all
languages I use, quotes are generally and almost always used to introduce
constant values.  Sometimes, maybe, there are macro related things that may
use quoting, but as a developer, if I see quotes, I’m thinking: the runtime
will treat this as a constant.

Having a special case where the quotes are a glorified function call just
feels very wrong to me.  And likely to be confusing.

Steve




>- Consistency with the comprehension style for all other iterables
>(which seems to be one of the most beloved features of python)
>- Confusion surrounding the str.join(iter) syntax is very well
>documented
>
> 
>and I believe it is particularly unintuitive when the string is empty
>
> I also believe the following reasons carry some weight.
>
>- Skips unnecessary type switching from str to iter and back to str
>- Much much MUCH more readable/intuitive
>
> Please let me know what you all think. It was mentioned (by @rhettinger)
> in the PBT issue  that this will
> likely require a PEP which I would happily write if there is a positive
> response.
>
> --
>
> *David Álvarez Lombardi*
> Machine Learning Spanish Linguist
> Amazon | Natural Language Understanding
>   Boston, Massachusetts
>   alvarezdqal 
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/MVQGP4GGTIWQRJTSY5S6SDYES6JVOOGK/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KKE25ZZP62CLTWQWKLZCNL7NKTRIRSL6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: String comprehension

2021-05-01 Thread Brendan Barnwell

On 2021-04-30 11:14, David Álvarez Lombardi wrote:


To me, it is hard to see how any argument against this design (for
anything other than implementation-difficulty or something along these
lines) can be anything but an argument against iter comprehensions in
general... but if someone disagrees, please say so.


	The difference between your proposal and existing comprehensions is 
that strings are very different from lists, dicts, sets, and generators 
(which are the things we currently have comprehensions for).  The syntax 
for those objects is Python syntax, which is strict and can include 
expressions that have meaning that is interpreted by Python.  But 
strings can contain *anything*, and in general (apart from f-strings) 
their content is not parsed by Python.  You can't do this:


[wh4t3ver I feel like!!!  okay?^@^&]

But you can do this:

"wh4t3ver I feel like!!!  okay?^@^&"

	This means that the way people think about and visually comprehend 
strings is quite different from other Python types.  You propose to have 
the string delimiters now contain actual Python code that Python will 
parse and run, but this isn't what people are used to seeing between 
quote marks.


	I think the closest existing thing to your string comprehensions is not 
any existing comprehension, but rather f-strings, which are the one 
place where Python does potentially parse and execute code in a string. 
 However, f-strings are different in notable ways.


	First, the code in f-strings is delimited (by curly braces), so it is 
visually distinguished from "freeform" text within the string.  Second, 
f-strings do not restrict the normal usage of strings for freeform text 
content (apart from making the curly brace characters special).  So 
`f"wh4t3ver I feel like!!!  okay?^@^&"` is a valid f-string just like 
it's a valid string.  In your proposal (I assume), something like 
`c"item for item in other_seq and then the string text continues here"` 
would have to be a syntax error.  That is, unlike f-strings (or any 
other existing kind of string), the string comprehension would "claim" 
the entire string and you could no longer put normal string content in 
there.


	Your proposal is focusing on strings as iterables and drawing a 
parallel with other kinds of iterables for which we have comprehensions. 
 But strings aren't like other iterables because they're primarily 
vessels for freeform text content, not structured data.


	For the same reason, string comprehensions are likely to be less 
useful.  I would look doubtfully on code that tried to do anything 
complex in a string comprehension, in the same way that I would look 
doubtfully on code that used f-strings with huge, complex expressions. 
It would be more readable to do whatever data preparation you need to do 
before creating the string and then use a simpler final step to create 
the string itself.


	Also, string comprehensions would only facilitate the creation of 
simple "linear" strings which draw their content sequentially from 
iterables.  I find that in practice, if I want to create a string, 
programmatically, I'm not doing that.  Rather, I'm pulling disparate 
content from different places and putting it together in a template-like 
fashion, in the way that f-strings or str.format() facilitate.  So I 
don't think this proposal would have much practical use in string creation.


	So overall I think your proposed string comprehensions would tend to 
make Python code less readable in the relatively rare cases where they 
were useful at all.


--
Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."

   --author unknown
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/B2SY2UI7FNOBTCZILYEHPXECJNJ2YBLH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Changing The Theme of Python Docs Site

2021-05-01 Thread Stephen J. Turnbull
Abdur-Rahmaan Janhangeer writes:

 > I have been reading the Python docs since long.
 > I have enjoyed it, it has great pieces of information.
 > You have how-tos, faqs etc. Really awesome to read.

Thank you! ;-)

 > However, I feel that the style is a bit bland and off putting
 > for newcomers. I suggest we consider changing the theme
 > to a crisper and cleaner look. I find docs such as Masonite
 > really enjoyable to read: https://docs.masoniteproject.com/

 > I do hope we can advance something along these lines.

Despite my personal stylistic preferences (see below), I think this
could be a move toward docs that makes people happy and proud to see
their contribution in pixels, so I'm supportive in general.  You
should join the python-docs group
(https://mail.python.org/mailman3/lists/docs.python.org/)
and get in touch with the docs working group
https://mail.python.org/archives/list/python-...@python.org/message/LELGQN3HMOJXWD4QCPBL5EZVFAFX7SGC/
to most effectively promote your ideas.  That's where the folks who do
a lot of work specifically on docs hang out.

To get started on the discussion, just as one person's opinion:  I'm
not a fan of the Masonite color scheme, to be honest.  (White on navy
for the screen shots, to be specific.  I find the Python docs' pastel
backgrounds less jarring, and easily readable.  But I'm not going to
shoot down that hill, let alone die on it. ;-)

Also, I don't really see a big difference with the Python
documentation, and I prefer the denser Python text to that on the
front page of Masonite (but I'm a *very* text-oriented person, drives
my poor students nuts!)

I didn't much like the dual sidebar layout.

Overall I'd recommend tweaking the current theme to try to improve
your readability, rather than moving to a completely different theme.

I don't know if anybody else feels that way, so YMMV.

Steve


-- 
Associate Professor  Division of Policy and Planning Science
http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information
Email: turnb...@sk.tsukuba.ac.jp   University of Tsukuba
Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G2DE72ZXJWPVA46GABQQPRDJOUMOAD3R/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding str.remove()

2021-05-01 Thread David Mertz
On Sat, May 1, 2021, 3:17 AM Cameron Simpson  wrote:

> >Let's take a not-absurd hypothetical:
> >
> >GET [http://example.com/picture] 200 image/jpeg
> >POST [http://nowhere.org/data] 200 application/json
> >PUT [https://example.org/page] 200 text/html
>
> Though for this, they are ok. Or even just:
>
> method, _url_, code, mimetype = line.split(None,3)
>

Notice that in my example, I agree extra square brackets that I want to get
rid of as well, which your line doesn't do. Of course an extra line or two
could. But often enough, I want to remove certain fixed substrings in lines
that don't have a uniform delimiter like a space.

There shouldn't be any whitespace in a log line URL - it should be
> percent encoded.
>

Lots of things "should be" :-). Sadly, I deal with "actually existing data."
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EN65SJDKJ75WWM7R7M5O6JYUBO67RCFJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] TACE16 text encoding for Tamil language

2021-05-01 Thread Stephen J. Turnbull
You wrote:

 > I want to use this encoding
 > 
 > for Tamil language text

As written, it sounds like you just want help.  If so, this list is
for proposals to change Python itself (including the standard
library), and this should have been posted to python-list or to
StackExchange.

If you do mean to propose this for the stdlib, it is highly unlikely
to get in as proposed since the encoding commandeers private space in
the BMP, which is a scarce resource.  We *can* do that, but it's very
likely that the general sentiment will be "do it in a PyPI module,
then it *can't* cause anybody else any trouble."  In principle it's
not our job to "fix" Unicode.  That's the work of the relevant
national standards body for Tamil and the Unicode Consortium.  (I am
not authoritative, so if that's what you want, don't take my word for
it.  I just want you to be prepared for what I expect to be strong
pushback, and what the argument will be.)

About the proposal:

If you are planning to use TACE16 as an interchange format, you don't
need a codec; you just treat it as normal UTF-8 (or any other UTF, for
that matter).  Python does not care whether a character is standard or
private, it just adds it to the str the codec is building.

If you propose to use the codec to translate standard Unicode to
TACE16 as the internal format, the obvious (rough) idea would be to
just plug the converter you have written into the stdlib's Unicode
codecs as a post-processor when there is a Unicode character in the
(standard) Tamil block.  This would then handle both the standard
Unicode encoding for Tamil, as well as TACE16 (because it would just
pass through the UTF-8 part, and the converter would ignore it).

You may want two separate codecs for output: one which produces TACE16
for you, and another which produces standard Unicode for anyone who
doesn't have TACE16 capability.

Exactly how to do that is above my pay grade, it depends on how the
postprocessor works, which depends on Tamil language knowledge that I
don't have.  Whether to rewrite the converter in C is up to you, it's
possible to call Python from C.

 > Two basic questions,
 > 
 >1. How do I approach writing a new text encoding codec for
 >   python and register it with the codec module.

Start here:
/Users/steve/src/Python/cpython/Doc/library/codecs.rst
/Users/steve/src/Python/cpython/Doc/c-api/codec.rst

To write them in C, follow the code in 
Likely needed (forgot where the Unicode codecs live, try codecs.[ch] first):
/Users/steve/src/Python/cpython/Python/codecs.c
/Users/steve/src/Python/cpython/Include/codecs.h
/Users/steve/src/Python/cpython/Objects/stringlib/codecs.h
/Users/steve/src/Python/cpython/Objects/unicodectype.c
/Users/steve/src/Python/cpython/Lib/codecs.py
/Users/steve/src/Python/cpython/Modules/_codecsmodule.c
Probably not needed:
/Users/steve/src/Python/cpython/Modules/cjkcodecs
/Users/steve/src/Python/cpython/Modules/clinic/_codecsmodule.c.h

 >2. How would I convert utf-8 encoded pattern for regex into the
 >   custom codec so that the pattern and input string for
 >   re.match/search is consistent.

You don't.  That's the point of the codec: you convert all text
(including source program text) into an internal "abstract text" type
(ie, str), and then it "just works".  Instead, you would read program
text as utf-8-tace16 by placing a PEP 263 coding cookie in one of the
first two lines of your program, like this:

# -*- encoding: utf-8-tace16 -*-

If you think that's ugly, read the PEP for alternative forms.  If you
want to avoid it entirely, I'm not sure it's possible, but python-list
or StackExchange are better places to ask.

Regards,
Steve

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6KE3I2GU2YP4YXW2NZOGF7WY3E77TIYJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: TACE16 text encoding for Tamil language

2021-05-01 Thread Jonathan Goble
On Sat, May 1, 2021 at 11:17 AM Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> Start here:
> /Users/steve/src/Python/cpython/Doc/library/codecs.rst
> /Users/steve/src/Python/cpython/Doc/c-api/codec.rst
>
> To write them in C, follow the code in
> Likely needed (forgot where the Unicode codecs live, try codecs.[ch]
> first):
> /Users/steve/src/Python/cpython/Python/codecs.c
> /Users/steve/src/Python/cpython/Include/codecs.h
> /Users/steve/src/Python/cpython/Objects/stringlib/codecs.h
> /Users/steve/src/Python/cpython/Objects/unicodectype.c
> /Users/steve/src/Python/cpython/Lib/codecs.py
> /Users/steve/src/Python/cpython/Modules/_codecsmodule.c
> Probably not needed:
> /Users/steve/src/Python/cpython/Modules/cjkcodecs
> /Users/steve/src/Python/cpython/Modules/clinic/_codecsmodule.c.h
>

I assume the "cpython" part of these paths here is your local clone of the
CPython GitHub repo? (Otherwise these local filepaths from your computer
don't make sense.)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AYYSZ76AV3L2PVVXO542VNDGIVS35HFL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Improving sys.executable for embedded Python scenarios

2021-05-01 Thread Gregory Szorc
The way it works today, if you have an application embedding Python, your
sys.argv[0] is (likely) your main executable and sys.executable is probably
None or the empty string (per the stdlib docs which say not to set
sys.executable if there isn't a path to a known `python` executable).

Unfortunately, since sys.executable is a str, the executable it points to
must behave as `python` does. This means that your application embedding
and distributing its own Python must provide a `python` or `python`-like
standalone executable and use it for sys.executable and this executable
must be independent from your main application because the run-time
behavior is different. (Yes, you can employ symlink hacks and your
executable can sniff argv[0] and dispatch to your app or `python`
accordingly. But symlinks aren't reliable on Windows and this still
requires multiple files/executables.) **This limitation effectively
prevents the existence of single file application binaries who also want to
expose a full `python`-like environment, as there's no standard way to
advertise a mechanism to invoke `python` that isn't a standalone executable
with no arguments.**

While applications embedding Python may not have an explicit `python`
executable, they do likely have the capability to instantiate a
`python`-like environment at run-time: they have the interpreter after all,
they "just" need to provide a mechanism to invoke Py_RunMain() with an
interpreter config initialized using the "python" profile.

**I'd like to propose a long-term replacement to sys.executable that
enables applications embedding Python to advertise a mechanism for invoking
the same executable such that they get a `python` experience.**

The easiest way to do this is to introduce a list[str] variant. Let's call
it sys.python_interpreter. Here's how it would work.

Say I've produced myapp.exe, a Windows application. If you run `myapp.exe
python --`, the executable behaves like `python`. e.g. `myapp.exe python --
-c 'print("hello, world")'` would be equivalent to `python -c
'print("hello, world")'`. The app would set `sys.python_interpreter =
["myapp.exe", "python", "--"]`. Then Python code wanting to invoke a Python
interpreter would do something like
`subprocess.run(sys.python_interpreter)` and automatically dispatch through
the same executable.

For applications not wanting to expose a `python`-like capability, they
would simply set sys.python_interpreter to None or [], just like they do
with sys.executable today. In fact, I imagine Python's initialization would
automatically set sys.python_interpreter to [sys.executable] by default and
applications would have to opt in to a more advanced PyConfig field to make
sys.python_interpreter different. This would make sys.python_interpreter
behaviorally backwards compatible, so code bases could use
sys.python_interpreter as a modern substitute for sys.executable, if
available, without that much risk.

Some applications may want more advanced mechanisms than command line
arguments to dispatch off of. For example, maybe you want to key off an
environment variable to activate "Python mode."  This scenario is a bit
harder to implement, as it would require yet another advertisement on how
to invoke `python`. If subprocess had a "builder" interface for iteratively
constructing a process invocation, we could expose a stdlib function to
return a builder preconfigured to invoke `python`. But since such an
interface doesn't exist, there's not as clean a solution for cases that
require something more advanced than additional process arguments. Maybe we
could make sys.python_interpreter a tuple[list[str], dict[str, str]] where
that dict is environment variables to set. Doable. But I'm unconvinced the
complexity is warranted, especially since the application has full control
over interpreter initialization and can set most of the settings that
they'd want to set through environment variables (e.g. PYTHONHOME) as part
of initializing the `python`-like environment.

Yes, there will be a long tail of applications needing to adapt to the
reality that sys.python_interpreter exists and is a list. Checks like `if
sys.executable == sys.argv[0]` will need to become more complicated. Maybe
we could expose a simple "am I a Python interpreter process" in the stdlib?
(The inverse "am I not a Python interpreter executable" question could also
benefit from stdlib standardization, as there are unofficial mechanisms
like sys.frozen and sys.meipass attempting to answer this question.)

Anyway, as it stands, sys.executable just doesn't work for applications
embedding Python who want to expose a full `python`-like environment from
single executable distributions. I think the introduction of a new API to
allow applications to "self-dispatch" to a Python interpreter could
eventually lead to significant ergonomic wins for embedded Python
applications. This would make Python a more attractive target for
embedding, which benefits the larger Python ecosys

[Python-ideas] Re: String comprehension

2021-05-01 Thread Christopher Barker
On Fri, Apr 30, 2021 at 11:15 PM Valentin Berlier 
wrote:

> > You could say that f-strings are redundant because they can't do
> anything that str.format can't, but  they make it possible to shave off the
> static overhead of going through python's protocols and enable additional
> optimizations.


But that was not the primary motivator for adding them to the language.

Nor is it the primary motivator for using them. I really like f-strings,
and I have never even thought about their performance characteristics.

With regard to the possible performance benefits of “string
comprehensions”: Python is already poorly performant when working with
strings character by character. Which is one reason we have nifty string
methods like .replace() and .translate. (And join).

I’d bet that many (most?) potential “string comprehensions” would perform
better if done with string methods, even if they were optimized.

Another note that I don’t think has been said explicitly— yes strings are
Sequences, but they are a very special case in that they can contain only
one type of thing: length-1 strings. Which massively reduces the possible
kinds of comprehensions one might write, and I suspect most of those are
already covered by string methods.

[actually, I think this is a similar point as that made by David Mertz)

-CHB

-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Z3J727MCT46XPDUAEQLH7ZWEKO7QZKTX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: String comprehension

2021-05-01 Thread Joao S. O. Bueno
I started seeing this, as the objecting people are putting, something that
is
really outside of the scope.

But it just did occur to me that having to use str.join _inside_ an
f-string
expression is somewhat cumbersome

I mean, think of a typical repr for a sequence class:

return f"MyClass({', '.join(str(item) for item in self) } )"


So, maybe, not going for another kind of string, or string comprehensions,
but rather for
a formatting acceptable by the format-mini-language that could do a "map to
str and join" when
the item is a generator?

This maybe would: suffice the O.P. request, introduce no fundamental
changes in the
way we think the language, _and_ be somewhat useful.

The example above  could become

return f"MyClass({self:, j}"

The "j" suffix meaning to use ", " as the separator, and map the items to
"str"
- this, if the option is kept terse as the other indicators in the format
mini language, or
could maybe be more readable (bikeshed at will) .

(Other than that, I hope it is clear I am with Steven, Chris, Christopher
et al. on the objections
to the 'string comprehension' proposal as it is)

On Sat, 1 May 2021 at 17:36, Christopher Barker  wrote:

>
>
> On Fri, Apr 30, 2021 at 11:15 PM Valentin Berlier 
> wrote:
>
>> > You could say that f-strings are redundant because they can't do
>> anything that str.format can't, but  they make it possible to shave off the
>> static overhead of going through python's protocols and enable additional
>> optimizations.
>
>
> But that was not the primary motivator for adding them to the language.
>
> Nor is it the primary motivator for using them. I really like f-strings,
> and I have never even thought about their performance characteristics.
>
> With regard to the possible performance benefits of “string
> comprehensions”: Python is already poorly performant when working with
> strings character by character. Which is one reason we have nifty string
> methods like .replace() and .translate. (And join).
>
> I’d bet that many (most?) potential “string comprehensions” would perform
> better if done with string methods, even if they were optimized.
>
> Another note that I don’t think has been said explicitly— yes strings are
> Sequences, but they are a very special case in that they can contain only
> one type of thing: length-1 strings. Which massively reduces the possible
> kinds of comprehensions one might write, and I suspect most of those are
> already covered by string methods.
>
> [actually, I think this is a similar point as that made by David Mertz)
>
> -CHB
>
> --
> Christopher Barker, PhD (Chris)
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/Z3J727MCT46XPDUAEQLH7ZWEKO7QZKTX/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OX7F5N5KPOEBD27OBZAGWQBTA3SDJNAO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Namespaces!

2021-05-01 Thread Matt del Valle
Hi all!

So this is a proposal for a new soft language keyword:

namespace

I started writing this up a few hours ago and then realized as it was
starting to get away from me that there was no way this was going to be
even remotely readable in email format, so I created a repo for it instead
and will just link to it here. The content of the post is the README.md,
which github will render for you at the following link:

https://github.com/matthewgdv/namespace

I'll give the TLDR form here, but I would ask that before you reply please
read the full thing first, since I don't think a few bullet-points give the
necessary context. You might end up bringing up points I've already
addressed. It will also be very hard to see the potential benefits without
seeing some actual code examples.

TLDR:

- In a single sentence: this proposal aims to add syntactic sugar for
setting and accessing module/class/local attributes with dots in their name

- the syntax for the namespace keyword is similar to the simplest form of a
class definition statement (one that implicitly inherits from object), so:

namespace some_name:
   ...  # code goes here

- any name bound within the namespace block is bound in exactly the same
way it would be bound if the namespace block were not there, except that
the namespace's name and a dot are prepended to the key when being inserted
into the module/class/locals dict.

- a namespace block leaves behind an object that serves to process
attribute lookups on it by prepending its name plus a dot to the lookup and
then delegating it to whatever object it is in scope of
(module/class/locals)

- This would allow for small performance wins by replacing the use of class
declarations that is currently common in python for namespacing, as well as
making the writer's intent explicit

- Crucially, it allows for namespacing the content of classes, by grouping
together related methods. This improves code clarity and is useful for
library authors who design their libraries with IDE autocompletion in mind.
This cannot currently be done by nesting classes.

I live in the UK so I'm going to bed now (after working on this for like
the last 6 hours). I'll be alive again in maybe 8 hours or so and will be
able to reply to any posts here then.

Cheers everyone :)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/7DAX2JTKZKLRT4CKKBRACNBJLHQUCN6E/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Namespaces!

2021-05-01 Thread David Mertz
So this is exactly the same as `types.SimpleNamespace`, but with special
syntax?!

On Sat, May 1, 2021, 7:57 PM Matt del Valle  wrote:

> Hi all!
>
> So this is a proposal for a new soft language keyword:
>
> namespace
>
> I started writing this up a few hours ago and then realized as it was
> starting to get away from me that there was no way this was going to be
> even remotely readable in email format, so I created a repo for it instead
> and will just link to it here. The content of the post is the README.md,
> which github will render for you at the following link:
>
> https://github.com/matthewgdv/namespace
>
> I'll give the TLDR form here, but I would ask that before you reply please
> read the full thing first, since I don't think a few bullet-points give the
> necessary context. You might end up bringing up points I've already
> addressed. It will also be very hard to see the potential benefits without
> seeing some actual code examples.
>
> TLDR:
>
> - In a single sentence: this proposal aims to add syntactic sugar for
> setting and accessing module/class/local attributes with dots in their name
>
> - the syntax for the namespace keyword is similar to the simplest form of
> a class definition statement (one that implicitly inherits from object), so:
>
> namespace some_name:
>...  # code goes here
>
> - any name bound within the namespace block is bound in exactly the same
> way it would be bound if the namespace block were not there, except that
> the namespace's name and a dot are prepended to the key when being inserted
> into the module/class/locals dict.
>
> - a namespace block leaves behind an object that serves to process
> attribute lookups on it by prepending its name plus a dot to the lookup and
> then delegating it to whatever object it is in scope of
> (module/class/locals)
>
> - This would allow for small performance wins by replacing the use of
> class declarations that is currently common in python for namespacing, as
> well as making the writer's intent explicit
>
> - Crucially, it allows for namespacing the content of classes, by grouping
> together related methods. This improves code clarity and is useful for
> library authors who design their libraries with IDE autocompletion in mind.
> This cannot currently be done by nesting classes.
>
> I live in the UK so I'm going to bed now (after working on this for like
> the last 6 hours). I'll be alive again in maybe 8 hours or so and will be
> able to reply to any posts here then.
>
> Cheers everyone :)
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/7DAX2JTKZKLRT4CKKBRACNBJLHQUCN6E/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SFXURMEILWU2BALQZ7RL7ESZLMHS2GFO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Improving sys.executable for embedded Python scenarios

2021-05-01 Thread Gregory P. Smith
On Sat, May 1, 2021 at 10:49 AM Gregory Szorc 
wrote:

> The way it works today, if you have an application embedding Python, your
> sys.argv[0] is (likely) your main executable and sys.executable is probably
> None or the empty string (per the stdlib docs which say not to set
> sys.executable if there isn't a path to a known `python` executable).
>
> Unfortunately, since sys.executable is a str, the executable it points to
> must behave as `python` does. This means that your application embedding
> and distributing its own Python must provide a `python` or `python`-like
> standalone executable and use it for sys.executable and this executable
> must be independent from your main application because the run-time
> behavior is different. (Yes, you can employ symlink hacks and your
> executable can sniff argv[0] and dispatch to your app or `python`
> accordingly. But symlinks aren't reliable on Windows and this still
> requires multiple files/executables.) **This limitation effectively
> prevents the existence of single file application binaries who also want to
> expose a full `python`-like environment, as there's no standard way to
> advertise a mechanism to invoke `python` that isn't a standalone executable
> with no arguments.**
>

minor nit: I wouldn't use the words "must behave" above... Since using
sys.executable = None at work for the past five years.  The issues we run
into are predominantly in unit tests that try to launch an interpreter via
subprocess of sys.executable.  and the bulk of that is in CPython's own
test suite (which I vote "doesn't really count").

regardless, not needing to tweak even those would be a convenience and it
could open up doors for some more application frameworks that make such
environment assumptions and are thus hard to distribute stand-alone.  ex:
It'd open the door for multiprocessing spawn mode within stand alone
embedded binaries.

While applications embedding Python may not have an explicit `python`
> executable, they do likely have the capability to instantiate a
> `python`-like environment at run-time: they have the interpreter after all,
> they "just" need to provide a mechanism to invoke Py_RunMain() with an
> interpreter config initialized using the "python" profile.
>
> **I'd like to propose a long-term replacement to sys.executable that
> enables applications embedding Python to advertise a mechanism for invoking
> the same executable such that they get a `python` experience.**
>
> The easiest way to do this is to introduce a list[str] variant. Let's call
> it sys.python_interpreter. Here's how it would work.
>
> Say I've produced myapp.exe, a Windows application. If you run `myapp.exe
> python --`, the executable behaves like `python`. e.g. `myapp.exe python --
> -c 'print("hello, world")'` would be equivalent to `python -c
> 'print("hello, world")'`. The app would set `sys.python_interpreter =
> ["myapp.exe", "python", "--"]`. Then Python code wanting to invoke a Python
> interpreter would do something like
> `subprocess.run(sys.python_interpreter)` and automatically dispatch through
> the same executable.
>

yep, that seems reasonable.  unfortunately the command line arguments are a
global namespace, but choosing a unique "launch me as a standalone python
interpreter" arg when building a standalone python executable app that will
never conflict with an application, at build time, is doable.  Nobody's
application wants this specific unique per build ---$(uuid) flag in argv[1]
right? ;) ...

There's still an API challenge to decide on here: people using
sys.executable also expect to pass flags to the python interpreter.  Do we
make an API guarantee that the final flag in sys.python_interpreter is
always a terminator that separates python flags from application flags (--
or otherwise)?

For applications not wanting to expose a `python`-like capability, they
> would simply set sys.python_interpreter to None or [], just like they do
> with sys.executable today.
>

Yep.  Though that should be done at stand alone python application build
time to avoid any command line of the binary possibly launching as a plain
interpreter.  (this isn't security, anyone with access to read the stand
alone executable can figure out how to construct a raw interpreter usable
in their environment from that)


> In fact, I imagine Python's initialization would automatically set
> sys.python_interpreter to [sys.executable] by default and applications
> would have to opt in to a more advanced PyConfig field to make
> sys.python_interpreter different. This would make sys.python_interpreter
> behaviorally backwards compatible, so code bases could use
> sys.python_interpreter as a modern substitute for sys.executable, if
> available, without that much risk.
>

+1

-gps


>
> Some applications may want more advanced mechanisms than command line
> arguments to dispatch off of. For example, maybe you want to key off an
> environment variable to activate "Python mode."  This scenario is a bi

[Python-ideas] Re: String comprehension

2021-05-01 Thread Valentin Berlier
> But that was not the primary motivator for adding them to the language.

I don't think the original author thinks that way either about string 
comprehensions. I was asked about the kind of speed benefits that string 
comprehensions would have over using a generator with "".join() and I used 
f-strings as an example because the benefits would be similar.

By the way now that i think about it, comprehensions would fit into f-string 
interpolation pretty nicely.

f"""
Guest list ({len(people)} people):
{person.name + '\n' for person in people}
"""

> Which massively reduces the possible kinds of comprehensions one might write, 
> and I suspect most of those are already covered by string methods.

I actually replied to David Mertz about this. String comprehensions can derive 
substrings from any iterable. Just like the only requirement for using a 
generator expression in "".join() is that it produces strings. Comprehensions 
can also have nested loops which can come in handy at times. And of course this 
doesn't mean I'm going to advocate for using them with complex predicates.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/J4UNAY3QY7M34UU5OIMZJCZR5XV7O676/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Comprehensions within f-strings

2021-05-01 Thread Valentin Berlier
Recently there's been some discussion around string comprehensions, and I 
wanted to look at a specific variant of the proposal in a bit more detail.
Original thread: 
https://mail.python.org/archives/list/python-ideas@python.org/thread/MVQGP4GGTIWQRJTSY5S6SDYES6JVOOGK/

Let's say i have a matrix of numbers:

matrix = [[randint(7, 50) / randint(1, 3) for _ in range(4)] for _ in range(4)]

I want to format and display each row so that the columns are nicely lined up. 
Maybe also display the sum of the row at the end of each line:

for row in matrix:
print(''.join(f'{n:>8.3f}' for n in row) + f' | {sum(row):>8.3f}')

This gives me a nicely formatted table. Now with the proposal:

for row in matrix:
print(f'{n for n in row:>8.3f} | {sum(row):>8.3f}')

The idea is that you would be able to embed a comprehension in f-string 
interpolations, and that the format specifier at the end would be applied to 
all the generated items. This has a few advantages compared to the first 
version. It's a bit shorter and I find it easier to see what kind of shape the 
output will look like. It would also be faster since the interpreter would be 
able to append the formatted numbers directly to the final string. The first 
version needs to create a temporary string object for each number and then feed 
it into the iterator protocol before joining them together.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HSTXKN7OUUR34IXLVMXR65XVPNWPVEL5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: TACE16 text encoding for Tamil language

2021-05-01 Thread Stephen J. Turnbull
Jonathan Goble writes:

 > I assume the "cpython" part of these paths here is your local clone of the
 > CPython GitHub repo? (Otherwise these local filepaths from your computer
 > don't make sense.)

Thanks for catching that!

Sorry, I was concentrating on stifling irrelevant Unicode
politics. :-)  You need a local clone of the GitHub repo, and the
various possibly relevant files are in

Doc/library/codecs.rst  (these two are available online)
Doc/c-api/codec.rst

Lib/codecs.py

Python/codecs.c
Include/codecs.h
Objects/stringlib/codecs.h
Objects/unicodectype.c
Modules/_codecsmodule.c
Modules/cjkcodecs
Modules/clinic/_codecsmodule.c.h

Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GXLGRWVKL4VOTVZDR57VZXRVVJB22RZ2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding str.remove()

2021-05-01 Thread Stephen J. Turnbull
David Mertz writes:

 > Lots of things "should be" :-). Sadly, I deal with "actually existing data."

What I would do to experience your kind of sadness!  I spend most of
my time working around (or doing theory instead of working on)
"actually nonexisting data". ;-)

Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AWW544SZ5WREMFN2DOAWLF43JFOG5TDM/
Code of Conduct: http://python.org/psf/codeofconduct/