[Python-ideas] Re: Custom literals, a la C++

Christopher Barker Sun, 10 Apr 2022 10:02:16 -0700

Not sure this conversation actually relates to any proposals at this point,
but ...

On Fri, Apr 8, 2022 at 9:54 AM Chris Angelico <ros...@gmail.com> wrote:

> You're misunderstanding the difference between "application" and
> "library" here.

No, I'm not -- your misunderstanding my (poorly made, I guess) point.

Those are four separate libraries, and each one has a
> single purpose: encoding/decoding stuff. It is not the application.

Of course it's not -- my point is that my application is using a bunch of
third party libraries, and a number of them are using JSON, and clearly
they are all using it in a somewhat different way, and the people
writing that library code absolutely don't want some global settings to
change how they work.

os.chdir()
> is global, we just accept that it belongs to the application, not a
> library.
>

Sure -- but I'd actually say that  a "current working dir" is actually not
helpful -- libraries shouldn't use it, ever.

It can be handy for command line applications, but as you say, it's only
the application that should be working with it.

> General rule: A library is allowed to change things that belong to the
> application if, and only if, it is at the behest of the application.
> That's a matter of etiquette rather than a hard-and-fast rule, but we
> decry badly-behaved libraries for violating it, rather than blaming
> the feature for being global.
>

Sure -- but I'm talking about the application changing global state that
then affects how libraries will work -- that can only be helpful if there's
a very well established standard protocol -- like current working
directory, and maybe logging.

> Granted
> > * Python is dynamic and has a global module namespace, so packages CAN
> monkey patch and make of mess of virtually anything.
> > * "Well behaved" packages would not mess with the global configuration.
>

exactly.

> > But that doesn't mean that it wouldn't happen -- why encourage it? Why
> have a global registry and then tell people not to use it?
>
> For precisely the same reason that we have so many other global
> registries. It is simplest and cleanest to maintain consistency rather
> than try to have per-module boundaries.
>

I'm not necessarily saying that a global registry is always a bad idea, but
I think it's a bad idea for most things, and is for Decimal behavior, and
Units

>  I've fielded multiple questions from people who do "import
> sys" in one module, and then try to use "sys.argv" in another module,
> not realising that the namespace into which the name 'sys' was
> imported belonged only to that module. It's not too hard to explain,
> but it's a thing that has to be learned.

But that is the very idea of non-global namespaces -- you aren't going to
get far in Python if you don't get that.

Only if it's expected to be configured with some granularity. And, as
> with decimal.localcontext(), it's perfectly possible to have scopes
> much smaller than modules. So my question to you, just as to D'Aprano,
> is: why should this be at the module scope, not global, and not
> narrower?
>

I do like the narrower option provided by decimal.localcontext()

as for module scope, not global:

The principle here is that the person writing code code needs to control
the context that is used -- only that person, at that time, knows what's
appropriate -- the "application developer" mya have no idea whatsoever how
Decimal is being used in third party libraries. in fact, they may not even
know that it is being used.

You could say that library writers need to be careful not to use the global
context -- which I agree with -- but then it's a really bad idea to make
that a default or easy to use. And given the rarity of a large application
not using any third-party libs, I don't see the benefit of a global context
at all.

Contrast this with, e.g. logging -- in that case, a third party lib
generally will want to simply log stuff to the application-defined logging
system it does not need to know (or care) where a debug message is sent
only that it is sent where the application configuration wants it to go.

> I'm not sure if this is really a good analogy, but it reminds me of the
> issues with system locale settings:
>

> The reason for having it centralized on the computer has always been
> that different applications could then agree on something.

sure -- having a locale is a fine idea, the problem is when a programming
language uses that locale by default, or even worse without even the choice
of overriding it. If an application wants to, for instance: "display this
information in the local computer's locale appropriate way" -- that's a
perfect use case.

But "read this text file, that could have come from anywhere, using the
compteres' locale settings" was always a bad idea.

Sure -- you may actually want to do that -- but it should be an
explicit choice, not the default.

text, and that it should assume that eight-bit data is most likely to
> be in Greek. Since text files don't have inherent metadata identifying
> their encodings, it's not unreasonable to let the system decide it.
>

Well, it wasn't unreasonable back in the day , but it is now -- the odds
that a text files comes from the local system are slim, and even worse,
very unlikely that the code is being written and tested on a
system with the same settings.

> > Dateitme handling has the same issues -- I think the C libs STILL use
> the system timezone settings. And an early version of the numpy datetime
> implementation did too -- realy bad idea.
> >
> > In short: The context in which code is run should be in complete control
> of the person writing the code, not the person writing the "application".
>
> Not sure what you mean there. Obviously any date/time with inherent
> timezone data should simply use that, but if a library is parsing
> something like "2022-04-09 02:46:17", should every single library have
> a way for you to tell it what timezone that is, or should it just use
> the system settings?

That ISO string has no TZ offset -- it is a naive datetime, and it
absolutely , positively should be interpreted as such. That is EXACTLY what
was wrong with the first numpy datetime implementation -- it interpreted an
iso string without an offset as "local time" (which I believe the ISO spec
says) and so applied the locale timezone -- which was almost always the
wrong thing to do, period. We all had to go through machinations to work
around that.

> I put it to you that this is something that
> belongs to the application, unless there's a VERY VERY VERY good
> reason for the library to override that.

I think there's some confusion here: I'm not saying that libraries should
override system settings -- I'm saying libraries should only use these
types of system settings when explicitly asked to -- not by default -- and
worst of all, libraries should not use system settings that can't be
overridden by the application (which is what the old C time functions did
(still do?))

Again, the behavior of some code should be clear and obvious to the person
writing the code. If I write code such as:

np.datetime64('2022-04-10T09:33:00')

I should know exactly what I'm going to get (which I do now -- numpy fixed
this a good while ago -- but in it's first implementation, it should
literally provide a different result depending on the local computer's
settings)

That doesn't mean that:

np.datetime64('2022-04-10T09:33:00', apply_locale_tz_offset=True)

Isn't useful, it's just that it shouldn't be default, or even worse, should
be a non-overridable default -- e.g. a "global context".

And if you mean the interpretation of timezones themselves...

of course not, no. That belongs in its own library, which libraries that
need it can then choose (or not) to use.

>  One single global tzdata is absolutely fine, thank you very
> much.

Of course it is -- I"m not saying that nothing global is useful, I'm saying
that sets of global defaults and all that are very useful, but they should
always be explicitly specified. If I'm writing a library, I may choose to
depend on pytz. But when I write the code, I'm making that choice -- i"m
not writing code simply hoping that the application using my code has made
the right choice of how to deal with timezones.

> Again: practical use case with units:
>

> Cool. The global repository that I suggest would be completely
> independent, unless you choose to synchronize them. The registry that
> you have would be used by your tools, and source code would use the
> interpreter-wide ones. This is not a conflict. Of course, since you
> have all the functionality already, it would make a lot of sense to
> offer an easy way to register all of your library's units with the
> system repository, thus making them all available; but that would be
> completely optional to both you and your users.
>

But if I did that, then one lib registering my units with the global
repository might break some other lib's use of the global repository.

A global "default units" is fine, but then anyone wanting to override it
should be working with a copy, not changing the same one that other
packages might be using.

Which I believe is exactly what pint does, for instance.

-CHB

-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5MDATAMKRUN6MMC5PDQESUEN7FR6D4I4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom literals, a la C++

Reply via email to