Custom importer and errors

2024-04-15 Thread Fabiano Sidler via Python-list

Hi folks!

I'd like to split my package tree into several IDE projects and build a 
custom

importer to import
    'top.child1.child2'
from the directory
    /top.child1.child2/__init__.py
so basically replacing the dots with slashes and having the package content
lying directly in the project folder. I have come up with this:

=== usercustomize.py ===
 1 import sys
 2 from importlib.machinery import ModuleSpec
 3 from pathlib import Path
 4
 5 Loader = type(__spec__.loader)
 6
 7 class IdeHelper:
 8 @classmethod
 9 def find_spec(cls, name, path, target=None):
10 for dirname in sys.path:
11 dirobj = Path(dirname)
12 if dirobj.name == name:
13 break
14 else:
15 return None
16 origin = str(dirobj.joinpath('__init__.py').absolute())
17 ret = ModuleSpec(name, Loader(name, origin), origin=origin)
18 return ret
19
20 sys.meta_path.append(IdeHelper)

which I'm on the right direction with. Unfortunately, I'm getting errors 
while

importing a subpackage. With 'import top.child1' the error is
    ModuleNotFoundError: No module named 'top.child1'; 'top' is not a 
package

whereas with 'from top import child1' the error changes to
    ImportError: cannot import name 'child1' from 'top' (unknown location)

How can I make this work?

Best wishes,
Fabiano

--
https://mail.python.org/mailman/listinfo/python-list


Errors

2023-11-02 Thread Zigocut Technologies via Python-list
Hello,
I would like to develop Mobile Applications using the Kivy Python Framework
but I am having difficulty and these are the errors I am finding  "
C:\WINDOWS\system32>python3 --version Python was not found; run without
arguments to install from the Microsoft Store, or disable this shortcut
from Settings > Manage App Execution Aliases." after installing Python 3.9
I also had another error is "
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None,
status=None)) after connection broken by
'NewConnectionError(': Failed to establish a new connection: [Errno 11001]
getaddrinfo failed')': /simple/kivy-deps-gstreamer-dev/" how can I go about
these errors? I am running windows 10
Best Regards
*Owner Zigocut Technologies *
*+256774306868/701067528*
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-19 Thread Ángel GR

On 2022-12-19 16:14, MRAB wrote:
To be fair, I don't think I've never seen that notation either! I've 
only ever seen the form 6.67430E-11 ± 0.00015E-11, which is much clearer.


We use it regularly in our experimental data: 6.3(4), 15.002(10). Things 
would become complex using exponential forms for errors, specially when 
starting to play with exponents: 6.67430E-11 ± 1.5E-15.

--
Ángel GR

--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-19 Thread Michael F. Stemper

On 19/12/2022 09.14, MRAB wrote:

On 2022-12-19 14:10, Peter J. Holzer wrote:

On 2022-12-19 09:25:17 +1100, Chris Angelico wrote:

On Mon, 19 Dec 2022 at 07:57, Stefan Ram  wrote:
> G = Decimal( 6.6743015E-11 )
> r = Decimal( 6.371E6 )
> M = Decimal( 5.9722E24 )

What's the point of using Decimal if you start with nothing more than
float accuracy?


Right. He also interpreted the notation "6.67430(15)E-11" wrong. The
digits in parentheses represent the uncertainty in the same number of
last digits. So "6.67430(15)E-11" means "something between 6.67430E-11 -
0.00015E-11 and 6.67430E-11 + 0.00015E-11". The r value has only a
precision of 1 km and I'm not sure how accurate the mass is. Let's just
assume (for the sake of the argument) that these are actually accurate in
all given digits.


ntal misunderstanding of the numbers they are working with.



To be fair, I don't think I've never seen that notation either! I've only ever 
seen the form 6.67430E-11 ± 0.00015E-11, which is much clearer.


See, for instance:

In particular, the "concise form".

For more detail, see:


--
Michael F. Stemper
Isaiah 58:6-7
--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-19 Thread Sabrina Almodóvar
On 17/12/2022 18:55, Stefan Ram wrote:
> Grant Edwards  writes:
>> Yes, fixed point (or decimal) is a better fit for what he's doing. but
>> I suspect that floating point would be a better fit for the problem
>> he's trying to solve.
> 
>   I'd like to predict that within the next ten posts in this
>   thread someone will mention "What Every Computer Scientist
>   Should Know About Floating-Point Arithmetic".

Lol!  Speaking of which, let us guide ourselves by

  https://floating-point-gui.de/basic/

and learn that calculations such as ``0.1 + 0.4 work correctly.''
That's the Internet.  You learn a thing on a Tuesday, and on a Wednesday
you're a teacher.  That's why we need peer review.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-19 Thread Greg Ewing

On 19/12/22 9:24 am, Stefan Ram wrote:

 So what's the time until a mass of one gram
 arrives at the ground versus a mass of ten grams? I think
 one needs "Decimal" to calculate this!


Or you can be smarter about how you calculate it.

Differentiating t with respect to m gives

dt/dm = -0.5 * sqrt(2 * s * r**2 / (G * (M + m)**3))

which, since m is much smaller than M, is approximately

   -0.5 * sqrt(2 * s * r**2 / (G * M**3))

So

>>> G = 6.6743015E-11
>>> r = 6.371E6
>>> M = 5.9722E24
>>> dtdm = -0.5 * sqrt(2*s*(r**2) / (G * M**3))
>>> dtdm * (1/1000 - 10/1000)
3.4004053539917275e-28

which agrees with your Decimal calculation to 3 digits,
and should be as precise as the input numbers (about
4 digits in this case).

This is a good example of why it's important to choose
an appropriate numerical algorithm!

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-19 Thread Peter J. Holzer
On 2022-12-19 15:14:14 +, MRAB wrote:
> On 2022-12-19 14:10, Peter J. Holzer wrote:
> > He also interpreted the notation "6.67430(15)E-11" wrong. The
> > digits in parentheses represent the uncertainty in the same number of
> > last digits. So "6.67430(15)E-11" means "something between 6.67430E-11 -
> > 0.00015E-11 and 6.67430E-11 + 0.00015E-11".
[...]
> To be fair, I don't think I've never seen that notation either!

I've probably seen it first on Wikipedia, quite a few years ago. Since
then I've also encountered in in physical and astronomical papers (I'm
neither a physicist nor an astronomomer but I occasionally read the
original papers if what I read in the "mainstream media"[1] or hear on
youtube seems suspect).

> I've only ever seen the form 6.67430E-11 ± 0.00015E-11, which is much
> clearer.

Yeah, it's definitely not the pinnacle of inuitiveness. I freely admit
that I looked it up before posting just to make sure that I wasn't
confused about its meaning.

Another problem (but it shares that with the ± notation) is that it's
not clear what that number actually represents. Is it one sigma or two?
Or something else? Is the distribution even symmetric?

hp

[1] I'm not quite happy with that term.

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-19 Thread MRAB

On 2022-12-19 14:10, Peter J. Holzer wrote:

On 2022-12-19 09:25:17 +1100, Chris Angelico wrote:

On Mon, 19 Dec 2022 at 07:57, Stefan Ram  wrote:
> G = Decimal( 6.6743015E-11 )
> r = Decimal( 6.371E6 )
> M = Decimal( 5.9722E24 )

What's the point of using Decimal if you start with nothing more than
float accuracy?


Right. He also interpreted the notation "6.67430(15)E-11" wrong. The
digits in parentheses represent the uncertainty in the same number of
last digits. So "6.67430(15)E-11" means "something between 6.67430E-11 -
0.00015E-11 and 6.67430E-11 + 0.00015E-11". The r value has only a
precision of 1 km and I'm not sure how accurate the mass is. Let's just
assume (for the sake of the argument) that these are actually accurate in
all given digits.

So G is between 6.67415E-11 and 6.67445E-11, r is between 6.3705E6 and
6.3715E6 and M is between 5.97215E24 and 5.97225E24. If we compute the
time for those deviations you will find that the differences are many
orders of magnitude greater than the effect you wanted to show. And that
still ignores the fact that a vacuum won't be perfect (and collisions
with a few stray atoms might have a similarly tiny effect), that gravity
isn't constant while the weight falls (it's getting closer to the center
of the earth and it's moving past other masses on its way) that Newton's
law is only an approximation, etc. So while the effect is (almost
certainly) real, the numbers are garbage.

I think there's a basic numeracy problem here. This is unfortunately all
too common, even among scientists. The OP apparently rounded their
numbers to 8 significant digits (thereby introducing an error of about
1E-8) and then insisted that the additional error of 1E-15 introduced by
the decimal to float conversion was unacceptable, showing IMHO a
fundamental misunderstanding of the numbers they are working with.

To be fair, I don't think I've never seen that notation either! I've 
only ever seen the form 6.67430E-11 ± 0.00015E-11, which is much clearer.

--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-19 Thread Thomas Passin

On 12/19/2022 9:10 AM, Peter J. Holzer wrote:

On 2022-12-19 09:25:17 +1100, Chris Angelico wrote:

On Mon, 19 Dec 2022 at 07:57, Stefan Ram  wrote:

G = Decimal( 6.6743015E-11 )
r = Decimal( 6.371E6 )
M = Decimal( 5.9722E24 )


What's the point of using Decimal if you start with nothing more than
float accuracy?


Right. He also interpreted the notation "6.67430(15)E-11" wrong. The
digits in parentheses represent the uncertainty in the same number of
last digits. So "6.67430(15)E-11" means "something between 6.67430E-11 -
0.00015E-11 and 6.67430E-11 + 0.00015E-11". The r value has only a
precision of 1 km and I'm not sure how accurate the mass is. Let's just
assume (for the sake of the argument) that these are actually accurate in
all given digits.

So G is between 6.67415E-11 and 6.67445E-11, r is between 6.3705E6 and
6.3715E6 and M is between 5.97215E24 and 5.97225E24. If we compute the
time for those deviations you will find that the differences are many
orders of magnitude greater than the effect you wanted to show. And that
still ignores the fact that a vacuum won't be perfect (and collisions
with a few stray atoms might have a similarly tiny effect), that gravity
isn't constant while the weight falls (it's getting closer to the center
of the earth and it's moving past other masses on its way) that Newton's
law is only an approximation, etc. So while the effect is (almost
certainly) real, the numbers are garbage.

I think there's a basic numeracy problem here. This is unfortunately all
too common, even among scientists. The OP apparently rounded their
numbers to 8 significant digits (thereby introducing an error of about
1E-8) and then insisted that the additional error of 1E-15 introduced by
the decimal to float conversion was unacceptable, showing IMHO a
fundamental misunderstanding of the numbers they are working with.

 hp


In a way, this example shows both things - the potential value of using 
Decimal numbers, and a degree of innumeracy.  It also misses a chance to 
illustrate how to approach a problem in the simplest and most 
informative way.  Here's what I mean -


We can imagine that the input numbers really are exact, with the 
remaining digits filled in with zeros.  Then it might really be the case 
that - if you wanted to do this computation with precision and could 
assume all those other effects could be neglected - using Decimals would 
be a good thing to do. So OK, let's say that's demonstrated.  No need to 
nit-pick it further.


As a physics problem, though, you would generally be interested in two 
kinds of things:


1. Could there be such an effect, and if so would it be large enough to 
be interesting, whether in theory or in practice?
2. Can there be any feasible way to demonstrate the proposed effect by 
measurements?


The first thing one should do is to find a way to estimate the magnitude 
of the effect so it can be compared with some of those other phenomena 
(non-constant gravity, etc) to see if it's worth doing a full 
computation at all, or even spending any more time on the matter.  There 
is likely to be a way to make such an estimate without needing to resort 
to extremely high precision - you would only need to get within perhaps 
an order of magnitude.  Your real task, then, is to find that way.


For example, you would probably be able to estimate the precision needed 
without actually doing the calculation.  That in itself might turn out 
to enough.




--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-19 Thread Peter J. Holzer
On 2022-12-19 09:25:17 +1100, Chris Angelico wrote:
> On Mon, 19 Dec 2022 at 07:57, Stefan Ram  wrote:
> > G = Decimal( 6.6743015E-11 )
> > r = Decimal( 6.371E6 )
> > M = Decimal( 5.9722E24 )
> 
> What's the point of using Decimal if you start with nothing more than
> float accuracy?

Right. He also interpreted the notation "6.67430(15)E-11" wrong. The
digits in parentheses represent the uncertainty in the same number of
last digits. So "6.67430(15)E-11" means "something between 6.67430E-11 -
0.00015E-11 and 6.67430E-11 + 0.00015E-11". The r value has only a
precision of 1 km and I'm not sure how accurate the mass is. Let's just
assume (for the sake of the argument) that these are actually accurate in
all given digits.

So G is between 6.67415E-11 and 6.67445E-11, r is between 6.3705E6 and
6.3715E6 and M is between 5.97215E24 and 5.97225E24. If we compute the
time for those deviations you will find that the differences are many
orders of magnitude greater than the effect you wanted to show. And that
still ignores the fact that a vacuum won't be perfect (and collisions
with a few stray atoms might have a similarly tiny effect), that gravity
isn't constant while the weight falls (it's getting closer to the center
of the earth and it's moving past other masses on its way) that Newton's
law is only an approximation, etc. So while the effect is (almost
certainly) real, the numbers are garbage.

I think there's a basic numeracy problem here. This is unfortunately all
too common, even among scientists. The OP apparently rounded their
numbers to 8 significant digits (thereby introducing an error of about
1E-8) and then insisted that the additional error of 1E-15 introduced by
the decimal to float conversion was unacceptable, showing IMHO a
fundamental misunderstanding of the numbers they are working with.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-18 Thread Cameron Simpson

On 19Dec2022 08:53, Cameron Simpson  wrote:
I'm no expert on floating point coding for precision, but I believe 
that trying to work with values "close together" in magnitude is 
important because values of different scales inherently convert one of 
them to the other scale (i.e. similar sized exponent part) with 
corresponding loss of precision in the mantissa part. That may require 
you to form your calcutations carefully.


This depends on the operation. With addition/subtraction you'd see this 
directly. With multiplication etc it isn't really the case. But there 
are both more and less effective ways to arrange your floating point 
math in terms of preserving what precision you have.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-18 Thread Greg Ewing

On 19/12/22 6:35 am, Paul St George wrote:

So I am working on a physics paper with a colleague. We have a theory about 
Newtons Cradle.

We want to illustrate the paper with animations.

Because there is a problem, I am investigating in all areas. ... I would like 
to be in control of or fully aware of what goes on under the bonnet.


When you convert a string to a float, you're already getting the closest
possible value in binary floating point.

For things like physics simulations, you need to design your algorithms
so that they're tolerant of small inaccuracies in the representation of
your numbers. If those are causing you problems, it sounds like there
is some kind of numerical instability in your algorithm that needs to
be addressed.

It's also possible that there is just a bug somewhere in your code, and
the problem really has nothing to do with floating point inaccuracies.

If you can post some code we might be able to help you further.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-18 Thread Chris Angelico
On Mon, 19 Dec 2022 at 07:57, Stefan Ram  wrote:
> G = Decimal( 6.6743015E-11 )
> r = Decimal( 6.371E6 )
> M = Decimal( 5.9722E24 )

What's the point of using Decimal if you start with nothing more than
float accuracy?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-18 Thread Cameron Simpson

On 18Dec2022 18:35, Paul St George  wrote:

So I am working on a physics paper with a colleague. We have a theory about 
Newtons Cradle. We answer the question why when you lift and drop balls 1 and 
2, balls 4 and 5 rise up. I could say more, but ... (if you are interested 
please write to me).

We want to illustrate the paper with animations. The theory includes distortion 
of the balls and this distortion is very very small. So, I am sent data with 
locations and dimensions to 13 decimal places. Something strange is happening 
with the animations: the balls are not moving smoothly. I do not know (yet) 
where the problem lies so it is difficult to provide a clear narrative.

Because there is a problem, I am investigating in all areas. This brings me to 
the question I asked here. I am not expecting six decimal places or three 
decimal places to be as accurate as thirteen decimal places, but I would like 
to be in control of or fully aware of what goes on under the bonnet.


First the short take: your machine pobably is quite precise, and float 
is far more performant that the other numeric types available. Your 
source data seem to have more round off than the rounding in a float.


Under the bonnet:

A Python float is effectively a base-2 value in scientific notation.  
Internally it has a base-2 mantissa and base-2 exponent. This page:

https://docs.python.org/3/library/stdtypes.html#typesnumeric
says that CPython's float uses C's "double" floating point type
(you are almost certainly using the CPython implementation) and thus 
you're using the machine's floating point implemenetation.


I believe that almost all modern CPUs implement IEEE 754 floating point:
https://en.wikipedia.org/wiki/IEEE_754

Because they're base 2, various values in other bases will not be 
precisely representable as a float. For example, 1/3 (which you will 
know is _also_ not representable precisely as a base-10 value such as 
0.333).


You can get specifics of your Python's floating point from 
`sys.float_Info`, i.e:


from sys import float_info

The look at float_info.epsilon etc. Details:
https://docs.python.org/3/library/sys.html#sys.float_info

Here's my machine:

Python 3.10.6 (main, Aug 11 2022, 13:47:18) [Clang 12.0.0 
(clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more 
information.

>>> from sys import float_info
>>> float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, 
max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, 
min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, 
radix=2, rounds=1)


Values of note: mant_dig=53 (53 base-2 bits), dig=15 (15 decimal digits 
of precision).


You might want to look at sys.float_repr_style here:
https://docs.python.org/3/library/sys.html#sys.float_repr_style
which affects how Python writes floats out. In particular this text:

 If the string has value 'short' then for a finite float x, repr(x) 
 aims to produce a short string with the property that 
 float(repr(x)) == x. This is the usual behaviour in Python 3.1 and 
 later.


Again, on my machine:

>>> 64550.727
64550.727
>>> 64550.728
64550.728
>>> 64550.72701
64550.72701
>>> 64550.7270101
64550.7270101
>>> 64550.727010101
64550.727010101
>>> 64550.72701010101
64550.72701010101
>>> 64550.7270101010101
64550.72701010101
>>> 64550.727010101010101
64550.72701010101
>>> 64550.72701010101010101
64550.72701010101


On 17 Dec 2022, at 16:54:05 EST 2022, Thomas Passin wrote:

On 12/17/2022 3:45 PM, Paul St George wrote:

Thanks to all!
It was the rounding rounding error that I needed to avoid (as Peter J. Holzer 
suggested). The use of decimal solved it and just in time. I was about to 
truncate the number, get each of the characters from the string mantissa, and 
then do something like this:

64550.727

64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)

Now I do not need to!


Good, because if you do that using floats it will be less precise than 
float(64550.727). (Which I see Alan has already stated.)


Your source file contains strings like "64550.727". They look to already 
be less than 13 digits of precision as written i.e. some round off 
already took place when that file was written. Do you know the precision 
of the source data?


I suspect that rather than chasing a "perfect" representation of your 
source data, which is already rounded off, you:

- see if the source values can be obtained more precisely
- figure out which operations in your simulation contribute to the 
  motion roughness you see


I'm no expert on floating point coding for precision, but I believe that 
trying to work with values "close together" in magnitude is important 
because values of different scales inherently convert one of them to the 
other scale (i.e. similar sized exponent part) with corresponding loss 
of precision in the mantissa part. T

Re: String to Float, without introducing errors

2022-12-18 Thread dn

On 18/12/2022 10.55, Stefan Ram wrote:

Grant Edwards  writes:

Yes, fixed point (or decimal) is a better fit for what he's doing. but
I suspect that floating point would be a better fit for the problem
he's trying to solve.


   I'd like to predict that within the next ten posts in this
   thread someone will mention "What Every Computer Scientist
   Should Know About Floating-Point Arithmetic".



Thank you for doing-so.

More specific: https://docs.python.org/3/tutorial/floatingpoint.html


Perhaps in the rush to 'answer', the joke is on 'us' - that we might 
first need to more carefully-understand the OP's use-case, requirements, 
and constraints?



The joke sours when remembering that this 'mystery' generates frequent 
questions 'here' (and on other Python fora) - not as many as 'why don't 
I see a pretty-GUI when I fire-up Python on MS-Windows?' but it is a 
more sophisticated realisation and deserves a detailed response (such as 
the thought-provoking illustration-code appearing today).



Is it a consequence of Python lowering 'the barrier to entry'? Good 
thing? Bad thing?


(we first started noticing this sort of issue in our (non-Python) MOOCs, 
several pre-COVID years ago: when we first started, the trainee was 
typically recent post-grad; whereas today we enrol folk with a much 
wider range of ages and backgrounds. It is likely that more dev.work has 
gone into 'the bottom end', than into new/higher sophistications - even 
given IT's rate-of-change!)


--
--
Regards,
=dn
--
https://mail.python.org/mailman/listinfo/python-list


String to Float, without introducing errors

2022-12-18 Thread Paul St George
So I am working on a physics paper with a colleague. We have a theory about 
Newtons Cradle. We answer the question why when you lift and drop balls 1 and 
2, balls 4 and 5 rise up. I could say more, but ... (if you are interested 
please write to me).

We want to illustrate the paper with animations. The theory includes distortion 
of the balls and this distortion is very very small. So, I am sent data with 
locations and dimensions to 13 decimal places. Something strange is happening 
with the animations: the balls are not moving smoothly. I do not know (yet) 
where the problem lies so it is difficult to provide a clear narrative.

Because there is a problem, I am investigating in all areas. This brings me to 
the question I asked here. I am not expecting six decimal places or three 
decimal places to be as accurate as thirteen decimal places, but I would like 
to be in control of or fully aware of what goes on under the bonnet.







 
>> On 17 Dec 2022, at 16:54:05 EST 2022, Thomas Passin wrote:
On 12/17/2022 3:45 PM, Paul St George wrote:
> Thanks to all!
> It was the rounding rounding error that I needed to avoid (as Peter J. Holzer 
> suggested). The use of decimal solved it and just in time. I was about to 
> truncate the number, get each of the characters from the string mantissa, and 
> then do something like this:
> 
> 64550.727
> 
> 64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)
> 
> Now I do not need to!

And that approach would not have helped you, because each of those 
calculations would be done as floating point, and you wouldn't have 
gotten any more precision (and maybe less) than simply doing 
float('64550.727').

Here is a small but interesting discussion thread about float vs Decimal:

https://stackoverflow.com/questions/32053647/comparing-python-decimals-created-from-float-and-string

Would you mind telling us why that degree of precision (that is, decimal 
vs float) matters for your problem?


>> On 17 Dec 2022, at 13:11, Alan Gauld > > wrote:
>>
>> On 17/12/2022 11:51, Paul St George wrote:
>>> I have a large/long array of numbers in an external file. The numbers look 
>>> like this:
>>>
>>> -64550.727
>>> -64511.489
>>> -64393.637
>>> -64196.763
>>> -63920.2
>>
>>> When I bring the numbers into my code, they are Strings. To use the
>>> numbers in my code, I want to change the Strings to Float type
>>> because the code will not work with Strings but I do not want
>>> to change the numbers in any other way.
>>
>> That may be impossible. Float type is not exact and the conversion
>> will be the closest binary representation of your decimal number.
>> It will be very close but it may be slightly different when you
>> print it, for example. (You can usually deal with that by using
>> string formatting features.)
>>
>> Another option is to use the decimal numeric type. That has other
>> compromises associated with it but, if retaining absolute decimal
>> accuracy is your primary goal, it might suit you better.
>>
>>
>> -- 
>> Alan G
>> Author of the Learn to Program web site
>> http://www.alan-g.me.uk/
>> http://www.amazon.com/author/alan_gauld
>> Follow my photo-blog on Flickr at:
>> http://www.flickr.com/photos/alangauldphotos
>>
>>
> 









-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-18 Thread rbowman
On Sun, 18 Dec 2022 11:14:28 -0500, Dennis Lee Bieber wrote:

> .. And maybe lament the days when a 3-digit result was acceptable in
> math class -- being the typical capability in reading a standard (10"
> scale) slide rule.

Arguably more thought was given to what those three digits meant in the 
real world. For example, is calculating a latitude to 6 decimal places 
meaningful when the data was gathered by a GPS receiver with 5m accuracy? 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-18 Thread Thomas Passin
Thanks for filling us in!  I wouldn't think the animations themselves 
would need such precision, though perhaps the calculations of the forces 
and motions do.  One way to check might be to perturb the initial 
conditions a bit and see if the changes in the motions seem to be 
correspondingly small.  That might help you work out if the problem is 
with the physics solution or the animation.  It would be easy to get 
some numerical instability in the calculations, for example if you are 
inverting matrices.


On 12/18/2022 2:00 PM, Paul St George wrote:

So I am working on a physics paper with a colleague. We have a theory about 
Newtons Cradle. We answer the question why when you lift and drop balls 1 and 
2, balls 4 and 5 rise up. I could say more, but ... (if you are interested 
please write to me).

We want to illustrate the paper with animations. The theory includes distortion 
of the balls and this distortion is very very small. So, I am sent data with 
locations and dimensions to 13 decimal places. Something strange is happening 
with the animations: the balls are not moving smoothly. I do not know (yet) 
where the problem lies so it is difficult to provide a clear narrative.

Because there is a problem, I am investigating in all areas. This brings me to 
the question I asked here. I am not expecting six decimal places or three 
decimal places to be as accurate as thirteen decimal places, but I would like 
to be in control of or fully aware of what goes on under the bonnet.

Here is a picture:
https://paulstgeorge.com/newton/cyclography.html
Thanks,
Paul



  

On 17 Dec 2022, at 16:54:05 EST 2022, Thomas Passin wrote:

On 12/17/2022 3:45 PM, Paul St George wrote:

Thanks to all!
It was the rounding rounding error that I needed to avoid (as Peter J. Holzer 
suggested). The use of decimal solved it and just in time. I was about to 
truncate the number, get each of the characters from the string mantissa, and 
then do something like this:

64550.727

64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)

Now I do not need to!


And that approach would not have helped you, because each of those
calculations would be done as floating point, and you wouldn't have
gotten any more precision (and maybe less) than simply doing
float('64550.727').

Here is a small but interesting discussion thread about float vs Decimal:

https://stackoverflow.com/questions/32053647/comparing-python-decimals-created-from-float-and-string

Would you mind telling us why that degree of precision (that is, decimal
vs float) matters for your problem?



On 17 Dec 2022, at 13:11, Alan Gauld https://mail.python.org/mailman/listinfo/python-list>> wrote:

On 17/12/2022 11:51, Paul St George wrote:

I have a large/long array of numbers in an external file. The numbers look like 
this:

-64550.727
-64511.489
-64393.637
-64196.763
-63920.2



When I bring the numbers into my code, they are Strings. To use the
numbers in my code, I want to change the Strings to Float type
because the code will not work with Strings but I do not want
to change the numbers in any other way.


That may be impossible. Float type is not exact and the conversion
will be the closest binary representation of your decimal number.
It will be very close but it may be slightly different when you
print it, for example. (You can usually deal with that by using
string formatting features.)

Another option is to use the decimal numeric type. That has other
compromises associated with it but, if retaining absolute decimal
accuracy is your primary goal, it might suit you better.


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
















--
https://mail.python.org/mailman/listinfo/python-list


String to Float, without introducing errors

2022-12-18 Thread Paul St George
So I am working on a physics paper with a colleague. We have a theory about 
Newtons Cradle. We answer the question why when you lift and drop balls 1 and 
2, balls 4 and 5 rise up. I could say more, but ... (if you are interested 
please write to me).

We want to illustrate the paper with animations. The theory includes distortion 
of the balls and this distortion is very very small. So, I am sent data with 
locations and dimensions to 13 decimal places. Something strange is happening 
with the animations: the balls are not moving smoothly. I do not know (yet) 
where the problem lies so it is difficult to provide a clear narrative.

Because there is a problem, I am investigating in all areas. This brings me to 
the question I asked here. I am not expecting six decimal places or three 
decimal places to be as accurate as thirteen decimal places, but I would like 
to be in control of or fully aware of what goes on under the bonnet.

Here is a picture:
https://paulstgeorge.com/newton/cyclography.html
Thanks,
Paul



 
>> On 17 Dec 2022, at 16:54:05 EST 2022, Thomas Passin wrote:
On 12/17/2022 3:45 PM, Paul St George wrote:
> Thanks to all!
> It was the rounding rounding error that I needed to avoid (as Peter J. Holzer 
> suggested). The use of decimal solved it and just in time. I was about to 
> truncate the number, get each of the characters from the string mantissa, and 
> then do something like this:
> 
> 64550.727
> 
> 64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)
> 
> Now I do not need to!

And that approach would not have helped you, because each of those 
calculations would be done as floating point, and you wouldn't have 
gotten any more precision (and maybe less) than simply doing 
float('64550.727').

Here is a small but interesting discussion thread about float vs Decimal:

https://stackoverflow.com/questions/32053647/comparing-python-decimals-created-from-float-and-string

Would you mind telling us why that degree of precision (that is, decimal 
vs float) matters for your problem?


>> On 17 Dec 2022, at 13:11, Alan Gauld > > wrote:
>>
>> On 17/12/2022 11:51, Paul St George wrote:
>>> I have a large/long array of numbers in an external file. The numbers look 
>>> like this:
>>>
>>> -64550.727
>>> -64511.489
>>> -64393.637
>>> -64196.763
>>> -63920.2
>>
>>> When I bring the numbers into my code, they are Strings. To use the
>>> numbers in my code, I want to change the Strings to Float type
>>> because the code will not work with Strings but I do not want
>>> to change the numbers in any other way.
>>
>> That may be impossible. Float type is not exact and the conversion
>> will be the closest binary representation of your decimal number.
>> It will be very close but it may be slightly different when you
>> print it, for example. (You can usually deal with that by using
>> string formatting features.)
>>
>> Another option is to use the decimal numeric type. That has other
>> compromises associated with it but, if retaining absolute decimal
>> accuracy is your primary goal, it might suit you better.
>>
>>
>> -- 
>> Alan G
>> Author of the Learn to Program web site
>> http://www.alan-g.me.uk/
>> http://www.amazon.com/author/alan_gauld
>> Follow my photo-blog on Flickr at:
>> http://www.flickr.com/photos/alangauldphotos
>>
>>
> 









-- 
https://mail.python.org/mailman/listinfo/python-list


RE: String to Float, without introducing errors

2022-12-17 Thread avi.e.gross
As often seems to happen, someone asks something that may not be fully clear 
and others chime in and extend the question.

Was the original question how to read in a ingle column of numbers from a file 
that are all numeric and NOT integers and be able to use them?

If so, the answer was quite trivial using the conversion function of your 
choice. Handling errors or what to do with something like a blank or NA are 
nice ideas if asked about.

My answer would be to ask if this was an assignment where they are EXPECTED to 
do things a certain way to master a concept, or part of a serious attempt to 
get things done.

For the latter case, it may make plenty of sense considering a single column of 
text as just a special case for the kind of multi-column files often read in 
from formatted data files and use some functionality from add-on modules like 
numpy or pandas that also allow you to deal with many other concerns.

Note such utilities also often make a decent guess on what data type a column 
should be turned into and it is possible they may occasionally decide based on 
the data that the contents all HAPPEN to be integer or there is at least one 
that makes it choose character. So any such use may well require the subsequent 
use of functions that check the dtype and, if needed, do an explicit conversion 
to what you really really really want.



-Original Message-
From: Python-list  On 
Behalf Of Mats Wichmann
Sent: Saturday, December 17, 2022 1:42 PM
To: python-list@python.org
Subject: Re: String to Float, without introducing errors

On 12/17/22 07:15, Thomas Passin wrote:
> You have strings, and you want to end up with numbers.  The numbers 
> are not integers.  Other responders have gone directly to whether you 
> should use float or decimal as the conversion, but that is a secondary matter.
> 
> If you have integers, convert with
> 
> integer = int(number_string)
>> -64550.727

they pretty clearly aren't integers:

>> -64511.489
>> -64393.637
>> -64196.763
>> -63920.2
>> -63563.037
>> -63124.156
>> -62602.254
>> -61995.895
>> -61303.548
>> -60523.651
>> -59654.66
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Chris Angelico
On Sun, 18 Dec 2022 at 09:46, Stefan Ram  wrote:
>
> Grant Edwards  writes:
> >Yes, fixed point (or decimal) is a better fit for what he's doing. but
> >I suspect that floating point would be a better fit for the problem
> >he's trying to solve.
>
>   I'd like to predict that within the next ten posts in this
>   thread someone will mention "What Every Computer Scientist
>   Should Know About Floating-Point Arithmetic".
>
> |>>> 0.1 + 0.2 - 0.3
> |5.551115123125783e-17
>

Looks like someone just did.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread dn

On 18/12/2022 01.39, Peter J. Holzer wrote:

On 2022-12-17 12:51:17 +0100, Paul St George wrote:

I have a large/long array of numbers in an external file. The numbers
look like this:

-64550.727
-64511.489
-64393.637

[...]


When I bring the numbers into my code, they are Strings. To use the
numbers in my code, I want to change the Strings to Float type because
the code will not work with Strings but I do not want to change the
numbers in any other way.



s = "-64550.727"
f = float(s)
f

-64550.727

type(f)



(Contrary to the other people posting in this thread I don't think float
is the wrong type for the job. It might be, but you haven't given enough
details to tell whether the inevitable rounding error matters or not. In
my experience in almost all cases where people think it matters it
really doesn't.)


Agreed: (ultimately) insufficient information-provided.
(but that probably doesn't matter either - as the OP seems to have come 
to a decision)



Agreed: probably doesn't matter.


'The world' agrees with both, having decided that Numerical Analysis is 
no-longer a necessary ComSc study.


In the ?good, old, days Numerical Analysis included contemplation of the 
difficulties and differences between "precision" and "accuracy". Thus, 
the highly accurate calculation of less-than precise numbers - or was it 
precise values subject to less than accurate computation?

(rhetorical!)

Sort of like giving highly-accurate answers to a less-than precise 
(complete) question, by presuming can ignore the latter.


--
Regards,
=dn
--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Thomas Passin

On 12/17/2022 3:45 PM, Paul St George wrote:

Thanks to all!
It was the rounding rounding error that I needed to avoid (as Peter J. Holzer 
suggested). The use of decimal solved it and just in time. I was about to 
truncate the number, get each of the characters from the string mantissa, and 
then do something like this:

64550.727

64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)

Now I do not need to!


And that approach would not have helped you, because each of those 
calculations would be done as floating point, and you wouldn't have 
gotten any more precision (and maybe less) than simply doing 
float('64550.727').


Here is a small but interesting discussion thread about float vs Decimal:

https://stackoverflow.com/questions/32053647/comparing-python-decimals-created-from-float-and-string

Would you mind telling us why that degree of precision (that is, decimal 
vs float) matters for your problem?




On 17 Dec 2022, at 13:11, Alan Gauld  wrote:

On 17/12/2022 11:51, Paul St George wrote:

I have a large/long array of numbers in an external file. The numbers look like 
this:

-64550.727
-64511.489
-64393.637
-64196.763
-63920.2



When I bring the numbers into my code, they are Strings. To use the
numbers in my code, I want to change the Strings to Float type
because the code will not work with Strings but I do not want
to change the numbers in any other way.


That may be impossible. Float type is not exact and the conversion
will be the closest binary representation of your decimal number.
It will be very close but it may be slightly different when you
print it, for example. (You can usually deal with that by using
string formatting features.)

Another option is to use the decimal numeric type. That has other
compromises associated with it but, if retaining absolute decimal
accuracy is your primary goal, it might suit you better.


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos






--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Peter J. Holzer
On 2022-12-17 21:45:06 +0100, Paul St George wrote:
> It was the rounding rounding error that I needed to avoid (as Peter J.
> Holzer suggested). The use of decimal solved it and just in time. I
> was about to truncate the number, get each of the characters from the
> string mantissa, and then do something like this:
> 
> 64550.727
> 
> 64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)

That wouldn't have helped. In fact it would have made matters worse
because instead of a single rounding operation you now have nine!

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Chris Angelico
On Sun, 18 Dec 2022 at 08:22, Grant Edwards  wrote:
>
> On 2022-12-17, Chris Angelico  wrote:
>
> >> It was the rounding rounding error that I needed to avoid (as Peter
> >> J. Holzer suggested). The use of decimal solved it and just in
> >> time. I was about to truncate the number, get each of the
> >> characters from the string mantissa, and then do something like
> >> this:
> >>
> >> 64550.727
> >>
> >> 64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)
> >>
> >> Now I do not need to!
> >
> > It sounds like fixed-point arithmetic might be a better fit for what
> > you're doing.
>
> Yes, fixed point (or decimal) is a better fit for what he's doing. but
> I suspect that floating point would be a better fit for the problem
> he's trying to solve.
>

Hard to judge, given how little info we have on the actual problem.
Could go either way.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Grant Edwards
On 2022-12-17, Chris Angelico  wrote:

>> It was the rounding rounding error that I needed to avoid (as Peter
>> J. Holzer suggested). The use of decimal solved it and just in
>> time. I was about to truncate the number, get each of the
>> characters from the string mantissa, and then do something like
>> this:
>>
>> 64550.727
>>
>> 64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)
>>
>> Now I do not need to!
>
> It sounds like fixed-point arithmetic might be a better fit for what
> you're doing.

Yes, fixed point (or decimal) is a better fit for what he's doing. but
I suspect that floating point would be a better fit for the problem
he's trying to solve.

--
Grant



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Chris Angelico
On Sun, 18 Dec 2022 at 07:46, Paul St George  wrote:
>
> Thanks to all!
> It was the rounding rounding error that I needed to avoid (as Peter J. Holzer 
> suggested). The use of decimal solved it and just in time. I was about to 
> truncate the number, get each of the characters from the string mantissa, and 
> then do something like this:
>
> 64550.727
>
> 64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)
>
> Now I do not need to!

It sounds like fixed-point arithmetic might be a better fit for what
you're doing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Paul St George
Thanks to all!
It was the rounding rounding error that I needed to avoid (as Peter J. Holzer 
suggested). The use of decimal solved it and just in time. I was about to 
truncate the number, get each of the characters from the string mantissa, and 
then do something like this:

64550.727

64550 + (7 * 0.1) + (2 * 0.01) + (7 * 0.001)

Now I do not need to!





> On 17 Dec 2022, at 13:11, Alan Gauld  wrote:
> 
> On 17/12/2022 11:51, Paul St George wrote:
>> I have a large/long array of numbers in an external file. The numbers look 
>> like this:
>> 
>> -64550.727
>> -64511.489
>> -64393.637
>> -64196.763
>> -63920.2
> 
>> When I bring the numbers into my code, they are Strings. To use the 
>> numbers in my code, I want to change the Strings to Float type 
>> because the code will not work with Strings but I do not want 
>> to change the numbers in any other way.
> 
> That may be impossible. Float type is not exact and the conversion
> will be the closest binary representation of your decimal number.
> It will be very close but it may be slightly different when you
> print it, for example. (You can usually deal with that by using
> string formatting features.)
> 
> Another option is to use the decimal numeric type. That has other
> compromises associated with it but, if retaining absolute decimal
> accuracy is your primary goal, it might suit you better.
> 
> 
> -- 
> Alan G
> Author of the Learn to Program web site
> http://www.alan-g.me.uk/
> http://www.amazon.com/author/alan_gauld
> Follow my photo-blog on Flickr at:
> http://www.flickr.com/photos/alangauldphotos
> 
> 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Thomas Passin

On 12/17/2022 1:41 PM, Mats Wichmann wrote:

On 12/17/22 07:15, Thomas Passin wrote:
You have strings, and you want to end up with numbers.  The numbers 
are not integers.  Other responders have gone directly to whether you 
should use float or decimal as the conversion, but that is a secondary 
matter.


If you have integers, convert with

integer = int(number_string)

-64550.727


they pretty clearly aren't integers:


Of course they aren't.  That's why I gave the line with float() too. 
It's useful to see that there is a basic pattern here:  Given a string, 
expect to need to convert it to what you actually want, int, float, 
decimal, whatever.



-64511.489
-64393.637
-64196.763
-63920.2
-63563.037
-63124.156
-62602.254
-61995.895
-61303.548
-60523.651
-59654.66


--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Mats Wichmann

On 12/17/22 07:15, Thomas Passin wrote:
You have strings, and you want to end up with numbers.  The numbers are 
not integers.  Other responders have gone directly to whether you should 
use float or decimal as the conversion, but that is a secondary matter.


If you have integers, convert with

integer = int(number_string)

-64550.727


they pretty clearly aren't integers:


-64511.489
-64393.637
-64196.763
-63920.2
-63563.037
-63124.156
-62602.254
-61995.895
-61303.548
-60523.651
-59654.66

--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Thomas Passin
You have strings, and you want to end up with numbers.  The numbers are 
not integers.  Other responders have gone directly to whether you should 
use float or decimal as the conversion, but that is a secondary matter.


If you have integers, convert with

integer = int(number_string)

If you don't have integers, convert with

number = float(number_string)

If the number is not an integer and you need to be extremely precise 
with the fractional part - for example, if you need to keep exact track 
of exact amounts of money - you can use decimal instead of float.  But 
in this example it seems unlikely that you need that.


The thing to be aware of that hasn't been mentioned so far is that the 
conversion might cause an exception if there is something wrong with one 
of the number strings.  If you don't handle it, your program will quit 
running when it hits the error.  That might not matter to you.  If it 
matters, you can handle the exception in the program, but first you will 
need to decide what to do about such an error: should that number be 
omitted, should it be replaced with a placeholder, should it be replaced 
with float("NaN"), or what else?


In your case here, I'd suggest not handling the error until you get the 
basics of your program working.  Then if you need to, figure out how you 
want to handle exceptions.



On 12/17/2022 6:51 AM, Paul St George wrote:

I have a large/long array of numbers in an external file. The numbers look like 
this:

-64550.727
-64511.489
-64393.637
-64196.763
-63920.2
-63563.037
-63124.156
-62602.254
-61995.895
-61303.548
-60523.651
-59654.66
...

When I bring the numbers into my code, they are Strings. To use the numbers in 
my code, I want to change the Strings to Float type because the code will not 
work with Strings but I do not want to change the numbers in any other way.

So, I want my Strings (above) to be these numbers.

-64550.727
-64511.489
-64393.637
-64196.763
-63920.2
-63563.037
-63124.156
-62602.254
-61995.895
-61303.548
-60523.651
-59654.66
...

Please help!












--
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Peter J. Holzer
On 2022-12-17 12:51:17 +0100, Paul St George wrote:
> I have a large/long array of numbers in an external file. The numbers
> look like this:
> 
> -64550.727
> -64511.489
> -64393.637
[...]
> 
> When I bring the numbers into my code, they are Strings. To use the
> numbers in my code, I want to change the Strings to Float type because
> the code will not work with Strings but I do not want to change the
> numbers in any other way.

>>> s = "-64550.727"
>>> f = float(s)
>>> f
-64550.727
>>> type(f)


(Contrary to the other people posting in this thread I don't think float
is the wrong type for the job. It might be, but you haven't given enough
details to tell whether the inevitable rounding error matters or not. In
my experience in almost all cases where people think it matters it
really doesn't.)

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Alan Gauld
On 17/12/2022 11:51, Paul St George wrote:
> I have a large/long array of numbers in an external file. The numbers look 
> like this:
> 
> -64550.727
> -64511.489
> -64393.637
> -64196.763
> -63920.2

> When I bring the numbers into my code, they are Strings. To use the 
> numbers in my code, I want to change the Strings to Float type 
> because the code will not work with Strings but I do not want 
> to change the numbers in any other way.

That may be impossible. Float type is not exact and the conversion
will be the closest binary representation of your decimal number.
It will be very close but it may be slightly different when you
print it, for example. (You can usually deal with that by using
string formatting features.)

Another option is to use the decimal numeric type. That has other
compromises associated with it but, if retaining absolute decimal
accuracy is your primary goal, it might suit you better.


-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: String to Float, without introducing errors

2022-12-17 Thread Weatherby,Gerard
https://docs.python.org/3/library/decimal.html

Get Outlook for iOS<https://aka.ms/o0ukef>

From: Python-list  on 
behalf of Paul St George 
Sent: Saturday, December 17, 2022 6:51:17 AM
To: python-list@python.org 
Subject: String to Float, without introducing errors

*** Attention: This is an external email. Use caution responding, opening 
attachments or clicking on links. ***

I have a large/long array of numbers in an external file. The numbers look like 
this:

-64550.727
-64511.489
-64393.637
-64196.763
-63920.2
-63563.037
-63124.156
-62602.254
-61995.895
-61303.548
-60523.651
-59654.66
...

When I bring the numbers into my code, they are Strings. To use the numbers in 
my code, I want to change the Strings to Float type because the code will not 
work with Strings but I do not want to change the numbers in any other way.

So, I want my Strings (above) to be these numbers.

-64550.727
-64511.489
-64393.637
-64196.763
-63920.2
-63563.037
-63124.156
-62602.254
-61995.895
-61303.548
-60523.651
-59654.66
...

Please help!










--
https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!guE_zPsjxMW4k6nHIdOqZbrt8SdjUC9GELXgSHatARIr2PrAYr6tXCmixkZGNocjsf9SKLduQFjZjM7tOeaQ$
-- 
https://mail.python.org/mailman/listinfo/python-list


String to Float, without introducing errors

2022-12-17 Thread Paul St George
I have a large/long array of numbers in an external file. The numbers look like 
this:

-64550.727
-64511.489
-64393.637
-64196.763
-63920.2
-63563.037
-63124.156
-62602.254
-61995.895
-61303.548
-60523.651
-59654.66
...

When I bring the numbers into my code, they are Strings. To use the numbers in 
my code, I want to change the Strings to Float type because the code will not 
work with Strings but I do not want to change the numbers in any other way.

So, I want my Strings (above) to be these numbers.

-64550.727
-64511.489
-64393.637
-64196.763
-63920.2
-63563.037
-63124.156
-62602.254
-61995.895
-61303.548
-60523.651
-59654.66
...

Please help!










-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-11-08 Thread Alex Hall
On Sunday, October 9, 2022 at 12:09:45 PM UTC+2, Antoon Pardon wrote:
> I would like a tool that tries to find as many syntax errors as possible 
> in a python file. I know there is the risk of false positives when a 
> tool tries to recover from a syntax error and proceeds but I would 
> prefer that over the current python strategy of quiting after the first 
> syntax error. I just want a tool for syntax errors. No style 
> enforcements. Any recommandations? -- Antoon Pardon

Bit late here, coming from the Pycoder's Weekly email newsletter, but I'm 
surprised that I don't see any mentions of 
[parso](https://parso.readthedocs.io/en/latest/):

> Parso is a Python parser that supports error recovery and round-trip parsing 
> for different Python versions (in multiple Python versions). Parso is also 
> able to list multiple syntax errors in your python file.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-12 Thread Peter J. Holzer
On 2022-10-11 14:11:56 -0400, Thomas Passin wrote:
> To bring things back to the context of the original post, actual web
> browsers are extremely tolerant of HTML syntax errors (including incorrect
> nesting of tags) in the documents they receive.

HTML5 actually specifies exactly how to recover from errors. So since
every sequence of bytes results in a well-defined DOM tree you might
argue (a bit tongue in cheek) that there are no syntax errors in HTML5.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-12 Thread Peter J. Holzer
On 2022-10-13 11:23:40 +1100, Chris Angelico wrote:
> On Thu, 13 Oct 2022 at 11:19, Peter J. Holzer  wrote:
> > On 2022-10-11 09:47:52 +1100, Chris Angelico wrote:
> > > On Tue, 11 Oct 2022 at 09:18, Cameron Simpson  wrote:
> > > >
> > > Consider:
> > >
> > > if condition # no colon
> > > code
> > > else:
> > > code
> > >
> > > To actually "restart" parsing, you have to make a guess of some sort.
> >
> > Right. At least one of the papers on parsing I read over the last few
> > years (yeah, I really should try to find them again) argued that the
> > vast majority of syntax errors is either a missing token, a superfluous
> > token or a combination of the the two. So one strategy with good results
> > is to heuristically try to insert or delete single tokens and check
> > which results in the longest distance to the next error.
> >
> > Checking multiple possible fixes has its cost, especially since you have
> > to do that at every error. So you can argue that it is better for
> > productivity if you discover one error in 0.1 seconds than 10 errors in
> > 5 seconds.
> 
> Maybe; but what if you report 10 errors in 5 seconds, but 8 of them
> are spurious? You've reported two useful errors in a sea of noise.
> Even if it's the other way around (8 where you nailed it and correctly
> reported the error, 2 that are nonsense), is it actually helpful?

Humans are pattern-matching animals. It is quite possible that seeing a
bunch of related errors makes the fix more obvious than seeing them in
isolation.

No, I haven't done any studies on this. Yes, it is possible that all
those compiler writers who spent lots of work on error recovery over the
last 50 years (or longer) are delusional.


> > > > I grew up with C and Pascal compilers which would _happily_ produce many
> > > > complaints, usually accurate, and all manner of syntactic errors. They
> > > > didn't stop at the first syntax error.
> > >
> > > Yes, because they work with a much simpler grammar.
> >
> > I very much doubt that. Python doesn't have a particularly complicated
> > grammar, and C certainly doesn't have a particularly simple one.
> >
> > The argument that it's impossible in Python (unlike any other language),
> > because Python is oh so special doesn't hold water.
> >
> 
> Never said it's because Python is special; there are a LOT of
> languages that are at least as complicated.

And almost all of their compilers do try to recover from errors.

> But I do think that Pascal, especially, has a significantly simpler
> grammar than Python does.

Incidentally, Turbo Pascal was the one other example of a compiler which
*didn't* try to recover.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-12 Thread Chris Angelico
On Thu, 13 Oct 2022 at 11:23, dn  wrote:
> # add an extra character within identifier, as if 'new' identifier
> 28  assert expected_value == fyibonacci_number
> UUU
>
> # these all trivial SYNTAX errors - could have tried leaving-out a
> keyword, but ...

Just to be clear, this last one is not actually a *syntax* error -
it's a misspelled name, but contextually, that is clearly a name and
nothing else. These are much easier to report multiples of, and
typical syntax highlighters will do so.

Your other two examples were both syntactic discrepancies though.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-12 Thread Chris Angelico
On Thu, 13 Oct 2022 at 11:19, Peter J. Holzer  wrote:
>
> On 2022-10-11 09:47:52 +1100, Chris Angelico wrote:
> > On Tue, 11 Oct 2022 at 09:18, Cameron Simpson  wrote:
> > >
> > Consider:
> >
> > if condition # no colon
> > code
> > else:
> > code
> >
> > To actually "restart" parsing, you have to make a guess of some sort.
>
> Right. At least one of the papers on parsing I read over the last few
> years (yeah, I really should try to find them again) argued that the
> vast majority of syntax errors is either a missing token, a superfluous
> token or a combination of the the two. So one strategy with good results
> is to heuristically try to insert or delete single tokens and check
> which results in the longest distance to the next error.
>
> Checking multiple possible fixes has its cost, especially since you have
> to do that at every error. So you can argue that it is better for
> productivity if you discover one error in 0.1 seconds than 10 errors in
> 5 seconds.

Maybe; but what if you report 10 errors in 5 seconds, but 8 of them
are spurious? You've reported two useful errors in a sea of noise.
Even if it's the other way around (8 where you nailed it and correctly
reported the error, 2 that are nonsense), is it actually helpful? Bear
in mind that, if you can discover one syntax error in 0.1 seconds, you
can do that check *the moment the user types a key* in the editor
(which is more-or-less what happens with most syntax highlighting
editors - some have a small delay to avoid being too noisy with error
reporting, but same difference). Why report false errors when you can
report errors one by one and know that they're true?

> > > I grew up with C and Pascal compilers which would _happily_ produce many
> > > complaints, usually accurate, and all manner of syntactic errors. They
> > > didn't stop at the first syntax error.
> >
> > Yes, because they work with a much simpler grammar.
>
> I very much doubt that. Python doesn't have a particularly complicated
> grammar, and C certainly doesn't have a particularly simple one.
>
> The argument that it's impossible in Python (unlike any other language),
> because Python is oh so special doesn't hold water.
>

Never said it's because Python is special; there are a LOT of
languages that are at least as complicated. Try giving multiple useful
errors when there's a syntactic problem in SQL, for instance. But I do
think that Pascal, especially, has a significantly simpler grammar
than Python does.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-12 Thread dn

On 09/10/2022 23.09, Antoon Pardon wrote:
I would like a tool that tries to find as many syntax errors as possible 
in a python file. I know there is the risk of false positives when a 
tool tries to recover from a syntax error and proceeds but I would 
prefer that over the current python strategy of quiting after the first 
syntax error. I just want a tool for syntax errors. No style 
enforcements. Any recommandations? -- Antoon Pardon



Am not sure if have really understood problem being addressed, because 
it seems 'answered' - perhaps the question says more about the tool-set 
being utilised...



As someone who used to manually check and re-check code before 
submitting (first punched-cards, and later edited files) source to a 
compiler, it took some re-education to learn what to expect from a 
modern/language-intelligent IDE.


The topic was a major interest back in the days of batch-compilers. Plus 
we had other tools, eg CREF/XREF utilities which produced 
cross-references of identifier usage - and illustrated typos in 
identifiers, usage before value-assignment, etc (per request from one 
respondent).



Using an IDE which is inspecting source-code as it is being typed (or 
when an existing file is opened) will suggest what might?should be typed 
'next' (a mixed blessing IMHO!), and secondly highlights errors until 
they are noticed and dealt-with. Some, especially warnings, can be 
safely ignored - and yes, some are spurious and SHOULD be ignored!.


PyCharm* displays a number of indicators. The least intrusive appears in 
the top-right corner of the editor-tab listing, eg 8 errors, 2 warnings. 
So, apparently not 'stopping' at first error found.


Within the source-code itself, there are high-lights and under-lines (in 
and amongst the syntax highlighting presentation/theme) - which I 
suppose are easier to notice during data-entry if one is a touch-typist. 
Accordingly, not much of a context for multiple errors to be committed 
during a single coding-session, but remaining un-noticed until 'the end'.



For illustration, I took a simple tutorial* routine and deliberately 
introduced some/many of the types of error discussed within this thread. 
It would have been ideal to attach a graphic but here are some lines of 
code, under which I have attempted to represent a highlighted character 
(related to the line above) with an "H", and a (red) under-lined token 
with a "U". So, this is a feeble-attempt to show how the source is 
displayed and annotated by the IDE:


# mis-type the tuple-assignment by adding semi-colon
# which might also confuse Python into thinking of a second instruction
17 i, j = 0;, 1
  H  UH

# replace under-line/under-score with space: s/b expected_value
25 for expected value, fibonacci_number in \
   UU  

# mis-type the name of the zip built-in function
26 z ip( SERIES, fibonacci_generator() ):
   U 

# add an extra character within identifier, as if 'new' identifier
28  assert expected_value == fyibonacci_number
   UUU

# these all trivial SYNTAX errors - could have tried leaving-out a 
keyword, but ...



Assuming the problem is not noticed/handled as the text is being typed, 
and in addition to the coder reviewing the work, recognising problems, 
and dealing with them him-/her-self; the IDE offers two follow-up 
mechanisms:


1 a means to jump 'focus' from the site of one error to the next, 
whereupon a pop-up will describe the error, eg (line 28) "Unresolved 
reference 'expected_value'"; which illustrates one problem in-isolation. 
In this case, line 28 is 'at fault' despite the fact that the 'error' is 
a consequence of THE problem on line 25!


2 a "Problems" Tool Window can be displayed, which will list every error 
and warning, with pretty, colored, icons, and the same message per 
example above, together with the relevant line-number, (the first two 
entries, as-listed, are 'warnings', and the rest are described as "errors"):


Need more values to unpack:17
Statement seems to have no effect:17
# so it has picked-up both of my nefarious intentions

Statement expected, found Py:COMMA:17
# as above
# NB the "Py:COMMA" is from tokenize (per @Chris contribution(s))
'in' expected:25
# logical, but confused by the space
Unresolved reference 'value':25
# pretty-much had no chance with so many faults in one statement!
Unresolved reference 'fibonacci_number':25
# ditto
Unresolved reference 'z':26
# absolutely!
':' expected:26
# evidently re-started after the "in" and did what it could with the "z"
Unresolved reference 'expected_value':28
# it would be "resolved" but for t

Re: What to use for finding as many syntax errors as possible.

2022-10-12 Thread Peter J. Holzer
On 2022-10-11 09:47:52 +1100, Chris Angelico wrote:
> On Tue, 11 Oct 2022 at 09:18, Cameron Simpson  wrote:
> >
> Consider:
> 
> if condition # no colon
> code
> else:
> code
> 
> To actually "restart" parsing, you have to make a guess of some sort.

Right. At least one of the papers on parsing I read over the last few
years (yeah, I really should try to find them again) argued that the
vast majority of syntax errors is either a missing token, a superfluous
token or a combination of the the two. So one strategy with good results
is to heuristically try to insert or delete single tokens and check
which results in the longest distance to the next error.

Checking multiple possible fixes has its cost, especially since you have
to do that at every error. So you can argue that it is better for
productivity if you discover one error in 0.1 seconds than 10 errors in
5 seconds.


> > I grew up with C and Pascal compilers which would _happily_ produce many
> > complaints, usually accurate, and all manner of syntactic errors. They
> > didn't stop at the first syntax error.
> 
> Yes, because they work with a much simpler grammar.

I very much doubt that. Python doesn't have a particularly complicated
grammar, and C certainly doesn't have a particularly simple one.

The argument that it's impossible in Python (unlike any other language),
because Python is oh so special doesn't hold water.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Cameron Simpson

On 11Oct2022 17:45, Thomas Passin  wrote:
Personally, I'd most likely go for a decent programming editor that you 
can set up to run a program on your file, use that to run a checker, 
like pyflakes for instance, and run that from time to time.  You could 
run it when you save a file.  Even if it only showed one error at a 
time, it would make quick work of correcting mistakes.  And it wouldn't 
need to trigger an entire tool chain each time.


Aye.

I've got my editor (vim) configured to run an autoformatter on my code 
when I save (this can be turned off, and parse errors prevent any 
reformatting).


Linters I run by hand from the adjacent shell window, via a small script 
which runs my preferred linters with their preferred options.


My current workplace triggers the CI workflow when you push commits 
upstream, and you can make branch names which do not trigger the CI 
stuff.


So there's a decent separation between saving (and testing or locally 
running the dev code) from the CI cycle.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Thomas Passin

On 10/11/2022 5:09 PM, Thomas Passin wrote:

The OP wants to get help with problems in 
his files even if it isn't perfect, and I think that's reasonable to 
wish for.  The link to a post about the lezer parser in a recent message 
on this thread is partly about how a real, practical parser can do some 
error correction in mid-flight, for the purposes of a programming editor 
(as opposed to one that has to build a correct program).


One editor that seems to do what the OP wants is Visual Studio Code.  It 
will mark apparent errors - not just syntax errors - not limited to one 
per page.  Sometimes it can even suggest corrections.  I personally 
dislike the visual clutter the markings impose, but I imagine I could 
get used to it.


VSC uses a Microsoft system they call "PyLance" - see

https://devblogs.microsoft.com/python/announcing-pylance-fast-feature-rich-language-support-for-python-in-visual-studio-code/

Of course, you don't get something complex for free, and in this case 
the cost is having to run a separate server to do all this analysis on 
the fly.  However, VSC handles all of that behind the scenes so you 
don't have to.


Personally, I'd most likely go for a decent programming editor that you 
can set up to run a program on your file, use that to run a checker, 
like pyflakes for instance, and run that from time to time.  You could 
run it when you save a file.  Even if it only showed one error at a 
time, it would make quick work of correcting mistakes.  And it wouldn't 
need to trigger an entire tool chain each time.


My editor of choice for setting up helper "tools" like this on Windows 
is Editplus (non-free but cheap and very worth it), and I have both 
py_compile and pyflakes set up this way in it.  However, as I mentioned 
in an earlier post, the Leo Editor 
(https://github.com/leo-editor/leo-editor) does this for you 
automatically when you save, so it's very convenient.  That's what I 
mostly work in.

--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Thomas Passin

On 10/11/2022 4:00 PM, Chris Angelico wrote:

On Wed, 12 Oct 2022 at 05:23, Thomas Passin  wrote:


On 10/11/2022 3:10 AM, avi.e.gr...@gmail.com wrote:

I see resemblances to something like how a web page is loaded and operated.
I mean very different but at some level not so much.

I mean a typical web page is read in as HTML with various keyword regions
expected such as  ...  or  ...  with things
often cleanly nested in others. The browser makes nodes galore in some kind
of tree format with an assortment of objects whose attributes or methods
represent aspects of what it sees. The resulting treelike structure has
names like DOM.


To bring things back to the context of the original post, actual web
browsers are extremely tolerant of HTML syntax errors (including
incorrect nesting of tags) in the documents they receive.  They usually
recover silently from errors and are able to display the rest of the
page.  Usually they manage this correctly.


Having had to debug tiny errors in HTML pages that resulted in
extremely weird behaviour, I'm not sure that I agree that they usually
manage correctly. Fundamentally, they guess, and guesswork is never
reliable.


Still, browsers generally do a very decent job of recovery, even though 
perfection isn't possible.  The OP wants to get help with problems in 
his files even if it isn't perfect, and I think that's reasonable to 
wish for.  The link to a post about the lezer parser in a recent message 
on this thread is partly about how a real, practical parser can do some 
error correction in mid-flight, for the purposes of a programming editor 
(as opposed to one that has to build a correct program).


--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Chris Angelico
On Wed, 12 Oct 2022 at 05:23, Thomas Passin  wrote:
>
> On 10/11/2022 3:10 AM, avi.e.gr...@gmail.com wrote:
> > I see resemblances to something like how a web page is loaded and operated.
> > I mean very different but at some level not so much.
> >
> > I mean a typical web page is read in as HTML with various keyword regions
> > expected such as  ...  or  ...  with things
> > often cleanly nested in others. The browser makes nodes galore in some kind
> > of tree format with an assortment of objects whose attributes or methods
> > represent aspects of what it sees. The resulting treelike structure has
> > names like DOM.
>
> To bring things back to the context of the original post, actual web
> browsers are extremely tolerant of HTML syntax errors (including
> incorrect nesting of tags) in the documents they receive.  They usually
> recover silently from errors and are able to display the rest of the
> page.  Usually they manage this correctly.

Having had to debug tiny errors in HTML pages that resulted in
extremely weird behaviour, I'm not sure that I agree that they usually
manage correctly. Fundamentally, they guess, and guesswork is never
reliable.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Thomas Passin

On 10/11/2022 3:10 AM, avi.e.gr...@gmail.com wrote:

I see resemblances to something like how a web page is loaded and operated.
I mean very different but at some level not so much.

I mean a typical web page is read in as HTML with various keyword regions
expected such as  ...  or  ...  with things
often cleanly nested in others. The browser makes nodes galore in some kind
of tree format with an assortment of objects whose attributes or methods
represent aspects of what it sees. The resulting treelike structure has
names like DOM.


To bring things back to the context of the original post, actual web 
browsers are extremely tolerant of HTML syntax errors (including 
incorrect nesting of tags) in the documents they receive.  They usually 
recover silently from errors and are able to display the rest of the 
page.  Usually they manage this correctly.  The OP would like to have a 
parser or checker that could do the same, plus giving an output showing 
where each of the errors happened.


I can imagine such a parser also reporting which lines it had to skip 
before it was able to recover.

--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Chris Angelico
On Tue, 11 Oct 2022 at 18:12,  wrote:
>
> Thanks for a rather detailed explanation of some of what we have been
> discussing, Chris. The overall outline is about what I assumed was there but
> some of the details were, to put it politely, fuzzy.
>
> I see resemblances to something like how a web page is loaded and operated.
> I mean very different but at some level not so much.
>
> I mean a typical web page is read in as HTML with various keyword regions
> expected such as  ...  or  ...  with things
> often cleanly nested in others. The browser makes nodes galore in some kind
> of tree format with an assortment of objects whose attributes or methods
> represent aspects of what it sees. The resulting treelike structure has
> names like DOM.

Yes. The basic idea of "tokenize, parse, compile" can be used for
pretty much any language - even English, although its grammar is a bit
more convoluted than most programming languages, with many weird
backward compatibility features! I'll parse your last sentence above:

LETTERS The
SPACE
LETTERS resulting
SPACE
... you get the idea
LETTERS like
SPACE
LETTERS DOM
FULLSTOP # or call this token PERIOD if you're American

Now, we can group those tokens into meaningful sets.

Sentence(type=Statement,
subject=Noun(name="structure", addenda=[
Article(type=The),
Adjective(name="treelike"),
]),
verb=Verb(type=Being, name="has", addenda=[]),
object=Noun(name="name", plural=True, addenda=[
Adjective(phrase=Phrase(verb=Verb(name="like"), object=Noun(name="DOM"),
]),
)

Grammar nerds will probably dispute some of the awful shorthanding I
did here, but I didn't want to devise thousands of AST nodes just for
this :)

> To a certain approximation, this tree starts a certain way but is regularly
> being manipulated (or perhaps a copy is) as it regularly is looked at to see
> how to display it on the screen at the moment based on the current tree
> contents and another set of rules in Cascading Style Sheets.

Yep; the DOM tree is initialized from the HTML (usually - it's
possible to start a fresh tree with no HTML) and then can be
manipulated afterwards.

> These are not at all the same thing but share a certain set of ideas and
> methods and can be very powerful as things interact.

Oh absolutely. That's why there are languages designed to help you
define other languages.

> In effect the errors in the web situation have such analogies too as in what
> happens if a region of HTML is not well-formed or uses a keyword not
> recognized.

And they're horribly horribly messy, due to a few decades of
sloppy HTML programmers and the desire to still display the page even
if things are messed up :) But, again, there's a huge difference
between syntactic errors (like omitting a matching angle bracket) and
semantic errors (a keyword not known, like using  when you
should have used ). In the latter case, you can still build a
DOM tree, but you have an unknown element; in the former case, you
have to guess at what the author meant, just to get anything going at
all.

> There was a guy around a few years ago who suggested he would create a
> system where you could create a series of some kind of configuration files
> for ANY language and his system would them compile or run programs for each
> and every such language? Was that on this forum? What ever happened to him?

That was indeed on this forum, and I have no idea what happened to
him. Maybe he realised that all he'd invented was the Unix shebang?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Weatherby,Gerard
Sure it does. They’re optional and not enforced at runtime, but I find them 
useful when writing code in PyCharm:

import os
from os import DirEntry

de : DirEntry
for de in os.scandir('/tmp'):
print(de.name)

de = 7
print(de)

Predeclaring de allows me to do the tab completion thing with DirEntry fields / 
methods

From: Python-list  on 
behalf of avi.e.gr...@gmail.com 
Date: Monday, October 10, 2022 at 10:11 PM
To: python-list@python.org 
Subject: RE: What to use for finding as many syntax errors as possible.
*** Attention: This is an external email. Use caution responding, opening 
attachments or clicking on links. ***

Michael,

A reasonable question. Python lets you initialize variables but has no
explicit declarations.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Antoon Pardon




Op 10/10/2022 om 19:08 schreef Robert Latest via Python-list:

Antoon Pardon wrote:

I would like a tool that tries to find as many syntax errors as possible
in a python file.

I'm puzzled as to when such a tool would be needed. How many syntax errors can
you realistically put into a single Python file before compiling it for the
first time?


Why are you puzzled? I don't need to make that many syntaxt errors to find
such a tool useful.

--
Antoon Pardon
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-11 Thread Roel Schroeven

Op 10/10/2022 om 19:08 schreef Robert Latest via Python-list:

Antoon Pardon wrote:
> I would like a tool that tries to find as many syntax errors as possible 
> in a python file.


I'm puzzled as to when such a tool would be needed. How many syntax errors can
you realistically put into a single Python file before compiling it for the
first time?
I've been following the discussion from a distance and the whole time 
I've been wondering the same thing. Especially when you have unit tests, 
as Antoon said he has, I can't really imagine a situation where you add 
so much code in one go without running it that you introduce a painful 
amount of syntax errors.


My solution would be to use a modern IDE with a linter, possibly with 
style warnings disabled, which will flag syntax errors as soon as you 
type them. Possibly combined with a TDD-style tactic which also prevents 
large amounts of errors (any errors) to build up. But I have the 
impression that any of those doesn't fit in Antoon's workflow.


--
"Peace cannot be kept by force. It can only be achieved through understanding."
-- Albert Einstein

--
https://mail.python.org/mailman/listinfo/python-list


RE: What to use for finding as many syntax errors as possible.

2022-10-11 Thread avi.e.gross
Thanks for a rather detailed explanation of some of what we have been
discussing, Chris. The overall outline is about what I assumed was there but
some of the details were, to put it politely, fuzzy.

I see resemblances to something like how a web page is loaded and operated.
I mean very different but at some level not so much.

I mean a typical web page is read in as HTML with various keyword regions
expected such as  ...  or  ...  with things
often cleanly nested in others. The browser makes nodes galore in some kind
of tree format with an assortment of objects whose attributes or methods
represent aspects of what it sees. The resulting treelike structure has
names like DOM.

To a certain approximation, this tree starts a certain way but is regularly
being manipulated (or perhaps a copy is) as it regularly is looked at to see
how to display it on the screen at the moment based on the current tree
contents and another set of rules in Cascading Style Sheets. But bits and
pieces of JavaScript are also embedded or imported that can read aspects of
the tree (and more) and modify the contents and arrange for all kinds of
asynchronous events when bits of code are invoked such as when you click a
button or hover or when an image finishes loading or every 100 milliseconds.
It can insert new objects into the DOM too. And of course there can be
interactions with restricted local storage as well as with servers and code
running there.

It is quite a mess but in some ways I see analogies. Your program reads a
stream of data and looks for tokens and eventually turns things into a tree
of sorts that represents relationships to a point. Additional structures
eventually happen at run time that let you store collections of references
to variables such as environments or namespaces and the program derived from
the trees makes changes as it goes and in a language like Python can even
possibly change the running program in some ways.

These are not at all the same thing but share a certain set of ideas and
methods and can be very powerful as things interact. In the web case, the
CSS may search for regions with some class or ID or that are the third
element of a bullet list and more, using powerful tools like jQuery, and
make changes. A CSS rule that previously ignored some region as not having a
particular class, might start including it after a JavaScript segment is
aroused while waiting on an event listener for say a mouse hovering over an
area and then changes that part of the DOM (like a node) to be in that
class. Suddenly the area on your screen changes background or whatever the
CSS now dictates. We have multiple systems written in an assortment of
"languages" that complement each other. Some running programs, especially
ones that use asynchronous methods like threads or callbacks on events, such
as a GUI, can effectively do similar things. 

In effect the errors in the web situation have such analogies too as in what
happens if a region of HTML is not well-formed or uses a keyword not
recognized. This becomes even more interesting in XML where anything can be
a keyword and you often need other kinds of files (often also in ML) to
define what the XML can be like and what restrictions it may have such as
can a  have multiple authors but only one optional publication date
and so on. It can be fascinating and highly technical. So I am up for a
challenge of studying anything from early compilers for languages of my
youth to more recent ways including some like what you show.

I have time to kill and this might be more fun than other things, for a
while.

There was a guy around a few years ago who suggested he would create a
system where you could create a series of some kind of configuration files
for ANY language and his system would them compile or run programs for each
and every such language? Was that on this forum? What ever happened to him?

But although what he promised seemed a bit too much, I can see from your
comments below how in some ways a limited amount of that might be done for
some subset of languages which can be parsed and manipulated as described. 

-Original Message-
From: Python-list  On
Behalf Of Chris Angelico
Sent: Monday, October 10, 2022 11:55 PM
To: python-list@python.org
Subject: Re: What to use for finding as many syntax errors as possible.

On Tue, 11 Oct 2022 at 14:26,  wrote:
>
> I stand corrected Chris, and others, as I pay the sin tax.
>
> Yes, there are many kinds of errors that logically fall into different 
> categories or phases of evaluation of a program and some can be 
> determined by a more static analysis almost on a line by line (or 
> "statement" or "expression", ...)  basis and others need to sort of 
> simulate some things and look back and forth to detect possible 
> incompatibilities and yet others can only be detected at run time and 
> likely way more categories depending on the language.
>
> But 

What to use for finding as many syntax errors as possible.

2022-10-10 Thread avi.e.gross
I think we are in agreement here, Chris. My point is that the error
detection and correction is now done at levels where there is not much need
to use earlier and inefficient methods like parity bits set aside. We use
protocols like TCP and IP and layers above them and above those to maintain
the integrity of packets and sessions and forms of encryption allowing
things like authentication. There is tons of overhead, even when some is
fairly efficient, but we hardly notice it unless things go wrong.

So written language sent (as in this email/post) does not need lots of
redundancy and all the extra effort is, IMNSHO opinion, largely wasted. If I
see a bear, I do not wish to check their genitals or DNA to determine their
irrelevant gender before asking someone to run from it. If I happen to know
the gender, as in a zoo, gender only matters for things like breeding
purposes. I do not want to memorize terms in languages that have not only
words like lion and lioness or duck and drake and goose and gander, but for
EVERYTHING in some sense so I can say the equivalent of ANIMAL-male and
ANIMAL-female with unique words. Life would be so much simpler if I could
say your dog was nice and not be corrected that it was a bitch and I used
the wrong word endings. If I really wanted to say it was a female dog, well
I could just add a qualified. Most of the time, who cares?

The same applies to so much grammatical nonsense which is also usually
riddled with endless exceptions to the many rules. Make the languages simple
with little redundancy and thus far easier to learn.

I can say similar things about some programming languages that either have
way too many rules or too few of the right ones.

There are tradeoffs and if you want a powerful language it will likely not
be easy to control. If you want a very regulated language, you may find it
not very useful as many things are hard to do ad others not possible. I know
that strongly typed languages often have to allow some method of cheating
such as unions of data types, or using a parent class as the sort of
object-type to allow disparate objects to live together. Python is far from
the most complex but as noted, it is not trivial to evaluate even the syntax
past errors.

But I admit it is fun and a challenge to learn both kinds and I spent much
of my time doing so. I like the flexibility of seeing different approaches
and holding contradictions in my mind while accepting both and yet neither!
LOL!


-Original Message-
From: Python-list  On
Behalf Of Chris Angelico
Sent: Monday, October 10, 2022 11:24 PM
To: python-list@python.org
Subject: Re: What to use for finding as many syntax errors as possible.

On Tue, 11 Oct 2022 at 14:13,  wrote:
> With the internet today, we are used to expecting error correction to 
> come for free. Do you really need one of every 8 bits to be a parity 
> bit, which only catches may half of the errors...

Fortunately, we have WAY better schemes than simple parity, which was only
really a thing in the modem days. (Though I would say that there's still a
pretty clear distinction between a good message where everything has correct
parity, and line noise where half of them
don't.) Hamming codes can correct one-bit errors (and detect two-bit
errors) at a price of log2(size)+1 bits of space. Here's a great
rundown:

https://www.youtube.com/watch?v=X8jsijhllIA

There are other schemes too, but Hamming codes are beautifully elegant and
easy to understand.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Chris Angelico
On Tue, 11 Oct 2022 at 14:26,  wrote:
>
> I stand corrected Chris, and others, as I pay the sin tax.
>
> Yes, there are many kinds of errors that logically fall into different
> categories or phases of evaluation of a program and some can be determined
> by a more static analysis almost on a line by line (or "statement" or
> "expression", ...)  basis and others need to sort of simulate some things
> and look back and forth to detect possible incompatibilities and yet others
> can only be detected at run time and likely way more categories depending on
> the language.
>
> But when I run the Python interpreter on code, aren't many such phases done
> interleaved and at once as various segments of code are parsed and examined
> and perhaps compiled into block code and eventually executed?

Hmm, depends what you mean. Broadly speaking, here's how it goes:

0) Early pre-parse steps that don't really matter to most programs,
like checking character set. We'll ignore these.
1) Tokenize the text of the program into a sequence of
potentially-meaningful units.
2) Parse those tokens into some sort of meaningful "sentence".
3) Compile the syntax tree into actual code.
4) Run that code.

Example:
>>> code = """def f():
... print("Hello, world", 1>=2)
... print(Ellipsis, ...)
... return True
... """
>>>

In step 1, all that happens is that a stream of characters (or bytes,
depending on your point of view) gets broken up into units.

>>> for t in tokenize.tokenize(iter(code.encode().split(b"\n")).__next__):
... print(tokenize.tok_name[t.exact_type], t.string)

It's pretty spammy, but you can see how the compiler sees the text.
Note that, at this stage, there's no real difference between the NAME
"def" and the NAME "print" - there are no language keywords yet.
Basically, all you're doing is figuring out punctuation and stuff.

Step 2 is what we'd normally consider "parsing". (It may well happen
concurrently and interleaved with tokenizing, and I'm giving a
simplified and conceptualized pipeline here, but this is broadly what
Python does.) This compares the stream of tokens to the grammar of a
Python program and attempts to figure out what it means. At this
point, the linear stream turns into a recursive syntax tree, but it's
still very abstract.

>>> import ast
>>> ast.dump(ast.parse(code))
"Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[],
args=[], kwonlyargs=[], kw_defaults=[], defaults=[]),
body=[Expr(value=Call(func=Name(id='print', ctx=Load()),
args=[Constant(value='Hello, world'), Compare(left=Constant(value=1),
ops=[GtE()], comparators=[Constant(value=2)])], keywords=[])),
Expr(value=Call(func=Name(id='print', ctx=Load()),
args=[Name(id='Ellipsis', ctx=Load()), Constant(value=Ellipsis)],
keywords=[])), Return(value=Constant(value=True))],
decorator_list=[])], type_ignores=[])"

(Side point: I would rather like to be able to
pprint.pprint(ast.parse(code)) but that isn't a thing, at least not
currently.)

This is where the vast majority of SyntaxErrors come from. Your code
is a sequence of tokens, but those tokens don't mean anything. It
doesn't make sense to say "print(def f[return)]" even though that'd
tokenize just fine. The trouble with the notion of "keeping going
after finding an error" is that, when you find an error, there are
almost always multiple possible ways that this COULD have been
interpreted differently. It's as likely to give nonsense results as
actually useful ones.

(Note that, in contrast to the tokenization stage, this version
distinguishes between the different types of word. The "def" has
resulted in a FunctionDef node, the "print" is a Name lookup, and both
"..." and "True" have now become Constant nodes - previously, "..."
was a special Ellipsis token, but "True" was just a NAME.)

Step 3: the abstract syntax tree gets parsed into actual runnable
code. This is where that small handful of other SyntaxErrors come
from. With these errors, you absolutely _could_ carry on and report
multiple; but it's not very likely that there'll actually *be* more
than one of them in a file. Here's some perfectly valid AST parsing:

>>> ast.dump(ast.parse("from __future__ import the_past"))
"Module(body=[ImportFrom(module='__future__',
names=[alias(name='the_past')], level=0)], type_ignores=[])"
>>> ast.dump(ast.parse("from __future__ import braces"))
"Module(body=[ImportFrom(module='__future__',
names=[alias(name='braces')], level=0)], type_ignores=[])"
>&

Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Chris Angelico
On Tue, 11 Oct 2022 at 14:13,  wrote:
> With the internet today, we are used to expecting error correction to come
> for free. Do you really need one of every 8 bits to be a parity bit, which
> only catches may half of the errors...

Fortunately, we have WAY better schemes than simple parity, which was
only really a thing in the modem days. (Though I would say that
there's still a pretty clear distinction between a good message where
everything has correct parity, and line noise where half of them
don't.) Hamming codes can correct one-bit errors (and detect two-bit
errors) at a price of log2(size)+1 bits of space. Here's a great
rundown:

https://www.youtube.com/watch?v=X8jsijhllIA

There are other schemes too, but Hamming codes are beautifully elegant
and easy to understand.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: What to use for finding as many syntax errors as possible.

2022-10-10 Thread avi.e.gross
I stand corrected Chris, and others, as I pay the sin tax.

Yes, there are many kinds of errors that logically fall into different
categories or phases of evaluation of a program and some can be determined
by a more static analysis almost on a line by line (or "statement" or
"expression", ...)  basis and others need to sort of simulate some things
and look back and forth to detect possible incompatibilities and yet others
can only be detected at run time and likely way more categories depending on
the language.

But when I run the Python interpreter on code, aren't many such phases done
interleaved and at once as various segments of code are parsed and examined
and perhaps compiled into block code and eventually executed? 

So is the OP asking for something other than a Python Interpreter that
normally halts after some kind of error? Tools like a linter may indeed fit
that mold. 

This may limit some of the objections of when an error makes it hard for the
parser to find some recovery point to continue from as no code is being run
and no harmful side effects happen by continuing just an analysis. 

Time to go read some books about modern ways to evaluate a language based on
more mathematical rules including more precisely what is syntax versus ...

Suggestions?

-Original Message-
From: Python-list  On
Behalf Of Chris Angelico
Sent: Monday, October 10, 2022 10:42 PM
To: python-list@python.org
Subject: Re: What to use for finding as many syntax errors as possible.

On Tue, 11 Oct 2022 at 13:10,  wrote:
> If the above is:
>
> Import grumpy as np
>
> Then what happens if the code tries to find a file named "grumpy" 
> somewhere and cannot locate it and this is considered a syntax error 
> rather than a run-time error for whatever reason? Can you continue 
> when all kinds of functionality is missing and code asking to make a 
> np.array([1,2,3]) clearly fails?

That's not a syntax error. Syntax is VERY specific. It is an error in Python
to attempt to add 1 to "one", it is an error to attempt to look up the
upper() method on None, it is an error to try to use a local variable you
haven't assigned to yet, and it is an error to open a file that doesn't
exist. But not one of these is a *syntax* error.

Syntax errors are detected at the parsing stage, before any code gets run.
The vast majority of syntax errors are grammar errors, where the code
doesn't align with the parseable text of a Python program.
(Non-grammatical parsing errors include using a "nonlocal" statement with a
name that isn't found in any surrounding scope, using "await"
in a non-async function, and attempting to import braces from the
future.)

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: What to use for finding as many syntax errors as possible.

2022-10-10 Thread avi.e.gross
Cameron, or OP if you prefer,

I think by now you have seen a suggestion that languages make choices and
highly structured ones can be easier to "recover" from errors and try to
continue than some with way more complex possibilities that look rather
unstructured.

What is the error in code like this?

A,b,c,d = 1,2,

Or is it an error at all?

Many languages have no concept of doing anything like the above and some
tolerate a trailing comma and some set anything not found to some form of
NULL or uninitialized and some ...

If you look at human language, some are fairly simple and some are way too
organized. But in a way it can make sense. Languages with gender will often
ask you to change the spelling and often how you pronounce things not only
based on whether a noun is male/female or even neuter but also insist you
change the form of verbs or adjectives and so on that in effect give
multiple signals that all have to line up to make a valid and understandable
sentence. Heck, in conversations, people can often leave out parts of  a
sentence such as whether you are talking about "I" or "you" or "she" or "we"
because the rest of the words in the sentence redundantly force only one
choice to be possible. 

So some such annoying grammars (in my opinion) are error
detection/correction codes in disguise. In days before microphones and
speakers, it was common to not hear people well, like on a stage a hundred
feet away with other ambient noises. Missing a word or two might still allow
you to get the point as other parts of the sentence did such redundancies.
Many languages have similar strictures letting you know multiple times if
something is singular or plural. And I think another reason was what I call
stranger detection. People who learn some vocabulary might still not speak
correctly and be identifiable as strangers, as in spies.

Do we need this in the modern age? Who knows! But it makes me prefer some
languages over others albeit other reasons may ...

With the internet today, we are used to expecting error correction to come
for free. Do you really need one of every 8 bits to be a parity bit, which
only catches may half of the errors, when the internals of your computer are
relatively error free and even the outside is protected by things like
various protocols used in making and examining packets and demanding some be
sent again if some checksum does not match? Tons of checking is built in so
at your level you rarely think about it. If you get a message, it usually is
either 99.% accurate, or you do not have it shown to you at all. I am
not talking about SPAM but about errors of transmission.

So my analogies are that if you want a very highly structured language that
can recover somewhat from errors, Python may not be it.

And over the years as features are added or modified, the structure tends to
get more complex. And R is not alone. Many surviving languages continue to
evolve and borrow from each other and any program that you run today that
could partially recover and produce pages of possible errors, may blow up
when new features are introduced.

And with UNICODE, the number of possible "errors" in what is placed in code
for languages like Julia that allow them in most places ...


-Original Message-
From: Python-list  On
Behalf Of Cameron Simpson
Sent: Monday, October 10, 2022 6:17 PM
To: python-list@python.org
Subject: Re: What to use for finding as many syntax errors as possible.

On 11Oct2022 08:02, Chris Angelico  wrote:
>There's a huge difference between non-fatal errors and syntactic 
>errors. The OP wants the parser to magically skip over a fundamental 
>syntactic error and still parse everything else correctly. That's never 
>going to work perfectly, and the OP is surprised at this.

The OP is not surprised by this, and explicitly expressed awareness that
resuming a parse had potential for "misparsing" further code.

I remain of the opinion that one could resume a parse at the next unindented
line and get reasonable results a lot of the time.

In fact, I expect that one could resume tokenising at almost any line which
didn't seem to be inside a string and often get reasonable results.

I grew up with C and Pascal compilers which would _happily_ produce many
complaints, usually accurate, and all manner of syntactic errors. They
didn't stop at the first syntax error.

All you need in principle is a parser which goes "report syntax error here,
continue assuming ". For Python that might mean "pretend a
missing final colon" or "close open brackets" etc, depending on the context.
If you make conservative implied corrections you can get a reasonable
continued parse, enough to find further syntax errors.

I remember the Pascal compiler in particular had a really good "you missed a
semicolon _back there_" mode which was almost alwa

Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Chris Angelico
On Tue, 11 Oct 2022 at 13:10,  wrote:
> If the above is:
>
> Import grumpy as np
>
> Then what happens if the code tries to find a file named "grumpy" somewhere
> and cannot locate it and this is considered a syntax error rather than a
> run-time error for whatever reason? Can you continue when all kinds of
> functionality is missing and code asking to make a np.array([1,2,3]) clearly
> fails?

That's not a syntax error. Syntax is VERY specific. It is an error in
Python to attempt to add 1 to "one", it is an error to attempt to look
up the upper() method on None, it is an error to try to use a local
variable you haven't assigned to yet, and it is an error to open a
file that doesn't exist. But not one of these is a *syntax* error.

Syntax errors are detected at the parsing stage, before any code gets
run.  The vast majority of syntax errors are grammar errors, where the
code doesn't align with the parseable text of a Python program.
(Non-grammatical parsing errors include using a "nonlocal" statement
with a name that isn't found in any surrounding scope, using "await"
in a non-async function, and attempting to import braces from the
future.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: What to use for finding as many syntax errors as possible.

2022-10-10 Thread avi.e.gross
Michael,

A reasonable question. Python lets you initialize variables but has no
explicit declarations. Languages differ and I juggle attributes of many in
my mind and am reacting to the original question NOT about whether and how
Python should report many possible errors all at once but how ANY language
can be expected to do this well. Many others do have a variable declaration
phase or an optional declaration or perhaps just a need to declare a
function prototype so it can be used by others even if the formal function
creation will happen later in the code.

But what I meant in a Python context was something like this:

Wronk = who cares # this should fail
...
If (Wronk > 5): ...
...
Wronger = Wronk + 1
...
X = minimum(Wronk, Wronger, 12)

The first line does not parse well so you have an error. But in any case as
the line makes no sense, Wronk is not initialized to anything. Later code
may use it  in various ways and some of those may be seen as errors for an
assortment of reasons, then at one point the code does provide a value for
Wronk and suddenly code beyond that has no seeming errors. The above
examples are not meant to be real but just give a taste that programs with
holes in them for any reason may not be consistent. The only relatively
guaranteed test for sanity has to start at the top and encounter no errors
or missing parts based on an anything such as I/O errors. 

And I suggest there are some things sort of declared in python such as:

Import numpy as np

Yes, that brings in code from a module if it works and initializes a
variable called np to sort of point at the module or it's namespace or
whatever, depending on the language. It is an assignment but also a way to
let the program know things. If the above is:

Import grumpy as np

Then what happens if the code tries to find a file named "grumpy" somewhere
and cannot locate it and this is considered a syntax error rather than a
run-time error for whatever reason? Can you continue when all kinds of
functionality is missing and code asking to make a np.array([1,2,3]) clearly
fails?

Many of us here are talking past each other.

Yes, it would be nice to get lots of info and arguably we may eventually
have machine-learning or AI programs a bit more like SPAM detectors that
look for patterns commonly found and try to fix your program from common
errors or at least do a temporary patch so they can continue searching for
more errors. This could result in the best case in guessing right every
time. If you allowed it to actually fix your code, it might be like people
who let their spelling be corrected and do not proofread properly and send
out something embarrassing or just plain wrong!

And it will compile or be interpreted without complaint albeit not do
exactly what it is supposed to!




-Original Message-
From: Python-list  On
Behalf Of Michael F. Stemper
Sent: Monday, October 10, 2022 9:22 AM
To: python-list@python.org
Subject: Re: What to use for finding as many syntax errors as possible.

On 09/10/2022 10.49, Avi Gross wrote:
> Anton
> 
> There likely are such programs out there but are there universal 
> agreements on how to figure out when a new safe zone of code starts 
> where error testing can begin?
> 
> For example a file full of function definitions might find an error in 
> function 1 and try to find the end of that function and resume 
> checking the next function.  But what if a function defines local
functions within it?
> What if the mistake in one line of code could still allow checking the 
> next line rather than skipping it all?
> 
> My guess is that finding 100 errors might turn out to be misleading. 
> If you fix just the first, many others would go away. If you spell a 
> variable name wrong when declaring it, a dozen uses of the right name may
cause errors.
> Should you fix the first or change all later ones?

How does one declare a variable in python? Sometimes it'd be nice to be able
to have declarations and any undeclared variable be flagged.

When I was writing F77 for a living, I'd (temporarily) put:
   IMPLICIT CHARACTER*3
at the beginning of a program or subroutine that I was modifying, in order
to have any typos flagged.

I'd love it if there was something similar that I could do in python.

--
Michael F. Stemper
87.3% of all statistics are made up by the person giving them.
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Thomas Passin

On 10/10/2022 9:21 AM, Michael F. Stemper wrote:

On 09/10/2022 10.49, Avi Gross wrote:

Anton

There likely are such programs out there but are there universal 
agreements

on how to figure out when a new safe zone of code starts where error
testing can begin?

For example a file full of function definitions might find an error in
function 1 and try to find the end of that function and resume 
checking the

next function.  But what if a function defines local functions within it?
What if the mistake in one line of code could still allow checking the 
next

line rather than skipping it all?

My guess is that finding 100 errors might turn out to be misleading. 
If you
fix just the first, many others would go away. If you spell a variable 
name

wrong when declaring it, a dozen uses of the right name may cause errors.
Should you fix the first or change all later ones?


How does one declare a variable in python? Sometimes it'd be nice to
be able to have declarations and any undeclared variable be flagged.

When I was writing F77 for a living, I'd (temporarily) put:
   IMPLICIT CHARACTER*3
at the beginning of a program or subroutine that I was modifying,
in order to have any typos flagged.

I'd love it if there was something similar that I could do in python.


The Leo editor (https://github.com/leo-editor/leo-editor) will notify 
you of undeclared variables (and some syntax errors) each time you save 
your (Python) file.


--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Chris Angelico
On Tue, 11 Oct 2022 at 09:18, Cameron Simpson  wrote:
>
> On 11Oct2022 08:02, Chris Angelico  wrote:
> >There's a huge difference between non-fatal errors and syntactic
> >errors. The OP wants the parser to magically skip over a fundamental
> >syntactic error and still parse everything else correctly. That's
> >never going to work perfectly, and the OP is surprised at this.
>
> The OP is not surprised by this, and explicitly expressed awareness that
> resuming a parse had potential for "misparsing" further code.
>
> I remain of the opinion that one could resume a parse at the next
> unindented line and get reasonable results a lot of the time.

The next line at the same indentation level as the line with the
error, or the next flush-left line? Either way, there's a weird and
arbitrary gap before you start parsing again, and you still have no
indication of what could make sense. Consider:

if condition # no colon
code
else:
code

To actually "restart" parsing, you have to make a guess of some sort.
Maybe you can figure out what the user meant to do, and parse
accordingly; but if that's the case, keep going immediately, don't
wait for an unindented line. If you want for a blank line followed by
an unindented line, that might help with a notion of "next logical
unit of code", but it's very much dependent on the coding style, and
if you have a codebase that's so full of syntax errors that you
actually want to see more than one, you probably don't have a codebase
with pristine and beautiful code layout.

> In fact, I expect that one could resume tokenising at almost any line
> which didn't seem to be inside a string and often get reasonable
> results.

"Seem to be"? On what basis?

> I grew up with C and Pascal compilers which would _happily_ produce many
> complaints, usually accurate, and all manner of syntactic errors. They
> didn't stop at the first syntax error.

Yes, because they work with a much simpler grammar. But even then,
most syntactic errors (again, this is not to be confused with semantic
errors - if you say "char *x = 1.234;" then there's no parsing
ambiguity but it's not going to compile) cause a fair degree of
nonsense afterwards.

The waters are a bit muddied by some things being called "syntax
errors" when they're actually nothing at all to do with the parser.
For instance:

>>> def f():
... await q
...
  File "", line 2
SyntaxError: 'await' outside async function

This is not what I'm talking about; there's no parsing ambiguity here,
and therefore no difficulty whatsoever in carrying on with the
parsing. You could ast.parse() this code without an error. But
resuming after a parsing error is fundamentally difficult, impossible
without guesswork.

> All you need in principle is a parser which goes "report syntax error
> here, continue assuming ". For Python that might mean
> "pretend a missing final colon" or "close open brackets" etc, depending
> on the context. If you make conservative implied corrections you can get
> a reasonable continued parse, enough to find further syntax errors.

And, more likely, you'll generate a lot of nonsense. Take something like this:

items = [
item[1],
item2],
    item[3],
]

As a human, you can easily see what the problem is. Try teaching a
parser how to handle this. Most likely, you'll generate a spurious
error - maybe the indentation, maybe the intended end of the list -
but there's really only one error here. Reporting multiple errors
isn't actually going to be at all helpful.

> I remember the Pascal compiler in particular had a really good "you
> missed a semicolon _back there_" mode which was almost always correct, a
> nice boon when correcting mistakes.
>

Ahh yes. Design a language with strict syntactic requirements, and
it's not too hard to find where the programmer has omitted them. Thing
is Python just doesn't HAVE those semicolons. Let's say that a
variant Python required you to put a U+251C ├ at the start of every
statement, and U+2524 ┤ at the end of the statement. A whole lot of
classes of error would be extremely easy to notice and correct, and
thus you could resume parsing; but that isn't benefiting the
programmer any. When you don't have that kind of information
duplication, it's a lot harder to figure out how to cheat the fix and
go back to parsing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Cameron Simpson

On 09/10/2022 10.49, Avi Gross wrote:
My guess is that finding 100 errors might turn out to be misleading. 
If you

fix just the first, many others would go away. If you spell a variable name
wrong when declaring it, a dozen uses of the right name may cause errors.
Should you fix the first or change all later ones?


Just to this, these are semantic errors, not syntax errors. Linters do 
an ok job of spotting these. Antoon is after _syntax errors_.


On 10Oct2022 08:21, Michael F. Stemper  wrote:

How does one declare a variable in python? Sometimes it'd be nice to
be able to have declarations and any undeclared variable be flagged.


Linters do pretty well at this. They can trace names and their use 
compared to their first definition/assignment (often - there are of 
course some constructs which are correct but unclear to a static 
analysis - certainly one of my linters occasionally says "possible 
undefine use" to me because there may be a path to use before set). This 
is particularly handy for typos, which often make for "use before set" 
or "set and not used".



I'd love it if there was something similar that I could do in python.


Have you used any lint programmes? My "lint" script runs pyflakes and 
pylint.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Cameron Simpson

On 11Oct2022 08:02, Chris Angelico  wrote:

There's a huge difference between non-fatal errors and syntactic
errors. The OP wants the parser to magically skip over a fundamental
syntactic error and still parse everything else correctly. That's
never going to work perfectly, and the OP is surprised at this.


The OP is not surprised by this, and explicitly expressed awareness that 
resuming a parse had potential for "misparsing" further code.


I remain of the opinion that one could resume a parse at the next 
unindented line and get reasonable results a lot of the time.


In fact, I expect that one could resume tokenising at almost any line 
which didn't seem to be inside a string and often get reasonable 
results.


I grew up with C and Pascal compilers which would _happily_ produce many 
complaints, usually accurate, and all manner of syntactic errors. They 
didn't stop at the first syntax error.


All you need in principle is a parser which goes "report syntax error 
here, continue assuming ". For Python that might mean 
"pretend a missing final colon" or "close open brackets" etc, depending 
on the context. If you make conservative implied corrections you can get 
a reasonable continued parse, enough to find further syntax errors.


I remember the Pascal compiler in particular had a really good "you 
missed a semicolon _back there_" mode which was almost always correct, a 
nice boon when correcting mistakes.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Robert Latest via Python-list
Antoon Pardon wrote:
> I would like a tool that tries to find as many syntax errors as possible 
> in a python file.

I'm puzzled as to when such a tool would be needed. How many syntax errors can
you realistically put into a single Python file before compiling it for the
first time?

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Robert Latest via Python-list
Michael F. Stemper wrote:
> How does one declare a variable in python? Sometimes it'd be nice to
> be able to have declarations and any undeclared variable be flagged.

To my knowledge, the closest to that is using __slots__ in class definitions.
Many a time have I assigned to misspelled class members until I discovered
__slots__.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Robert Latest via Python-list
 wrote:
> Cameron,
>
> Your suggestion makes me shudder!

Me, too

> Removing all earlier lines of code is often guaranteed to generate errors as
> variables you are using are not declared or initiated, modules are not
> imported and so on.

all of which aren't syntax errors, so the method should still work. Ugly as
hell though. I can't think of a reason to want to find multiple syntax errors
in a file.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Michael F. Stemper

On 09/10/2022 10.49, Avi Gross wrote:

Anton

There likely are such programs out there but are there universal agreements
on how to figure out when a new safe zone of code starts where error
testing can begin?

For example a file full of function definitions might find an error in
function 1 and try to find the end of that function and resume checking the
next function.  But what if a function defines local functions within it?
What if the mistake in one line of code could still allow checking the next
line rather than skipping it all?

My guess is that finding 100 errors might turn out to be misleading. If you
fix just the first, many others would go away. If you spell a variable name
wrong when declaring it, a dozen uses of the right name may cause errors.
Should you fix the first or change all later ones?


How does one declare a variable in python? Sometimes it'd be nice to
be able to have declarations and any undeclared variable be flagged.

When I was writing F77 for a living, I'd (temporarily) put:
  IMPLICIT CHARACTER*3
at the beginning of a program or subroutine that I was modifying,
in order to have any typos flagged.

I'd love it if there was something similar that I could do in python.

--
Michael F. Stemper
87.3% of all statistics are made up by the person giving them.
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Chris Angelico
On Tue, 11 Oct 2022 at 06:34, Peter J. Holzer  wrote:
>
> On 2022-10-10 09:23:27 +1100, Chris Angelico wrote:
> > On Mon, 10 Oct 2022 at 06:50, Antoon Pardon  wrote:
> > > I just want a parser that doesn't give up on encoutering the first syntax
> > > error. Maybe do some semantic checking like checking the number of 
> > > parameters.
> >
> > That doesn't make sense though.
>
> I think you disagree with most compiler authors here.
>
> > It's one thing to keep going after finding a non-syntactic error, but
> > an error of syntax *by definition* makes parsing the rest of the file
> > dubious.
>
> Dubious but still useful.

There's a huge difference between non-fatal errors and syntactic
errors. The OP wants the parser to magically skip over a fundamental
syntactic error and still parse everything else correctly. That's
never going to work perfectly, and the OP is surprised at this.

> > What would it even *mean* to not give up?
>
> Read the blog post on Lezer for some ideas:
> https://marijnhaverbeke.nl/blog/lezer.html
>
> This is in the context of an editor.

Incidentally, that's actually where I would expect to see that kind of
feature show up the most - syntax highlighters will often be designed
to "carry on, somehow" after a syntax error, even though it often
won't make any sense (just look at what happens to your code
highlighting when you omit a quote character). It still won't always
be any use, but you do see *some* attempt at it.

But if the OP would be satisfied with that, I rather doubt that this
thread would even have happened. Unless, of course, the OP still lives
in the dark ages when no text editor available had any suitable
features for code highlighting.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Peter J. Holzer
On 2022-10-10 09:23:27 +1100, Chris Angelico wrote:
> On Mon, 10 Oct 2022 at 06:50, Antoon Pardon  wrote:
> > I just want a parser that doesn't give up on encoutering the first syntax
> > error. Maybe do some semantic checking like checking the number of 
> > parameters.
> 
> That doesn't make sense though.

I think you disagree with most compiler authors here.

> It's one thing to keep going after finding a non-syntactic error, but
> an error of syntax *by definition* makes parsing the rest of the file
> dubious.

Dubious but still useful.

> What would it even *mean* to not give up?

Read the blog post on Lezer for some ideas:
https://marijnhaverbeke.nl/blog/lezer.html

This is in the context of an editor. But the same problem applies to
compilers. It's not very important if a compile run only takes a second
or so but even then it might be helpful to see several error messages
and not only one at a time. It becomes much more important as compile
times get longer (as an extreme[1] example, when I worked on a largeish
cobol program in the 1980s, compiling the thing took about half an hour.
I really wanted to fix *everything* before starting the compiler again.)

Marijn isn't the only person who revisited this problem recently[2].
I've read a few other blog posts and papers on that topic at about the
same time.

hp

[1] Yes, there are programs where a full compile takes much longer than
that. But you can usually get away with recompiling only a small
part, so you don't have to wait that long during normal development.
That cobol compiler couldn't do that.

[2] "Recently" means "in the last 10 years or so".

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Cameron Simpson

On 10Oct2022 09:04, Antoon Pardon  wrote:
It is easy to get the syntax right before submitting to such a 
pipeline.  I usually run a linter on my code for serious commits, and 
I've got a `lint1` alias which basicly runs the short fast flavour of 
that which does a syntax check and the very fast less thorough lint 
phase.


If you have a linter that doesn't quit after the first syntax error, 
please provide a link. I already tried pylint and it also quits after 
the first syntax error.


I don't have such a linter. I did outline an approach for you to write 
one of your own by wrapping an existing parser program.


I have a personal "lint" script which runs a few linters. The first 
check is `py_compile` which quits at the first syntax error. The other 
linters are not even tried if that fails.


I do not know what your editing environment is; I'd have thought that 
some IDEs should make the first syntax error very obvious and easy to go 
to, and an obvious indication that the file as a whoe is syntacticly 
good/bad. If you have such, between them you could fairly easily resolve 
syntax errors rapidly, perhaps rapidly enough to make up for a 
stop-at-the-first-fail syntax check.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-10 Thread Antoon Pardon



Op 10/10/2022 om 00:45 schreef Cameron Simpson:

On 09Oct2022 21:46, Antoon Pardon  wrote:
Is it that onerous to fix one thing and run it again? It was once 
when you

handed in punch cards and waited a day or on very busy machines.


Yes I find it onerous, especially since I have a pipeline with unit 
tests
and other tools that all have to redo their work each time a bug is 
corrected.


It is easy to get the syntax right before submitting to such a 
pipeline.  I usually run a linter on my code for serious commits, and 
I've got a `lint1` alias which basicly runs the short fast flavour of 
that which does a syntax check and the very fast less thorough lint phase.


If you have a linter that doesn't quit after the first syntax error, 
please provide a link. I already tried pylint and it also quits after 
the first syntax error.


--
Antoon Pardon
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Cameron Simpson

On 10Oct2022 00:41, avi.e.gr...@gmail.com  wrote:

Your suggestion makes me shudder!


And fair enough too. I don't do this for me, I'm just suggesting an 
approach which might bring something to Antoon's objective.



Removing all earlier lines of code is often guaranteed to generate errors as
variables you are using are not declared or initiated, modules are not
imported and so on.


Antoon's interested in syntax errors.


Removing just the line or three where the previous error happened would also
have a good chance of invalidating something.


Doubtless. He accepts that any such resume-the-parse can bring 
misleading error messages. Antoon is not expecting magic, just getting 
several complaints instead of just the first syntax error.


I must admit I sympathise a bit, as one of my own major irks is command 
line tools which moan about the first bad option instead of noting it 
and moving on to complain about other things as well, then quitting 
after the command line parse. Pure laziness a lot of the time IMO; I've 
done it myself, but do like to make multiple complaints when it's 
feasible.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


RE: What to use for finding as many syntax errors as possible.

2022-10-09 Thread avi.e.gross
Cameron,

Your suggestion makes me shudder!

Removing all earlier lines of code is often guaranteed to generate errors as
variables you are using are not declared or initiated, modules are not
imported and so on.

Removing just the line or three where the previous error happened would also
have a good chance of invalidating something.

Someone who really wants to be able to isolate large parts of their code so
that an error in once does not compromise lots of remaining code, might
build their code in small units on the level of single functions per file
and do lots of imports. They can then ask for all the files to be
pseudo-compiled to byte-code and that might provide lots of errors to look
at in one pass.

But asking for a one-file version to find errors and somehow go past them
and look for more is more daunting but of course can be done with partial
accuracy and usefulness at best.

As an analogy, if tolerated, think of a spell-checker on a document that can
find oodles of words spelled wrong. Unfortunately, a spell corrector can
drive us nuts if it knows little about context. If it sees a word like
"reid" should it just change it to "read" or "red" or perhaps "reed" or look
to see if the real problem is it is supposed to be unified (no space) with a
word before or after? Will it know if the word appears in a context where a
language like Latin or French or German or Hungarian is being quoted and
perhaps it is spelled right, or if wrong, has other more likely corrections?

Now if you add a grammar detector, and it knows you are looking for an
adjective or a verb or a noun, it may do better.

I use Google translate quite  a bit as a tool as I often have to type in
various languages and it provides a handy keyboard or lets me check if I
used the right grammar especially in languages with silly ideas that objects
can have 2 or even three genders. So putting in phrases like "this xyz" can
result in language-specific text that tells me if it is masculine or
feminine or perhaps neuter. But the reason I mention it is how often it is
WRONG. I mean many languages have multiple words that are spelled the same
but used and pronounced differently in various contexts. The English word
"read" can sound like reed or like red so past tense sounds different as in
I read that book last week versus please read it to me now. But some
languages such as Hebrew which generally may not show the vowels, can get
totally confused in this program as humans often need lots of context to
figure out whether the current short word is in a context where it means
"you: feminine and singular and is pronounced aht or it is a way of showing
what follows is a direct object and loosely means "the" in a redundant way
and is pronounced as "eht". Quite a few words have three or more possible
ways to pronounce the same letters and without vowel guides need context and
sometimes some spreadsheet-like ingenuity as multiple other words are also
in limbo and once resolved can impact what other words may now mean.
Obviously adding back the vowels makes things clear so people who are used
to seeing books written that old way can get hopelessly lost reading a
modern newspaper.

End of digression, just assume I could have gone on for many pages
describing my annoyances at what Google translate does to many other
languages that show the imperfections in what is really a great and powerful
tool.

Well parsing any program in most languages can be equally complex and
require lots of context. For example, you can often use the same identifier
to be the name of a regular variable or the name of a function and sometimes
other things such as the name of a module. They can often be disambiguated
in context. Perhaps the same name following by parentheses should be a
function call while a name followed by :: or ::: might in that language
require it to be the name of a module/package. If followed by [ it might
need to be something indexable such as an array or list and so on. So say
there is an error in the variable. Can the interpreter or linter figure out
what the error is and almost repair it? Can it see a variable name like
"alpXha" and note there is no such identifier in the current namespace but
there is one called "alpha" that might be the one without the X? But what if
what is missing is an open parent or maybe the matching close paren. Does it
know if the problem is a bad variable name or a bad function invocation or
one of many other possible problems. Code with a random blemish is often not
easily figured out. If I type the name of a function without parentheses, it
could be an attempt to call the function with no arguments (an error though
in many languages) or it could be I want to pass the function itself as n
argument in functional programming. But if I have another variable of type
array, might it not be parentheses missing but square br

Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Thomas Passin



On 10/9/2022 1:29 PM, Peter J. Holzer wrote:
> On 2022-10-09 12:59:09 -0400, Thomas Passin wrote:
>> 
https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it

>>
>> People seemed especially enthusiastic about the one-liner from jmd_dk.
>
> I don't think that one-liner solves Antoon's requirement of continuing
> after an error. It uses just the normal python parser so it has exactly
> the same limitations.

Yes, of course. Interesting, though. py_compile tends to be what I use 
for a quick check. I linked to the page mostly for the other 
possibilities, as you mentioned below:


> Some of the mentioned tools may do what Antoon wants, though.
>
>  hp
>
>

--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Cameron Simpson

On 09Oct2022 21:46, Antoon Pardon  wrote:
Is it that onerous to fix one thing and run it again? It was once when 
you

handed in punch cards and waited a day or on very busy machines.


Yes I find it onerous, especially since I have a pipeline with unit tests
and other tools that all have to redo their work each time a bug is 
corrected.


It is easy to get the syntax right before submitting to such a pipeline.  
I usually run a linter on my code for serious commits, and I've got a 
`lint1` alias which basicly runs the short fast flavour of that which 
does a syntax check and the very fast less thorough lint phase.


I say this just to ease your write/run-tests cycle.

Regarding your main request, had you considered writing your own wrapper 
tool? Something which ran something like:


python -We:invalid -m py_compile your_python_file.py

If there's an error, report it, then make a new file commencing with the 
next unindented line after the error, with all preceeding lines 
commented out (to keep the line numbers the same). Then run the check 
again. Repeat until the file's empty or there are no errors.


This doesn't sound very complex.

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Chris Angelico
On Mon, 10 Oct 2022 at 06:50, Antoon Pardon  wrote:
> I just want a parser that doesn't give up on encoutering the first syntax
> error. Maybe do some semantic checking like checking the number of parameters.

That doesn't make sense though. It's one thing to keep going after
finding a non-syntactic error, but an error of syntax *by definition*
makes parsing the rest of the file dubious. What would it even *mean*
to not give up? How should it interpret the following lines of code?
All it can do is report the error.

You know, if you'd not made this thread, the time you saved would have
been enough for quite a few iterations of "fix one syntactic error,
run it again to find the next".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Karsten Hilbert
Am Sun, Oct 09, 2022 at 07:51:12PM +0200 schrieb Antoon Pardon:

> >But the point is: you can't (there is no way to) be sure the
> >9+ errors really are errors.
> >
> >Unless you further constrict what sorts of errors you are
> >looking for and what margin of error or leeway for false
> >positives you want to allow.
>
> Look when I was at the university we had to program in Pascal and
> the compilor we used continued parsing until the end. Sure there
> were times that after a number of reported errors the number of
> false positives became so high it was useless trying to find the
> remaining true ones, but it still was more efficient to correct the
> obvious ones, than to only correct the first one.
>
> I don't need to be sure. Even the occasional wrong correction
> is probably still more efficient than quiting after the first
> syntax error.

A-ha, so you further defined your context.

Under which I can agree to the objective :-)

Best,
Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Barry


> On 9 Oct 2022, at 18:54, Antoon Pardon  wrote:
> 
> 
> 
> Op 9/10/2022 om 19:23 schreef Karsten Hilbert:
>> Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:
>> 
>>> Op 9/10/2022 om 17:49 schreef Avi Gross:
>>>> My guess is that finding 100 errors might turn out to be misleading. If you
>>>> fix just the first, many others would go away.
>>> At this moment I would prefer a tool that reported 100 errors, which would
>>> allow me to easily correct 10 real errors, over the python strategy which 
>>> quits
>>> after having found one syntax error.
>> But the point is: you can't (there is no way to) be sure the
>> 9+ errors really are errors.
>> 
>> Unless you further constrict what sorts of errors you are
>> looking for and what margin of error or leeway for false
>> positives you want to allow.
> 
> Look when I was at the university we had to program in Pascal and
> the compilor we used continued parsing until the end. Sure there
> were times that after a number of reported errors the number of
> false positives became so high it was useless trying to find the
> remaining true ones, but it still was more efficient to correct the
> obvious ones, than to only correct the first one.

If it’s very fast to syntax check then one at a time is fine.
Python is very fast to syntax check so I personal do not need the multi error 
version.
My editor has syntax check on a key and it’s instant to drop me a syntax error.

Barry

> 
> I don't need to be sure. Even the occasional wrong correction
> is probably still more efficient than quiting after the first
> syntax error.
> 
> -- 
> Antoon.
> -- 
> https://mail.python.org/mailman/listinfo/python-list
> 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Antoon Pardon




Op 9/10/2022 om 21:44 schreef Avi Gross:

But an error like setting the size of a fixed length data structure to the
right size may result in oodles of errors about being out of range that
magically get fixed by one change. Sometimes too much info just gives you a
headache.


So? The user of such a tool doesn't need to go through all the provided 
information.
If after correcting a few errors, the users find the rest of the information 
gives
him a headache, he can just ignore all that and just run a new iteration.

--
Antoon Pardon
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Peter J. Holzer
On 2022-10-09 15:18:19 -0400, Avi Gross wrote:
> Antoon,  it may also relate to an interpreter versus compiler issue.
> 
> Something like a compiler for C does not do anything except write code in
> an assembly language. It can choose to keep going after an error and start
> looking some more from a less stable place.
> 
> Interpreters for Python have to catch interrupts as they go and often run
> code in small batches. Continuing to evaluate after an error could cause
> weird effects.

I don't think this is really an issue. A python file is completely
compiled to byte code before execution starts.

It's true that a syntax error before an import prevents that import, but
since imports are usually at the start of a file, a syntax error will
only rarely prevent the import (and files intended to be imported
generally don't have weird side effects anyway).

One issue is could be that compilers which generate executables are
generally thorough and slow, while the compilers which generate
byte-code for immediate consumption by an interpreter are generally
simple and fast. So there is more incentive for the former to discover
as many errors as possible and they are also better equipped to do this.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Antoon Pardon




Op 9/10/2022 om 21:18 schreef Avi Gross:

Antoon,  it may also relate to an interpreter versus compiler issue.

Something like a compiler for C does not do anything except write code in
an assembly language. It can choose to keep going after an error and start
looking some more from a less stable place.

Interpreters for Python have to catch interrupts as they go and often run
code in small batches. Continuing to evaluate after an error could cause
weird effects.

So what you want is closer to a lint program that does not run code at all,
or merely writes pseudocode to a file to be run faster later.


I just want a parser that doesn't give up on encoutering the first syntax
error. Maybe do some semantic checking like checking the number of parameters.


I will say that often enough a program could report more possible errors.
Putting your code into multiple files and modules may mean you could
cleanly evaluate the code and return multiple errors from many modules as
long as they are distinct. Finding all errors is not possible if recovery
from one is not guaranteed.


I don't need it to find all errors. As long as it reasonably accuratly
finds a significant number of them.


Is it that onerous to fix one thing and run it again? It was once when you
handed in punch cards and waited a day or on very busy machines.


Yes I find it onerous, especially since I have a pipeline with unit tests
and other tools that all have to redo their work each time a bug is corrected.

--
Antoon.
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Antoon Pardon




Op 9/10/2022 om 21:18 schreef Avi Gross:

Antoon,  it may also relate to an interpreter versus compiler issue.

Something like a compiler for C does not do anything except write code in
an assembly language. It can choose to keep going after an error and start
looking some more from a less stable place.

Interpreters for Python have to catch interrupts as they go and often run
code in small batches. Continuing to evaluate after an error could cause
weird effects.

So what you want is closer to a lint program that does not run code at all,
or merely writes pseudocode to a file to be run faster later.


I just want a parser that doesn't give up on encoutering the first syntax
error. Maybe do some semantic checking like checking the number of parameters.


I will say that often enough a program could report more possible errors.
Putting your code into multiple files and modules may mean you could
cleanly evaluate the code and return multiple errors from many modules as
long as they are distinct. Finding all errors is not possible if recovery
from one is not guaranteed.


I don't need it to find all errors. As long as it reasonably accuratly
finds a significant number of them.


Is it that onerous to fix one thing and run it again? It was once when you
handed in punch cards and waited a day or on very busy machines.


Yes I find it onerous, especially since I have a pipeline with unit tests
and other tools that all have to redo their work each time a bug is corrected.

--
Antoon.
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Avi Gross
I will say that those of us  meaning me, who express reservations are not
arguing it is a bad idea to get more info in one sweep. Many errors come in
bunches.

If I keep calling some function with the wrong number or type of arguments,
it may be the same in a dozen places in my code. The first error report may
make me search for the others places so I fix it all at once. Telling me
where some instances are might speed that a bit.

As long as it is understood that further errors are a heuristic and
possibly misleading,  fine.

But an error like setting the size of a fixed length data structure to the
right size may result in oodles of errors about being out of range that
magically get fixed by one change. Sometimes too much info just gives you a
headache.

But a tool like you described could have uses even if imperfect. If you are
teaching a course and students submit programs, could you grade the one
with a single error higher than one with 5 errors shown imperfectly and
fail the one with 600?

On Sun, Oct 9, 2022, 1:53 PM Antoon Pardon  wrote:

>
>
> Op 9/10/2022 om 19:23 schreef Karsten Hilbert:
> > Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:
> >
> >> Op 9/10/2022 om 17:49 schreef Avi Gross:
> >>> My guess is that finding 100 errors might turn out to be misleading.
> If you
> >>> fix just the first, many others would go away.
> >> At this moment I would prefer a tool that reported 100 errors, which
> would
> >> allow me to easily correct 10 real errors, over the python strategy
> which quits
> >> after having found one syntax error.
> > But the point is: you can't (there is no way to) be sure the
> > 9+ errors really are errors.
> >
> > Unless you further constrict what sorts of errors you are
> > looking for and what margin of error or leeway for false
> > positives you want to allow.
>
> Look when I was at the university we had to program in Pascal and
> the compilor we used continued parsing until the end. Sure there
> were times that after a number of reported errors the number of
> false positives became so high it was useless trying to find the
> remaining true ones, but it still was more efficient to correct the
> obvious ones, than to only correct the first one.
>
> I don't need to be sure. Even the occasional wrong correction
> is probably still more efficient than quiting after the first
> syntax error.
>
> --
> Antoon.
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Avi Gross
Antoon,  it may also relate to an interpreter versus compiler issue.

Something like a compiler for C does not do anything except write code in
an assembly language. It can choose to keep going after an error and start
looking some more from a less stable place.

Interpreters for Python have to catch interrupts as they go and often run
code in small batches. Continuing to evaluate after an error could cause
weird effects.

So what you want is closer to a lint program that does not run code at all,
or merely writes pseudocode to a file to be run faster later.

Many languages now have blocks of code that are not really be evaluated
till later. Some code is built on the fly. And some errors are not errors
at first. Many languages let you not declare a variable before using it or
allow it to change types. In some, the text is lazily evaluated as late as
possible.

I will say that often enough a program could report more possible errors.
Putting your code into multiple files and modules may mean you could
cleanly evaluate the code and return multiple errors from many modules as
long as they are distinct. Finding all errors is not possible if recovery
from one is not guaranteed.

Take a language that uses a semicolon to end a statement. If absent usually
there would be some error but often something on the next line. Your
evaluator could do an experiment and add a semicolon and try again. This
might work 90% of the time but sometimes the error was not ending the line
with a backslash to make it continue properly, or an indentation issue and
even spelling error. No guarantees.

Is it that onerous to fix one thing and run it again? It was once when you
handed in punch cards and waited a day or on very busy machines.

On Sun, Oct 9, 2022, 1:03 PM Antoon Pardon  wrote:

>
>
> Op 9/10/2022 om 17:49 schreef Avi Gross:
> > My guess is that finding 100 errors might turn out to be misleading. If
> you
> > fix just the first, many others would go away.
>
> At this moment I would prefer a tool that reported 100 errors, which would
> allow me to easily correct 10 real errors, over the python strategy which
> quits
> after having found one syntax error.
>
> --
> Antoon.
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread MRAB

On 2022-10-09 18:51, Antoon Pardon wrote:



Op 9/10/2022 om 19:23 schreef Karsten Hilbert:

Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:


Op 9/10/2022 om 17:49 schreef Avi Gross:

My guess is that finding 100 errors might turn out to be misleading. If you
fix just the first, many others would go away.

At this moment I would prefer a tool that reported 100 errors, which would
allow me to easily correct 10 real errors, over the python strategy which quits
after having found one syntax error.

But the point is: you can't (there is no way to) be sure the
9+ errors really are errors.

Unless you further constrict what sorts of errors you are
looking for and what margin of error or leeway for false
positives you want to allow.


Look when I was at the university we had to program in Pascal and
the compilor we used continued parsing until the end. Sure there
were times that after a number of reported errors the number of
false positives became so high it was useless trying to find the
remaining true ones, but it still was more efficient to correct the
obvious ones, than to only correct the first one.

I don't need to be sure. Even the occasional wrong correction
is probably still more efficient than quiting after the first
syntax error.

When I did some programming in COBOL, a single omitted "." would 
completely confuse the compiler and it was best to fix that one error 
and then try again.


On the other hand, TurboPascal would also stop on the first error and 
put the cursor at the error position in the IDE, but as it compiled 
quickly, it wasn't a problem. It was no slower than it would've been if 
it had found multiple errors and you pressed a key to advance to the 
next error.

--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Weatherby,Gerard
PyCharm.

Does a good job of separating these are really errors from do you really mean 
that warnings from this word is spelled right.

https://www.jetbrains.com/pycharm/

From: Python-list  on 
behalf of Antoon Pardon 
Date: Sunday, October 9, 2022 at 6:11 AM
To: python-list@python.org 
Subject: What to use for finding as many syntax errors as possible.
*** Attention: This is an external email. Use caution responding, opening 
attachments or clicking on links. ***

I would like a tool that tries to find as many syntax errors as possible
in a python file. I know there is the risk of false positives when a
tool tries to recover from a syntax error and proceeds but I would
prefer that over the current python strategy of quiting after the first
syntax error. I just want a tool for syntax errors. No style
enforcements. Any recommandations? -- Antoon Pardon
--
https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!kxDZilNf74VILuntVEzVZ4Wjv6RPr4JUbGpWrURDJ3CtDNAi9szBWweqrDM-uHy-o_Sncgrm2BmJIRksmxSG_LGVbBU$<https://urldefense.com/v3/__https:/mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!kxDZilNf74VILuntVEzVZ4Wjv6RPr4JUbGpWrURDJ3CtDNAi9szBWweqrDM-uHy-o_Sncgrm2BmJIRksmxSG_LGVbBU$>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Antoon Pardon




Op 9/10/2022 om 19:23 schreef Karsten Hilbert:

Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:


Op 9/10/2022 om 17:49 schreef Avi Gross:

My guess is that finding 100 errors might turn out to be misleading. If you
fix just the first, many others would go away.

At this moment I would prefer a tool that reported 100 errors, which would
allow me to easily correct 10 real errors, over the python strategy which quits
after having found one syntax error.

But the point is: you can't (there is no way to) be sure the
9+ errors really are errors.

Unless you further constrict what sorts of errors you are
looking for and what margin of error or leeway for false
positives you want to allow.


Look when I was at the university we had to program in Pascal and
the compilor we used continued parsing until the end. Sure there
were times that after a number of reported errors the number of
false positives became so high it was useless trying to find the
remaining true ones, but it still was more efficient to correct the
obvious ones, than to only correct the first one.

I don't need to be sure. Even the occasional wrong correction
is probably still more efficient than quiting after the first
syntax error.

--
Antoon.
--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Peter J. Holzer
On 2022-10-09 19:23:41 +0200, Karsten Hilbert wrote:
> Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:
> > Op 9/10/2022 om 17:49 schreef Avi Gross:
> > >My guess is that finding 100 errors might turn out to be misleading. If you
> > >fix just the first, many others would go away.
> >
> > At this moment I would prefer a tool that reported 100 errors, which would
> > allow me to easily correct 10 real errors, over the python strategy which 
> > quits
> > after having found one syntax error.
> 
> But the point is: you can't (there is no way to) be sure the
> 9+ errors really are errors.

As a human who knows Python in many cases you can be sure. Sometimes you
aren't sure, then you leave that one for the next iteration. No big
deal. This isn't the 1960s when you sent your punched cards in and got
the result back next week. So neither the parser nor you need to be
perfect. Just better than one error at a time.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Peter J. Holzer
On 2022-10-09 12:59:09 -0400, Thomas Passin wrote:
> https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it
> 
> People seemed especially enthusiastic about the one-liner from jmd_dk.

I don't think that one-liner solves Antoon's requirement of continuing
after an error. It uses just the normal python parser so it has exactly
the same limitations.

Some of the mentioned tools may do what Antoon wants, though.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Karsten Hilbert
Am Sun, Oct 09, 2022 at 06:59:36PM +0200 schrieb Antoon Pardon:

> Op 9/10/2022 om 17:49 schreef Avi Gross:
> >My guess is that finding 100 errors might turn out to be misleading. If you
> >fix just the first, many others would go away.
>
> At this moment I would prefer a tool that reported 100 errors, which would
> allow me to easily correct 10 real errors, over the python strategy which 
> quits
> after having found one syntax error.

But the point is: you can't (there is no way to) be sure the
9+ errors really are errors.

Unless you further constrict what sorts of errors you are
looking for and what margin of error or leeway for false
positives you want to allow.

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Thomas Passin

https://stackoverflow.com/questions/4284313/how-can-i-check-the-syntax-of-python-script-without-executing-it

People seemed especially enthusiastic about the one-liner from jmd_dk.

On 10/9/2022 12:17 PM, Peter J. Holzer wrote:

On 2022-10-09 12:09:17 +0200, Antoon Pardon wrote:

I would like a tool that tries to find as many syntax errors as possible in
a python file. I know there is the risk of false positives when a tool tries
to recover from a syntax error and proceeds but I would prefer that over the
current python strategy of quiting after the first syntax error. I just want
a tool for syntax errors. No style enforcements. Any recommandations?


There seems to have been increased interest in good error recovery over
the last years. I thought I had bookmarked a bunch of projects, but the
only one I can find right now is Lezer
(https://marijnhaverbeke.nl/blog/lezer.html) which is part of the
CodeMirror (https://codemirror.net/) editor. Python is listed as a
currently supported language, so you might want to check that out.

Disclaimer: I haven't used CodeMirror, so I can't say anything about
its quality. The blog entry about Lezer was interesting, though.

 hp




--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Antoon Pardon




Op 9/10/2022 om 17:49 schreef Avi Gross:

My guess is that finding 100 errors might turn out to be misleading. If you
fix just the first, many others would go away.


At this moment I would prefer a tool that reported 100 errors, which would
allow me to easily correct 10 real errors, over the python strategy which quits
after having found one syntax error.

--
Antoon.

--
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Peter J. Holzer
On 2022-10-09 12:09:17 +0200, Antoon Pardon wrote:
> I would like a tool that tries to find as many syntax errors as possible in
> a python file. I know there is the risk of false positives when a tool tries
> to recover from a syntax error and proceeds but I would prefer that over the
> current python strategy of quiting after the first syntax error. I just want
> a tool for syntax errors. No style enforcements. Any recommandations?

There seems to have been increased interest in good error recovery over
the last years. I thought I had bookmarked a bunch of projects, but the
only one I can find right now is Lezer
(https://marijnhaverbeke.nl/blog/lezer.html) which is part of the
CodeMirror (https://codemirror.net/) editor. Python is listed as a
currently supported language, so you might want to check that out.

Disclaimer: I haven't used CodeMirror, so I can't say anything about
its quality. The blog entry about Lezer was interesting, though.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What to use for finding as many syntax errors as possible.

2022-10-09 Thread Avi Gross
Anton

There likely are such programs out there but are there universal agreements
on how to figure out when a new safe zone of code starts where error
testing can begin?

For example a file full of function definitions might find an error in
function 1 and try to find the end of that function and resume checking the
next function.  But what if a function defines local functions within it?
What if the mistake in one line of code could still allow checking the next
line rather than skipping it all?

My guess is that finding 100 errors might turn out to be misleading. If you
fix just the first, many others would go away. If you spell a variable name
wrong when declaring it, a dozen uses of the right name may cause errors.
Should you fix the first or change all later ones?



On Sun, Oct 9, 2022, 6:11 AM Antoon Pardon  wrote:

> I would like a tool that tries to find as many syntax errors as possible
> in a python file. I know there is the risk of false positives when a
> tool tries to recover from a syntax error and proceeds but I would
> prefer that over the current python strategy of quiting after the first
> syntax error. I just want a tool for syntax errors. No style
> enforcements. Any recommandations? -- Antoon Pardon
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


What to use for finding as many syntax errors as possible.

2022-10-09 Thread Antoon Pardon
I would like a tool that tries to find as many syntax errors as possible 
in a python file. I know there is the risk of false positives when a 
tool tries to recover from a syntax error and proceeds but I would 
prefer that over the current python strategy of quiting after the first 
syntax error. I just want a tool for syntax errors. No style 
enforcements. Any recommandations? -- Antoon Pardon

--
https://mail.python.org/mailman/listinfo/python-list


Re: Unable to compile my C Extension on Windows: unresolved external link errors

2021-11-15 Thread Eryk Sun
On 11/14/21, Marco Sulla  wrote:
> On Sun, 14 Nov 2021 at 16:42, Barry Scott  wrote:
>
>> On macOS .dynlib and Unix .so its being extern that does this.
>
> And extern is the default. I understand now.

Per Include/exports.h and Include/pyport.h, Python should be built in
Unix with "default" visibility (per global/local binding) for the API
PyAPI_FUNC(RTYPE) and PyAPI_DATA(RTYPE) symbols. Everything else with
global binding will be hidden via the "-fvisibility=hidden" compiler
option that's configured in the makefile. For example:

$ readelf -s Objects/dictobject.o | grep HIDDEN | cut -b 40-
GLOBAL HIDDEN 4 _pydict_global_version
GLOBAL HIDDEN 1 _PyDict_ClearFreeList
GLOBAL HIDDEN 1 _PyDict_Fini
GLOBAL HIDDEN 1 _PyDictKeys_StringLookup
GLOBAL HIDDEN 9 _Py_dict_lookup
GLOBAL HIDDEN 1 _PyDict_GetItemHint
GLOBAL HIDDEN 1 _PyDict_LoadGlobal
GLOBAL HIDDEN 1 _PyDict_Pop_KnownHash
GLOBAL HIDDEN 1 _PyDict_FromKeys
GLOBAL HIDDEN 1 _PyDict_KeysSize
GLOBAL HIDDEN 1 _PyDict_NewKeysForClass
GLOBAL HIDDEN 1 _PyObject_InitializeDict
GLOBAL HIDDEN 1 _PyObject_MakeDictFromIns
GLOBAL HIDDEN 1 _PyObject_StoreInstanceAt
GLOBAL HIDDEN 1 _PyObject_GetInstanceAttr
GLOBAL HIDDEN 1 _PyObject_IsInstanceDictE
GLOBAL HIDDEN 1 _PyObject_VisitInstanceAt
GLOBAL HIDDEN 1 _PyObject_ClearInstanceAt
GLOBAL HIDDEN 1 _PyObject_FreeInstanceAtt
GLOBAL HIDDEN 1 _PyObjectDict_SetItem
GLOBAL HIDDEN 1 _PyDictKeys_DecRef
GLOBAL HIDDEN 1 _PyDictKeys_GetVersionFor

These hidden symbols get linked in the executable or shared-object
with local binding. For example:

$ readelf -s python | grep _PyDict_FromKeys | cut -b 40-
LOCAL  DEFAULT   16 _PyDict_FromKeys

I suggest testing your project with Python built as a shared library,
i.e. --enable-shared. The local binding on internal symbols may be a
problem in this case.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unable to compile my C Extension on Windows: unresolved external link errors

2021-11-14 Thread Marco Sulla
On Sun, 14 Nov 2021 at 16:42, Barry Scott  wrote:
>
> Sorry iPad sent the message before it was complete...
>
> > On 14 Nov 2021, at 10:38, Marco Sulla  wrote:
> >
> > Okay, now the problem seems to be another: I get the same "unresolved
> > external link" errors, but only for internal functions.
> >
> > This seems quite normal. The public .lib does not expose the internals
> > of Python.
> > The strange fact is: why can I compile it on Linux and MacOS? Their
> > external libraries expose the internal functions?
>
> Windows is not Linux is not macOS,
> The toolchain on each OS has its own strengths, weaknesses and quirks.
>
> On Windows DLLs only allow access to the symbols that are explicitly listed 
> to be access.

Where are those symbols listed?

> On macOS .dynlib and Unix .so its being extern that does this.

And extern is the default. I understand now.

> Maybe you could copy the code that you want and add it to your code?
> Change any conflicting symbols of course.

It's quite hard. I have to compile dictobject.c, which needs a lot of
internal functions. And I suppose that every internal function may
require 1 or more other internal functions.

I have other two other solutions:
* compile a whole python DLL with the symbols I need and link against
it. I have to put this DLL in my code, which is ugly.
* drop the support of the C Extension for Windows users and make for
them the slow, pure py version only.

Since my interest in Windows now is near to zero, I think I'll opt for
the third for now.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unable to compile my C Extension on Windows: unresolved external link errors

2021-11-14 Thread Barry Scott
Sorry iPad sent the message before it was complete...

> On 14 Nov 2021, at 10:38, Marco Sulla  wrote:
> 
> Okay, now the problem seems to be another: I get the same "unresolved
> external link" errors, but only for internal functions.
> 
> This seems quite normal. The public .lib does not expose the internals
> of Python.
> The strange fact is: why can I compile it on Linux and MacOS? Their
> external libraries expose the internal functions?

Windows is not Linux is not macOS,
The toolchain on each OS has its own strengths, weaknesses and quirks.

On Windows DLLs only allow access to the symbols that are explicitly listed to 
be access.
On macOS .dynlib and Unix .so its being extern that does this.

> 
> Anyway, is there a way to compile Python on Windows in such a way that
> I get a shared library that exposes all the functions?

Yes you can do your own build of python that exposes the symbols you want.
But that build will be private to you and will not allow others to use your work
(on the assumption that they will not use your private build of python).

Maybe you could copy the code that you want and add it to your code?
Change any conflicting symbols of course.

Barry

> 
> On Sat, 13 Nov 2021 at 12:17, Marco Sulla  
> wrote:
>> 
>> . Sorry, the problem is I downloaded the 32 bit version of VS
>> compiler and 64 bit version of Python..
>> 
>> On Sat, 13 Nov 2021 at 11:10, Barry Scott  wrote:
>>> 
>>> 
>>> 
>>>> On 13 Nov 2021, at 09:00, Barry  wrote:
>>>> 
>>>> 
>>>> 
>>>>> On 12 Nov 2021, at 22:53, Marco Sulla  
>>>>> wrote:
>>>>> 
>>>>> It seems that on Windows it doesn't find python3.lib,
>>>>> even if I put it in the path. So I get the `unresolved external link`
>>>>> errors.
>>>> 
>>>> I think you need the python310.lib (not sure of file name) to get to the 
>>>> internal symbols.
>>> 
>>> Another thing that you will need to check is that the symbols you are after 
>>> have been
>>> exposed in the DLL at all. Being external in the source is not enough they 
>>> also have to
>>> listed in the .DLL's def file ( is that the right term?) as well.
>>> 
>>> If its not clear yet, you are going to have to read a lot or source code 
>>> and understand
>>> the tool chain used on Windows to solve this.
>>> 
>>> 
>>>> 
>>>> You can use the objdump(?) utility to check that the symbols are in the 
>>>> lib.
>>>> 
>>>> Barry
>>> 
>>> Barry
>>> 
> 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unable to compile my C Extension on Windows: unresolved external link errors

2021-11-14 Thread Marco Sulla
Okay, now the problem seems to be another: I get the same "unresolved
external link" errors, but only for internal functions.

This seems quite normal. The public .lib does not expose the internals
of Python.
The strange fact is: why can I compile it on Linux and MacOS? Their
external libraries expose the internal functions?

Anyway, is there a way to compile Python on Windows in such a way that
I get a shared library that exposes all the functions?

On Sat, 13 Nov 2021 at 12:17, Marco Sulla  wrote:
>
> . Sorry, the problem is I downloaded the 32 bit version of VS
> compiler and 64 bit version of Python..
>
> On Sat, 13 Nov 2021 at 11:10, Barry Scott  wrote:
> >
> >
> >
> > > On 13 Nov 2021, at 09:00, Barry  wrote:
> > >
> > >
> > >
> > >> On 12 Nov 2021, at 22:53, Marco Sulla  
> > >> wrote:
> > >>
> > >> It seems that on Windows it doesn't find python3.lib,
> > >> even if I put it in the path. So I get the `unresolved external link`
> > >> errors.
> > >
> > > I think you need the python310.lib (not sure of file name) to get to the 
> > > internal symbols.
> >
> > Another thing that you will need to check is that the symbols you are after 
> > have been
> > exposed in the DLL at all. Being external in the source is not enough they 
> > also have to
> > listed in the .DLL's def file ( is that the right term?) as well.
> >
> > If its not clear yet, you are going to have to read a lot or source code 
> > and understand
> > the tool chain used on Windows to solve this.
> >
> >
> > >
> > > You can use the objdump(?) utility to check that the symbols are in the 
> > > lib.
> > >
> > > Barry
> >
> > Barry
> >
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unable to compile my C Extension on Windows: unresolved external link errors

2021-11-13 Thread Marco Sulla
. Sorry, the problem is I downloaded the 32 bit version of VS
compiler and 64 bit version of Python..

On Sat, 13 Nov 2021 at 11:10, Barry Scott  wrote:
>
>
>
> > On 13 Nov 2021, at 09:00, Barry  wrote:
> >
> >
> >
> >> On 12 Nov 2021, at 22:53, Marco Sulla  wrote:
> >>
> >> It seems that on Windows it doesn't find python3.lib,
> >> even if I put it in the path. So I get the `unresolved external link`
> >> errors.
> >
> > I think you need the python310.lib (not sure of file name) to get to the 
> > internal symbols.
>
> Another thing that you will need to check is that the symbols you are after 
> have been
> exposed in the DLL at all. Being external in the source is not enough they 
> also have to
> listed in the .DLL's def file ( is that the right term?) as well.
>
> If its not clear yet, you are going to have to read a lot or source code and 
> understand
> the tool chain used on Windows to solve this.
>
>
> >
> > You can use the objdump(?) utility to check that the symbols are in the lib.
> >
> > Barry
>
> Barry
>
-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >