Re: XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'

2021-09-29 Thread hongy...@gmail.com
On Thursday, September 30, 2021 at 5:20:04 AM UTC+8, Peter J. Holzer wrote:
> On 2021-09-29 01:22:03 -0700, hongy...@gmail.com wrote: 
> > I tried to convert a xls file into csv with the following command, but 
> > failed: 
> > 
> > $ in2csv --sheet 'Sheet1' 2021-2022-1.xls 
> > XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 
> > b'\r\n\r\n\r\n\r\n' 
> > 
> > The above testing file is located at here [1]. 
> > 
> > [1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls
> Why is that file name .xls when it's obviously an HTML file? 

Good catch! Thank you for pointing this out. This file is automatically 
exported from my university's teaching management system, and it was assigned 
the .xls extension by default.

HZ
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: OT: AttributeError

2021-09-29 Thread Greg Ewing

On 30/09/21 7:28 am, dn wrote:

Oh yes! The D2 kit - I kept those books for years...
https://www.electronixandmore.com/adam/temp/6800trainer/mek6800d2.html


My 6800 system was nowhere near as fancy as that! It was the
result of replacing the CPU in my homebrew Miniscamp.

--
Greg


--
https://mail.python.org/mailman/listinfo/python-list


Re: OT: AttributeError

2021-09-29 Thread Rob Cliffe via Python-list
Ah, Z80s (deep sigh).  Those were the days!  You could disassemble the 
entire CP/M operating system (including the BIOS), and still have many 
Kb to play with!  Real programmers don't need gigabytes!


On 29/09/2021 03:03, 2qdxy4rzwzuui...@potatochowder.com wrote:

On 2021-09-29 at 09:21:34 +1000,
Chris Angelico  wrote:


On Wed, Sep 29, 2021 at 9:10 AM <2qdxy4rzwzuui...@potatochowder.com> wrote:

On 2021-09-29 at 11:38:22 +1300,
dn via Python-list  wrote:


For those of us who remember/can compute in binary, octal, hex, or
decimal as-needed:
Why do programmers confuse All Hallows'/Halloween for Christmas Day?

That one is also very old.  (Yes, I know the answer.  No, I will not
spoil it for those who might not.)  What do I have to do to gain the
insight necessary to have discovered that question and answer on my own?

You'd have to be highly familiar with numbers in different notations,
to the extent that you automatically read 65 and 0x41 as the same
number ...

I do that.  And I have done that, with numbers that size, since the late
1970s (maybe the mid 1970s, for narrow definitions of "different").

There's at least one more [sideways, twisted] leap to the point that you
even think of translating the names of those holidays into an arithmetic
riddle.


... Or, even better, to be able to read off a hex dump and see E8 03
and instantly read it as "1,000 little-endian".

59535 big endian.  Warningm flamebait ahead:  Who thinks in little
endian?  (I was raised on 6502s and 680XX CPUs; 8080s and Z80s always
did things backwards.)


--
https://mail.python.org/mailman/listinfo/python-list


RE: XML Considered Harmful

2021-09-29 Thread Avi Gross via Python-list
I think that to make electricity comprehend, you need a room temperature
superconductor. The Cooper Pairs took a while to comprehend but now ...

I think, seriously, we have established the problems with guessing that
others are using the language in a way we assume. 

So how many comprehensions does Python have?

[] - list comprehension
{} - dictionary OR set comprehension
() - generator expression

Tuples are incomprehensible and I wonder if any other comprehensions might
make sense to add, albeit we may need new symbols.

-Original Message-
From: Python-list  On
Behalf Of Michael F. Stemper
Sent: Wednesday, September 29, 2021 9:04 AM
To: python-list@python.org
Subject: Re: XML Considered Harmful

On 28/09/2021 18.21, Greg Ewing wrote:
> On 29/09/21 4:37 am, Michael F. Stemper wrote:
>> I'm talking about something made
>> from tons of iron and copper that is oil-filled and rotates at 1800 rpm.
> 
> To avoid confusion, we should rename them "electricity comprehensions".

Hah!

--
Michael F. Stemper
If you take cranberries and stew them like applesauce they taste much more
like prunes than rhubarb does.
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'

2021-09-29 Thread Peter J. Holzer
On 2021-09-29 01:22:03 -0700, hongy...@gmail.com wrote:
> I tried to convert a xls file into csv with the following command, but failed:
> 
> $ in2csv --sheet 'Sheet1'  2021-2022-1.xls
> XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 
> b'\r\n\r\n\r\n\r\n'
> 
> The above testing file is located at here [1].
> 
> [1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls

Why is that file name .xls when it's obviously an HTML file?

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: OT: AttributeError

2021-09-29 Thread dn via Python-list
On 29/09/2021 19.16, Greg Ewing wrote:
> On 29/09/21 3:03 pm, 2qdxy4rzwzuui...@potatochowder.com wrote:
>> Who thinks in little
>> endian?  (I was raised on 6502s and 680XX CPUs; 8080s and Z80s always
>> did things backwards.)
> 
> The first CPU I wrote code for was a National SC/MP, which doesn't
> have an endianness, since it never deals with more than a byte at
> a time. The second was a 6800, which is big-endian. That's definitely
> more convenient when you're hand-assembling code! I can see the
> advantages of little-endian when you're implementing a CPU, though.


Oh yes! The D2 kit - I kept those books for years...
https://www.electronixandmore.com/adam/temp/6800trainer/mek6800d2.html
-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: NUmpy

2021-09-29 Thread Christian Gollwitzer

Am 29.09.21 um 18:16 schrieb Jorge Conforte:



Hi,

I have a netcdf file "uwnd_850_1981.nc" and I'm using the commands to 
read it:


Your code is incomplete:

from numpy import dtype
  fileu ='uwnd_850_1981.nc'
ncu = Dataset(fileu,'r')


Where is "Dataset" defined?


uwnd=ncu.variables['uwnd'][:]


and I had:

:1: DeprecationWarning: `np.bool` is a deprecated alias for the 
builtin `bool`. To silence this warning, use `bool` by itself. Doing 
this will not modify any behavior and is safe. If you specifically 
wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: 
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations


I didn't how I have this message. My numpy verison is 1.21.2.


Please, how can I solve this.


First, it is only a warning, therefore it should still work. Second, the 
problem is not in the code that you posted. POssibly in the definition 
of "Dataset". Maybe the netCDF-File contains boolean values and the 
package you use to read it should be updated?


Christian
--
https://mail.python.org/mailman/listinfo/python-list


ANN: Wing Python IDE 8.0.4 has been released

2021-09-29 Thread Wingware
Wing 8.0.4 adds Close Unmodified Others to the editor tab's context 
menu, documents using sitecustomize to automatically start debug, fixes 
the debugger on some Windows systems, improves icon rendering with some 
Windows scaling factors, and makes several other improvements.


Details:  https://wingware.com/news/2021-09-28
Downloads:   https://wingware.com/downloads

== About Wing ==

Wing is a light-weight but full-featured Python IDE designed 
specifically for Python, with powerful editing, code inspection, 
testing, and debugging capabilities. Wing's deep code analysis provides 
auto-completion, auto-editing, and refactoring that speed up 
development. Its top notch debugger works with any Python code, locally 
or on a remote host, container, or cluster. Wing also supports 
test-driven development, version control, UI color and layout 
customization, and includes extensive documentation and support.


Wing is available in three product levels:  Wing Pro is the 
full-featured Python IDE for professional developers, Wing Personal is a 
free Python IDE for students and hobbyists (omits some features), and 
Wing 101 is a very simplified free Python IDE for beginners (omits many 
features).


Learn more at https://wingware.com/


--
https://mail.python.org/mailman/listinfo/python-list


Re: OT: AttributeError

2021-09-29 Thread MRAB

On 2021-09-29 03:03, 2qdxy4rzwzuui...@potatochowder.com wrote:

On 2021-09-29 at 09:21:34 +1000,
Chris Angelico  wrote:


On Wed, Sep 29, 2021 at 9:10 AM <2qdxy4rzwzuui...@potatochowder.com> wrote:
>
> On 2021-09-29 at 11:38:22 +1300,
> dn via Python-list  wrote:
>
> > For those of us who remember/can compute in binary, octal, hex, or
> > decimal as-needed:
> > Why do programmers confuse All Hallows'/Halloween for Christmas Day?
>
> That one is also very old.  (Yes, I know the answer.  No, I will not
> spoil it for those who might not.)  What do I have to do to gain the
> insight necessary to have discovered that question and answer on my own?

You'd have to be highly familiar with numbers in different notations,
to the extent that you automatically read 65 and 0x41 as the same
number ...


I do that.  And I have done that, with numbers that size, since the late
1970s (maybe the mid 1970s, for narrow definitions of "different").

There's at least one more [sideways, twisted] leap to the point that you
even think of translating the names of those holidays into an arithmetic
riddle.


... Or, even better, to be able to read off a hex dump and see E8 03
and instantly read it as "1,000 little-endian".


59535 big endian.  Warningm flamebait ahead:  Who thinks in little
endian?  (I was raised on 6502s and 680XX CPUs; 8080s and Z80s always
did things backwards.)


6502 is little-endian.
--
https://mail.python.org/mailman/listinfo/python-list


NUmpy

2021-09-29 Thread Jorge Conforte



Hi,

I have a netcdf file "uwnd_850_1981.nc" and I'm using the commands to 
read it:


from numpy import dtype
 fileu ='uwnd_850_1981.nc'
ncu = Dataset(fileu,'r')
uwnd=ncu.variables['uwnd'][:]


and I had:

:1: DeprecationWarning: `np.bool` is a deprecated alias for the 
builtin `bool`. To silence this warning, use `bool` by itself. Doing 
this will not modify any behavior and is safe. If you specifically 
wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: 
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations


I didn't how I have this message. My numpy verison is 1.21.2.


Please, how can I solve this.


Thanks,


Conrado

--
https://mail.python.org/mailman/listinfo/python-list


Re: XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'

2021-09-29 Thread J.O. Aho

On 29/09/2021 13.10, hongy...@gmail.com wrote:

On Wednesday, September 29, 2021 at 5:40:58 PM UTC+8, J.O. Aho wrote:

On 29/09/2021 10.22, hongy...@gmail.com wrote:

I tried to convert a xls file into csv with the following command, but failed:

$ in2csv --sheet 'Sheet1' 2021-2022-1.xls
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 
b'\r\n\r\n\r\n\r\n'

The above testing file is located at here [1].

[1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls

Any hints for fixing this problem?

You need to delete the 13 first lines in the file


Yes. After deleting the top 3 lines, the problem has been fixed.


or you see to that your code does first trim the data before start xml parse it.


Yes. I really want to do this trick programmatically, but how do I do it 
without manually editing the file?



You could do something like loading the XML into a string (myxmlstr) and 
then find the fist < in that string


xmlstart = myxmlstr.find('<')

xmlstr = myxmlstr[xmlstart:]

then use the xmlstr in the xml parser, sure not as convenient as loading 
the file directly to the xml parser.


I don't say this is the best way of doing it, I'm sure some python wiz 
here would have a smarter solution.


--

 //Aho

--
https://mail.python.org/mailman/listinfo/python-list


Re: XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'

2021-09-29 Thread hongy...@gmail.com
On Wednesday, September 29, 2021 at 8:12:08 PM UTC+8, J.O. Aho wrote:
> On 29/09/2021 13.10, hongy...@gmail.com wrote: 
> > On Wednesday, September 29, 2021 at 5:40:58 PM UTC+8, J.O. Aho wrote: 
> >> On 29/09/2021 10.22, hongy...@gmail.com wrote: 
> >>> I tried to convert a xls file into csv with the following command, but 
> >>> failed: 
> >>> 
> >>> $ in2csv --sheet 'Sheet1' 2021-2022-1.xls 
> >>> XLRDError: Unsupported format, or corrupt file: Expected BOF record; 
> >>> found b'\r\n\r\n\r\n\r\n' 
> >>> 
> >>> The above testing file is located at here [1]. 
> >>> 
> >>> [1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls 
> >>> 
> >>> Any hints for fixing this problem? 
> >> You need to delete the 13 first lines in the file 
> > 
> > Yes. After deleting the top 3 lines, the problem has been fixed. 
> > 
> >> or you see to that your code does first trim the data before start xml 
> >> parse it. 
> > 
> > Yes. I really want to do this trick programmatically, but how do I do it 
> > without manually editing the file?
> You could do something like loading the XML into a string (myxmlstr)

How to do this operation? As you have seen, the file refused to be loaded at 
all.

> and then find the fist < in that string 
> 
> xmlstart = myxmlstr.find('<') 
> 
> xmlstr = myxmlstr[xmlstart:] 
> 
> then use the xmlstr in the xml parser, sure not as convenient as loading 
> the file directly to the xml parser. 
> 
> I don't say this is the best way of doing it, I'm sure some python wiz 
> here would have a smarter solution. 

Another very strange thing: I trimmed the first 3 lines in the original file 
and saved it into a new one named as  2021-2022-1-trimmed-top-3-lines.xls. [1]

Then I read the file with the following python script named as pandas-excel.py:

--
import pandas as pd

excel_file='2021-2022-1-trimmed-top-3-lines.xls'

#print(pd.ExcelFile(excel_file).sheet_names)

newpd=pd.read_excel(excel_file, sheet_name='Sheet1')

for i in newpd.index:
 if i >1:
 for j in newpd.columns:
 if int(j.split()[1]) > 2:
 if not pd.isnull(newpd.loc[i][j]):
 print(newpd.loc[i][j])
--

$ python pandas-excel.py | sort -u
汽车实用英语 [1-8]周 1-4节 38 汽车楼413基础电气实训室II 汽修1932 
汽车车载网络系统的检测与修复 [1-12]周 1-4节 38 汽车楼416安全、舒适系统实训室 汽修1932 

OTOH, I also tried to read the file with in2csv as follows:

$ in2csv --sheet Sheet1 2021-2022-1-trimmed-top-3-lines.xls 2>/dev/null |tr ',' 
'\n' | \
  sed -re '/^$/d' | sort -u  | awk '{print length($0),$0}' | sort -k1n | tail 
-3 | cut -d ' '  -f2-
汽车实用英语 [1-8]周 1-4节 38 汽车楼413基础电气实训室II 汽修1932 
智能网联汽车概论 [1-8]周 6-9节 45 汽车楼511汽车营销策划实训室 汽销1931 
汽车车载网络系统的检测与修复 [1-12]周 1-4节 38 汽车楼416安全、舒适系统实训室 汽修1932 

As you can see, the above two methods give different results. I'm very puzzled 
by this phenomenon. Any hints/tips/comments will be greatly appreciated.

[1] 
https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1-trimmed-top-3-lines.xls

Regards,
HZ
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'

2021-09-29 Thread hongy...@gmail.com
On Wednesday, September 29, 2021 at 5:40:58 PM UTC+8, J.O. Aho wrote:
> On 29/09/2021 10.22, hongy...@gmail.com wrote: 
> > I tried to convert a xls file into csv with the following command, but 
> > failed: 
> > 
> > $ in2csv --sheet 'Sheet1' 2021-2022-1.xls 
> > XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 
> > b'\r\n\r\n\r\n\r\n' 
> > 
> > The above testing file is located at here [1]. 
> > 
> > [1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls 
> > 
> > Any hints for fixing this problem?
> You need to delete the 13 first lines in the file 

Yes. After deleting the top 3 lines, the problem has been fixed. 

> or you see to that your code does first trim the data before start xml parse 
> it. 

Yes. I really want to do this trick programmatically, but how do I do it 
without manually editing the file?

HZ
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'

2021-09-29 Thread J.O. Aho

On 29/09/2021 10.22, hongy...@gmail.com wrote:

I tried to convert a xls file into csv with the following command, but failed:

$ in2csv --sheet 'Sheet1'  2021-2022-1.xls
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 
b'\r\n\r\n\r\n\r\n'

The above testing file is located at here [1].

[1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls

Any hints for fixing this problem?


You need to delete the 13 first lines in the file or you see to that 
your code does first trim the data before start xml parse it.


--

 //Aho
--
https://mail.python.org/mailman/listinfo/python-list


Re: XML Considered Harmful

2021-09-29 Thread Michael F. Stemper

On 28/09/2021 18.21, Greg Ewing wrote:

On 29/09/21 4:37 am, Michael F. Stemper wrote:

I'm talking about something made
from tons of iron and copper that is oil-filled and rotates at 1800 rpm.


To avoid confusion, we should rename them "electricity comprehensions".


Hah!

--
Michael F. Stemper
If you take cranberries and stew them like applesauce they taste much
more like prunes than rhubarb does.
--
https://mail.python.org/mailman/listinfo/python-list


Re: OT: AttributeError

2021-09-29 Thread Greg Ewing

On 29/09/21 3:03 pm, 2qdxy4rzwzuui...@potatochowder.com wrote:

Who thinks in little
endian?  (I was raised on 6502s and 680XX CPUs; 8080s and Z80s always
did things backwards.)


The first CPU I wrote code for was a National SC/MP, which doesn't
have an endianness, since it never deals with more than a byte at
a time. The second was a 6800, which is big-endian. That's definitely
more convenient when you're hand-assembling code! I can see the
advantages of little-endian when you're implementing a CPU, though.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n'

2021-09-29 Thread hongy...@gmail.com
I tried to convert a xls file into csv with the following command, but failed:

$ in2csv --sheet 'Sheet1'  2021-2022-1.xls
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 
b'\r\n\r\n\r\n\r\n'

The above testing file is located at here [1].

[1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls

Any hints for fixing this problem?

Regards,
HZ
-- 
https://mail.python.org/mailman/listinfo/python-list


Automated data testing, checking, validation, reporting for data assurance

2021-09-29 Thread Shaozhong SHI
There appear to be a few options for this.

Has anyone tested and got experience with automated data testing,
validation and reporting?

Can anyone enlighten me?

Regards,

David
-- 
https://mail.python.org/mailman/listinfo/python-list