On 29/09/2021 13.10, hongy...@gmail.com wrote:
On Wednesday, September 29, 2021 at 5:40:58 PM UTC+8, J.O. Aho wrote:
On 29/09/2021 10.22, hongy...@gmail.com wrote:
I tried to convert a xls file into csv with the following command, but failed:
$ in2csv --sheet 'Sheet1' 2021-2022-1.xls
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found
b'\r\n\r\n\r\n\r\n'
The above testing file is located at here [1].
[1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls
Any hints for fixing this problem?
You need to delete the 13 first lines in the file
Yes. After deleting the top 3 lines, the problem has been fixed.
or you see to that your code does first trim the data before start xml parse it.
Yes. I really want to do this trick programmatically, but how do I do it
without manually editing the file?
You could do something like loading the XML into a string (myxmlstr) and
then find the fist < in that string
xmlstart = myxmlstr.find('<')
xmlstr = myxmlstr[xmlstart:]
then use the xmlstr in the xml parser, sure not as convenient as loading
the file directly to the xml parser.
I don't say this is the best way of doing it, I'm sure some python wiz
here would have a smarter solution.
--
//Aho
--
https://mail.python.org/mailman/listinfo/python-list