On 29/09/2021 13.10, [email protected] wrote:
On Wednesday, September 29, 2021 at 5:40:58 PM UTC+8, J.O. Aho wrote:On 29/09/2021 10.22, [email protected] wrote:I tried to convert a xls file into csv with the following command, but failed:$ in2csv --sheet 'Sheet1' 2021-2022-1.xls XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n\r\n\r\n\r\n' The above testing file is located at here [1]. [1] https://github.com/hongyi-zhao/temp/blob/master/2021-2022-1.xls Any hints for fixing this problem?You need to delete the 13 first lines in the fileYes. After deleting the top 3 lines, the problem has been fixed.or you see to that your code does first trim the data before start xml parse it.Yes. I really want to do this trick programmatically, but how do I do it without manually editing the file?
You could do something like loading the XML into a string (myxmlstr) and then find the fist < in that string
xmlstart = myxmlstr.find('<')
xmlstr = myxmlstr[xmlstart:]
then use the xmlstr in the xml parser, sure not as convenient as loading
the file directly to the xml parser.
I don't say this is the best way of doing it, I'm sure some python wiz here would have a smarter solution.
-- //Aho -- https://mail.python.org/mailman/listinfo/python-list
