[issue37984] Unable parse csv on latin iso or binary mode
Yhojann Aguilera added the comment: For big files (like as >= 1gb) can not load the all string on memory, need use a file stream using open(). -- ___ Python tracker <https://bugs.python.org/issue37984> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37984] Unable parse csv on latin iso or binary mode
Yhojann Aguilera added the comment: Thanks, works fine, but anyway why not give the option to work binary? the delimiters can be represented with binary values. In python it is difficult to autodetect the encoding of characters in a file. -- ___ Python tracker <https://bugs.python.org/issue37984> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37984] Unable parse csv on latin iso or binary mode
New submission from Yhojann Aguilera : Unable parse a csv with latin iso charset. with open('./exported.csv', newline='') as csvFileHandler: csvHandler = csv.reader(csvFileHandler, delimiter=';', quotechar='"') for line in csvHandler: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 1032: invalid continuation byte I try using a binary mode on open() but says: binary mode doesn't take a newline argument. Ok, replace newline to binary char: newline=b'', but says: open() argument 6 must be str or None, not bytes. Ok, remove newline argument: _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?). Ok, csv module no support binary read mode. Try use latin iso: with open('./exported.csv', mode='r', encoding='ISO-8859', newline='') as csvFileHandler: UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032: character maps to But the charset is latin iso: $ file exported.csv exported.csv: ISO-8859 text, with very long lines, with CRLF line terminators Ok, change to ISO-8859-8: UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032: character maps to Unable load the file. Why not give the option to work binary? the delimiters can be represented with binary values. -- components: Unicode messages: 350836 nosy: Yhojann Aguilera, ezio.melotti, vstinner priority: normal severity: normal status: open title: Unable parse csv on latin iso or binary mode type: behavior versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue37984> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18748] io.IOBase destructor silence I/O error on close() by default
Yhojann Aguilera added the comment: I hope that when an error occurs, python tells me what the problem is. The abort core error is a problem at a lower level than python because python is not able to recognize or handle the error. The main problem is that I exceeded the maximum number of process threads supported by the kernel. What I hope is that python throws an exception when it exceeds this limit or when it cannot access the pointer in memory of the process thread. The problem is not if the script is right or wrong, but that Python is not able to recognize and handle the problem. A generic message saying that an error occurred without indicating where and how it occurred is a python bug. -- ___ Python tracker <https://bugs.python.org/issue18748> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18748] io.IOBase destructor silence I/O error on close() by default
Yhojann Aguilera added the comment: Same problem using Python 3.6.8 on Ubuntu 18.04 LTS. For now, solve this using LD_PRELOAD=libgcc_s.so.1 python3 ... For more details and pocs: https://github.com/WHK102/wss/issues/2 -- nosy: +Yhojann Aguilera versions: +Python 3.6 ___ Python tracker <https://bugs.python.org/issue18748> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35071] Canot send real string from c api to module (corrupted string)
New submission from Yhojann Aguilera : The functios like as PyUnicode_FromString use a printf format in char array argument. Example: PyUnicode_FromString("a%22b"); in module interprete the %22 as 22 blank spaces. A double quote in module add a backslash. Poc: Y try send a string from c++ to python string using: PyObject* pyString = PyUnicode_FromString("/abc/def.html/a%22.php?abc=&def=%22;%00s%01"); PyObject* pyArgs = Py_BuildValue("(z)", pyString); ... PyObject_CallObject(pFunc, pyArgs); But in script the string is bad: function(data): print(data) The result is: /abc/def.html/a bogus %pp?abc=&def=%;(null)% -- components: Library (Lib) messages: 328484 nosy: Yhojann Aguilera priority: normal severity: normal status: open title: Canot send real string from c api to module (corrupted string) type: behavior versions: Python 3.6 ___ Python tracker <https://bugs.python.org/issue35071> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com