[issue37984] Unable parse csv on latin iso or binary mode

2019-08-29 Thread Yhojann Aguilera


Yhojann Aguilera  added the comment:

For big files (like as >= 1gb) can not load the all string on memory, need use 
a file stream using open().

--

___
Python tracker 
<https://bugs.python.org/issue37984>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37984] Unable parse csv on latin iso or binary mode

2019-08-29 Thread Yhojann Aguilera


Yhojann Aguilera  added the comment:

Thanks, works fine, but anyway why not give the option to work binary? the 
delimiters can be represented with binary values. In python it is difficult to 
autodetect the encoding of characters in a file.

--

___
Python tracker 
<https://bugs.python.org/issue37984>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37984] Unable parse csv on latin iso or binary mode

2019-08-29 Thread Yhojann Aguilera


New submission from Yhojann Aguilera :

Unable parse a csv with latin iso charset.

with open('./exported.csv', newline='') as csvFileHandler:
csvHandler = csv.reader(csvFileHandler, delimiter=';', 
quotechar='"')
for line in csvHandler:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 1032: 
invalid continuation byte

I try using a binary mode on open() but says: binary mode doesn't take a 
newline argument. Ok, replace newline to binary char: newline=b'', but says: 
open() argument 6 must be str or None, not bytes. Ok, remove newline argument: 
_csv.Error: iterator should return strings, not bytes (did you open the file in 
text mode?).

Ok, csv module no support binary read mode. Try use latin iso:

with open('./exported.csv', mode='r', encoding='ISO-8859', newline='') as 
csvFileHandler:

UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032: 
character maps to 

But the charset is latin iso:

$ file exported.csv 
exported.csv: ISO-8859 text, with very long lines, with CRLF line terminators

Ok, change to ISO-8859-8:

UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032: 
character maps to 

Unable load the file. Why not give the option to work binary? the delimiters 
can be represented with binary values.

--
components: Unicode
messages: 350836
nosy: Yhojann Aguilera, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: Unable parse csv on latin iso or binary mode
type: behavior
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue37984>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18748] io.IOBase destructor silence I/O error on close() by default

2019-07-26 Thread Yhojann Aguilera


Yhojann Aguilera  added the comment:

I hope that when an error occurs, python tells me what the problem is. The 
abort core error is a problem at a lower level than python because python is 
not able to recognize or handle the error.

The main problem is that I exceeded the maximum number of process threads 
supported by the kernel.

What I hope is that python throws an exception when it exceeds this limit or 
when it cannot access the pointer in memory of the process thread.

The problem is not if the script is right or wrong, but that Python is not able 
to recognize and handle the problem. A generic message saying that an error 
occurred without indicating where and how it occurred is a python bug.

--

___
Python tracker 
<https://bugs.python.org/issue18748>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18748] io.IOBase destructor silence I/O error on close() by default

2019-07-03 Thread Yhojann Aguilera


Yhojann Aguilera  added the comment:

Same problem using Python 3.6.8 on Ubuntu 18.04 LTS.

For now, solve this using

LD_PRELOAD=libgcc_s.so.1 python3 ...

For more details and pocs: https://github.com/WHK102/wss/issues/2

--
nosy: +Yhojann Aguilera
versions: +Python 3.6

___
Python tracker 
<https://bugs.python.org/issue18748>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35071] Canot send real string from c api to module (corrupted string)

2018-10-25 Thread Yhojann Aguilera


New submission from Yhojann Aguilera :

The functios like as PyUnicode_FromString use a printf format in char array 
argument. Example: PyUnicode_FromString("a%22b"); in module interprete the %22 
as 22 blank spaces. A double quote in module add a backslash. Poc:

Y try send a string from c++ to python string using:

PyObject* pyString = 
PyUnicode_FromString("/abc/def.html/a%22.php?abc=&def=%22;%00s%01");

PyObject* pyArgs = Py_BuildValue("(z)", pyString);
...
PyObject_CallObject(pFunc, pyArgs);

But in script the string is bad:

function(data):
print(data)

The result is:

/abc/def.html/a  bogus %pp?abc=&def=%;(null)%

--
components: Library (Lib)
messages: 328484
nosy: Yhojann Aguilera
priority: normal
severity: normal
status: open
title: Canot send real string from c api to module (corrupted string)
type: behavior
versions: Python 3.6

___
Python tracker 
<https://bugs.python.org/issue35071>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com