Hi,

I am not sure this will help or not, Still i want add a peace of code.

sniffer = csv.Sniffer()
dialect = sniffer.sniff(<first line of csv>)

dialect.__dict__
mappingproxy({'__module__': 'csv', '_name': 'sniffed', 'lineterminator': '\r\n',
'quoting': 0, '__doc__': None, 'doublequote': False, 'delimiter': ',',
'quotechar': '"', 'skipinitialspace': False})


lineterminator = dialect.lineterminator
quoting = dialect.quoting
doublequote = dialect.doublequote
delimiter = dialect.delimiter
quotechar = dialect.quotechar
skipinitialspace = dialect.skipinitialspace


csv.DictReader(self.file_open, **dialect)


Try this.

-
Naresh Jonnala
Hindustan.


On Saturday, July 25, 2020 at 8:03:44 AM UTC+5:30, Liu Zheng wrote:
>
> Yes. You are right. Pandas' default behavior is as following:
>
> encoding = sys.getsystemencoding() or "utf-8"
>
> I tried to open a simple csv encoded into "utf16-LE" (popular on windows), 
> and got the following error:
>
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: 
> invalid start byte
>
> On Sat, Jul 25, 2020 at 5:13 AM Ronaldo Mata <[email protected] 
> <javascript:>> wrote:
>
>> Hi Pandas require knows the encoding and delimiter previously when you 
>> use pd.read_csv(filepath, encoding=" ", delimiter=" ") I think that is the 
>> same 馃
>>
>> El vie., 24 de julio de 2020 3:42 p. m., Jani Tiainen <[email protected] 
>> <javascript:>> escribi贸:
>>
>>> Hi,
>>>
>>> I highly can recommend to use pandas to read csv. It does pretty good 
>>> job to guess a lot of things without extra config. 
>>>
>>> Of course it's one more extra dependency. 
>>>
>>>
>>> pe 24. hein盲k. 2020 klo 17.09 Ronaldo Mata <[email protected] 
>>> <javascript:>> kirjoitti:
>>>
>>>> Yes, I will try it. Anythin I will let you know
>>>>
>>>> El mi茅., 22 de julio de 2020 12:24 p. m., Liu Zheng <
>>>> [email protected] <javascript:>> escribi贸:
>>>>
>>>>> Hi, 
>>>>>
>>>>> Are you sure that the file used for detection is the same as the file 
>>>>> opened and decoded and gave you incorrect information?
>>>>>
>>>>> By the way, ascii is a proper subset of utf-8. If chardet said it 
>>>>> ascii, decoding it using utf-8 should always work.
>>>>>
>>>>> If your file contains non-ascii UTF-8 bytes, maybe it鈥檚 a bug in 
>>>>> chardet? You can try it directly, without mixing it with django鈥檚 
>>>>> requests 
>>>>> first. Make sure you can detect and decode the file locally in a test 
>>>>> program. Then put it into the app.
>>>>>
>>>>> If you share the file, i鈥檓 also glad to help you try it.
>>>>>
>>>>> On Thu, 23 Jul 2020 at 12:04 AM, Ronaldo Mata <[email protected] 
>>>>> <javascript:>> wrote:
>>>>>
>>>>>> Hi Kovy, this is not solved. Liu Zheng but using 
>>>>>> chardet(request.FILES['file'].read()) return encoding "ascii" is not 
>>>>>> correct, I've uploaded a file using utf-7 as encoding for example and 
>>>>>> the 
>>>>>> result is wrog. and then I tried 
>>>>>> request.FILES['file'].read().decode('ascii') and not work return bad 
>>>>>> data. 
>>>>>> Example for @ string return "+AEA-" string.
>>>>>>
>>>>>> El mi茅., 22 jul. 2020 a las 11:16, Kovy Jacob (<[email protected] 
>>>>>> <javascript:>>) escribi贸:
>>>>>>
>>>>>>> I鈥檓 confused. I don鈥檛 know if I can help.
>>>>>>>
>>>>>>> On Jul 22, 2020, at 11:11 AM, Liu Zheng <[email protected] 
>>>>>>> <javascript:>> wrote:
>>>>>>>
>>>>>>> Hi, glad you solved the problem. Yes, both the request.FILES[鈥榝ile鈥橾 
>>>>>>> and the chardet file handler are binary handlers. Binary handler 
>>>>>>> presents 
>>>>>>> the raw data. chardet takes a sequence or raw data and then detect the 
>>>>>>> encoding format. With its prediction, if you want to open that puece of 
>>>>>>> data in text mode, you can use the .decode(<encoding format>) method of 
>>>>>>> bytes object to get a python string.
>>>>>>>
>>>>>>> On Wed, 22 Jul 2020 at 11:04 PM, Kovy Jacob <[email protected] 
>>>>>>> <javascript:>> wrote:
>>>>>>>
>>>>>>>> That鈥檚 probably not the proper answer, but that鈥檚 the best I can 
>>>>>>>> do. Sorry :-(
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jul 22, 2020, at 10:46 AM, Ronaldo Mata <[email protected] 
>>>>>>>> <javascript:>> wrote:
>>>>>>>>
>>>>>>>> Yes, the problem here is that the files will be loaded by the user, 
>>>>>>>> so I don't know what delimiter I will receive. This is not a base 
>>>>>>>> command 
>>>>>>>> that I am using, it is the logic that I want to incorporate in a view
>>>>>>>>
>>>>>>>> El mi茅., 22 jul. 2020 a las 10:43, Kovy Jacob (<[email protected] 
>>>>>>>> <javascript:>>) escribi贸:
>>>>>>>>
>>>>>>>>> Ah, so is the problem that you don鈥檛 always know what the 
>>>>>>>>> delimiter is when you read it? If yes, what is the use case for this? 
>>>>>>>>> You 
>>>>>>>>> might not need a universal solution, maybe just put all the info into 
>>>>>>>>> a csv 
>>>>>>>>> yourself, manually.
>>>>>>>>>
>>>>>>>>> On Jul 22, 2020, at 10:39 AM, Ronaldo Mata <[email protected] 
>>>>>>>>> <javascript:>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Kovy, I'm using csv module, but I need to handle the delimiters 
>>>>>>>>> of the files, sometimes you come separated by "," others by ";" and 
>>>>>>>>> rarely 
>>>>>>>>> by "|" 
>>>>>>>>>
>>>>>>>>> El mi茅., 22 jul. 2020 a las 10:28, Kovy Jacob (<[email protected] 
>>>>>>>>> <javascript:>>) escribi贸:
>>>>>>>>>
>>>>>>>>>> Could you just use the standard python csv module?
>>>>>>>>>>
>>>>>>>>>> On Jul 22, 2020, at 10:25 AM, Ronaldo Mata <[email protected] 
>>>>>>>>>> <javascript:>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Liu thank for your answer.
>>>>>>>>>>
>>>>>>>>>> This has been a headache, I am trying to read the file using 
>>>>>>>>>> csv.DictReader initially i had an error trying to get the dict keys 
>>>>>>>>>> when 
>>>>>>>>>> iterating by rows, and i thought it could be encoding (for this 
>>>>>>>>>> reason i 
>>>>>>>>>> wanted to prepare the view to use the correct encoding). for that 
>>>>>>>>>> reason I 
>>>>>>>>>> asked my question.
>>>>>>>>>>
>>>>>>>>>> 1) your first approach doesn't work, if i send utf-8 file, 
>>>>>>>>>> chardet returns ascii as encoding. it seems request.FILES ['file']. 
>>>>>>>>>> read () 
>>>>>>>>>> returns a binary with that encoding.
>>>>>>>>>>
>>>>>>>>>> 2) In the end I realized that the problem was the delimiter of 
>>>>>>>>>> the csv but predicting it is another problem.
>>>>>>>>>>
>>>>>>>>>> Anyway, it was a task that I had to do and that was my 
>>>>>>>>>> limitation. I think there must be a library that does all this, 
>>>>>>>>>> uploading a 
>>>>>>>>>> csv file is common practice in many web apps.
>>>>>>>>>>
>>>>>>>>>> El mar., 21 jul. 2020 a las 13:47, Liu Zheng (<
>>>>>>>>>> [email protected] <javascript:>>) escribi贸:
>>>>>>>>>>
>>>>>>>>>>> Hi. First of all, I think it's impossible to perfectly detect 
>>>>>>>>>>> encoding without further information. See the answer in this SO 
>>>>>>>>>>> post: 
>>>>>>>>>>> https://stackoverflow.com/questions/436220/how-to-determine-the-encoding-of-text
>>>>>>>>>>>  There 
>>>>>>>>>>> are many packages and tools to help detect encoding format, but 
>>>>>>>>>>> keep in 
>>>>>>>>>>> mind that they are only giving educated guesses. (Most of the time, 
>>>>>>>>>>> the 
>>>>>>>>>>> guess is correct, but do check the dev page to see whether there 
>>>>>>>>>>> are known 
>>>>>>>>>>> issues related to your problem.)
>>>>>>>>>>>
>>>>>>>>>>> Now let's say you have decided to use chardet. Check its doc 
>>>>>>>>>>> page for the usage: 
>>>>>>>>>>> https://chardet.readthedocs.io/en/latest/usage.html#usage You'll 
>>>>>>>>>>> have more than one solutions. Here are some examples:
>>>>>>>>>>>
>>>>>>>>>>> 1. If the files uploaded to your server are all expected to be 
>>>>>>>>>>> small csv files (less than a few MB and not many users do it 
>>>>>>>>>>> concurrently), 
>>>>>>>>>>> you can do the following:
>>>>>>>>>>>
>>>>>>>>>>> #in the view to handle the uploaded file: (assume file input 
>>>>>>>>>>> name is just "file")
>>>>>>>>>>> file_content = request.FILES['file'].read()
>>>>>>>>>>> chardet.detect(file_content)
>>>>>>>>>>>
>>>>>>>>>>> 2. Also, chardet seems to support incremental (line-by-line) 
>>>>>>>>>>> detection 
>>>>>>>>>>> https://chardet.readthedocs.io/en/latest/usage.html#example-detecting-encoding-incrementally
>>>>>>>>>>>
>>>>>>>>>>> Given this, we can also read from requests.FILES line by line 
>>>>>>>>>>> and pass each line to chardet
>>>>>>>>>>>
>>>>>>>>>>> from chardet.universaldetector import UniversalDetector
>>>>>>>>>>>
>>>>>>>>>>> #somewhere in a view function
>>>>>>>>>>> detector = UniversalDetector()
>>>>>>>>>>> file_handle = request.FILES['file']
>>>>>>>>>>> for line in file_handle:
>>>>>>>>>>>     detector.feed(line)
>>>>>>>>>>>     if detector.done: break
>>>>>>>>>>> detector.close()
>>>>>>>>>>> # result available as a dict at detector.result
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tuesday, July 21, 2020 at 7:09:35 AM UTC+8, Ronaldo Mata 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> How to deal with encoding when you try to read a csv file on 
>>>>>>>>>>>> view.
>>>>>>>>>>>>
>>>>>>>>>>>> I have a view to upload csv file, in this view I read file and 
>>>>>>>>>>>> save each row as new record.
>>>>>>>>>>>>
>>>>>>>>>>>> My bug is when I try to upload a csv file with a 
>>>>>>>>>>>> differente encoding (not UTF-8)
>>>>>>>>>>>>
>>>>>>>>>>>> how to handle this on django (using request.FILES) I was 
>>>>>>>>>>>> researching and I found chardet but I don't know how to pass it a 
>>>>>>>>>>>> request.FILES. I need help please.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>> Google Groups "Django users" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>> it, send an email to [email protected] <javascript:>.
>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>> https://groups.google.com/d/msgid/django-users/64307441-0e65-45a2-b917-ece15a4ea729o%40googlegroups.com
>>>>>>>>>>>  
>>>>>>>>>>> <https://groups.google.com/d/msgid/django-users/64307441-0e65-45a2-b917-ece15a4ea729o%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "Django users" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to [email protected] <javascript:>.
>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziQuZyb74Wsk%2BnjngUpSccOKCYRM_C%3D7KgGX%2BgV5wRzHwQ%40mail.gmail.com
>>>>>>>>>>  
>>>>>>>>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziQuZyb74Wsk%2BnjngUpSccOKCYRM_C%3D7KgGX%2BgV5wRzHwQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "Django users" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to [email protected] <javascript:>.
>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>> https://groups.google.com/d/msgid/django-users/91E9FE01-4701-478C-B575-2BD5BA5DCE86%40gmail.com
>>>>>>>>>>  
>>>>>>>>>> <https://groups.google.com/d/msgid/django-users/91E9FE01-4701-478C-B575-2BD5BA5DCE86%40gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "Django users" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to [email protected] <javascript:>.
>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziSjnUSkWgHqb1RzsSHsUURLM9%3DPP0ZNX_zORkp3v-L1%2BQ%40mail.gmail.com
>>>>>>>>>  
>>>>>>>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziSjnUSkWgHqb1RzsSHsUURLM9%3DPP0ZNX_zORkp3v-L1%2BQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "Django users" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to [email protected] <javascript:>.
>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>> https://groups.google.com/d/msgid/django-users/1471A9A8-8BFD-41B0-9AC4-2EA424F1F989%40gmail.com
>>>>>>>>>  
>>>>>>>>> <https://groups.google.com/d/msgid/django-users/1471A9A8-8BFD-41B0-9AC4-2EA424F1F989%40gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "Django users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected] <javascript:>.
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziR%3DrkT%3DCHquc%3DOCB1WbmLFdGuJy0CWadM7bMs8-cGGPNw%40mail.gmail.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziR%3DrkT%3DCHquc%3DOCB1WbmLFdGuJy0CWadM7bMs8-cGGPNw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "Django users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected] <javascript:>.
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/django-users/1DD30686-3E37-4217-AC5A-F865A522F059%40gmail.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/django-users/1DD30686-3E37-4217-AC5A-F865A522F059%40gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Django users" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected] <javascript:>.
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/django-users/CAGQ3pf-hZFLu6JpfTg7qj0jJ92v5br38z9Dx2m%3DkKwouiZZhFw%40mail.gmail.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/django-users/CAGQ3pf-hZFLu6JpfTg7qj0jJ92v5br38z9Dx2m%3DkKwouiZZhFw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Django users" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected] <javascript:>.
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/django-users/73558DAD-CAE6-4275-A8F0-F3A7C47E1514%40gmail.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/django-users/73558DAD-CAE6-4275-A8F0-F3A7C47E1514%40gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "Django users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected] <javascript:>.
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziSHnZFKiXON8b5Jn7hu7LVX-jHCOQ%2BHUSeiBO%3DF3Q_yxw%40mail.gmail.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziSHnZFKiXON8b5Jn7hu7LVX-jHCOQ%2BHUSeiBO%3DF3Q_yxw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Django users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected] <javascript:>.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/django-users/CAGQ3pf-CsurYvoDYJvbqW9kTMQGMcu5XdJ2zJsp3zz5ZwFvT5g%40mail.gmail.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/django-users/CAGQ3pf-CsurYvoDYJvbqW9kTMQGMcu5XdJ2zJsp3zz5ZwFvT5g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Django users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected] <javascript:>.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/django-users/CAP%3DoziTNYmh37hvx0fJL0n5cK_4HBm3fBi5BZf%3D0cnrG3pzvmw%40mail.gmail.com
>>>>  
>>>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziTNYmh37hvx0fJL0n5cK_4HBm3fBi5BZf%3D0cnrG3pzvmw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Django users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] <javascript:>.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/django-users/CAHn91offCbz%3DH_QH%3D60wpVVM6xHFPnSj4oFg4ZMOso5PS5SfzA%40mail.gmail.com
>>>  
>>> <https://groups.google.com/d/msgid/django-users/CAHn91offCbz%3DH_QH%3D60wpVVM6xHFPnSj4oFg4ZMOso5PS5SfzA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Django users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/django-users/CAP%3DoziRCr_GBFvfE-FWW3v%3Dd2CV_G3Lr1JwGc%2BYR40y69ufcyw%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziRCr_GBFvfE-FWW3v%3Dd2CV_G3Lr1JwGc%2BYR40y69ufcyw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/17cc20fc-eb42-4697-a651-1d082ae0f453o%40googlegroups.com.

Reply via email to