Hi, glad you solved the problem. Yes, both the request.FILES[‘file’] and the chardet file handler are binary handlers. Binary handler presents the raw data. chardet takes a sequence or raw data and then detect the encoding format. With its prediction, if you want to open that puece of data in text mode, you can use the .decode(<encoding format>) method of bytes object to get a python string.
On Wed, 22 Jul 2020 at 11:04 PM, Kovy Jacob <[email protected]> wrote: > That’s probably not the proper answer, but that’s the best I can do. Sorry > :-( > > > On Jul 22, 2020, at 10:46 AM, Ronaldo Mata <[email protected]> > wrote: > > Yes, the problem here is that the files will be loaded by the user, so I > don't know what delimiter I will receive. This is not a base command that I > am using, it is the logic that I want to incorporate in a view > > El mié., 22 jul. 2020 a las 10:43, Kovy Jacob (<[email protected]>) > escribió: > >> Ah, so is the problem that you don’t always know what the delimiter is >> when you read it? If yes, what is the use case for this? You might not need >> a universal solution, maybe just put all the info into a csv yourself, >> manually. >> >> On Jul 22, 2020, at 10:39 AM, Ronaldo Mata <[email protected]> >> wrote: >> >> Hi Kovy, I'm using csv module, but I need to handle the delimiters of the >> files, sometimes you come separated by "," others by ";" and rarely by "|" >> >> El mié., 22 jul. 2020 a las 10:28, Kovy Jacob (<[email protected]>) >> escribió: >> >>> Could you just use the standard python csv module? >>> >>> On Jul 22, 2020, at 10:25 AM, Ronaldo Mata <[email protected]> >>> wrote: >>> >>> Hi Liu thank for your answer. >>> >>> This has been a headache, I am trying to read the file using >>> csv.DictReader initially i had an error trying to get the dict keys when >>> iterating by rows, and i thought it could be encoding (for this reason i >>> wanted to prepare the view to use the correct encoding). for that reason I >>> asked my question. >>> >>> 1) your first approach doesn't work, if i send utf-8 file, chardet >>> returns ascii as encoding. it seems request.FILES ['file']. read () returns >>> a binary with that encoding. >>> >>> 2) In the end I realized that the problem was the delimiter of the csv >>> but predicting it is another problem. >>> >>> Anyway, it was a task that I had to do and that was my limitation. I >>> think there must be a library that does all this, uploading a csv file is >>> common practice in many web apps. >>> >>> El mar., 21 jul. 2020 a las 13:47, Liu Zheng (<[email protected]>) >>> escribió: >>> >>>> Hi. First of all, I think it's impossible to perfectly detect encoding >>>> without further information. See the answer in this SO post: >>>> https://stackoverflow.com/questions/436220/how-to-determine-the-encoding-of-text >>>> There >>>> are many packages and tools to help detect encoding format, but keep in >>>> mind that they are only giving educated guesses. (Most of the time, the >>>> guess is correct, but do check the dev page to see whether there are known >>>> issues related to your problem.) >>>> >>>> Now let's say you have decided to use chardet. Check its doc page for >>>> the usage: https://chardet.readthedocs.io/en/latest/usage.html#usage You'll >>>> have more than one solutions. Here are some examples: >>>> >>>> 1. If the files uploaded to your server are all expected to be small >>>> csv files (less than a few MB and not many users do it concurrently), you >>>> can do the following: >>>> >>>> #in the view to handle the uploaded file: (assume file input name is >>>> just "file") >>>> file_content = request.FILES['file'].read() >>>> chardet.detect(file_content) >>>> >>>> 2. Also, chardet seems to support incremental (line-by-line) detection >>>> https://chardet.readthedocs.io/en/latest/usage.html#example-detecting-encoding-incrementally >>>> >>>> Given this, we can also read from requests.FILES line by line and pass >>>> each line to chardet >>>> >>>> from chardet.universaldetector import UniversalDetector >>>> >>>> #somewhere in a view function >>>> detector = UniversalDetector() >>>> file_handle = request.FILES['file'] >>>> for line in file_handle: >>>> detector.feed(line) >>>> if detector.done: break >>>> detector.close() >>>> # result available as a dict at detector.result >>>> >>>> >>>> >>>> >>>> >>>> On Tuesday, July 21, 2020 at 7:09:35 AM UTC+8, Ronaldo Mata wrote: >>>>> >>>>> How to deal with encoding when you try to read a csv file on view. >>>>> >>>>> I have a view to upload csv file, in this view I read file and save >>>>> each row as new record. >>>>> >>>>> My bug is when I try to upload a csv file with a differente encoding >>>>> (not UTF-8) >>>>> >>>>> how to handle this on django (using request.FILES) I was researching >>>>> and I found chardet but I don't know how to pass it a request.FILES. I >>>>> need >>>>> help please. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Django users" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/django-users/64307441-0e65-45a2-b917-ece15a4ea729o%40googlegroups.com >>>> <https://groups.google.com/d/msgid/django-users/64307441-0e65-45a2-b917-ece15a4ea729o%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Django users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/django-users/CAP%3DoziQuZyb74Wsk%2BnjngUpSccOKCYRM_C%3D7KgGX%2BgV5wRzHwQ%40mail.gmail.com >>> <https://groups.google.com/d/msgid/django-users/CAP%3DoziQuZyb74Wsk%2BnjngUpSccOKCYRM_C%3D7KgGX%2BgV5wRzHwQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Django users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/django-users/91E9FE01-4701-478C-B575-2BD5BA5DCE86%40gmail.com >>> <https://groups.google.com/d/msgid/django-users/91E9FE01-4701-478C-B575-2BD5BA5DCE86%40gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Django users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/django-users/CAP%3DoziSjnUSkWgHqb1RzsSHsUURLM9%3DPP0ZNX_zORkp3v-L1%2BQ%40mail.gmail.com >> <https://groups.google.com/d/msgid/django-users/CAP%3DoziSjnUSkWgHqb1RzsSHsUURLM9%3DPP0ZNX_zORkp3v-L1%2BQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Django users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/django-users/1471A9A8-8BFD-41B0-9AC4-2EA424F1F989%40gmail.com >> <https://groups.google.com/d/msgid/django-users/1471A9A8-8BFD-41B0-9AC4-2EA424F1F989%40gmail.com?utm_medium=email&utm_source=footer> >> . >> > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/django-users/CAP%3DoziR%3DrkT%3DCHquc%3DOCB1WbmLFdGuJy0CWadM7bMs8-cGGPNw%40mail.gmail.com > <https://groups.google.com/d/msgid/django-users/CAP%3DoziR%3DrkT%3DCHquc%3DOCB1WbmLFdGuJy0CWadM7bMs8-cGGPNw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/django-users/1DD30686-3E37-4217-AC5A-F865A522F059%40gmail.com > <https://groups.google.com/d/msgid/django-users/1DD30686-3E37-4217-AC5A-F865A522F059%40gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CAGQ3pf-hZFLu6JpfTg7qj0jJ92v5br38z9Dx2m%3DkKwouiZZhFw%40mail.gmail.com.

