If all else fails this untested tool (attached) might translate bad files to good ones. Run it like this:
python utr16toascii.py bad_file name_for_new_good_file On Thu, May 30, 2013 at 4:10 PM, Bill Freeman <[email protected]> wrote: > This file is encoded in UTF-16 with a byte order mark. That is to say, > other than starting with \xff\xfe (the two character byte order mark),, > every other character is nul (\x00). There are actually 1449 useful > characters in this 2900 byte file. A converted version is attached. > json.load() is happy with it. > > I suspect that it was produced correctly, but the act of opening it in a > Windows editor converted it to "wide" characters, which Windows has > preferred for a while now. I don't know how to tell windows to give you > the actual byte size of a file, rather than rounding up to a number of > "k". You could use the following python incantation: > > >>> with open('the_file') as fp:print len(fp.read)) > > The length of my file, downloaded but not opened in an editor, should be > 1449. The length of the bad one should be 2900. The question remains > about the length of the file as produced by dumpdata, but before opening in > an editor. If it is already bad, it must be cmd.exe's ">" operation that > is performing the conversion, or possibly the default encoding in that > python. Though if you are using the same python for the loaddata, it > should have the same default encoding, though I'm not sure that applies to > files read directly, rather than sent to stdout. > > If the editor is what's doing it, there are editors that won't. IDLE, > which comes with a lot of Windows python installs has an editor that is a > possibility. Other Windows users may want to comment. > > > On Thu, May 30, 2013 at 3:27 PM, Gitonga Mbaya <[email protected]> wrote: > >> I just did a fresh dump and I realise the difference is not that drastic. >> The extra stuff must come from trying to edit it. Here is a fresh file from >> the dump... >> >> >> On Thursday, May 30, 2013 9:50:26 PM UTC+3, ke1g wrote: >> >>> Can you load the file using json.load()? I.e.; is that one of the >>> things that you have already tried? >>> >>> >>> On Thu, May 30, 2013 at 2:32 PM, Gitonga Mbaya <[email protected]> wrote: >>> >>>> All you suggest I had already tried. Without indent, same result. >>>> dumping an xml file, same thing. The only thing I didn't try was loading it >>>> in a different project. >>>> >>>> I am doing all this on Windows 7 on the same machine. >>>> >>>> On Thursday, May 30, 2013 8:57:42 PM UTC+3, ke1g wrote: >>>> >>>>> Try again without the indent (just for grins). >>>>> >>>>> Are the two systems on the same box, or did you have to transfer it >>>>> over a network, or via a flash drive, or the like? >>>>> >>>>> If two boxes, is one Windows and the other not? (Line boundaries >>>>> differ, though I would hope that the json tools would be proof against >>>>> that.) >>>>> >>>>> Are there non-ASCII characters in any of the strings? (Encodings >>>>> could differ.) >>>>> >>>>> See if you can make it work for one application. E.g.: >>>>> >>>>> python manage.py dumpdata books > file.json >>>>> >>>>> and in the other project: >>>>> >>>>> loaddata fixture/file.json >>>>> >>>>> (You should be able to leave off the fixture/ if that's where you have >>>>> put it.) >>>>> >>>>> Try again in the XML format: >>>>> >>>>> python manage.py dumpdata --format xml > file.xml >>>>> >>>>> python manage.py loaddata file.xml >>>>> >>>>> (I'm pretty sure that loaddata figures out the format for itself, at >>>>> least it doesn't document a format switch. I've never tried this, so it's >>>>> possible that loaddata only supports JSON.) >>>>> >>>>> Bill >>>>> >>>>> >>>>> On Thu, May 30, 2013 at 1:38 PM, Gitonga Mbaya <[email protected]>wrote: >>>>> >>>>>> Bill, >>>>>> >>>>>> This is are the exact steps I follow: >>>>>> >>>>>> python manage.py dumpdata --indent=4 > fixtures/data.json >>>>>> >>>>>> python manage.py loaddata fixtures/data.json >>>>>> >>>>>> That is when I get: >>>>>> >>>>>> DeserializationError: No JSON object could be decoded >>>>>> >>>>>> I checked the json code using http://jsonlint.com/ and it was >>>>>> reported as being valid. (The json code is reproduced at the end of this >>>>>> post for your info) >>>>>> >>>>>> I openned the file using Notepad++, copied it all into regular >>>>>> Notepad.exe and then saved it as a new json file. When I do the loaddata >>>>>> command with that new file it works just fine. >>>>>> >>>>>> When I copy paste the code from Notepad.exe back into a new file on >>>>>> Notepad++ and save that, the resultant file works just fine as well. >>>>>> >>>>>> This link: http://stackoverflow.com/**quest**ions/8732799/django-** >>>>>> fixtures-**jsondecodeerror<http://stackoverflow.com/questions/8732799/django-fixtures-jsondecodeerror>suggested >>>>>> that the unicode text file needed to be converted to ascii. It >>>>>> was also pointed out that the file in a hexeditor should start with 5B >>>>>> and >>>>>> not any other byte. Sure enough, in the hexeditor, the file straight from >>>>>> the dump began with FF FE, but the notepad saved json file began with 5B. >>>>>> Could it be my setup that is at fault producing the wrong json file dump? >>>>>> >>>>>> [ >>>>>> { >>>>>> "pk": 1, >>>>>> "model": "books.publisher", >>>>>> "fields": { >>>>>> "state_province": "MA", >>>>>> "city": "Cambdridge", >>>>>> "name": "O'Reilly Media", >>>>>> "country": "USA", >>>>>> "website": "www.oreilly.com", >>>>>> "address": "73 Prince Street" >>>>>> } >>>>>> }, >>>>>> { >>>>>> "pk": 2, >>>>>> "model": "books.publisher", >>>>>> "fields": { >>>>>> "state_province": "CA", >>>>>> "city": "Bakersfield", >>>>>> "name": "Randomn House", >>>>>> "country": "USA", >>>>>> "website": "www.randomn.com", >>>>>> "address": "234 Hollywood Boulevard" >>>>>> } >>>>>> }, >>>>>> { >>>>>> "pk": 3, >>>>>> "model": "books.publisher", >>>>>> "fields": { >>>>>> "state_province": "NY", >>>>>> "city": "New York", >>>>>> "name": "Pearson Vue", >>>>>> "country": "USA", >>>>>> "website": "www.pearson.com", >>>>>> "address": "1 Wall Street" >>>>>> } >>>>>> }, >>>>>> { >>>>>> "pk": 1, >>>>>> "model": "books.author", >>>>>> "fields": { >>>>>> "first_name": "Eric", >>>>>> "last_name": "Meyer", >>>>>> "email": "" >>>>>> } >>>>>> }, >>>>>> { >>>>>> "pk": 2, >>>>>> "model": "books.author", >>>>>> "fields": { >>>>>> "first_name": "Seth", >>>>>> "last_name": "Meyer", >>>>>> "email": "" >>>>>> } >>>>>> }, >>>>>> { >>>>>> "pk": 3, >>>>>> "model": "books.author", >>>>>> "fields": { >>>>>> "first_name": "Vincent", >>>>>> "last_name": "Meyer", >>>>>> "email": "" >>>>>> } >>>>>> }, >>>>>> { >>>>>> "pk": 1, >>>>>> "model": "books.book", >>>>>> "fields": { >>>>>> "publisher": 1, >>>>>> "authors": [ >>>>>> 1 >>>>>> ], >>>>>> "isbn": 123456789, >>>>>> "publication_date": null, >>>>>> "title": "CSS: The Definitive Guide" >>>>>> } >>>>>> }, >>>>>> { >>>>>> "pk": 2, >>>>>> "model": "books.book", >>>>>> "fields": { >>>>>> "publisher": 3, >>>>>> "authors": [ >>>>>> 2 >>>>>> ], >>>>>> "isbn": 987654321, >>>>>> "publication_date": null, >>>>>> "title": "Primer on Banking" >>>>>> } >>>>>> }, >>>>>> { >>>>>> "pk": 3, >>>>>> "model": "books.book", >>>>>> "fields": { >>>>>> "publisher": 2, >>>>>> "authors": [ >>>>>> 1,2 >>>>>> ], >>>>>> "isbn": 543216789, >>>>>> "publication_date": null, >>>>>> "title": "Frolicking on the Beach" >>>>>> } >>>>>> } >>>>>> ] >>>>>> >>>>>> On Sunday, March 4, 2012 12:04:08 AM UTC+3, Vincent Bastos wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I am having trouble importing data using loaddata from a .json file >>>>>>> that I created from a dumpdata export. I have a production application >>>>>>> which runs MySQL on one server and a development machine which runs >>>>>>> SQLite. >>>>>>> I simple executed ./manage.py dumpdata > file.json on the production >>>>>>> machine, but when I execute ./manage.py loaddata file.json I get the >>>>>>> error: >>>>>>> >>>>>>> ValueError: No JSON object could be decoded >>>>>>> >>>>>>> I would appreciate some sort of trouble shooting direction, as I >>>>>>> could not find anything that would help me in the docs. >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Django users" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to django-users...@**googlegroups.**com. >>>>>> To post to this group, send email to [email protected]. >>>>>> >>>>>> Visit this group at http://groups.google.com/**group** >>>>>> /django-users?hl=en<http://groups.google.com/group/django-users?hl=en> >>>>>> . >>>>>> For more options, visit >>>>>> https://groups.google.com/**grou**ps/opt_out<https://groups.google.com/groups/opt_out> >>>>>> . >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Django users" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to django-users...@**googlegroups.com. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at >>>> http://groups.google.com/**group/django-users?hl=en<http://groups.google.com/group/django-users?hl=en> >>>> . >>>> For more options, visit >>>> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out> >>>> . >>>> >>>> >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Django users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/django-users?hl=en. >> For more options, visit https://groups.google.com/groups/opt_out. >> >> >> > > -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-users?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
utf16toascii.py
Description: Binary data

