If all else fails this untested tool (attached) might translate bad files
to good ones.  Run it like this:

  python utr16toascii.py bad_file name_for_new_good_file


On Thu, May 30, 2013 at 4:10 PM, Bill Freeman <[email protected]> wrote:

> This file is encoded in UTF-16 with a byte order mark.  That is to say,
> other than starting with \xff\xfe (the two character byte order mark),,
> every other character is nul (\x00).  There are actually 1449 useful
> characters in this 2900 byte file.  A converted version is attached.
> json.load() is happy with it.
>
> I suspect that it was produced correctly, but the act of opening it in a
> Windows editor converted it to "wide" characters, which Windows has
> preferred for a while now.  I don't know how to tell windows to give you
> the actual byte size of a file, rather than rounding up to a number of
> "k".  You could use the following python incantation:
>
>     >>> with open('the_file') as fp:print len(fp.read))
>
> The length of my file, downloaded but not opened in an editor, should be
> 1449.  The length of the bad one should be 2900.  The question remains
> about the length of the file as produced by dumpdata, but before opening in
> an editor.  If it is already bad, it must be cmd.exe's ">" operation that
> is performing the conversion, or possibly the default encoding in that
> python.  Though if you are using the same python for the loaddata, it
> should have the same default encoding, though I'm not sure that applies to
> files read directly, rather than sent to stdout.
>
> If the editor is what's doing it, there are editors that won't.  IDLE,
> which comes with a lot of Windows python installs has an editor that is a
> possibility.  Other Windows users may want to comment.
>
>
> On Thu, May 30, 2013 at 3:27 PM, Gitonga Mbaya <[email protected]> wrote:
>
>> I just did a fresh dump and I realise the difference is not that drastic.
>> The extra stuff must come from trying to edit it. Here is a fresh file from
>> the dump...
>>
>>
>> On Thursday, May 30, 2013 9:50:26 PM UTC+3, ke1g wrote:
>>
>>> Can you load the file using json.load()?  I.e.; is that one of the
>>> things that you have already tried?
>>>
>>>
>>> On Thu, May 30, 2013 at 2:32 PM, Gitonga Mbaya <[email protected]> wrote:
>>>
>>>> All you suggest I had already tried. Without indent, same result.
>>>> dumping an xml file, same thing. The only thing I didn't try was loading it
>>>> in a different project.
>>>>
>>>> I am doing all this on Windows 7 on the same machine.
>>>>
>>>> On Thursday, May 30, 2013 8:57:42 PM UTC+3, ke1g wrote:
>>>>
>>>>> Try again without the indent (just for grins).
>>>>>
>>>>> Are the two systems on the same box, or did you have to transfer it
>>>>> over a network, or via a flash drive, or the like?
>>>>>
>>>>> If two boxes, is one Windows and the other not?  (Line boundaries
>>>>> differ, though I would hope that the json tools would be proof against
>>>>> that.)
>>>>>
>>>>> Are there non-ASCII characters in any of the strings?  (Encodings
>>>>> could differ.)
>>>>>
>>>>> See if you can make it work for one application.   E.g.:
>>>>>
>>>>>   python manage.py dumpdata books > file.json
>>>>>
>>>>> and in the other project:
>>>>>
>>>>>   loaddata fixture/file.json
>>>>>
>>>>> (You should be able to leave off the fixture/ if that's where you have
>>>>> put it.)
>>>>>
>>>>> Try again in the XML format:
>>>>>
>>>>>   python manage.py dumpdata --format xml > file.xml
>>>>>
>>>>>   python manage.py loaddata file.xml
>>>>>
>>>>> (I'm pretty sure that loaddata figures out the format for itself, at
>>>>> least it doesn't document a format switch.  I've never tried this, so it's
>>>>> possible that loaddata only supports JSON.)
>>>>>
>>>>> Bill
>>>>>
>>>>>
>>>>> On Thu, May 30, 2013 at 1:38 PM, Gitonga Mbaya <[email protected]>wrote:
>>>>>
>>>>>> Bill,
>>>>>>
>>>>>> This is are the exact steps I follow:
>>>>>>
>>>>>> python manage.py dumpdata --indent=4 > fixtures/data.json
>>>>>>
>>>>>> python manage.py loaddata fixtures/data.json
>>>>>>
>>>>>> That is when I get:
>>>>>>
>>>>>> DeserializationError: No JSON object could be decoded
>>>>>>
>>>>>> I checked the json code using http://jsonlint.com/ and it was
>>>>>> reported as being valid. (The json code is reproduced at the end of this
>>>>>> post for your info)
>>>>>>
>>>>>> I openned the file using Notepad++, copied it all into regular
>>>>>> Notepad.exe and then saved it as a new json file. When I do the loaddata
>>>>>> command with that new file it works just fine.
>>>>>>
>>>>>> When I copy paste the code from Notepad.exe back into a new file on
>>>>>> Notepad++ and save that, the resultant file works just fine as well.
>>>>>>
>>>>>> This link: http://stackoverflow.com/**quest**ions/8732799/django-**
>>>>>> fixtures-**jsondecodeerror<http://stackoverflow.com/questions/8732799/django-fixtures-jsondecodeerror>suggested
>>>>>>  that the unicode text file needed to be converted to ascii. It
>>>>>> was also pointed out that the file in a hexeditor should start with 5B 
>>>>>> and
>>>>>> not any other byte. Sure enough, in the hexeditor, the file straight from
>>>>>> the dump began with FF FE, but the notepad saved json file began with 5B.
>>>>>> Could it be my setup that is at fault producing the wrong json file dump?
>>>>>>
>>>>>> [
>>>>>>     {
>>>>>>         "pk": 1,
>>>>>>         "model": "books.publisher",
>>>>>>         "fields": {
>>>>>>             "state_province": "MA",
>>>>>>             "city": "Cambdridge",
>>>>>>             "name": "O'Reilly Media",
>>>>>>             "country": "USA",
>>>>>>             "website": "www.oreilly.com",
>>>>>>             "address": "73 Prince Street"
>>>>>>         }
>>>>>>     },
>>>>>> {
>>>>>>         "pk": 2,
>>>>>>         "model": "books.publisher",
>>>>>>         "fields": {
>>>>>>             "state_province": "CA",
>>>>>>             "city": "Bakersfield",
>>>>>>             "name": "Randomn House",
>>>>>>             "country": "USA",
>>>>>>             "website": "www.randomn.com",
>>>>>>             "address": "234 Hollywood Boulevard"
>>>>>>         }
>>>>>>     },
>>>>>> {
>>>>>>         "pk": 3,
>>>>>>         "model": "books.publisher",
>>>>>>         "fields": {
>>>>>>             "state_province": "NY",
>>>>>>             "city": "New York",
>>>>>>             "name": "Pearson Vue",
>>>>>>             "country": "USA",
>>>>>>             "website": "www.pearson.com",
>>>>>>             "address": "1 Wall Street"
>>>>>>         }
>>>>>>     },
>>>>>>     {
>>>>>>         "pk": 1,
>>>>>>         "model": "books.author",
>>>>>>         "fields": {
>>>>>>             "first_name": "Eric",
>>>>>>             "last_name": "Meyer",
>>>>>>             "email": ""
>>>>>>         }
>>>>>>     },
>>>>>>     {
>>>>>>         "pk": 2,
>>>>>>         "model": "books.author",
>>>>>>         "fields": {
>>>>>>             "first_name": "Seth",
>>>>>>             "last_name": "Meyer",
>>>>>>             "email": ""
>>>>>>         }
>>>>>>     },
>>>>>>         {
>>>>>>         "pk": 3,
>>>>>>         "model": "books.author",
>>>>>>         "fields": {
>>>>>>             "first_name": "Vincent",
>>>>>>             "last_name": "Meyer",
>>>>>>             "email": ""
>>>>>>         }
>>>>>>     },
>>>>>> {
>>>>>>         "pk": 1,
>>>>>>         "model": "books.book",
>>>>>>         "fields": {
>>>>>>             "publisher": 1,
>>>>>>             "authors": [
>>>>>>                 1
>>>>>>             ],
>>>>>>             "isbn": 123456789,
>>>>>>             "publication_date": null,
>>>>>>             "title": "CSS: The Definitive Guide"
>>>>>>         }
>>>>>>     },
>>>>>>     {
>>>>>>         "pk": 2,
>>>>>>         "model": "books.book",
>>>>>>         "fields": {
>>>>>>             "publisher": 3,
>>>>>>             "authors": [
>>>>>>                 2
>>>>>>             ],
>>>>>>             "isbn": 987654321,
>>>>>>             "publication_date": null,
>>>>>>             "title": "Primer on Banking"
>>>>>>         }
>>>>>>     },
>>>>>>     {
>>>>>>         "pk": 3,
>>>>>>         "model": "books.book",
>>>>>>         "fields": {
>>>>>>             "publisher": 2,
>>>>>>             "authors": [
>>>>>>                 1,2
>>>>>>             ],
>>>>>>             "isbn": 543216789,
>>>>>>             "publication_date": null,
>>>>>>             "title": "Frolicking on the Beach"
>>>>>>         }
>>>>>>     }
>>>>>> ]
>>>>>>
>>>>>> On Sunday, March 4, 2012 12:04:08 AM UTC+3, Vincent Bastos wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am having trouble importing data using loaddata from a .json file
>>>>>>> that I created from a dumpdata export. I have a production application
>>>>>>> which runs MySQL on one server and a development machine which runs 
>>>>>>> SQLite.
>>>>>>> I simple executed ./manage.py dumpdata > file.json on the production
>>>>>>> machine, but when I execute ./manage.py loaddata file.json I get the 
>>>>>>> error:
>>>>>>>
>>>>>>> ValueError: No JSON object could be decoded
>>>>>>>
>>>>>>> I would appreciate some sort of trouble shooting direction, as I
>>>>>>> could not find anything that would help me in the docs.
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>  --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Django users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to django-users...@**googlegroups.**com.
>>>>>> To post to this group, send email to [email protected].
>>>>>>
>>>>>> Visit this group at http://groups.google.com/**group**
>>>>>> /django-users?hl=en<http://groups.google.com/group/django-users?hl=en>
>>>>>> .
>>>>>> For more options, visit 
>>>>>> https://groups.google.com/**grou**ps/opt_out<https://groups.google.com/groups/opt_out>
>>>>>> .
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Django users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to django-users...@**googlegroups.com.
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at 
>>>> http://groups.google.com/**group/django-users?hl=en<http://groups.google.com/group/django-users?hl=en>
>>>> .
>>>> For more options, visit 
>>>> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
>>>> .
>>>>
>>>>
>>>>
>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "Django users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/django-users?hl=en.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Attachment: utf16toascii.py
Description: Binary data

Reply via email to