Hi Darcy, this may not be the most elegant answer, but with the basic
testing I can do now I believe it will work.

Add the following function to the top of the file in question
<https://github.com/archesproject/arches/blob/stable/3.x/arches/management/commands/packages.py#L352>,
right under the import statements:

def encode_utf8(val):
    try:
        utf8 = val.encode('utf-8')
    except:
        utf8 = str(val).encode('utf-8')
    return utf8

then, change this line

csvwriter.writerow({k: str(v).encode('utf8') for k, v in
csv_record.items()})

to

csvwriter.writerow({k: encode_utf8(v) for k, v in csv_record.items()})

Let me know you have any trouble with that. It will probably be easiest to
just modify the file in your virtual environment.

Adam


On Mon, Sep 18, 2017 at 8:04 PM, Darcy Christ <da...@1000camels.com> wrote:

> Hi Adam,
>
> Still not sure how to get past this. I see it is an n-dash (I should have
> looked this up, rather than assumed it was related to Chinese).
>
> The question for me is why is this code assuming all ascii since it
> allowed an ndash into the database?
>
> Is there a way to fix this code, rather update the content?
>
>
> Darcy
>
>
> On Saturday, September 16, 2017 at 12:59:44 AM UTC+10, Adam Cox wrote:
>>
>> Darcy, it looks like this is not a chinese character, but a long dash.
>>
>> This issue seems to be well-summed up in this stack exchange answer:
>> https://stackoverflow.com/a/5387966/3873885
>>
>> Essentially, in this line
>>
>> csvwriter.writerow({k: str(v).encode('utf8') for k, v in
>> csv_record.items()})
>>
>> the str() operation is encoding v (which in this case is a unicode
>> object) to ascii, the default encoding for a str object in python 2.7.
>> Then, that ascii-encoded string is further encoded into utf-8. I assume the
>> initial str() operation is meant to handle integers and other non-text
>> obects, but you've found an example where because  u'\u2013' (unicode
>> character 2013
>> <http://www.fileformat.info/info/unicode/char/2013/index.htm>) cannot be
>> encoded in ascii, it hits an error even before it has a chance to encode to 
>> utf-8.
>> So, I think that line in the code could be improved.
>>
>> It looks like that line comes from the related resource export
>> <https://github.com/archesproject/arches/blob/stable/3.x/arches/management/commands/packages.py#L352>
>> process. Maybe you have a long dash in one of the notes that you have about
>> a resource to resource relationship? Otherwise, I'm really not sure where
>> that problem character would come from...
>>
>> Hope that's at least a little helpful.
>>
>> Adam
>>
>> On Thu, Sep 14, 2017 at 8:01 PM, Darcy Christ <da...@1000camels.com>
>> wrote:
>>
>>> I am having trouble exporting data from v3
>>>
>>> I have add this to my config:
>>>
>>> EXPORT_CONFIG = os.path.normpath(os.path.join(PACKAGE_ROOT,
>>> 'source_data', 'business_data', 'resource_export_mappings.json'))
>>>
>>>
>>> And then I get an error while trying to export. Given that it is related
>>> to encoding, could it be any chinese characters I might have in the data?
>>>
>>>
>>> (hkarches) [hkarches@heritage hongkong]$ python manage.py packages -o
>>> export_resources -d '../hongkong_data'
>>> operation: export_resources
>>> package: hongkong
>>> Writing 3 ACTIVITY.E7 resources
>>> Writing 370 INFORMATION_RESOURCE.E73 resources
>>> Writing 1205 HERITAGE_RESOURCE.E18 resources
>>> Writing 545 ACTOR.E39 resources
>>> Writing 6 HISTORICAL_EVENT.E5 resources
>>> Writing 0 HERITAGE_RESOURCE_GROUP.E27 resources
>>> Traceback (most recent call last):
>>>   File "manage.py", line 28, in <module>
>>>     execute_from_command_line(sys.argv)
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/django/core/management/__init__.py",
>>> line 399, in execute_from_command_line
>>>     utility.execute()
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/django/core/management/__init__.py",
>>> line 392, in execute
>>>     self.fetch_command(subcommand).run_from_argv(self.argv)
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/django/core/management/base.py",
>>> line 242, in run_from_argv
>>>     self.execute(*args, **options.__dict__)
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/django/core/management/base.py",
>>> line 285, in execute
>>>     output = self.handle(*args, **options)
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/arches/management/commands/packages.py",
>>> line 106, in handle
>>>     self.export_resources(package_name, options['dest_dir'])
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/arches/management/commands/packages.py",
>>> line 351, in export_resources
>>>     csvwriter.writerow({k: str(v).encode('utf8') for k, v in
>>> csv_record.items()})
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/arches/management/commands/packages.py",
>>> line 351, in <dictcomp>
>>>     csvwriter.writerow({k: str(v).encode('utf8') for k, v in
>>> csv_record.items()})
>>> UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in
>>> position 220: ordinal not in range(128)
>>>
>>> --
>>> -- To post, send email to arches...@googlegroups.com. To unsubscribe,
>>> send email to archesprojec...@googlegroups.com. For more information,
>>> visit https://groups.google.com/d/forum/archesproject?hl=en
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Arches Project" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to archesprojec...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> -- To post, send email to archesproject@googlegroups.com. To unsubscribe,
> send email to archesproject+unsubscr...@googlegroups.com. For more
> information, visit https://groups.google.com/d/forum/archesproject?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Arches Project" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to archesproject+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
-- To post, send email to archesproject@googlegroups.com. To unsubscribe, send 
email to archesproject+unsubscr...@googlegroups.com. For more information, 
visit https://groups.google.com/d/forum/archesproject?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Arches Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to archesproject+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to