Thanks Richard. It looks like this is what I had better do!
Lucy

Sent from my iPad

> On 30 Jan 2016, at 20:21, Richard Jennings <[email protected]> wrote:
> 
> Hi Lucy,
> 
> I just wanted to say that I agree with Koen in that I use OpenOffice to 
> manipulate my authority documents, resource graphs and in preparing my 
> .arches files and find that it works very well in terms of handling issues 
> such as yours. I can't recommend it highly enough after having similar 
> problems with Excel.
> 
> Best wishes,
> 
> Richard
> 
> 
> 
>> On Monday, January 25, 2016 at 9:39:35 AM UTC, Koen Van Daele wrote:
>> Hi Lucy,
>> 
>> 
>> character encodings are one of those nasty issues in computing that nobody 
>> likes tackling. If you want a detailed, yet fairly easy to follow analysis 
>> on why that is, see http://www.joelonsoftware.com/articles/Unicode.html 
>> (Cthulhu is waiting for you there though...)
>> 
>> Basically, what Arches does is the best thing possible. That way most human 
>> languages can be integrated in Arches, and all you need to do is make sure 
>> your data is UTF-8. Unfortunately Excel makes that bloody impossible. I 
>> think Excel saves that file in the ISO-8859-1  encoding. That encoding just 
>> doesn't know the characters you're trying to save (ISO-8859-1 only contains 
>> 191 characters). So, it's not just Arches. I can't read them either. Excel 
>> should be telling you when saving as CSV that you will lose information), it 
>> still wouldn't work since your csv file already contains illegal ISO-8859-1 
>> characters.
>> 
>> And it's not just Excel, the whole Windows ecosystem is fundamentelly flawed 
>> in that regard. I myself run Linux where character encoding is handled 
>> correctly and UTF-8 is the default. No idea how they do it on a Mac.
>> 
>> 
>> So, I think using OpenOffice is your best bet. Or just open the csv file you 
>> have in Notepad++ (or similar text editor), save the file as UTF-8 and fix 
>> the problems manually. But then you'd have to do that every time you want to 
>> change something.
>> 
>> 
>> Cheers,
>> 
>> Koen
>> 
>> Van: [email protected] <[email protected]> namens Lucy FJ 
>> <[email protected]>
>> Verzonden: zondag 24 januari 2016 12:28
>> Aan: Arches Project
>> Onderwerp: Re: [Arches] Diacriticals in authority and .Arches files problems
>>  
>> Hi Koen,
>> 
>> Thank you for this information. I did tryout some of the suggestions on 
>> Google for using Excel to create UTF-8 files, because I like using Excel and 
>> know it well,  but I have tried some and they are over complicated and 
>> produce a CVS file in UTF-BOM format which I believe will not work in 
>> Arches. It looks like I will need to download the Openoffice version as you 
>> suggest. Must all files loading into Arches be UTF-8 only?
>> 
>> Lucy
>> 
>>> On Friday, January 22, 2016 at 4:24:42 PM UTC+2, Koen Van Daele wrote:
>>> Hi Lucy,
>>> 
>>> 
>>> as far as I know Excel (all versions) are notoriously bad at handling 
>>> things like character encodings.  This rather old Stackoverflow question 
>>> seems to confirm that:
>>> 
>>> http://stackoverflow.com/questions/4221176/excel-to-csv-with-utf8-encoding 
>>> It does offer some workarounds, but none of them are very nice.
>>> 
>>> I would suggest writing your CSV files with Libreoffice/Openoffice. You 
>>> should be able to install it and it's free. While it's not always an exact 
>>> replacement for Excel, when it comes to character encodings, it just works. 
>>> By default it will save things as UTF-8 (at least under Linux it does) and 
>>> it will ask you if you want to save in a different encoding.
>>> 
>>> 
>>> Cheers,
>>> 
>>> Koen
>>> 
>>> 
>>> Op vrijdag 22 januari 2016 15:05:52 UTC+1 schreef Lucy FJ:
>>>> 
>>>> Hi Adam and Alexei,
>>>>  
>>>> I forgot to add that the diacriticals are in the altnames at rows 132 to 
>>>> 136 when editing in Excel.
>>>>  
>>>> Lucy
>>>> ----- Original Message -----
>>>> From: Adam Cox
>>>> To: Lucy Fletcher-Jones
>>>> Cc: Alexei Peters ; Arches Project
>>>> Sent: Thursday, January 21, 2016 5:36 PM
>>>> Subject: Re: [Arches] Diacriticals in authority and .Arches files problems
>>>> 
>>>> Hi Lucy, you can check the encoding in Notepad ++.  Open your authority 
>>>> document with that program, and click the Encoding menu.  Your file should 
>>>> be in "UTF-8" or "UTF-8 without BOM" (depends on the version of Notepad ++ 
>>>> you have). The î character should work as far as I know...
>>>> 
>>>>> On Thu, Jan 21, 2016 at 7:18 AM, 'Lucy Fletcher-Jones' via Arches Project 
>>>>> <[email protected]> wrote:
>>>>> Hi Alexei,
>>>>>  
>>>>> Thank you for looking into this. I am glad to hear that Arches should 
>>>>> support diacriticals.
>>>>>  
>>>>> Here is the error message on loading the 'Ruler' Authority document:
>>>>>  
>>>>> RULER_AUTHORITY_DOCUMENT.csv
>>>>>  
>>>>> ERRORS IN FILE: RULER_AUTHORITY_DOCUMENT.values.csv
>>>>>  
>>>>> ERRORS IN FILE: RULER_AUTHORITY_DOCUMENT.csv
>>>>>  
>>>>> ERROR: Make sure the file is saved with UTF-8 encoding
>>>>> 'utf8' codec can't decode byte 0xea in position 30: invalid continuation 
>>>>> byte
>>>>> Traceback (most recent call last):
>>>>>   File 
>>>>> "/opt/projects/ENV/lib/python2.7/site-packages/arches/management/commands/package_utils/authority_files.py",
>>>>>  line 112, in load_authority_file
>>>>>     for row in rows:
>>>>>   File "/opt/projects/ENV/lib/python2.7/site-packages/unicodecsv/py2.py", 
>>>>> line 217, in next
>>>>>     row = csv.DictReader.next(self)
>>>>>   File "/usr/local/lib/python2.7/csv.py", line 104, in next
>>>>>     row = self.reader.next()
>>>>>   File "/opt/projects/ENV/lib/python2.7/site-packages/unicodecsv/py2.py", 
>>>>> line 128, in next
>>>>>     for value in row]
>>>>>   File "/opt/projects/ENV/lib/python2.7/encodings/utf_8_sig.py", line 22, 
>>>>> in decode
>>>>>     (output, consumed) = codecs.utf_8_decode(input, errors, True)
>>>>> UnicodeDecodeError: 'utf8' codec can't decode byte 0xea in position 30: 
>>>>> invalid continuation byte
>>>>>  
>>>>> ERROR in row 31 (Legacyoid (RULER_UID:30) not found.  Make sure your 
>>>>> ParentConceptid in the  
>>>>>  
>>>>> This caused further errors in the Ruler Values files as can be seen from 
>>>>> above.
>>>>> I do not have a copy of the authority file that caused the error asI have 
>>>>> since corrected it and changed it in a few places. But the alternative 
>>>>> name was
>>>>>  
>>>>> Ptolemaîos Philadelphos
>>>>>  
>>>>> and I believe it was the circumflex above the 'i' that caused the 
>>>>> problem. Certainly when I removed the circumflex, the file loaded OK.
>>>>>  
>>>>> Thank you,
>>>>> Lucy
>>>>>  
>>>>>  
>>>>> ----- Original Message -----
>>>>> From: Alexei Peters
>>>>> To: Lucy FJ
>>>>> Cc: Arches Project
>>>>> Sent: Wednesday, January 20, 2016 8:24 PM
>>>>> Subject: Re: [Arches] Diacriticals in authority and .Arches files problems
>>>>> 
>>>>> Hi Lucy,
>>>>> The .arches file should support diacritics.  I'm actually surprised that 
>>>>> the authority files don't.  I just tested a local file and I was able to 
>>>>> add these records:
>>>>> 
>>>>> conceptid,PrefLabel,AltLabels,ParentConceptid,ConceptType,Provider 
>>>>> 20000001-0000-0000-0000-000000000000,Portland,,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI
>>>>> 20000002-0000-0000-0000-000000000000,San Francisco,The Bay 
>>>>> Area,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI
>>>>> 20000003-0000-0000-0000-000000000000,San Jose,San 
>>>>> José,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI
>>>>> 
>>>>> Notice that the alt label for San Jose, is San José
>>>>> 
>>>>> Can you share the authority file that you're having trouble with?
>>>>> Cheers,
>>>>> Alexei
>>>>> 
>>>>> 
>>>>> Director of Web Development - Farallon Geographics, Inc. - 971.227.3173
>>>>> 
>>>>>> On Wed, Jan 20, 2016 at 12:32 AM, Lucy FJ <[email protected]> wrote:
>>>>>> Hi all,
>>>>>> We have been loading customised authority files and have noticed that 
>>>>>> Arches rejects words with diacriticals (accents etc). This is not a 
>>>>>> problem for us as we were happy to remove them  and if we really want 
>>>>>> them we can enter then through the RDM. But will this problem occur when 
>>>>>> loading resource data through .arches? We need to input place names as 
>>>>>> alternative names using diacriticals and it would be much easier if we 
>>>>>> can do this via .arches files. We know we can input them using the 
>>>>>> resource data manager but obviously when dealing with about 3000 
>>>>>> entries,,this is time consuming.
>>>>>> Any ideas?
>>>>>> Lucy
>>>>>> 
>>>>>> --
>>>>>> -- To post, send email to [email protected]. To unsubscribe, 
>>>>>> send email to [email protected]. For more information, 
>>>>>> visit https://groups.google.com/d/forum/archesproject?hl=en
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "Arches Project" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>> an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>> 
>>>>> -- 
>>>>> -- To post, send email to [email protected]. To unsubscribe, 
>>>>> send email to [email protected]. For more information, 
>>>>> visit https://groups.google.com/d/forum/archesproject?hl=en
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "Arches Project" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>> 
>> 
>> -- 
>> -- To post, send email to [email protected]. To unsubscribe, send 
>> email to [email protected]. For more information, visit 
>> https://groups.google.com/d/forum/archesproject?hl=en
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Arches Project" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> -- To post, send email to [email protected]. To unsubscribe, 
> send email to [email protected]. For more 
> information, visit https://groups.google.com/d/forum/archesproject?hl=en
> --- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "Arches Project" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/archesproject/3l6N7KuEpXY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
-- To post, send email to [email protected]. To unsubscribe, send 
email to [email protected]. For more information, 
visit https://groups.google.com/d/forum/archesproject?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Arches Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to