Hi euler,

Thanks for your suggestion. I tried uploading our cleaned csv into google 
spreadsheet and as you said most of the accent marks are retained. The 
problem is depending where � is, it represents different special 
characters, in name field it can be Š or Å. In abstract field, it 
represents ". So we can't do a one stop find and replace. 

We ended up identifying all the names with special characters in our 
cleaned up file (out of 941 records we found 36 records with names of 
special characters so not too bad) and have the same list of names with 
correct encoding in a second column and do a find and replace. 

We were suggested by a colleague on using open office to open up the csv 
file right after export from Dspace to set the encoding right (for future 
reference). It's good to know that Googe spreadsheet has the same function. 

Thanks again for your help. 
Xiping 

On Thursday, March 10, 2016 at 9:18:07 PM UTC-6, euler wrote:
>
> Hi Xiping,
>
> My suggestion is to import or upload your (cleaned-up) csv file into 
> Google spreadsheets. I tested your sample data and accent marks are 
> retained (not 100% though). See my attached screenshot. All you have to do 
> now is to find and replace all the � characters. After cleaning it up, 
> download as CSV and then import it back to DSpace. Take note that once you 
> already downloaded it as CSV, refrain from editing it in MS Excel because 
> in my experience, it will mess up your encoding again. It may seem a 
> tedious task but I would rather do it this way than start all over again.
>
> Hope this helps.
>
> Good luck and best regards,
> euler
>
> On Friday, March 11, 2016 at 1:18:05 AM UTC+8, Xiping Liu wrote:
>>
>> Hello everyone, 
>>
>> A few months ago we started a project of cleaning up our electronic 
>> thesis and dissertation records from DSpace. We exported our data from 
>> Dspace as a csv file and after the cleanup we are ready to import the data 
>> back into Dspace. But we noticed that some of the names (accent marks and 
>> quotes) in our data are not showing correctly (I am assuming the encoding 
>> is not set correctly in the very beginning after we export). But since we 
>> have already done our clean up in our file, it will be really painful to go 
>> back and re export the file from Dspace (so we can set the encoding 
>> correctly this time) and redo all the editing. I wonder is there any way we 
>> can correct the encoding after we import the data back into Dspace? Or any 
>> suggestions to solve this problem? I have attached a small sample of our 
>> data. 
>>
>> Your help is greatly appreicated. 
>>
>> Xiping 
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to