cas-metadata should handle this escaping/unescaping in its
SerDe capabilities.

Kostsas, can yo provide the exact file that I can test on and upload it to
JIRA?

------------------------
Chris Mattmann
chris.mattm...@gmail.com




-----Original Message-----
From: Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
Reply-To: <dev@oodt.apache.org>
Date: Thursday, October 9, 2014 at 2:59 AM
To: "dev@oodt.apache.org" <dev@oodt.apache.org>
Subject: Re: How to ingest files when metadata contain non standard
characters?

>Hi Kos,
>Thanks for reply
>
>On Wed, Oct 8, 2014 at 5:16 PM, Konstantinos Mavrommatis <
>kmavromma...@celgene.com> wrote:
>
>> I escaped the characters using the CGI::escapeHTML function from the CGI
>> perl module.
>>
>
>Wow. I am surpised at this one. I wonder if this is a bug which results in
>the discrepancy or if this is intential behaviour!
>
>
>>
>> The differences between the two versions (mine escaped vs yours escaped)
>> is in the encoding of the single quote "'" character, if I am not
>>mistaken.
>> I want to clarify this because your email come as simple ASCII (not
>>HTML)
>>
>
>Yes that is correct.
>
>
>>
>> I did try your command and it worked !!!
>>
>
>OK grand.
>
>
>>
>> Now the question is how to do this encoding (your version) ☺
>>
>>
>Is this the question? My thoughts would be that this should be
>encapsulated
>within OODT somewhere and that it should not be necessary to escape
>everything as you/we have been doing. This is extremely time consuming and
>painful.
>
>I escaped everything here
>http://www.freeformatter.com/html-escape.html
>
>and compared the strings here
>http://text-compare.com/
>
>The latter resource will verify that it is the single quote that is the
>offending char here.
>Thanks
>Lewis


Reply via email to