Thanks Kostas. Can you upload somewhere and then point here, the
message list strips attachments..

Cheers,
Chris

------------------------
Chris Mattmann
chris.mattm...@gmail.com




-----Original Message-----
From: Konstantinos Mavrommatis <kmavromma...@celgene.com>
Reply-To: <dev@oodt.apache.org>
Date: Thursday, October 9, 2014 at 5:48 AM
To: "dev@oodt.apache.org" <dev@oodt.apache.org>
Subject: RE: How to ingest files when metadata contain non standard
characters?

>Thanks Chris,
>
>attached is an offending file before escape.
>For the record perl module HTML::Entities does provide an escapeHTML
>alternative that produces acceptable files.
>
>Thanks
>K
>
>
>> -----Original Message-----
>> From: Chris Mattmann [mailto:chris.mattm...@gmail.com]
>> Sent: Wednesday, October 08, 2014 11:38 AM
>> To: dev@oodt.apache.org
>> Subject: Re: How to ingest files when metadata contain non standard
>> characters?
>> 
>> cas-metadata should handle this escaping/unescaping in its SerDe
>> capabilities.
>> 
>> Kostsas, can yo provide the exact file that I can test on and upload it
>> to JIRA?
>> 
>> ------------------------
>> Chris Mattmann
>> chris.mattm...@gmail.com
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
>> Reply-To: <dev@oodt.apache.org>
>> Date: Thursday, October 9, 2014 at 2:59 AM
>> To: "dev@oodt.apache.org" <dev@oodt.apache.org>
>> Subject: Re: How to ingest files when metadata contain non standard
>> characters?
>> 
>> >Hi Kos,
>> >Thanks for reply
>> >
>> >On Wed, Oct 8, 2014 at 5:16 PM, Konstantinos Mavrommatis <
>> >kmavromma...@celgene.com> wrote:
>> >
>> >> I escaped the characters using the CGI::escapeHTML function from the
>> >> CGI perl module.
>> >>
>> >
>> >Wow. I am surpised at this one. I wonder if this is a bug which
>> results
>> >in the discrepancy or if this is intential behaviour!
>> >
>> >
>> >>
>> >> The differences between the two versions (mine escaped vs yours
>> >>escaped)  is in the encoding of the single quote "'" character, if I
>> >>am not mistaken.
>> >> I want to clarify this because your email come as simple ASCII (not
>> >>HTML)
>> >>
>> >
>> >Yes that is correct.
>> >
>> >
>> >>
>> >> I did try your command and it worked !!!
>> >>
>> >
>> >OK grand.
>> >
>> >
>> >>
>> >> Now the question is how to do this encoding (your version) ☺
>> >>
>> >>
>> >Is this the question? My thoughts would be that this should be
>> >encapsulated within OODT somewhere and that it should not be necessary
>> >to escape everything as you/we have been doing. This is extremely time
>> >consuming and painful.
>> >
>> >I escaped everything here
>> >http://www.freeformatter.com/html-escape.html
>> >
>> >and compared the strings here
>> >http://text-compare.com/
>> >
>> >The latter resource will verify that it is the single quote that is
>> the
>> >offending char here.
>> >Thanks
>> >Lewis
>> 
>
>*********************************************************
>THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS
>CONFIDENTIAL AND MAY CONTAIN LEGALLY PRIVILEGED
>INFORMATION INTENDED ONLY FOR THE USE OF THE INDIVIDUAL
>OR INDIVIDUALS NAMED ABOVE.
>If the reader is not the intended recipient, or the
>employee or agent responsible to deliver it to the
>intended recipient, you are hereby notified that any
>dissemination, distribution or copying of this
>communication is strictly prohibited. If you have
>received this communication in error, please reply to the
>sender to notify us of the error and delete the original
>message. Thank You.


Reply via email to