How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Konstantinos Mavrommatis
Hi, I am trying to ingest a large number of files. The metadata for these files exist in .met files. Many of the metadata fields contain characters like '$' etc. Running crawler on these metadata results in failure. When I try to escape the characters using HTML encode e.g. '' becomes gt etc

Re: How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Lewis John Mcgibbney
Hi Kos, I take you up on your challenge ;) However I don't know if this will fix it. On Tue, Oct 7, 2014 at 11:31 PM, Konstantinos Mavrommatis kmavromma...@celgene.com wrote: valsailfish quant --index /reference/v1/Homo-sapiens/GRCh37.p12/SailFishIndex --libtype 'T=PE:O=:S=AS' -1 (gunzip -c

what is batch stub? Is it necessary?

2014-10-08 Thread Mallder, Valerie
Hello, I am still having trouble getting my CAS PGE crawler task to run due to http://localhost:2001 being down. I have spent the last 2 days tracing through the resource manager code and tracked this down to line 146 of LRUScheduler where the XmlRpcBatchMgr is failing to execute the task

RE: what is batch stub? Is it necessary?

2014-10-08 Thread Mallder, Valerie
Well then, I'm proud to be a member :) (I think ) Valerie A. Mallder New Horizons Deputy Mission System Engineer Johns Hopkins University/Applied Physics Laboratory -Original Message- From: Bruce Barkstrom [mailto:brbarkst...@gmail.com] Sent: Wednesday, October 08, 2014 4:54

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Ramirez, Paul M (398J)
Valerie, I would have thought it would have just not used a batch stub by default. That said if you go into the $OODT_HOME/resmgr/bin there should be a script to start a batch stub. Right now on my phone I forget the name of the script but if you more the file you will see the Java class name

RE: what is batch stub? Is it necessary?

2014-10-08 Thread Mallder, Valerie
Hi Paul, Thank you for replying. I found the batch_stub script (Duh!). I was so busy looking at java code that I never looked in the bin dir. So, I ran batch_stub with the expected port number, and I made a little more progress, batch_stub actually received the job! The learn by example

Re: HttpClient NoClassDefFoundError For the url-downloader Script of the Apache OODT Crawler

2014-10-08 Thread Lewis John Mcgibbney
Dynamite Angela. Thanks for your persistence. On Sat, Oct 4, 2014 at 8:22 PM, MengYing Wang mengyingwa...@gmail.com wrote: Dear Lewis, Done. This is the jira url: https://issues.apache.org/jira/browse/OODT-756 Best, Mengying Wang On Tue, Sep 30, 2014 at 6:59 PM, Lewis John Mcgibbney

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Verma, Rishi (398J)
Hi Valerie, All I am trying to do is run crawler_launcher as a workflow task in the CAS PGE environment. Interesting. I have a working example here [1] you can look at that does this exact thing. So, if batchstub is necessary in this scenario, pleast tell me what it is, why it is

RE: what is batch stub? Is it necessary?

2014-10-08 Thread Mallder, Valerie
Hi Rishi, Thank you very much for pointing me to your working example. This is very helpful. My pgeConfig looks very similar to yours. So, I commented out the resource manager like you suggested and tried running again without the resource manager. And my problem still exists. The problem is

RE: How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Konstantinos Mavrommatis
Hi Lewis I escaped the characters using the CGI::escapeHTML function from the CGI perl module. The differences between the two versions (mine escaped vs yours escaped) is in the encoding of the single quote ' character, if I am not mistaken. I want to clarify this because your email come as

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Bruce Barkstrom
Take courage. It's essential. Believe it or not you have a new persona. Welcome to your new community. On Wed, Oct 8, 2014 at 5:03 PM, Mallder, Valerie valerie.mall...@jhuapl.edu wrote: Well then, I'm proud to be a member :) (I think ) Valerie A. Mallder New Horizons Deputy Mission

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Verma, Rishi (398J)
Hi Val, Yep - here’s a link to the tasks.xml file: https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-netscan/workflow/src/main/resources/policy/tasks.xml The problem is that the ExternScriptTaskInstance is unable to recognize the command line arguments that I want to pass to the

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Lewis John Mcgibbney
Folks, Is it possible to create a parent issue for defining XSD's for all of the XML file we need ti OODT? I do not know them all, but from this thread alone, it is clear that we could do with setting some kind of restrictions on what can be included within task and configuration XML within OODT.

A Suggestion on Developing Documentation Based on the History of Experimentation

2014-10-08 Thread Bruce Barkstrom
During the last month, I managed to get a fairly difficult installation task to work on software I felt I had a critical need for. I've attached the documentation I wrote as I went through the experience describing what I had to do. I think we often denigrate writing documentation at the level

Re: How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Lewis John Mcgibbney
Hi Kos, Thanks for reply On Wed, Oct 8, 2014 at 5:16 PM, Konstantinos Mavrommatis kmavromma...@celgene.com wrote: I escaped the characters using the CGI::escapeHTML function from the CGI perl module. Wow. I am surpised at this one. I wonder if this is a bug which results in the discrepancy

Re: How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Lewis John Mcgibbney
In addition, if you can get to the bottom of what you think the intended behaviour is here, please feel free to log a ticket in Jira https://issues.apache.org/jira/browse/OODT/?selectedTab=com.atlassian.jira.jira-projects-plugin:issues-panel On Wed, Oct 8, 2014 at 5:59 PM, Lewis John Mcgibbney

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Ramirez, Paul M (398J)
+1 billion --Paul Sent from my iPhone On Oct 8, 2014, at 5:55 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Folks, Is it possible to create a parent issue for defining XSD's for all of the XML file we need ti OODT? I do not know them all, but from this thread alone, it is

Re: HttpClient NoClassDefFoundError For the url-downloader Script of the Apache OODT Crawler

2014-10-08 Thread Chris Mattmann
Yep woot! Committed! Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Lewis John Mcgibbney lewis.mcgibb...@gmail.com Reply-To: dev@oodt.apache.org Date: Thursday, October 9, 2014 at 12:01 AM To: MengYing Wang mengyingwa...@gmail.com Cc: Chris

Re: How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Chris Mattmann
cas-metadata should handle this escaping/unescaping in its SerDe capabilities. Kostsas, can yo provide the exact file that I can test on and upload it to JIRA? Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Lewis John Mcgibbney

RE: How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Konstantinos Mavrommatis
Thanks Chris, attached is an offending file before escape. For the record perl module HTML::Entities does provide an escapeHTML alternative that produces acceptable files. Thanks K -Original Message- From: Chris Mattmann [mailto:chris.mattm...@gmail.com] Sent: Wednesday, October

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Chris Mattmann
Hi Val, I don¹t think you need to run a CAS-PGE task to call crawler_launcher. If you define blocks in the output../output section of the XML file, a crawler will be forked in the job working directory of CAS-PGE and crawl your specified output. I believe that will accomplish the same goal of

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Chris Mattmann
ack, git push Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Ramirez, Paul M (398J) paul.m.rami...@jpl.nasa.gov Reply-To: dev@oodt.apache.org Date: Thursday, October 9, 2014 at 4:37 AM To: dev@oodt.apache.org dev@oodt.apache.org Subject: Re:

Re: How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Chris Mattmann
Thanks Kostas. Can you upload somewhere and then point here, the message list strips attachments.. Cheers, Chris Chris Mattmann chris.mattm...@gmail.com -Original Message- From: Konstantinos Mavrommatis kmavromma...@celgene.com Reply-To: dev@oodt.apache.org

Re: what is batch stub? Is it necessary?

2014-10-08 Thread Cameron Goodale
Valerie, This could be nothing, or it could be the root cause...your output XML tags are malformed. !-- Files to ingest -- output/ /output Should be: !-- Files to ingest -- output /output No trailing slash in the opening tag. It might be failing since it cannot parse the XML

RE: How to ingest files when metadata contain non standard characters?

2014-10-08 Thread Konstantinos Mavrommatis
Here is the offending file before escape: cas:metadata xmlns:cas=http://oodt.jpl.nasa.gov/1.0/cas; keyval keyderived_from/key val/gpfs/celgene/reference/v1/Homo-sapiens/GRCh37.p12/SailFishIndex/val