Hi Chris, Thanks for your emails from Friday and Saturday. They were extremely helpful and I have now worked my way through all of my setup issues so far. I was using the new style CASE-PGE keys because Rishi used those in his working example and at one point I had been trying to mimic that example. I didn't know anything about the legacy mode flag. But, after changing the keys to the old style, everything started to come together a little faster.
To start my crawler at the beginning of my pipeline, I chose the approach that Rishi had taken where the pgeConfig starts the crawler_launcher script directly rather than rely on the definition of the output files to trigger the crawler to run. This was because the met extractor I am using is a perl script, and your DRAT system only shows how to specify a class for the metFileWriterClass attribute. I could not find anything to show me how to specify a perl script as the met file writer instead. Also, having to specify the file types to crawl for by using a regular expression in the config file seemed like an unnecessary duplication of the definition of the files types to crawl for (since the regular expression just specifies a mime-type that is already defined in mime-types.xml). I didn't want to specify the mime-type in two different places. So, now I've been through the process of setting up my CAS-PGE to ingest the raw science and spacecraft housekeeping telemetry files, and run a postIngestSuccess action to copy some of those files to a new directory for further processing. The next step in my pipeline may be a little tricky. I need to query the filemgr and pass the results of the query (it could be a long list) as an input parameter to the first algorithm in the pipeline which, happens to be another perl script. Does anyone know how I might do this in a PGE config file? Thanks!! Val Valerie A. Mallder New Horizons Deputy Mission System Engineer Johns Hopkins University/Applied Physics Laboratory > -----Original Message----- > From: Mattmann, Chris A (3980) [mailto:[email protected]] > Sent: Friday, October 10, 2014 7:59 PM > To: [email protected] > Subject: Re: Failed to build PgeConfig, exception in > PathUtils.doDynamicReplacement > > Thanks Val you are close! > > Looking at what you show below, I think the issue is one of the following: > > 1. You are using the new style CAS-PGE Keys. Try using the old ones (e.g., the > ones present here: > http://svn.apache.org/repos/asf/oodt/trunk/pge/src/main/resources/examples/ > WorkflowTask/tasks.xml > > (note the ?_? and not the ?/? used in keys). > These keys are used by CAS-PGE if you see something in your wmgr bin script > (or > your resource manager batch_stub script) stating ?legacyMode=true?). > If you > are using RADIX, I believe that?s the case: > > http://s.apache.org/hO > > 2. Regarding better building environments. The m2e plugin for Eclipse is > fantastic > nowadays and can literally checkout a multi-module Maven project from SVN > (when paired with Subversive as a plugin or Subclipse). Another thing to > check out > is this page on the wiki: > > https://cwiki.apache.org/confluence/display/OODT/OODT+Eclipse+Developer+Hel > p > > > Let me know if that fixes it. Sorry for all the trouble :) Trust me it will > be worth it. > CAS-PGE is awesome once working. > > Cheers, > Chris > > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > +++++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software and Science Data Systems Section (398) NASA Jet > Propulsion Laboratory Pasadena, CA 91109 USA > Office: 168-519, Mailstop: 168-527 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > +++++ > Adjunct Associate Professor, Computer Science Department University of > Southern California, Los Angeles, CA 90089 USA > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > +++++ > > > > > > > -----Original Message----- > From: <Mallder>, Valerie <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Friday, October 10, 2014 at 3:40 PM > To: "[email protected]" <[email protected]> > Subject: Failed to build PgeConfig, exception in > PathUtils.doDynamicReplacement > > >Hi All, > > > >I'm still working on this! Still trying to get a CA-PGE Task to run. > >I'm almost there. Now it at least trys to build the PGE config file. But the > >XmlFilePgeConfigBuilder is failing. I looks like it is picking up a > >null string somewhere, but it also looks like it is crashing in a > >strange place. It is unable to read the .xml file that contains my > ><pgeConfig>...</pgeConfig> but, the correct path and file are shown in > >the output log messages I've include below which also shows the stack > >trace. I looks like it is failing during a recursive call to > >PathUtils.doDynamicReplacement. I saw some chatter about some prior > >errors in this code in the mailing list archives from 2010 and 2012. So > >I am hoping someone might remember and be able to tell me if I am doing > >something that is causing this error. I included my tasks.xml and > >fei-crawler-pge-config.xml file after the log messages. > > > >At this point, I think need to start running these processes in an > >environment where I can debug this better. The runtime output and > >stack traces simply aren't enough for me to track this down. I am > >newbie to using eclipse and mvn, so, does anyone have some notes on the > >best way to import oodt-0.7 into an exclipe workspace and tell it to > >build it using mvn?? I am hoping I can leverage other peoples > >knowledge of how to do this so that I can do it quickly and not waste > >another week. I imported the oodt-0.7 directory into eclipse so I could > >view the files easier, but I just made eclipse link to the where the > >files are located rather that copy them into a workspace folder. If > >there's a better way, please tell me. And now, how do I tell exclipse to > >build > stuff? > > > >Thanks!! > >Valerie > > > > > > > >Using CATALINA_BASE: > >/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/tomcat > >Using CATALINA_HOME: > >/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/tomcat > >Using CATALINA_TMPDIR: > >/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/tomcat/temp > >Using JRE_HOME: /project/jedi/users/jedi-pipeline/jdk1.7.0_55 > >Workflow Manager started PID file > >(/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/workflow/ > >run > >/cas.workflow.pid). > >Oct 10, 2014 5:53:52 PM > >org.apache.oodt.cas.workflow.system.XmlRpcWorkflowManager > >loadProperties > >INFO: Loading Workflow Manager Configuration Properties from: > >[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/workflow/ > >etc > >/workflow.properties] > >Oct 10, 2014 5:53:52 PM > >org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory > >getResmgrUrl > >INFO: No Resource Manager URL provided or malformed URL: executing jobs > >locally. URL: [null] Oct 10, 2014 5:53:52 PM > >org.apache.oodt.cas.workflow.system.XmlRpcWorkflowManager <init> > >INFO: Workflow Manager started by malldva1 Oct 10, 2014 5:54:12 PM > >org.apache.oodt.cas.workflow.system.XmlRpcWorkflowManager handleEvent > >INFO: WorkflowManager: Received event: startJediPipeline Oct 10, 2014 > >5:54:12 PM org.apache.oodt.cas.workflow.system.XmlRpcWorkflowManager > >handleEvent > >INFO: WorkflowManager: Workflow jediWorkflowName retrieved for event > >startJediPipeline Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread > >checkTaskRequiredMetadata > >INFO: Task: [feiCrawlerTaskName] has no required metadata fields Oct > >10, 2014 5:54:13 PM > >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread > >executeTaskLocally > >INFO: Executing task: [feiCrawlerTaskName] locally Oct 10, 2014 5:54:13 > >PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >INFO: Converting workflow configuration to static metadata... > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = [PGETask/WorkflowManagerUrl] > >value = [http://localhost:9001] Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = > >[PGETask/Ingest/CrawlerConfigFile] > >value = > >[file:/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/craw > >ler > >/policy/crawler-config.xml] > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = [PGETask/Name] value = > >[feiCrawlerTaskName] Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = [PGETask/Ingest/ActionIds] value > >= [MoveFileToLevel0Dir] Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = [PGETask/DumpMetadata] value = > >[true] Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = > >[PGETask/Query/ClientTransferServiceFactory] value = > >[org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory] > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = [PGE_HOME] value = > >[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/pge] > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = [PGETask/Query/FileManagerUrl] > >value = [http://localhost:9000] Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = > >[PGETask/Ingest/MimeExtractorRepo] > >value = > >[file:/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/exte > >nsi > >ons/policy/mime-extractor-map.xml] > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = [PGETask/ConfigFilePath] value = > >[file:/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/exte > >nsi > >ons/config/fei-crawler-pge-config.xml] > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding static metadata: key = [PGETask/Ingest/FileManagerUrl] > >value = [http://localhost:9000] Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >INFO: Loading workflow context metadata... > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding dynamic metadata: key = [WorkflowInstId] value = > >[f8730997-50c7-11e4-b9aa-57625eee7ebd] > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding dynamic metadata: key = [JobId] value = > >[f8730997-50c7-11e4-b9aa-57625eee7ebd] > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding dynamic metadata: key = [WorkflowManagerUrl] value = > >[http://slothrop.jhuapl.edu:9001] > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding dynamic metadata: key = [TaskId] value = > >[urn:oodt:feiCrawlerTaskId] Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.pge.PGETaskInstance > >createPgeMetadata > >FINEST: Adding dynamic metadata: key = [ProcessingNode] value = > >[slothrop.jhuapl.edu] Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.pge.PGETaskInstance > >createPgeConfig > >INFO: Create PgeConfig... > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance > >createPgeConfig > >INFO: Using default PgeConfigBuilder: > >org.apache.oodt.cas.pge.config.XmlFilePgeConfigBuilder > >Oct 10, 2014 5:54:13 PM org.apache.oodt.cas.pge.PGETaskInstance run > >SEVERE: PGETask FAILED!!! : Failed to build PgeConfig : Failed to parse > >value: null > >java.io.IOException: Failed to build PgeConfig : Failed to parse value: > >null > > at > >org.apache.oodt.cas.pge.config.XmlFilePgeConfigBuilder.build(XmlFilePge > >Con > >figBuilder.java:87) > > at > >org.apache.oodt.cas.pge.PGETaskInstance.createPgeConfig(PGETaskInstance > >.ja > >va:230) > > at > >org.apache.oodt.cas.pge.PGETaskInstance.run(PGETaskInstance.java:123) > > at > >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread.ex > >ecu > >teTaskLocally(IterativeWorkflowProcessorThread.java:574) > > at > >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread.ru > >n(I > >terativeWorkflowProcessorThread.java:321) > > at > >EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source) > > at java.lang.Thread.run(Thread.java:745) > >Caused by: java.lang.Exception: Failed to parse value: null > > at org.apache.oodt.cas.pge.util.XmlHelper.fillIn(XmlHelper.java:501) > > at org.apache.oodt.cas.pge.util.XmlHelper.fillIn(XmlHelper.java:480) > > at > >org.apache.oodt.cas.pge.config.XmlFilePgeConfigBuilder.build(XmlFilePge > >Con > >figBuilder.java:77) > > ... 6 more > >Caused by: java.lang.NullPointerException > > at java.util.regex.Matcher.getTextLength(Matcher.java:1234) > > at java.util.regex.Matcher.reset(Matcher.java:308) > > at java.util.regex.Matcher.<init>(Matcher.java:228) > > at java.util.regex.Pattern.matcher(Pattern.java:1088) > > at > >org.apache.oodt.cas.metadata.util.PathUtils.doDynamicDateToMillisReplac > >eme > >nt(PathUtils.java:321) > > at > >org.apache.oodt.cas.metadata.util.PathUtils.doDynamicReplacement(PathUt > >ils > >.java:96) > > at org.apache.oodt.cas.pge.util.XmlHelper.fillIn(XmlHelper.java:488) > > ... 8 more > > > >org.apache.oodt.cas.workflow.structs.exceptions.WorkflowTaskInstanceExc > >ept > >ion: PGETask FAILED!!! : Failed to build PgeConfig : Failed to parse > >value: null > > at > >org.apache.oodt.cas.pge.PGETaskInstance.run(PGETaskInstance.java:150) > > at > >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread.ex > >ecu > >teTaskLocally(IterativeWorkflowProcessorThread.java:574) > > at > >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread.ru > >n(I > >terativeWorkflowProcessorThread.java:321) > > at > >EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source) > > at java.lang.Thread.run(Thread.java:745) > >Caused by: java.io.IOException: Failed to build PgeConfig : Failed to > >parse value: null > > at > >org.apache.oodt.cas.pge.config.XmlFilePgeConfigBuilder.build(XmlFilePge > >Con > >figBuilder.java:87) > > at > >org.apache.oodt.cas.pge.PGETaskInstance.createPgeConfig(PGETaskInstance > >.ja > >va:230) > > at > >org.apache.oodt.cas.pge.PGETaskInstance.run(PGETaskInstance.java:123) > > ... 4 more > >Caused by: java.lang.Exception: Failed to parse value: null > > at org.apache.oodt.cas.pge.util.XmlHelper.fillIn(XmlHelper.java:501) > > at org.apache.oodt.cas.pge.util.XmlHelper.fillIn(XmlHelper.java:480) > > at > >org.apache.oodt.cas.pge.config.XmlFilePgeConfigBuilder.build(XmlFilePge > >Con > >figBuilder.java:77) > > ... 6 more > >Caused by: java.lang.NullPointerException > > at java.util.regex.Matcher.getTextLength(Matcher.java:1234) > > at java.util.regex.Matcher.reset(Matcher.java:308) > > at java.util.regex.Matcher.<init>(Matcher.java:228) > > at java.util.regex.Pattern.matcher(Pattern.java:1088) > > at > >org.apache.oodt.cas.metadata.util.PathUtils.doDynamicDateToMillisReplac > >eme > >nt(PathUtils.java:321) > > at > >org.apache.oodt.cas.metadata.util.PathUtils.doDynamicReplacement(PathUt > >ils > >.java:96) > > at org.apache.oodt.cas.pge.util.XmlHelper.fillIn(XmlHelper.java:488) > > ... 8 more > >Oct 10, 2014 5:54:13 PM > >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread > >executeTaskLocally > >WARNING: Exception executing task: [feiCrawlerTaskName] locally: Message: > >PGETask FAILED!!! : Failed to build PgeConfig : Failed to parse value: > >null > > > > > > > > > >Content of tasks.xml > > > > > ><cas:tasks xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas"> > > <task id="urn:oodt:feiCrawlerTaskId" name="feiCrawlerTaskName" > >class="org.apache.oodt.cas.pge.StdPGETaskInstance"> > > <conditions/> > > > > <configuration> > > <property name="PGETask/Name" value="feiCrawlerTaskName"/> > > <property name="PGETask/ConfigFilePath" > >value="file:/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deplo > >y/e xtensions/config/fei-crawler-pge-config.xml"/> > > <property name="PGETask/DumpMetadata" value="true"/> > > <property name="PGETask/WorkflowManagerUrl" > >value="http://localhost:9001" /> > > <property name="PGETask/Query/FileManagerUrl" > >value="http://localhost:9000" /> > > <property name="PGETask/Ingest/FileManagerUrl" > >value="http://localhost:9000"/> > > > > <property name="PGETask/Query/ClientTransferServiceFactory" > >value="org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactor > >y"/ > >> > > <property name="PGETask/Ingest/CrawlerConfigFile" > >value="file:/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deplo > >y/c > >rawler/policy/crawler-config.xml"/> > > <property name="PGETask/Ingest/MimeExtractorRepo" > >value="file:/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deplo > >y/e xtensions/policy/mime-extractor-map.xml"/> > > <property name="PGETask/Ingest/ActionIds" > >value="MoveFileToLevel0Dir"/> > > <property name="PGE_HOME" > >value="/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/pge"/> > > </configuration> > > > > <requiredMetFields/> > > </task> > ></cas:tasks> > > > > > >Contents of fei-crawler-pge-config.xml > > > > > ><pgeConfig> > > > > <!-- How to run the PGE --> > > <!-just echoing the current directory to a file so I can see if this > >thing ever starts --> > > <exe > >dir="/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data/ > >pge > >/jobs" shell="/bin/sh"> > > <cmd>echo "Current Working Directory is `pwd`" > > >/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/logs/pge.l > >og< > >/cmd> > > </exe> > > > > <!-- Files to ingest --> > > <output> > > <!-- trying this approach. Telling PGE there is output files > >should invoke the crawler. --> > > > ><dir="/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/data > >/st > >aging"/> > > </output> > > > > <!-- Custom metadata to add to output files --> > > <customMetadata> > > > > <!-- each of these directories exist --> > > <metadata key="JobDir" val="[OODT_HOME]/data/pge/jobs" > >envReplace="true"/> > > <metadata key="JobInputDir" val="[JobDir]/input"/> > > <metadata key="JobOutputDir" val="[JobDir]/output"/> > > <metadata key="JobLogDir" val="[JobDir]/logs"/> > > > > </customMetadata> > ></pgeConfig> > > > >Valerie A. Mallder > > > >New Horizons Deputy Mission System Engineer The Johns Hopkins > >University/Applied Physics Laboratory > >11100 Johns Hopkins Rd (MS 23-282), Laurel, MD 20723 > >240-228-7846 (Office) 410-504-2233 (Blackberry) > >
