Hi Chris, I got a bunch of error messages when running the crawler_launcher script. First off, I think I need to understand how to a crawler works. Can I get some materials to help me write configuration files for crawler_launcher ?
Honestly I am not familiar with Crawler. But I will try to file a JIRA issue to update the Crawler user guide. Thanks, Yunhee 2012/8/9 Mattmann, Chris A (388J) <[email protected]>: > Hi YunHee, > > Sorry, we need to update the docs, that is for sure. Can you help > us remember by filing a JIRA issue to update the Crawler user > guide and to fix the URL there? > > As for crawlerId, yes it's obsolete, you can find the modern > 0.4 and 0.5-trunk options by running ./crawler_launcher -h > > Cheers, > Chris > > On Aug 7, 2012, at 7:03 AM, YunHee Kang wrote: > >> Hi Chris and Sheryl, >> >> I understood my mistake after modifying a wrong URL with the "/". >> But there is the wrong URL that is used as an option of >> crawler_launcher in the apache oodt >> homepage(http://oodt.apache.org/components/maven/crawler/user/). >> --filemgrUrl http://localhost:9000/ \ >> So it made me confused. >> >> I tried to run the command mentioned below according to the home >> page of apache oodt. >> $ ./crawler_launcher --crawlerId MetExtractorProductCrawler >> ERROR: Invalid option: 'crawlerId' >> >> But the error described above was occurred. >> Is the option 'crawlerid' obsolete ? >> >> Thanks, >> Yunhee >> >> >> 2012/8/7 Mattmann, Chris A (388J) <[email protected]>: >>> Perfect, Sheryl, my thoughts exactly. >>> >>> Cheers, >>> Chris >>> >>> On Aug 6, 2012, at 10:01 AM, Sheryl John wrote: >>> >>>> Hi Yunhee, >>>> >>>> Check out this OODT wiki for crawler : >>>> https://cwiki.apache.org/confluence/display/OODT/OODT+Crawler+Help >>>> >>>> Did you try giving 'http://localhost:8000' without the "/" in the end? >>>> Also, specify >>>> 'org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory' >>>> for 'clientTransferer' option. >>>> >>>> >>>> On Mon, Aug 6, 2012 at 9:46 AM, YunHee Kang <[email protected]> wrote: >>>> >>>>> Hi Chris, >>>>> >>>>> I got an error message when I tried to run crawler_launcher by using a >>>>> shell script. The error message may be caused by a wrong URL of >>>>> filemgr. >>>>> $ ./crawler_launcher.sh >>>>> ERROR: Validation Failures: - Value 'http://localhost:8000/' is not >>>>> allowed for option >>>>> [longOption='filemgrUrl',shortOption='fm',description='File Manager >>>>> URL'] - Allowed values = [http://.*:\d*] >>>>> >>>>> The following is the shell script that I wrote: >>>>> $ cat crawler_launcher.sh >>>>> #!/bin/sh >>>>> export STAGE_AREA=/home/yhkang/oodt-0.5/cas-pushpull/staging/TESL2CO2 >>>>> ./crawler_launcher \ >>>>> -op --launchStdCrawler \ >>>>> --productPath $STAGE_AREA\ >>>>> --filemgrUrl http://localhost:8000/\ >>>>> --failureDir /tmp \ >>>>> --actionIds DeleteDataFile MoveDataFileToFailureDir Unique \ >>>>> --metFileExtension tmp \ >>>>> --clientTransferer >>>>> org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferer >>>>> >>>>> I am wondering if there is a problem in the URL of the filemgr or >>>>> elsewhere >>>>> >>>>> Thanks, >>>>> Yunhee >>>>> >>>> >>>> >>>> >>>> -- >>>> -Sheryl >>> >>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Chris Mattmann, Ph.D. >>> Senior Computer Scientist >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> Office: 171-266B, Mailstop: 171-246 >>> Email: [email protected] >>> WWW: http://sunset.usc.edu/~mattmann/ >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Adjunct Assistant Professor, Computer Science Department >>> University of Southern California, Los Angeles, CA 90089 USA >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >
