Hey Ricky, On Feb 10, 2012, at 12:41 PM, Nguyen, Ricky wrote:
> How would I add metadata to existing (already ingested to FileMgr) products, > without re-ingesting (producing a new product ID and lucene document because > the product hasn't changed)? Great question! In general, here are a couple ways to do this: 1. via XML-RPC - there is a method called updateMetadata introduced in 0.4-SNAPSHOT (trunk) since OODT-256 that provides an "updateMetadata" capability: http://s.apache.org/0Ww 2. use CAS curator and its REST API here: http://oodt.apache.org/components/maven/curator/api/index.html One of the underlying methods (not sure if it's documented at the above) is a method to update the metadata for an existing product. Caveat: this only works with the LuceneCatalog and includes a forked version of it that Paul Ramirez made that includes an updateMetadata capability. It would be great to bring this back into the trunk and re-align them; just haven't had the cycles yet. > > Are these possible solutions: Can crawler run multiple metExtractors on each > file to be ingested? Or perhaps there is a way to get PGE tasks to update an > existing product's metadata? Yep, the crawler can run multiple met extractors, you have to use the AutoDetectCrawler to do this, or develop a met extractor that can run a series of other extractors that you want. Regarding the PGE tasks, you can certainly leverage it for its control flow to do metadata updates (which happen *after* execution but *before* crawling) via PGE extractors, as well. HTH! Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
