Hey Ricky, I've ran into this a number of times myself and recently Paul Ramirez and I were talking about this too. Paul even said he would try and fix it (ha! I'm signing him up for work :P ). Actually I'll just look at it myself.
In the meanwhile, the workaround is exactly the one you stated. Ingest a file, that gets you a catalog. Then, you can simply delete the file if you want using fmquery | fmdel and then Unique works just fine. Cheers, Chris On Nov 18, 2011, at 4:52 PM, Nguyen, Ricky wrote: > Hi, > > I am trying to run a crawler using "--actionIds Unique". Since this is the > first time I am ingesting a file into FileMgr, the user guide [1] says that > the catalog dir MUST NOT exist so that Lucene can create it. However, the > crawler fails with the error: > > IOException when opening index directory: [/Users/rnguyen/vpicu/data/catalog] > for search: Message: /Users/rnguyen/vpicu/data/catalog is not a directory > > Seems like crawler is trying to search for a product (to determine it's > uniqueness), but the catalog hasn't been created yet. I guess since I have no > catalog, the workaround is to omit the "Unique" action. > > But if I use crawler as a daemon, it would be useful to leave "Unique" as an > action. Any thoughts on the right course? > > Thanks, > Ricky > > [1] http://oodt.apache.org/components/maven/filemgr/user/basic.html > > > --------------------------------------------------------------------- > CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, > is for the sole use of the intended recipient(s) and may contain confidential > or legally privileged information. Any unauthorized review, use, disclosure > or distribution is prohibited. If you are not the intended recipient, please > contact the sender by reply e-mail and destroy all copies of this original > message. > > --------------------------------------------------------------------- > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
