I ran into this too. My solution was to add the ingest of blah.txt (blatantly stolen from the quick-start) to my start-up scripts and just leave it in the catalog.
What are fmquery and fmdel? I was wondering how to remove something from the catalog. Thanks, Tim. On Nov 18, 2011, at 4:55 PM, Mattmann, Chris A (388J) wrote: > Hey Ricky, > > I've ran into this a number of times myself and recently Paul Ramirez and I > were talking about this too. Paul even > said he would try and fix it (ha! I'm signing him up for work :P ). Actually > I'll just look at it myself. > > In the meanwhile, the workaround is exactly the one you stated. Ingest a > file, that gets you a catalog. Then, you can > simply delete the file if you want using fmquery | fmdel and then Unique > works just fine. > > Cheers, > Chris > > On Nov 18, 2011, at 4:52 PM, Nguyen, Ricky wrote: > >> Hi, >> >> I am trying to run a crawler using "--actionIds Unique". Since this is the >> first time I am ingesting a file into FileMgr, the user guide [1] says that >> the catalog dir MUST NOT exist so that Lucene can create it. However, the >> crawler fails with the error: >> >> IOException when opening index directory: >> [/Users/rnguyen/vpicu/data/catalog] for search: Message: >> /Users/rnguyen/vpicu/data/catalog is not a directory >> >> Seems like crawler is trying to search for a product (to determine it's >> uniqueness), but the catalog hasn't been created yet. I guess since I have >> no catalog, the workaround is to omit the "Unique" action. >> >> But if I use crawler as a daemon, it would be useful to leave "Unique" as an >> action. Any thoughts on the right course? >> >> Thanks, >> Ricky >> >> [1] http://oodt.apache.org/components/maven/filemgr/user/basic.html >> >> >> --------------------------------------------------------------------- >> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, >> is for the sole use of the intended recipient(s) and may contain confidential >> or legally privileged information. Any unauthorized review, use, disclosure >> or distribution is prohibited. If you are not the intended recipient, please >> contact the sender by reply e-mail and destroy all copies of this original >> message. >> >> --------------------------------------------------------------------- >> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > ----------------------------------------------------------------- Tim Stough NASA/Caltech Jet Propulsion Lab Senior System Architect Data Understanding Group (Section 388) 818-393-5347 (office) 626-644-6574 (cell) -----------------------------------------------------------------
