Do you want to index the text in the attachments? If so, you probably are better off creating a unique document for the mail body and each attachment. A field in the document could give the id of the main email document. The main email document could contain a multivalued field giving all of the attachment ids.
On Thu, Mar 25, 2010 at 10:14 AM, Chris Hostetter <hossman_luc...@fucit.org> wrote: > > : > I tried calling the addFile() twice (one call for each file) and no > : > error but nothing getting indexed as well. > ... > : Write your own RequestHandler that uses the existing > ExtractingRequestHandler > : to actually parse the streams, and then you combine the results arbitrarily > in > : your handler, eventually sending an AddUpdateCommand to the update > processor. > : You can obtain both the update processor and SolrCell instance from > : req.getCore(). > > The key bit being: yes you contain attach multiple files to your request, > and yes the SolrQueryRequest abstraction can handle that (it appears as > two "ContentStreams" to the RequestHandler) but the existing > ExtractingRequestHandler assumes there will only be one ContentStream and > constructsa one document for it -- the API isn't really designed arround > the idea of how to generate a single SolrInputDOcument from multipole > COntentStreams (where would you get the "title" from? etc...) > > There was talk about trying to generalize this, but i don't think anyone > else has looked into it much. Here's one refrence, but i definitely > remember a more recent thread about this idea... > > http://n3.nabble.com/ExtractingRequestHandler-and-XmlUpdateHandler-tt492202.html#a492211 > > > > -Hoss > > -- Lance Norskog goks...@gmail.com