> I indexed my data by using index-more plugin and added my required field > (like content_type) to schema.xml > Now how can i search on pdf files (a kind of content_types) using this new > index? what query should i enter to have a search on pdf files?
This is a Solr specific question and depends on your fieldType defined in Solr for the content type field. Refer to the Solr manual or mailing list. > > On Thu, Sep 29, 2011 at 9:33 AM, ahmad ajiloo <[email protected]>wrote: > > How can I use the Index-more plugin? I'm new to Nutch and need your help > > in detail ! > > thanks > > > > > > On Wed, Sep 14, 2011 at 12:54 PM, Markus Jelsma < > > > > [email protected]> wrote: > >> Just i wrote on the Solr list. Use the index-more plugin or copyField > >> the url > >> to an extension field in which you can use char pattern replace filter > >> to skip > >> everything up to the first dot. > >> > >> > Hello > >> > I want to search on articles via Solr. So need to find only specific > >> > >> files > >> > >> > like doc, docx, and pdf. > >> > I don't need any html pages. Thus the result of our search should only > >> > consists of doc, docx, and pdf files. > >> > > >> > I'm using Nutch to crawling web pages and sending Nutch's data to Solr > >> > >> for > >> > >> > indexing. There is an approach to search on specific file types: Put > >> > the file extension into my index and I have no idea about the type of > >> > schema nutch uses when indexing into Solr, wether it creates a > >> > specific field > >> > >> for > >> > >> > file extension, and/or how we can modify the nutch indexer to create a > >> > field like that for ourselves.

