I set in the job the connection: 1. Repository: WinShare 2. Transformation: Allowed Documents 3. Transformation: TikaExternal 4. Transformation: MetadataExtractor 5. Output: SolrShare
so, in allowed contents I put the allowed mimetypes and extension in the field mapping I added [cid:image002.png@01D46574.F9A5A060] and I unchecked “keep all metadata” in the metadata expressions I checked “Keep all incoming metadata” and “remove empy metadata values” Obviously, my solr schema has to contains the field last_author, author besides the fields that I specified in the output connection SolrShare tab Schema [cid:image006.png@01D46574.F9A5A060] It works, in the solr index I find the field added last_author and author (where they aren’t empty) I hope that my approach is the right way to set the architecture ManifoldCF-Solr-Tika Thanks a lot, Karl for your patience.. Mario Da: Karl Wright <daddy...@gmail.com> Inviato: martedì 16 ottobre 2018 13:11 A: user@manifoldcf.apache.org Oggetto: Re: Add field to Output Solr If it's not in your PDFs, Tika won't extract it. If you merely want to copy another field, you can use the Metadata Adjuster transformer to do that. Karl On Tue, Oct 16, 2018 at 4:38 AM Bisonti Mario <mario.biso...@vimar.com<mailto:mario.biso...@vimar.com>> wrote: Hallo I am using Tika server as processor of file pdf, doc, etc I configured: [cid:image003.png@01D4653C.61DD4040] In my solr output connection, so, when I index the documents I see the field: id last_modified resourcename content_type allow_token_document deny_token_document allow_token_share deny_token_share stream_size creator deny_token_parent allow_token_parent content _version_ In my schema of Solr, I have the field last_author that I would like to be indexed. How can I add it? Thanks a lot Mario