Hi Pavan, You can apply xdmp:document-filter on many binary formats, including mp3 and mp4. It will extract meta information like file size and content mime type, and for instance document properties from office documents, and exif tags from images. It will also attempt extract actual text, but that will only work if such text is inside the file in a machine readable form. E.g. text contained inside images or video streams will not be captured. This includes images embedded in office docs, image pdf, and also captions and subtitles on images and videos. You would need an OCR kind of solution for that..
Kind regards, Geert From: <[email protected]<mailto:[email protected]>> on behalf of GUPTA Pavan <[email protected]<mailto:[email protected]>> Reply-To: MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Date: Thursday, July 20, 2017 at 9:19 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format Hi Team, I am trying to ingest the .mp4 and .mp3 file and make them searchable. I have studied that these files are considered as binary files. I have also seen how to make the binary files searchable but I have done for .doc, .ppt, .pdf etc file but could not do for .mp4 or .mp3. Actually I want to make the files searchable. Can you please direct me how to achieve this and tell me if I need to enable or set up any content processing framework for same.\ Thanks In Advance! Regards, Pavan
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
