Hi Geert,

MarkLogic 9 also allows storing simple key/value pairs in hidden document
> metadata, which is more efficient than document properties

I am interested in that new feature. Is there somewhere an explanation how
it works (regarding reindexing, ...)?

Thanks,
Andreas



2017-07-20 11:33 GMT+02:00 Geert Josten <geert.jos...@marklogic.com>:

> Hi Pavan,
>
> If you need to store both the binary itself, and the meta info + textual
> contents, there are two general approaches:
>
> - put meta info and textual contents in document properties
> - store them separately as normal documents with a reference with the
> database uri of the actual binary
>
> MarkLogic 9 also allows storing simple key/value pairs in hidden document
> metadata, which is more efficient than document properties or separate
> docs, but it is probably too limited for this use case.
>
> You can store transcripts of videos including timestamps as XML, which
> would work for both the two-doc, and the doc-prop approach.
>
> Document properties allows storing complete XML fragments, and is
> associated with the same database uri as the actual document (in this case
> the binary data). It is included in indexing automatically. You just need
> to indicate you like to include properties fragments in searching and
> faceting.
>
> There are out of the box CPF pipelines for Document Filtering. There is
> one that saves the the result in doc properties, and one that saves the
> result in a separate doc. It should be possible to enable those via the
> Admin ui..
>
> Kind regards,
> Geert
>
> From: GUPTA Pavan <pavan.gu...@soprasteria.com>
> Date: Thursday, July 20, 2017 at 11:07 AM
> To: MarkLogic Developer Discussion <general@developer.marklogic.com>,
> Geert Josten <geert.jos...@marklogic.com>
> Subject: RE: [MarkLogic Dev General] Binary Document Ingestion in MP4 and
> MP3 format
>
> Hello Geert,
>
>
>
> Thanks for information. I would also know how I can store the content
> (means spoken words) of a video and find the time when it was spoken as we
> load the content of any document file in metadata.
>
> Is there any CPF I need to apply or suggest some library.
>
>
>
> Thanks In Advance!
>
>
>
>
>
> Regards,
>
> Pavan
>
>
>
> *From:* general-boun...@developer.marklogic.com [mailto:general-bounces@
> developer.marklogic.com <general-boun...@developer.marklogic.com>] *On
> Behalf Of *Geert Josten
> *Sent:* Thursday, July 20, 2017 2:27 PM
> *To:* MarkLogic Developer Discussion
> *Subject:* Re: [MarkLogic Dev General] Binary Document Ingestion in MP4
> and MP3 format
>
>
>
> Hi Pavan,
>
>
>
> You can apply xdmp:document-filter on many binary formats, including mp3
> and mp4. It will extract meta information like file size and content mime
> type, and for instance document properties from office documents, and exif
> tags from images. It will also attempt extract actual text, but that will
> only work if such text is inside the file in a machine readable form. E.g.
> text contained inside images or video streams will not be captured. This
> includes images embedded in office docs, image pdf, and also captions and
> subtitles on images and videos. You would need an OCR kind of solution for
> that..
>
>
>
> Kind regards,
>
> Geert
>
>
>
> *From: *<general-boun...@developer.marklogic.com> on behalf of GUPTA
> Pavan <pavan.gu...@soprasteria.com>
> *Reply-To: *MarkLogic Developer Discussion <general@developer.marklogic.
> com>
> *Date: *Thursday, July 20, 2017 at 9:19 AM
> *To: *"general@developer.marklogic.com" <general@developer.marklogic.com>
> *Subject: *[MarkLogic Dev General] Binary Document Ingestion in MP4 and
> MP3 format
>
>
>
> Hi Team,
>
>
>
> I am trying to ingest the .mp4 and .mp3 file and make them searchable. I
> have studied that these files are considered as binary files.
>
>
>
> I have also seen how to make the binary files searchable but I have done
> for .doc, .ppt, .pdf etc file but could not do for .mp4 or .mp3.
>
>
>
> Actually I want to make the files searchable.
>
>
>
> Can you please direct me how to achieve this and tell me if I need to
> enable or set up any content processing framework for same.\
>
>
>
> Thanks In Advance!
>
>
>
>
>
> Regards,
>
> Pavan
>
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>


-- 
Andreas Hubmer
Senior IT Consultant

EBCONT enterprise technologies GmbH
Millennium Tower
Handelskai 94-96
A-1200 Vienna

Mobile: +43 664 60651861
Fax: +43 2772 512 69-9
Email: andreas.hub...@ebcont.com
Web: http://www.ebcont.com

OUR TEAM IS YOUR SUCCESS

UID-Nr. ATU68135644
HG St.Pölten - FN 399978 d
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to