Sounds like that to me, but don’t know details. It is indexed, so changing it must involve updating indexes for sure though. But there might be subtleties about what is actually reindexed and what not..
I’ll forward your question though.. Cheers From: <general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>> on behalf of Andreas Hubmer <andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>> Reply-To: MarkLogic Developer Discussion <general@developer.marklogic.com<mailto:general@developer.marklogic.com>> Date: Thursday, July 20, 2017 at 4:42 PM To: MarkLogic Developer Discussion <general@developer.marklogic.com<mailto:general@developer.marklogic.com>> Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format Thanks, Geert. In the release notes I've found the following statement (https://docs.marklogic.com/guide/relnotes/chap3#id_45632): Storing the axes times in metadata enables MarkLogic to update the axes timestamps without changing the documents and invoking reindexing. To me it seems that the metadata is connected to the fragment but stored somehow differently. Do you know any more details? Cheers, Andreas 2017-07-20 16:35 GMT+02:00 Geert Josten <geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>: Hi Andreas, I tried to look for a nice Guide section, but couldn’t find one. But there isn’t too much to say about it actually. It starts with adding metadata to a doc using http://docs.marklogic.com/xdmp:document-set-metadata. It takes a map:map, and non-string values will be converted to quoted strings. It effectively lives inside the same document fragment as the documents contents, but it is not included nor embedded when you pull up the contents with for instance fn:doc. You can also search on it using so-called metadata fields. That is a new 3rd type of field. You can create them with admin ui, or for instance with admin functions. The Temporal guide spends a few words on it: http://docs.marklogic.com/guide/temporal/temporal-quick-start#id_50302. Very useful for storing temporal properties, but you can use it for other purposes too. In search constraints you just refer to the field by name, like any other field. You can range index metadata fields too, like other fields, and even index as dateTime and such, but you cannot store a fragment of XML inside it, and index on a sub-element of that. It will simply get stored as quoted xml, and it will full-text search that instead.. Cheers, Geert From: <general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>> on behalf of Andreas Hubmer <andreas.hub...@ebcont.com<mailto:andreas.hub...@ebcont.com>> Reply-To: MarkLogic Developer Discussion <general@developer.marklogic.com<mailto:general@developer.marklogic.com>> Date: Thursday, July 20, 2017 at 11:53 AM To: MarkLogic Developer Discussion <general@developer.marklogic.com<mailto:general@developer.marklogic.com>> Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format Hi Geert, MarkLogic 9 also allows storing simple key/value pairs in hidden document metadata, which is more efficient than document properties I am interested in that new feature. Is there somewhere an explanation how it works (regarding reindexing, ...)? Thanks, Andreas 2017-07-20 11:33 GMT+02:00 Geert Josten <geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>>: Hi Pavan, If you need to store both the binary itself, and the meta info + textual contents, there are two general approaches: - put meta info and textual contents in document properties - store them separately as normal documents with a reference with the database uri of the actual binary MarkLogic 9 also allows storing simple key/value pairs in hidden document metadata, which is more efficient than document properties or separate docs, but it is probably too limited for this use case. You can store transcripts of videos including timestamps as XML, which would work for both the two-doc, and the doc-prop approach. Document properties allows storing complete XML fragments, and is associated with the same database uri as the actual document (in this case the binary data). It is included in indexing automatically. You just need to indicate you like to include properties fragments in searching and faceting. There are out of the box CPF pipelines for Document Filtering. There is one that saves the the result in doc properties, and one that saves the result in a separate doc. It should be possible to enable those via the Admin ui.. Kind regards, Geert From: GUPTA Pavan <pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>> Date: Thursday, July 20, 2017 at 11:07 AM To: MarkLogic Developer Discussion <general@developer.marklogic.com<mailto:general@developer.marklogic.com>>, Geert Josten <geert.jos...@marklogic.com<mailto:geert.jos...@marklogic.com>> Subject: RE: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format Hello Geert, Thanks for information. I would also know how I can store the content (means spoken words) of a video and find the time when it was spoken as we load the content of any document file in metadata. Is there any CPF I need to apply or suggest some library. Thanks In Advance! Regards, Pavan From:general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten Sent: Thursday, July 20, 2017 2:27 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format Hi Pavan, You can apply xdmp:document-filter on many binary formats, including mp3 and mp4. It will extract meta information like file size and content mime type, and for instance document properties from office documents, and exif tags from images. It will also attempt extract actual text, but that will only work if such text is inside the file in a machine readable form. E.g. text contained inside images or video streams will not be captured. This includes images embedded in office docs, image pdf, and also captions and subtitles on images and videos. You would need an OCR kind of solution for that.. Kind regards, Geert From: <general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com>> on behalf of GUPTA Pavan <pavan.gu...@soprasteria.com<mailto:pavan.gu...@soprasteria.com>> Reply-To: MarkLogic Developer Discussion <general@developer.marklogic.com<mailto:general@developer.marklogic.com>> Date: Thursday, July 20, 2017 at 9:19 AM To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" <general@developer.marklogic.com<mailto:general@developer.marklogic.com>> Subject: [MarkLogic Dev General] Binary Document Ingestion in MP4 and MP3 format Hi Team, I am trying to ingest the .mp4 and .mp3 file and make them searchable. I have studied that these files are considered as binary files. I have also seen how to make the binary files searchable but I have done for .doc, .ppt, .pdf etc file but could not do for .mp4 or .mp3. Actually I want to make the files searchable. Can you please direct me how to achieve this and tell me if I need to enable or set up any content processing framework for same.\ Thanks In Advance! Regards, Pavan _______________________________________________ General mailing list General@developer.marklogic.com<mailto:General@developer.marklogic.com> Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general