Hi Danny,
Thanks for the information.

I understand that using enrich/entity-highlight, I can enrich the XML 
content by some entity types already defined in Marklogic.
I would like to know how can I add more entity type and add the different 
values of the new entity types or customize the existing entity types e.g.
Suppose my project is about newspaper. I need metadata about commodity 
(e.g. gas, electricity etc) , subject area, geography,  news sources etc. 
So I need to insert my definition of entity type into the Marklogic.   It 
will be great if you can let me know how can I do that?

Secondly does Marklogic has any plugin where it will take a DITA format 
XML as an input and will return the DITA format XML as output with 
enriched metadata?
Because I may have XML that conforms to DITA format. Now entity-highlight 
or enrich function adds some tag in the content. These news tags may not 
conform to DITA XSD format. So I was wondering if there is any builtin 
function/plugin to work with the DITA xml for enriching content.

I did not understand the concept of pipelines for third party technologies 
like Temis. I guess they are also accessing and invoking the same enrich 
function of Marklogic. How they can provide different functionality?

regards,
Saptarshi Das
Tata Consultancy Services
20 Ryan Ranch Road
Monterey
Monterey - 93940,California
United States
Mailto: [email protected]
Website: http://www.tcs.com
____________________________________________
Experience certainty.   IT Services
                        Business Solutions
                        Outsourcing
____________________________________________



[email protected] 
Sent by: [email protected]
07/06/2009 02:44 PM
Please respond to
[email protected]


To
[email protected]
cc

Subject
General Digest, Vol 61, Issue 10








Message: 2
Date: Mon, 6 Jul 2009 11:52:23 -0700
From: Danny Sokolsky <[email protected]>
Subject: RE: [MarkLogic Dev General] Enrichment of content
To: General Mark Logic Developer Discussion
                 <[email protected]>
Message-ID:
 <[email protected]>
Content-Type: text/plain; charset="utf-8"

Hi Saptarshi,

If the XML from entity:enrich does not suit your needs, it sounds like you 
will need to use cts:entity-highlight to define the XML based on your own 
taxonomy:

http://developer.marklogic.com/pubs/4.1/apidocs/SearchBuiltins.html#cts:entity-highlight


You can use cts:entity-highlight to write a function to transform the 
entity mark up to return whatever you need.

You can also see chapter 9 (~p109) of the Search Developerâ??s Guide:

http://developer.marklogic.com/pubs/4.1/books/search-dev-guide.pdf

It outlines how entity enrichment works with MarkLogic Server.

I am not sure I understand your question about schema and DITA.  Perhaps 
if you gave a specific example of what you are trying to do and what you 
are having trouble doing, we might be able to help you find a solution.

The sample pipelines that use third-party technologies (such as Temis) are 
designed to show integration with these other technologies.  Entity 
extraction technologies are often very specialized to particular types of 
content, and MarkLogic can work with a wide array of different 
technologies.

Hope that helps,
-Danny

From: [email protected] [
mailto:[email protected]] On Behalf Of 
[email protected]
Sent: Thursday, July 02, 2009 10:27 PM
To: [email protected]
Subject: [MarkLogic Dev General] Enrichment of content



Hi,

 In my project, I shall be using Marklogic and we have a requirement for 
content enrichment. I have the content and a taxonomy structure  defined. 
I want to enrich the content using that taxonomy structure.   I would like 
to do the inline metadata tagging on the content. Following are my few 
questions:

1) From the enrich module API, I have understood that using the enrich 
function I can add the metadata on the given XML. Here it seems to me that 
the taxonomy structure and values based on which the metadata is tagged is 
managed by Marklogic.

In my project, I have my own taxonomy definition for the marked up 
elements. I would like to use that taxonomy definition for enriching the 
content. How can I add that into Marklogic?



2) Secondly, I have noticed that if that XML has any schema defined and 
that does not allow children element, Marklogic does not enrich that node. 
That is fine. But if I send a DITA formatted XML, can I get a DITA 
formatted XML as output with the enriched content?

It will be very helpful, if you can give some example on this topic. I 
also would like to explore more on this topic. If you can provide me some 
more resource that will be great.



3) I have also seen that Marklogic has partnered with Temis Luxid for 
content enrichment. I could not understand that what Marklogic  is 
providing and what Temis is doing extra on top of Marklogic. Any help in 
this regard will be great.

Thanks in advance.

regards,

Saptarshi Das
Tata Consultancy Services

United States
Mailto: [email protected]
Website: http://www.tcs.com
____________________________________________
Experience certainty.IT Services
Business Solutions
Outsourcing
____________________________________________



------------------------------

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general




ForwardSourceID:NT00005B4E 
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to