Hi Stephane,

It must be that your documents are themselves in XML, right?
extract-path normally grabs trees from the persisted document, and so
the nodes extracted from an XML document will be XML.

I wonder whether you can add '/text()' to the end of your extract-path 
expressions
in order to force them into something that can be serialized within JSON.
That would erase the key names of course.

An alternate approach would be to use bulk search (from a client API)
and use an output transform to render results of each search result into JSON.
(Possible, but I can see why that would not be an appealing solution).

If your documents were JSON, I *think* you'd get the results you are expecting.

Charles Greer
Lead Engineer
MarkLogic Corporation

________________________________
From: general-boun...@developer.marklogic.com 
[general-boun...@developer.marklogic.com] on behalf of stephane.va...@oecd.org 
[stephane.va...@oecd.org]
Sent: Thursday, June 23, 2016 2:19 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] format:json && extract-document-data

Hi,

I am trying to include some document data into my search results, using the 
following query options:

<options xmlns="http://marklogic.com/appservices/search";>
    <extract-document-data selected="include">
          <extract-path>/language-version/ 
language-version-canonical-model/title</extract-path>
          <extract-path>/language-version/ 
language-version-canonical-model/language</extract-path>
(…)
    </extract-document-data>
</options>

Unfortunately, when I ask for json format (using header Accpet: 
application/json), the extracted element comes as “stringyfied” xml instead of 
being converted into json as I would have expected:

{
  "snippet-format": "snippet",
  "total": 564,
  "start": 1,
  "page-length": 10,
  "selected": "include",
  "results": [
    {
      "index": 1,
      "uri": "ENV/CHEM/NANO(2015)22/ANN5/2",
      "path": "fn:doc(\"ENV/CHEM/NANO(2015)22/ANN5/2\")",
(…)
      "extracted": {
        "kind": "element",
        "content": [
          "<language>En</language>",
          "<title>ZINC OXIDE DOSSIERANNEX 5</title>",
          "<reference>ENV/CHEM/NANO(2015)22/ANN5</reference>",
          "<classification>2</classification>",
          "<modificationDate>2015-04-16T00:00:00.000+02:00</modificationDate>",
          "<subject label_en=\"media\">media</subject>",
          "<subject label_en=\"fish\">fish</subject>",
(…)
        ]
      }
    },

Anything I am doing wrong? Is there some configuration options I could tweak to 
enforce the conversion of xml to json?

Cheers,
Stéphane Varin
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to