I’m not sure if you figured this out yet but you are likely running into a 
filtered vs. unfiltered issue now. The Java search client runs queries 
unfiltered by default as it is a best practice to be able to resolve queries 
from the indexes without filtering.

Since you are using container queries though, you will need positions turned on 
for your indexes so it can resolve the nested structure without filtering.

You will want to turn on “word positions”, “element word positions” and 
“element value positions” to support resolving these types of queries 
unfiltered. See this knowledgebase article 
https://help.marklogic.com/knowledgebase/article/View/245/0/queries-constrained-to-elements
 as well as the “Usage Notes” for https://docs.marklogic.com/cts.elementQuery 
and https://docs.marklogic.com/cts:json-property-scope-query for details.

-James

From: <[email protected]> on behalf of APEL Holger 
<[email protected]>
Reply-To: MarkLogic Developer Discussion <[email protected]>
Date: Tuesday, August 22, 2017 at 4:05 AM
To: MarkLogic Developer Discussion <[email protected]>
Subject: Re: [MarkLogic Dev General] Search result estimation: issue with json 
array structure

Ah yes, I got the cts.andQuery parameter wrong. But what I really want to do is 
using the Java API to query my pojos

StructuredQueryBuilder qb = new StructuredQueryBuilder();
StructuredQueryDefinition q = qb.containerQuery(qb.jsonProperty("stages"),
    qb.and(
        qb.value(qb.jsonProperty("status"), "CURRENT"),
        qb.value(qb.jsonProperty("stageId"), 9999)
    ));

DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8082, new 
DigestAuthContext("admin", "admin"));

SearchHandle result = client.newQueryManager().search(q, new SearchHandle());
logger.info("query: {}", q.serialize());
logger.info("returned: {}", result.getTotalResults());

The serialized query is:
<query xmlns="http://marklogic.com/appservices/search";>
  <container-query>
    <json-property>stages</json-property>
    <and-query>
      <value-query type="string">
        <json-property>status</json-property>
        <text>CURRENT</text>
      </value-query>
      <value-query type="number">
        <json-property>stageId</json-property>
        <text>9999</text>
      </value-query>
    </and-query>
  </container-query>
</query>

And totalResults: 20

From: [email protected] 
[mailto:[email protected]] On Behalf Of James Kerr
Sent: 2017-08-19 05:58
To: MarkLogic Developer Discussion <[email protected]>
Subject: Re: [MarkLogic Dev General] Search result estimation: issue with json 
array structure

The function signature for cts.andQuery accepts an array. You are using () 
instead of [] around your sub-queries. This should work:

fn.count(
  cts.search(
      cts.jsonPropertyScopeQuery("stages",
         cts.andQuery([
            cts.jsonPropertyValueQuery("status", "CURRENT"),
            cts.jsonPropertyValueQuery("stageId", 9999)
         ])
      )
  , 'filtered')
);


From: 
<[email protected]<mailto:[email protected]>>
 on behalf of APEL Holger <[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Date: Friday, August 18, 2017 at 5:05 AM
To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Subject: [MarkLogic Dev General] Search result estimation: issue with json 
array structure

Hello community,

I stumbled over a use case where the total result count of a query is wrong.

Here or set of test data

declareUpdate();
for (i = 0; i < 10; i++) {
  xdmp.documentInsert(
  "/a" + i + ".json",
  {
    "project": {
        "stages": [{"stageId": 9999, "status": "CURRENT"},
                   {"stageId": 9999, "status": "CLOSED"}]
    }
  }
  );
  xdmp.documentInsert(
         "/b" + i + ".json",
  {
    "project": {
        "stages": [{"stageId": 9998, "status": "CURRENT"},
                   {"stageId": 9999, "status": "CLOSED"}]
    }
  }
  );
};

fn.count(
  cts.search(
      cts.jsonPropertyScopeQuery("stages",
         cts.andQuery(
            (cts.jsonPropertyValueQuery("status", "CURRENT"),
             cts.jsonPropertyValueQuery("stageId", 9999))
         )
      )
  , 'filtered')
);

Returns 20 but in xquery

fn:count(
  cts:search(/,
      cts:json-property-scope-query("stages",
         cts:and-query(
            (cts:json-property-value-query("status", "CURRENT"),
             cts:json-property-value-query("stageId", 9999))
         )
     )
  )
)

correctly returns 10

I guess cts.search uses the search:* module behind the scenes because 
search:resolve gives me the same result doing an equivalent query. So it seems 
the problem is result estimation … I know search:resolve uses xdmp:estimate and 
xdmp:remainder and replacing fn:count with xdmp:estimate

xdmp:estimate(
  cts:search(/,
      cts:json-property-scope-query("stages",
         cts:and-query(
            (cts:json-property-value-query("status", "CURRENT"),
             cts:json-property-value-query("stageId", 9999))
         )
     )
  )
)

also gives me the 20.

In our use case the data set is rather small so the wrong estimates are very 
noticeable and not acceptable.
So my questions: is there a way to get the right count?

·         By tuning some indexes

·         Using additional query-options

·         Changing our query or even data model if there is no other way

Any hint is welcome

holger apel
software manager | information technology and electronic services | iso central 
secretariat<http://www.iso.org/iso/contact_iso>




_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to