Hi, Will:

The Search API and its dynamic parser was introduced in MarkLogic 4.0 (or maybe 
4.1).  A lot has changed since then.

MarkLogic 8 made it possible for MarkLogic customers to take advantage of the 
JavaScript ecosystem. Part of the motivation for taking on the work of building 
v8 into the server was to give customers access to a broad set of tools and 
resources that no single company could possibly provide in isolation. 

That's especially true for customers who are starting new projects.

In particular, generating parsers for textual grammars is a general problem 
(not specific to MarkLogic). If tools that solve the problem exist in the 
ecosystem, MarkLogic isn't adding value by creating equivalents. Instead, we 
add value by working on the tools and capabilities that are missing.

It makes sense that "glue" examples showing how to use parser generators with 
MarkLogic would help bridge the gap; I'll raise that suggestion internally. 

I realize the explanation above might not be convincing but hope that's at 
least useful in understanding the rationale for suggesting JavaScript parser 
generators.


Erik Hennum

________________________________________
From: [email protected] 
[[email protected]] on behalf of Will Thompson 
[[email protected]]
Sent: Wednesday, February 01, 2017 12:34 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Custom search grammar

Hi Oleksi,

If it seems odd to you that MarkLogic continues to pressure you *not* to use 
the API they built, you are not alone.

Erik,

I don't think it's reasonable to suggest to customers only needing to extend or 
replace a small portion of Search API that they rewrite significant parts of 
its functionality using 3rd-party tools (some of which haven't been updated for 
years) instead of using the clearly documented extensibility hooks of the 
ML-provided API.

If MarkLogic wants to push customers with more complex Search API needs in a 
different direction, that's fine, but it would be a lot more palatable if ML 
actually did some of the legwork upfront - provide code examples, blogs, 
documentation, etc. - to demonstrate how that should be done correctly. At a 
minimum it's confusing to be told not to use the the provided tools and 
frustrating that the alternatives suggested require a lot more work and 
uncertainty. Ideally, if ML doesn't want customers using parts of the Search 
API, they should just build a replacement that they are willing to endorse 
(maybe even using one of the JS parsers you recommend).

-Will


> On Feb 1, 2017, at 11:58 AM, Erik Hennum <[email protected]> wrote:
>
> Hi, Oleksii:
>
> To be clear, we discourage use of custom grammars.
>
> Besides the JavaScript parser generators that I mentioned previously, you 
> might also consider the XQuery approach demonstrated in:
>
>    https://github.com/mblakele/xqysp
>
> These approaches will support more flexible and performant parsers than the 
> dynamic grammar of the Search API.
>
> If you have a requirement that can be addressed only by a custom grammar and 
> not by one of these approaches, please open a support ticket.
>
>
> Erik Hennum
>
> ________________________________________
> From: [email protected] 
> [[email protected]] on behalf of Oleksii Segeda 
> [[email protected]]
> Sent: Wednesday, February 01, 2017 7:28 AM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Custom search grammar
>
> Hi Erik,
>
> Did you figure out how to extend the grammar?
>
> Regards,
> Oleksii Segeda
> IT Analyst
> Information and Technology Solutions
>
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Oleksii Segeda
> Sent: Monday, January 30, 2017 3:09 PM
> To: MarkLogic Developer Discussion <[email protected]>
> Subject: Re: [MarkLogic Dev General] Custom search grammar
>
> Hi Erik,
>
> Yes, that's is desired behavior.
>
> Ideally, I would like to avoid custom constraints, simply because search 
> grammar looks cleaner in the search box. In addition, some of our users are 
> already familiar with simple search operators like AND, OR, so BOOST won't 
> look like an alien to them.
>
> I guess a postprocessing can be used as you suggested, however I'm interested 
> in custom search grammar, because I may need to extend it more in the future.
>
> Thank you,
> Oleksii Segeda
> IT Analyst
> Information and Technology Solutions
>
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Erik Hennum
> Sent: Monday, January 30, 2017 2:42 PM
> To: MarkLogic Developer Discussion <[email protected]>
> Subject: Re: [MarkLogic Dev General] Custom search grammar
>
> Hi, Oleksii:
>
> Thanks for providing more detail.
>
> Just to confirm, is it clear that, in a boost query, the right-hand
> term is optional?  Documents with only the left-hand term will still
> appear in the results though with less relevance than documents
> that have both terms.
>
> By contrast, AND-related terms are both required and both
> contribute to relevance.
>
> Anyway, to increase weight, one approach would be to define a tag
> for a quoted phrase and pass the phrase to a Search API custom
> constraint or to cts:parse() with a binding to a query generator function:
>
>    http://docs.marklogic.com/guide/search-dev/cts_query#id_13456
>
> The custom code could then tokenize the phrase and combine the
> terms with a boost-query or and-query, adding appropriate weight.
>
> Another approach would be to do postprocessing of the query tree
> returned by cts:parse() or search:parse() to replace the default
> boost-query or and-query with a query that has more weight.
>
> In either approach, you would then search on the query.
>
> I mention cts:parse() because it parses query text more quickly
> than search:parse()
>
>
> Hoping that helps,
>
> Erik Hennum
>
> ________________________________________
> From: [email protected] 
> [[email protected]] on behalf of Oleksii Segeda 
> [[email protected]]
> Sent: Monday, January 30, 2017 10:55 AM
> To: [email protected]
> Subject: Re: [MarkLogic Dev General] Custom search grammar
>
> Hi Erik,
>
> I'm trying to boost some parts of search query. For example, if user types 
> `trade BOOST water`, I want documents with the word "water" to be higher in 
> the results.
> cts:boost-query seems to be a perfect fit, but the default BOOST doesn't let 
> you specify weights.
>
> My ultimate goal is to convert `trade BOOST water` to something like this:
>
>    cts:boost-query(cts:word-query("trade"), cts:word-query("water", (), 10.0) 
> )
>
> Regards,
> Oleksii Segeda
> IT Analyst
> Information and Technology Solutions
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> [email protected]
> Sent: Monday, January 30, 2017 1:08 PM
> To: [email protected]
> Subject: General Digest, Vol 151, Issue 42
>
> Send General mailing list submissions to
>        [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://developer.marklogic.com/mailman/listinfo/general
> or, via email, send a message with subject or body 'help' to
>        [email protected]
>
> You can reach the person managing the list at
>        [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of General digest..."
>
>
> Today's Topics:
>
>   1. Custom search grammar (Oleksii Segeda)
>   2. Re: Custom search grammar (Erik Hennum)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 30 Jan 2017 16:51:26 +0000
> From: Oleksii Segeda <[email protected]>
> Subject: [MarkLogic Dev General] Custom search grammar
> To: "[email protected]"
>        <[email protected]>
> Message-ID:
>        
> <bn1pr0101mb0769b9cdcd5e7697ace8381bcb...@bn1pr0101mb0769.prod.exchangelabs.com>
>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi there,
>
> I'm trying to declare a custom search grammar. I declared a custom function 
> via search options, which supposed to parse "BOOST" keyword:
>
> <joiner strength="2" apply="custom-boost" 
> ns="http://worldbankgroup.org/search/grammar"; at="/lib/grammar-boost.xqy" 
> tokenize="word">BOOST</joiner>
>
> I declared this function and just copied existing implementation from 
> impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy :
>
> declare function grammar:custom-boost($ps as map:map, $left as element()?, 
> $opts as element()?) as schema-element(cts:query) {
>    let $symbol := impl:symbol-lookup($ps)
>    let $_ := tdop:advance($ps)
>    let $expr1 := tdop:expression($ps, $symbol/@strength)
>    return
>        if (empty($left))
>        then ($left, impl:msg($ps, <cts:annotation 
> warning="SEARCH-IGNOREDQTEXT:[{string($symbol)} {$expr1}]: expected two 
> arguments"/>))
>        else
>            element { xs:QName($symbol/@element) } {
>                attribute qtextjoin {concat($symbol/string())},
>                attribute strength {$symbol/@strength},
>                attribute qtextgroup { 
> impl:opts($ps)/opt:grammar/opt:starter[@apply eq "grouping"]/(string(), 
> @delimiter/string()) },
>                for $opt in 
> $symbol/@options/tokenize(normalize-space(.)<mailto:$symbol/@options/tokenize(normalize-space(.)>,
>  "\s") return <cts:option>{$opt}</cts:option>,
>                element cts:matching-query {
>                    attribute qtextref { "schema-element(cts:query)" },
>                    $left },
>                element cts:boosting-query {
>                    attribute qtextref { "schema-element(cts:query)" },
>                    $expr1 }
>            }
> };
>
> Unfortunately this doesn't work, because for some reason impl:symbol-lookup 
> returns an empty sequence.
> Any ideas what went wrong here?
>
>
> Oleksii Segeda
>
> IT Analyst
>
> Information and Technology Solutions
>
> [http://siteresources.worldbank.org/NEWS/Images/spacer.png]
>
> [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]
>
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0001.html
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: image003.png
> Type: image/png
> Size: 6577 bytes
> Desc: image003.png
> Url : 
> http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0002.png
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: image004.png
> Type: image/png
> Size: 170 bytes
> Desc: image004.png
> Url : 
> http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0003.png
>
> ------------------------------
>
> Message: 2
> Date: Mon, 30 Jan 2017 18:07:41 +0000
> From: Erik Hennum <[email protected]>
> Subject: Re: [MarkLogic Dev General] Custom search grammar
> To: MarkLogic Developer Discussion <[email protected]>
> Message-ID:
>        <dfdf2fd50bf5aa42adaf93ff2e3ca1850bd7d...@exchg10-be01.marklogic.com>
> Content-Type: text/plain; charset="windows-1252"
>
> Hi, Oleksii:
>
> Can you explain what you are trying to accomplish?
>
> There may be better ways of doing the same thing than creating a custom 
> grammar, which is really a tool of last resort.
>
> For instance, a custom constraint can map a term to a custom query.
>
> For other cases, it's often useful to do postprocessing on the generated 
> query.
>
> If a custom grammar really is unavoidable, in many cases a special-purpose 
> third-party parsing tool may provide a faster and more flexible alternative 
> to the limited custom grammar in the Search API.
>
> For instance, the Jison.js and Peg.js parsers work with server-side 
> JavaScript. (A nearly.js parser is also available, though I've heard no 
> reports about it yet.)
>
>
> Hoping that helps,
>
>
> Erik Hennum
>
>
> ________________________________
> From: [email protected] 
> [[email protected]] on behalf of Oleksii Segeda 
> [[email protected]]
> Sent: Monday, January 30, 2017 8:51 AM
> To: [email protected]
> Subject: [MarkLogic Dev General] Custom search grammar
>
> Hi there,
>
> I?m trying to declare a custom search grammar. I declared a custom function 
> via search options, which supposed to parse ?BOOST? keyword:
>
> <joiner strength="2" apply="custom-boost" 
> ns="http://worldbankgroup.org/search/grammar"; at="/lib/grammar-boost.xqy" 
> tokenize="word">BOOST</joiner>
>
> I declared this function and just copied existing implementation from 
> impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy :
>
> declare function grammar:custom-boost($ps as map:map, $left as element()?, 
> $opts as element()?) as schema-element(cts:query) {
>    let $symbol := impl:symbol-lookup($ps)
>    let $_ := tdop:advance($ps)
>    let $expr1 := tdop:expression($ps, $symbol/@strength)
>    return
>        if (empty($left))
>        then ($left, impl:msg($ps, <cts:annotation 
> warning="SEARCH-IGNOREDQTEXT:[{string($symbol)} {$expr1}]: expected two 
> arguments"/>))
>        else
>            element { xs:QName($symbol/@element) } {
>                attribute qtextjoin {concat($symbol/string())},
>                attribute strength {$symbol/@strength},
>                attribute qtextgroup { 
> impl:opts($ps)/opt:grammar/opt:starter[@apply eq "grouping"]/(string(), 
> @delimiter/string()) },
>                for $opt in 
> $symbol/@options/tokenize(normalize-space(.)<mailto:$symbol/@options/tokenize(normalize-space(.)>,
>  "\s") return <cts:option>{$opt}</cts:option>,
>                element cts:matching-query {
>                    attribute qtextref { "schema-element(cts:query)" },
>                    $left },
>                element cts:boosting-query {
>                    attribute qtextref { "schema-element(cts:query)" },
>                    $expr1 }
>            }
> };
>
> Unfortunately this doesn?t work, because for some reason impl:symbol-lookup 
> returns an empty sequence.
> Any ideas what went wrong here?
>
>
> Oleksii Segeda
>
> IT Analyst
>
> Information and Technology Solutions
>
> [http://siteresources.worldbank.org/NEWS/Images/spacer.png]
>
> [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png]
>
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> http://developer.marklogic.com/pipermail/general/attachments/20170130/33effe92/attachment.html
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: image003.png
> Type: image/png
> Size: 6577 bytes
> Desc: image003.png
> Url : 
> http://developer.marklogic.com/pipermail/general/attachments/20170130/33effe92/attachment.png
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: image004.png
> Type: image/png
> Size: 170 bytes
> Desc: image004.png
> Url : 
> http://developer.marklogic.com/pipermail/general/attachments/20170130/33effe92/attachment-0001.png
>
> ------------------------------
>
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>
> End of General Digest, Vol 151, Issue 42
> ****************************************
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to