Hi Jack,

In general, it should always be possible to get ft:mark working somehow;
it’s just difficult to give general advice how to do it ;)

If you like, you can provide us with a stripped-down version of your code
that you can’t get to work.

Best,
Christian


On Wed, Mar 6, 2024 at 5:25 AM Jack Steyn <steynj...@gmail.com> wrote:

> Hi Christian,
>
> Thank you very much for your explanation and variant example.
>
> In my use case, the local:search function is itself being called (as a
> named function reference) from within another function that is the endpoint
> of a RESTXQ API. This containing function handles a number of things e.g.
> pagination, deduplication, transformation of XML into HTML.
>
> Even when I rewrite this local:search to match your variant example,
> incorrect results are still returned. But when I then add %rest:GET
> annotations to turn the local:search function into its own endpoint, the
> correct results are returned only when I use that endpoint directly.
>
> Thus I assume the containing function makes things again too complicated
> for metadata to be propagated.
>
> Does that sound plausible to you? And can you suggest any simple ways
> around it? I'm afraid applying %basex:inline hasn't helped.
>
> Very best,
>
> Jack
>
> On Fri, 1 Mar 2024, 8:12 pm Christian Grün, <christian.gr...@gmail.com>
> wrote:
>
>> Hi Jack,
>>
>> > When you say you can't reproduce it, do you mean you get 14 results
>> from running this script?
>>
>> Yes, that’s what I meant.
>>
>> The upcoming information will be very technical and specific. You are
>> welcome to focus on the examples.
>>
>> Your updated example was helpful, and I noticed it’s a bunch of issues
>> that lead to the unexpected results. The core challenge is that ft:mark and
>> ft:extract only yield expected results if the internally collected
>> full-text metadata is not lost at some stage during the internal processing
>> – which can happen at many places hidden to the writer of the query.
>>
>> In your specific example, the full-text information gets lost because the 
>> local:search
>> function is too complex to be inlined by the compiler (which enables
>> further optimizations that eventually allow metadata propagation). You can
>> tackle this by forcing the compiler to inline your function:
>>
>>   declare %basex:inline function local:search(...)
>>
>> Using '(ethnicgroups, languages)' instead of 'name() = (...)' is another
>> practical advice; it helps the optimizer to detect at compile time that
>> metadata will be available at runtime. Another solution is to use
>> 'local-name()' instead of 'name()' (local-name does not rely on namespace
>> that may possibly occur in a database, which also affects the way how
>> full-text queries are evaluated).
>>
>> Here’s a variant that should work:
>>
>> declare function local:search(
>>   $database  as xs:string,
>>   $query     as xs:string
>> ) {
>>   let $country := ft:search($database, $query)/ancestor::country
>>   let $search := function($node) { $node/text() contains text { $query } }
>>   return (
>>     ft:mark($country[.//name[$search(.)]]),
>>     ft:mark($country[.//city[$search(.)]]),
>>     ft:mark($country[.//(ethnicgroups, languages)[$search(.)]])
>>   )
>> };
>> local:search('factbook', 'German')
>>
>> …or…
>>
>>   let $search := function($nodes) { $nodes[text() contains text { $query
>> }] }
>>   return (ft:mark($country[$search(.//name)]), ...
>>
>> From today’s perspective, we would certainly design ft:mark and
>> ft:extract in a way that the results are always correct. The consequences,
>> however, would be a much more restricted syntax.
>>
>> Hope this helps,
>> Christian
>>
>>
>> On Thu, Feb 29, 2024 at 12:13 AM Jack Steyn <steynj...@gmail.com> wrote:
>>
>>> Hi Christian,
>>>
>>> When I run your script, I do get 14 elements.
>>>
>>> When I run the following script I just get 12.
>>>
>>> <commands>
>>>   <set option='ftindex'>true</set>
>>>   <create-db name='factbook'>https://files.basex.org/xml/factbook.xml
>>> </create-db>
>>>   <xquery><![CDATA[
>>> declare function local:search(
>>>     $database as xs:string,
>>>     $query as xs:string
>>> ) {
>>>     let $country-search := ft:search($database, $query)/ancestor::country
>>>     let $city-search := ft:search($database,
>>> $query)/ancestor::city/ancestor::country
>>>     let $other-search := ft:search($database, $query)/parent::*[name() =
>>> ('ethnicgroups', 'languages')]/ancestor::country
>>>     let $country-mark := $country-search[.//name[text() contains text {
>>> $query }]] => ft:mark()
>>>     let $city-mark := $city-search[.//city[text() contains text { $query
>>> }]] => ft:mark()
>>>     let $other-mark := $other-search[.//*[name() = ('ethnicgroups',
>>> 'languages')][text() contains text { $query }]] => ft:mark()
>>>     return (
>>>         $country-mark,
>>>         $city-mark,
>>>         $other-mark
>>>     )
>>> };
>>>
>>> local:search('factbook', 'German')//mark
>>>   ]]></xquery>
>>> </commands>
>>>
>>> When you say you can't reproduce it, do you mean you get 14 results from
>>> running this script?
>>>
>>> Cheers,
>>>
>>> Jack
>>>
>>> On Thu, 29 Feb 2024, 1:02 am Christian Grün, <christian.gr...@gmail.com>
>>> wrote:
>>>
>>>> Hi Jack,
>>>>
>>>> Thanks for your observation.
>>>>
>>>>
>>>>> The first result of this query is the entry for Austria. I would
>>>>> expect both of the instances of the word 'German' in that entry to be
>>>>> surrounded by <mark> tags. However only the first instance is.
>>>>>
>>>>
>>>> I couldn’t reproduce this yet. Here’s a command script that returns 14
>>>> <mark>German</mark> elements:
>>>>
>>>> <commands>
>>>>   <set option='ftindex'>true</set>
>>>>   <create-db name='factbook'>https://files.basex.org/xml/factbook.xml
>>>> </create-db>
>>>>   <xquery><![CDATA[
>>>> let $groups := ('ethnicgroups', 'languages')
>>>> let $database := 'factbook'
>>>> let $query := 'German'
>>>>
>>>> let $search := ft:search($database, $query)/parent::*
>>>>   [name() = $groups]/ancestor::country
>>>> let $marked := ft:mark(
>>>>   $search[.//*[name() = $groups][text() contains text { $query }]]
>>>> )
>>>> return $marked//*[text() = 'German']
>>>>   ]]></xquery>
>>>> </commands>
>>>>
>>>> Could you check if you get the same result?
>>>>
>>>> Thanks in advance
>>>> Christian
>>>>
>>>>

Reply via email to