On 31.03.2020 18:32, Ben Engbers wrote:
Hi,

For (my personal) clarity, I have split up the original function in two
parts:

declare function local:step_one($nodes as node()*) as array(*)*
{
   let $text := for $node in $nodes
      return $node/text() =>
      tokenize() => distinct-values()
   let $idf := $text   =>
      tidyTM:wordCount_arr()
   return $idf
};

In local:step_one(), I first create a sequence with the distinct tokens
for each $node. All the sequences are joined in $text.
I then call wordCount_arr to count the occurences of each word in $text:

declare function tidyTM:wordCount_arr(
   $Words as xs:string*)
   as array(*) {
for $w in $Words
   let $f := $w
   group by $f
   order by count($w) descending
return ([$f, count($w)])
} ;

I would say that tidyTM:wordCount_arr returns a sequence of arrays but I
am not certain if I have specified the correct return-type?

Reading the code I agree that the return type seems to be a sequence of
arrays but therefore I wonder why you don't get a similar error as later
on with declaring
  array(*)
and not
  array(*)*

Calling local:step_one(tidyTM:remove_Stopwords($nodes, "Stp", $Stoppers))
returns:
["probleem", 703]
["opgelost.", 248]
....

I had hoped that calling  the following local:wordFreq, would add the
idf to each element but instead I get an error

declare function local:wordFreq_idf($nodes as node()*)  as array(*)
{
   let $count := count($nodes)
   let $idf := local:step_one($nodes)
   let $result := for-each( $idf,
     function($z) {array:append ($z, math:log($count div $z(2) ) ) } )
   return $result
};
[XPTY0004] Cannot promote (array(xs:anyAtomicType))+ to array(*): $idf
:= ([ "probleem", 703 ], [ "opgelost.", 248 ], ...).

The message tries to tell you that the declared return type
  array(*)
is a single array while the function returns a (non-empty) sequence of
arrays so using
  declare function local:wordFreq_idf($nodes as node()*)  as array(*)*
would remove that error.

To insert the third value into each array I think you want

  let $result := $idf ! array:append(., math:log($count div .(2) ))

Reply via email to