Hi, In textmining, the 'idf' or inverse document frequency is defined as idf(term)=ln(ndocuments / ndocuments containing term). I am working on a function that should return this idf.
This function: declare function local:wordFreq_idf($nodes as node()*) as array(*) { let $count := count($nodes) let $text := for $node in $nodes return $node/text() => tokenize() => distinct-values() let $idf := $text => tidyTM:wordCount_arr() return $idf }; returns: ["probleem", 703] ["opgelost.", 248] ["dictu", 235] ["opgelost", 217] ["medewerker", 193] ... For "probleem", the idf should be calculated as ln($count/703). Since there are 1780 nodes this would result in 0.929011751. I tried to exten the 'let $idf' line with: => array:for-each(function($idf) {array:append($idf, math:log($count div $idf[2]) )}) which should result in ["probleem", 703, 0.929011751] but no mather what I do, every time I get this error: [XPTY0004] Cannot promote (array(xs:anyAtomicType))+ to array(*): ([ "probleem", 703 ], [ "opgelost.", 248 ], ...). Is it possible to apply array:for-each on an array of arrays? Ben