Hi,
In textmining, the 'idf' or inverse document frequency is defined as
idf(term)=ln(ndocuments / ndocuments containing term). I am working on a
function that should return this idf.
This function:
declare function local:wordFreq_idf($nodes as node()*) as array(*) {
let $count := count($nodes
Hi Ben -
I'm on mobile, please excuse any typos.
Maybe
`return array { $idf }`
is closer?
Untested, apologies!
Best,
Bridger
On Mon, Mar 30, 2020, 5:16 PM Ben Engbers wrote:
> Hi,
>
> In textmining, the 'idf' or inverse document frequency is defined as
> idf(term)=ln(ndocuments / ndocuments c
On Mon, Mar 30, 2020 at 11:16:23PM +0200, Ben Engbers scripsit:
[snip]
> For "probleem", the idf should be calculated as ln($count/703). Since
> there are 1780 nodes this would result in 0.929011751.
> I tried to exten the 'let $idf' line with:
>=> array:for-each(function($idf) {array:appen
Op 31-03-2020 om 01:18 schreef Graydon:
> On Mon, Mar 30, 2020 at 11:16:23PM +0200, Ben Engbers scripsit:
> [snip]
>> For "probleem", the idf should be calculated as ln($count/703). Since
>> there are 1780 nodes this would result in 0.929011751.
>> I tried to exten the 'let $idf' line with:
>>
Am 30.03.2020 um 23:16 schrieb Ben Engbers:
Hi,
In textmining, the 'idf' or inverse document frequency is defined as
idf(term)=ln(ndocuments / ndocuments containing term). I am working on a
function that should return this idf.
This function:
declare function local:wordFreq_idf($nodes as node(
On Tue, Mar 31, 2020 at 04:21:52PM +0200, Ben Engbers scripsit:
> Op 31-03-2020 om 01:18 schreef Graydon:
> > On Mon, Mar 30, 2020 at 11:16:23PM +0200, Ben Engbers scripsit:
> > [snip]
> >> For "probleem", the idf should be calculated as ln($count/703). Since
> >> there are 1780 nodes this would re
Hi,
> => means "take the thing on the left and substitute it for the first
> parameter of the function on the right, so
I thought it meant "The first parameter on the right will be subsituted
with the thing on the left"?
> ('weasels') => replace('weasels','mustelids') works
>
> ('weasels','bad
Hi,
For (my personal) clarity, I have split up the original function in two
parts:
declare function local:step_one($nodes as node()*) as array(*)*
{
let $text := for $node in $nodes
return $node/text() =>
tokenize() => distinct-values()
let $idf := $text =>
tidyTM:wordCount_a
On 31.03.2020 18:32, Ben Engbers wrote:
Hi,
For (my personal) clarity, I have split up the original function in two
parts:
declare function local:step_one($nodes as node()*) as array(*)*
{
let $text := for $node in $nodes
return $node/text() =>
tokenize() => distinct-values()
9 matches
Mail list logo