Count co-occurrences

Andrej Kastrin Fri, 22 Jun 2007 10:04:50 -0700

Dear all,

I wrote a simple sql querry to count co-occurrences between words but itperforms very very slow on large datasets. So, it's time to do it withPerl. I need just a short tip to start out: which structure to use tocount all possible occurrences between letters (e.g. A, B and C) underthe particular document number. My dataset looks like following:


1 A
1 B
1 C
1 B
2 A
2 A
2 B
2 C
etc. till doc. number 100.000

The result file should than be similar to:

A B 4 ### 2 co-occurrences under doc. number 1 + 2 co-occurrencesunder doc. number 2A C 3 ### 1 co-occurrence under doc. number 1 + 2 co-occurrences underdoc. number 2B C 3 ### 2 co-occurrences under doc. number 1 + 1 co-occurrence underdoc. number 2


Thanks in advance for any pointers.

Best, Andrej




--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Count co-occurrences

Reply via email to