We are pleased to announce the release of nameconflate v0.15. You can find it here:
http://www.d.umn.edu/~tpederse/tools.html This is the program we use to create pseduo words (or conflated names) in either English GigaWord text, or plain text. It has been very helpful in recent name discrimination papers, and has been responsible for creating nearly all of our evaluation data for the name discrimination work Anagha has been doing. This version has three new options, and 2 bug fixes as described below: New Options: 1. '--linecontext' : one line per context for plain text 2. '--instnum NUM' : set starting instance number (default: 1) 3. '--lang LANG' : set language (default: english) Bugs: 1. If, lets say, windowSize=25, one of the target-word=America and text: "Welcome to Bank of America, the nation's leading financial institution" then we would expect the context to be: "Welcome to Bank of <head>A_X</head>, the nation's leading financial institution" but what we were getting was: "Welcome to Bank of <head>A_X</head>, the nation's leading financial " i.e. the last word was not getting included even if it was within the windowsize limit. The problem was with one of the regex I was using. Now this problem is fixed. 2. A stale variable value was getting used in a specific situation. Please let us know if you have any questions or comments. Enjoy, Ted and Anagha ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
