Pei Chen created CTAKES-310:
-------------------------------
Summary: Dictionary lookup permutations sort issue
Key: CTAKES-310
URL: https://issues.apache.org/jira/browse/CTAKES-310
Project: cTAKES
Issue Type: Bug
Components: ctakes-dictionary-lookup
Affects Versions: 3.2.0
Reporter: Pei Chen
Priority: Minor
Fix For: 3.2.1
Hi All,
I was reviewing the use of permutations, and I noticed that we sorted
the permutation list before creating the string to do the concept lookup
with. It also appears that we were sorting the object that was stored in
the parent list.
I've made a few changes, and now it appears I can discover some
additional concepts based upon the permutations.
Let me know what you think of the following changes.
Thanks,
Kim
=== modified file
'ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java'
---
ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
2014-07-31 22:00:48 +0000
+++
ctakes-dictionary-lookup/src/main/java/org/apache/ctakes/dictionary/lookup/algorithms/FirstTokenPermutationImpl.java
2014-09-04 18:39:59 +0000
@@ -210,11 +210,12 @@
final List<List<Integer>> permutationList = iv_permCacheMap.get(
permutationIndex );
for ( List<Integer> permutations : permutationList ) {
// Moved sort and offset calculation from inner (per
MetaDataHit) iteration 2-21-2013 spf
- Collections.sort( permutations );
+ List<Integer> permutationsSorted = (List)
((ArrayList)permutations).clone();
+ Collections.sort( permutationsSorted );
int startOffset = firstWordStartOffset;
int endOffset = firstWordEndOffset;
- if ( !permutations.isEmpty() ) {
- int firstIdx = permutations.get( 0 );
+ if ( !permutationsSorted.isEmpty() ) {
+ int firstIdx = permutationsSorted.get( 0 );
if ( firstIdx <= firstTokenIndex ) {
firstIdx--;
}
@@ -222,7 +223,7 @@
if ( firstToken.getStartOffset() < firstWordStartOffset ) {
startOffset = firstToken.getStartOffset();
}
- int lastIdx = permutations.get( permutations.size() - 1 );
+ int lastIdx = permutationsSorted.get(
permutationsSorted.size() - 1 );
if ( lastIdx <= firstTokenIndex ) {
lastIdx--;
}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)