Revision: 18502
          http://sourceforge.net/p/gate/code/18502
Author:   ian_roberts
Date:     2014-12-18 16:11:35 +0000 (Thu, 18 Dec 2014)
Log Message:
-----------
Long-standing bug in RepositioningInfo

Suppose you have a repositioning info like

- original offset: 5
- original length: 4
- extracted offset: 1
- extracted length: 1

- original offset: 9
- original length: 3
- extracted offset: 2
- extracted length: 3

(i.e. mapping a sequence of characters in the original content to a *shorter*
sequence of characters in the extracted content) and you ask for the extracted
offset corresponding to original offset 9 then you would previously get 5
instead of 2.  The problem is that any position within the span of the first
repos record (5-9) would be mapped to that record's extracted start offset plus
the number of excess characters in the *original* text, even if that would
return a result greater than extractedOffset+extractedLength.  I've now capped
the result so that cases like this will never return a value outside the
*extracted* span of the relevant repositioning record (any original position
greater than originalOffset+extractedLength is mapped to
extractedOffset+extractedLength).

Modified Paths:
--------------
    gate/trunk/src/main/gate/corpora/RepositioningInfo.java

Modified: gate/trunk/src/main/gate/corpora/RepositioningInfo.java
===================================================================
--- gate/trunk/src/main/gate/corpora/RepositioningInfo.java     2014-12-18 
02:19:59 UTC (rev 18501)
+++ gate/trunk/src/main/gate/corpora/RepositioningInfo.java     2014-12-18 
16:11:35 UTC (rev 18502)
@@ -132,6 +132,10 @@
           else {
             // current position + offset in this PositionInfo record
             result = currPI.getCurrentPosition() + absPos - origPos;
+            // but don't go beyond the extracted length
+            if(result > currPI.getCurrentPosition() + 
currPI.getCurrentLength()) {
+              result = currPI.getCurrentPosition() + currPI.getCurrentLength();
+            }
           } // if
           found = true;
           break;

This was sent by the SourceForge.net collaborative development platform, the 
world's largest Open Source development site.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
GATE-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/gate-cvs

Reply via email to