All MatchData objects resulting from an invocation of String#scan are updated 
with the current match
----------------------------------------------------------------------------------------------------

                 Key: JRUBY-6141
                 URL: https://jira.codehaus.org/browse/JRUBY-6141
             Project: JRuby
          Issue Type: Bug
          Components: Standard Library
    Affects Versions: JRuby 1.6.4
            Reporter: Jeff Pace
            Assignee: Thomas E Enebo
         Attachments: rubyregexp.patch, test_string_scan.rb

Each iteration of String#scan should result in a unique MatchData returned by 
Regexp.last_match, with each MatchData object reflecting the results of its 
match. Thus if String#scan resulted in two matches, after the first match 
Regexp.last_match would have different values than Regexp.last_match would have 
after the second match.

However, in the current implementation each MatchData resulting from an 
invocation of String#scan has the values of the most recent match. For example, 
from the attached unit test:

{noformat}
  def test_scan
    firstmatch = nil

    str = "testing"
    re = Regexp.new('(t[^t]*)')
    
    str.scan(re) do |match|
      if firstmatch.nil?
        firstmatch = Regexp.last_match

        assert_equal "tes", firstmatch[0]
      else
        secondmatch = Regexp.last_match
        assert_equal "ting", secondmatch[0]

        # not the same object
        assert firstmatch.object_id != secondmatch.object_id
        
        # should still be the value of the first match
        assert_equal "tes", firstmatch[0]
      end
    end
  end
{noformat}

although they are different objects (per the object_id assertions) firstmatch 
and secondmatch have the same results, so the assertion:

{noformat}
        assert_equal "tes", firstmatch[0]
{noformat}

will fail, with firstmatch[0] equaling "ting", the results for the second 
match. (Note that this test succeeds with Ruby 1.8 and 1.9.)

The reason for this behavior is that during String#scan, a MatchData is created 
for each match. The MatchData object has an attribute "regs" (org.joni.Region), 
which refers to where the pattern matched in the string.

The issue is that when String#scan creates MatchData objects for each pattern 
match, each of the MatchData objects refer to the same Region instance. 
Subsequent matches result in the Region object being updated, and each 
MatchData object sharing a reference to that Region will have the same value, 
as used in MatchData#to_s and MatchData#captures.

The solution is to clone the Region object for each newly-created MatchData, as 
in the attached patch.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Reply via email to