[ 
https://issues.apache.org/jira/browse/SOLR-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468490
 ] 

Yonik Seeley commented on SOLR-122:
-----------------------------------

OK, check this out... my second ruby coding attempt ever.  The first was the 6 
line program here http://wiki.apache.org/solr/SolRuby

At first I thought maybe the speed difference was due to gsub scanning the 
string 3 times.  Then I started fooling around with it and realized the 
slowdown must be because the pattern is being "compiled" on every evaluation 
(just a guess).  I also wrote a single-pass version that's a little faster yet.

I didn't test the XML versions since I don't have libxml (and I'm not even sure 
how to get/install... I'm obviously not a ruby person).   *but* since these 
versions are 10 times faster than the original string concat versions, I assume 
they will be perhaps 5 times faster than libxml.  Assuming It's actually doing 
what it's supposed to and I didn't make some horrible mistake.

                                          user     system      total        real
string concatenation:                 6.812000   0.171000   6.983000 (  
7.172000)
string substitution:                  6.922000   0.141000   7.063000 (  
7.250000)
string concatenation2:                1.047000   0.000000   1.047000 (  
1.078000)
string substitution2:                 0.953000   0.000000   0.953000 (  
0.969000)
catenation w/ single pass escape:     0.734000   0.000000   0.734000 (  
0.750000)
substitution w/ single pass escape:   0.657000   0.000000   0.657000 (  0.656000
)

require "benchmark"

#TESTS = 1_000_000
TESTS = 100_000

def escape(text)
  text.gsub(/([&<>])/) { |ch|
    case ch
    when '&' then '&amp;'
    when '<' then '&lt;'
    when '>' then '&gt;'
    end
  }
end


Benchmark.bmbm do |results|
  results.report("string concatenation:") do
    TESTS.times do
      x = "<blah>"
      x << "woot".gsub("&", "&amp;").gsub("<", "&lt;").gsub(">", "&gt;")
      x << "</blah>"
    end
  end
  
  results.report("string substitution:") do
    TESTS.times do
      x = "<blah>#{"woot".gsub("&", "&amp;").gsub("<", "&lt;").gsub(">", 
"&gt;")}</blah>"
    end
  end

  results.report("string concatenation2:") do
    TESTS.times do
      x = "<blah>"
      x << "woot".gsub(/&/, '&amp;').gsub(/</, '&lt;').gsub(/>/, '&gt;')
      x << "</blah>"
    end
  end

  results.report("string substitution2:") do
    TESTS.times do
      x = "<blah>#{"woot".gsub(/&/, '&amp;').gsub(/</, '&lt;').gsub(/>/, 
'&gt;')}</blah>"
    end
  end

  results.report("catenation w/ single pass escape:") do
    TESTS.times do
      x = "<blah>"
      x << escape("woot")
      x << "</blah>"
    end
  end

  results.report("substitution w/ single pass escape:") do
    TESTS.times do
      x = "<blah>#{escape('woot')}</blah>"
    end
  end


end



> Add optional support for Ruby-libxml2 (vs. REXML)
> -------------------------------------------------
>
>                 Key: SOLR-122
>                 URL: https://issues.apache.org/jira/browse/SOLR-122
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - ruby - flare
>            Reporter: Coda Hale
>         Attachments: libxml.rb, libxml.rb
>
>
> This file adds drop-in support for the ruby-libxml2, which is a wrapper for 
> the libxml2 library, which is an order of magnitude or so faster than REXML.
> This depends on my SOLR-121 patch for multi-document adds, since the behavior 
> of Solr::Request::AddDocument#to_s is different.
> Requiring this makes some tests fail, but for trivial reasons: some tests are 
> directly tied to REXML, others fail due to interelement whitespace added by 
> libxml2 (which you can't disable via the Ruby interface). Functionally, it's 
> identical, and passes all functional tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to