Please quote when replying. It is very hard to follow the discussion if
you don't.
David Kahn wrote:
> Actually I and my client care how fast, even if it means more work and
> tests
> to hedge accuracy.
And by the time you do that extra work for correctness, you will have
developed a system equivalent to REXML or Nokogiri, and likely with
similar or worse performance. You're fighting a losing battle here.
> I did try Nokogiri - which I liked getting to know,
> but
> it also plods in at ~ 150 seconds which is just unacceptable for someone
> waiting at a browser.
Waiting at a browser? Let me get this straight -- your app is trying to
process a 65MB file in real time? That's insane. Do some of the
processing in advance, or tell the user that he can expect a 2-minute
wait (which is absolutely reasonable for that much data).
> That's what I was trying to get at with my
> original
> post and should have provided more data, i.e. am I wasting time with
> unrealistic expectations for any XML parser in this endeavor.
>
> Unless anyone can point out a more efficient search (code and example
> xml
> below), it seems practical in absence of other ideas, to go the way of
> regex
> at least to triangulate the data before throwing it to an xml parser to
> get
> the details or put the data into a db (which I am trying to avoid).
Why are you trying to avoid putting the data into a DB? Databases are
designed for quick searches through lots of data -- in other words,
exactly what you are doing. XML really is not. (You could try eXistDB,
though.)
>
> Below, the second line is what takes forever, understandably.
> gsa_epls_xml_doc = Nokogiri::HTML(doc_xml)
> @gsa_epls_xml_doc.xpath("//records/record[last='#{last_name}' and
> first='#{first_name}']").each do |possible_match_record| ...
I'm assuming gsa is Google Search Appliance. Can't it do the searching
itself and give you back only the records you need?
Best,
--
Marnen Laibow-Koser
http://www.marnen.org
[email protected]
--
Posted via http://www.ruby-forum.com/.
--
You received this message because you are subscribed to the Google Groups "Ruby
on Rails: Talk" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en.