On Sep 26, 2007, at 1:30 PM, Jacques Distler wrote:
> On Sep 15, 9:18 pm, Shawn <[EMAIL PROTECTED]> wrote:
>> The data structure (list) used to keep unread characters has O(n)  
>> cost
>> for insertion to/deletion from the head of the list, so frequent
>> unget() (which happens when parsing a long sequence of number
>> entities) can result in poor performance.  Changing the data  
>> structure
>> to deque, a queue object with O(1) insertion/deletion cost at the
>> head, improves the performance in general and makes char()/unget()
>> scales better with long input size.
>
> I imagine there's a similar O(n) issue with the shift/unshift methods
> of Ruby's Array class.
>
> Is there a Ruby Deque class floating around somewhere, which would
> allow a similar speedup of inputstream.rb?

I'll look into it. FWIW, there's several other significantly bad  
things in input_stream.rb related to performance, which I should be  
tackling soon.*

-ryan

* soon, because we're starting to use the ruby port of html5lib (in  
limited ways) in production at Technorati.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"html5lib-discuss" group.
 To post to this group, send email to [email protected]
 To unsubscribe from this group, send email to [EMAIL PROTECTED]
 For more options, visit this group at 
http://groups.google.com/group/html5lib-discuss?hl=en-GB
-~----------~----~----~----~------~----~------~--~---

Reply via email to