thanks Stefan,

both lxml and threading works perfect.
One small problem, "with_tail" was not recognized as a valid keyword.

cheers,
Stef

Stefan Behnel wrote:
Stef Mientki <stef.mientki <at> gmail.com> writes:
Although it works functionally,
it can take lots of time waiting for the translation.

What I basically do is, after selecting a new string to be translated:

    kwds = { 'trtext' : line_to_be_translated, 'lp' :'en_nl'}
    soup = BeautifulSoup (urlopen(url, urlencode ( kwds ) ) )
    translation= soup.find ( 'div', style='padding:0.6em;' ).string
    self.Editor_Babel.SetLabel ( translation )

You should give lxml.html a try.

http://codespeak.net/lxml/

It can parse directly from HTTP URLs (no need to go through urlopen), and it frees the GIL while parsing, so it will become efficient to create a little Thread that doesn't do more than parsing the web site, as in (untested):

  def read_bablefish(text, lang, result):
      url = BABLEFISH_URL + '?' + urlencode({'trtext':text, 'lp':lang})
      page = lxml.html.parse(url)
      for div in page.iter('div'):
           style = div.get('style')
           if style is not None and 'padding:0.6em;' in style:
               result.append(
                  lxml.html.tostring(div, method="text", with_tail=False))

  result = []
  thread = threading.Thread(target=read_bablefish,
                            args=("...", "en_nl", result))
  thread.start()
  while thread.isAlive():
      # ... do other stuff
  if result:
      print result[0]

Stefan


--
http://mail.python.org/mailman/listinfo/python-list

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to