Hi, I believe this one<https://github.com/KL-7/sonata/commit/d69faac4f2e205751d2f039b91bef2efaf2a3c15#sonata/lyricwiki.py>will do the trick.
Lazaros, I suggest you to look through commits in multani's<https://github.com/multani/sonata>and my <https://github.com/KL-7/sonata> repos. You might find some useful fixes there. And I wonder how things are going with sonata. I don't see much activity of Paco's team on codingteam. Did they move the repo somewhere else or finally decide against resurrecting sonata project? On Wed, Apr 27, 2011 at 2:39 PM, Lazaros Koromilas <[email protected]>wrote: > On Tue, Apr 26, 2011 at 7:35 PM, KL-7 <[email protected]> wrote: > > Hi, > > > > Something really strange happened with their markup lately, so thank you > for > > sharing your patch. Though, I used this one for myself. It avoids > splitting > > string with the tag that contains simultaneously escaped '<' and > unescaped > > '>' as it looks a bit unstable for me. > > Yep! This seems much better, thanks ;) > Another thing that needs attention is the capitalization rules, for > instance > for the song "Orchestral Manoeuvres In The Dark - She's Leaving" sonata > will get the page: > > http://lyrics.wikia.com/index.php?title=Orchestral%20Manoeuvres%20In%20The%20Dark:She%27S%20Leaving&action=edit > whereas the logical would be: > > http://lyrics.wikia.com/index.php?title=Orchestral%20Manoeuvres%20In%20The%20Dark:She%27s%20Leaving&action=edit > where the lyrics actually exist. > > > On Tue, Apr 26, 2011 at 7:08 PM, Lazaros Koromilas <[email protected]> > > wrote: > >> > >> Lyrics fetching was not functional lately, > >> some changes in the markup apparently. > >> This patch fixes it. > >> > >> diff --git a/sonata/lyricwiki.py b/sonata/lyricwiki.py > >> index 0cd07fc..d4055c1 100644 > >> --- a/sonata/lyricwiki.py > >> +++ b/sonata/lyricwiki.py > >> @@ -42,10 +42,9 @@ class LyricWiki(object): > >> if content[:len(redir_tag)].lower() == redir_tag: > >> addr = > >> "http://lyricwiki.org/index.php?title=%s&action=edit" % > >> urllib.quote(content.split("[[")[1].split("]]")[0]) > >> lyricpage = urllib.urlopen(addr).read() > >> - content = re.split("<textarea[^>]*>", > >> lyricpage)[1].split("</textarea>")[0] > >> - content = content.strip() > >> - lyrics = > >> content.split("<lyrics>")[1].split("</lyrics>")[0] > >> - if lyrics.strip() != "<!-- PUT LYRICS HERE (and delete > >> this entire line) -->": > >> + lyrics = re.split("<lyrics>", > >> lyricpage)[1].split("</lyrics>")[0] > >> + if lyrics.find("PUT LYRICS HERE") == -1: > >> + lyrics = lyrics.strip() > >> lyrics = misc.unescape_html(lyrics) > >> lyrics = misc.wiki_to_html(lyrics) > >> lyrics = lyrics.decode("utf-8") > >> _______________________________________________ > >> Sonata-users mailing list > >> [email protected] > >> https://lists.berlios.de/mailman/listinfo/sonata-users > > > > > > > > -- > > Kirill > > > > > > _______________________________________________ > > Sonata-users mailing list > > [email protected] > > https://lists.berlios.de/mailman/listinfo/sonata-users > > > > > _______________________________________________ > Sonata-users mailing list > [email protected] > https://lists.berlios.de/mailman/listinfo/sonata-users > -- Kirill
_______________________________________________ Sonata-users mailing list [email protected] https://lists.berlios.de/mailman/listinfo/sonata-users
