On Sat, Aug 22, 2015 at 10:18:19PM +0200, Andreas Rottmann wrote: > [email protected] writes: > > > Hello list, I need some help. > > > > I'm following a Computer Science course material in Python and then > > try to implement the examples and exercises in Guile Scheme. > > > > Currently I'm working on a small program to get a text document from > > the Web and do some string operations with the content, but my > > implementation in Guile takes about 5-7 minutes to finish, while the > > Python version takes 6-8 seconds. Also, the printed results are > > different, but I'm more concerned about the time issue right now. > > > > Could you point me to possible mistakes I'm making here? > > > > Guile Scheme program: > > https://gist.github.com/umanomata/99f103ed686acd523d9e > > > > Python program: > > https://gist.github.com/umanomata/1dbb3beb2ce19f09fcee > > > There's a difference in algorithmic complexity here: in Scheme, you use > SRFI-1's "delete-duplicates", which is noted to be O(n^2) complexity in > the SRFI-1 specification[0]. In Python, you use set(), which is probably > implemented using a hash table, yielding roughly O(1) complexity. Since > your input set is large, it would be better to feed the read words into > a hash table, instead of using a list and SRFI-1 "delete-duplicates". > > [0] http://srfi.schemers.org/srfi-1/srfi-1.html#delete-duplicates
Of course, that leaves us with the question why code that was explicitly moved from scheme to the c-level "... so that using srfi-1 wouldn't have performance penalties" uses such a problematic algorithm. Cheers, RalfD > Regards, Rotty > -- > Andreas Rottmann -- <http://rotty.xx.vu/> >
