A big speedup could already be gained by switching from the old xml parser 
that was included with Python to one of the more modern variants that comes 
with the python library (Ixml I think, but I don't remember exactly)

Patches welcome!

Peter

On Tuesday 09 December 2008 08:14:58 duncan wrote:
> On Dec 8, 1:40 am, Peter Bienstman <[EMAIL PROTECTED]> wrote:
> > Wait and it should finish eventually. The current algorithmn is not very
> > efficient.
> >
> > Peter
> >
> > On Monday 08 December 2008 01:40:03 MacB wrote:
> > > Yes, I had the same problem, with a deck of 7500.
> > >
> > > It took ages to load the deck in to Mnemosyne.
> > >
> > > And the reverse, when I tried to delete the entire 7500 deck, it took
> > > even longer.
> > >
> > > MacB
> > >
> > > On Dec 8, 1:35 pm, "Oisín Mac Fhearaí" <[EMAIL PROTECTED]> wrote:
> > > > 2008/12/7 Netta J. <[EMAIL PROTECTED]>:
> > > > > (I have searched the discussions, but wasn't able to find what I'm
> > > > > looking for. I apologize if i have overlooked.)
> > > > >
> > > > > I downloaded a deck of cards (10000+ I believe) and have tried to
> > > > > import it onto Mnemosyne, but when I click OK for it to load, the
> > > > > screen freezes causing me to close the program and attempt to start
> > > > > again. The file size is about 6.0 MB. I'm not sure if it is the
> > > > > file size or something I'm not doing correctly. Is there a way to
> > > > > correct this problem?
> > > >
> > > > Have you tried just waiting longer for it to complete the import?
> > > > 10000 cards is quite a lot.
>
> I'll take a look at this. If it is normal for imports to stall on such
> small input I think it must be the case that the algorithm used is O
> (nsquared) or worse. 10000 cards is not a lot. Importing 10000
> independent items should be bottlenecked by the time it takes to read
> them from disk- i.e. it should take at most an infinitesimal fraction
> of a second for datasets 10000 times as large as 10000 items. Without
> even looking at the code I am almost sure that the import algorithm,
> as it stands, must perform a scan of all previously imported items,
> for each imported item.... that would explain the O(nsquared)
> behaviour. If that's not the case something similar must be going on.
> At any rate the import _should_ be linear in the number of items, but
> it seems to be quadratic or worse- maybe even exponential given your
> report.
>
> Never let it be said that I lit a candle when I could have cursed the
> darkness- I revel in infamy. But in this case I will have a look at
> the code, as I revel in correct algorithms as much as I revel in the
> dark arts.... It ought to be possible to import items in time linear
> with the number of items to be imported. I'm all for taking certain
> shortcuts, but... this is not the place to explain why Python's
> deficiencies make things like O(nsquared) algorithms inevitable when
> naiive users program in Python.
> 
-- 
------------------------------------------------
Peter Bienstman
Ghent University, Dept. of Information Technology 
Sint-Pietersnieuwstraat 41, B-9000 Gent, Belgium
tel: +32 9 264 34 46, fax: +32 9 264 35 93
WWW: http://photonics.intec.UGent.be
email: [EMAIL PROTECTED]
------------------------------------------------

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mnemosyne-proj-users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/mnemosyne-proj-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to