Hi, John: Sorry! The pseudo code write by me is not correct, and It's slow.. I will come back tonight.
With Regards, Qiancong,Mo From: [email protected] Date: 2014-06-11 23:47 To: django-users Subject: Re: Massive import in Django database Hi, John: I think your code is right, except "Doc.object" should be "Doc.objects"; The following pseudo code maybe fater than what you write: doc_map = {} for each xml: extract from the xml data -> mydoc_code, mydoc_text, myRelated_doc_codes doc = Doc.objects.create(doc_code=mydoc_code, doc_text=mydoc_text) doc_map[mydoc_code] = (doc, myRelated_doc_codes) for (doc, rcodes) in doc_map.values(): for rcode in rcodes: doc.related_doc.add(doc_map[rcode]) doc.save() I have checked, It's okay; The object have be cached in doc_map, and no need re-query related_codes for related_doc from database, the speed should speed up. With Regards. [email protected] From: John Carlo Date: 2014-06-11 21:14 To: django-users Subject: Massive import in Django database Hello everybody, I've fallen in love with Django two years ago and I've been using it for my job projects. In the past I found very useful information in this group, so a big thank you guys! I have a little doubt. I have to import in Django db (sqlite for local development, mySql on the server) about 1.000.000 xml documents. The model class is the following: class Doc(models.Model): doc_code = models.CharField(max_length=20, unique=True, primary_key=True, db_index = True) doc_text = models.TextField(null=True, blank=True) related_doc= models.ManyToManyField('self', null=True, blank=True, db_index = True) >From what I know bulk insertion is not possibile because I have a >ManyToManyField relation. So I have this simple loop (in pseudo code) for each xml: extract from the xml date-> mydoc_code, mydoc_text, myRelated_doc_codes myDoc = Doc.object.get_or_create(doc_code = mydoc_code)[0] myDoc.doc_text = mydoc_text for reldoc_code in myRelated_doc_codes: myRelDoc = Doc.object.get_or_create(doc_code = reldoc_code )[0] myDoc.related_doc.add(myRelDoc ) myDoc.save() I'm doing it right? Do you have some suggestions, recommendation? I fear that since I have 1.000.000 docs to import, it will take a loooot of time, especially during the get_or_create routines thank you in advance everybody! John -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/5b88deaf-d806-4a64-9e8d-528d95599c80%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/2014061208110509465878%40gmail.com. For more options, visit https://groups.google.com/d/optout.

