Thanks Fabrice, I am making good progress following your advice. Do you have any heuristics for the best way to distribute data for performant searches and subsetting of data? Am I better having lots of small files or a few large files in a collection?
> > > >---- Original Message ---- >From: [email protected] >To: [email protected], [email protected] >Subject: RE: [basex-talk] handling large files: is there a >streamingsolution? >Date: Mon, 11 Feb 2013 14:38:54 +0000 > >>Dear Peter, >> >>Did you try to create a collection with the files (CREATE command) ? >>You should start that way, I don't see the point in using file: >module for import. >>I think that once in the database, file size does not matter (until >you reach millions of file in the collection, and do a lot of >document related operations (list, etc...)) >> >> >> >>-----Message d'origine----- >>De : [email protected] >[mailto:[email protected]] De la part de >[email protected] >>Envoyé : lundi 11 février 2013 15:33 >>À : [email protected] >>Objet : [basex-talk] handling large files: is there a streaming >solution? >> >>Hello List >>I am wanting to do a join with some large (3-400Mb) XML files and >would appreciate guidance on the optimal strategy. >>At present these files are on the filesystem and not in a database >> >>Is there any equivalent to the Zorba streaming xml:parse()? >> >>Would loading the files into a database directly be the approach, or >is it better to split them into smaller files? >> >>Is the file: module a suitable route through which to import the >files? >> >>Thanks for your help >> >>Peter >> >>_______________________________________________ >>BaseX-Talk mailing list >>[email protected] >>https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk >> _______________________________________________ BaseX-Talk mailing list [email protected] https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

