Hello Hans-Juergen, > So my understanding is that the messages are inserted as child elements into > this root element - and the end result is one document with one root element > and millions of child elements representing the invidual messages, yes?
Yes that is correct, i have one root element at the beginning and insert the incoming items as child nodes of the root. > Therefore you do not have to come up with URIs, as there is only one single > document. A monster document, but I conclude from your approach that this is > no problem, and not worse (or even better) than having a million individual, > small documents. Is it correct - would you recommend to store the messages in > one single document? In my use case, tweets have unique id attributes, so i don't need any URIs to identify them. Probably, it is a good idea if you describe your further querying process so it is easier to understand what you want to do. > If the loading process cannot concur with queries - would there be any way > how one could periodically "shift" packages of messages into a "read only" > database? Or perhaps better the other way around, let the server periodically > interrupt its loading activity, close the database, rename it, open and > initialize a new base and then continue to load? Or is there presently simply > no solution available? Thats exactly what i do after each hour. I rename the current db with the current date_hour and create a new database for the next incoming items. Shifting is not really an alternative, cause it will probably take too long to insert the items into a second database and delete them from the "main" database. Kind regards, Andreas Am 03.07.2012 um 23:58 schrieb Hans-Juergen Rennau: > Hello Andreas, > > thank you very much for these informations! Indeed, the use-cases are > similar. > > I try to understand how exactly you stored the messages. The Wiki says: "the > initial database just contained a root node <tweets/>". So my understanding > is that the messages are inserted as child elements into this root element - > and the end result is one document with one root element and millions of > child elements representing the invidual messages, yes? Therefore you do not > have to come up with URIs, as there is only one single document. A monster > document, but I conclude from your approach that this is no problem, and not > worse (or even better) than having a million individual, small documents. Is > it correct - would you recommend to store the messages in one single document? > > If the loading process cannot concur with queries - would there be any way > how one could periodically "shift" packages of messages into a "read only" > database? Or perhaps better the other way around, let the server periodically > interrupt its loading activity, close the database, rename it, open and > initialize a new base and then continue to load? Or is there presently simply > no solution available? > > Kind regards, > Hans-Juergen > > Von: Andreas Weiler <andreas.wei...@uni-konstanz.de> > An: Hans-Juergen Rennau <hren...@yahoo.de> > CC: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de> > Gesendet: 15:51 Dienstag, 3.Juli 2012 > Betreff: Re: [basex-talk] BaseX as a log msg store? > > Hello Hans-Juergen, > > here are some details about my use case, which is similar to yours. > I'm using BaseX to insert the live public Twitter Stream into databases (see > Wiki Entry [1]). > > One Twitter message is around 4 kb of size and i'm able to insert about 2000 > of them per second > using single XQuery Update inserts. So that would probably be working out for > you, too. > If you use bulk inserts, like caching the items in a item list and running > one XQuery Update for all of them, the amount of inserts would also increase. > >> thus made available for querying > > this could be a bigger problem, cause as long as you are writing items into > the database (which will never stop in your use case), the readers are > blocked. > And if one of your readers will be running, the writers are blocked. > > Hope this helps, > Andreas > > [1] http://docs.basex.org/wiki/Twitter > >
_______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk