2009/2/26 Scott Zhang <[email protected]>: > Thanks. Jan. > I am using 0.8.1 beta installer download from couchDB wiki. > > --------------------------------------------------- > The question is what you need you system to look like eventually. If this is > an initial data-import and after that you get mostly read requests, the > longer > insertion time will amortize over time. > --------------------------------------------------- > Yes. I am trying to transfer the keyword-index database from SqlServer to a > database to banlance the pressure on SqlServer database. The I will do > search use the keyword-index database. So the first import process is very > important to me. After initial data-import process, I can slowly add new > keywords in. > > My candidates are Mnesia(prefer at first time), couchDB, postgresql, mysql. >
Maybe a Berkeley DB like database is a good option. Tokyo Tyrant http://tokyocabinet.sourceforge.net/benchmark.pdf > The mnesia's performance(insert) is good. But there is a weird issue about > Http_client in erlang windows release. After reporting that bugs on erlang > mailing list but get no reponse. I finally have to give up mnesia. > > CouchDB is my second try, but the problem is as I showed in the mail. > > Now I am working with postgresql. At least I see it works as good as > SqlServer. > > > I will check couchDB back soon later after your 1.0 release. But as I see, > for user play with huge records, I can see when I saw couchDB 1.0. I will be > playing with 1 billion records. So speed is the most important thing I care. > If you want do a very heavy migration is better develop a script that generate de db file directly. The net layers consume a lot of time. Develop an aplication that write bytes directly to disk can save you a lot of time. I don't know how to do profiling in erlang to do a estimation of the time saving. Regards, Javi PD: Excuse me for my bad english > Cheers. > Thanks for your hard working. > > > Regards. > Scott > > > > > > > > On Thu, Feb 26, 2009 at 6:04 PM, Jan Lehnardt <[email protected]> wrote: > >> Hi Scott, >> >> thanks for your feedback. As a general note, you can't expect any magic >> from CouchDB. It is bound by the same constraint all other programmes >> are. To get the most out of CouchDB or SqlServer or MySQL, you need >> to understand how it works. >> >> >> On 26 Feb 2009, at 05:30, Scott Zhang wrote: >> >> Hi. Thanks for replying. >>> But what a database is for if it is slow? Every database has the feature >>> to >>> make cluster to improve speed and capacity (Don't metion "access" things). >>> >> >> The point of CouchDB is allowing high numbers of concurrent requests. This >> gives you more throughput for a single machine but not necessarily faster >> single query execution speed. >> >> >> I was expecting couchDB is as fast as SqlServer or mysql. At least I know, >>> mnesia is much faster than SqlServer. But mnesia always throw harmless >>> "overload" message. >>> >> >> CouchDB is not nearly as old as either of them. Did you really expect a >> software in alpha stages to be faster than fine-tuned systems that have >> been used in production for a decade or longer? >> >> >> I will try bulk insert now. But be fair, I was inserting into sqlserver >>> one insert one time. >>> >> >> Insert speed can be speed up in numerous ways: >> >> - Use sequential descending document ids on insert. >> - Use bulk insert. >> - Bypass the HTTP API and insert native Erlang terms and skip JSON >> conversion. >> >> The question is what you need you system to look like eventually. If this >> is >> an initial data-import and after that you get mostly read requests, the >> longer >> insertion time will amortize over time. >> >> What version is the Windows binary you are using? If it is still 0.8, you >> should >> try trunk (which most likely means switching to some UNIXy system). >> >> Cheers >> Jan >> -- >> >> >> >> >> >> >> >>> Regards. >>> >>> >>> >>> >>> On Thu, Feb 26, 2009 at 12:18 PM, Jens Alfke <[email protected]> wrote: >>> >>> >>>> On Feb 25, 2009, at 8:02 PM, Scott Zhang wrote: >>>> >>>> But the performance is as bad as I can image, After several minutes run, >>>> I >>>> >>>>> only inserted into 120K records. I saw the speed is ~20 records each >>>>> second. >>>>> >>>>> >>>> Use the bulk-insert API to improve speed. The way you're doing it, every >>>> record being added is a separate transaction, which requires a separate >>>> HTTP >>>> request and flushing the file. >>>> >>>> (I'm a CouchDB newbie, but I don't think the point of CouchDB is speed. >>>> What's exciting about it is the flexibility and the ability to build >>>> distributed systems. If you're looking for a traditional database with >>>> speed, have you tried MySQL?) >>>> >>>> —Jens >>>> >>> >> >
