Re: Kernel SCM saga.. (bk license?)

2005-04-12 Thread Ricky Beam
On Tue, 12 Apr 2005, Kedar Sovani wrote: >I was wondering if working on git, is in anyway, in violation of the >Bitkeeper license, which states that you cannot work on any other SCM >(SCM-like?) tool for "x" amount of time after using Bitkeeper ? Technically, yes, it is. However, as BitMover has

Re: Kernel SCM saga..

2005-04-12 Thread Pavel Machek
Hi! > > It's possible to generate another object with the same hash, but: > > Yeah - the real check is that the modified object has to > compile and do something useful for someone (the cracker > if no one else). > > Just getting a random bucket of bits substituted for a > real kernel source fil

Re: Kernel SCM saga.. (bk license?)

2005-04-12 Thread Catalin Marinas
Kedar Sovani <[EMAIL PROTECTED]> wrote: > I was wondering if working on git, is in anyway, in violation of the > Bitkeeper license, which states that you cannot work on any other SCM > (SCM-like?) tool for "x" amount of time after using Bitkeeper ? That's valid for the new BK license only which pr

Re: Kernel SCM saga.. (bk license?)

2005-04-12 Thread Kedar Sovani
I was wondering if working on git, is in anyway, in violation of the Bitkeeper license, which states that you cannot work on any other SCM (SCM-like?) tool for "x" amount of time after using Bitkeeper ? Kedar. On Apr 8, 2005 10:12 AM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Thu, 7

Re: Kernel SCM saga..

2005-04-10 Thread Jan Hudec
On Mon, Apr 11, 2005 at 04:56:06 +0200, Marcin Dalecki wrote: > > On 2005-04-11, at 04:26, Miles Bader wrote: > > >Marcin Dalecki <[EMAIL PROTECTED]> writes: > >>Better don't waste your time with looking at Arch. Stick with patches > >>you maintain by hand combined with some scripts containing a

Re: Kernel SCM saga..

2005-04-10 Thread Marcin Dalecki
On 2005-04-11, at 04:26, Miles Bader wrote: Marcin Dalecki <[EMAIL PROTECTED]> writes: Better don't waste your time with looking at Arch. Stick with patches you maintain by hand combined with some scripts containing a list of apply commands and you should be still more productive then when using Ar

Re: Kernel SCM saga..

2005-04-10 Thread Miles Bader
Marcin Dalecki <[EMAIL PROTECTED]> writes: > Better don't waste your time with looking at Arch. Stick with patches > you maintain by hand combined with some scripts containing a list of > apply commands and you should be still more productive then when using > Arch. Arch has its problems, but plea

Re: Kernel SCM saga..

2005-04-10 Thread Christian Parpart
On Monday 11 April 2005 12:33 am, you wrote: [..] > Well, I followed some of the instructions to mirror the kernel tree on > svn.clkao.org/linux/cvs, and although it took around 12 hours to import > 28232 versions, I seem to have a mirror of it on my own subversion > server now. I think the svn

Re: Kernel SCM saga..

2005-04-10 Thread Troy Benjegerdes
On Thu, Apr 07, 2005 at 02:29:24PM -0400, Daniel Phillips wrote: > On Thursday 07 April 2005 14:13, Dmitry Yusupov wrote: > > On Thu, 2005-04-07 at 13:54 -0400, Daniel Phillips wrote: > > > Three years ago, there was no fully working open source distributed scm > > > code base to use as a starting

Re: Kernel SCM saga..

2005-04-10 Thread Paul Jackson
Ingo wrote: > not the compression of every file separately. ok -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the lin

Re: Kernel SCM saga..

2005-04-10 Thread Matthias Andree
Andrea Arcangeli schrieb am 2005-04-09: > On Fri, Apr 08, 2005 at 05:12:49PM -0700, Linus Torvalds wrote: > > really designed for something like a offline http grabber, in that you can > > just grab files purely by filename (and verify that you got them right by > > running sha1sum on the result

Re: Kernel SCM saga..

2005-04-10 Thread Paul Jackson
> It's possible to generate another object with the same hash, but: Yeah - the real check is that the modified object has to compile and do something useful for someone (the cracker if no one else). Just getting a random bucket of bits substituted for a real kernel source file isn't going to get

Re: Kernel SCM saga..

2005-04-10 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Ingo wrote: > > With default gzip it's 3.3 seconds though, > > and that still compresses it down to 57 MB. > > Interesting. I'm surprised how much a bunch of separate, modest sized > files can be compressed. sorry, what i measured was in essence the

Re: Kernel SCM saga..

2005-04-10 Thread Paul Jackson
Ingo wrote: > With default gzip it's 3.3 seconds though, > and that still compresses it down to 57 MB. Interesting. I'm surprised how much a bunch of separate, modest sized files can be compressed. I'm unclear what matters most here. Space on disk certainly isn't much of an issue. Even with An

Re: Kernel SCM saga..

2005-04-10 Thread Bill Davidsen
ly) - it's still non-trivial in terms of computation needed > > Message-ID: <[EMAIL PROTECTED]> > From: Linus Torvalds <[EMAIL PROTECTED]> > Subject: Re: Kernel SCM saga.. > Date: Sat, 9 Apr 2005 09:16:22 -0700 (PDT) > > ... > >

Re: Kernel SCM saga..

2005-04-10 Thread David Roundy
On Sun, Apr 10, 2005 at 11:24:07AM +0200, Giuseppe Bilotta wrote: > On Sat, 9 Apr 2005 12:17:58 -0400, David Roundy wrote: > > > I've recently made some improvements recently which will reduce the > > memory use > > Does this include check for redundancy? ;) Yeah, the only catch is that if the r

Re: Kernel SCM saga..

2005-04-10 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > These 16817 files consume: > > 224 MBytes uncompressed and >95 MBytes compressed > > (using zlib's minigzip, on a 4 KB page reiserfs.) that's a 42.4% compressed size. Using a (much) more CPU-intense compression method (bzip -9), the co

Re: Kernel SCM saga..

2005-04-10 Thread Ingo Molnar
* David S. Miller <[EMAIL PROTECTED]> wrote: > On Fri, 8 Apr 2005 22:45:18 -0700 (PDT) > Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > Also, I don't want people editing repostitory files by hand. Sure, the > > sha1 catches it, but still... I'd rather force the low-level ops to use > > the pr

Re: Kernel SCM saga..

2005-04-10 Thread Junio C Hamano
to do so... Message-ID: <[EMAIL PROTECTED]> From: Linus Torvalds <[EMAIL PROTECTED]> Subject: Re: Kernel SCM saga.. Date: Sat, 9 Apr 2005 09:16:22 -0700 (PDT) ... Linus (*) yeah, yeah, I know about the current theoretical case, and I don't c

Re: Kernel SCM saga..

2005-04-10 Thread Giuseppe Bilotta
On Sat, 9 Apr 2005 12:17:58 -0400, David Roundy wrote: > I've recently made some improvements > recently which will reduce the memory use Does this include check for redundancy? ;) -- Giuseppe "Oblomov" Bilotta Hic manebimus optime - To unsubscribe from this list: send the line "unsubscribe l

Re: Kernel SCM saga..

2005-04-10 Thread David Lang
On Sat, 9 Apr 2005, Linus Torvalds wrote: The biggest irritation I have with the "tree" format I chose is actually not the name (which is trivial), it's the part. Almost everything else keeps the in the ASCII hexadecimal representation, and I should have done that here too. Why? Not because it's

Re: Kernel SCM saga..

2005-04-09 Thread Albert Cahalan
Linus Torvalds writes: > NOTE! I detest the centralized SCM model, but if push comes to shove, > and we just _can't_ get a reasonable parallell merge thing going in > the short timeframe (ie month or two), I'll use something like SVN > on a trusted site with just a few committers, and at least try

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
Linus wrote: > Almost everything > else keeps the in the ASCII hexadecimal representation, and I > should have done that here too. Why? Not because it's a - hey, the > binary representation is certainly denser and equivalent Since the size of ASCII sha1's is only about 18% larger than the size

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
Chris wrote: > How many is alot? Are we talking 100k, 1m, 10m? I pulled some numbers out of my bk tree for Linux. I have 16817 source files. They average 12.2 bitkeeper changes per file (counting the number of changes visible from doing 'bk sccslog' on each of the 16817 files). These 16817 fi

Re: Re: Re: Kernel SCM saga..

2005-04-09 Thread Phillip Lougher
On Apr 10, 2005 2:42 AM, Petr Baudis <[EMAIL PROTECTED]> wrote: > Dear diary, on Sun, Apr 10, 2005 at 03:01:12AM CEST, I got a letter > where Phillip Lougher <[EMAIL PROTECTED]> told me that... > > On Apr 9, 2005 3:53 AM, Petr Baudis <[EMAIL PROTECTED]> wrote: > > > > > FWIW, I made few small fix

Re: Re: Re: Kernel SCM saga..

2005-04-09 Thread Petr Baudis
Dear diary, on Sun, Apr 10, 2005 at 03:01:12AM CEST, I got a letter where Phillip Lougher <[EMAIL PROTECTED]> told me that... > On Apr 9, 2005 3:53 AM, Petr Baudis <[EMAIL PROTECTED]> wrote: > > > FWIW, I made few small fixes (to prevent some trivial usage errors to > > cause cache corruption) a

Re: Re: Kernel SCM saga..

2005-04-09 Thread Phillip Lougher
On Apr 9, 2005 3:53 AM, Petr Baudis <[EMAIL PROTECTED]> wrote: > FWIW, I made few small fixes (to prevent some trivial usage errors to > cause cache corruption) and added scripts gitcommit.sh, gitadd.sh and > gitlog.sh - heavily inspired by what already went through the mailing > list. Everythin

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
David wrote: > recovery is more difficult when you corrupt some > file in your repository. Agreed. I too have recovered RCS and SCCS files by hand editing. Linus wrote: > I don't want people editing repostitory files by hand. Tyrant !;) >From Wikipedia: A tyrant is a usurper of rightful

Re: Kernel SCM saga..

2005-04-09 Thread Chris Wedgwood
On Sat, Apr 09, 2005 at 04:13:51PM -0700, Linus Torvalds wrote: > > I understand the arguments for compression, but I hate it for one > > simple reason: recovery is more difficult when you corrupt some > > file in your repository. I've had this too. Magic binary blobs are horrible here for data

Re: Kernel SCM saga..

2005-04-09 Thread Tupshin Harper
Roman Zippel wrote: It seems you exported the complete parent information and this is exactly the "nitty-gritty" I was "whining" about and which is not available via bkcvs or bkweb and it's the most crucial information to make the bk data useful outside of bk. Larry was previously very clear abo

Re: Kernel SCM saga..

2005-04-09 Thread Linus Torvalds
On Sat, 9 Apr 2005, David S. Miller wrote: > > I understand the arguments for compression, but I hate it for one > simple reason: recovery is more difficult when you corrupt some > file in your repository. Trust me, the way git does things, you'll have so much redundancy that you'll have to re

Re: Kernel SCM saga..

2005-04-09 Thread David S. Miller
On Fri, 8 Apr 2005 22:45:18 -0700 (PDT) Linus Torvalds <[EMAIL PROTECTED]> wrote: > Also, I don't want people editing repostitory files by hand. Sure, the > sha1 catches it, but still... I'd rather force the low-level ops to use > the proper helper routines. Which is why it's a raw zlib compress

Re: Kernel SCM saga..

2005-04-09 Thread Florian Weimer
* David Lang: >> Databases supporting replication are called high end. You forgot >> the cats dance around the network this issue involves. > > And Postgres (which is Free in all senses of the word) is high end by this > definition. I'm not aware of *any* DBMS, commercial or not, which can perfo

Re: Kernel SCM saga..

2005-04-09 Thread Ray Lee
On Sat, 2005-04-09 at 19:40 +0200, Roman Zippel wrote: > On Sat, 9 Apr 2005, Eric D. Mudama wrote: > > > For example bk does something like this: > > > > > > A1 -> A2 -> A3 -> BM > > > \-> B1 -> B2 --^ > > > > > > and instead of creating the merge changeset, one could merge them

Re: Kernel SCM saga..

2005-04-09 Thread Marcin Dalecki
On 2005-04-09, at 17:42, Paul Jackson wrote: Marcin wrote: But what will impress you are either the price tag the DB comes with or the hardware it runs on :-) The payroll for the staffing to care and feed for these babies is often impressive as well. Please don't forget the bill from the electric p

[PATCH] Re: Kernel SCM saga..

2005-04-09 Thread Petr Baudis
Dear diary, on Sat, Apr 09, 2005 at 09:08:59AM CEST, I got a letter where "Randy.Dunlap" <[EMAIL PROTECTED]> told me that... > On Sat, 9 Apr 2005 04:53:57 +0200 Petr Baudis wrote: ..snip.. > | FWIW, I made few small fixes (to prevent some trivial usage errors to > | cause cache corruption) and ad

Re: Kernel SCM saga..

2005-04-09 Thread Roman Zippel
Hi, On Sat, 9 Apr 2005, Eric D. Mudama wrote: > > For example bk does something like this: > > > > A1 -> A2 -> A3 -> BM > > \-> B1 -> B2 --^ > > > > and instead of creating the merge changeset, one could merge them like > > this: > > > > A1 -> A2 -> A3 -> B1 -> B2 > >

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
> (b) while I depend on the fact that if the SHA of an object matches, the > objects are the same, I generally try to avoid the reverse > dependency. It might be a valid point that you want to leave the door open to using a different (than SHA1) digest. (So this means you going to st

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
Linus wrote: > In "git", you usually care about > the old contents too. True - in your case, you probably want the old contents so might as well dig them out as soon as it becomes convenient to have them. I was objecting to your claim that you _had_ to dig out the old contents to determine i

Re: Kernel SCM saga..

2005-04-09 Thread Roman Zippel
Hi, On Fri, 8 Apr 2005, Linus Torvalds wrote: > Yes. Per-file history is expensive in git, because if the way it is > indexed. Things are indexed by tree and by changeset, and there are no > per-file indexes. > > You could create per-file _caches_ (*) on top of git if you wanted to make > it

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
Linus wrote: > (you need to remember to escape '%' > too when you do that ;). No - don't have to. Not if I don't mind giving fools that embed newlines in paths second class service. In my case, if I create a file named "foo\nbar", then backup and restore it, I end up with a restored file named

Re: Kernel SCM saga..

2005-04-09 Thread Eric D. Mudama
On Apr 8, 2005 4:52 PM, Roman Zippel <[EMAIL PROTECTED]> wrote: > The problem is you pay a price for this. There must be a reason developers > were adding another GB of memory just to run BK. > Preserving the complete merge history does indeed make repeated merges > simpler, but it builds up comple

Re: Kernel SCM saga..

2005-04-09 Thread Roman Zippel
Hi, On Fri, 8 Apr 2005, Linus Torvalds wrote: > Also, I suspect that BKCVS actually bothers to get more details out of a > BK tree than I cared about. People have pestered Larry about it, so BKCVS > exports a lot of the nitty-gritty (per-file comments etc) that just > doesn't actually _matter_, b

Re: Kernel SCM saga..

2005-04-09 Thread Linus Torvalds
On Sat, 9 Apr 2005, Paul Jackson wrote: > > > in order to avoid having to worry about special characters > > they are NUL-terminated) > > Would this be a possible alternative - newline terminated (convert any > newlines embedded in filenames to the 3 chars '%0A', and leave it as an > exercise to

Re: Kernel SCM saga..

2005-04-09 Thread David Roundy
On Thu, Apr 07, 2005 at 12:30:18PM +0200, Matthias Andree wrote: > On Thu, 07 Apr 2005, Sergei Organov wrote: > > darcs? > > Close. Some things: > > 1. It's rather slow and quite CPU consuming and certainly I/O consuming >at times - I keep, to try it out, l

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
Linus wrote: > If you want to have spaces > and newlines in your pathname, go wild. So long as there is only one pathname in a record, you don't need nul-terminators to be allow spaces in the name. The rest of the record is well known, so the pathname is just whatever is left after chomping off

Re: Kernel SCM saga..

2005-04-09 Thread Linus Torvalds
On Sat, 9 Apr 2005, Paul Jackson wrote: > > I must be missing something here ... > > If the stat shows a possible change, then you shouldn't have to open the > original version to determine if it really changed - just compute the > SHA1 of the new file, and see if that changed from the original

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
> in order to avoid having to worry about special characters > they are NUL-terminated) Would this be a possible alternative - newline terminated (convert any newlines embedded in filenames to the 3 chars '%0A', and leave it as an exercise to the reader to de-convert them.) Line formatted ASCII f

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
Marcin wrote: > But what will impress you are either the price tag the > DB comes with or > the hardware it runs on :-) The payroll for the staffing to care and feed for these babies is often impressive as well. -- I won't rest till it's the best ... Programm

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
Linus wrote: > then git will open have exactly _one_ > file (no searching, no messing around), which contains absolutely nothing > except for the compressed (and SHA1-signed) old contents of the file. It > obviously _has_ to do that, because in order to know whether you've > changed it, it need

Re: Kernel SCM saga..

2005-04-09 Thread Paul Jackson
Linus wrote: > you need to reuse the same inode/dev numbers > (again - I didn't worry about portability, and filesystems where those > aren't stable are a "don't do that then") On filesystems that don't have a stable inode number, I use the md5sum of the full (relative to mount point) pathname as

Re: Kernel SCM saga..

2005-04-09 Thread Samium Gromoff
It seems that Tom Lord, the primary architect behind GNU Arch has recently published an open letter to Linus Torvalds. Because no open letter to Linus would be really open without an accompanying reference post on lkml, here it is: http://lists.seyza.com/pipermail/gnu-arch-dev/2005-April/001001.h

Re: Kernel SCM saga..

2005-04-09 Thread Samium Gromoff
Ok, this was literally screaming for a rebuttal! :-) > Arch isn't a sound example of software design. Quite contrary to the >

Re: Kernel SCM saga..

2005-04-09 Thread Neil Brown
On Saturday April 9, [EMAIL PROTECTED] wrote: > On Sat, Apr 09, 2005 at 05:47:08PM +1000, Neil Brown wrote: > > On Saturday April 9, [EMAIL PROTECTED] wrote: > > > > > > I've just checked, it takes 5.7s to compare 2.4.29{,-hf3} over NFS (13300 > > > files each) and 1.3s once the trees are cached l

Re: Kernel SCM saga..

2005-04-09 Thread Jan Hudec
On Sat, Apr 09, 2005 at 03:01:29 +0200, Marcin Dalecki wrote: > > On 2005-04-07, at 09:44, Jan Hudec wrote: > > > >I have looked at most systems currently available. I would suggest > >following for closer look on: > > > >1) GNU Arch/Bazaar. They use the same archive format, simple, have the > >

Re: Kernel SCM saga..

2005-04-09 Thread Willy Tarreau
On Sat, Apr 09, 2005 at 05:47:08PM +1000, Neil Brown wrote: > On Saturday April 9, [EMAIL PROTECTED] wrote: > > > > I've just checked, it takes 5.7s to compare 2.4.29{,-hf3} over NFS (13300 > > files each) and 1.3s once the trees are cached locally. This is without > > comparing file contents, jus

Re: Kernel SCM saga..

2005-04-09 Thread Neil Brown
On Saturday April 9, [EMAIL PROTECTED] wrote: > > I've just checked, it takes 5.7s to compare 2.4.29{,-hf3} over NFS (13300 > files each) and 1.3s once the trees are cached locally. This is without > comparing file contents, just meta-data. And it takes 19.33s to compare > the file's md5 sums once

Re: Kernel SCM saga..

2005-04-09 Thread Willy Tarreau
On Fri, Apr 08, 2005 at 11:56:09AM -0700, Chris Wedgwood wrote: > On Fri, Apr 08, 2005 at 11:47:10AM -0700, Linus Torvalds wrote: > > > Don't use NFS for development. It sucks for BK too. > > Some times NFS is unavoidable. > > In the best case (see previous email wrt to only stat'ing the parent

Re: Kernel SCM saga..

2005-04-09 Thread Willy Tarreau
On Fri, Apr 08, 2005 at 12:03:49PM -0700, Linus Torvalds wrote: > And if you do actively malicious things in your own directory, you get > what you deserve. It's actually _hard_ to try to fool git into believing a > file hasn't changed: you need to not only replace it with the exact same > file l

Re: Kernel SCM saga..

2005-04-09 Thread Randy.Dunlap
On Sat, 9 Apr 2005 04:53:57 +0200 Petr Baudis wrote: | Hello, | | Dear diary, on Fri, Apr 08, 2005 at 05:50:21PM CEST, I got a letter | where Linus Torvalds <[EMAIL PROTECTED]> told me that... | > | > | > On Fri, 8 Apr 2005 [EMAIL PROTECTED] wrote: | > > | > > Here's a partial solution. It

Re: Kernel SCM saga..

2005-04-08 Thread Linus Torvalds
On Sat, 9 Apr 2005, Andrea Arcangeli wrote: > > I'm not entirely convinced wget is going to be an efficient way to > synchronize and fetch your tree I don't think it's efficient per se, but I think it's important that people can just "pass the files along". Ie it's a huge benefit if any every

Re: Kernel SCM saga..

2005-04-08 Thread Walter Landry
Linus Torvalds wrote: > Which is why I'd love to hear from people who have actually used > various SCM's with the kernel. There's bound to be people who have > already tried. At the end of my Codecon talk, there is a performance comparison of a number of different distributed SCM's with the kernel

Re: Kernel SCM saga..

2005-04-08 Thread Andrea Arcangeli
On Fri, Apr 08, 2005 at 11:08:58PM -0400, Brian Gerst wrote: > It's my understanding that the files don't change. Only new ones are > created for each revision. I said diff between the trees, not diff between files ;). When you fetch the new changes with rsync, it'll compress better and in turn

Re: Kernel SCM saga..

2005-04-08 Thread Brian Gerst
Andrea Arcangeli wrote: On Fri, Apr 08, 2005 at 05:12:49PM -0700, Linus Torvalds wrote: really designed for something like a offline http grabber, in that you can just grab files purely by filename (and verify that you got them right by running sha1sum on the resulting local copy). So think "wget

Re: Re: Kernel SCM saga..

2005-04-08 Thread Petr Baudis
Hello, Dear diary, on Fri, Apr 08, 2005 at 05:50:21PM CEST, I got a letter where Linus Torvalds <[EMAIL PROTECTED]> told me that... > > > On Fri, 8 Apr 2005 [EMAIL PROTECTED] wrote: > > > > Here's a partial solution. It does depend on a modified version of > > cat-file that behaves like cat.

Re: Kernel SCM saga..

2005-04-08 Thread Andrea Arcangeli
On Fri, Apr 08, 2005 at 07:38:30PM -0400, Daniel Phillips wrote: > For the immediate future, all we need is something than can _losslessly_ > capture the new metadata that's being generated. That buys time to bring one > of the promising open source candidates up to full speed. Agreed. - To uns

Re: Kernel SCM saga..

2005-04-08 Thread David Lang
On Sat, 9 Apr 2005, Andrea Arcangeli wrote: On Fri, Apr 08, 2005 at 05:12:49PM -0700, Linus Torvalds wrote: really designed for something like a offline http grabber, in that you can just grab files purely by filename (and verify that you got them right by running sha1sum on the resulting local cop

Re: Kernel SCM saga..

2005-04-08 Thread Andrea Arcangeli
On Fri, Apr 08, 2005 at 05:12:49PM -0700, Linus Torvalds wrote: > really designed for something like a offline http grabber, in that you can > just grab files purely by filename (and verify that you got them right by > running sha1sum on the resulting local copy). So think "wget". I'm not entire

Re: Kernel SCM saga..

2005-04-08 Thread David Lang
On Sat, 9 Apr 2005, Marcin Dalecki wrote: On 2005-04-08, at 20:28, Jon Smirl wrote: On Apr 8, 2005 2:14 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: How do you replicate your database incrementally? I've given you enough clues to do it for "git" in probably five lines of perl. Efficient data

Re: Kernel SCM saga..

2005-04-08 Thread Tupshin Harper
Roman Zippel wrote: Please show me how you would do a binary search with arch. I don't really like the arch model, it's far too restrictive and it's jumping through hoops to get to an acceptable speed. What I expect from a SCM is that it maintains both a version index of the directory structure

Re: Kernel SCM saga..

2005-04-08 Thread Marcin Dalecki
On 2005-04-09, at 03:09, Chris Wedgwood wrote: On Sat, Apr 09, 2005 at 03:00:44AM +0200, Marcin Dalecki wrote: Yes it sucks less for this purpose. See subversion as reference. Whatever solution people come up with, ideally it should be tolerant to minor amounts of corruption (so I can recover the r

Re: Kernel SCM saga..

2005-04-08 Thread Marcin Dalecki
On 2005-04-08, at 20:28, Jon Smirl wrote: On Apr 8, 2005 2:14 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: How do you replicate your database incrementally? I've given you enough clues to do it for "git" in probably five lines of perl. Efficient database replication is achieved by copying t

Re: Kernel SCM saga..

2005-04-08 Thread Chris Wedgwood
On Sat, Apr 09, 2005 at 03:00:44AM +0200, Marcin Dalecki wrote: > Yes it sucks less for this purpose. See subversion as reference. Whatever solution people come up with, ideally it should be tolerant to minor amounts of corruption (so I can recover the rest of my data if need be) and it should al

Re: Kernel SCM saga..

2005-04-08 Thread Marcin Dalecki
On 2005-04-08, at 20:14, Linus Torvalds wrote: On Fri, 8 Apr 2005, Matthias-Christian Ott wrote: Ok, but if you want to search for information in such big text files it slow, because you do linear search No I don't. I don't search for _anything_. I have my own content-addressable filesystem, and

Re: Kernel SCM saga..

2005-04-08 Thread Marcin Dalecki
On 2005-04-08, at 19:14, Linus Torvalds wrote: You do that with an sql database, and I'll be impressed. It's possible. But what will impress you are either the price tag the DB comes with or the hardware it runs on :-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in t

Re: Kernel SCM saga..

2005-04-08 Thread Marcin Dalecki
On 2005-04-06, at 23:13, [EMAIL PROTECTED] wrote: Linus Torvalds wrote: PS. Don't bother telling me about subversion. If you must, start reading up on "monotone". That seems to be the most viable alternative, but don't pester the developers so much that they don't get any work done. They are alr

Re: Kernel SCM saga..

2005-04-08 Thread Marcin Dalecki
On 2005-04-07, at 09:44, Jan Hudec wrote: I have looked at most systems currently available. I would suggest following for closer look on: 1) GNU Arch/Bazaar. They use the same archive format, simple, have the concepts right. It may need some scripts or add ons. When Bazaar-NG is ready, it wi

Re: Kernel SCM saga..

2005-04-08 Thread Marcin Dalecki
On 2005-04-08, at 18:15, Matthias-Christian Ott wrote: Linus Torvalds wrote: SQL Databases like SQLite aren't slow. But maybe a Berkeley Database v.4 is a better solution. Yes it sucks less for this purpose. See subversion as reference. - To unsubscribe from this list: send the line "unsubscribe l

Re: Kernel SCM saga..

2005-04-08 Thread Roman Zippel
Hi, On Fri, 8 Apr 2005, Tupshin Harper wrote: > > A1 -> A2 -> A3 -> B1 -> B2 > > > > This results in a simpler repository, which is more scalable and which is > > easier for users to work with (e.g. binary bug search). > > The disadvantage would be it will cause more minor conflicts, when ch

Re: Kernel SCM saga..

2005-04-08 Thread Linus Torvalds
On Fri, 8 Apr 2005, Linus Torvalds wrote: > > Also note that the above algorithm really works for _any_ two commit > points (apart for the two first steps, which are obviously all about > finding the parent tree when you want to diff against a predecessor). Btw, if you want to try this, you

Re: Kernel SCM saga..

2005-04-08 Thread Linus Torvalds
On Fri, 8 Apr 2005, Andrea Arcangeli wrote: > > We'd need a regenerated coherent copy of BKCVS to pipe into those SCM to > evaluate how well they scale. Yes, that makes most sense, I believe. Especially as BKCVS does the linearization that makes other SCM's _able_ to take the data in the first

Re: Kernel SCM saga..

2005-04-08 Thread Tupshin Harper
Roman Zippel wrote: Preserving the complete merge history does indeed make repeated merges simpler, but it builds up complex meta data, which has to be managed forever. I doubt that this is really an advantage in the long term. I expect that we were better off serializing changesets in the main

Re: Kernel SCM saga..

2005-04-08 Thread Daniel Phillips
On Friday 08 April 2005 04:38, Andrea Arcangeli wrote: > On Thu, Apr 07, 2005 at 11:41:29PM -0700, Linus Torvalds wrote: > The huge number of changesets is the crucial point, there are good > distributed SCM already but they are apparently not efficient enough at > handling 60k changesets. > > We'd

Re: Kernel SCM saga..

2005-04-08 Thread Linus Torvalds
On Fri, 8 Apr 2005, Rajesh Venkatasubramanian wrote: > > Although directory changes are tracked using change-sets, there > seems to be no easy way to answer "give me the diff corresponding to > the commit (change-set) object ". That will be really helpful to > review the changes. Actually, it

Re: Kernel SCM saga..

2005-04-08 Thread Roman Zippel
Hi, On Thu, 7 Apr 2005, Linus Torvalds wrote: > I really disliked that in BitKeeper too originally. I argued with Larry > about it, but Larry (correctly, I believe) argued that efficient and > reliable distribution really requires the concept of "history is > immutable". It makes replication much

Re: Kernel SCM saga..

2005-04-08 Thread Rajesh Venkatasubramanian
Linus wrote: It looks like an operation like "show me the history of mm/memory.c" will be pretty expensive using git. Yes. Per-file history is expensive in git, because if the way it is indexed. Things are indexed by tree and by changeset, and there are no per-file indexes. Although directory ch

Re: Kernel SCM saga..

2005-04-08 Thread Daniel Phillips
On Friday 08 April 2005 13:24, Jon Masters wrote: > On Apr 7, 2005 6:54 PM, Daniel Phillips <[EMAIL PROTECTED]> wrote: > > So I propose that everybody who is interested, pick one of the above > > projects and join it, to help get it to the point of being able to > > losslessly import the version gr

Re: Kernel SCM saga..

2005-04-08 Thread Linus Torvalds
On Fri, 8 Apr 2005 [EMAIL PROTECTED] wrote: > > It looks like an operation like "show me the history of mm/memory.c" will > be pretty expensive using git. Yes. Per-file history is expensive in git, because if the way it is indexed. Things are indexed by tree and by changeset, and there are no

Re: Kernel SCM saga..

2005-04-08 Thread Luck
It looks like an operation like "show me the history of mm/memory.c" will be pretty expensive using git. I'd need to look at the current tree, and then trace backwards through all 60,000 changesets to see which ones had actual changes to this file. Could you expand the tuple in the tree object to

Re: Uncached stat performace [ Was: Re: Kernel SCM saga.. ]

2005-04-08 Thread Chris Wedgwood
On Fri, Apr 08, 2005 at 10:11:51PM +0200, Ragnar Kj?rstad wrote: > It does, so why isn't there a way to do this without the disgusting > hack? (Your words, not mine :) ) inode sorting probably a good guess for a number of filesystems, you can map the blocks used to do better still (somewhat fs sp

Uncached stat performace [ Was: Re: Kernel SCM saga.. ]

2005-04-08 Thread Ragnar Kjørstad
On Fri, Apr 08, 2005 at 12:39:26PM -0700, Linus Torvalds wrote: > One of the reasons I do inode numbers in the "index" file (apart from > checking that the inode hasn't changed) is in fact that "stat()" is damn > slow if it causes seeks. Since your stat loop is entirely > > You can optimize you

Re: Kernel SCM saga..

2005-04-08 Thread Chris Wedgwood
On Fri, Apr 08, 2005 at 09:38:09PM +0200, Florian Weimer wrote: > Does sorting by inode number make a difference? It almost certainly would. But I can sort more intelligently than that even (all the world isn't ext2/3). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

Re: Kernel SCM saga..

2005-04-08 Thread Matthias-Christian Ott
Linus Torvalds wrote: On Fri, 8 Apr 2005, Matthias-Christian Ott wrote: But as mentioned you need to _open_ each file (It doesn't matter if it's cached (this speeds up only reading it) -- you need a _slow_ system call and _very slow_ hardware access anyway). Nope. System calls aren't slow

Re: Kernel SCM saga..

2005-04-08 Thread Linus Torvalds
On Fri, 8 Apr 2005, Chris Wedgwood wrote: > > > It doesn't matter so much for the cached case, but it _does_ matter > > for the uncached one. > > Doing the minimal stat cold-cache here is about 6s for local disk. > I'm somewhat surprised it's that bad actually. One of the reasons I do inode nu

Re: Kernel SCM saga..

2005-04-08 Thread Florian Weimer
* Chris Wedgwood: >> It doesn't matter so much for the cached case, but it _does_ matter >> for the uncached one. > > Doing the minimal stat cold-cache here is about 6s for local disk. Does sorting by inode number make a difference? - To unsubscribe from this list: send the line "unsubscribe linu

Re: Kernel SCM saga..

2005-04-08 Thread Linus Torvalds
On Fri, 8 Apr 2005, Matthias-Christian Ott wrote: > > But as mentioned you need to _open_ each file (It doesn't matter if it's > cached (this speeds up only reading it) -- you need a _slow_ system call > and _very slow_ hardware access anyway). Nope. System calls aren't slow. What crappy OS ar

Re: Kernel SCM saga..

2005-04-08 Thread Chris Wedgwood
On Fri, Apr 08, 2005 at 12:03:49PM -0700, Linus Torvalds wrote: > Yes, doing the stat just on the directory (on leaf directories only, of > course, but nlink==2 does say that on most filesystems) is indeed a huge > potential speedup. Here I measure about 6ms for cache --- essentially below the no

Re: Kernel SCM saga..

2005-04-08 Thread Matthias-Christian Ott
Linus Torvalds wrote: On Fri, 8 Apr 2005, Matthias-Christian Ott wrote: Ok, but if you want to search for information in such big text files it slow, because you do linear search No I don't. I don't search for _anything_. I have my own content-addressable filesystem, and I guarantee you t

Re: Kernel SCM saga..

2005-04-08 Thread Linus Torvalds
On Fri, 8 Apr 2005, Chris Wedgwood wrote: > > Actually, I could probably make this *much* still faster with a > caveat. Given that my editor when I write a file will write a > temporary file and rename it, for files in directories where nlink==2 > I can check chat first and skip the stat of the

Re: Kernel SCM saga..

2005-04-08 Thread Florian Weimer
* Jon Smirl: > On Apr 8, 2005 2:14 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: >>How do you replicate your database incrementally? I've given you enough >>clues to do it for "git" in probably five lines of perl. > > Efficient database replication is achieved by copying the transaction >

  1   2   >