RE: Medium sized binaries, lots of commits and performance

2005-02-10 Thread Jesper Vad Kristensen
Paul and Doug,

Thanks a lot for the advice (Larry too, but I replied to him elsewhere).
You've given me plenty to go on. I believe it's up to me now to figure
out what's most appropriate.

Regards,

Jesper Vad Kristensen
Aarhus, Denmark


___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs


RE: Medium sized binaries, lots of commits and performance

2005-02-10 Thread Jesper Vad Kristensen
Larry Jones wrote:

 I and the rest of us out here work with Oracle Forms and that means
 binary source code.

Are you sure there isn't a way to store them as text or to convert them
to text?  Source control systems are popular enough that there almost
certainly is.  Storing them in text form rather than in binary 
is by far the best solution to your potential problem.

I absolutely sure there is such a way :)

The Oracle Form Builder tool supports this innately. I'm a bit hesitant
to go this way, however, because it complicates most people's lives here
(having to always do explicit converting and all). Also, we pull source
code directly from CVS (using a perl script) and compile our
releases/launches from it - but your suggestion may of course make it
far more expedient to pull everything as ascii files, convert to
binaries, and then compile as usual.

Your advice may well be implemented - if I don't successfully follow any
of the other very fine suggestions by the people on this list.

BTW, if I have a binary of 2.8 MB it converts to text format as 6,4 MB.
Altbough the ascii is bigger I'm assuming CVS will be able to handle it
more efficiently (smaller deltas).

Thanks a lot for the advice!

Regards,

Jesper Vad Kristensen
Aarhus, Denmark


___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs


Re: Medium sized binaries, lots of commits and performance

2005-02-09 Thread Doug Lee
You first asked (or at least seemed to want to know :-) ) why
performance on a large binary CVS file goes way down when you update
from a branch instead of from HEAD.  Answer:  CVS stores the trunk
such that getting the HEAD revision is simply a matter of retrieving
a copy of it from the CVS file.  To get a branched revision, however,
requires the retrieval of the first version in the branch, then all
the deltas from then to the revision you want, going forward through
branch revisions.  I would therefore regard this performance hit
as a natural consequence of your use of CVS for binary source code,
unfortunately.  For a binary file, as you know, a delta can be a
considerable percentage of the original file size.

Your second question was how to remove old revisions in order to
improve performance.  I don't have a CVS manual URL handy like most
participants on this list seem to have, but check out the cvs admin
command.  It can indeed permanently delete revisions and ranges of
them.  You could, for example, delete all the revisions from the start
of a branch until two or so revisions behind its current state, so as
to speed up retrieval of revisions on that branch.

Good luck.

On Wed, Feb 09, 2005 at 04:37:14PM +0100, Jesper Vad Kristensen wrote:
Hi folks,

I've searched the net and mail archives for some help or workaround to
my problem, but most binary issues tend to deal with the impossibility
of diff/merge or whether very large files can be stuffed into CVS.

I and the rest of us out here work with Oracle Forms and that means
binary source code. At first I was very suspicious of moving to CVS
because we were having binary source code, but as it turns out I and
everyone else have become extremely happy with CVS. We can't merge,
granted, but with our external diff application we reap enormous
benefits from using CVS. Even branching is manageable.

But here's the problem, especially with our largest 3,5 MB file that's
been committed approx. 70 times. When doing a

cvs update -r HEAD filename

things work real fast (5 seconds). But if we do a

cvs update -r branch version filename

performance drops from 5 seconds to a minute and a half. I can imagine
something ugly happening with the filename,v file on the cvs server
which is 200 MB large.

The performance isn't killing us right now, but in maybe 6 months to a
year, who knows how bad it may have gotten?

So the question is if there are any administrative tools one can use to
compress/rationalize/index the file so branch access becomes faster? Is
there a way to permanently erase stuff older than 6 months?

And if not: opinions about my ideas below would be great? My ideas so
far:

MOVE variant: I wouldn't _like_ to lose the history of the application,
but it might be acceptable if performance degrades too much. I figure I
could move the filename,v file on the cvsroot repository (to a backup
folder), then delete from client and add a fresh one and the 1-2 active
branches - but can any history be kept if you do this? Will the old
history be in the backup folder?

MIGRATE: An alternative would be to create a new folder (while keeping
the old one) and simply migrate _all_ 85 files to the new folder (grab
HEAD, add all in HEAD to new folder, grab endpoints on branches, add all
branches as I best can).

Regards,

Jesper Vad Kristensen
Aarhus, Denmark


___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs

-- 
Doug Lee   [EMAIL PROTECTED]http://www.dlee.org
Bartimaeus Group   [EMAIL PROTECTED]   http://www.bartsite.com
It is difficult to produce a television documentary that is both
incisive and probing when every twelve minutes one is interrupted by
dancing rabbits singing about toilet paper.  --Rod Serling


___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs


Re: Medium sized binaries, lots of commits and performance

2005-02-09 Thread Larry Jones
Doug Lee writes:
 
 To get a branched revision, however,
 requires the retrieval of the first version in the branch, then all
 the deltas from then to the revision you want, going forward through
 branch revisions.

It's even worse than that since retreival of the first version in the
branch requires retreiving the head of the trunk (the only revision
that's stored intact), then all the deltas from there to the branch
point, going backwards through trunk revisions.

So, for example, if you branched at revision 1.1, the head of the trunk
is now 1.5, and the head of the branch is 1.1.2.5, CVS does the
following:

 1) Retrieve revision 1.5
 2) Retrieve the 1.4 - 1.5 delta and apply it backwards to
recreate revision 1.4
 3) Retrieve the 1.3 - 1.4 delta and apply it backwards to
recreate revision 1.3
 4) Retrieve the 1.2 - 1.3 delta and apply it backwards to
recreate revision 1.2
 5) Retrieve the 1.1 - 1.2 delta and apply it backwards to
recreate revision 1.1
 6) Retrieve the 1.1 - 1.1.2.1 delta and apply it to
recreate revision 1.1.2.1
 7) Retrieve the 1.1.2.1 - 1.1.2.2 delta and apply it to
recreate revision 1.1.2.2
 8) Retrieve the 1.1.2.2 - 1.1.2.3 delta and apply it to
recreate revision 1.1.2.3
 9) Retrieve the 1.1.2.3 - 1.1.2.4 delta and apply it to
recreate revision 1.1.2.4
10) Retrieve the 1.1.2.4 - 1.1.2.5 delta and apply it to
recreate revision 1.1.2.5

-Larry Jones

Moms and reason are like oil and water. -- Calvin


___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs


Re: Medium sized binaries, lots of commits and performance

2005-02-09 Thread Larry Jones
Jesper Vad Kristensen writes:
 
 I and the rest of us out here work with Oracle Forms and that means
 binary source code.

Are you sure there isn't a way to store them as text or to convert them
to text?  Source control systems are popular enough that there almost
certainly is.  Storing them in text form rather than in binary is by far
the best solution to your potential problem.

-Larry Jones

I've never seen a sled catch fire before. -- Hobbes


___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs


Re: Medium sized binaries, lots of commits and performance

2005-02-09 Thread Paul Sander
Larry gave a great description of why you're seeing your performance 
degrade over time.  As you can see, the more versions sit between the 
head and the version you want, the longer it takes to construct the 
version you want.

I can think of two effective and usable ways to combat the problem, 
plus one marginal one.  All of them essentially move the version you 
want closer to the head.

The first method is your MIGRATE method, which is a time-honored 
technique with CVS.

The second, which I believe was mentioned, is to reduce the number of 
revisions by obsoleting those that are no longer needed.  This is the 
marginal technique because history is lost, and the nature of the 
differences may not buy you anything.

The third method is to spawn new branches off the head and merge the 
latest versions of your existing branches onto the new branches, then 
convert your process to use the new branches instead.   This must be 
repeated periodically to keep a cap on response time.

On Feb 9, 2005, at 7:37 AM, [EMAIL PROTECTED] wrote:
Hi folks,
I've searched the net and mail archives for some help or workaround to
my problem, but most binary issues tend to deal with the impossibility
of diff/merge or whether very large files can be stuffed into CVS.
I and the rest of us out here work with Oracle Forms and that means
binary source code. At first I was very suspicious of moving to CVS
because we were having binary source code, but as it turns out I and
everyone else have become extremely happy with CVS. We can't merge,
granted, but with our external diff application we reap enormous
benefits from using CVS. Even branching is manageable.
But here's the problem, especially with our largest 3,5 MB file that's
been committed approx. 70 times. When doing a
cvs update -r HEAD filename
things work real fast (5 seconds). But if we do a
cvs update -r branch version filename
performance drops from 5 seconds to a minute and a half. I can imagine
something ugly happening with the filename,v file on the cvs server
which is 200 MB large.
The performance isn't killing us right now, but in maybe 6 months to a
year, who knows how bad it may have gotten?
So the question is if there are any administrative tools one can use to
compress/rationalize/index the file so branch access becomes faster? Is
there a way to permanently erase stuff older than 6 months?
And if not: opinions about my ideas below would be great? My ideas so
far:
MOVE variant: I wouldn't _like_ to lose the history of the application,
but it might be acceptable if performance degrades too much. I figure I
could move the filename,v file on the cvsroot repository (to a backup
folder), then delete from client and add a fresh one and the 1-2 active
branches - but can any history be kept if you do this? Will the old
history be in the backup folder?
MIGRATE: An alternative would be to create a new folder (while keeping
the old one) and simply migrate _all_ 85 files to the new folder (grab
HEAD, add all in HEAD to new folder, grab endpoints on branches, add 
all
branches as I best can).

Regards,
Jesper Vad Kristensen
Aarhus, Denmark
___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs
--
Paul Sander   | To do two things at once is to do neither
[EMAIL PROTECTED] | Publilius Syrus, Roman philosopher, 100 B.C.

___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs


Re: Medium sized binaries, lots of commits and performance

2005-02-09 Thread Paul Sander
On Feb 9, 2005, at 8:53 AM, [EMAIL PROTECTED] wrote:
Jesper Vad Kristensen writes:
I and the rest of us out here work with Oracle Forms and that means
binary source code.
Are you sure there isn't a way to store them as text or to convert them
to text?  Source control systems are popular enough that there almost
certainly is.  Storing them in text form rather than in binary is by 
far
the best solution to your potential problem.
Jesper also wrote:
 We can't merge,
 granted, but with our external diff application we reap enormous
 benefits from using CVS. Even branching is manageable.
This appears to be a case where adding support for external 
datatype-specific diff and merge tools would be useful.

--
Paul Sander   | Lets stick to the new mistakes and get rid of the 
old
[EMAIL PROTECTED] | ones -- William Brown


___
Info-cvs mailing list
Info-cvs@gnu.org
http://lists.gnu.org/mailman/listinfo/info-cvs