Re: Purging old Revisions
Guten Tag Matthew Bluhm, am Mittwoch, 28. November 2012 um 23:57 schrieben Sie: The repository was started 5 1/2 years ago. Which version does your repo has and which Subversion does your server use? Did you just upgrade your repo or made a full dump and load cycle with current versions of Subversion? New features like representation sharing could reduce the size of your repo a lot by preserving the history full history and working copies. I agree that the whole idea is too keep the history, but oldest transactions provide the least value for me. If you drop revisions, your working copies get useless and you have to checkout everything new. Just compare that time with whatever costs your currently maybe larger repo produces. You can easily compare the benefits of all approaches by copying your current repo and create a new one in the current format with dump/load and create a new with only the least revisions you want. Mit freundlichen Grüßen, Thorsten Schöning -- Thorsten Schöning E-Mail:thorsten.schoen...@am-soft.de AM-SoFT IT-Systeme http://www.AM-SoFT.de/ Telefon...05151- 9468- 55 Fax...05151- 9468- 88 Mobil..0178-8 9468- 04 AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow
Re: Is there a way to dump the checksums from a svn repo?
On Thu, Nov 29, 2012 at 1:59 AM, Thorsten Schöning tschoen...@am-soft.de wrote: Guten Tag olli hauer, am Mittwoch, 28. November 2012 um 22:45 schrieben Sie: Someone hacks one of the additional mirrors, modifies a revision and adjust the checksum (as described on many places how-to fix a corrupt repo) so it looks OK even with svnadmin verify. Sounds interesting, but if the mirrors not under your full control already have been hacked how can you trust the locally produced checksums by svnadmin? You can't as you can't trust the mirror in any way, svnadmin could be manipulated, too, you would need to get the data to a trustful environment again and check it from there. For things where the file representation is the same, I just use an 'rsync -nv' against a known-good copy to verify integrity and it runs pretty quickly. But, the copy built by svnsync doesn't necessarily get stored the same way, does it? -- Les Mikesell lesmikes...@gmail.com
Re: Purging old Revisions
On Wed, Nov 28, 2012 at 5:57 PM, Matthew Bluhm matthew.bl...@bluhm.biz wrote: There were some binary files that were included as part of the projects that weren't necessary, so its tons of seemingly small mistakes. The repository was started 5 1/2 years ago. Revision #6,000 was about 2 years ago. I have never used any history older than 24 months. Even though 10 GB doesn't seem big, about 2/3 of it is a waste, so its wasting time and money. I agree that the whole idea is too keep the history, but oldest transactions provide the least value for me. This is an easier problem to solve than getting rid of old revisions. Just create a dump file of your repository and pipe it through svndumpfilter using the --exclude option to remove any paths you want to remove from your repository. Then load the end result into a new repository This will allow you to strip out the large binary files and recover disk space without needing to drop revisions. Simply removing the first 6000 revisions is not going to accomplish what you want. -- Thanks Mark Phippard http://markphip.blogspot.com/
Re: Is there a way to dump the checksums from a svn repo?
Les Mikesell wrote on Thu, Nov 29, 2012 at 09:59:47 -0600: On Thu, Nov 29, 2012 at 1:59 AM, Thorsten Schöning tschoen...@am-soft.de wrote: Guten Tag olli hauer, am Mittwoch, 28. November 2012 um 22:45 schrieben Sie: Someone hacks one of the additional mirrors, modifies a revision and adjust the checksum (as described on many places how-to fix a corrupt repo) so it looks OK even with svnadmin verify. Sounds interesting, but if the mirrors not under your full control already have been hacked how can you trust the locally produced checksums by svnadmin? You can't as you can't trust the mirror in any way, svnadmin could be manipulated, too, you would need to get the data to a trustful environment again and check it from there. For things where the file representation is the same, I just use an 'rsync -nv' against a known-good copy to verify integrity and it runs pretty quickly. But, the copy built by svnsync doesn't necessarily get stored the same way, does it? I think in 1.8/fsfs it will byte-for-byte identical. (except rep-cache.db, but you can remove that file without consequences) There was a dev@ thread by philipm about this not too long ago.
Re: Is there a way to dump the checksums from a svn repo?
olli hauer oha...@gmx.de writes: Is there a way to dump the checksums from a svn repo? What I'm doing at the moment on masters and slaves is $ svnadmin verify and $ sqlite $repo/db/rep-cache.db select hash,revision from rep_cache then additional comparing the sqlite output from master and slaves. Since rep-cache is not used during read requests it would be nice to have for example a parameter for svnadmin verify to output the checksums so they can be compared between master and slaves. Is there way for example via the python/perl API? Thanks for every answer and code snippet ... I did it in C but I suppose you might be able to use the Python bindings. I did svn_fs_open() svn_fs_revision_root(N) svn_repos_replay2(N-1) which drove an editor from rN-1 rto rN and the editor did nothing except extract the checksum from the close_file callback. -- Certified Supported Apache Subversion Downloads: http://www.wandisco.com/subversion/download
Re: Is there a way to dump the checksums from a svn repo?
Daniel Shahaf d...@daniel.shahaf.name writes: Les Mikesell wrote on Thu, Nov 29, 2012 at 09:59:47 -0600: But, the copy built by svnsync doesn't necessarily get stored the same way, does it? I think in 1.8/fsfs it will byte-for-byte identical. (except rep-cache.db, but you can remove that file without consequences) There was a dev@ thread by philipm about this not too long ago. No, an svnsync mirror is usually not identical to the master. It does contain the same versioned data but the representation of that data is different. For example, every failed commit on the master will bump the fsfs sequence number and that will cause the node-revision-ids to be different. -- Certified Supported Apache Subversion Downloads: http://www.wandisco.com/subversion/download
Re: Is there a way to dump the checksums from a svn repo?
Philip Martin wrote on Thu, Nov 29, 2012 at 18:26:04 +: Daniel Shahaf d...@daniel.shahaf.name writes: Les Mikesell wrote on Thu, Nov 29, 2012 at 09:59:47 -0600: But, the copy built by svnsync doesn't necessarily get stored the same way, does it? I think in 1.8/fsfs it will byte-for-byte identical. (except rep-cache.db, but you can remove that file without consequences) There was a dev@ thread by philipm about this not too long ago. No, an svnsync mirror is usually not identical to the master. It does contain the same versioned data but the representation of that data is different. For example, every failed commit on the master will bump the fsfs sequence number and that will cause the node-revision-ids to be different. Node-revision-id's in revisions don't embed transaction id's... For example the noderev header (yes, header, not just id) of /subversion/trunk/notes is identical between svn.us and svn.eu.
Re: Is there a way to dump the checksums from a svn repo?
Philip Martin philip.mar...@wandisco.com writes: mkdir zz echo foo zz/f echo bar zz/g echo zigzig zz/F echo zagzag zz/G svnadmin create repo svn mkdir -mm file://`pwd`/repo/A oops! should be import not mkdir svn import -mm zz file://`pwd`/repo/A svnadmin create repo2 svnsync init file://`pwd`/repo2 file://`pwd`/repo svnsync sync file://`pwd`/repo2 -- Certified Supported Apache Subversion Downloads: http://www.wandisco.com/subversion/download
Re: Is there a way to dump the checksums from a svn repo?
Philip Martin wrote on Thu, Nov 29, 2012 at 19:13:11 +: Daniel Shahaf d...@daniel.shahaf.name writes: Philip Martin wrote on Thu, Nov 29, 2012 at 18:26:04 +: Daniel Shahaf d...@daniel.shahaf.name writes: Les Mikesell wrote on Thu, Nov 29, 2012 at 09:59:47 -0600: But, the copy built by svnsync doesn't necessarily get stored the same way, does it? I think in 1.8/fsfs it will byte-for-byte identical. (except rep-cache.db, but you can remove that file without consequences) There was a dev@ thread by philipm about this not too long ago. No, an svnsync mirror is usually not identical to the master. It does contain the same versioned data but the representation of that data is different. For example, every failed commit on the master will bump the fsfs sequence number and that will cause the node-revision-ids to be different. Node-revision-id's in revisions don't embed transaction id's... For example the noderev header (yes, header, not just id) of /subversion/trunk/notes is identical between svn.us and svn.eu. OK. But the sequence number differences do show up in other places: Further, node-revision-ids can vary for other reasons. Representations in the revision files are in whatever order the client sends representations to the server. There are no defined orders for clients to use so it is quite likely that commits to the master and the mirror will use different orders: That affects the offsets in the text: lines, often changing the line length, which in turn affects the position of the subsequent nodes, and the position of the nodes affects the node-revision-ids. Yes, that's exactly what your thread 87mx2hw607@stat.home.lan was about. I thought in the end that patch got committed? svnadmin create repo svn mkdir -mm file://`pwd`/repo/A # r1 svn mkdir -mm file://`pwd`/repo/A # fail svn mkdir -mm file://`pwd`/repo/A/B # r2 svnadmin create repo2 svnadmin dump repo | svnadmin load repo2 diff repo/db/revs/0/2 repo2/db/revs/0/2 37c37 _1.0.t1-2 add-dir false false /A/B --- _1.0.t1-1 add-dir false false /A/B Well, that answers the question: revision files are not byte-for-byte identical. I wonder, though, if we should be rewriting these to use the revfile noderev id's? If not to avoid _* id's in revfiles, then to make the revfiles deterministic by using the (stable) revfile noderev id's --- for the reasons given in your linked thread.
Re: Is there a way to dump the checksums from a svn repo?
Daniel Shahaf d...@daniel.shahaf.name writes: Further, node-revision-ids can vary for other reasons. Representations in the revision files are in whatever order the client sends representations to the server. There are no defined orders for clients to use so it is quite likely that commits to the master and the mirror will use different orders: That affects the offsets in the text: lines, often changing the line length, which in turn affects the position of the subsequent nodes, and the position of the nodes affects the node-revision-ids. Yes, that's exactly what your thread 87mx2hw607@stat.home.lan was about. I thought in the end that patch got committed? That was committed but it's not quite the same problem. That thread was about revision file differences caused by the server itself. When comparing commits on a master and slave there can also be differences caused by the client. -- Certified Supported Apache Subversion Downloads: http://www.wandisco.com/subversion/download
Re: Is there a way to dump the checksums from a svn repo?
On 2012-11-29 20:13, Philip Martin wrote: Daniel Shahaf d...@daniel.shahaf.name writes: Philip Martin wrote on Thu, Nov 29, 2012 at 18:26:04 +: Daniel Shahaf d...@daniel.shahaf.name writes: Les Mikesell wrote on Thu, Nov 29, 2012 at 09:59:47 -0600: But, the copy built by svnsync doesn't necessarily get stored the same way, does it? I think in 1.8/fsfs it will byte-for-byte identical. (except rep-cache.db, but you can remove that file without consequences) There was a dev@ thread by philipm about this not too long ago. No, an svnsync mirror is usually not identical to the master. It does contain the same versioned data but the representation of that data is different. For example, every failed commit on the master will bump the fsfs sequence number and that will cause the node-revision-ids to be different. Node-revision-id's in revisions don't embed transaction id's... For example the noderev header (yes, header, not just id) of /subversion/trunk/notes is identical between svn.us and svn.eu. OK. But the sequence number differences do show up in other places: svnadmin create repo svn mkdir -mm file://`pwd`/repo/A # r1 svn mkdir -mm file://`pwd`/repo/A # fail svn mkdir -mm file://`pwd`/repo/A/B # r2 svnadmin create repo2 svnadmin dump repo | svnadmin load repo2 diff repo/db/revs/0/2 repo2/db/revs/0/2 37c37 _1.0.t1-2 add-dir false false /A/B --- _1.0.t1-1 add-dir false false /A/B Further, node-revision-ids can vary for other reasons. Representations in the revision files are in whatever order the client sends representations to the server. There are no defined orders for clients to use so it is quite likely that commits to the master and the mirror will use different orders: mkdir zz echo foo zz/f echo bar zz/g echo zigzig zz/F echo zagzag zz/G svnadmin create repo svn mkdir -mm file://`pwd`/repo/A svnadmin create repo2 svnsync init file://`pwd`/repo2 file://`pwd`/repo svnsync sync file://`pwd`/repo2 I see orders: repo/db/revs/0/1: foo, zigzig, zagzag, bar repo2/db/revs/0/1: zigzig, zagzag, foo, bar That affects the offsets in the text: lines, often changing the line length, which in turn affects the position of the subsequent nodes, and the position of the nodes affects the node-revision-ids. Thats what I also see with svnsync, specially for revisions with a lot of files in the initial commit (master and mirror are the same OS and installed with exact the same packages no matter if I sync over svn or http(s)).
Re: Is there a way to dump the checksums from a svn repo?
On 2012-11-29 19:24, Philip Martin wrote: olli hauer oha...@gmx.de writes: Is there a way to dump the checksums from a svn repo? What I'm doing at the moment on masters and slaves is $ svnadmin verify and $ sqlite $repo/db/rep-cache.db select hash,revision from rep_cache then additional comparing the sqlite output from master and slaves. Since rep-cache is not used during read requests it would be nice to have for example a parameter for svnadmin verify to output the checksums so they can be compared between master and slaves. Is there way for example via the python/perl API? Thanks for every answer and code snippet ... I did it in C but I suppose you might be able to use the Python bindings. I did svn_fs_open() svn_fs_revision_root(N) svn_repos_replay2(N-1) which drove an editor from rN-1 rto rN and the editor did nothing except extract the checksum from the close_file callback. Thanks for the hint, I will do some tests with your promised snipped.
Problems with configuration of SVN ( error code 500 )
Hello everybody. I followed arch-linux wiki guide for setting up a SVN repository using apache and ssl. I'm almost certain, that I understand all the steps, and that I filled them correctly. Here is the guide: https://wiki.archlinux.org/index.php/Subversion_Setup. When I try to connect using svn co https://192.168.0.21/svn/myrepo or links https://192.168.0.21/svn/myrepo the http authentication asks me to fill my account's name and password. After the form (no matter if name and password are correct), there's a message: Server sent unexpected return value (500 Internal Server Error) in response to OPTIONS request for https://192.168.0.21/svn/myrepo I checked httpd/errors_log, and every time I try to connect, apache outputs: [Thu Nov 29 22:19:45 2012] [error] [client 192.168.0.21] (13)Permission denied: Failed to load the AuthzSVNAccessFile: Can't open file '/home/svn/.svn-policy-file': Permission denied [Thu Nov 29 22:19:57 2012] [error] [client 192.168.0.21] (13)Permission denied: Could not open password file: /home/svn/.svn-auth-file But here's my ls -la on /home/svn -rwxrwxrwx 1 http http40 Nov 29 16:02 .svn-auth-file -rwxrwxrwx 1 http http43 Nov 29 17:58 .svn-policy-file I don't have any Idea what causes the problem, for now I used SVN only as a client. All interest really is appreciated, thanks.
RE: SVN Tag / Branch question
I am sorry to re-visit this again. Yes, we have cases where files (in the same path, i.e. trunk/docs) are intended for different releases. If as you say, we are doing something wrong, what's a better way to handle this? In /trunk/docs: Release_doc_1.5 Release_doc_1.6 Release_doc_1.7 Do you have different files in the same path that apply to different releases? If so, I think you are doing something wrong. -Original Message- From: Bob Archer [mailto:bob.arc...@amsi.com] Sent: Tuesday, October 30, 2012 1:41 PM To: Ahmed, Omair (GE Oil Gas); users@subversion.apache.org Subject: RE: SVN Tag / Branch question You are correct in making the statement below. However, what's confusing is that when I copied the Docs directory from /trunk to /tags/release-1.6, the directory included files from the previous release also. Basically, I was expecting to see just the new files. I am trying to understand how that happened and how to prevent. I think perhaps you have a misunderstanding of how subversion revisions work. A revision contains ALL of the files in the path no matter what previous rev they were last changed in. Do you have different files in the same path that apply to different releases? If so, I think you are doing something wrong. For example, you should have... readme_v1.txt and then make a readme_v2.txt for a new release. You should just modify the readme.txt file accordingly and let svn keep track of which rev of that file goes with which release of your product. You should go and review Chapter 1 and 2 of the documentation. http://svnbook.red-bean.com/en/1.7/svn-book.html BOb Also, if you released your product from a certain svn revision, aren't ALL the files in that revision part of that release version? -Original Message- From: Bob Archer [mailto:bob.arc...@amsi.com] Sent: Tuesday, October 30, 2012 11:36 AM To: Ahmed, Omair (GE Oil Gas); users@subversion.apache.org Subject: RE: SVN Tag / Branch question Hello, We did our first release in SVN today. I used the copy command (shown below) to copy from /trunk to /tag. Since not everything in /trunk was needed for this release, I had to specify the directories which were needed. Q1. Is this the normal/correct way of doing things? For the new release, just the Docs, MKVIE and Screens dirs. were needed. The others were not. Not sure what you mean by not needed. However, you don't save anything by not just copying trunk to tag. Since svn uses cheap copies copying the full trunk folder doesn't take any more space than copying certain folders. Also, if you released your product from a certain svn revision, aren't ALL the files in that revision part of that release version? Our repo structure is as follows: Csvn list https://X.X.com/svn/muxbopcs_svn/trunk/MUX Control/ Docs/ MKVIE/ Screens/ sem_modbus/ Q2. Are we better off using release branches instead of copying to /tags? To svn a copy is a copy. tags and branches are semantic names. In general a tag isn't ever committed to. But, this is only by convention. Q3. Sometime down the line, if I had to re-create a view of Release 1.6, do I just base my workspace on what's in /tags/release-1.6? Or is there another/better way of re-creating a prior release? I would copy the tag to a branch and work from the branch. Q4. I was also expecting /tags to contain just the new files for Release 1.6. However, that wouldn't be case, right? I have a feeling I am confusing myself over nothing. Basically, all a copy is, is a pointer to the location that it copied. So, the state of the path you copy to includes everything from the source path. But, once again, it is a cheap copy so no files are really copied. BOb
Re: Is there a way to dump the checksums from a svn repo?
Philip Martin wrote on Thu, Nov 29, 2012 at 18:24:38 +: olli hauer oha...@gmx.de writes: Is there a way to dump the checksums from a svn repo? What I'm doing at the moment on masters and slaves is $ svnadmin verify and $ sqlite $repo/db/rep-cache.db select hash,revision from rep_cache then additional comparing the sqlite output from master and slaves. Since rep-cache is not used during read requests it would be nice to have for example a parameter for svnadmin verify to output the checksums so they can be compared between master and slaves. Is there way for example via the python/perl API? Thanks for every answer and code snippet ... I did it in C but I suppose you might be able to use the Python bindings. I did svn_fs_open() svn_fs_revision_root(N) svn_repos_replay2(N-1) which drove an editor from rN-1 rto rN and the editor did nothing except extract the checksum from the close_file callback. This will only give you the precalculated checksum stored as a metadata attribute within the backend --- it's not going to checksum the file on-the-fly to compute the actual checksum.