Free Online Subversion Training - All About Subversion Hook Scripts
svn.wandisco.com/hookscripts Hook scripts are server-side executables tied to various Subversion events. During this course, attendees will learn how to use hook scripts for: * Email notification of an event * Commit validation * Automatic backup of changes * Integration with issue trackers and other external systems. * Specific access control scenarios * And much more Broadcast Details Wed, May 19, 2010 9:00 am PDT
Re: svn commit: r943445 - in /subversion/trunk/subversion: svn/cl.h svn/log-cmd.c svn/main.c tests/cmdline/getopt_tests_data/svn_help_log_switch_stdout
On Wed, May 12, 2010 at 07:48:58AM -0400, Mark Phippard wrote: On Wed, May 12, 2010 at 7:24 AM, Stefan Sperling s...@elego.de wrote: On Wed, May 12, 2010 at 10:59:28AM -, s...@apache.org wrote: Author: stsp Date: Wed May 12 10:59:27 2010 New Revision: 943445 URL: http://svn.apache.org/viewvc?rev=943445view=rev Log: Add a new --show-diff option to svn log, fixing part of issue #2909. + SVN_ERR(svn_client_diff5(diff_options, + lb-target_url, + start_revision, + lb-target_url, + end_revision, + NULL, + svn_depth_infinity, + FALSE, /* ignore ancestry */ + TRUE, /* no diff deleted */ + FALSE, /* show copies as adds */ + FALSE, /* ignore content type */ BTW, I deliberately hard-coded the above options in the initial implementation. We could decide to leave it like this, given that svn log is not the primary tool to produce variants of diffs. I set these options with the goal of getting the shortest possible diff shown, with the exception of ignore_ancestry. I don't want to pee in the punch bowl, but why are we doing this? Because it has proven to be a popular feature in other systems. It's very convenient if you want to catch up with the latest N commits made and skim the diffs. Especially if there aren't any commit list archives (many, many shops do not have those). And I have missed it in svn sometimes, too. Also, Daniel N. and Bert have said they will use it. Especially if we cannot implement it sufficiently? What do mean by sufficiently? We can certainly implement it sufficiently. You ask svn log to show diffs and it shows them. What's not sufficient about that? If you mean efficient, Bert pointed out that it probably won't be possible at all to implement it without opening a second RA session. I guess we could have the server send a diff to the previous revision along with every log item. But that would require a lot more changes. How fast it is depends on the connection to the server, as with so many things in svn. I suggest you try it out and see for yourself. svn log --diff -l 10 http://svn.apache.org/repos/asf/subversion And it's completely optional. The extra overhead isn't there if you don't use the option. In terms of the implementation, how are you handling the log output in XML when the user asks for diffs? We error out with an informative error message: $ svn log --diff --xml svn: Try 'svn help' for more info svn: 'show-diff' option is not supported in XML mode $ Stefan
Re: svn commit: r943445 - in /subversion/trunk/subversion: svn/cl.h svn/log-cmd.c svn/main.c tests/cmdline/getopt_tests_data/svn_help_log_switch_stdout
On Wed, May 12, 2010 at 8:00 AM, Stefan Sperling s...@elego.de wrote: I don't want to pee in the punch bowl, but why are we doing this? Because it has proven to be a popular feature in other systems. It's very convenient if you want to catch up with the latest N commits made and skim the diffs. Especially if there aren't any commit list archives (many, many shops do not have those). And I have missed it in svn sometimes, too. Also, Daniel N. and Bert have said they will use it. It would also be easy to script and that has always been our position on adding options like this. Especially if we cannot implement the option internally more efficiently than the script. Especially if we cannot implement it sufficiently? What do mean by sufficiently? We can certainly implement it sufficiently. You ask svn log to show diffs and it shows them. What's not sufficient about that? Sorry, I meant efficiently as you guessed. -- Thanks Mark Phippard http://markphip.blogspot.com/
RE: svn commit: r943445 - in /subversion/trunk/subversion: svn/cl.h svn/log-cmd.c svn/main.c tests/cmdline/getopt_tests_data/svn_help_log_switch_stdout
-Original Message- From: s...@apache.org [mailto:s...@apache.org] Sent: woensdag 12 mei 2010 12:59 To: comm...@subversion.apache.org Subject: svn commit: r943445 - in /subversion/trunk/subversion: svn/cl.h svn/log-cmd.c svn/main.c tests/cmdline/getopt_tests_data/svn_help_log_switch_stdout Author: stsp Date: Wed May 12 10:59:27 2010 New Revision: 943445 URL: http://svn.apache.org/viewvc?rev=943445view=rev Log: Add a new --show-diff option to svn log, fixing part of issue #2909. svn log --show-diff makes the svn log command provide inline diff output. This is a fairly naive implementation, done entirely within the CLI client. A separate RA session is opened to retrieve the diff. It behaves more or less like a series of svn log; svn diff invocations. This means first and foremost that it is a bit slow. But it works. Thanks for doing this in svn instead of changing the APIs for a CLI client specific feature. :) Bert
Re: svn commit: r943460 - /subversion/trunk/subversion/libsvn_wc/update_editor.c
On Wed, May 12, 2010 at 07:58, julianf...@apache.org wrote: ... +++ subversion/trunk/subversion/libsvn_wc/update_editor.c Wed May 12 11:58:16 2010 @@ -4717,27 +4717,38 @@ close_file(void *file_baton, SVN_ERR(svn_checksum_parse_hex(expected_md5_checksum, svn_checksum_md5, expected_md5_digest, pool)); - /* Was this an add-with-history, with no apply_textdelta? */ - if (fb-added_with_history ! fb-received_textdelta) + /* Retrieve the new text-base file's path and checksums. If it was an + * add-with-history, with no apply_textdelta, then that means the text-base + * of the copied file, else the new text-base created by apply_textdelta(), + * if any. */ Style nit: all but one the comments in close_file() do not use the leading-asterisk format for comments (the other is one you added :-P ). + if (fb-received_textdelta) { - SVN_ERR_ASSERT(! fb-new_text_base_tmp_abspath - fb-copied_text_base_abspath); - + new_text_base_md5_checksum = fb-new_text_base_md5_checksum; + new_text_base_sha1_checksum = fb-new_text_base_sha1_checksum; + new_text_base_abspath = fb-new_text_base_tmp_abspath; + SVN_ERR_ASSERT(new_text_base_md5_checksum + new_text_base_sha1_checksum + new_text_base_abspath); Style nit: pls move the operators down to the beginning of the next line. It is easier to read how the terms are joined when the operator leads, rather than follows (see the ASSERT that you removed a few lines above). + } + else if (fb-added_with_history) + { + SVN_ERR_ASSERT(! fb-new_text_base_tmp_abspath); new_text_base_md5_checksum = fb-copied_text_base_md5_checksum; new_text_base_sha1_checksum = fb-copied_text_base_sha1_checksum; new_text_base_abspath = fb-copied_text_base_abspath; - SVN_ERR_ASSERT(svn_dirent_is_absolute(new_text_base_abspath)); + SVN_ERR_ASSERT(new_text_base_md5_checksum + new_text_base_sha1_checksum + new_text_base_abspath); Same. } else { - /* Pull the actual checksum from the file_baton, computed during - the application of a text delta. */ - new_text_base_md5_checksum = fb-new_text_base_md5_checksum; - new_text_base_sha1_checksum = fb-new_text_base_sha1_checksum; - new_text_base_abspath = fb-new_text_base_tmp_abspath; + SVN_ERR_ASSERT(! fb-new_text_base_tmp_abspath + ! fb-copied_text_base_abspath); Good :-) ... Cheers, -g
WC-NG: Perhaps it's time to write a new summary of where we stand at the moment?
Hi! Greg [1] identified the following items in the beginning of February. * remove entry_t usage within libsvn_wc/client * remove access_t usage withing libsvn_wc/client * move props into wc.db * move to single/root wc.db * switch to new pristine file mgmt I have the feeling, we've made some progress. Are there any new identifiable items for wc-ng completion? The WC-NG meta issue [2] contains only one dependency, #3586. Perhaps we could add some more to make it easier to track the progress? Cheers, Daniel (Hope I'm not too forward asking these questions) [1] http://svn.haxx.se/dev/archive-2010-02/0286.shtml [2] http://subversion.tigris.org/issues/show_bug.cgi?id=3357
RE: Possible security problem with svnsync?
I have a repository that is partially mirrored, using svnsync and mod_authz_svn [1]. I just realised that the administrator of the mirror server can bypass the authz rules I've set up on the master server. All he has to do is change the svn:sync-from-url property on the mirror repository to be a file:// URL to the source repository, rather than a http:// one. The correct file:// URL is probably guessable. Well, this has nothing to do with svnsync then does it? If you expose the repository file system then yes anyone can access it bypassing the server. Even with svn.exe it can be done. you should use FS/Network permission so that your repositories are only available via your server (http or svn protocols). BOb
RE: Possible security problem with svnsync?
Hi, Bob Archer wrote: Jon Foster wrote: I have a repository that is partially mirrored, using svnsync and mod_authz_svn [1]. I just realised that the administrator of the mirror server can bypass the authz rules I've set up on the master server. All he has to do is change the svn:sync-from-url property on the mirror repository to be a file:// URL to the source repository, rather than a http:// one. The correct file:// URL is probably guessable. Well, this has nothing to do with svnsync then does it? If you expose the repository file system then yes anyone can access it bypassing the server. Even with svn.exe it can be done. you should use FS/Network permission so that your repositories are only available via your server (http or svn protocols). I'm not exposing the repository file system to users, and I'm not giving shell access to users. The only way a user can access this server is via Apache. However, svnsync is started by the post-commit hook. (This is the recommended svnsync setup, as far as I can tell). This means that svnsync is running on the server, as the apache user, which gives it a lot of permissions - including the ability to directly access the repository files. The problem is that svnsync trusts the mirror server to give it the correct source URL. Kind regards, Jon Foster ** This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Cabot Communications Ltd. If you are not the intended recipient of this email and its attachments, you must take no action based upon them, nor must you copy or show them to anyone. Cabot Communications Limited Verona House, Filwood Road, Bristol BS16 3RY, UK +44 (0) 1179584232 Co. Registered in England number 02817269 Please contact the sender if you believe you have received this email in error. ** __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: Possible security problem with svnsync?
Peter Samuelson wrote: [Jon Foster] All he has to do is change the svn:sync-from-url property on the mirror repository to be a file:// URL to the source repository, rather than a http:// one. The correct file:// URL is probably guessable. I'd never thought of this as as security problem, but I _do_ think it's a suboptimal design where a svnsync setup stores state on the mirrored repository which is relative not to the mirror, but to whoever is running svnsync. Please can we change svnsync sync to allow both the source and target URLs to be specified? That rather simple measure would block this attack. Since svnsync is usually invoked from a script, typing the extra URL isn't a problem. Yes, this sounds like a good design anyway, aside from the security question. I'm coding right now along these lines. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: svn commit: r943574 - in /subversion/trunk/subversion/libsvn_wc: update_editor.c wc_db.c wc_db.h
On Wed, May 12, 2010 at 12:41, julianf...@apache.org wrote: ... +++ subversion/trunk/subversion/libsvn_wc/wc_db.c Wed May 12 16:41:18 2010 @@ -3767,6 +3767,49 @@ svn_wc__db_temp_op_remove_entry(svn_wc__ } +svn_error_t * +svn_wc__db_temp_op_remove_working(svn_wc__db_t *db, + const char *local_abspath, + apr_pool_t *scratch_pool) +{ + svn_wc__db_pdh_t *pdh; + svn_sqlite__stmt_t *stmt; + svn_sqlite__db_t *sdb; + apr_int64_t wc_id; + const char *local_relpath; + + SVN_ERR_ASSERT(svn_dirent_is_absolute(local_abspath)); + + SVN_ERR(parse_local_abspath(pdh, local_relpath, db, local_abspath, + svn_sqlite__mode_readwrite, + scratch_pool, scratch_pool)); + VERIFY_USABLE_PDH(pdh); + + flush_entries(pdh); + + /* Check if we should remove it from the parent db instead */ Instead? It should be removing it from both places. ... + sdb = pdh-wcroot-sdb; + wc_id = pdh-wcroot-wc_id; Precedent is to just spell it out in the usage, rather than have local variables for this. ... Cheers, -g
Re: What are the plans for svn_wc_status3_t?
On Wed, May 12, 2010 at 3:10 PM, Daniel Näslund dan...@longitudo.com wrote: Hi! I set out to remove the entry field in svn_wc_status3_t. The reasons was i) svn_wc_entry_t is deprecated and ii) fetching all the info contained in the entry field is slow and we want 'svn status' to be fast! So far this has been done: -- * The revision information has been moved out to its own fields. The idea is that those fields should be clearly defined, contrary to the entry ones. * entry-changelist information is fetched with a node func in the caller. Or you can just read it with _read_info() with all other information * Added a field, status-conflicted. The caller is supposed to find out what conflict it is. * Added a field status-versioned. The caller can use this one instead of the presence of status-entry for determining if a path is versioned. What are the plans for... --- * status-text_status. Should it be replaced with a boolean indicating if there are text modifications? * status-prop_status. Should it be replaced with a boolean indicating if there are props modifications? I would keep these working, or replace them with a similar wc-ng status, but keep the status if we also see a conflict or other high priority status. (In status2_t you loose status_modified if you have a conflict) You call status to see if a node is locally modified, so you need this info in one format or another * status-copied. It could be a bit expensive to calculate but only returning status_added and have the API user determine if it's copied feels like a leaky abstraction to me. My take is that all API users wants to know if the node is copied. * status-switched. Everyone probably wants to know that too. We need the url's to determine if the node is switched which brings me to... * status-url. It is not used in svn/status{,-cmd}.c. How many other callers use it? If we keep status-switched, we still have to fetch the url though. If we have it for free we can just pass it to users. (I know it is used by many clients). You need the url of a node and its parent to determine if a node is switched, so I would keep both in the structure. (In AnkhSVN I'm really not that interested in switched.. I just want to know if a node is commit/diff/revertable.) You should also add a node kind to the status structure. (Status without node kind requires looking at disk). And probably depth too as you often need that to process the result. (And you have both available anyway) Bert * svn_wc_status_kind. Should it be revved to map to svn_wc__db_status_kind_t in some other way. We've already made the svn_wc_status_conflicted alternative obsolute (atleast for svn_wc_status3_t. I haven't checked the other uses of svn_wc_status_kind). Cheers, Daniel
Re: WC-NG: Perhaps it's time to write a new summary of where we stand at the moment?
On Wed, May 12, 2010 at 09:33, Daniel Näslund dan...@longitudo.com wrote: Hi! Greg [1] identified the following items in the beginning of February. * remove entry_t usage within libsvn_wc/client This is very close in the client, but there is still a good chunk of work in libsvn_wc. * remove access_t usage withing libsvn_wc/client This looks to be done. libsvn_client has no more uses/references. There are still some references in libsvn_wc, but that appears to be for backwards compatibility only. Hyrum was doing some work to clarify access_t uses in libsvn_wc. Not sure where he is with that. * move props into wc.db Should happen within a week. * move to single/root wc.db No plans yet. We probably need to remove entry_t uses before tackling this. * switch to new pristine file mgmt Julian is working on this. I believe it is getting close for new working copies, but we do not have a migration/upgrade story yet. I have the feeling, we've made some progress. Are there any new identifiable items for wc-ng completion? We have work items which perform database work inside *and* perform file operations. These need to be turned inside-out to perform the db operation and queue file updates within one sqlite txn. There are a number of places where we perform several db operations outside of a sqlite txn, which means failure could leave us in a corrupted state. See svn_wc_add_repos_file4 for an example (note: other than that, the function is now at our wc-ng ideal design). The WC-NG meta issue [2] contains only one dependency, #3586. Perhaps we could add some more to make it easier to track the progress? Meh. I don't use the tracker as a to-do list. The set of work is too fluid for something as rigid as that. Cheers, Daniel (Hope I'm not too forward asking these questions) It's always fine! [1] http://svn.haxx.se/dev/archive-2010-02/0286.shtml [2] http://subversion.tigris.org/issues/show_bug.cgi?id=3357 Cheers, -g
Re: WC-NG: Perhaps it's time to write a new summary of where we stand at the moment?
On Wed, May 12, 2010 at 12:14 PM, Greg Stein gst...@gmail.com wrote: On Wed, May 12, 2010 at 09:33, Daniel Näslund dan...@longitudo.com wrote: Hi! Greg [1] identified the following items in the beginning of February. * remove entry_t usage within libsvn_wc/client This is very close in the client, but there is still a good chunk of work in libsvn_wc. * remove access_t usage withing libsvn_wc/client This looks to be done. libsvn_client has no more uses/references. There are still some references in libsvn_wc, but that appears to be for backwards compatibility only. Hyrum was doing some work to clarify access_t uses in libsvn_wc. Not sure where he is with that. I started working in it, then spent a week at a conference, and then had my laptop motherboard get toasted. I'm hoping to have the hardware back by week's end, and then dive back in next week. -Hyrum
Re: Optimizing back-end: saving cycles or saving I/O (was: Re: [PATCH v2] Saving a few cycles, part 3/3)
On Tue, May 11, 2010 at 6:16 PM, Johan Corveleyn jcor...@gmail.com wrote: On Tue, May 11, 2010 at 1:56 PM, Stefan Sperling s...@elego.de wrote: On Tue, May 11, 2010 at 07:43:33AM -0400, Mark Phippard wrote: On Tue, May 11, 2010 at 7:27 AM, Stefan Sperling s...@elego.de wrote: On Tue, May 11, 2010 at 01:36:26AM +0200, Johan Corveleyn wrote: As I understand your set of patches, you're mainly focusing on saving cpu cycles, and not on avoiding I/O where possible (unless I'm missing something). Maybe some of the low- or high-level algorithms in the back-end can be reworked a bit to reduce the amount of I/O? Or maybe some clever caching can avoid some file accesses? In general, I think trying to work around I/O slowness by loading stuff into RAM (caching) is a bad idea. You're just taking away memory from the OS buffer cache if you do this. A good buffer cache in the OS should make open/close/seek fast. (So don't run a windows server if you can avoid it.) The only point where it's worth thinking about optimizing I/O access is when you get to clustered, distributed storage, because at that point every I/O request translated into a network packet. You had me until that last part. I think we should ALWAYS be thinking about optimizing I/O. I have little doubt that is where the biggest performance bottlenecks live (other than network of course). I agree that making a big cache is probably not the best way to go, but I think we should always be looking for optimizations where we avoid repeated open/closes that are not necessary. That's true. Avoiding repeated open/close of the same file is a good optimisation. Even with a good buffer cache it will make a difference. So s/The only point/One point/ :) Yes, some form of caching may or may not be a good approach, but the main point is that ideally, for a certain client request, every interesting rev file should be opened and read exactly once. Currently this is definitely not the case (for svn log it's closer to 10 opens/closes and 5 times the amount of bytes of every rev file involved; with packed revs it's even worse because of the extra lookup of the rev offset in the pack manifest file). Maybe the ideal situation is currently impossible because of the higher level algorithms (retrieving the data in a certain way). So with some clever caching I really meant read the rev file (or the interesting parts of it) exactly once, and keep that in memory for those 10 other accesses you need (which follow very shortly), then forget about it. Not some sort of LRU cache or something. But I really don't know whether this is a good idea (it may be difficult to determine when you don't need it anymore, ...). Just guessing ... In my book, I/O is almost always one of the slowest parts, even if the data is on a local 15k rpm or even SSD disk. Since SVN with FSFS does so much I/O with potentially thousands of little files for a single client request, I think it could pay off big time to try to reduce it as much as possible. I remember trying to do some caching of open file handles back when working on the packing code, but it turned out that multiple layers of the FSFS stack were using the same revision file for different purposes *simultaneously*. By attempting to reuse the open file handles, one function would seek to a different part of the file, while a function higher up in the stack understandably didn't expect the file pointer to have changed. It made things...confusing. There may be other ways of caching this information, which would be great. But my experience with wc-ng leads me to ask if we can use an existing piece of tech, rather than having to implement Yet Another Cache. (Caching is a well known problem with many good solutions, let's not reinvent the wheel.) This conversation about I/O also reminds me of Jon Trowbridge's comments about rewriting libsvn_fs_bigtable: the FS layer pretends that all I/O is essentially free, which it is obviously not. To make matters worse, the disparity between disk speeds and CPU speeds has increased in the last 10 years, making this assumption even less valid. -Hyrum
First SVN performance data
Hi there, as I promised, I'm going to conduct some in-depth analysis and comprehensive SVN performance testing. That is very time-consuming process. However, it seems that many people have incorrect or outdated ideas about the current state of affairs. To add a bit more substance to the discussion, I like to present some preliminary data and not to wait until I collected all data I intend to. Side note: Maybe, these numbers make it clearer why my patches should be committed after review. Bottom line: * SVN servers tend to be CPU-limited (we already observed that problem @ our company with SVN 1.4) * packed repositories are ~20% faster than non-packed, non-sharded * optimal file cache size is roughly /trunk size (plus branch diffs, but that is yet to be quantified) * cold I/O from a low-latency source takes 2 .. 3 times as long as from cached data * a fully patched 1.7 server is twice as fast as 1.6.9 Export has been chosen to eliminate problems with client-side w/c performance. Please note that all measurements were taken in a true client/server setup. You can achieve similar performance in low-latency broad-band networks. -- Stefan^2. Test system --- 2x XEON 5550 (8 cores total), 2.66GHz, hyper-threading disabled, turbo boost disabled 24GB RAM 4x128GB cheapo ssd on RAID-0 controller w/ 256 MB cache LINUX 2.6.28-18-generic SMP 64 bits repositories on ext4 export to tempfs Repository mirrors used --- tsvn : mirror of tortoisesvn.tigris.org repository, non-sharded, non-packed SVN 1.5 format export of /tr...@head tsvn_packed: mirror of tortoisesvn.tigris.org repository, packed SVN 1.6 format export of /tr...@head apache.org : mirror of apache.org repository, packed SVN 1.6 format export of /subversion/tr...@head kde.org: mirror of kde.org repository, packed SVN 1.6 format export of /tr...@head criterion tsvn tsvn_packed apache.org kde.org repo size 5.3G 4.8G 30.4G58.4G revisions 18,937 18,937943,322 1,125,690 export items 3,840 3,840 1,941399,419 export size 81M81M 38M 8.4G Export using the old server --- ~/subversion-1.6.9/subversion/svnserve/svnserve -d -T -r /mnt/archive/svnroot/ time svn-1.6.9 export --ignore-externals -q svn://localhost/$TOEXPORT /dev/shm/t 1st run tsvn tsvn_packed apache.org kde.org real 0m10.242s 0m9.774s 0m7.473s 20m4.354s user 0m2.280s 0m2.372s 0m1.104s 2m39.870s sys 0m0.428s 0m0.424s 0m0.236s 0m35.590s 2nd run tsvn tsvn_packed apache.org kde.org real 0m5.494s 0m4.907s 0m3.045s 11m4.798s user 0m2.416s 0m2.484s 0m1.136s 2m45.506s sys 0m0.352s 0m0.328s 0m0.216s 0m37.718s Export using the new server + patches, wire-compression off --- ~/subversion-1.7.patched/subversion/svnserve/svnserve -d -T -r /mnt/archive/svnroot/ time svn-1.6.9 export --ignore-externals -q svn://localhost/$TOEXPORT /dev/shm/t 1st run tsvn tsvn_packed apache.org kde.org real 0m6.453s 0m5.817s 0m6.119s 13m22.418s user 0m1.572s 0m1.680s 0m0.784s 1m42.710s sys 0m0.544s 0m0.572s 0m0.252s 0m40.351s 2nd run tsvn tsvn_packed apache.org kde.org real 0m2.520s 0m2.092s 0m1.792s 5m45.944s user 0m1.360s 0m1.228s 0m0.796s 1m38.662s sys 0m0.400s 0m0.392s 0m0.224s 0m36.386s Resource usage -- These numbers are for the patched server criterion tsvn tsvn_packed apache.org kde.org cache usage 97M106M 61M 10.0G Typical CPU load during 1st KDE export (100% == 1 core) ~70 .. 100% CPU load (svnserve) ~20 .. 40% CPU load (svn) ~10 .. 20% sys Typical CPU load during 2nd KDE export (100% == 1 core) ~100% CPU load (svnserve) ~20 .. 70% CPU load (svn) ~15 .. 30% sys
Commiting, tree conflicts and keep-local
The only thing keeping me from finally removing svn_wc_entry_t from harvest_committables is keep_local. There was some discussion on IRC a few days ago http://colabti.org/irclogger/irclogger_log/svn-dev?date=2010-05-06#l44 but it really resolve things. It includes pointers to Neels patch to make keep-local remove tree conflicts. Tree conflicts generally prevent a commit, however if the conflict is inside a directory that has been deleted with svn rm --keep-local then the commit will be allowed. So rm -rf repo wc svnadmin create repo svn mkdir -mm file://`pwd`/repo/A svn mkdir -mm file://`pwd`/repo/A/B svn co -r1 file://`pwd`/repo wc svn cp file://`pwd`/repo/A wc/A/B svn up wc # creates tree conflict for A/B svn ci -mm wc # fails because of tree conflict svn rm wc/foo # fails because of tree conflict svn rm --keep-local wc/A # succeeds svn ci -mm wc # succeeds When A is rm'd the tree conflict is sort of still present, it shows up in status, but it's all a bit dodgy. After the commit the unversioned directory A/B still contains a .svn directory. If I revert, rather than commit, then wc/A/B becomes unversioned. In wc-metadata.sql there is a comment about working_node.keep_local being temporary. My immediate problem is how to replace entry-keep_local in harvest_committables. Two options are: 1. XFAIL tree_conflicts 17. 2. Add svn_wc__db_temp_get_keep_local. 3. Use Neels' patch to make rm --keep-local/--force remove tree conflicts. Any other ideas? -- Philip
Re: Possible security problem with svnsync?
C. Michael Pilato wrote: Peter Samuelson wrote: [Jon Foster] All he has to do is change the svn:sync-from-url property on the mirror repository to be a file:// URL to the source repository, rather than a http:// one. The correct file:// URL is probably guessable. I'd never thought of this as as security problem, but I _do_ think it's a suboptimal design where a svnsync setup stores state on the mirrored repository which is relative not to the mirror, but to whoever is running svnsync. Please can we change svnsync sync to allow both the source and target URLs to be specified? That rather simple measure would block this attack. Since svnsync is usually invoked from a script, typing the extra URL isn't a problem. Yes, this sounds like a good design anyway, aside from the security question. I'm coding right now along these lines. By the way, I'm tracking this is issue #3637[1]. The proposed solution has been committed to trunk. [1] http://subversion.tigris.org/issues/show_bug.cgi?id=3637 -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: Commiting, tree conflicts and keep-local
Temp function. It can be used in entries.c, too. It'll only last until single-db. Not blasting conflict info is good, in case of reverts. On May 12, 2010 2:17 PM, Philip Martin philip.mar...@wandisco.com wrote: The only thing keeping me from finally removing svn_wc_entry_t from harvest_committables is keep_local. There was some discussion on IRC a few days ago http://colabti.org/irclogger/irclogger_log/svn-dev?date=2010-05-06#l44 but it really resolve things. It includes pointers to Neels patch to make keep-local remove tree conflicts. Tree conflicts generally prevent a commit, however if the conflict is inside a directory that has been deleted with svn rm --keep-local then the commit will be allowed. So rm -rf repo wc svnadmin create repo svn mkdir -mm file://`pwd`/repo/A svn mkdir -mm file://`pwd`/repo/A/B svn co -r1 file://`pwd`/repo wc svn cp file://`pwd`/repo/A wc/A/B svn up wc # creates tree conflict for A/B svn ci -mm wc # fails because of tree conflict svn rm wc/foo # fails because of tree conflict svn rm --keep-local wc/A # succeeds svn ci -mm wc # succeeds When A is rm'd the tree conflict is sort of still present, it shows up in status, but it's all a bit dodgy. After the commit the unversioned directory A/B still contains a .svn directory. If I revert, rather than commit, then wc/A/B becomes unversioned. In wc-metadata.sql there is a comment about working_node.keep_local being temporary. My immediate problem is how to replace entry-keep_local in harvest_committables. Two options are: 1. XFAIL tree_conflicts 17. 2. Add svn_wc__db_temp_get_keep_local. 3. Use Neels' patch to make rm --keep-local/--force remove tree conflicts. Any other ideas? -- Philip
Re: [PATCH] readable error messages on non-ASCII systems
ooops, I meant to cc: dev at subversion. Thanks, Daniel. I'm more used to dev at httpd where there is some magic involving replyto: Here is what I see if I run dbx against svnversion with a stop set in svn_cmdline_fputs: [2] stopped in svn_cmdline_fputs at line 288 in file cmdline.c ($t1) 288 svn_cmdline_fputs(const char *string, FILE* stream, apr_pool_t *pool) (dbx64) where svn_cmdline_fputs(string = svn: Valid UTF-8 data.(hex: 2e 61 4b).followed by invalid UTF-8 sequence.(hex: a2 a5 95 61)., stream = 0xD6C84F8, pool = 0xE4E4040), line 288 in cmdline.c svn_cmdline_fprintf(stream = 0xD6C84F8, pool = 0xE4E4040, fmt = %s%s., ...), line 284 in cmdline.c print_error.$b20, line 332 in error.c print_error(err = 0xE4E4078, stream = 0xD6C84F8, prefix = svn: ), line 332 in error.c svn_handle_error2.$b25.$b29, line 405 in error.c svn_handle_error2.$b25, line 405 in error.c svn_handle_error2(err = 0xE4E4078, stream = 0xD6C84F8, fatal = 0, prefix = svn: ), line 405 in error.c main.$b17.$b18, line 216 in main.c main.$b17, line 216 in main.c main(argc = 1, argv = 0xDF76878), line 216 in main.c Any strings that are printable with z/OS dbx where or p are in the native EBCDIC encoding. If I stop in svn_handle_error2 and print the err structure, I see: (dbx64) p *err (apr_err = 121, message = Valid UTF-8 data.(hex: 2e 61 4b).followed by invalid UTF-8 sequence.(hex: a2 a5 95 61), child = 0x0, pool = 0xE4E4040, file = ./subversion/libsvn_subr/utf.c, line = 632) ...so we have native strings here which never become UTF-8. I could patch print_error() to do the UTF-8 conversion prior to calling svn_cmdline_fputs(), but the back-to-back conversions seem silly. Maybe it would be better to define svn_cmdline_fputs_native_cstring() or some such and call that from print_error() and any other caller that passes native strings. Greg On Wed, May 12, 2010 at 10:42 AM, Greg Ames ames.g...@gmail.com wrote: On Wed, May 12, 2010 at 12:59 AM, Daniel Shahaf d...@daniel.shahaf.namewrote: Greg Ames wrote on Tue, 11 May 2010 at 19:36 -0400: The error messages are in the native code page to start with, so running them through a UTF-8 - native conversion doesn't do anything helpful. ... Index: subversion/libsvn_subr/cmdline.c === --- subversion/libsvn_subr/cmdline.c(revision 943316) +++ subversion/libsvn_subr/cmdline.c(working copy) @@ -318,24 +318,15 @@ svn_error_t * svn_cmdline_fputs(const char *string, FILE* stream, apr_pool_t *pool) { - svn_error_t *err; - const char *out; + /* string is native. do not try to convert from UTF-8 */ The doc string of this function (see subversion/include/svn_cmdline.h) specifically promises that it'll do conversion from UTF-8. ok, but a) that's not appropriate for error messages b) it's not enforced. We cannot make it unconditionally do the opposite. I have done exactly that with good results (Perhaps with suitable #ifdef's we could do it; or perhaps your problem can be fixed elsewhere (e.g., the error-printing code).) The SVN_ERR() macro and supporting functions produce native strings, not UTF-8, and they are widely used. Is your issue only with the encoding of error messages? This patch addresses only the encoding of error messages. There are a few other places where there is confusion about the encoding of input or literals. Or with the the encoding of all svn output? I think it's a great idea to have svn metadata and text files in the repository in UTF-8 to promote universal access. But error messages are local and shouldn't be munged much or Bad Things can happen. Yes, someone could inject code after SVN_ERR() to convert all the literal strings and characters in error messages throughout subversion to UTF-8. But what's the point of doing that then converting it back to native to write to stderr? And what are the odds of picking up 100% of the literal strings and characters and doing exactly one UTF-8 conversion on all of them prior to calling svn_cmdline_fputs()? Simplicity is good, especially in error situations, and it saves a few cycles on non-UTF-8 systems. Greg
Re: [PATCH] readable error messages on non-ASCII systems
You should look at this now deleted branch for Subversion on EBCDIC systems. http://svn.apache.org/viewvc/subversion/branches/ebcdic/?pathrev=876004 It was written for AS/400, not z/OS but examining the diff of this branch and what it was synched with will help get an idea on where to look. The branch was deleted in r876005. So be sure to use a peg-revision of 876004 when looking at the repository. On Wed, May 12, 2010 at 2:59 PM, Greg Ames ames.g...@gmail.com wrote: ooops, I meant to cc: dev at subversion. Thanks, Daniel. I'm more used to dev at httpd where there is some magic involving replyto: Here is what I see if I run dbx against svnversion with a stop set in svn_cmdline_fputs: [2] stopped in svn_cmdline_fputs at line 288 in file cmdline.c ($t1) 288 svn_cmdline_fputs(const char *string, FILE* stream, apr_pool_t *pool) (dbx64) where svn_cmdline_fputs(string = svn: Valid UTF-8 data.(hex: 2e 61 4b).followed by invalid UTF-8 sequence.(hex: a2 a5 95 61)., stream = 0xD6C84F8, pool = 0xE4E4040), line 288 in cmdline.c svn_cmdline_fprintf(stream = 0xD6C84F8, pool = 0xE4E4040, fmt = %s%s., ...), line 284 in cmdline.c print_error.$b20, line 332 in error.c print_error(err = 0xE4E4078, stream = 0xD6C84F8, prefix = svn: ), line 332 in error.c svn_handle_error2.$b25.$b29, line 405 in error.c svn_handle_error2.$b25, line 405 in error.c svn_handle_error2(err = 0xE4E4078, stream = 0xD6C84F8, fatal = 0, prefix = svn: ), line 405 in error.c main.$b17.$b18, line 216 in main.c main.$b17, line 216 in main.c main(argc = 1, argv = 0xDF76878), line 216 in main.c Any strings that are printable with z/OS dbx where or p are in the native EBCDIC encoding. If I stop in svn_handle_error2 and print the err structure, I see: (dbx64) p *err (apr_err = 121, message = Valid UTF-8 data.(hex: 2e 61 4b).followed by invalid UTF-8 sequence.(hex: a2 a5 95 61), child = 0x0, pool = 0xE4E4040, file = ./subversion/libsvn_subr/utf.c, line = 632) ...so we have native strings here which never become UTF-8. I could patch print_error() to do the UTF-8 conversion prior to calling svn_cmdline_fputs(), but the back-to-back conversions seem silly. Maybe it would be better to define svn_cmdline_fputs_native_cstring() or some such and call that from print_error() and any other caller that passes native strings. Greg On Wed, May 12, 2010 at 10:42 AM, Greg Ames ames.g...@gmail.com wrote: On Wed, May 12, 2010 at 12:59 AM, Daniel Shahaf d...@daniel.shahaf.namewrote: Greg Ames wrote on Tue, 11 May 2010 at 19:36 -0400: The error messages are in the native code page to start with, so running them through a UTF-8 - native conversion doesn't do anything helpful. ... Index: subversion/libsvn_subr/cmdline.c === --- subversion/libsvn_subr/cmdline.c (revision 943316) +++ subversion/libsvn_subr/cmdline.c (working copy) @@ -318,24 +318,15 @@ svn_error_t * svn_cmdline_fputs(const char *string, FILE* stream, apr_pool_t *pool) { - svn_error_t *err; - const char *out; + /* string is native. do not try to convert from UTF-8 */ The doc string of this function (see subversion/include/svn_cmdline.h) specifically promises that it'll do conversion from UTF-8. ok, but a) that's not appropriate for error messages b) it's not enforced. We cannot make it unconditionally do the opposite. I have done exactly that with good results (Perhaps with suitable #ifdef's we could do it; or perhaps your problem can be fixed elsewhere (e.g., the error-printing code).) The SVN_ERR() macro and supporting functions produce native strings, not UTF-8, and they are widely used. Is your issue only with the encoding of error messages? This patch addresses only the encoding of error messages. There are a few other places where there is confusion about the encoding of input or literals. Or with the the encoding of all svn output? I think it's a great idea to have svn metadata and text files in the repository in UTF-8 to promote universal access. But error messages are local and shouldn't be munged much or Bad Things can happen. Yes, someone could inject code after SVN_ERR() to convert all the literal strings and characters in error messages throughout subversion to UTF-8. But what's the point of doing that then converting it back to native to write to stderr? And what are the odds of picking up 100% of the literal strings and characters and doing exactly one UTF-8 conversion on all of them prior to calling svn_cmdline_fputs()? Simplicity is good, especially in error situations, and it saves a few cycles on non-UTF-8 systems. Greg -- Thanks Mark Phippard http://markphip.blogspot.com/
Re: Optimizing back-end: saving cycles or saving I/O (was: Re: [PATCH v2] Saving a few cycles, part 3/3)
On Wed, 2010-05-12 at 13:44 -0400, Hyrum K. Wright wrote: There may be other ways of caching this information, which would be great. Maybe. Caches add complexity, and too many layers of caching (disk, OS, application) can actually reduce performance in some scenarios. I would suggest getting a better understanding of why this operation is slow. Why is svn log opening each rev file ten times? Is this intrinsically necessary? Going straight to optimizing the overused low-level operations can provide a noticeable performance benefit, but fixing the inefficient high-level algorithms is how you can turn minutes into microseconds.
Re: [PATCH] readable error messages on non-ASCII systems
On Wed, May 12, 2010 at 3:09 PM, Mark Phippard markp...@gmail.com wrote: You should look at this now deleted branch for Subversion on EBCDIC systems. http://svn.apache.org/viewvc/subversion/branches/ebcdic/?pathrev=876004 It was written for AS/400, not z/OS but examining the diff of this branch and what it was synched with will help get an idea on where to look. will do, thanks! Greg
Re: svn commit: r943609 - in /subversion/trunk/subversion: libsvn_wc/props.c tests/cmdline/special_tests.py
Daniel Shahaf wrote on Wed, 12 May 2010 at 23:19 +0300: It only affects symlinks being added on Windows And even then, what exactly is the effect? What code exactly is going to bother noticing that the property's value is not *, given that we promise it to always be * whenever the property is present? :-) Daniel (I've always wondered why symlinks are represented as metadata = { 'svn:special' : '*' } data = link $target rather than as metadata = { 'svn:special' : 'link' } data = $target .)
Re: svn commit: r943609 - in /subversion/trunk/subversion: libsvn_wc/props.c tests/cmdline/special_tests.py
On Wed, May 12, 2010 at 16:34, Daniel Shahaf d...@daniel.shahaf.name wrote: Daniel Shahaf wrote on Wed, 12 May 2010 at 23:19 +0300: It only affects symlinks being added on Windows And even then, what exactly is the effect? What code exactly is going to bother noticing that the property's value is not *, given that we promise it to always be * whenever the property is present? :-) Daniel (I've always wondered why symlinks are represented as metadata = { 'svn:special' : '*' } data = link $target rather than as metadata = { 'svn:special' : 'link' } data = $target Don't get me started. It should have been svn:symlink = $target
Re: keep_local in harvest_committables
On 2010-05-12 16:46, Philip Martin wrote: I saw you talking about keep-local on IRC a couple of days ago, are you working on keep-local in harvest_committables? Yes, indeed I am ... was ... will. I've been kept busy by real life issues, but that's about to pass. Thanks for reminding me, btw. The bottom line is: simply remove that madly misguided entry-keep_local conditional and always call bail_on_tree_conflicted_children(). The rest of the patch is mending the test suite to be sensible about it. Will do. For the record: We (mostly Julian and me) discussed that tree-conflicts should not be ignored at commit time, no matter how they came about. ~Neels signature.asc Description: OpenPGP digital signature
Re: keep_local in harvest_committables
Neels J Hofmeyr ne...@elego.de writes: The bottom line is: simply remove that madly misguided entry-keep_local conditional and always call bail_on_tree_conflicted_children(). The rest of the patch is mending the test suite to be sensible about it. Will do. See r943722 for a slightly different approach. If you want to change it further that's fine by me. -- Philip
Re: keep_local in harvest_committables
On 2010-05-13 00:25, Philip Martin wrote: Neels J Hofmeyr ne...@elego.de writes: The bottom line is: simply remove that madly misguided entry-keep_local conditional and always call bail_on_tree_conflicted_children(). The rest of the patch is mending the test suite to be sensible about it. Will do. See r943722 for a slightly different approach. If you want to change it further that's fine by me. * neels feels bad for causing work by not reading dev@ for a few days Should we keep that temp func for the time being? ~Neels signature.asc Description: OpenPGP digital signature
Re: keep_local in harvest_committables
Neels J Hofmeyr ne...@elego.de writes: On 2010-05-13 00:25, Philip Martin wrote: Neels J Hofmeyr ne...@elego.de writes: The bottom line is: simply remove that madly misguided entry-keep_local conditional and always call bail_on_tree_conflicted_children(). The rest of the patch is mending the test suite to be sensible about it. Will do. See r943722 for a slightly different approach. If you want to change it further that's fine by me. * neels feels bad for causing work by not reading dev@ for a few days Don't worry, the time consuming bit was the url/entry stuff not keep_local. Should we keep that temp func for the time being? I don't mind. The whole 'svn rm --keep-local' with tree conflicts is a bit suspect. After a commit the unversioned directory still contains a .svn directory. Reverting instead of committing leads to unversioned directories that should be versioned. See my other thread. -- Philip
[PATCH] readable error messages on non-ASCII systems, v2
This patch also produces readable error messages on z/OS by defining two new svn_cmdline functions which take native string arguments, then calling svn_cmdline_fprintf_asis() from print_error(), only in the path where I can see that native strings are being used. Greg Index: subversion/libsvn_subr/cmdline.c === --- subversion/libsvn_subr/cmdline.c(revision 943729) +++ subversion/libsvn_subr/cmdline.c(working copy) @@ -316,6 +316,19 @@ } svn_error_t * +svn_cmdline_fprintf_asis(FILE *stream, apr_pool_t *pool, const char *fmt, ...) +{ + const char *message; + va_list ap; + + va_start(ap, fmt); + message = apr_pvsprintf(pool, fmt, ap); + va_end(ap); + + return svn_cmdline_fputs_asis(message, stream, pool); +} + +svn_error_t * svn_cmdline_fputs(const char *string, FILE* stream, apr_pool_t *pool) { svn_error_t *err; @@ -348,6 +361,28 @@ } svn_error_t * +svn_cmdline_fputs_asis(const char *string, FILE* stream, apr_pool_t *pool) +{ + + /* On POSIX systems, errno will be set on an error in fputs, but this might + not be the case on other platforms. We reset errno and only + use it if it was set by the below fputs call. Else, we just return + a generic error. */ + errno = 0; + + if (fputs(string, stream) == EOF) +{ + if (errno) +return svn_error_wrap_apr(errno, _(Write error)); + else +return svn_error_create + (SVN_ERR_IO_WRITE_ERROR, NULL, NULL); +} + + return SVN_NO_ERROR; +} + +svn_error_t * svn_cmdline_fflush(FILE *stream) { /* See comment in svn_cmdline_fputs about use of errno and stdio. */ Index: subversion/libsvn_subr/error.c === --- subversion/libsvn_subr/error.c(revision 943729) +++ subversion/libsvn_subr/error.c(working copy) @@ -415,8 +415,8 @@ /* Only print the same APR error string once. */ else if (err-message) { - svn_error_clear(svn_cmdline_fprintf(stream, err-pool, %s%s\n, - prefix, err-message)); + svn_error_clear(svn_cmdline_fprintf_asis(stream, err-pool, %s%s\n, + prefix, err-message)); } else { Index: subversion/include/svn_cmdline.h === --- subversion/include/svn_cmdline.h(revision 943729) +++ subversion/include/svn_cmdline.h(working copy) @@ -128,6 +128,28 @@ FILE *stream, apr_pool_t *pool); +/** Write to the stdio @a stream, using a printf-like format string @a fmt, + * passed through apr_pvsprintf(). All string arguments are in the native + * encoding; no codepage conversion is done. Use @a pool for + * temporary allocation. + * + * @since New in ?? + */ +svn_error_t *svn_cmdline_fprintf_asis(FILE *stream, + apr_pool_t *pool, + const char *fmt, + ...) + __attribute__((format(printf, 3, 4))); + +/** Output the @a string to the stdio @a stream without changing the encoding. + * Use @a pool for temporary allocation. + * + * @since New in ?? + */ +svn_error_t *svn_cmdline_fputs_asis(const char *string, +FILE *stream, +apr_pool_t *pool); + /** Flush output buffers of the stdio @a stream, returning an error if that * fails. This is just a wrapper for the standard fflush() function for * consistent error handling.