Re: [PATCH] -cM-N command line syntax
Strictly from the view of the non-developer and user of SVN; I have only ever used the : for specifying a range. But I do see the sanity in having the input match the output too. If the ability to support both is indeed trivial - and if both cases are expected by users - then that would certainly seem to be the most worthwhile choice. Gavin Beau Baumanis On 03/10/2010, at 12:34 PM, Peter Samuelson wrote: [Peter Samuelson] Is this, then, a worthwhile feature addition? I don't want to add syntax that nobody else wants. In particular, this patch highlights the existing inconsistency of : vs. - as range operators. Hmmm, in tests/cmdline/log_tests.py XFail(log_chanage_range), I see Lieven thought -c might someday take ranges with : instead of -. But since the output uses -, I'd argue the cut-and-paste utility of sticking with that outweighs the consistency of :. Unless perhaps we should support both, which would be trivial. Peter
RE: svn commit: r1003924 - /subversion/trunk/subversion/svnrdump/dump_editor.c
-Original Message- From: artag...@apache.org [mailto:artag...@apache.org] Sent: zondag 3 oktober 2010 9:05 To: comm...@subversion.apache.org Subject: svn commit: r1003924 - /subversion/trunk/subversion/svnrdump/dump_editor.c Author: artagnon Date: Sun Oct 3 07:05:05 2010 New Revision: 1003924 URL: http://svn.apache.org/viewvc?rev=1003924view=rev Log: svnrdump: dump_editor: Use just one temporary file for all textdelta application by truncating and re-using it in each iteration. This has significant performance benefits. * subversion/svnrdump/dump_editor.c (dump_edit_baton): Correct the comment about the description of pool, add a new delta_file variable and note in a comment that delta_file and delta_abspath should be allocated in the per-edit-session pool, not in the per-revision pool like the other variables. (get_dump_editor): Open the delta file initializing `eb-delta_file` and `eb-delta_abspath`, allocating them in the per-edit-session pool. Mark delta_file as a del_on_close, so that it's automatically cleaned up when the editor is done. All streams in callbacks that map to it should set disown = TRUE. (close_file): Instead of opening the delta_file, simply seek to the beginning. During cleanup, don't close or remove the file; just truncate it to get it ready for the next textdelta application. Also, while getting the file stat, don't re-open the file -- use apr_file_info_get on the open file instead. (apply_textdelta): Don't create a unique file. Simply map a stream to the delta file to perform the textdelta application. Disown the stream and close it explicitly, making sure not to close the delta file itself. Patch by: David Barr david.b...@cordelta.com Helped by: artagnon Looks good; nice optimization :) @@ -846,7 +853,14 @@ get_dump_editor(const svn_delta_editor_t /* Create a special per-revision pool */ eb-pool = svn_pool_create(pool); - + + /* Open a unique temporary file for all textdelta applications in + this edit session. The file is automatically closed and cleaned + up when the edit session is done. */ + SVN_ERR(svn_io_open_uniquely_named((eb-delta_file), (eb- delta_abspath), + NULL, svnrdump, NULL, + svn_io_file_del_on_close, pool, pool)); (This code is probably copied from the old code; but anyway:) Do you really need a 'named' file here? The named file algorithm of svn_io_open_uniquely_named() is very slow compared to the 'just get me a temporary file' implemented by svn_io_open_unique_file3(). Especially when you need more than a few tempfiles with the same basename. (It tries the generated name itself, then it tries a 1 suffix, a 2 suffix, etc. Until it finds a name that doesn't exist) Bert
Re: svn commit: r1003227 - /subversion/site/publish/docs/community-guide/releasing.part.html
On Fri, Oct 01, 2010 at 01:41:42AM +0200, Daniel Shahaf wrote: s...@apache.org wrote on Thu, Sep 30, 2010 at 20:05:54 -: +++ subversion/site/publish/docs/community-guide/releasing.part.html Thu Sep 30 20:05:53 2010 @@ -1096,7 +1096,7 @@ manager.)/p -div class=h3 id=releasing-not +div class=h2 id=releasing-not h2How emnot/em to make a Subversion release What is the effect of this change? I don't see any reference in the CSS, etc, to an h2 or h3 class? r1003224 changed the heading background color from blue to green for the releasing-not ssection. But the level of indentation was still the same as for blue subsections. The effect of this change was to align the heading of the releasing-not section with all the other green section headings.
Re: svn commit: r1003924 - /subversion/trunk/subversion/svnrdump/dump_editor.c
Hi Bert, Bert Huijben writes: Do you really need a 'named' file here? The named file algorithm of svn_io_open_uniquely_named() is very slow compared to the 'just get me a temporary file' implemented by svn_io_open_unique_file3(). Especially when you need more than a few tempfiles with the same basename. (It tries the generated name itself, then it tries a 1 suffix, a 2 suffix, etc. Until it finds a name that doesn't exist) Ah, I see. The named temporary made debugging easier initially- I could easily see if the file was being removed or not, but I suppose it doesn't matter anymore. Fixed in r1003958. Thanks for the suggestion. -- Ram
Re: [Merge request] Merge r985477 from performance branch
Hi Julian, Julian Foad writes: Looks good to me. I wondered if it is safe in a long-running Subversion process, like TortoiseSvn or a Linux equivalent. It seems to me that it won't really matter much in practice. If someone changes their umask and finds that Subversion carries on creating files with the 'old' permissions that were in effect when Subversion was started... that's fine, as far as I'm concerned. Can I get an explicit +1 to commit this? I just want to get as many of Stefan's changes merged in quickly so that there's enough time before the 1.7 release to test them. Technical detail: How do I merge? I want to make some modifications to this patch before committing, but I want to preserve authorship. -- Ram
Re: [Merge request] Merge r985477 from performance branch
On Sun, Oct 03, 2010 at 07:02:01PM +0530, Ramkumar Ramachandra wrote: Hi Julian, Julian Foad writes: Looks good to me. I wondered if it is safe in a long-running Subversion process, like TortoiseSvn or a Linux equivalent. It seems to me that it won't really matter much in practice. If someone changes their umask and finds that Subversion carries on creating files with the 'old' permissions that were in effect when Subversion was started... that's fine, as far as I'm concerned. Can I get an explicit +1 to commit this? I just want to get as many of Stefan's changes merged in quickly so that there's enough time before the 1.7 release to test them. Technical detail: How do I merge? cd svn-trunk-working-copy svn merge -c r985477 ^/subversion/branches/performance # edit if necessary svn commit # Take note of committed revision, e.g. rN. # I'd recommend doing the following to avoid a cyclic (aka reflective) merge. # The commit just made should not bounce back to the performance branch # when someone runs svn merge ^/subversion/trunk on that branch cd svn-performance-branch-working-copy svn merge --record-only -rN ^/subversion/trunk svn commit -m Block rN from being merged into the performance branch to avoid a cyclic merge If you don't understand why the record-only merge is necessary, see http://mail-archives.apache.org/mod_mbox/subversion-dev/201004.mbox/%3c20100406152724.gm19...@noel.stsp.name%3e I want to make some modifications to this patch before committing What kinds of modifications do you want to make? You'll need a +1 for those, too. Please mail the diff which shows the results of the merge plus your modifications. but I want to preserve authorship. r985477 lists stefan2 as author. There is no concept of authorship in svn beyond that. You'll be listed as author of the merge commit, and your log message should say which change you're merging (the mergeinfo will also say it, but it doesn't hurt to mention it in the commit message anyway). Thanks, Stefan
Re: [Merge request] Merge r985477 from performance branch
Hi Stefan, Stefan Sperling writes: Can I get an explicit +1 to commit this? I just want to get as many of Stefan's changes merged in quickly so that there's enough time before the 1.7 release to test them. Technical detail: How do I merge? cd svn-trunk-working-copy svn merge -c r985477 ^/subversion/branches/performance # edit if necessary svn commit # Take note of committed revision, e.g. rN. # I'd recommend doing the following to avoid a cyclic (aka reflective) merge. # The commit just made should not bounce back to the performance branch # when someone runs svn merge ^/subversion/trunk on that branch cd svn-performance-branch-working-copy svn merge --record-only -rN ^/subversion/trunk svn commit -m Block rN from being merged into the performance branch to avoid a cyclic merge If you don't understand why the record-only merge is necessary, see http://mail-archives.apache.org/mod_mbox/subversion-dev/201004.mbox/%3c20100406152724.gm19...@noel.stsp.name%3e Thanks for the elaborate explanation! :) I want to make some modifications to this patch before committing What kinds of modifications do you want to make? You'll need a +1 for those, too. Please mail the diff which shows the results of the merge plus your modifications. Just minor indentation/ comment changes (some mentioned in my review). Nothing functional. but I want to preserve authorship. r985477 lists stefan2 as author. There is no concept of authorship in svn beyond that. You'll be listed as author of the merge commit, and your log message should say which change you're merging (the mergeinfo will also say it, but it doesn't hurt to mention it in the commit message anyway). Ok, so when the build on trunk breaks, I'll be blamed, not stefan2- I think I can live with that. I'll also need to merge 5~10 more commits: will do shortly. My objective is to get whatever I think I can evaluate into trunk as quickly as possible so that they're well-tested and shipped with Subversion 1.7. I'm inexperienced, and will only attempt to evaluate a subset of patches that directly affect svnrdump: I would request someone else to pick up the other patches. -- Ram
[PATCH] extend svn_subst_translate_string() to record whether re-encoding and/or line ending translation were performed (v. 2)
On Fri, Oct 1, 2010 at 3:57 AM, Julian Foad julian.f...@wandisco.com wrote: Adds a public API function, svn_subst_translate_string2(), an extension of svn_subst_translate_string(), that has two, additional output parameters for determining whether re-encoding and/or line ending translation were performed. If we're going to add this functionality to _translate_string(), shouldn't we also add it to the other variants - _detranslate_string, _translate_cstring2, _stream_translated, _copy_and_translate4? I see you're adding the infrastructure into the (new) bodies of _translate_cstring2 and _stream_translated, just not (yey) exposing it in their APIs. I'm not sure. The set of 'translation' functions is already messy and it's not clear to me how useful this new functionality will be. I originally began working on svn_subst_translate_string2() because I could not find a combination of public API functions that would allow me to determine whether the line endings changed when a string was both re-encoded to UTF-8 and normalized to LF line endings. The most applicable, svn_subst_translate_string(), performs both translations without indicating whether it re-encoded the string or normalized a line ending. An alternative to extending svn_subst_translate_string() is to add a public API function that does the re-encoding and another that normalizes line endings. I think, however, that extending svn_subst_translate_string() is better because though the current implementation of svn_subst_translate_string() re-encodes, then normalizes line endings, the single API function allows for the possibility of a future improvement in efficiency as a result of performing both translations simultaneously. Attached are a revised patch and log message. [[[ Adds a public API function, svn_subst_translate_string2(), an extension of svn_subst_translate_string(), that has two, additional output parameters for determining whether re-encoding and/or line ending translation were performed. As discussed at: http://thread.gmane.org/gmane.comp.version-control.subversion.devel/122550 http://thread.gmane.org/gmane.comp.version-control.subversion.devel/123020 * subversion/include/svn_subst.h (svn_subst_translate_string2): New function. (svn_subst_translate_string): Deprecated in favor of svn_subst_translate_string2(). * subversion/libsvn_subr/subst.c (translate_newline): Added the new parameter TRANSLATED_EOL, the value at which is set to TRUE if the newline string in EOL_STR is different than the newline string in NEWLINE_BUF. (translation_baton): Added the TRANSLATED_EOL field. (create_translation_baton): Added a new parameter TRANSLATED_EOL that is passed to the resulting translation_baton. (translate_chunk): Modified the three calls to translate_newline() to pass the TRANSLATED_EOL pointer from the translation baton. (stream_translated): New static function. Its implementation is the old implementation of svn_subst_stream_translated(), but accepting another parameter, TRANSLATED_EOL, that is passed to the in/out translation batons that it creates. (svn_subst_stream_translated): Now a wrapper for stream_translated(). (translate_cstring): New static function. Its implementation is the old implementation of svn_subst_translate_cstring2(), but modified to accept another parameter, TRANSLATED_EOL, that is passed to stream_translated(). (svn_subst_translate_cstring2): Now a wrapper for translate_cstring(). (svn_subst_translate_string2): New function. The task of recording whether it translated a line ending is delegated to translate_cstring(). ]]]Index: subversion/include/svn_subst.h === --- subversion/include/svn_subst.h (revision 1004022) +++ subversion/include/svn_subst.h (working copy) @@ -588,16 +588,41 @@ svn_subst_stream_detranslated(svn_stream_t **strea /* EOL conversion and character encodings */ -/** Translate the data in @a value (assumed to be in encoded in charset +/** @deprecated Provided for backward compatibility with the 1.6 API. Callers + * should use svn_subst_translate_string2(). + * + * Translate the data in @a value (assumed to be encoded in charset * @a encoding) to UTF8 and LF line-endings. If @a encoding is @c NULL, * then assume that @a value is in the system-default language encoding. * Return the translated data in @a *new_value, allocated in @a pool. */ +SVN_DEPRECATED svn_error_t *svn_subst_translate_string(svn_string_t **new_value, const svn_string_t *value, const char *encoding, apr_pool_t *pool); +/** Translate the data in @a value (assumed to be encoded in charset + * @a encoding) to UTF8 and LF line-endings. If @a encoding is @c NULL, + * then assume that @a value is in the system-default character
Re: svn commit: r1004003 - /subversion/site/publish/index.html
On Sun, Oct 3, 2010 at 13:12, s...@apache.org wrote: Author: stsp Date: Sun Oct 3 17:12:19 2010 New Revision: 1004003 URL: http://svn.apache.org/viewvc?rev=1004003view=rev Log: * site/publish/index.html: Fix 1.6.13 release date, here, too. (Why are news items maintained in two separate places?) The intent is for index.html to have just two or three recent items, and news.html to be the full archive. Cheers, -g
Re: svn commit: r1004003 - /subversion/site/publish/index.html
On Sun, Oct 03, 2010 at 05:03:25PM -0400, Greg Stein wrote: On Sun, Oct 3, 2010 at 13:12, s...@apache.org wrote: Author: stsp Date: Sun Oct 3 17:12:19 2010 New Revision: 1004003 URL: http://svn.apache.org/viewvc?rev=1004003view=rev Log: * site/publish/index.html: Fix 1.6.13 release date, here, too. (Why are news items maintained in two separate places?) The intent is for index.html to have just two or three recent items, and news.html to be the full archive. And there's no html foo that can automate this for us? But, well, it's not a big deal.
Re: svn commit: r1004003 - /subversion/site/publish/index.html
On Sun, Oct 3, 2010 at 5:14 PM, Stefan Sperling s...@elego.de wrote: On Sun, Oct 03, 2010 at 05:03:25PM -0400, Greg Stein wrote: On Sun, Oct 3, 2010 at 13:12, s...@apache.org wrote: Author: stsp Date: Sun Oct 3 17:12:19 2010 New Revision: 1004003 URL: http://svn.apache.org/viewvc?rev=1004003view=rev Log: * site/publish/index.html: Fix 1.6.13 release date, here, too. (Why are news items maintained in two separate places?) The intent is for index.html to have just two or three recent items, and news.html to be the full archive. And there's no html foo that can automate this for us? But, well, it's not a big deal. If we provided an RSS feed for our news items, which would be a good idea but yet more work, than we could use something like this to easily pull the most recent news items into the homepage: http://feed2js.org/ Maybe we could just make a handcrafted RSS feed for our news items, and then use this JavaScript to populate both the home page and the news page? Then we would only have a single news source to maintain, the RSS feed. Just a thought. -- Thanks Mark Phippard http://markphip.blogspot.com/
Re: svn commit: r1002151 - in /subversion/trunk/subversion: libsvn_auth_kwallet/kwallet.cpp svn/main.c svnsync/main.c
On 10/3/10 2:45 PM, Arfrever Frehtes Taifersar Arahesis wrote: 2010-09-29 14:29:26 Stefan Sperling napisał(a): On Tue, Sep 28, 2010 at 11:21:26AM -0700, Blair Zajac wrote: On 9/28/2010 6:07 AM, s...@apache.org wrote: Author: stsp Date: Tue Sep 28 13:07:23 2010 New Revision: 1002151 URL: http://svn.apache.org/viewvc?rev=1002151view=rev Log: Remove some goo introduced in r878078 and follow-ups, which was related to the Linux-specific code which has been removed in r1002144. - INITIALIZE_APPLICATION + QCoreApplication *app; + if (! qApp) +{ + int argc = 1; + app = new QCoreApplication(argc, (char *[1]) {(char *) svn}); +} Out of curiosity, what does this do? No idea. I didn't understand this either. I moved this code back out of the INITIALIZE_APPLICATION macro anyway. The code was this way before the INITIALIZE_APPLICATION macro was introduced. QCoreApplication *app isn't static, so does this do some setup logic that we don't need to keep track of? That's what it looks like to me, too. I suppose Arfrever will definitely know what this is doing. This code initializes application to avoid segmentation fault. Should it initialize the application everytime through the function, since app is not static, it won't store the result of the new QCoreApplication? Blair
Re: svn commit: r1004003 - /subversion/site/publish/index.html
On Sun, Oct 3, 2010 at 5:58 PM, Mark Phippard markp...@gmail.com wrote: If we provided an RSS feed for our news items, which would be a good idea but yet more work, than we could use something like this to easily pull the most recent news items into the homepage: http://feed2js.org/ Maybe we could just make a handcrafted RSS feed for our news items, and then use this JavaScript to populate both the home page and the news page? Then we would only have a single news source to maintain, the RSS feed. Actually an RSS feed that did not have items to link to would be kind of pointless. -- Thanks Mark Phippard http://markphip.blogspot.com/
Re: Some tips on profiling
On 01.10.2010 15:56, Ramkumar Ramachandra wrote: Hi Stefan, Stefan Fuhrmann writes: You obviously don't have APR threads enabled. Thanks for finding this. Fixed in r1003430. I enabled it, but there's still some issue: subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config' It builds here. Did you run autogen.sh before ./configure? The server caches almost everything. So, the first (cold) run demonstrates the worst-case, the second run (hot) the best case. Real world performance depends on server memory and load. Ah. Thanks for the clarification. For the merge part: please review the integrate-membuffer branch (only 3 patches). You may also review and merge any of the patches listed in /branches/performance/STATUS. integrate-cache-membuffer, you mean? Thanks! Your emails contains references exactly to the patches I'm looking for :) For the MD5 stuff, try r986459 (performance branch). It should eliminate 1 of the 3 MD5 calculations. Why doesn't STATUS document this and everything else consistently? Because there is no simple mapping rev-feature / improvement. In particular, there are a number of intermediate steps that were replaced by new code later on. There is no point in reviewing these earlier revisions but the newer ones can't be reviewed and merged on their own. Hence the integration branch for the first major change. As soon as a larger number of patches got reviewed and merged, I will prepare further changes for integration. So far, nobody had free cycles to do the reviews. As for the temp files, there are some improvements listed in /branches/performance/STATUS. These would also reduce your system CPU time. I had the chance to check them out and apply them just now. They work as expected. I'll submit some (uneducated) patch reviews to the list and request a merge. I don't have access to the areas your patches touch. I really appreciate that. It would be great if someone had the time to review the 3 commits to the membuffer cache integration branch. The review should not require too much context knowledge. An in-depth review will take a full day or so (like for an average sized C++ class). -- Stefan^2.
Re: [WIP PATCH] Make svn_diff_diff skip identical prefix and suffix to make diff and blame faster
On Sun, Oct 3, 2010 at 1:46 AM, Johan Corveleyn jcor...@gmail.com wrote: Hi, Here is a second iteration of the patch. It now passes make check. Differences from the previous version are: - Support for \r eol-style (\n and \r\n was already ok). - The number of prefix_lines is now passed to svn_diff__lcs, so it can use that value to set the position offset of the EOF marker correctly, in case one of both files has become empty after skipping the prefix. This fixes the crashes in blame_tests.py 2 and 7. The patch is pretty big, so please let me know if I should split it up to make it more reviewable (I could easily split it up between the prefix-finding and the suffix-finding, at the cost of having overview over the entire algorithm). Still to do: - Think about why results are sometimes different (because of eliminated suffix, the LCS can sometimes be slightly different), and what can be done about it. Hm, this problem has made me reconsider whether my patch is a good solution (more details below). I'm thinking of other ways of implementing this kind of optimization, so for now I'm pulling this patch. Please don't review it yet, as I might go for a radically different approach (sorry for any time spent that may be lost). The details: The problem happens if there are two (or more) non-matching parts with an identical part in between (so the prefix scanning stops early, doesn't meet the suffix in one of the files), and the suffix scanning goes too far because the end of the last change is identical to the original at that point. For example (best viewed with fixed-width font): [[[ original | modified --- This is line 1 | This is line 1 This is line 2 | This line is *changed* This is line 3 | This is line 3 ... (more identical lines) | ... existing_function() | existing_function() {| { // do stuff| // do stuff return SVN_NO_ERROR; | return SVN_NO_ERROR; }| } | This is the end of the file | new_function() -| { | // do other stuff | return SVN_NO_ERROR; | } | | This is the end of the file |- ]]] The following identical suffix will be eliminated: [[[ return SVN_NO_ERROR; } This is the end of the file ]]] which means the LCS algorithm cannot do better than to say that the following lines are new: [[[ + return SVN_NO_ERROR; +} + +new_function() +{ + // do other stuff ]]] Not quite what we want/expect. If the suffix is not eliminated, the LCS algorithm gives the correct/expected result. Also if the change in the beginning of the file isn't there, the result is correct (because the prefix scanning goes first, and eliminates the identical stuff from the start (I'll leave it as an exercise to the reader to see that the result is better)). Interestingly, GNU diff also has this problem. It mitigates it a bit by keeping a number of identical prefix/suffix lines (at least the number of context lines, or a higher number if supplied by --horizon-lines=NUM). This cannot completely solve the problem though: for any given number of horizon-lines one can always come up with an example that doesn't give the correct result. So I need to find a way to not eliminate the identical suffix, but to mark those lines as identical and then include them in the position_list that's going to be used by the LCS algorithm. I'd like to do the same for the prefix. This means I need to approach this problem on a different level. To be continued ... Cheers, -- Johan