Re: [PATCH 0/2] allow git-svn fetching to work using serf
(cc-ing users@ as requested by danielsh) David Rothenberger wrote: > On 7/6/2013 5:28 PM, Jonathan Nieder wrote: >> Is there a simple explanation of why violating the depth-first >> constraint would lead to multiple blob (i.e., file, not directory) >> deltas being opened in a row without an intervening close? > > I believe serf is doing the following for a number of files in parallel: > 1. open_file > 2. apply_textdelta > 3. change_file_prop, change_file_prop, ... > 4. close_file Ah, that makes more sense. It is not about traversal order but about processing multiple non-directory files in parallel, and step (3) potentially involving a large number of property changes means that it can make sense not to take a lock. Perhaps the reference documentation could warn about this? On the git-svn side, it looks like we have enough information to make a more complete commit message or in-code comment so the reason for multiple git_blob tempfiles is not forgotten. Thanks for your patient explanations. Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] allow git-svn fetching to work using serf
(cc-ing subversion's users@ list for advice) Kyle McKay wrote: > On Jul 6, 2013, at 18:37, Jonathan Nieder wrote: >> Kyle McKay wrote: >>> Begin forwarded message: [2] http://subversion.tigris.org/issues/show_bug.cgi?id=2932 >> >> Ah, thanks for the context. >> >> It's still not clear to me how we know that ra_serf driving the editor >> in a non depth-first manner is the problem here. Has that explanation >> been confirmed somehow? [...] > Since ra_serf makes multiple connections to the server (hard-coded > to 4 prior to svn 1.8, defaults to 4 in svn 1.8 but can be set to > between 1 and 8) it makes sense there would be multiple active calls > to apply_textdelta if processing is done as results are received on > the multiple connections. Ah, that's worrisome. Do I understand you correctly that to work with ra_serf in skelta mode, callers need to make their apply_textdelta callback thread-safe? Or do you just mean that the traversal order is based on the order in which results are received? That would be fine, as long as after each apply_textdelta call, close_file is called before the next apply_textdelta. Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] allow git-svn fetching to work using serf
On 7/6/2013 5:28 PM, Jonathan Nieder wrote: > David Rothenberger wrote: >> On 7/5/2013 8:41 PM, Kyle McKay wrote: > >>> Daniel Shahaf has suggested also setting >>> "servers:global:http-bulk-updates=on". >> >> I have a patch that does this, but since turning on bulk updates has >> a possible performance penalty, I prefer your approach. > > I assume that's because http-bulk-updates defeats caching. If so, > makes sense. I believe that "bulk updates" means that serf makes one request for a lot of information and receives it all in one big response. In "skelta" mode, serf makes a single request for a single piece of information. The serf authors feel this can lead to improved overall throughput because they can pipeline these requests and have multiple connections open at the same time. The downside, though, is that serf will do multiple open_file calls in parallel as it descends down sibling directories. > It's still not clear to me how we know that ra_serf driving the editor > in a non depth-first manner is the problem here. Has that explanation > been confirmed somehow? I did do a trace of "git svn fetch" and observed this non-depth-first traversal. It certainly causes the failure we've observed. > Is there a simple explanation of why violating the depth-first > constraint would lead to multiple blob (i.e., file, not directory) > deltas being opened in a row without an intervening close? I believe serf is doing the following for a number of files in parallel: 1. open_file 2. apply_textdelta 3. change_file_prop, change_file_prop, ... 4. close_file -- David Rothenberger daver...@acm.org Nusbaum's Rule: The more pretentious the corporate name, the smaller the organization. (For instance, the Murphy Center for the Codification of Human and Organizational Law, contrasted to IBM, GM, and AT&T.) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] allow git-svn fetching to work using serf
On Jul 6, 2013, at 18:37, Jonathan Nieder wrote: Kyle McKay wrote: On Jul 6, 2013, at 17:28, Jonathan Nieder wrote: David Rothenberger wrote: On 7/5/2013 8:41 PM, Kyle McKay wrote: Daniel Shahaf has suggested also setting "servers:global:http-bulk-updates=on". I have a patch that does this, but since turning on bulk updates has a possible performance penalty, I prefer your approach. I assume that's because http-bulk-updates defeats caching. If so, makes sense. Please forgive my ignorance: is there a bug filed about ra_serf's misbehavior here? Is it eventually going to be fixed and this is just a workaround, or is the growth in temp file use something we'd live with permanently? [...] Begin forwarded message: [...] [2] http://subversion.tigris.org/issues/show_bug.cgi?id=2932 Ah, thanks for the context. It's still not clear to me how we know that ra_serf driving the editor in a non depth-first manner is the problem here. Has that explanation been confirmed somehow? For example, does the workaround mentioned by danielsh work? Does using ra_neon instead of ra_serf avoid trouble as well? Is there a simple explanation of why violating the depth-first constraint would lead to multiple blob (i.e., file, not directory) deltas being opened in a row without an intervening close? Using ra_neon seams to eliminate the problem. Using ra_neon has always been the default until svn 1.8 which drops ra_neon support entirely and always uses ra_serf for https?: urls. The workaround mentioned by danielsh won't work if the server has configured SVNAllowBulkUpdates Off because that will force use of skelta mode no matter what the client does. However, since ra_neon only ever has a single connection to the server it probably doesn't matter. Since ra_serf makes multiple connections to the server (hard-coded to 4 prior to svn 1.8, defaults to 4 in svn 1.8 but can be set to between 1 and 8) it makes sense there would be multiple active calls to apply_textdelta if processing is done as results are received on the multiple connections. Kyle -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] allow git-svn fetching to work using serf
Kyle McKay wrote: > On Jul 6, 2013, at 17:28, Jonathan Nieder wrote: >> David Rothenberger wrote: >>> On 7/5/2013 8:41 PM, Kyle McKay wrote: Daniel Shahaf has suggested also setting "servers:global:http-bulk-updates=on". >>> >>> I have a patch that does this, but since turning on bulk updates has >>> a possible performance penalty, I prefer your approach. >> >> I assume that's because http-bulk-updates defeats caching. If so, >> makes sense. >> >> Please forgive my ignorance: is there a bug filed about ra_serf's >> misbehavior here? Is it eventually going to be fixed and this is >> just a workaround, or is the growth in temp file use something we'd >> live with permanently? [...] > > Begin forwarded message: [...] >> [2] http://subversion.tigris.org/issues/show_bug.cgi?id=2932 Ah, thanks for the context. It's still not clear to me how we know that ra_serf driving the editor in a non depth-first manner is the problem here. Has that explanation been confirmed somehow? For example, does the workaround mentioned by danielsh work? Does using ra_neon instead of ra_serf avoid trouble as well? Is there a simple explanation of why violating the depth-first constraint would lead to multiple blob (i.e., file, not directory) deltas being opened in a row without an intervening close? Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] allow git-svn fetching to work using serf
On Jul 6, 2013, at 17:28, Jonathan Nieder wrote: David Rothenberger wrote: On 7/5/2013 8:41 PM, Kyle McKay wrote: Daniel Shahaf has suggested also setting "servers:global:http-bulk-updates=on". I have a patch that does this, but since turning on bulk updates has a possible performance penalty, I prefer your approach. I assume that's because http-bulk-updates defeats caching. If so, makes sense. Please forgive my ignorance: is there a bug filed about ra_serf's misbehavior here? Is it eventually going to be fixed and this is just a workaround, or is the growth in temp file use something we'd live with permanently? Apparently it will not be fixed: Begin forwarded message: From: David Rothenberger Date: July 5, 2013 16:14:12 PDT To: git@vger.kernel.org Subject: Re: git-svn "Temp file with moniker 'svn_delta' already in use" and skelta mode I traced git-svn and discovered that the error is due to a known problem in the SVN APIs. ra_serf does not drive the delta editor in a depth-first manner as required by the API [1]. Instead, the calls come in this order: 1. open_root 2. open_directory 3. add_file 4. apply_textdelta 5. add_file 6. apply_textdelta This is a known issue [2] and one that the Subversion folks have elected not to fix [3]. [1] http://subversion.apache.org/docs/api/latest/structsvn__delta__editor__t.html#details [2] http://subversion.tigris.org/issues/show_bug.cgi?id=2932 [3] http://subversion.tigris.org/issues/show_bug.cgi?id=3831 The summary of [3] which is marked RESOLVED,FIXED is "Add errata / release note noise around ra_serf's editor drive violations". Kyle -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] allow git-svn fetching to work using serf
David Rothenberger wrote: > On 7/5/2013 8:41 PM, Kyle McKay wrote: >> Daniel Shahaf has suggested also setting >> "servers:global:http-bulk-updates=on". > > I have a patch that does this, but since turning on bulk updates has > a possible performance penalty, I prefer your approach. I assume that's because http-bulk-updates defeats caching. If so, makes sense. Please forgive my ignorance: is there a bug filed about ra_serf's misbehavior here? Is it eventually going to be fixed and this is just a workaround, or is the growth in temp file use something we'd live with permanently? Thanks, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] allow git-svn fetching to work using serf
On 7/5/2013 8:41 PM, Kyle McKay wrote: > This patch allows git-svn to fetch successfully using the > serf library when given an https?: url to fetch from. Thanks, Kyle. I confirm this is working for my problem cases as well. > Daniel Shahaf has suggested also setting > "servers:global:http-bulk-updates=on". I have a patch that does this, but since turning on bulk updates has a possible performance penalty, I prefer your approach. Please let me know if anyone wants to see the patch. -- David Rothenberger daver...@acm.org What to do in case of an alien attack: 1) Hide beneath the seat of your plane and look away. 2) Avoid eye contact. 3) If there are no eyes, avoid all contact. -- The Firesign Theatre, _Everything you know is Wrong_ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] allow git-svn fetching to work using serf
This patch allows git-svn to fetch successfully using the serf library when given an https?: url to fetch from. Unfortunately some svn servers do not seem to be configured well for use with the serf library. This can cause fetching to take longer compared to the neon library or actually cause timeouts during the fetch. When timeouts occur git-svn can be safely restarted to fetch more revisions. A new temp_is_locked function has been added to Git.pm to facilitate using the minimal number of temp files possible when using serf. The problem that occurs when running git-svn fetch using the serf library is that the previously used temp file is not always unlocked before the next temp file needs to be used. To work around this problem, a new temp name is used if the temp name that would otherwise be chosen is currently locked. This patch may not be all that is required, but at least it is a starting point. Daniel Shahaf has suggested also setting "servers:global:http-bulk- updates=on". Kyle J. McKay (2): Git.pm: add new temp_is_locked function git-svn: allow git-svn fetching to work using serf perl/Git.pm | 17 - perl/Git/SVN/Fetcher.pm | 6 -- 2 files changed, 20 insertions(+), 3 deletions(-) -- 1.8.3 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html