Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Philip Martin  writes:

> I've committed and nominated for 1.10.  However I have found another
> caching problem:
>
> https://issues.apache.org/jira/browse/SVN-4727
>
> Loading my copy of the original Collab Subversion repository (40515
> revisions) fails when the cache is large:
>
> svnadmin: E160004: Corrupt node-revision '9-965551.0-857792.r988076/12'
> svnadmin: E160004: Reading one svndiff window read beyond the end of
> the representation

I think I've found the problem.  The call to auto_read_diff_version()
that I added for the previous problem is in the loop in cache_windows().
If there are more than one window chunks, and the first is found in the
cache, then when auto_read_diff_version() is called on subsequent chunks
it will reset some of the rep_state_t.

-- 
Philip


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Philip Martin  writes:

> With a small cache the load completes and verification works, but
> verfication fails with a large cache:
>
> * Error verifying revision 11826.
> svnadmin: E160004: Corrupt node-revision '5-385.0.r8127/80'
> svnadmin: E160004: Reading one svndiff window read beyond the end of
> the representation

Repeated runs show the same error but on different revisions.  Here is
the debug error stack (note different revision):

* Error verifying revision 4965.
../src/subversion/svnadmin/svnadmin.c:2254,
../src/subversion/libsvn_repos/dump.c:2532,
../src/subversion/libsvn_repos/dump.c:2425,
../src/subversion/svnadmin/svnadmin.c:1028,
../src/subversion/libsvn_repos/dump.c:2369,
../src/subversion/libsvn_fs/fs-loader.c:754,
../src/subversion/libsvn_fs_fs/tree.c:4815,
../src/subversion/libsvn_fs_fs/tree.c:4749,
../src/subversion/libsvn_fs_fs/tree.c:4749,
../src/subversion/libsvn_fs_fs/tree.c:4749,
../src/subversion/libsvn_fs_fs/tree.c:4749,
../src/subversion/libsvn_fs_fs/tree.c:4748,
../src/subversion/libsvn_fs_fs/dag.c:200,
../src/subversion/libsvn_fs_fs/dag.c:167,
../src/subversion/libsvn_fs_fs/cached_data.c:521: (apr_err=SVN_ERR_FS_CORRUPT)
svnadmin: E160004: Corrupt node-revision '1m-3485.0.r4965/4'
../src/subversion/libsvn_fs_fs/cached_data.c:482,
../src/subversion/libsvn_fs_fs/cached_data.c:3749,
../src/subversion/libsvn_fs_fs/cached_data.c:3477,
../src/subversion/libsvn_fs_fs/cached_data.c:3414,
../src/subversion/libsvn_fs_fs/cached_data.c:3343: (apr_err=SVN_ERR_FS_CORRUPT)
svnadmin: E160004: Reading one svndiff window read beyond the end of the 
representation


-- 
Philip


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Branko Čibej  writes:

> On 15.03.2018 20:38, Philip Martin wrote:
>> Philip Martin  writes:
>>
>>> I think the raw
>>> window cache may have to be modified to include the svndiff version.
>> Experimental patch:
>
> Looks approximately correct to me. Hard-coding version 1 would have been
> wrong before LZ4, too, since version 0 does exist (although it's not
> often used these days).

I've committed and nominated for 1.10.  However I have found another
caching problem:

https://issues.apache.org/jira/browse/SVN-4727

Loading my copy of the original Collab Subversion repository (40515
revisions) fails when the cache is large:

svnadmin: E160004: Corrupt node-revision '9-965551.0-857792.r988076/12'
svnadmin: E160004: Reading one svndiff window read beyond the end of the 
representation

With a small cache the load completes and verification works, but
verfication fails with a large cache:

* Error verifying revision 11826.
svnadmin: E160004: Corrupt node-revision '5-385.0.r8127/80'
svnadmin: E160004: Reading one svndiff window read beyond the end of the 
representation

-- 
Philip


Re: [PATCH] Hackathon project: Dumping viewspec

2018-03-15 Thread Julian Foad

Johan Corveleyn wrote:

On Sun, Nov 26, 2017, Stefan  wrote:

On 25/11/2017, Stefan Sperling wrote:

On Fri, Nov 24, 2017, Bert Huijben wrote:

At the Aachen hackathon I promised to write some code to spit out the sparse
definition of a working copy, or in other words some initial dumb viewspec
output.
$ svn switch --list \SharpSvn\trunk


Has a new 'svn viewspec' been subcommand considered?
'switch --list' reminds me of our 'switch --relocate' mistake
from the past ;)


Indeed it was. FWIW I agree there are good arguments for a new viewspec
subcommand.

The alternative would be to use "svn list --generate-viewspec" and "svn
switch/checkout --use-viewspec < viewspecfile" or something like this.
The obvious downside would be that one subcommand would be used to
generate the viewspec while another one would be used to apply it. I
think Bert brought up other arguments against adding it to "svn list".


I prefer adding the "export the viewspec info from this WC" to "svn
info", because that's what we already use to obtain info from a
working copy (including depth and working revision). Perhaps "svn info
-R --viewspec". Let's say this would generate some structured
information in a well defined syntax.


I have committed Bert's patch, with the command modified to be "svn info 
--viewspec", in http://svn.apache.org/r1826864


Let's see where it leads.

First I'd like to 'clean up' the formatting a bit to make it easier to 
read, and add some tests.



The exported info can then be used as optional input for several commands:

 svn checkout $URL --apply-viewspec vspecfile.txt
 svn update . --apply-viewspec vspecfile.txt
 svn switch . --apply-viewspec vspecfile.txt   (perhaps the
viewspec contains switched subtrees, which necessitates the 'switch'
command to execute)


We need to look at using the viewspec as input next. Semantically 
speaking, these sorts of things:


* a way to check out a new WC to match the spec;
* a way to modify an existing WC to match the spec;
* a way to modify/checkout a WC of a *different* branch, to match the 
spec except for its URLs (maybe switch URLs pointing inside this 
'branch' or WC get adjusted as if they are relative, and other switch 
URLs stay absolute?);


- Julian


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Branko Čibej
On 15.03.2018 20:38, Philip Martin wrote:
> Philip Martin  writes:
>
>> I think the raw
>> window cache may have to be modified to include the svndiff version.
> Experimental patch:

Looks approximately correct to me. Hard-coding version 1 would have been
wrong before LZ4, too, since version 0 does exist (although it's not
often used these days).

-- Brane



Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Philip Martin  writes:

> I think the raw
> window cache may have to be modified to include the svndiff version.

Experimental patch:

Index: subversion/libsvn_fs_fs/cached_data.c
===
--- subversion/libsvn_fs_fs/cached_data.c   (revision 1826834)
+++ subversion/libsvn_fs_fs/cached_data.c   (working copy)
@@ -1268,7 +1268,7 @@ parse_raw_window(void **out,
   stream = svn_stream_from_string(_window, result_pool);
 
   /* parse it */
-  SVN_ERR(svn_txdelta_read_svndiff_window(>window, stream, 1,
+  SVN_ERR(svn_txdelta_read_svndiff_window(>window, stream, window->ver,
   result_pool));
 
   /* complete the window and return it */
@@ -3212,7 +3212,7 @@ init_rep_state(rep_state_t *rs,
   rs->start = entry->offset + rs->header_size;
   rs->current = rep_header->type == svn_fs_fs__rep_plain ? 0 : 4;
   rs->size = entry->size - rep_header->header_size - 7;
-  rs->ver = 1;
+  rs->ver = -1;
   rs->chunk_index = 0;
   rs->raw_window_cache = ffd->raw_window_cache;
   rs->window_cache = ffd->txdelta_window_cache;
@@ -3310,6 +3310,8 @@ cache_windows(svn_fs_t *fs,
   apr_size_t window_len;
   char *buf;
 
+  auto_read_diff_version(rs, iterpool);
+
   /* navigate to the current window */
   SVN_ERR(rs_aligned_seek(rs, NULL, start_offset, iterpool));
   SVN_ERR(svn_txdelta__read_raw_window_len(_len,
@@ -3330,6 +3332,7 @@ cache_windows(svn_fs_t *fs,
   window.end_offset = rs->current;
   window.window.len = window_len;
   window.window.data = buf;
+  window.ver = rs->ver;
 
   /* cache the window now */
   SVN_ERR(svn_cache__set(rs->raw_window_cache, , ,
Index: subversion/libsvn_fs_fs/temp_serializer.h
===
--- subversion/libsvn_fs_fs/temp_serializer.h   (revision 1826834)
+++ subversion/libsvn_fs_fs/temp_serializer.h   (working copy)
@@ -60,6 +60,9 @@ typedef struct
 
   /* the offset within the representation right after reading the window */
   apr_off_t end_offset;
+
+  /* svndiff version */
+  int ver;
 } svn_fs_fs__raw_cached_window_t;
 
-- 
Philip


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Branko Čibej  writes:

> The svndiff version is embedded in the first 4 bytes of the window data
> and should be parsed from there.

When we store the data in the raw window cache the svndiff version isn't
part of the stored data.  When we subsequently retreive the data from
the cache the svndiff version is no longer available.  I think the raw
window cache may have to be modified to include the svndiff version.

-- 
Philip


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Julian Foad  writes:

> Philip Martin wrote:
>> cached_data.c:parse_raw_window() where the svndiff version is hard-coded
>> to 1 in the call to svn_txdelta_read_svndiff_window.  Changing that to 2
>> allows the regression test to pass, the question is where should the
>> correct value be obtained?
>
> The simple answer appears to be 'rs->ver' (where 'rs' needs to be
> passed in via the function's baton).

It's not that simple.  When we cache the data rs->ver is set to 1 by
init_rep_state().  When we retrieve the data rs->ver is set to -1 by
create_rep_state().  I don't know why these functions initialize rs
differently.  Neither appears to be the value we want.

-- 
Philip


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Julian Foad

Philip Martin wrote:

cached_data.c:parse_raw_window() where the svndiff version is hard-coded
to 1 in the call to svn_txdelta_read_svndiff_window.  Changing that to 2
allows the regression test to pass, the question is where should the
correct value be obtained?


The simple answer appears to be 'rs->ver' (where 'rs' needs to be passed 
in via the function's baton).


But does that mean this code isn't being tested by any of our manual or 
buildbot testing in the last 6 months?


If so, we need to resolve that somehow before I am happy to release it.

Extending our testing is one option. I would much prefer to apply the 
principle of testable design: "if this code only runs sometimes, change 
it so it runs always". Can we do that?


- Julian


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Branko Čibej
On 15.03.2018 18:50, Philip Martin wrote:
> Philip Martin  writes:
>
>> Evgeny Kotkov  writes:
>>
>>> Philip Martin  writes:
>>>
 That works as expected, but vary the cache size of the load process and
 it fails.  The load succeeds with -M64 and smaller but fails with -M65
 and larger:
>>> [...]
>>>
>>> Maybe this behavior could be related to the cache size threshold in svnadmin
>>> that enables the block read feature in fsfs (currently set to 64 MB, as per
>>> svnadmin.c:BLOCK_READ_CACHE_THRESHOLD).
>> Sounds plausible.
> It causes different code to run, in particular the window cache is
> enabled.  I'm not familiar with this code but the problem seems to be in
> cached_data.c:parse_raw_window() where the svndiff version is hard-coded
> to 1 in the call to svn_txdelta_read_svndiff_window.  Changing that to 2
> allows the regression test to pass, the question is where should the
> correct value be obtained?

The svndiff version is embedded in the first 4 bytes of the window data
and should be parsed from there.

-- Brane



Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Philip Martin  writes:

> Evgeny Kotkov  writes:
>
>> Philip Martin  writes:
>>
>>> That works as expected, but vary the cache size of the load process and
>>> it fails.  The load succeeds with -M64 and smaller but fails with -M65
>>> and larger:
>>
>> [...]
>>
>> Maybe this behavior could be related to the cache size threshold in svnadmin
>> that enables the block read feature in fsfs (currently set to 64 MB, as per
>> svnadmin.c:BLOCK_READ_CACHE_THRESHOLD).
>
> Sounds plausible.

It causes different code to run, in particular the window cache is
enabled.  I'm not familiar with this code but the problem seems to be in
cached_data.c:parse_raw_window() where the svndiff version is hard-coded
to 1 in the call to svn_txdelta_read_svndiff_window.  Changing that to 2
allows the regression test to pass, the question is where should the
correct value be obtained?

-- 
Philip


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Evgeny Kotkov  writes:

> Philip Martin  writes:
>
>> That works as expected, but vary the cache size of the load process and
>> it fails.  The load succeeds with -M64 and smaller but fails with -M65
>> and larger:
>
> [...]
>
> Maybe this behavior could be related to the cache size threshold in svnadmin
> that enables the block read feature in fsfs (currently set to 64 MB, as per
> svnadmin.c:BLOCK_READ_CACHE_THRESHOLD).

Sounds plausible.

Here is a simpler reproduction:

svnadmin create repo
svn -mm mkdir file://`pwd`/repo/subversion ^/subversion/trunk 
^/subversion/branches
svnadmin dump -r1 -M100 repo > /dev/null

-- 
Philip


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Evgeny Kotkov
Philip Martin  writes:

> That works as expected, but vary the cache size of the load process and
> it fails.  The load succeeds with -M64 and smaller but fails with -M65
> and larger:

[...]

Maybe this behavior could be related to the cache size threshold in svnadmin
that enables the block read feature in fsfs (currently set to 64 MB, as per
svnadmin.c:BLOCK_READ_CACHE_THRESHOLD).


Regards,
Evgeny Kotkov


Re: svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Philip Martin  writes:

> $ svnadmin dump -q repo | svnadmin load -q -M1000 repo2
> svnadmin: E140001: zlib (uncompress): corrupt data: Decompression of
> svndiff data failed

This is now issue https://issues.apache.org/jira/browse/SVN-4725

-- 
Philip


Re: New build requirements for 1.10?

2018-03-15 Thread Bert Huijben
I currently just use the internal builds for my binaries and don't see a
reason to upgrade them separately. Utf8proc is only used in 'svn' at this
time.

I still build SQLite myself, but just passing the amalgamation works for
others. I don't think there is even support to use external LZ4 and
utf8proc on Windows yet.

Bert

On Tue, Mar 13, 2018 at 3:20 PM, Branko Čibej  wrote:

> On 13.03.2018 14:27, Mark Phippard wrote:
> > Starting to put together updated binaries for 1.10 release.  Can
> > someone summarize any new build requirements, particularly for Windows?
> >
> > I see the reference to LZ4.  Should we use the "internal"
> > implementation or are we expected to build and deliver a liblz4 for
> > Windows?
> >
> > Anything else?  I recall reading something about utf8proc but not sure
> > what that is or what it is used for.  FWIW, our 1.9 build process
> > works for 1.10 but obviously if there are new, but optional, features
> > or libraries added then we probably are not building them yet.
>
> We keep internal copies of both utf8proc and LZ4. I believe the default
> Windows build uses these copies (svn --version --verbose will tell you).
>
> Only LZ4 is a new dependency, we've been using utf8proc for years.
>
> -- Brane
>
>


svnadmin load error with 1.10 that seems to be fsfs cache related

2018-03-15 Thread Philip Martin
Julian Foad  writes:

> Julian Foad wrote on 2018-02-28:
>> I'm happy to announce the release of Apache Subversion 1.10.0-rc1.
>
> That was 2 weeks ago and we need an RC2.

svnadmin create repo
svn -mm mkdir file://`pwd`/repo/subversion
svn -mm mkdir file://`pwd`/repo/subversion/trunk ^/subversion/branches
svn -mm mkdir file://`pwd`/repo/subversion/trunk/src
svnadmin create repo2
svnadmin dump -q repo | svnadmin load -q repo2

That works as expected, but vary the cache size of the load process and
it fails.  The load succeeds with -M64 and smaller but fails with -M65
and larger:

$ svnadmin dump -q repo | svnadmin load -q -M1000 repo2
svnadmin: warning: apr_err=SVN_ERR_STREAM_MALFORMED_DATA
svnadmin: warning: W140001: zlib (uncompress): corrupt data: Decompression of 
svndiff data failed
../src-1.10/subversion/libsvn_repos/load.c:794,
../src-1.10/subversion/libsvn_repos/load-fs-vtable.c:1118,
../src-1.10/subversion/libsvn_fs/fs-loader.c:885,
../src-1.10/subversion/libsvn_fs_fs/tree.c:2333,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:3983,
../src-1.10/subversion/libsvn_fs_fs/fs_fs.c:367,
../src-1.10/subversion/libsvn_fs_fs/fs_fs.c:239,
../src-1.10/subversion/libsvn_fs_fs/fs_fs.c:227,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:3812,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:3185,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:3203,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:2964,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:2767,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:653,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:634,
../src-1.10/subversion/libsvn_subr/stream.c:221,
../src-1.10/subversion/libsvn_fs_fs/transaction.c:2736,
../src-1.10/subversion/libsvn_subr/stream.c:221,
../src-1.10/subversion/libsvn_delta/text_delta.c:524,
../src-1.10/subversion/libsvn_subr/stream.c:194,
../src-1.10/subversion/libsvn_fs_fs/cached_data.c:2148,
../src-1.10/subversion/libsvn_fs_fs/cached_data.c:1885,
../src-1.10/subversion/libsvn_fs_fs/cached_data.c:1704,
../src-1.10/subversion/libsvn_fs_fs/cached_data.c:1564,
../src-1.10/subversion/libsvn_fs_fs/cached_data.c:1323,
../src-1.10/subversion/libsvn_subr/cache-membuffer.c:3070,
../src-1.10/subversion/libsvn_subr/cache-membuffer.c:2608,
../src-1.10/subversion/libsvn_fs_fs/cached_data.c:1272,
../src-1.10/subversion/libsvn_delta/svndiff.c:590,
../src-1.10/subversion/libsvn_subr/compress_zlib.c:166,
../src-1.10/subversion/libsvn_subr/io.c:3699: 
(apr_err=SVN_ERR_STREAM_MALFORMED_DATA)
svnadmin: E140001: zlib (uncompress): corrupt data: Decompression of svndiff 
data failed


-- 
Philip


Bumping the help text API [was: svn commit: r1826328]

2018-03-15 Thread Julian Foad

Daniel Shahaf wrote:

julianf...@apache.org wrote:

[...]

+/** The maximum number of paragraphs of help text a subcommand can have. */


Missing "@since New in 1.11" annotation.


+#define SVN_OPT_MAX_PARAGRAPHS 50


Done.


@@ -77,6 +80,37 @@ typedef svn_error_t *(svn_opt_subcommand
  /** One element of a subcommand dispatch table.
   *
+ * @since New in 1.11.
+ */
+typedef struct svn_opt_subcommand_desc3_t
+{

...

+} svn_opt_subcommand_desc3_t;
+
+
+/** One element of a subcommand dispatch table.
+ *
   * @since New in 1.4.
   */


Mark as deprecated?  (Also other symbols in this commit)


  typedef struct svn_opt_subcommand_desc2_t


We should deprecate these, yes.

Done in r1826800.

It is a bit ugly. Most of the code is identical but the structure data 
type is different. Lots of code duplication (introduced in r1826328).


Is there a better way?

If we were to move the new help text 'paragraphs' array to the end of 
the struct leaving the old single string 'help' field in place (with 
some definition of how they are combined or ignored), so that the 
beginning of the new struct is identical to the whole of the old struct, 
then we could cast one to the other and share some code.


I am not sure if such deduplication is worth the effort at this point. 
The deprecated code can just sit there.


Also I am noticing some ugliness in the general svn_opt help API. For 
example, many command-line programs print their version info by calling


  svn_opt_print_help4(NULL, "svnversion", TRUE, quiet, FALSE,
  NULL, NULL, NULL, NULL, NULL, NULL, pool);

/**
 * Central dispatcher function for various kinds of help message.
 * Prints one of:
 *   * subcommand-specific help (svn_opt_subcommand_help)
 *   * generic help (svn_opt_print_generic_help)
 *   * version info
 *   * simple usage complaint: "Type '@a pgm_name help' for usage."
...*/

and so this had to be bumped to _help5() in 7 places. Dedicated API 
functions would be better than using a 'central dispatcher' I think, for 
the 'version info' case. Well, we already have one (and _help5 simply 
forwards to it):


  svn_opt__print_version_info(...)

- Julian