Re: Triage recovery of damaged Subversion repo

2022-11-28 Thread Michael K
Daniel, thanks for your reply! It is greatly appreciated.

"Yes, rev files have quite a bit of internal structure: reps, node-rev
headers, changed-paths, P2L/L2P, final line. These are generally easy
to parse out of surrounding contexts (revprop files use counted-length
strings, reps have their header and "ENDREP" trailer, L2P-INDEX and
P2L-INDEX know their own length and have ASCII before and after them,
and everything else is ASCII in specific formats)."

I have frequently looked over the documentation at
https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure.


I can definitely recognize the border between reps when I see an "ENDREP",
0A (newline), and then a "DELTA SVN". But then there are those that have
significant data bewteen "ENDREP" and "DELTA SVN" and I don't understand
what is going on there yet. I don't know how I would split those if needed.

"Similarly, it should be easy to recognize where the appended cryptogram
and padding start, since the part from L2P-INDEX to the last line is
distinctive and self-checksummed."

Yes, I have been able to remove the tailing bit that the ransomware added
at the end of files. I made a program to process all files and it does that
fine.

"I don't know by heart what elements will be serialized into the first
4KB of a rev file in logical addressing mode."

I'll mention that a great many of these rev files are smaller than 4 KB, so
they contain no original data.

"Why would you need to /manually/ create a rev file with original data?
You can use 'svn commit' to create rev files (on top of the old, good
backup). I'd have thought you'd focus on trying to extract data from
the partially-corrupted rev files (e.g., reconstruct the fulltexts of
reps where it's possible to do so)."

The old, good backup went through rev 88214, while the original data repo
goes through rev 241130. So that is 152916 revisions difference. None of
those revisions work. I assume so with the ~4KB of damage at the
beginning... as far as I know nothing can read anything automatically from
those, and Subversion will not show any data or verify any revisions at the
point it hits the ransomware-affected files.

So I have been investigating a process to create rev files that includes
remaining original data from the revs so they are functional within
Subversion. If I can find a process to do that, then I can write a program
that will execute that over all the ransomware-affected original revisions
88215 thru 241130. I am comfortable writing programs that process raw data
from files in different ways. The plan would be to process all those files,
output new files (completely outside of Subversion), and then access that
within Subversion to check it. If something didn't work, I would rework the
program and run it again.

I just started with an "empty" revision so that I would first know I can
satisfy Subversion's minimum requirements for a revision (revprops and revs
files). Someone related to the original project actually gave me an example
repo for this purpose. Within that, they created a revision with a single
change. They then used a dump filter to filter out the contents of that
revision, and made a new repo from that. Then I look at the specific
revision file in a hex editor. I've been looking at lots of files in a hex
editor here.

Now, I am completely new to Subversion since this project. So if there is a
better way to do this using Subversion, I'm certainly open to that! If I
were to do an "svn commit", how would I include original data from the
damaged repo?

My thought was these rev files include "reps" units, and those units are
how I would include the original data in newly created rev files.

"(e.g., reconstruct the fulltexts of reps where it's possible to do so)"

Hmm I don't know what "the fulltexts" means.

I am also learning about SVNKit at the same time. Actually I was looking
there to try to figure out the 2 hashes in the footer of the rev files. But
it might be useful to use for this process.

"[note: this means it's possible for rN+M of a file to be recoverable even
if rN's rep is lost].
...
In principle, you can even dive down this rabbit hole of abstractions to
recover data from the surviving tail ends of partially-overwritten reps."

That is intriguing to know. But as for diving down rabbit holes, I likely
won't want to do that if it requires manual work per revision, or if it
requires a lot of coding work with very little to gain. At this point I
would love to get something that works and also contains some original data
so that I know this is feasible. Then if I can improve from there, great.

"The rev files you get by default have bells and whistles turned on. For
instance, they use DELTA and self-DELTA reps even though it's a lot
easier to fabricate a PLAIN rep, and you can use PLAIN anywhere you can
use DELTA."

"For this reason, I'd recommend to try to create a 1.1-era rev file
first. Pass «--compatible-version=1.1 --fs-type=fsfs» to «svna

Re: Questions about setting up Subversion Server

2022-11-28 Thread Felix Natter

hello Daniel,

many thanks for your answers.

Best Regards,

Felix

On 23.11.22 22:21, Daniel Shahaf wrote:

Felix Natter wrote on Tue, Nov 15, 2022 at 09:03:49 +0100:

- In my svnserve wrapper I call this:

exec /usr/bin/svnserve "$@" -r /repos

-> Does the (Linux-) system automatically call this with "-t"?


The -t is added by the client when it invokes the tunnel command
(ssh(1)).  This is documented in the [tunnels] section of the default
~/.subversion/config file and in Chapter 6 of the book (just search for
"-t").


- If I use ssh auth (svn+ssh://), do I have to change the options in
/svnserve.conf,
  like anon-access=none? (I just want access based on file system
permissions, and no
  anon access)


If you don't run svnserve without -t, then the value of anon-access
shouldn't matter.

I'd probably still set it to «none» just in case.


- I am migrating from a 3rd party server which used a different directory
structure,
   and system users (no LDAP), but also svn+ssh://.
   Can I keep all the history, even if some old users in the 3rd party server
are not
   available in the new LDAP auth? (but of course the user names of existing
users are kept)
   In other words: does a SVN Server store the user information (history) as
strings or
   something else that is linked to accounts?


As strings, in the svn:author revprop.


- I am using strict permissions for two (LDAP-)groups in the base
directories for the repos
  (/repos/students, /repos/employees). The Howto I used [1] is using "umask
002" in the
  wrapper. I think for my case (groups) "007" is sufficient, right?

[1]https://www.startupcto.com/server-tech/subversion/setting-up-svn

Please CC: me, as I am not subscribed.

Many Thanks and Best Regards!

Felix

--

*SIDACT GmbH
Simulation Data Analysis and
Compression Technologies
*
*Felix Natter*
/Software Developer /

Auguststraße 29
53229 Bonn
Germany

Phone    :   +49 228 5348 0430
Direct   :   +49 228 4097 7118
Email    :felix.nat...@sidact.com
Web  :http://www.sidact.com/





--

*SIDACT GmbH
Simulation Data Analysis and
Compression Technologies
*
*Felix Natter*
/Software Developer /

Auguststraße 29
53229 Bonn
Germany

Phone    :   +49 228 5348 0430
Direct   :   +49 228 4097 7118
Email    : felix.nat...@sidact.com
Web  : http://www.sidact.com/



OpenPGP_signature
Description: OpenPGP digital signature