[ moving to dev@ ]
Following up on a discussion on the users list about the lack of a way
to easily find the rev number in which a file was deleted...
Already referred to issue #3627 (FS API support for oldest-to-youngest
history traversal) and FS-NG, as mentioned on the roadmap. But the
discussion continued about why this is so hard right now, and if there
are alternative approaches. See below...
On Mon, Nov 29, 2010 at 3:51 AM, Daniel Shahaf d...@daniel.shahaf.name wrote:
Johan Corveleyn wrote on Sun, Nov 28, 2010 at 21:20:28 +0100:
On Sun, Nov 28, 2010 at 6:35 PM, Daniel Shahaf d...@daniel.shahaf.name
wrote:
Stefan Sperling wrote on Sun, Nov 28, 2010 at 16:48:30 +0100:
The real problem is that we want to be able to answer these questions
very fast, and some design aspects work against this. For instance,
FSFS by design does not allow modifying old revisions. So where do
we store the copy-to information for a given p...@n?
copy-to information is immutable (never changes once created), so we
could add another hierarchy (parallel to revs/ and revprops/) in which
to store that information. Any 'cp f...@n bar' operation would need to
create/append a file in that hierarchy.
Open question: how to organize $new_hierarchy/16/16384/** to make it
efficiently appendable and queryable (and for what queries? Iterate
all copied-to places is one).
Makes sense?
I'm not sure. But there is another alternative: while we wait for
FS-NG (or another solution like you propose), one could implement the
slow algorithm within the current design.
Are you advocating to implement it in the core (as an svn_fs_* API) or
as a third-party script? The latter is certainly fine, but regarding
the former I don't see the point of adding an API that cannot be
implemented efficiently at this time.
Why not in the core? We can't do this quickly, so we don't do it is
not a very strong argument against having this very useful
functionality IMHO.
Having it in the core is vastly more useful for people like me (and my
colleagues): works on Windows, regardless of whether or not one has
perl/python installed, no need to distribute an additional script,
guaranteed to be available everywhere an svn client is installed, ...
It's actually quite similar to the way blame is implemented
currently: we don't really have the design (line-based information) to
do this quickly, but we calculate it from the other information that
we have available (in a way that could also be done by a script on the
client: diffing every interesting revision against the next,
remembering the lines that were added/removed in every step). Can you
imagine not having blame in svn core just because we can't do it
quickly? Ok, blame may be a more important use case than finding the
rev number where a file was deleted, but still ...
So I still think it's definitely worth it to have this in the core and
offer an API, and implement it slowly now because that's the only way
we can do it (besides, I don't think it will be *that* slow). And
optimize it later when we have FS-NG, or another way to retrieve
this info quickly...
However, having said all that doesn't change the fact that someone
still needs to implement it, and I must admit I don't have the cycles
for that currently :-(.
Cheers,
Johan
Just automating what a
user (or script) currently does when looking for this information,
i.e. a binary search.
Of course it would be slow, but it would certainly already provide
value. At the very least, it saves users a lot of time searching FAQ's
and list archives, wondering why this doesn't work, understanding the
design limitations, and then finally implementing their own script or
doing a one-time manual search.
Then, when FS-NG arrives, or someone comes up with a way to index this
information, it can be optimized.
I don't know if there would be fundamental problems with that, apart
from the fact that someone still needs to implement it of course ...
Cheers,
--
Johan