Re: Google Summer of Code 2013 (GSoC13)

2013-02-25 Thread Florian Achleitner
[corrected David Barr's address]
On Monday 18 February 2013 12:42:39 Jeff King wrote:
> And I do not want to blame the students here (some of whom are on the cc
> list  ). They are certainly under no obligation to stick around after
> GSoC ends, and I know they have many demands on their time. But I am
> also thinking about what Git wants to get out of GSoC (and to my mind,
> the most important thing is contributors).

Just a little comment from another student:
Last year i worked on the 'remote helper for svn'. My official mentor was David 
Barr, but I had most interaction with Jonathan Nieder.

>From my point of view I wouldn't say the project was a fail. It was harder 
than I originally thought, yes. That happens.
But we have a remote helper in master now, although its far from complete and 
it's development is quite stalled. (remote-testsvn)

About sticking around:
As you can see I read the list (I was not on CC), but not very regularly, I 
admit. Anyways, I'd respond to mails in CC or on IRC.

During the summer I believe I learned git's development process quite well. I 
rerolled my main patch series 8 times until 19th of September, which is well 
beyond GSOC deadline. I tried to get it finished before concentrating on my 
studies again.

If I would now continue to contribute, it would be a completely new topic 
(like branch mapping) and take a lot of time that I don't have during the 
year, where I have to push my studies forward. 
For a student one aspect of  GSOC is also quite important: It is a cool and 
demanding summer job during the holidays, but it has to ramp down when the new 
semester starts.

Anyways I think GSOC is a great idea and I enjoyed contributing to git  a lot, 
would immediatly do it again. Keep it goin'!
Thanks.

Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] remote-testsvn: fix unitialized variable

2012-12-15 Thread Florian Achleitner
On Friday 14 December 2012 17:11:44 Jeff King wrote:

> [...]
> We can fix it by returning "-1" when no note is found (so on
> a zero return, we always found a valid value).

Good fix. Parsing of the note now always fails if the note doesn't contain the 
expected string, as it should.

> 
> Signed-off-by: Jeff King 
> ---
> I think this is the right fix, but I am not too familiar with this code,
> so I might be missing a case where a missing "Revision-number" should
> provide some sentinel value (like "0") instead of returning an error. In
> fact, of the two callsites, one already does such a zero-initialization.
> 
>  remote-testsvn.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/remote-testsvn.c b/remote-testsvn.c
> index 51fba05..5ddf11c 100644
> --- a/remote-testsvn.c
> +++ b/remote-testsvn.c
> @@ -90,10 +90,12 @@ static int parse_rev_note(const char *msg, struct
> rev_note *res) if (end == value || i < 0 || i > UINT32_MAX)
>   return -1;
>   res->rev_nr = i;
> + return 0;
>   }
>   msg += len + 1;
>   }
> - return 0;
> + /* didn't find it */
> + return -1;
>  }
> 
>  static int note2mark_cb(const unsigned char *object_sha1,
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-30 Thread Florian Achleitner
Sorry for reacting so late, I didn't read the list carefully in the last weeks 
and my gmail filter somehow didn't trigger on that.

On Tuesday 02 October 2012 16:20:22 Junio C Hamano wrote:
> * fa/remote-svn (2012-09-19) 16 commits
>  - Add a test script for remote-svn
>  - remote-svn: add marks-file regeneration
>  - Add a svnrdump-simulator replaying a dump file for testing
>  - remote-svn: add incremental import
>  - remote-svn: Activate import/export-marks for fast-import
>  - Create a note for every imported commit containing svn metadata
>  - vcs-svn: add fast_export_note to create notes
>  - Allow reading svn dumps from files via file:// urls
>  - remote-svn, vcs-svn: Enable fetching to private refs
>  - When debug==1, start fast-import with "--stats" instead of "--quiet"
>  - Add documentation for the 'bidi-import' capability of remote-helpers
>  - Connect fast-import to the remote-helper via pipe, adding 'bidi-import'
> capability - Add argv_array_detach and argv_array_free_detached
>  - Add svndump_init_fd to allow reading dumps from arbitrary FDs
>  - Add git-remote-testsvn to Makefile
>  - Implement a remote helper for svn in C
>  (this branch is used by fa/vcs-svn.)
> 
>  A GSoC project.
>  Waiting for comments from mentors and stakeholders.

>From my point of view, this is rather complete. It got eight review cycles on 
the list.
Note that the remote helper can only fetch, pushing is not possible at all.

> 
> 
> * fa/vcs-svn (2012-09-19) 4 commits
>  - vcs-svn: remove repo_tree
>  - vcs-svn/svndump: rewrite handle_node(), begin|end_revision()
>  - vcs-svn/svndump: restructure node_ctx, rev_ctx handling
>  - svndump: move struct definitions to .h
>  (this branch uses fa/remote-svn.)
> 
>  A GSoC project.
>  Waiting for comments from mentors and stakeholders.

This is the result of what I did when I wanted to start implementing branch 
detection. I found that the existing code is not suitable and restructured it.

The main goal is to seperate svn revision parsing from git commit creation. 
Because for creating commits, you need to know on which branch to create the 
commit.
While for finding out which branch is the right one, you need to read the 
complete svn revision first to see what dirs are changed and how.

It is rather invasive and it doesn't make sense without using it later on.
So I'm not surprised that you may not like it.
Anyways it passes all existing tests (that doesn't mean it's good of course 
;))

Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 13/16] remote-svn: add incremental import

2012-09-19 Thread Florian Achleitner
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.

If there is no next revision in the svn repo, svnrdump terminates with
a message on stderr an non-zero return value. This looks a little
weird, but there is no other way to know whether there is a new
revision in the svn repo.

On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous mark
files are currently not reused.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
diff:
- style
- improve error detection while reading notes
- strtol instead of atol
- seperate strbufs in main

 contrib/svn-fe/svn-fe.c |3 +-
 remote-testsvn.c|   79 ---
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |   10 --
 vcs-svn/fast_export.h   |6 ++--
 vcs-svn/svndump.c   |   10 +++---
 vcs-svn/svndump.h   |2 +-
 7 files changed, 95 insertions(+), 17 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index c796cc0..f363505 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,8 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master",
+   "refs/notes/svn/revs");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/remote-testsvn.c b/remote-testsvn.c
index 45eba9f..b741f6d 100644
--- a/remote-testsvn.c
+++ b/remote-testsvn.c
@@ -12,7 +12,8 @@ static const char *url;
 static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
-static const char *marksfilename;
+static const char *marksfilename, *notes_ref;
+struct rev_note { unsigned int rev_nr; };
 
 static int cmd_capabilities(const char *line);
 static int cmd_import(const char *line);
@@ -48,14 +49,79 @@ static void terminate_batch(void)
fflush(stdout);
 }
 
+/* NOTE: 'ref' refers to a git reference, while 'rev' refers to a svn 
revision. */
+static char *read_ref_note(const unsigned char sha1[20])
+{
+   const unsigned char *note_sha1;
+   char *msg = NULL;
+   unsigned long msglen;
+   enum object_type type;
+
+   init_notes(NULL, notes_ref, NULL, 0);
+   if (!(note_sha1 = get_note(NULL, sha1)))
+   return NULL;/* note tree not found */
+   if (!(msg = read_sha1_file(note_sha1, &type, &msglen)))
+   error("Empty notes tree. %s", notes_ref);
+   else if (!msglen || type != OBJ_BLOB) {
+   error("Note contains unusable content. "
+   "Is something else using this notes tree? %s", 
notes_ref);
+   free(msg);
+   msg = NULL;
+   }
+   free_notes(NULL);
+   return msg;
+}
+
+static int parse_rev_note(const char *msg, struct rev_note *res)
+{
+   const char *key, *value, *end;
+   size_t len;
+
+   while (*msg) {
+   end = strchr(msg, '\n');
+   len = end ? end - msg : strlen(msg);
+
+   key = "Revision-number: ";
+   if (!prefixcmp(msg, key)) {
+   long i;
+   char *end;
+   value = msg + strlen(key);
+   i = strtol(value, &end, 0);
+   if (end == value || i < 0 || i > UINT32_MAX)
+   return -1;
+   res->rev_nr = i;
+   }
+   msg += len + 1;
+   }
+   return 0;
+}
+
 static int cmd_import(const char *line)
 {
int code;
int dumpin_fd;
-   unsigned int startrev = 0;
+   char *note_msg;
+   unsigned char head_sha1[20];
+   unsigned int startrev;
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
+   if (read_ref(private_ref, head_sha1))
+   startrev = 0;
+   else {
+   note_msg = read_ref_note(head_sha1);
+   if(note_msg == NULL) {
+   warning("No note found for %s.", private_ref);
+   startrev = 0;
+   } else {
+   struct rev_note note = { 0 };
+   if (parse_rev_note(note_msg, ¬e))
+   die("Revision number couldn't be parsed from 
note.");
+   startrev = note.rev_nr + 1;
+   free(note_msg);
+   }
+   }
+
if (dump_from_file)

[PATCH v8 11/16] Create a note for every imported commit containing svn metadata

2012-09-19 Thread Florian Achleitner
To provide metadata from svn dumps for further processing, e.g.
branch detection, attach a note to each imported commit that stores
additional information.  The notes are currently hard-coded in
refs/notes/svn/revs.  Currently the following lines from the svn dump
are directly accumulated in the note. This can be refined as needed.

 - "Revision-number"
 - "Node-path"
 - "Node-kind"
 - "Node-action"
 - "Node-copyfrom-path"
 - "Node-copyfrom-rev"

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 vcs-svn/fast_export.c |   14 --
 vcs-svn/fast_export.h |2 ++
 vcs-svn/svndump.c |   21 +++--
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1ecae4b..df51c59 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -3,8 +3,7 @@
  * See LICENSE for details.
  */
 
-#include "git-compat-util.h"
-#include "strbuf.h"
+#include "cache.h"
 #include "quote.h"
 #include "fast_export.h"
 #include "repo_tree.h"
@@ -68,6 +67,17 @@ void fast_export_modify(const char *path, uint32_t mode, 
const char *dataref)
putchar('\n');
 }
 
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp)
+{
+   size_t loglen = strlen(log);
+   printf("commit refs/notes/svn/revs\n");
+   printf("committer %s <%s@%s> %ld +\n", author, author, "local", 
timestamp);
+   printf("data %"PRIuMAX"\n", (uintmax_t)loglen);
+   fwrite(log, loglen, 1, stdout);
+   fputc('\n', stdout);
+}
+
 void fast_export_note(const char *committish, const char *dataref)
 {
printf("N %s %s\n", dataref, committish);
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 9b32f1e..c2f6f11 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -10,6 +10,8 @@ void fast_export_deinit(void);
 void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_note(const char *committish, const char *dataref);
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
const char *url, unsigned long timestamp, const char 
*local_ref);
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index c8a5b7e..7ec1a5b 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -48,7 +48,7 @@ static struct {
 static struct {
uint32_t revision;
unsigned long timestamp;
-   struct strbuf log, author;
+   struct strbuf log, author, note;
 } rev_ctx;
 
 static struct {
@@ -77,6 +77,7 @@ static void reset_rev_ctx(uint32_t revision)
rev_ctx.timestamp = 0;
strbuf_reset(&rev_ctx.log);
strbuf_reset(&rev_ctx.author);
+   strbuf_reset(&rev_ctx.note);
 }
 
 static void reset_dump_ctx(const char *url)
@@ -310,8 +311,15 @@ static void begin_revision(const char *remote_ref)
 
 static void end_revision(void)
 {
-   if (rev_ctx.revision)
+   struct strbuf mark = STRBUF_INIT;
+   if (rev_ctx.revision) {
fast_export_end_commit(rev_ctx.revision);
+   fast_export_begin_note(rev_ctx.revision, "remote-svn",
+   "Note created by remote-svn.", 
rev_ctx.timestamp);
+   strbuf_addf(&mark, ":%"PRIu32, rev_ctx.revision);
+   fast_export_note(mark.buf, "inline");
+   fast_export_buf_to_data(&rev_ctx.note);
+   }
 }
 
 void svndump_read(const char *url, const char *local_ref)
@@ -358,6 +366,7 @@ void svndump_read(const char *url, const char *local_ref)
end_revision();
active_ctx = REV_CTX;
reset_rev_ctx(atoi(val));
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
case sizeof("Node-path"):
if (constcmp(t, "Node-"))
@@ -369,10 +378,12 @@ void svndump_read(const char *url, const char *local_ref)
begin_revision(local_ref);
active_ctx = NODE_CTX;
reset_node_ctx(val);
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
}
if (constcmp(t + strlen("Node-"), "kind"))
continue;
+  

[PATCH v8 09/16] Allow reading svn dumps from files via file:// urls

2012-09-19 Thread Florian Achleitner
For testing as well as for importing large, already available dumps,
it's useful to bypass svnrdump and replay the svndump from a file
directly.

Add support for file:// urls in the remote url, e.g.

  svn::file:///path/to/dump

When the remote helper finds an url starting with file:// it tries to
open that file instead of invoking svnrdump.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
diff:
- style
- seperate strbufs in main.

 remote-testsvn.c |   52 +---
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/remote-testsvn.c b/remote-testsvn.c
index c30ffcd..67466a9 100644
--- a/remote-testsvn.c
+++ b/remote-testsvn.c
@@ -9,6 +9,7 @@
 #include "argv-array.h"
 
 static const char *url;
+static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
 
@@ -54,29 +55,36 @@ static int cmd_import(const char *line)
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
-   memset(&svndump_proc, 0, sizeof(struct child_process));
-   svndump_proc.out = -1;
-   argv_array_push(&svndump_argv, "svnrdump");
-   argv_array_push(&svndump_argv, "dump");
-   argv_array_push(&svndump_argv, url);
-   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
-   svndump_proc.argv = svndump_argv.argv;
-
-   code = start_command(&svndump_proc);
-   if (code)
-   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
-   dumpin_fd = svndump_proc.out;
-
+   if (dump_from_file) {
+   dumpin_fd = open(url, O_RDONLY);
+   if(dumpin_fd < 0)
+   die_errno("Couldn't open svn dump file %s.", url);
+   } else {
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", 
svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+   }
svndump_init_fd(dumpin_fd, STDIN_FILENO);
svndump_read(url, private_ref);
svndump_deinit();
svndump_reset();
 
close(dumpin_fd);
-   code = finish_command(&svndump_proc);
-   if (code)
-   warning("%s, returned %d", svndump_proc.argv[0], code);
-   argv_array_clear(&svndump_argv);
+   if (!dump_from_file) {
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+   }
 
return 0;
 }
@@ -151,8 +159,14 @@ int main(int argc, const char **argv)
remote = remote_get(argv[1]);
url_in = (argc == 3) ? argv[2] : remote->url[0];
 
-   end_url_with_slash(&url_sb, url_in);
-   url = url_sb.buf;
+   if (!prefixcmp(url_in, "file://")) {
+   dump_from_file = 1;
+   url = url_decode(url_in + sizeof("file://")-1);
+   } else {
+   dump_from_file = 0;
+   end_url_with_slash(&url_sb, url_in);
+   url = url_sb.buf;
+   }
 
strbuf_addf(&private_ref_sb, "refs/svn/%s/master", remote->name);
private_ref = private_ref_sb.buf;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 08/16] remote-svn, vcs-svn: Enable fetching to private refs

2012-09-19 Thread Florian Achleitner
The reference to update by the fast-import stream is hard-coded.  When
fetching from a remote the remote-helper shall update refs in a
private namespace, i.e. a private subdir of refs/.  This namespace is
defined by the 'refspec' capability, that the remote-helper advertises
as a reply to the 'capabilities' command.

Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
diff:
- remove glitch in function declaration.

 contrib/svn-fe/svn-fe.c |2 +-
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |4 ++--
 vcs-svn/fast_export.h   |2 +-
 vcs-svn/svndump.c   |   12 ++--
 vcs-svn/svndump.h   |2 +-
 6 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index 35db24f..c796cc0 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,7 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL);
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/test-svn-fe.c b/test-svn-fe.c
index 83633a2..cb0d80f 100644
--- a/test-svn-fe.c
+++ b/test-svn-fe.c
@@ -40,7 +40,7 @@ int main(int argc, char *argv[])
if (argc == 2) {
if (svndump_init(argv[1]))
return 1;
-   svndump_read(NULL);
+   svndump_read(NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1f04697..11f8f94 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -72,7 +72,7 @@ static char gitsvnline[MAX_GITSVN_LINE_LEN];
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log,
const char *uuid, const char *url,
-   unsigned long timestamp)
+   unsigned long timestamp, const char *local_ref)
 {
static const struct strbuf empty = STRBUF_INIT;
if (!log)
@@ -84,7 +84,7 @@ void fast_export_begin_commit(uint32_t revision, const char 
*author,
} else {
*gitsvnline = '\0';
}
-   printf("commit refs/heads/master\n");
+   printf("commit %s\n", local_ref);
printf("mark :%"PRIu32"\n", revision);
printf("committer %s <%s@%s> %ld +\n",
   *author ? author : "nobody",
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 8823aca..17eb13b 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -11,7 +11,7 @@ void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
-   const char *url, unsigned long timestamp);
+   const char *url, unsigned long timestamp, const char 
*local_ref);
 void fast_export_end_commit(uint32_t revision);
 void fast_export_data(uint32_t mode, off_t len, struct line_buffer *input);
 void fast_export_blob_delta(uint32_t mode,
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index d81a078..c8a5b7e 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -299,13 +299,13 @@ static void handle_node(void)
node_ctx.text_length, &input);
 }
 
-static void begin_revision(void)
+static void begin_revision(const char *remote_ref)
 {
if (!rev_ctx.revision)  /* revision 0 gets no git commit. */
return;
fast_export_begin_commit(rev_ctx.revision, rev_ctx.author.buf,
&rev_ctx.log, dump_ctx.uuid.buf, dump_ctx.url.buf,
-   rev_ctx.timestamp);
+   rev_ctx.timestamp, remote_ref);
 }
 
 static void end_revision(void)
@@ -314,7 +314,7 @@ static void end_revision(void)
fast_export_end_commit(rev_ctx.revision);
 }
 
-void svndump_read(const char *url)
+void svndump_read(const char *url, const char *local_ref)
 {
char *val;
char *t;
@@ -353,7 +353,7 @@ void svndump_read(const char *url)
if (active_ctx == NODE_CTX)
handle_node();
if (active_ctx == REV_CTX)
-   begin_revision();
+   begin_revision(local_ref);
if (active_ctx != DUMP_CTX)
end_revision();
activ

[PATCH v8 06/16] Add documentation for the 'bidi-import' capability of remote-helpers

2012-09-19 Thread Florian Achleitner
Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
no diff

 Documentation/git-remote-helpers.txt |   21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-remote-helpers.txt 
b/Documentation/git-remote-helpers.txt
index f5836e4..5ce4cda 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -98,6 +98,20 @@ advertised with this capability must cover all refs reported 
by
 the list command.  If no 'refspec' capability is advertised,
 there is an implied `refspec *:*`.
 
+'bidi-import'::
+   The fast-import commands 'cat-blob' and 'ls' can be used by 
remote-helpers
+   to retrieve information about blobs and trees that already exist in
+   fast-import's memory. This requires a channel from fast-import to the
+   remote-helper.
+   If it is advertised in addition to "import", git establishes a pipe from
+   fast-import to the remote-helper's stdin.
+   It follows that git and fast-import are both connected to the
+   remote-helper's stdin. Because git can send multiple commands to
+   the remote-helper it is required that helpers that use 'bidi-import'
+   buffer all 'import' commands of a batch before sending data to 
fast-import.
+   This is to prevent mixing commands and fast-import responses on the
+   helper's stdin.
+
 Capabilities for Pushing
 
 'connect'::
@@ -286,7 +300,12 @@ terminated with a blank line. For each batch of 'import', 
the remote
 helper should produce a fast-import stream terminated by a 'done'
 command.
 +
-Supported if the helper has the "import" capability.
+Note that if the 'bidi-import' capability is used the complete batch
+sequence has to be buffered before starting to send data to fast-import
+to prevent mixing of commands and fast-import responses on the helper's
+stdin.
++
+Supported if the helper has the 'import' capability.
 
 'connect' ::
Connects to given service. Standard input and standard output
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 01/16] Implement a remote helper for svn in C

2012-09-19 Thread Florian Achleitner
Enable basic fetching from subversion repositories. When processing
remote URLs starting with testsvn::, git invokes this remote-helper.
It starts svnrdump to extract revisions from the subversion repository
in the 'dump file format', and converts them to a git-fast-import stream
using the functions of vcs-svn/.

Imported refs are created in a private namespace at
refs/svn//master.  The revision history is imported
linearly (no branch detection) and completely, i.e. from revision 0 to
HEAD.

The 'bidi-import' capability is used. The remote-helper expects data
from fast-import on its stdin. It buffers a batch of 'import' command
lines in a string_list before starting to process them.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
diff:
- style
- use seperate strbufs instead of sharing one in main.

 remote-testsvn.c |  176 ++
 1 file changed, 176 insertions(+)
 create mode 100644 remote-testsvn.c

diff --git a/remote-testsvn.c b/remote-testsvn.c
new file mode 100644
index 000..c30ffcd
--- /dev/null
+++ b/remote-testsvn.c
@@ -0,0 +1,176 @@
+#include "cache.h"
+#include "remote.h"
+#include "strbuf.h"
+#include "url.h"
+#include "exec_cmd.h"
+#include "run-command.h"
+#include "vcs-svn/svndump.h"
+#include "notes.h"
+#include "argv-array.h"
+
+static const char *url;
+static const char *private_ref;
+static const char *remote_ref = "refs/heads/master";
+
+static int cmd_capabilities(const char *line);
+static int cmd_import(const char *line);
+static int cmd_list(const char *line);
+
+typedef int (*input_command_handler)(const char *);
+struct input_command_entry {
+   const char *name;
+   input_command_handler fn;
+   unsigned char batchable;/* whether the command starts or is 
part of a batch */
+};
+
+static const struct input_command_entry input_command_list[] = {
+   { "capabilities", cmd_capabilities, 0 },
+   { "import", cmd_import, 1 },
+   { "list", cmd_list, 0 },
+   { NULL, NULL }
+};
+
+static int cmd_capabilities(const char *line)
+{
+   printf("import\n");
+   printf("bidi-import\n");
+   printf("refspec %s:%s\n\n", remote_ref, private_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static void terminate_batch(void)
+{
+   /* terminate a current batch's fast-import stream */
+   printf("done\n");
+   fflush(stdout);
+}
+
+static int cmd_import(const char *line)
+{
+   int code;
+   int dumpin_fd;
+   unsigned int startrev = 0;
+   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
+   struct child_process svndump_proc;
+
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+
+   svndump_init_fd(dumpin_fd, STDIN_FILENO);
+   svndump_read(url, private_ref);
+   svndump_deinit();
+   svndump_reset();
+
+   close(dumpin_fd);
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+
+   return 0;
+}
+
+static int cmd_list(const char *line)
+{
+   printf("? %s\n\n", remote_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static int do_command(struct strbuf *line)
+{
+   const struct input_command_entry *p = input_command_list;
+   static struct string_list batchlines = STRING_LIST_INIT_DUP;
+   static const struct input_command_entry *batch_cmd;
+   /*
+* commands can be grouped together in a batch.
+* Batches are ended by \n. If no batch is active the program ends.
+* During a batch all lines are buffered and passed to the handler 
function
+* when the batch is terminated.
+*/
+   if (line->len == 0) {
+   if (batch_cmd) {
+   struct string_list_item *item;
+   for_each_string_list_item(item, &batchlines)
+   batch_cmd->fn(item->string);
+   terminate_batch();
+   batch_cmd = NULL;
+   string_list_clear(&batchlines, 0);
+   return 0;   /* end of the batch, continue reading 
other com

[RFC v2 3/4] vcs-svn/svndump: rewrite handle_node(), begin|end_revision()

2012-08-28 Thread Florian Achleitner
Split the decision of what to do and actually doing it in
handle_node() to allow for detection of branches from svn nodes.
Split it into handle_node() and apply_node().

svn dumps are structured in revisions, which contain multiple nodes.
Nodes represent operations on data. Currently the function
handle_node() strongly mixes the interpretation of the node data with
the output of processed data to fast-import.

In a fast-import stream a commit object requires a branch name to
which the new commit is added at its beginning.

We want to detect branches in svn. This can only be done by analyzing
node operations, like copyfrom. This conflicts with the current
implementation, where at the beginning of each new revision in the svn
dump, a new commit on a hard-coded git branch is created, before even
reading the first node.

To allow analyzing the nodes before deciding on which branch the
commit will be placed, store the node metadata of one complete
revision, and create a commit from it, when it ends.

Each node can have file data appended. It's desirable to not store the
actual file data, as it is unbounded.  fast-import has a 'blob'
command that allows writing blobs, independent of commits. Use this
feature instead of sending data inline and send the actual file data
immediately when it is read in.

Use marks to reference a blob later. fast-import's marks are currently
used for marking commits, where the mark number corresponds to exactly
one svn revision.
Store the marks for blobs in the upper half of the marks number space
where the MSB is 1.

Change handle_node() to interpret the node data, store it in a
node_ctx, send blobs to fast-import, and append the new node_ctx to
the list of node_ctx.  Do this until the end of a revision.

Just clear the list of note_ctx in begin_revision().

At end_revision() all node metadata is available in the node_ctx list.
Future's branch detectors can decide what branches are to be changed.
Then, call apply_node() for each of them to actually create a commit
and change/add/delete files according to the node_ctx using the
already added blobs.

This can also be used to create commits if the node metadata does not
come from a svndump, but is stored in e.g. notes, for later branch
detection.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 vcs-svn/svndump.c |  167 ++---
 1 file changed, 109 insertions(+), 58 deletions(-)

diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 385523a..eb97e8e 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -48,7 +48,6 @@ static struct node_ctx_t *node_list, *node_list_tail;
 static struct node_ctx_t *new_node_ctx(char *fname)
 {
struct node_ctx_t *node = xmalloc(sizeof(struct node_ctx_t));
-   trace_printf("new_node_ctx %p\n", node);
node->type = 0;
node->action = NODEACT_UNKNOWN;
node->prop_length = -1;
@@ -67,7 +66,6 @@ static struct node_ctx_t *new_node_ctx(char *fname)
 
 static void free_node_ctx(struct node_ctx_t *node)
 {
-   trace_printf("free_node_ctx %p\n", node);
strbuf_release(&node->src);
strbuf_release(&node->dst);
free((char*)node->dataref);
@@ -77,7 +75,6 @@ static void free_node_ctx(struct node_ctx_t *node)
 static void free_node_list(void)
 {
struct node_ctx_t *p = node_list, *n;
-   trace_printf("free_node_list head %p tail %p\n", node_list, 
node_list_tail);
while (p) {
n = p->next;
free_node_ctx(p);
@@ -88,7 +85,6 @@ static void free_node_list(void)
 
 static void append_node_list(struct node_ctx_t *n)
 {
-   trace_printf("append_node_list %p head %p tail %p\n", n, node_list, 
node_list_tail);
if (!node_list)
node_list = node_list_tail = n;
else {
@@ -246,23 +242,10 @@ static void handle_node(struct node_ctx_t *node)
static const char *const empty_blob = "::empty::";
const char *old_data = NULL;
uint32_t old_mode = REPO_MODE_BLB;
+   struct strbuf sb = STRBUF_INIT;
+   static uintmax_t blobmark = (uintmax_t) 1UL << (bitsizeof(uintmax_t) - 
1);
+
 
-   if (node->action == NODEACT_DELETE) {
-   if (have_text || have_props || node->srcRev)
-   die("invalid dump: deletion node has "
-   "copyfrom info, text, or properties");
-   repo_delete(node->dst.buf);
-   return;
-   }
-   if (node->action == NODEACT_REPLACE) {
-   repo_delete(node->dst.buf);
-   node->action = NODEACT_ADD;
-   }
-   if (node->srcRev) {
-   repo_copy(node->srcRev, node->src.buf, node->dst.buf);
-   if (node->action == NODEACT_ADD)
-   node->action = NODEACT_CHANGE

Re: [PATCH v7 00/16] GSOC remote-svn

2012-08-28 Thread Florian Achleitner
On Tuesday 28 August 2012 10:49:34 Florian Achleitner wrote:
> Reroll includes fixups by Ramsey. Thanks!
> Diff:
> [..]
> - improve compatibility of integer types.
> [..]

This line is wrong in this series. Just delete it. Sorry.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 08/16] remote-svn, vcs-svn: Enable fetching to private refs

2012-08-28 Thread Florian Achleitner
The reference to update by the fast-import stream is hard-coded.  When
fetching from a remote the remote-helper shall update refs in a
private namespace, i.e. a private subdir of refs/.  This namespace is
defined by the 'refspec' capability, that the remote-helper advertises
as a reply to the 'capabilities' command.

Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 contrib/svn-fe/svn-fe.c |2 +-
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |4 ++--
 vcs-svn/fast_export.h   |2 +-
 vcs-svn/svndump.c   |   14 +++---
 vcs-svn/svndump.h   |2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index 35db24f..c796cc0 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,7 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL);
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/test-svn-fe.c b/test-svn-fe.c
index 83633a2..cb0d80f 100644
--- a/test-svn-fe.c
+++ b/test-svn-fe.c
@@ -40,7 +40,7 @@ int main(int argc, char *argv[])
if (argc == 2) {
if (svndump_init(argv[1]))
return 1;
-   svndump_read(NULL);
+   svndump_read(NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1f04697..11f8f94 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -72,7 +72,7 @@ static char gitsvnline[MAX_GITSVN_LINE_LEN];
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log,
const char *uuid, const char *url,
-   unsigned long timestamp)
+   unsigned long timestamp, const char *local_ref)
 {
static const struct strbuf empty = STRBUF_INIT;
if (!log)
@@ -84,7 +84,7 @@ void fast_export_begin_commit(uint32_t revision, const char 
*author,
} else {
*gitsvnline = '\0';
}
-   printf("commit refs/heads/master\n");
+   printf("commit %s\n", local_ref);
printf("mark :%"PRIu32"\n", revision);
printf("committer %s <%s@%s> %ld +\n",
   *author ? author : "nobody",
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 8823aca..17eb13b 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -11,7 +11,7 @@ void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
-   const char *url, unsigned long timestamp);
+   const char *url, unsigned long timestamp, const char 
*local_ref);
 void fast_export_end_commit(uint32_t revision);
 void fast_export_data(uint32_t mode, off_t len, struct line_buffer *input);
 void fast_export_blob_delta(uint32_t mode,
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index d81a078..288bb42 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -299,22 +299,22 @@ static void handle_node(void)
node_ctx.text_length, &input);
 }
 
-static void begin_revision(void)
+static void begin_revision(const char *remote_ref)
 {
if (!rev_ctx.revision)  /* revision 0 gets no git commit. */
return;
fast_export_begin_commit(rev_ctx.revision, rev_ctx.author.buf,
&rev_ctx.log, dump_ctx.uuid.buf, dump_ctx.url.buf,
-   rev_ctx.timestamp);
+   rev_ctx.timestamp, remote_ref);
 }
 
-static void end_revision(void)
+static void end_revision()
 {
if (rev_ctx.revision)
fast_export_end_commit(rev_ctx.revision);
 }
 
-void svndump_read(const char *url)
+void svndump_read(const char *url, const char *local_ref)
 {
char *val;
char *t;
@@ -353,7 +353,7 @@ void svndump_read(const char *url)
if (active_ctx == NODE_CTX)
handle_node();
if (active_ctx == REV_CTX)
-   begin_revision();
+   begin_revision(local_ref);
if (active_ctx != DUMP_CTX)
end_revision();
active_ctx = REV_CTX;
@@ -366,7 +366,7 @@ void svndump_rea

[PATCH v7 13/16] remote-svn: add incremental import

2012-08-28 Thread Florian Achleitner
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.

If there is no next revision in the svn repo, svnrdump terminates with
a message on stderr an non-zero return value. This looks a little
weird, but there is no other way to know whether there is a new
revision in the svn repo.

On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous mark
files are currently not reused.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 contrib/svn-fe/svn-fe.c |3 ++-
 remote-testsvn.c|   67 ---
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |   10 +--
 vcs-svn/fast_export.h   |6 ++---
 vcs-svn/svndump.c   |   10 +++
 vcs-svn/svndump.h   |2 +-
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index c796cc0..f363505 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,8 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master",
+   "refs/notes/svn/revs");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/remote-testsvn.c b/remote-testsvn.c
index b6e7968..e90d221 100644
--- a/remote-testsvn.c
+++ b/remote-testsvn.c
@@ -12,7 +12,8 @@ static const char *url;
 static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
-static const char *marksfilename;
+static const char *marksfilename, *notes_ref;
+struct rev_note { unsigned int rev_nr; };
 
 static int cmd_capabilities(const char *line);
 static int cmd_import(const char *line);
@@ -47,14 +48,70 @@ static void terminate_batch(void)
fflush(stdout);
 }
 
+/* NOTE: 'ref' refers to a git reference, while 'rev' refers to a svn 
revision. */
+static char *read_ref_note(const unsigned char sha1[20]) {
+   const unsigned char *note_sha1;
+   char *msg = NULL;
+   unsigned long msglen;
+   enum object_type type;
+   init_notes(NULL, notes_ref, NULL, 0);
+   if( (note_sha1 = get_note(NULL, sha1)) == NULL ||
+   !(msg = read_sha1_file(note_sha1, &type, &msglen)) ||
+   !msglen || type != OBJ_BLOB) {
+   free(msg);
+   return NULL;
+   }
+   free_notes(NULL);
+   return msg;
+}
+
+static int parse_rev_note(const char *msg, struct rev_note *res) {
+   const char *key, *value, *end;
+   size_t len;
+   while(*msg) {
+   end = strchr(msg, '\n');
+   len = end ? end - msg : strlen(msg);
+
+   key = "Revision-number: ";
+   if(!prefixcmp(msg, key)) {
+   long i;
+   value = msg + strlen(key);
+   i = atol(value);
+   if(i < 0 || i > UINT32_MAX)
+   return 1;
+   res->rev_nr = i;
+   }
+   msg += len + 1;
+   }
+   return 0;
+}
+
 static int cmd_import(const char *line)
 {
int code;
int dumpin_fd;
-   unsigned int startrev = 0;
+   char *note_msg;
+   unsigned char head_sha1[20];
+   unsigned int startrev;
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
+   if(read_ref(private_ref, head_sha1))
+   startrev = 0;
+   else {
+   note_msg = read_ref_note(head_sha1);
+   if(note_msg == NULL) {
+   warning("No note found for %s.", private_ref);
+   startrev = 0;
+   }
+   else {
+   struct rev_note note = { 0 };
+   parse_rev_note(note_msg, ¬e);
+   startrev = note.rev_nr + 1;
+   free(note_msg);
+   }
+   }
+
if (dump_from_file) {
dumpin_fd = open(url, O_RDONLY);
if(dumpin_fd < 0) {
@@ -80,7 +137,7 @@ static int cmd_import(const char *line)
"feature export-marks=%s\n", marksfilename, 
marksfilename);
 
svndump_init_fd(dumpin_fd, STDIN_FILENO);
-   svndump_read(url, private_ref);
+   svndump_read(url, private_ref, notes_ref);
svndump_deinit();
svndump_reset();
 
@@ -177,6 +234,9 @@ int main(int argc, const char **argv)
strbuf_addf(&buf, &qu

[PATCH v7 09/16] Allow reading svn dumps from files via file:// urls

2012-08-28 Thread Florian Achleitner
For testing as well as for importing large, already available dumps,
it's useful to bypass svnrdump and replay the svndump from a file
directly.

Add support for file:// urls in the remote url, e.g.

  svn::file:///path/to/dump

When the remote helper finds an url starting with file:// it tries to
open that file instead of invoking svnrdump.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 remote-testsvn.c |   55 +++---
 1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/remote-testsvn.c b/remote-testsvn.c
index ebe803b..2b9d151 100644
--- a/remote-testsvn.c
+++ b/remote-testsvn.c
@@ -9,6 +9,7 @@
 #include "argv-array.h"
 
 static const char *url;
+static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
 
@@ -53,29 +54,38 @@ static int cmd_import(const char *line)
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
-   memset(&svndump_proc, 0, sizeof(struct child_process));
-   svndump_proc.out = -1;
-   argv_array_push(&svndump_argv, "svnrdump");
-   argv_array_push(&svndump_argv, "dump");
-   argv_array_push(&svndump_argv, url);
-   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
-   svndump_proc.argv = svndump_argv.argv;
-
-   code = start_command(&svndump_proc);
-   if (code)
-   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
-   dumpin_fd = svndump_proc.out;
-
+   if (dump_from_file) {
+   dumpin_fd = open(url, O_RDONLY);
+   if(dumpin_fd < 0) {
+   die_errno("Couldn't open svn dump file %s.", url);
+   }
+   }
+   else {
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", 
svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+   }
svndump_init_fd(dumpin_fd, STDIN_FILENO);
svndump_read(url, private_ref);
svndump_deinit();
svndump_reset();
 
close(dumpin_fd);
-   code = finish_command(&svndump_proc);
-   if (code)
-   warning("%s, returned %d", svndump_proc.argv[0], code);
-   argv_array_clear(&svndump_argv);
+   if(!dump_from_file) {
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+   }
 
return 0;
 }
@@ -149,8 +159,15 @@ int main(int argc, const char **argv)
remote = remote_get(argv[1]);
url_in = (argc == 3) ? argv[2] : remote->url[0];
 
-   end_url_with_slash(&buf, url_in);
-   url = strbuf_detach(&buf, NULL);
+   if (!prefixcmp(url_in, "file://")) {
+   dump_from_file = 1;
+   url = url_decode(url_in + sizeof("file://")-1);
+   }
+   else {
+   dump_from_file = 0;
+   end_url_with_slash(&buf, url_in);
+   url = strbuf_detach(&buf, NULL);
+   }
 
strbuf_addf(&buf, "refs/svn/%s/master", remote->name);
private_ref = strbuf_detach(&buf, NULL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 06/16] Add documentation for the 'bidi-import' capability of remote-helpers

2012-08-28 Thread Florian Achleitner
Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-remote-helpers.txt |   21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-remote-helpers.txt 
b/Documentation/git-remote-helpers.txt
index f5836e4..5ce4cda 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -98,6 +98,20 @@ advertised with this capability must cover all refs reported 
by
 the list command.  If no 'refspec' capability is advertised,
 there is an implied `refspec *:*`.
 
+'bidi-import'::
+   The fast-import commands 'cat-blob' and 'ls' can be used by 
remote-helpers
+   to retrieve information about blobs and trees that already exist in
+   fast-import's memory. This requires a channel from fast-import to the
+   remote-helper.
+   If it is advertised in addition to "import", git establishes a pipe from
+   fast-import to the remote-helper's stdin.
+   It follows that git and fast-import are both connected to the
+   remote-helper's stdin. Because git can send multiple commands to
+   the remote-helper it is required that helpers that use 'bidi-import'
+   buffer all 'import' commands of a batch before sending data to 
fast-import.
+   This is to prevent mixing commands and fast-import responses on the
+   helper's stdin.
+
 Capabilities for Pushing
 
 'connect'::
@@ -286,7 +300,12 @@ terminated with a blank line. For each batch of 'import', 
the remote
 helper should produce a fast-import stream terminated by a 'done'
 command.
 +
-Supported if the helper has the "import" capability.
+Note that if the 'bidi-import' capability is used the complete batch
+sequence has to be buffered before starting to send data to fast-import
+to prevent mixing of commands and fast-import responses on the helper's
+stdin.
++
+Supported if the helper has the 'import' capability.
 
 'connect' ::
Connects to given service. Standard input and standard output
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 01/16] Implement a remote helper for svn in C

2012-08-28 Thread Florian Achleitner
Enable basic fetching from subversion repositories. When processing
remote URLs starting with testsvn::, git invokes this remote-helper.
It starts svnrdump to extract revisions from the subversion repository
in the 'dump file format', and converts them to a git-fast-import stream
using the functions of vcs-svn/.

Imported refs are created in a private namespace at
refs/svn//master.  The revision history is imported
linearly (no branch detection) and completely, i.e. from revision 0 to
HEAD.

The 'bidi-import' capability is used. The remote-helper expects data
from fast-import on its stdin. It buffers a batch of 'import' command
lines in a string_list before starting to process them.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 remote-testsvn.c |  174 ++
 1 file changed, 174 insertions(+)
 create mode 100644 remote-testsvn.c

diff --git a/remote-testsvn.c b/remote-testsvn.c
new file mode 100644
index 000..ebe803b
--- /dev/null
+++ b/remote-testsvn.c
@@ -0,0 +1,174 @@
+#include "cache.h"
+#include "remote.h"
+#include "strbuf.h"
+#include "url.h"
+#include "exec_cmd.h"
+#include "run-command.h"
+#include "vcs-svn/svndump.h"
+#include "notes.h"
+#include "argv-array.h"
+
+static const char *url;
+static const char *private_ref;
+static const char *remote_ref = "refs/heads/master";
+
+static int cmd_capabilities(const char *line);
+static int cmd_import(const char *line);
+static int cmd_list(const char *line);
+
+typedef int (*input_command_handler)(const char *);
+struct input_command_entry {
+   const char *name;
+   input_command_handler fn;
+   unsigned char batchable;/* whether the command starts or is 
part of a batch */
+};
+
+static const struct input_command_entry input_command_list[] = {
+   { "capabilities", cmd_capabilities, 0 },
+   { "import", cmd_import, 1 },
+   { "list", cmd_list, 0 },
+   { NULL, NULL }
+};
+
+static int cmd_capabilities(const char *line) {
+   printf("import\n");
+   printf("bidi-import\n");
+   printf("refspec %s:%s\n\n", remote_ref, private_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static void terminate_batch(void)
+{
+   /* terminate a current batch's fast-import stream */
+   printf("done\n");
+   fflush(stdout);
+}
+
+static int cmd_import(const char *line)
+{
+   int code;
+   int dumpin_fd;
+   unsigned int startrev = 0;
+   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
+   struct child_process svndump_proc;
+
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+
+   svndump_init_fd(dumpin_fd, STDIN_FILENO);
+   svndump_read(url, private_ref);
+   svndump_deinit();
+   svndump_reset();
+
+   close(dumpin_fd);
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+
+   return 0;
+}
+
+static int cmd_list(const char *line)
+{
+   printf("? %s\n\n", remote_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static int do_command(struct strbuf *line)
+{
+   const struct input_command_entry *p = input_command_list;
+   static struct string_list batchlines = STRING_LIST_INIT_DUP;
+   static const struct input_command_entry *batch_cmd;
+   /*
+* commands can be grouped together in a batch.
+* Batches are ended by \n. If no batch is active the program ends.
+* During a batch all lines are buffered and passed to the handler 
function
+* when the batch is terminated.
+*/
+   if (line->len == 0) {
+   if (batch_cmd) {
+   struct string_list_item *item;
+   for_each_string_list_item(item, &batchlines)
+   batch_cmd->fn(item->string);
+   terminate_batch();
+   batch_cmd = NULL;
+   string_list_clear(&batchlines, 0);
+   return 0;   /* end of the batch, continue reading 
other commands. */
+   }
+   return 1;   /* end 

[PATCH v7 00/16] GSOC remote-svn

2012-08-28 Thread Florian Achleitner
Reroll includes fixups by Ramsey. Thanks!
Diff:
- Add missing dependency to rule in Makefile.
- improve compatibility of integer types.
- t9020-*.sh: remove excess slash in urls that makes python on windows 
  interpret it as a network path.
- t9020-*.sh: skip if python isn't available.
- replace getline() in remote-testsvn.c. There are platforms that don't provide
  this function.

[PATCH v7 01/16] Implement a remote helper for svn in C
[PATCH v7 02/16] Add git-remote-testsvn to Makefile
[PATCH v7 03/16] Add svndump_init_fd to allow reading dumps from
[PATCH v7 04/16] Add argv_array_detach and argv_array_free_detached
[PATCH v7 05/16] Connect fast-import to the remote-helper via pipe,
[PATCH v7 06/16] Add documentation for the 'bidi-import' capability
[PATCH v7 07/16] When debug==1, start fast-import with "--stats"
[PATCH v7 08/16] remote-svn, vcs-svn: Enable fetching to private
[PATCH v7 09/16] Allow reading svn dumps from files via file:// urls
[PATCH v7 10/16] vcs-svn: add fast_export_note to create notes
[PATCH v7 11/16] Create a note for every imported commit containing
[PATCH v7 12/16] remote-svn: Activate import/export-marks for
[PATCH v7 13/16] remote-svn: add incremental import
[PATCH v7 14/16] Add a svnrdump-simulator replaying a dump file for
[PATCH v7 15/16] remote-svn: add marks-file regeneration
[PATCH v7 16/16] Add a test script for remote-svn
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vcs-svn: Fix 'fa/remote-svn' and 'fa/vcs-svn' in pu

2012-08-27 Thread Florian Achleitner
Hi!

Thanks for your fixups. I'm currently integrating them in a new series.
On what platform did you find that problems? 
Tried to reproduce them on 64bit Linux. Anyways the fixes look very reasonable.

Florian

On Thursday 23 August 2012 18:55:39 Ramsay Jones wrote:
> Signed-off-by: Ramsay Jones 
> ---
> 
> Hi Florian,
> 
> The build on pu is currently broken:
> 
> CC remote-testsvn.o
> LINK git-remote-testsvn
> cc: vcs-svn/lib.a: No such file or directory
> make: *** [git-remote-testsvn] Error 1
> 
> This is caused by a dependency missing from the git-remote-testsvn
> link rule. The addition of the $(VCSSVN_LIB) dependency, which should
> be squashed into commit ea1f4afb ("Add git-remote-testsvn to Makefile",
> 20-08-2012), fixes the build.
> 
> However, this leads to a failure of test t9020.5 and (not unrelated)
> compiler warnings:
> 
> CC vcs-svn/svndump.o
> vcs-svn/svndump.c: In function ‘handle_node’:
> vcs-svn/svndump.c:246: warning: left shift count >= width of type
> vcs-svn/svndump.c:345: warning: format ‘%lu’ expects type ‘long \
> unsigned int’, but argument 3 has type ‘uintmax_t’
> 
> The fix for the shift count warning is to cast the lhs of the shift
> expression to uintmax_t. The format warning is fixed by using the
> PRIuMAX format macro. These fixes should be squashed into commit
> 78d9d4138 ("vcs-svn/svndump: rewrite handle_node(), begin|end_revision()",
> 20-08-2012).
> 
> HTH
> 
> ATB,
> Ramsay Jones
> 
>  Makefile  | 2 +-
>  vcs-svn/svndump.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 9cede84..761ae05 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2356,7 +2356,7 @@ git-http-push$X: revision.o http.o http-push.o
> GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@
> $(ALL_LDFLAGS) $(filter %.o,$^) \ $(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
> 
> -git-remote-testsvn$X: remote-testsvn.o GIT-LDFLAGS $(GITLIBS)
> +git-remote-testsvn$X: remote-testsvn.o GIT-LDFLAGS $(GITLIBS) $(VCSSVN_LIB)
> $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^)
> $(LIBS) \ $(VCSSVN_LIB)
> 
> diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
> index 28ce2aa..eb97e8e 100644
> --- a/vcs-svn/svndump.c
> +++ b/vcs-svn/svndump.c
> @@ -243,7 +243,7 @@ static void handle_node(struct node_ctx_t *node)
>   const char *old_data = NULL;
>   uint32_t old_mode = REPO_MODE_BLB;
>   struct strbuf sb = STRBUF_INIT;
> - static uintmax_t blobmark = 1UL << (bitsizeof(uintmax_t) - 1);
> + static uintmax_t blobmark = (uintmax_t) 1UL << (bitsizeof(uintmax_t) - 
> 1);
> 
> 
>   if (have_text && type == REPO_MODE_DIR)
> @@ -342,7 +342,7 @@ static void handle_node(struct node_ctx_t *node)
>   node->text_length, &input);
>   }
> 
> - strbuf_addf(&sb, ":%lu", blobmark);
> + strbuf_addf(&sb, ":%"PRIuMAX, blobmark);
>   node->dataref = sb.buf;
>   }
>   }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t9020 broken on pu ?

2012-08-26 Thread Florian Achleitner
On Sunday 26 August 2012 21:32:58 Torsten Bögershausen wrote:
> > The reason is that contrib/svn-fe, where remote-svn is in,  is not yet
> > built automatically by the toplevel makefile, so the remote helper can't
> > be found. If you build it manually it should work.
> > Working on it ..
> 
> Hi Florian,
> 
> the compilation as such is started, but gives problems on Mac OS X:
> 
> CC remote-testsvn.o
> remote-testsvn.c: In function ‘check_or_regenerate_marks’:
> remote-testsvn.c:142: warning: implicit declaration of function ‘getline’
> CC vcs-svn/line_buffer.o
> CC vcs-svn/sliding_window.o
> CC vcs-svn/fast_export.o
> CC vcs-svn/svndiff.o
> CC vcs-svn/svndump.o
> AR vcs-svn/lib.a
> LINK git-remote-testsvn
> Undefined symbols:
>   "_getline", referenced from:
>   _cmd_import in remote-testsvn.o
>  (maybe you meant: _strbuf_getline)
> ld: symbol(s) not found
> collect2: ld returned 1 exit status
> make: *** [git-remote-testsvn] Error 1

Seems you also don't have getline on Mac OS X. Others already reported that 
this function may not be available on some platforms. Will be replaced in the 
next reroll.
Thanks for your reviews!

I'm still hesitating to send a new version out, as long as new fixups come in 
continuously.

-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 13/16] remote-svn: add incremental import

2012-08-22 Thread Florian Achleitner
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.

If there is no next revision in the svn repo, svnrdump terminates with
a message on stderr an non-zero return value. This looks a little
weird, but there is no other way to know whether there is a new
revision in the svn repo.

On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous mark
files are currently not reused.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 contrib/svn-fe/svn-fe.c |3 ++-
 remote-testsvn.c|   67 ---
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |   10 +--
 vcs-svn/fast_export.h   |6 ++---
 vcs-svn/svndump.c   |   10 +++
 vcs-svn/svndump.h   |2 +-
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index c796cc0..f363505 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,8 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master",
+   "refs/notes/svn/revs");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/remote-testsvn.c b/remote-testsvn.c
index b6e7968..e90d221 100644
--- a/remote-testsvn.c
+++ b/remote-testsvn.c
@@ -12,7 +12,8 @@ static const char *url;
 static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
-static const char *marksfilename;
+static const char *marksfilename, *notes_ref;
+struct rev_note { unsigned int rev_nr; };
 
 static int cmd_capabilities(const char *line);
 static int cmd_import(const char *line);
@@ -47,14 +48,70 @@ static void terminate_batch(void)
fflush(stdout);
 }
 
+/* NOTE: 'ref' refers to a git reference, while 'rev' refers to a svn 
revision. */
+static char *read_ref_note(const unsigned char sha1[20]) {
+   const unsigned char *note_sha1;
+   char *msg = NULL;
+   unsigned long msglen;
+   enum object_type type;
+   init_notes(NULL, notes_ref, NULL, 0);
+   if( (note_sha1 = get_note(NULL, sha1)) == NULL ||
+   !(msg = read_sha1_file(note_sha1, &type, &msglen)) ||
+   !msglen || type != OBJ_BLOB) {
+   free(msg);
+   return NULL;
+   }
+   free_notes(NULL);
+   return msg;
+}
+
+static int parse_rev_note(const char *msg, struct rev_note *res) {
+   const char *key, *value, *end;
+   size_t len;
+   while(*msg) {
+   end = strchr(msg, '\n');
+   len = end ? end - msg : strlen(msg);
+
+   key = "Revision-number: ";
+   if(!prefixcmp(msg, key)) {
+   long i;
+   value = msg + strlen(key);
+   i = atol(value);
+   if(i < 0 || i > UINT32_MAX)
+   return 1;
+   res->rev_nr = i;
+   }
+   msg += len + 1;
+   }
+   return 0;
+}
+
 static int cmd_import(const char *line)
 {
int code;
int dumpin_fd;
-   unsigned int startrev = 0;
+   char *note_msg;
+   unsigned char head_sha1[20];
+   unsigned int startrev;
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
+   if(read_ref(private_ref, head_sha1))
+   startrev = 0;
+   else {
+   note_msg = read_ref_note(head_sha1);
+   if(note_msg == NULL) {
+   warning("No note found for %s.", private_ref);
+   startrev = 0;
+   }
+   else {
+   struct rev_note note = { 0 };
+   parse_rev_note(note_msg, ¬e);
+   startrev = note.rev_nr + 1;
+   free(note_msg);
+   }
+   }
+
if (dump_from_file) {
dumpin_fd = open(url, O_RDONLY);
if(dumpin_fd < 0) {
@@ -80,7 +137,7 @@ static int cmd_import(const char *line)
"feature export-marks=%s\n", marksfilename, 
marksfilename);
 
svndump_init_fd(dumpin_fd, STDIN_FILENO);
-   svndump_read(url, private_ref);
+   svndump_read(url, private_ref, notes_ref);
svndump_deinit();
svndump_reset();
 
@@ -177,6 +234,9 @@ int main(int argc, const char **argv)
strbuf_addf(&buf, &qu

[PATCH v6 06/16] Add documentation for the 'bidi-import' capability of remote-helpers

2012-08-22 Thread Florian Achleitner
Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-remote-helpers.txt |   21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-remote-helpers.txt 
b/Documentation/git-remote-helpers.txt
index f5836e4..5ce4cda 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -98,6 +98,20 @@ advertised with this capability must cover all refs reported 
by
 the list command.  If no 'refspec' capability is advertised,
 there is an implied `refspec *:*`.
 
+'bidi-import'::
+   The fast-import commands 'cat-blob' and 'ls' can be used by 
remote-helpers
+   to retrieve information about blobs and trees that already exist in
+   fast-import's memory. This requires a channel from fast-import to the
+   remote-helper.
+   If it is advertised in addition to "import", git establishes a pipe from
+   fast-import to the remote-helper's stdin.
+   It follows that git and fast-import are both connected to the
+   remote-helper's stdin. Because git can send multiple commands to
+   the remote-helper it is required that helpers that use 'bidi-import'
+   buffer all 'import' commands of a batch before sending data to 
fast-import.
+   This is to prevent mixing commands and fast-import responses on the
+   helper's stdin.
+
 Capabilities for Pushing
 
 'connect'::
@@ -286,7 +300,12 @@ terminated with a blank line. For each batch of 'import', 
the remote
 helper should produce a fast-import stream terminated by a 'done'
 command.
 +
-Supported if the helper has the "import" capability.
+Note that if the 'bidi-import' capability is used the complete batch
+sequence has to be buffered before starting to send data to fast-import
+to prevent mixing of commands and fast-import responses on the helper's
+stdin.
++
+Supported if the helper has the 'import' capability.
 
 'connect' ::
Connects to given service. Standard input and standard output
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 11/16] Create a note for every imported commit containing svn metadata

2012-08-22 Thread Florian Achleitner
To provide metadata from svn dumps for further processing, e.g.
branch detection, attach a note to each imported commit that stores
additional information.  The notes are currently hard-coded in
refs/notes/svn/revs.  Currently the following lines from the svn dump
are directly accumulated in the note. This can be refined as needed.

 - "Revision-number"
 - "Node-path"
 - "Node-kind"
 - "Node-action"
 - "Node-copyfrom-path"
 - "Node-copyfrom-rev"

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 vcs-svn/fast_export.c |   14 --
 vcs-svn/fast_export.h |2 ++
 vcs-svn/svndump.c |   21 +++--
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1ecae4b..df51c59 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -3,8 +3,7 @@
  * See LICENSE for details.
  */
 
-#include "git-compat-util.h"
-#include "strbuf.h"
+#include "cache.h"
 #include "quote.h"
 #include "fast_export.h"
 #include "repo_tree.h"
@@ -68,6 +67,17 @@ void fast_export_modify(const char *path, uint32_t mode, 
const char *dataref)
putchar('\n');
 }
 
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp)
+{
+   size_t loglen = strlen(log);
+   printf("commit refs/notes/svn/revs\n");
+   printf("committer %s <%s@%s> %ld +\n", author, author, "local", 
timestamp);
+   printf("data %"PRIuMAX"\n", (uintmax_t)loglen);
+   fwrite(log, loglen, 1, stdout);
+   fputc('\n', stdout);
+}
+
 void fast_export_note(const char *committish, const char *dataref)
 {
printf("N %s %s\n", dataref, committish);
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 9b32f1e..c2f6f11 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -10,6 +10,8 @@ void fast_export_deinit(void);
 void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_note(const char *committish, const char *dataref);
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
const char *url, unsigned long timestamp, const char 
*local_ref);
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 288bb42..cd65b51 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -48,7 +48,7 @@ static struct {
 static struct {
uint32_t revision;
unsigned long timestamp;
-   struct strbuf log, author;
+   struct strbuf log, author, note;
 } rev_ctx;
 
 static struct {
@@ -77,6 +77,7 @@ static void reset_rev_ctx(uint32_t revision)
rev_ctx.timestamp = 0;
strbuf_reset(&rev_ctx.log);
strbuf_reset(&rev_ctx.author);
+   strbuf_reset(&rev_ctx.note);
 }
 
 static void reset_dump_ctx(const char *url)
@@ -310,8 +311,15 @@ static void begin_revision(const char *remote_ref)
 
 static void end_revision()
 {
-   if (rev_ctx.revision)
+   struct strbuf mark = STRBUF_INIT;
+   if (rev_ctx.revision) {
fast_export_end_commit(rev_ctx.revision);
+   fast_export_begin_note(rev_ctx.revision, "remote-svn",
+   "Note created by remote-svn.", 
rev_ctx.timestamp);
+   strbuf_addf(&mark, ":%"PRIu32, rev_ctx.revision);
+   fast_export_note(mark.buf, "inline");
+   fast_export_buf_to_data(&rev_ctx.note);
+   }
 }
 
 void svndump_read(const char *url, const char *local_ref)
@@ -358,6 +366,7 @@ void svndump_read(const char *url, const char *local_ref)
end_revision();
active_ctx = REV_CTX;
reset_rev_ctx(atoi(val));
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
case sizeof("Node-path"):
if (constcmp(t, "Node-"))
@@ -369,10 +378,12 @@ void svndump_read(const char *url, const char *local_ref)
begin_revision(local_ref);
active_ctx = NODE_CTX;
reset_node_ctx(val);
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
}
if (constcmp(t + strlen("Node-"), "kind"))
continue;
+ 

[PATCH v6 09/16] Allow reading svn dumps from files via file:// urls

2012-08-22 Thread Florian Achleitner
For testing as well as for importing large, already available dumps,
it's useful to bypass svnrdump and replay the svndump from a file
directly.

Add support for file:// urls in the remote url, e.g.

  svn::file:///path/to/dump

When the remote helper finds an url starting with file:// it tries to
open that file instead of invoking svnrdump.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 remote-testsvn.c |   55 +++---
 1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/remote-testsvn.c b/remote-testsvn.c
index ebe803b..2b9d151 100644
--- a/remote-testsvn.c
+++ b/remote-testsvn.c
@@ -9,6 +9,7 @@
 #include "argv-array.h"
 
 static const char *url;
+static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
 
@@ -53,29 +54,38 @@ static int cmd_import(const char *line)
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
-   memset(&svndump_proc, 0, sizeof(struct child_process));
-   svndump_proc.out = -1;
-   argv_array_push(&svndump_argv, "svnrdump");
-   argv_array_push(&svndump_argv, "dump");
-   argv_array_push(&svndump_argv, url);
-   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
-   svndump_proc.argv = svndump_argv.argv;
-
-   code = start_command(&svndump_proc);
-   if (code)
-   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
-   dumpin_fd = svndump_proc.out;
-
+   if (dump_from_file) {
+   dumpin_fd = open(url, O_RDONLY);
+   if(dumpin_fd < 0) {
+   die_errno("Couldn't open svn dump file %s.", url);
+   }
+   }
+   else {
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", 
svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+   }
svndump_init_fd(dumpin_fd, STDIN_FILENO);
svndump_read(url, private_ref);
svndump_deinit();
svndump_reset();
 
close(dumpin_fd);
-   code = finish_command(&svndump_proc);
-   if (code)
-   warning("%s, returned %d", svndump_proc.argv[0], code);
-   argv_array_clear(&svndump_argv);
+   if(!dump_from_file) {
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+   }
 
return 0;
 }
@@ -149,8 +159,15 @@ int main(int argc, const char **argv)
remote = remote_get(argv[1]);
url_in = (argc == 3) ? argv[2] : remote->url[0];
 
-   end_url_with_slash(&buf, url_in);
-   url = strbuf_detach(&buf, NULL);
+   if (!prefixcmp(url_in, "file://")) {
+   dump_from_file = 1;
+   url = url_decode(url_in + sizeof("file://")-1);
+   }
+   else {
+   dump_from_file = 0;
+   end_url_with_slash(&buf, url_in);
+   url = strbuf_detach(&buf, NULL);
+   }
 
strbuf_addf(&buf, "refs/svn/%s/master", remote->name);
private_ref = strbuf_detach(&buf, NULL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 08/16] remote-svn, vcs-svn: Enable fetching to private refs

2012-08-22 Thread Florian Achleitner
The reference to update by the fast-import stream is hard-coded.  When
fetching from a remote the remote-helper shall update refs in a
private namespace, i.e. a private subdir of refs/.  This namespace is
defined by the 'refspec' capability, that the remote-helper advertises
as a reply to the 'capabilities' command.

Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 contrib/svn-fe/svn-fe.c |2 +-
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |4 ++--
 vcs-svn/fast_export.h   |2 +-
 vcs-svn/svndump.c   |   14 +++---
 vcs-svn/svndump.h   |2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index 35db24f..c796cc0 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,7 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL);
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/test-svn-fe.c b/test-svn-fe.c
index 83633a2..cb0d80f 100644
--- a/test-svn-fe.c
+++ b/test-svn-fe.c
@@ -40,7 +40,7 @@ int main(int argc, char *argv[])
if (argc == 2) {
if (svndump_init(argv[1]))
return 1;
-   svndump_read(NULL);
+   svndump_read(NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1f04697..11f8f94 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -72,7 +72,7 @@ static char gitsvnline[MAX_GITSVN_LINE_LEN];
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log,
const char *uuid, const char *url,
-   unsigned long timestamp)
+   unsigned long timestamp, const char *local_ref)
 {
static const struct strbuf empty = STRBUF_INIT;
if (!log)
@@ -84,7 +84,7 @@ void fast_export_begin_commit(uint32_t revision, const char 
*author,
} else {
*gitsvnline = '\0';
}
-   printf("commit refs/heads/master\n");
+   printf("commit %s\n", local_ref);
printf("mark :%"PRIu32"\n", revision);
printf("committer %s <%s@%s> %ld +\n",
   *author ? author : "nobody",
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 8823aca..17eb13b 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -11,7 +11,7 @@ void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
-   const char *url, unsigned long timestamp);
+   const char *url, unsigned long timestamp, const char 
*local_ref);
 void fast_export_end_commit(uint32_t revision);
 void fast_export_data(uint32_t mode, off_t len, struct line_buffer *input);
 void fast_export_blob_delta(uint32_t mode,
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index d81a078..288bb42 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -299,22 +299,22 @@ static void handle_node(void)
node_ctx.text_length, &input);
 }
 
-static void begin_revision(void)
+static void begin_revision(const char *remote_ref)
 {
if (!rev_ctx.revision)  /* revision 0 gets no git commit. */
return;
fast_export_begin_commit(rev_ctx.revision, rev_ctx.author.buf,
&rev_ctx.log, dump_ctx.uuid.buf, dump_ctx.url.buf,
-   rev_ctx.timestamp);
+   rev_ctx.timestamp, remote_ref);
 }
 
-static void end_revision(void)
+static void end_revision()
 {
if (rev_ctx.revision)
fast_export_end_commit(rev_ctx.revision);
 }
 
-void svndump_read(const char *url)
+void svndump_read(const char *url, const char *local_ref)
 {
char *val;
char *t;
@@ -353,7 +353,7 @@ void svndump_read(const char *url)
if (active_ctx == NODE_CTX)
handle_node();
if (active_ctx == REV_CTX)
-   begin_revision();
+   begin_revision(local_ref);
if (active_ctx != DUMP_CTX)
end_revision();
active_ctx = REV_CTX;
@@ -366,7 +366,7 @@ void svndump_rea

[PATCH v6 01/16] Implement a remote helper for svn in C

2012-08-22 Thread Florian Achleitner
Enable basic fetching from subversion repositories. When processing
remote URLs starting with testsvn::, git invokes this remote-helper.
It starts svnrdump to extract revisions from the subversion repository
in the 'dump file format', and converts them to a git-fast-import stream
using the functions of vcs-svn/.

Imported refs are created in a private namespace at
refs/svn//master.  The revision history is imported
linearly (no branch detection) and completely, i.e. from revision 0 to
HEAD.

The 'bidi-import' capability is used. The remote-helper expects data
from fast-import on its stdin. It buffers a batch of 'import' command
lines in a string_list before starting to process them.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 remote-testsvn.c |  174 ++
 1 file changed, 174 insertions(+)
 create mode 100644 remote-testsvn.c

diff --git a/remote-testsvn.c b/remote-testsvn.c
new file mode 100644
index 000..ebe803b
--- /dev/null
+++ b/remote-testsvn.c
@@ -0,0 +1,174 @@
+#include "cache.h"
+#include "remote.h"
+#include "strbuf.h"
+#include "url.h"
+#include "exec_cmd.h"
+#include "run-command.h"
+#include "vcs-svn/svndump.h"
+#include "notes.h"
+#include "argv-array.h"
+
+static const char *url;
+static const char *private_ref;
+static const char *remote_ref = "refs/heads/master";
+
+static int cmd_capabilities(const char *line);
+static int cmd_import(const char *line);
+static int cmd_list(const char *line);
+
+typedef int (*input_command_handler)(const char *);
+struct input_command_entry {
+   const char *name;
+   input_command_handler fn;
+   unsigned char batchable;/* whether the command starts or is 
part of a batch */
+};
+
+static const struct input_command_entry input_command_list[] = {
+   { "capabilities", cmd_capabilities, 0 },
+   { "import", cmd_import, 1 },
+   { "list", cmd_list, 0 },
+   { NULL, NULL }
+};
+
+static int cmd_capabilities(const char *line) {
+   printf("import\n");
+   printf("bidi-import\n");
+   printf("refspec %s:%s\n\n", remote_ref, private_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static void terminate_batch(void)
+{
+   /* terminate a current batch's fast-import stream */
+   printf("done\n");
+   fflush(stdout);
+}
+
+static int cmd_import(const char *line)
+{
+   int code;
+   int dumpin_fd;
+   unsigned int startrev = 0;
+   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
+   struct child_process svndump_proc;
+
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+
+   svndump_init_fd(dumpin_fd, STDIN_FILENO);
+   svndump_read(url, private_ref);
+   svndump_deinit();
+   svndump_reset();
+
+   close(dumpin_fd);
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+
+   return 0;
+}
+
+static int cmd_list(const char *line)
+{
+   printf("? %s\n\n", remote_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static int do_command(struct strbuf *line)
+{
+   const struct input_command_entry *p = input_command_list;
+   static struct string_list batchlines = STRING_LIST_INIT_DUP;
+   static const struct input_command_entry *batch_cmd;
+   /*
+* commands can be grouped together in a batch.
+* Batches are ended by \n. If no batch is active the program ends.
+* During a batch all lines are buffered and passed to the handler 
function
+* when the batch is terminated.
+*/
+   if (line->len == 0) {
+   if (batch_cmd) {
+   struct string_list_item *item;
+   for_each_string_list_item(item, &batchlines)
+   batch_cmd->fn(item->string);
+   terminate_batch();
+   batch_cmd = NULL;
+   string_list_clear(&batchlines, 0);
+   return 0;   /* end of the batch, continue reading 
other commands. */
+   }
+   return 1;   /* end 

[PATCH v6 01/16] GSOC remote-svn

2012-08-22 Thread Florian Achleitner
Another improved series with fixups by Junio, and a little by me.
Diff:
- fix inconsistend indent in Documentation/git-remote-helpers.txt
- remove trailing newline in Makefile
- fix argument list and usage of regenerate_marks(void) in remote-svn.c


[PATCH v6 01/16] Implement a remote helper for svn in C
[PATCH v6 02/16] Add git-remote-testsvn to Makefile
[PATCH v6 03/16] Add svndump_init_fd to allow reading dumps from
[PATCH v6 04/16] Add argv_array_detach and argv_array_free_detached
[PATCH v6 05/16] Connect fast-import to the remote-helper via pipe,
[PATCH v6 06/16] Add documentation for the 'bidi-import' capability
[PATCH v6 07/16] When debug==1, start fast-import with "--stats"
[PATCH v6 08/16] remote-svn, vcs-svn: Enable fetching to private
[PATCH v6 09/16] Allow reading svn dumps from files via file:// urls
[PATCH v6 10/16] vcs-svn: add fast_export_note to create notes
[PATCH v6 11/16] Create a note for every imported commit containing
[PATCH v6 12/16] remote-svn: Activate import/export-marks for
[PATCH v6 13/16] remote-svn: add incremental import
[PATCH v6 14/16] Add a svnrdump-simulator replaying a dump file for
[PATCH v6 15/16] remote-svn: add marks-file regeneration
[PATCH v6 16/16] Add a test script for remote-svn
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 15/16] remote-svn: add marks-file regeneration

2012-08-21 Thread Florian Achleitner
On Monday 20 August 2012 16:20:27 Junio C Hamano wrote:
> Junio C Hamano  writes:
> > I think you meant something like:
> > 
> >   init_notes(NULL, notes_ref, NULL, 0);
> >
> > marksfile = fopen(marksfilename, "r");
> > if (!marksfile) {
> >   regenerate_marks(marksfilename);
> > marksfile = fopen(marksfilename, "r");

Btw, this is FILE* is nowhere closed in your fixuped version in fa/remote-svn.

> > if (!marksfile)
> >
> >   die("cannot read marks file!");
> >   } else {
> >
> >   ...

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 15/16] remote-svn: add marks-file regeneration

2012-08-21 Thread Florian Achleitner
On Monday 20 August 2012 16:20:27 Junio C Hamano wrote:
> Junio C Hamano  writes:
> > I think you meant something like:
> > init_notes(NULL, notes_ref, NULL, 0);
> > 
> > marksfile = fopen(marksfilename, "r");
> > if (!marksfile) {
> > 
> > regenerate_marks(marksfilename);
> > 
> > marksfile = fopen(marksfilename, "r");
> > if (!marksfile)
> > 
> > die("cannot read marks file!");
> > 
> > } else {
> > 
> > ...
> > 
> > Also there is another call to regenerate_marks() without any
> > argument.  Has this even been compile-tested?

Yes it compiled and it works (is tested by t9020), but the compiler didn't 
complain because I left out void, so every argument was ok. I need to get used 
to that C-feature.

> 
> I've made regenerate_marks() to take (void) parameter list, as
> marksfilename is a file scope static and visible to everybody, and
> applied something like the above and queued the result in 'pu'.

That's exactly how I meant it. Thanks for your fixups!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 3/4] vcs-svn/svndump: rewrite handle_node(), begin|end_revision()

2012-08-20 Thread Florian Achleitner
Split the decision of what to do and actually doing it in
handle_node() to allow for detection of branches from svn nodes.
Split it into handle_node() and apply_node().

svn dumps are structured in revisions, which contain multiple nodes.
Nodes represent operations on data. Currently the function
handle_node() strongly mixes the interpretation of the node data with
the output of processed data to fast-import.

In a fast-import stream a commit object requires a branch name to
which the new commit is added at its beginning.

We want to detect branches in svn. This can only be done by analyzing
node operations, like copyfrom. This conflicts with the current
implementation, where at the beginning of each new revision in the svn
dump, a new commit on a hard-coded git branch is created, before even
reading the first node.

To allow analyzing the nodes before deciding on which branch the
commit will be placed, store the node metadata of one complete
revision, and create a commit from it, when it ends.

Each node can have file data appended. It's desirable to not store the
actual file data, as it is unbounded.  fast-import has a 'blob'
command that allows writing blobs, independent of commits. Use this
feature instead of sending data inline and send the actual file data
immediately when it is read in.

Use marks to reference a blob later. fast-import's marks are currently
used for marking commits, where the mark number corresponds to exactly
one svn revision.
Store the marks for blobs in the upper half of the marks number space
where the MSB is 1.

Change handle_node() to interpret the node data, store it in a
node_ctx, send blobs to fast-import, and append the new node_ctx to
the list of node_ctx.  Do this until the end of a revision.

Just clear the list of note_ctx in begin_revision().

At end_revision() all node metadata is available in the node_ctx list.
Future's branch detectors can decide what branches are to be changed.
Then, call apply_node() for each of them to actually create a commit
and change/add/delete files according to the node_ctx using the
already added blobs.

This can also be used to create commits if the node metadata does not
come from a svndump, but is stored in e.g. notes, for later branch
detection.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 vcs-svn/svndump.c |  167 ++---
 1 file changed, 109 insertions(+), 58 deletions(-)

diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 385523a..28ce2aa 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -48,7 +48,6 @@ static struct node_ctx_t *node_list, *node_list_tail;
 static struct node_ctx_t *new_node_ctx(char *fname)
 {
struct node_ctx_t *node = xmalloc(sizeof(struct node_ctx_t));
-   trace_printf("new_node_ctx %p\n", node);
node->type = 0;
node->action = NODEACT_UNKNOWN;
node->prop_length = -1;
@@ -67,7 +66,6 @@ static struct node_ctx_t *new_node_ctx(char *fname)
 
 static void free_node_ctx(struct node_ctx_t *node)
 {
-   trace_printf("free_node_ctx %p\n", node);
strbuf_release(&node->src);
strbuf_release(&node->dst);
free((char*)node->dataref);
@@ -77,7 +75,6 @@ static void free_node_ctx(struct node_ctx_t *node)
 static void free_node_list(void)
 {
struct node_ctx_t *p = node_list, *n;
-   trace_printf("free_node_list head %p tail %p\n", node_list, 
node_list_tail);
while (p) {
n = p->next;
free_node_ctx(p);
@@ -88,7 +85,6 @@ static void free_node_list(void)
 
 static void append_node_list(struct node_ctx_t *n)
 {
-   trace_printf("append_node_list %p head %p tail %p\n", n, node_list, 
node_list_tail);
if (!node_list)
node_list = node_list_tail = n;
else {
@@ -246,23 +242,10 @@ static void handle_node(struct node_ctx_t *node)
static const char *const empty_blob = "::empty::";
const char *old_data = NULL;
uint32_t old_mode = REPO_MODE_BLB;
+   struct strbuf sb = STRBUF_INIT;
+   static uintmax_t blobmark = 1UL << (bitsizeof(uintmax_t) - 1);
+
 
-   if (node->action == NODEACT_DELETE) {
-   if (have_text || have_props || node->srcRev)
-   die("invalid dump: deletion node has "
-   "copyfrom info, text, or properties");
-   repo_delete(node->dst.buf);
-   return;
-   }
-   if (node->action == NODEACT_REPLACE) {
-   repo_delete(node->dst.buf);
-   node->action = NODEACT_ADD;
-   }
-   if (node->srcRev) {
-   repo_copy(node->srcRev, node->src.buf, node->dst.buf);
-   if (node->action == NODEACT_ADD)
-   node->action = NODEACT_CHANGE;
-  

[PATCH v5 11/16] Create a note for every imported commit containing svn metadata

2012-08-20 Thread Florian Achleitner
To provide metadata from svn dumps for further processing, e.g.
branch detection, attach a note to each imported commit that stores
additional information.  The notes are currently hard-coded in
refs/notes/svn/revs.  Currently the following lines from the svn dump
are directly accumulated in the note. This can be refined as needed.

 - "Revision-number"
 - "Node-path"
 - "Node-kind"
 - "Node-action"
 - "Node-copyfrom-path"
 - "Node-copyfrom-rev"

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 vcs-svn/fast_export.c |   14 --
 vcs-svn/fast_export.h |2 ++
 vcs-svn/svndump.c |   21 +++--
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1ecae4b..df51c59 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -3,8 +3,7 @@
  * See LICENSE for details.
  */
 
-#include "git-compat-util.h"
-#include "strbuf.h"
+#include "cache.h"
 #include "quote.h"
 #include "fast_export.h"
 #include "repo_tree.h"
@@ -68,6 +67,17 @@ void fast_export_modify(const char *path, uint32_t mode, 
const char *dataref)
putchar('\n');
 }
 
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp)
+{
+   size_t loglen = strlen(log);
+   printf("commit refs/notes/svn/revs\n");
+   printf("committer %s <%s@%s> %ld +\n", author, author, "local", 
timestamp);
+   printf("data %"PRIuMAX"\n", (uintmax_t)loglen);
+   fwrite(log, loglen, 1, stdout);
+   fputc('\n', stdout);
+}
+
 void fast_export_note(const char *committish, const char *dataref)
 {
printf("N %s %s\n", dataref, committish);
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 9b32f1e..c2f6f11 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -10,6 +10,8 @@ void fast_export_deinit(void);
 void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_note(const char *committish, const char *dataref);
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
const char *url, unsigned long timestamp, const char 
*local_ref);
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 288bb42..cd65b51 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -48,7 +48,7 @@ static struct {
 static struct {
uint32_t revision;
unsigned long timestamp;
-   struct strbuf log, author;
+   struct strbuf log, author, note;
 } rev_ctx;
 
 static struct {
@@ -77,6 +77,7 @@ static void reset_rev_ctx(uint32_t revision)
rev_ctx.timestamp = 0;
strbuf_reset(&rev_ctx.log);
strbuf_reset(&rev_ctx.author);
+   strbuf_reset(&rev_ctx.note);
 }
 
 static void reset_dump_ctx(const char *url)
@@ -310,8 +311,15 @@ static void begin_revision(const char *remote_ref)
 
 static void end_revision()
 {
-   if (rev_ctx.revision)
+   struct strbuf mark = STRBUF_INIT;
+   if (rev_ctx.revision) {
fast_export_end_commit(rev_ctx.revision);
+   fast_export_begin_note(rev_ctx.revision, "remote-svn",
+   "Note created by remote-svn.", 
rev_ctx.timestamp);
+   strbuf_addf(&mark, ":%"PRIu32, rev_ctx.revision);
+   fast_export_note(mark.buf, "inline");
+   fast_export_buf_to_data(&rev_ctx.note);
+   }
 }
 
 void svndump_read(const char *url, const char *local_ref)
@@ -358,6 +366,7 @@ void svndump_read(const char *url, const char *local_ref)
end_revision();
active_ctx = REV_CTX;
reset_rev_ctx(atoi(val));
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
case sizeof("Node-path"):
if (constcmp(t, "Node-"))
@@ -369,10 +378,12 @@ void svndump_read(const char *url, const char *local_ref)
begin_revision(local_ref);
active_ctx = NODE_CTX;
reset_node_ctx(val);
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
}
if (constcmp(t + strlen("Node-"), "kind"))
continue;
+ 

[PATCH v5 09/16] Allow reading svn dumps from files via file:// urls

2012-08-20 Thread Florian Achleitner
For testing as well as for importing large, already available dumps,
it's useful to bypass svnrdump and replay the svndump from a file
directly.

Add support for file:// urls in the remote url, e.g.

  svn::file:///path/to/dump

When the remote helper finds an url starting with file:// it tries to
open that file instead of invoking svnrdump.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 remote-testsvn.c |   55 +++---
 1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/remote-testsvn.c b/remote-testsvn.c
index ebe803b..2b9d151 100644
--- a/remote-testsvn.c
+++ b/remote-testsvn.c
@@ -9,6 +9,7 @@
 #include "argv-array.h"
 
 static const char *url;
+static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
 
@@ -53,29 +54,38 @@ static int cmd_import(const char *line)
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
-   memset(&svndump_proc, 0, sizeof(struct child_process));
-   svndump_proc.out = -1;
-   argv_array_push(&svndump_argv, "svnrdump");
-   argv_array_push(&svndump_argv, "dump");
-   argv_array_push(&svndump_argv, url);
-   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
-   svndump_proc.argv = svndump_argv.argv;
-
-   code = start_command(&svndump_proc);
-   if (code)
-   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
-   dumpin_fd = svndump_proc.out;
-
+   if (dump_from_file) {
+   dumpin_fd = open(url, O_RDONLY);
+   if(dumpin_fd < 0) {
+   die_errno("Couldn't open svn dump file %s.", url);
+   }
+   }
+   else {
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", 
svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+   }
svndump_init_fd(dumpin_fd, STDIN_FILENO);
svndump_read(url, private_ref);
svndump_deinit();
svndump_reset();
 
close(dumpin_fd);
-   code = finish_command(&svndump_proc);
-   if (code)
-   warning("%s, returned %d", svndump_proc.argv[0], code);
-   argv_array_clear(&svndump_argv);
+   if(!dump_from_file) {
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+   }
 
return 0;
 }
@@ -149,8 +159,15 @@ int main(int argc, const char **argv)
remote = remote_get(argv[1]);
url_in = (argc == 3) ? argv[2] : remote->url[0];
 
-   end_url_with_slash(&buf, url_in);
-   url = strbuf_detach(&buf, NULL);
+   if (!prefixcmp(url_in, "file://")) {
+   dump_from_file = 1;
+   url = url_decode(url_in + sizeof("file://")-1);
+   }
+   else {
+   dump_from_file = 0;
+   end_url_with_slash(&buf, url_in);
+   url = strbuf_detach(&buf, NULL);
+   }
 
strbuf_addf(&buf, "refs/svn/%s/master", remote->name);
private_ref = strbuf_detach(&buf, NULL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 08/16] remote-svn, vcs-svn: Enable fetching to private refs

2012-08-20 Thread Florian Achleitner
The reference to update by the fast-import stream is hard-coded.  When
fetching from a remote the remote-helper shall update refs in a
private namespace, i.e. a private subdir of refs/.  This namespace is
defined by the 'refspec' capability, that the remote-helper advertises
as a reply to the 'capabilities' command.

Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 contrib/svn-fe/svn-fe.c |2 +-
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |4 ++--
 vcs-svn/fast_export.h   |2 +-
 vcs-svn/svndump.c   |   14 +++---
 vcs-svn/svndump.h   |2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index 35db24f..c796cc0 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,7 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL);
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/test-svn-fe.c b/test-svn-fe.c
index 83633a2..cb0d80f 100644
--- a/test-svn-fe.c
+++ b/test-svn-fe.c
@@ -40,7 +40,7 @@ int main(int argc, char *argv[])
if (argc == 2) {
if (svndump_init(argv[1]))
return 1;
-   svndump_read(NULL);
+   svndump_read(NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1f04697..11f8f94 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -72,7 +72,7 @@ static char gitsvnline[MAX_GITSVN_LINE_LEN];
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log,
const char *uuid, const char *url,
-   unsigned long timestamp)
+   unsigned long timestamp, const char *local_ref)
 {
static const struct strbuf empty = STRBUF_INIT;
if (!log)
@@ -84,7 +84,7 @@ void fast_export_begin_commit(uint32_t revision, const char 
*author,
} else {
*gitsvnline = '\0';
}
-   printf("commit refs/heads/master\n");
+   printf("commit %s\n", local_ref);
printf("mark :%"PRIu32"\n", revision);
printf("committer %s <%s@%s> %ld +\n",
   *author ? author : "nobody",
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 8823aca..17eb13b 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -11,7 +11,7 @@ void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
-   const char *url, unsigned long timestamp);
+   const char *url, unsigned long timestamp, const char 
*local_ref);
 void fast_export_end_commit(uint32_t revision);
 void fast_export_data(uint32_t mode, off_t len, struct line_buffer *input);
 void fast_export_blob_delta(uint32_t mode,
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index d81a078..288bb42 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -299,22 +299,22 @@ static void handle_node(void)
node_ctx.text_length, &input);
 }
 
-static void begin_revision(void)
+static void begin_revision(const char *remote_ref)
 {
if (!rev_ctx.revision)  /* revision 0 gets no git commit. */
return;
fast_export_begin_commit(rev_ctx.revision, rev_ctx.author.buf,
&rev_ctx.log, dump_ctx.uuid.buf, dump_ctx.url.buf,
-   rev_ctx.timestamp);
+   rev_ctx.timestamp, remote_ref);
 }
 
-static void end_revision(void)
+static void end_revision()
 {
if (rev_ctx.revision)
fast_export_end_commit(rev_ctx.revision);
 }
 
-void svndump_read(const char *url)
+void svndump_read(const char *url, const char *local_ref)
 {
char *val;
char *t;
@@ -353,7 +353,7 @@ void svndump_read(const char *url)
if (active_ctx == NODE_CTX)
handle_node();
if (active_ctx == REV_CTX)
-   begin_revision();
+   begin_revision(local_ref);
if (active_ctx != DUMP_CTX)
end_revision();
active_ctx = REV_CTX;
@@ -366,7 +366,7 @@ void svndump_rea

[PATCH v5 13/16] remote-svn: add incremental import

2012-08-20 Thread Florian Achleitner
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.

If there is no next revision in the svn repo, svnrdump terminates with
a message on stderr an non-zero return value. This looks a little
weird, but there is no other way to know whether there is a new
revision in the svn repo.

On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous mark
files are currently not reused.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 contrib/svn-fe/svn-fe.c |3 ++-
 remote-testsvn.c|   67 ---
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |   10 +--
 vcs-svn/fast_export.h   |6 ++---
 vcs-svn/svndump.c   |   10 +++
 vcs-svn/svndump.h   |2 +-
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index c796cc0..f363505 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,8 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master",
+   "refs/notes/svn/revs");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/remote-testsvn.c b/remote-testsvn.c
index b6e7968..e90d221 100644
--- a/remote-testsvn.c
+++ b/remote-testsvn.c
@@ -12,7 +12,8 @@ static const char *url;
 static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
-static const char *marksfilename;
+static const char *marksfilename, *notes_ref;
+struct rev_note { unsigned int rev_nr; };
 
 static int cmd_capabilities(const char *line);
 static int cmd_import(const char *line);
@@ -47,14 +48,70 @@ static void terminate_batch(void)
fflush(stdout);
 }
 
+/* NOTE: 'ref' refers to a git reference, while 'rev' refers to a svn 
revision. */
+static char *read_ref_note(const unsigned char sha1[20]) {
+   const unsigned char *note_sha1;
+   char *msg = NULL;
+   unsigned long msglen;
+   enum object_type type;
+   init_notes(NULL, notes_ref, NULL, 0);
+   if( (note_sha1 = get_note(NULL, sha1)) == NULL ||
+   !(msg = read_sha1_file(note_sha1, &type, &msglen)) ||
+   !msglen || type != OBJ_BLOB) {
+   free(msg);
+   return NULL;
+   }
+   free_notes(NULL);
+   return msg;
+}
+
+static int parse_rev_note(const char *msg, struct rev_note *res) {
+   const char *key, *value, *end;
+   size_t len;
+   while(*msg) {
+   end = strchr(msg, '\n');
+   len = end ? end - msg : strlen(msg);
+
+   key = "Revision-number: ";
+   if(!prefixcmp(msg, key)) {
+   long i;
+   value = msg + strlen(key);
+   i = atol(value);
+   if(i < 0 || i > UINT32_MAX)
+   return 1;
+   res->rev_nr = i;
+   }
+   msg += len + 1;
+   }
+   return 0;
+}
+
 static int cmd_import(const char *line)
 {
int code;
int dumpin_fd;
-   unsigned int startrev = 0;
+   char *note_msg;
+   unsigned char head_sha1[20];
+   unsigned int startrev;
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
+   if(read_ref(private_ref, head_sha1))
+   startrev = 0;
+   else {
+   note_msg = read_ref_note(head_sha1);
+   if(note_msg == NULL) {
+   warning("No note found for %s.", private_ref);
+   startrev = 0;
+   }
+   else {
+   struct rev_note note = { 0 };
+   parse_rev_note(note_msg, ¬e);
+   startrev = note.rev_nr + 1;
+   free(note_msg);
+   }
+   }
+
if (dump_from_file) {
dumpin_fd = open(url, O_RDONLY);
if(dumpin_fd < 0) {
@@ -80,7 +137,7 @@ static int cmd_import(const char *line)
"feature export-marks=%s\n", marksfilename, 
marksfilename);
 
svndump_init_fd(dumpin_fd, STDIN_FILENO);
-   svndump_read(url, private_ref);
+   svndump_read(url, private_ref, notes_ref);
svndump_deinit();
svndump_reset();
 
@@ -177,6 +234,9 @@ int main(int argc, const char **argv)
strbuf_addf(&buf, &qu

[PATCH v5 06/16] Add documentation for the 'bidi-import' capability of remote-helpers

2012-08-20 Thread Florian Achleitner
Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-remote-helpers.txt |   21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-remote-helpers.txt 
b/Documentation/git-remote-helpers.txt
index f5836e4..5faa48e 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -98,6 +98,20 @@ advertised with this capability must cover all refs reported 
by
 the list command.  If no 'refspec' capability is advertised,
 there is an implied `refspec *:*`.
 
+'bidi-import'::
+   The fast-import commands 'cat-blob' and 'ls' can be used by 
remote-helpers
+to retrieve information about blobs and trees that already exist in
+fast-import's memory. This requires a channel from fast-import to the
+remote-helper.
+If it is advertised in addition to "import", git establishes a pipe from
+   fast-import to the remote-helper's stdin.
+   It follows that git and fast-import are both connected to the
+   remote-helper's stdin. Because git can send multiple commands to
+   the remote-helper it is required that helpers that use 'bidi-import'
+   buffer all 'import' commands of a batch before sending data to 
fast-import.
+This is to prevent mixing commands and fast-import responses on the
+helper's stdin.
+
 Capabilities for Pushing
 
 'connect'::
@@ -286,7 +300,12 @@ terminated with a blank line. For each batch of 'import', 
the remote
 helper should produce a fast-import stream terminated by a 'done'
 command.
 +
-Supported if the helper has the "import" capability.
+Note that if the 'bidi-import' capability is used the complete batch
+sequence has to be buffered before starting to send data to fast-import
+to prevent mixing of commands and fast-import responses on the helper's
+stdin.
++
+Supported if the helper has the 'import' capability.
 
 'connect' ::
Connects to given service. Standard input and standard output
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 01/16] Implement a remote helper for svn in C

2012-08-20 Thread Florian Achleitner
Enable basic fetching from subversion repositories. When processing
remote URLs starting with testsvn::, git invokes this remote-helper.
It starts svnrdump to extract revisions from the subversion repository
in the 'dump file format', and converts them to a git-fast-import stream
using the functions of vcs-svn/.

Imported refs are created in a private namespace at
refs/svn//master.  The revision history is imported
linearly (no branch detection) and completely, i.e. from revision 0 to
HEAD.

The 'bidi-import' capability is used. The remote-helper expects data
from fast-import on its stdin. It buffers a batch of 'import' command
lines in a string_list before starting to process them.

Signed-off-by: Florian Achleitner 
Signed-off-by: Junio C Hamano 
---
 remote-testsvn.c |  174 ++
 1 file changed, 174 insertions(+)
 create mode 100644 remote-testsvn.c

diff --git a/remote-testsvn.c b/remote-testsvn.c
new file mode 100644
index 000..ebe803b
--- /dev/null
+++ b/remote-testsvn.c
@@ -0,0 +1,174 @@
+#include "cache.h"
+#include "remote.h"
+#include "strbuf.h"
+#include "url.h"
+#include "exec_cmd.h"
+#include "run-command.h"
+#include "vcs-svn/svndump.h"
+#include "notes.h"
+#include "argv-array.h"
+
+static const char *url;
+static const char *private_ref;
+static const char *remote_ref = "refs/heads/master";
+
+static int cmd_capabilities(const char *line);
+static int cmd_import(const char *line);
+static int cmd_list(const char *line);
+
+typedef int (*input_command_handler)(const char *);
+struct input_command_entry {
+   const char *name;
+   input_command_handler fn;
+   unsigned char batchable;/* whether the command starts or is 
part of a batch */
+};
+
+static const struct input_command_entry input_command_list[] = {
+   { "capabilities", cmd_capabilities, 0 },
+   { "import", cmd_import, 1 },
+   { "list", cmd_list, 0 },
+   { NULL, NULL }
+};
+
+static int cmd_capabilities(const char *line) {
+   printf("import\n");
+   printf("bidi-import\n");
+   printf("refspec %s:%s\n\n", remote_ref, private_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static void terminate_batch(void)
+{
+   /* terminate a current batch's fast-import stream */
+   printf("done\n");
+   fflush(stdout);
+}
+
+static int cmd_import(const char *line)
+{
+   int code;
+   int dumpin_fd;
+   unsigned int startrev = 0;
+   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
+   struct child_process svndump_proc;
+
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+
+   svndump_init_fd(dumpin_fd, STDIN_FILENO);
+   svndump_read(url, private_ref);
+   svndump_deinit();
+   svndump_reset();
+
+   close(dumpin_fd);
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+
+   return 0;
+}
+
+static int cmd_list(const char *line)
+{
+   printf("? %s\n\n", remote_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static int do_command(struct strbuf *line)
+{
+   const struct input_command_entry *p = input_command_list;
+   static struct string_list batchlines = STRING_LIST_INIT_DUP;
+   static const struct input_command_entry *batch_cmd;
+   /*
+* commands can be grouped together in a batch.
+* Batches are ended by \n. If no batch is active the program ends.
+* During a batch all lines are buffered and passed to the handler 
function
+* when the batch is terminated.
+*/
+   if (line->len == 0) {
+   if (batch_cmd) {
+   struct string_list_item *item;
+   for_each_string_list_item(item, &batchlines)
+   batch_cmd->fn(item->string);
+   terminate_batch();
+   batch_cmd = NULL;
+   string_list_clear(&batchlines, 0);
+   return 0;   /* end of the batch, continue reading 
other commands. */
+   }
+   return 1;   /* end 

[PATCH v5 00/16] GSOC remote-svn

2012-08-20 Thread Florian Achleitner
New version with these changes:

- includes fixups and changes by Junio from fa/remote-svn
- move contrib/svn-fe/remote-svn.c to remote-testsvn.c (in toplevel)
- add it to the toplevel Makefile
  (needed to copy the linker rule, is there a nicer way?)
- check for prerequisite in test script (probably not needed, 
  because it's built automatically now)

 [PATCH v5 01/16] Implement a remote helper for svn in C
 [PATCH v5 02/16] Add git-remote-testsvn to Makefile.
 [PATCH v5 03/16] Add svndump_init_fd to allow reading dumps from
 [PATCH v5 04/16] Add argv_array_detach and argv_array_free_detached
 [PATCH v5 05/16] Connect fast-import to the remote-helper via pipe,
 [PATCH v5 06/16] Add documentation for the 'bidi-import' capability
 [PATCH v5 07/16] When debug==1, start fast-import with "--stats"
 [PATCH v5 08/16] remote-svn, vcs-svn: Enable fetching to private
 [PATCH v5 09/16] Allow reading svn dumps from files via file:// urls
 [PATCH v5 10/16] vcs-svn: add fast_export_note to create notes
 [PATCH v5 11/16] Create a note for every imported commit containing
 [PATCH v5 12/16] remote-svn: Activate import/export-marks for
 [PATCH v5 13/16] remote-svn: add incremental import
 [PATCH v5 14/16] Add a svnrdump-simulator replaying a dump file for
 [PATCH v5 15/16] remote-svn: add marks-file regeneration
 [PATCH v5 16/16] Add a test script for remote-svn
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t9020 broken on pu ?

2012-08-20 Thread Florian Achleitner
On Monday 20 August 2012 22:56:35 Torsten Bögershausen wrote:
> t9020 from pu doesn't work for me (neither linux nor Mac OS)
> 
> I haven't been able to find out more than this:
> 
> Initialized empty Git repository in /home/tb/projects/git/git.pu/t/trash
> directory.t9020-remote-svn/.git/
> expecting success:
>  init_git &&
>  git fetch svnsim &&
>  test_cmp .git/refs/svn/svnsim/master
> .git/refs/remotes/svnsim/master  &&
>  cp .git/refs/remotes/svnsim/master master.good
> 
> Initialized empty Git repository in /home/tb/projects/git/git.pu/t/trash
> directory.t9020-remote-svn/.git/
> fatal: Unable to find remote helper for 'svn'
> not ok - 1 simple fetch

The reason is that contrib/svn-fe, where remote-svn is in,  is not yet built 
automatically by the toplevel makefile, so the remote helper can't be found.
If you build it manually it should work.
Working on it ..


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/5] GSOC: prepare svndump for branch detection

2012-08-20 Thread Florian Achleitner
On Monday 20 August 2012 09:45:30 Jonathan Nieder wrote:
> Florian Achleitner wrote:
> > Currently, the mark number is equal to the svn revision number the commit
> > corresponds to. I didn't want to break that, but not mandatory. We could
> > also split the mark namespace by reserving one or more of the most
> > significant bits as a type specifier.
> > I'll develop a marks-based version ..
> 
> Have we already exhausted possibilities that don't involve changing
> vcs-svn/ code quite so much?  One possibility mentioned before was to
> post-process the stream that svn-fe produces, which seemed appealing
> from a debuggability point of view.
> 

Do you mean like another program in the pipe, that translates the fast-import 
stream produced by svn-fe into another fast-import stream?
svnrdump | svn-fe | svnbranchdetect | git-fast-import ?

My two previous ideas were meant like this:
1. Import everything into git and detect branches on the stuff in git, or
2. detect branches as it imports.

Both require to create commits for their work. So the idea behind these 
patches is to split the creation of commits from the creation of data. So that 
the data can be sent immediatly as it is coming in from svnrdump, and 
therefore save memory by not buffering it. 

And create the commits later. Either all linear and splitting it into branches 
later which requires creating commits but not data, or creating branched 
commits immediatly. This requires to inspect all  node data before starting a 
commit.

Anyways it's just an idea..

> Curious,
> Jonathan

Hope that helps,
Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v4 01/16] GSOC remote-svn

2012-08-20 Thread Florian Achleitner
On Saturday 18 August 2012 13:13:47 Junio C Hamano wrote:
> That indicates that one necessary patch to add logic to Makefile to
> go and build that subdirectory, at least before running the test,
> but possibly as part of the "all" target, is missing, isn't it?
> 
> Or you can add, at the beginning of your tests files that require
> the contrib bit, to have something like
> 
> if test -e "$GIT_BUILD_DIR/remote-svn"
> then
> test_set_prereq REMOTE_SVN
> fi
> 
> and protect your tests with the prerequisite, e.g.
> 
> test_expect_success REMOTE_SVN 'test svn:// URL' '
> ...
> '
> 
> without changing the top-level Makefile.

What version would you prefer? Currently nothing in contrib/ is built by the 
toplevel Makefile..
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/5] GSOC: prepare svndump for branch detection

2012-08-20 Thread Florian Achleitner
On Sunday 19 August 2012 23:57:23 Junio C Hamano wrote:
> Florian Achleitner  writes:
> >> This change makes me uncomfortable.
> >> We are doubling up on hashing with fast-import.
> >> This introduces git-specific logic into vcs-svn.
> 
> IIUC, vcs-svn/fast-export is meant to produce a stream in the
> fast-import format, and that format is meant to be VCS agnostic,
> it would need a careful thinking to add anything Git specific to
> it.  If you make other people's importers unable to read from you
> because you tell them the contents of blob in Git's terms, that is
> not very good.

Good point.

> 
> > You have two choices of referencing that blobs later, by using a mark, or
> > by giving their sha1. Marks are already used for marking commits, and
> > there is only one "mark namespace". So I couldn't use marks to reference
> > the blobs in a nice way. This allows for referencing them by their sha1.
> 
> Surely you can, by using even and odd numbers (or modulo 4 if you
> may later want to mark trees and tags as well, but I doubt that is
> needed), no?

Currently, the mark number is equal to the svn revision number the commit 
corresponds to. I didn't want to break that, but not mandatory. We could also 
split the mark namespace by reserving one or more of the most significant bits 
as a type specifier. 
I'll develop a marks-based version ..


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/5] GSOC: prepare svndump for branch detection

2012-08-19 Thread Florian Achleitner
On Sunday 19 August 2012 04:37:35 David Michael Barr wrote:
> On Sat, Aug 18, 2012 at 6:40 AM, Florian Achleitner
> 
>  wrote:
> > Hi!
> > 
> > This patch series should prepare vcs-svn/svndump.* for branch
> > detection. When starting with this feature I found that the existing
> > functions are not yet appropriate for that.
> > These rewrites the node handling part of svndump.c, it is very
> > invasive. The logic in handle_node is not simple, I hope that I
> > understood every case the existing code tries to adress.
> > At least it doesn't break an existing testcase.
> > 
> > The series applies on top of:
> > [PATCH/RFC v4 16/16] Add a test script for remote-svn.
> > I could also rebase it onto master if you think it makes sense.
> > 
> > Florian
> > 
> >  [RFC 1/5] vcs-svn: Add sha1 calculaton to fast_export and
> 
> This change makes me uncomfortable.
> We are doubling up on hashing with fast-import.
> This introduces git-specific logic into vcs-svn.

You might need to read the rest of the series to see why I did this.
Short version: For fast-import, I seperated sending data from the commits, it 
is sent using the 'blob' command.
You have two choices of referencing that blobs later, by using a mark, or by 
giving their sha1. Marks are already used for marking commits, and there is 
only one "mark namespace". So I couldn't use marks to reference the blobs in  
a nice way. This allows for referencing them by their sha1.

> 
> >  [RFC 2/5] svndump: move struct definitions to .h.
> >  [RFC 3/5] vcs-svn/svndump: restructure node_ctx, rev_ctx handling
> >  [RFC 4/5] vcs-svn/svndump: rewrite handle_node(),
> >  [RFC 5/5] vcs-svn: remove repo_tree
> 
> I haven't read the rest of the series yet but I expect
> it is less controversial than the first patch.

Hm.. I'm not sure ;)
> 
> --
> David Michael Barr

Florian 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v4 01/16] GSOC remote-svn

2012-08-19 Thread Florian Achleitner
On Saturday 18 August 2012 23:35:38 Junio C Hamano wrote:
> Junio C Hamano  writes:
> [..]
> Just to show how, here is what I did just now.
> [..] 
> Thanks.

Thanks for you guidance!
I'll base a new version on your fixups.

Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fast_export.c: Fix a compiler warning

2012-08-19 Thread Florian Achleitner
On Sunday 19 August 2012 16:29:02 Ramsay Jones wrote:
> In particular, gcc complains thus:
> 
> CC vcs-svn/fast_export.o
> vcs-svn/fast_export.c: In function 'fast_export_begin_note':
> vcs-svn/fast_export.c:77: warning: long long unsigned int format, \
> different type arg (arg 2)
> 
> In order to fix the warning, we cast the second size_t argument in
> the call to printf to uintmax_t.
> 
> Signed-off-by: Ramsay Jones 
> ---
> 
> Hi Florian,
> 
> If you need to re-roll your patches in the 'fa/remote-svn' branch, could
> you please squash this fix into them. [This was implemented on top of
> commit 2ce959ba, but you will probably want to make the equivalent change
> to commit d319a37c ("Create a note for every imported commit containing
> svn metadata", 17-08-2012) instead. Note that, because of the context
> lines in the patch, it won't apply as-is.]

Ok, I'll add it to the next version. This warning only occurs when building 
for 32bit, thus I never saw it. There would be a format flag for printf that 
sprecifies the platform's size_t integer type: "z". 
Probalby we should use it instead? I don't know how widely supported it is.
> 
> Thanks!
> 
> ATB,
> Ramsay Jones

Thanks,
Florian

> 
>  vcs-svn/fast_export.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
> index c780d32..dd09c7d 100644
> --- a/vcs-svn/fast_export.c
> +++ b/vcs-svn/fast_export.c
> @@ -74,7 +74,7 @@ void fast_export_begin_note(uint32_t revision, const char
> *author, size_t loglen = strlen(log);
>   printf("commit %s\n", note_ref);
>   printf("committer %s <%s@%s> %ld +\n", author, author, "local",
> timestamp); - printf("data %"PRIuMAX"\n", loglen);
> + printf("data %"PRIuMAX"\n", (uintmax_t) loglen);
>   fwrite(log, loglen, 1, stdout);
>   if (firstnote) {
>   if (revision > 1)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v4 01/16] GSOC remote-svn

2012-08-18 Thread Florian Achleitner
On Friday 17 August 2012 21:16:59 Junio C Hamano wrote:
> Comments from mentors and people interested in remote helpers?
> 
> I did minimum line wrapping, typofix and small compilation fixes
> and queued these on 'pu'; I think I saw one commit whose message
> I didn't quite get what it was trying to say, and another that was
> missing S-o-b (I left them untouched).

Should I provide a better version? I found the commit that I forgot to sign-
off, but I'm not sure which message you mean.

> 
> The result merged to 'pu' seems to fail 9020, by the way.

That's because contrib/svn-fe isn't built automatically if you call make in 
the toplevel dir. 
It dies with "fatal: Unable to find remote helper for 'svn'", because the 
helper is not built. We currently need to run make in contrib/svn-fe 
seperately.
That's a bit awkward.

Just checked how it works for svn-fe. It has a seperate test program (test-
svn-fe.c) which is in the toplevel dir and built here, while for svn-fe 
itself, it's the same as for remote-svn. 

Don't know what to do about that.

> 
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 4/5] vcs-svn/svndump: rewrite handle_node(), begin|end_revision()

2012-08-17 Thread Florian Achleitner
Split the decision of what to do and actually doing it in
handle_node() to allow for detection of branches from svn nodes.
Split it into handle_node() and apply_node().

svn dumps are structured in revisions, which contain multiple nodes.
Nodes represent operations on data. Currently the function
handle_node() strongly mixes the interpretation of the node data
with the output of processed data to fast-import.

In a fast-import stream a commit object requires a branch name to
which the new commit is added at its beginning.

We want to detect branches in svn. This can only be done by analyzing
node operations, like copyfrom. This conflicts with the current
implementation, where at the beginning of each new revision in the svn
dump, a new commit on a hard-coded git branch is created, before even
reading the first node.

To allow analyzing the nodes before deciding on which branch the commit
will be placed, store the node metadata of one complete revision, and
create a commit from it, when it ends.

Each node can have file data appended. It's desirable to not store the
actual file data, as it is unbounded.
fast-import has a 'blob' command that allows writing blobs, independent
of commits. Use this feature instead of sending data inline and send
the actual file data immediately when it is read in.

Use the previously added SHA1 calculation feature of fast_export_data
and fast_export_blob_delta to retrieve the SHA1 of the written blob
and reference it later. fast-import's marks can not be used for that,
because they are already used for marking commits, where the mark
number corresponds to exactly one svn revision.

Change handle_node() to interpret the node data, store it in a node_ctx,
send blobs to fast-import, and append the new node_ctx to the list of
node_ctx.
Do this until the end of a revision.

Just clear the list of note_ctx in begin_revision().

At end_revision() all node metadata is available in the node_ctx list.
Future's branch detectors can decide what branches are to be changed.
Then, call apply_node() for each of them to actually create a commit
and change/add/delete files according to the node_ctx using the already
added blobs.

This can also be used to create commits if the node metadata does not
come from a svndump, but is stored in e.g. notes, for later branch
detection.

Signed-off-by: Florian Achleitner 
---
 vcs-svn/svndump.c |  165 ++---
 1 file changed, 107 insertions(+), 58 deletions(-)

diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 2fca9f8..6feedd9 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -48,7 +48,6 @@ static struct node_ctx_t *node_list, *node_list_tail;
 static struct node_ctx_t *new_node_ctx(char *fname)
 {
struct node_ctx_t *node = xmalloc(sizeof(struct node_ctx_t));
-   trace_printf("new_node_ctx %p\n", node);
node->type = 0;
node->action = NODEACT_UNKNOWN;
node->prop_length = -1;
@@ -67,7 +66,6 @@ static struct node_ctx_t *new_node_ctx(char *fname)
 
 static void free_node_ctx(struct node_ctx_t *node)
 {
-   trace_printf("free_node_ctx %p\n", node);
strbuf_release(&node->src);
strbuf_release(&node->dst);
free((char*)node->dataref);
@@ -77,7 +75,6 @@ static void free_node_ctx(struct node_ctx_t *node)
 static void free_node_list()
 {
struct node_ctx_t *p = node_list, *n;
-   trace_printf("free_node_list head %p tail %p\n", node_list, 
node_list_tail);
while (p) {
n = p->next;
free_node_ctx(p);
@@ -88,7 +85,6 @@ static void free_node_list()
 
 static void append_node_list(struct node_ctx_t *n)
 {
-   trace_printf("append_node_list %p head %p tail %p\n", n, node_list, 
node_list_tail);
if (!node_list)
node_list = node_list_tail = n;
else {
@@ -246,23 +242,10 @@ static void handle_node(struct node_ctx_t *node)
static const char *const empty_blob = "::empty::";
const char *old_data = NULL;
uint32_t old_mode = REPO_MODE_BLB;
+   unsigned char data_sha1[20];
+   struct strbuf sb = STRBUF_INIT;
+
 
-   if (node->action == NODEACT_DELETE) {
-   if (have_text || have_props || node->srcRev)
-   die("invalid dump: deletion node has "
-   "copyfrom info, text, or properties");
-   repo_delete(node->dst.buf);
-   return;
-   }
-   if (node->action == NODEACT_REPLACE) {
-   repo_delete(node->dst.buf);
-   node->action = NODEACT_ADD;
-   }
-   if (node->srcRev) {
-   repo_copy(node->srcRev, node->src.buf, node->dst.buf);
-   if (node->action == NODEACT_ADD)
-   node->action = NODEACT_CHANGE;
-   }
if 

[RFC 3/5] vcs-svn/svndump: restructure node_ctx, rev_ctx handling

2012-08-17 Thread Florian Achleitner
As a preparation for handling branches in svndumps, make rev_ctx
and node_ctx more flexible.

Add the object to work on to the arguments of reset_*_ctx() and to
handle_node() to allow for multiple *_ctx objects.

Convert the static global node_ctx to a linked list ofsuch objects
to be able to accumulate all Node data of a revision in memory
before processing it.

Signed-off-by: Florian Achleitner 
---
 vcs-svn/svndump.c |  207 +++--
 vcs-svn/svndump.h |2 +
 2 files changed, 124 insertions(+), 85 deletions(-)

diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 296be8c..2fca9f8 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -38,42 +38,81 @@
 
 static struct line_buffer input = LINE_BUFFER_INIT;
 
-static struct node_ctx_t node_ctx;
+static struct node_ctx_t *node_ctx;
 static struct rev_ctx_t rev_ctx;
 static struct dump_ctx_t dump_ctx;
+static const char *current_ref;
 
+static struct node_ctx_t *node_list, *node_list_tail;
 
-static void reset_node_ctx(char *fname)
+static struct node_ctx_t *new_node_ctx(char *fname)
 {
-   node_ctx.type = 0;
-   node_ctx.action = NODEACT_UNKNOWN;
-   node_ctx.prop_length = -1;
-   node_ctx.text_length = -1;
-   strbuf_reset(&node_ctx.src);
-   node_ctx.srcRev = 0;
-   strbuf_reset(&node_ctx.dst);
+   struct node_ctx_t *node = xmalloc(sizeof(struct node_ctx_t));
+   trace_printf("new_node_ctx %p\n", node);
+   node->type = 0;
+   node->action = NODEACT_UNKNOWN;
+   node->prop_length = -1;
+   node->text_length = -1;
+   strbuf_init(&node->src, 4096);
+   node->srcRev = 0;
+   strbuf_init(&node->dst, 4096);
if (fname)
-   strbuf_addstr(&node_ctx.dst, fname);
-   node_ctx.text_delta = 0;
-   node_ctx.prop_delta = 0;
+   strbuf_addstr(&node->dst, fname);
+   node->text_delta = 0;
+   node->prop_delta = 0;
+   node->dataref = NULL;
+   node->next = NULL;
+   return node;
 }
 
-static void reset_rev_ctx(uint32_t revision)
+static void free_node_ctx(struct node_ctx_t *node)
 {
-   rev_ctx.revision = revision;
-   rev_ctx.timestamp = 0;
-   strbuf_reset(&rev_ctx.log);
-   strbuf_reset(&rev_ctx.author);
-   strbuf_reset(&rev_ctx.note);
+   trace_printf("free_node_ctx %p\n", node);
+   strbuf_release(&node->src);
+   strbuf_release(&node->dst);
+   free((char*)node->dataref);
+   free(node);
 }
 
-static void reset_dump_ctx(const char *url)
+static void free_node_list()
 {
-   strbuf_reset(&dump_ctx.url);
+   struct node_ctx_t *p = node_list, *n;
+   trace_printf("free_node_list head %p tail %p\n", node_list, 
node_list_tail);
+   while (p) {
+   n = p->next;
+   free_node_ctx(p);
+   p = n;
+   }
+   node_list = node_list_tail = NULL;
+}
+
+static void append_node_list(struct node_ctx_t *n)
+{
+   trace_printf("append_node_list %p head %p tail %p\n", n, node_list, 
node_list_tail);
+   if (!node_list)
+   node_list = node_list_tail = n;
+   else {
+   node_list_tail->next = n;
+   node_list_tail = n;
+   }
+}
+
+static void reset_rev_ctx(struct rev_ctx_t *rev, uint32_t revision)
+{
+   rev->revision = revision;
+   rev->timestamp = 0;
+   strbuf_reset(&rev->log);
+   strbuf_reset(&rev->author);
+   strbuf_reset(&rev->note);
+}
+
+static void reset_dump_ctx(struct dump_ctx_t *dump, const char *url)
+{
+   strbuf_reset(&dump->url);
if (url)
-   strbuf_addstr(&dump_ctx.url, url);
-   dump_ctx.version = 1;
-   strbuf_reset(&dump_ctx.uuid);
+   strbuf_addstr(&dump->url, url);
+   dump->version = 1;
+   strbuf_reset(&dump->uuid);
 }
 
 static void handle_property(const struct strbuf *key_buf,
@@ -121,11 +160,11 @@ static void handle_property(const struct strbuf *key_buf,
die("invalid dump: sets type twice");
}
if (!val) {
-   node_ctx.type = REPO_MODE_BLB;
+   node_ctx->type = REPO_MODE_BLB;
return;
}
*type_set = 1;
-   node_ctx.type = keylen == strlen("svn:executable") ?
+   node_ctx->type = keylen == strlen("svn:executable") ?
REPO_MODE_EXE :
REPO_MODE_LNK;
}
@@ -193,11 +232,11 @@ static void read_props(void)
}
 }
 
-static void handle_node(void)
+static void handle_node(struct node_ctx_t *node)
 {
-   const uint32_t type = node_ctx.type;
-   const int have_props = nod

[PATCH/RFC v4 13/16] remote-svn: add incremental import.

2012-08-17 Thread Florian Achleitner
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.

If there is no next revision in the svn repo, svnrdump terminates
with a message on stderr an non-zero return value. This looks a
little weird, but there is no other way to know whether there is
a new revision in the svn repo.

On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous
mark files are currently not reused.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/remote-svn.c |   67 +--
 contrib/svn-fe/svn-fe.c |3 +-
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |   10 +--
 vcs-svn/fast_export.h   |6 ++--
 vcs-svn/svndump.c   |   10 +++
 vcs-svn/svndump.h   |2 +-
 7 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
index 0643a4c..b385682 100644
--- a/contrib/svn-fe/remote-svn.c
+++ b/contrib/svn-fe/remote-svn.c
@@ -12,7 +12,8 @@ static const char *url;
 static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
-static const char *marksfilename;
+static const char *marksfilename, *notes_ref;
+struct rev_note { unsigned int rev_nr; };
 
 static int cmd_capabilities(const char *line);
 static int cmd_import(const char *line);
@@ -47,14 +48,70 @@ static void terminate_batch(void)
fflush(stdout);
 }
 
+/* NOTE: 'ref' refers to a git reference, while 'rev' refers to a svn 
revision. */
+static char *read_ref_note(const unsigned char sha1[20]) {
+   const unsigned char *note_sha1;
+   char *msg = NULL;
+   unsigned long msglen;
+   enum object_type type;
+   init_notes(NULL, notes_ref, NULL, 0);
+   if( (note_sha1 = get_note(NULL, sha1)) == NULL ||
+   !(msg = read_sha1_file(note_sha1, &type, &msglen)) ||
+   !msglen || type != OBJ_BLOB) {
+   free(msg);
+   return NULL;
+   }
+   free_notes(NULL);
+   return msg;
+}
+
+static int parse_rev_note(const char *msg, struct rev_note *res) {
+   const char *key, *value, *end;
+   size_t len;
+   while(*msg) {
+   end = strchr(msg, '\n');
+   len = end ? end - msg : strlen(msg);
+
+   key = "Revision-number: ";
+   if(!prefixcmp(msg, key)) {
+   long i;
+   value = msg + strlen(key);
+   i = atol(value);
+   if(i < 0 || i > UINT32_MAX)
+   return 1;
+   res->rev_nr = i;
+   }
+   msg += len + 1;
+   }
+   return 0;
+}
+
 static int cmd_import(const char *line)
 {
int code;
int dumpin_fd;
-   unsigned int startrev = 0;
+   char *note_msg;
+   unsigned char head_sha1[20];
+   unsigned int startrev;
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
+   if(read_ref(private_ref, head_sha1))
+   startrev = 0;
+   else {
+   note_msg = read_ref_note(head_sha1);
+   if(note_msg == NULL) {
+   warning("No note found for %s.", private_ref);
+   startrev = 0;
+   }
+   else {
+   struct rev_note note = { 0 };
+   parse_rev_note(note_msg, ¬e);
+   startrev = note.rev_nr + 1;
+   free(note_msg);
+   }
+   }
+
if (dump_from_file) {
dumpin_fd = open(url, O_RDONLY);
if(dumpin_fd < 0) {
@@ -80,7 +137,7 @@ static int cmd_import(const char *line)
"feature export-marks=%s\n", marksfilename, 
marksfilename);
 
svndump_init_fd(dumpin_fd, STDIN_FILENO);
-   svndump_read(url, private_ref);
+   svndump_read(url, private_ref, notes_ref);
svndump_deinit();
svndump_reset();
 
@@ -177,6 +234,9 @@ int main(int argc, const char **argv)
strbuf_addf(&buf, "refs/svn/%s/master", remote->name);
private_ref = strbuf_detach(&buf, NULL);
 
+   strbuf_addf(&buf, "refs/notes/%s/revs", remote->name);
+   notes_ref = strbuf_detach(&buf, NULL);
+
strbuf_addf(&buf, "%s/info/fast-import/remote-svn/%s.marks",
get_git_dir(), remote->name);
marksfilename = strbuf_detach(&buf, NULL);
@@ -196,6 +256,7 @@ int main(int argc, const char **argv)
strbuf_release(&buf);

[PATCH/RFC v4 14/16] Add a svnrdump-simulator replaying a dump file for testing.

2012-08-17 Thread Florian Achleitner
To ease testing without depending on a reachable svn server, this
compact python script mimics parts of svnrdumps behaviour.
It requires the remote url to start with sim://.
Start and end revisions are evaluated.
If the requested revision doesn't exist, as it is the case with
incremental imports, if no new commit was added, it returns 1
(like svnrdump).
To allow using the same dump file for simulating multiple
incremental imports the highest revision can be limited by setting
the environment variable SVNRMAX to that value. This simulates the
situation where higher revs don't exist yet.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/svnrdump_sim.py |   53 
 1 file changed, 53 insertions(+)
 create mode 100755 contrib/svn-fe/svnrdump_sim.py

diff --git a/contrib/svn-fe/svnrdump_sim.py b/contrib/svn-fe/svnrdump_sim.py
new file mode 100755
index 000..ab4ccf1
--- /dev/null
+++ b/contrib/svn-fe/svnrdump_sim.py
@@ -0,0 +1,53 @@
+#!/usr/bin/python
+"""
+Simulates svnrdump by replaying an existing dump from a file, taking care
+of the specified revision range.
+To simulate incremental imports the environment variable SVNRMAX can be set
+to the highest revision that should be available.
+"""
+import sys, os
+
+
+def getrevlimit():
+   var = 'SVNRMAX'
+   if os.environ.has_key(var):
+   return os.environ[var]
+   return None
+   
+def writedump(url, lower, upper):
+   if url.startswith('sim://'):
+   filename = url[6:]
+   if filename[-1] == '/': filename = filename[:-1] #remove 
terminating slash
+   else:
+   raise ValueError('sim:// url required')
+   f = open(filename, 'r');
+   state = 'header'
+   wroterev = False
+   while(True):
+   l = f.readline()
+   if l == '': break
+   if state == 'header' and l.startswith('Revision-number: '):
+   state = 'prefix'
+   if state == 'prefix' and l == 'Revision-number: %s\n' % lower:
+   state = 'selection'
+   if not upper == 'HEAD' and state == 'selection' and l == 
'Revision-number: %s\n' % upper:
+   break;
+
+   if state == 'header' or state == 'selection':
+   if state == 'selection': wroterev = True
+   sys.stdout.write(l)
+   return wroterev
+
+if __name__ == "__main__":
+   if not (len(sys.argv) in (3, 4, 5)):
+   print "usage: %s dump URL -rLOWER:UPPER"
+   sys.exit(1)
+   if not sys.argv[1] == 'dump': raise NotImplementedError('only "dump" is 
suppported.')
+   url = sys.argv[2]
+   r = ('0', 'HEAD')
+   if len(sys.argv) == 4 and sys.argv[3][0:2] == '-r':
+   r = sys.argv[3][2:].lstrip().split(':')
+   if not getrevlimit() is None: r[1] = getrevlimit()
+   if writedump(url, r[0], r[1]): ret = 0
+   else: ret = 1
+   sys.exit(ret)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v4 11/16] Create a note for every imported commit containing svn metadata.

2012-08-17 Thread Florian Achleitner
To provide metadata from svn dumps for further processing, e.g.
branch detection, attach a note to each imported commit that
stores additional information.
The notes are currently hard-coded in refs/notes/svn/revs.
Currently the following lines from the svn dump are directly
accumulated in the note. This can be refined on purpose, of course.
- "Revision-number"
- "Node-path"
- "Node-kind"
- "Node-action"
- "Node-copyfrom-path"
- "Node-copyfrom-rev"

Signed-off-by: Florian Achleitner 
---
 vcs-svn/fast_export.c |   14 --
 vcs-svn/fast_export.h |2 ++
 vcs-svn/svndump.c |   21 +++--
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1ecae4b..a84fa17 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -3,8 +3,7 @@
  * See LICENSE for details.
  */
 
-#include "git-compat-util.h"
-#include "strbuf.h"
+#include "cache.h"
 #include "quote.h"
 #include "fast_export.h"
 #include "repo_tree.h"
@@ -68,6 +67,17 @@ void fast_export_modify(const char *path, uint32_t mode, 
const char *dataref)
putchar('\n');
 }
 
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp)
+{
+   size_t loglen = strlen(log);
+   printf("commit refs/notes/svn/revs\n");
+   printf("committer %s <%s@%s> %ld +\n", author, author, "local", 
timestamp);
+   printf("data %"PRIuMAX"\n", loglen);
+   fwrite(log, loglen, 1, stdout);
+   fputc('\n', stdout);
+}
+
 void fast_export_note(const char *committish, const char *dataref)
 {
printf("N %s %s\n", dataref, committish);
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 9b32f1e..c2f6f11 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -10,6 +10,8 @@ void fast_export_deinit(void);
 void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_note(const char *committish, const char *dataref);
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
const char *url, unsigned long timestamp, const char 
*local_ref);
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 288bb42..cd65b51 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -48,7 +48,7 @@ static struct {
 static struct {
uint32_t revision;
unsigned long timestamp;
-   struct strbuf log, author;
+   struct strbuf log, author, note;
 } rev_ctx;
 
 static struct {
@@ -77,6 +77,7 @@ static void reset_rev_ctx(uint32_t revision)
rev_ctx.timestamp = 0;
strbuf_reset(&rev_ctx.log);
strbuf_reset(&rev_ctx.author);
+   strbuf_reset(&rev_ctx.note);
 }
 
 static void reset_dump_ctx(const char *url)
@@ -310,8 +311,15 @@ static void begin_revision(const char *remote_ref)
 
 static void end_revision()
 {
-   if (rev_ctx.revision)
+   struct strbuf mark = STRBUF_INIT;
+   if (rev_ctx.revision) {
fast_export_end_commit(rev_ctx.revision);
+   fast_export_begin_note(rev_ctx.revision, "remote-svn",
+   "Note created by remote-svn.", 
rev_ctx.timestamp);
+   strbuf_addf(&mark, ":%"PRIu32, rev_ctx.revision);
+   fast_export_note(mark.buf, "inline");
+   fast_export_buf_to_data(&rev_ctx.note);
+   }
 }
 
 void svndump_read(const char *url, const char *local_ref)
@@ -358,6 +366,7 @@ void svndump_read(const char *url, const char *local_ref)
end_revision();
active_ctx = REV_CTX;
reset_rev_ctx(atoi(val));
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
case sizeof("Node-path"):
if (constcmp(t, "Node-"))
@@ -369,10 +378,12 @@ void svndump_read(const char *url, const char *local_ref)
begin_revision(local_ref);
active_ctx = NODE_CTX;
reset_node_ctx(val);
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
}
if (constcmp(t + strlen("Node-"), "kind"))
continue;
+   strbuf_addf(&rev_ctx.note, &

[PATCH/RFC v4 09/16] Allow reading svn dumps from files via file:// urls.

2012-08-17 Thread Florian Achleitner
For testing as well as for importing large, already
available dumps, it's useful to bypass svnrdump and
replay the svndump from a file directly.

Add support for file:// urls in the remote url.
e.g. svn::file:///path/to/dump
When the remote helper finds an url starting with
file:// it tries to open that file instead of invoking svnrdump.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/remote-svn.c |   55 ---
 1 file changed, 36 insertions(+), 19 deletions(-)

diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
index b853d54..80a089a 100644
--- a/contrib/svn-fe/remote-svn.c
+++ b/contrib/svn-fe/remote-svn.c
@@ -9,6 +9,7 @@
 #include "argv-array.h"
 
 static const char *url;
+static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
 
@@ -53,29 +54,38 @@ static int cmd_import(const char *line)
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
-   memset(&svndump_proc, 0, sizeof(struct child_process));
-   svndump_proc.out = -1;
-   argv_array_push(&svndump_argv, "svnrdump");
-   argv_array_push(&svndump_argv, "dump");
-   argv_array_push(&svndump_argv, url);
-   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
-   svndump_proc.argv = svndump_argv.argv;
-
-   code = start_command(&svndump_proc);
-   if (code)
-   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
-   dumpin_fd = svndump_proc.out;
-
+   if (dump_from_file) {
+   dumpin_fd = open(url, O_RDONLY);
+   if(dumpin_fd < 0) {
+   die_errno("Couldn't open svn dump file %s.", url);
+   }
+   }
+   else {
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", 
svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+   }
svndump_init_fd(dumpin_fd, STDIN_FILENO);
svndump_read(url, private_ref);
svndump_deinit();
svndump_reset();
 
close(dumpin_fd);
-   code = finish_command(&svndump_proc);
-   if (code)
-   warning("%s, returned %d", svndump_proc.argv[0], code);
-   argv_array_clear(&svndump_argv);
+   if(!dump_from_file) {
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+   }
 
return 0;
 }
@@ -149,8 +159,15 @@ int main(int argc, const char **argv)
remote = remote_get(argv[1]);
url_in = (argc == 3) ? argv[2] : remote->url[0];
 
-   end_url_with_slash(&buf, url_in);
-   url = strbuf_detach(&buf, NULL);
+   if (!prefixcmp(url_in, "file://")) {
+   dump_from_file = 1;
+   url = url_decode(url_in + sizeof("file://")-1);
+   }
+   else {
+   dump_from_file = 0;
+   end_url_with_slash(&buf, url_in);
+   url = strbuf_detach(&buf, NULL);
+   }
 
strbuf_addf(&buf, "refs/svn/%s/master", remote->name);
private_ref = strbuf_detach(&buf, NULL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v4 08/16] remote-svn, vcs-svn: Enable fetching to private refs.

2012-08-17 Thread Florian Achleitner
The reference to update by the fast-import stream is hard-coded.
When fetching from a remote the remote-helper shall update refs
in a private namespace, i.e. a private subdir of refs/.
This namespace is defined by the 'refspec' capability, that the
remote-helper advertises as a reply to the 'capablilities' command.

Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/svn-fe.c |2 +-
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |4 ++--
 vcs-svn/fast_export.h   |2 +-
 vcs-svn/svndump.c   |   14 +++---
 vcs-svn/svndump.h   |2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index 35db24f..c796cc0 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,7 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL);
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/test-svn-fe.c b/test-svn-fe.c
index 83633a2..cb0d80f 100644
--- a/test-svn-fe.c
+++ b/test-svn-fe.c
@@ -40,7 +40,7 @@ int main(int argc, char *argv[])
if (argc == 2) {
if (svndump_init(argv[1]))
return 1;
-   svndump_read(NULL);
+   svndump_read(NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1f04697..11f8f94 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -72,7 +72,7 @@ static char gitsvnline[MAX_GITSVN_LINE_LEN];
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log,
const char *uuid, const char *url,
-   unsigned long timestamp)
+   unsigned long timestamp, const char *local_ref)
 {
static const struct strbuf empty = STRBUF_INIT;
if (!log)
@@ -84,7 +84,7 @@ void fast_export_begin_commit(uint32_t revision, const char 
*author,
} else {
*gitsvnline = '\0';
}
-   printf("commit refs/heads/master\n");
+   printf("commit %s\n", local_ref);
printf("mark :%"PRIu32"\n", revision);
printf("committer %s <%s@%s> %ld +\n",
   *author ? author : "nobody",
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 8823aca..17eb13b 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -11,7 +11,7 @@ void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
-   const char *url, unsigned long timestamp);
+   const char *url, unsigned long timestamp, const char 
*local_ref);
 void fast_export_end_commit(uint32_t revision);
 void fast_export_data(uint32_t mode, off_t len, struct line_buffer *input);
 void fast_export_blob_delta(uint32_t mode,
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index d81a078..288bb42 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -299,22 +299,22 @@ static void handle_node(void)
node_ctx.text_length, &input);
 }
 
-static void begin_revision(void)
+static void begin_revision(const char *remote_ref)
 {
if (!rev_ctx.revision)  /* revision 0 gets no git commit. */
return;
fast_export_begin_commit(rev_ctx.revision, rev_ctx.author.buf,
&rev_ctx.log, dump_ctx.uuid.buf, dump_ctx.url.buf,
-   rev_ctx.timestamp);
+   rev_ctx.timestamp, remote_ref);
 }
 
-static void end_revision(void)
+static void end_revision()
 {
if (rev_ctx.revision)
fast_export_end_commit(rev_ctx.revision);
 }
 
-void svndump_read(const char *url)
+void svndump_read(const char *url, const char *local_ref)
 {
char *val;
char *t;
@@ -353,7 +353,7 @@ void svndump_read(const char *url)
if (active_ctx == NODE_CTX)
handle_node();
if (active_ctx == REV_CTX)
-   begin_revision();
+   begin_revision(local_ref);
if (active_ctx != DUMP_CTX)
end_revision();
active_ctx = REV_CTX;
@@ -366,7 +366,7 @@ void svndump_read(const char *url)
if (active

[PATCH/RFC v4 06/16] Add documentation for the 'bidi-import' capability of remote-helpers.

2012-08-17 Thread Florian Achleitner
Signed-off-by: Florian Achleitner 
---
 Documentation/git-remote-helpers.txt |   21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-remote-helpers.txt 
b/Documentation/git-remote-helpers.txt
index f5836e4..5faa48e 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -98,6 +98,20 @@ advertised with this capability must cover all refs reported 
by
 the list command.  If no 'refspec' capability is advertised,
 there is an implied `refspec *:*`.
 
+'bidi-import'::
+   The fast-import commands 'cat-blob' and 'ls' can be used by 
remote-helpers
+to retrieve information about blobs and trees that already exist in
+fast-import's memory. This requires a channel from fast-import to the
+remote-helper.
+If it is advertised in addition to "import", git establishes a pipe from
+   fast-import to the remote-helper's stdin.
+   It follows that git and fast-import are both connected to the
+   remote-helper's stdin. Because git can send multiple commands to
+   the remote-helper it is required that helpers that use 'bidi-import'
+   buffer all 'import' commands of a batch before sending data to 
fast-import.
+This is to prevent mixing commands and fast-import responses on the
+helper's stdin.
+
 Capabilities for Pushing
 
 'connect'::
@@ -286,7 +300,12 @@ terminated with a blank line. For each batch of 'import', 
the remote
 helper should produce a fast-import stream terminated by a 'done'
 command.
 +
-Supported if the helper has the "import" capability.
+Note that if the 'bidi-import' capability is used the complete batch
+sequence has to be buffered before starting to send data to fast-import
+to prevent mixing of commands and fast-import responses on the helper's
+stdin.
++
+Supported if the helper has the 'import' capability.
 
 'connect' ::
Connects to given service. Standard input and standard output
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v4 03/16] Add svndump_init_fd to allow reading dumps from arbitrary FDs.

2012-08-17 Thread Florian Achleitner
The existing function only allows reading from a filename or
from stdin. Allow passing of a FD and an additional FD for
the back report pipe. This allows us to retrieve the name of
the pipe in the caller.

Signed-off-by: Florian Achleitner 
---
 vcs-svn/svndump.c |   22 ++
 vcs-svn/svndump.h |1 +
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 2b168ae..d81a078 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -468,11 +468,9 @@ void svndump_read(const char *url)
end_revision();
 }
 
-int svndump_init(const char *filename)
+static void init(int report_fd)
 {
-   if (buffer_init(&input, filename))
-   return error("cannot open %s: %s", filename, strerror(errno));
-   fast_export_init(REPORT_FILENO);
+   fast_export_init(report_fd);
strbuf_init(&dump_ctx.uuid, 4096);
strbuf_init(&dump_ctx.url, 4096);
strbuf_init(&rev_ctx.log, 4096);
@@ -482,6 +480,22 @@ int svndump_init(const char *filename)
reset_dump_ctx(NULL);
reset_rev_ctx(0);
reset_node_ctx(NULL);
+   return;
+}
+
+int svndump_init(const char *filename)
+{
+   if (buffer_init(&input, filename))
+   return error("cannot open %s: %s", filename ? filename : 
"NULL", strerror(errno));
+   init(REPORT_FILENO);
+   return 0;
+}
+
+int svndump_init_fd(int in_fd, int back_fd)
+{
+   if(buffer_fdinit(&input, xdup(in_fd)))
+   return error("cannot open fd %d: %s", in_fd, strerror(errno));
+   init(xdup(back_fd));
return 0;
 }
 
diff --git a/vcs-svn/svndump.h b/vcs-svn/svndump.h
index df9ceb0..acb5b47 100644
--- a/vcs-svn/svndump.h
+++ b/vcs-svn/svndump.h
@@ -2,6 +2,7 @@
 #define SVNDUMP_H_
 
 int svndump_init(const char *filename);
+int svndump_init_fd(int in_fd, int back_fd);
 void svndump_read(const char *url);
 void svndump_deinit(void);
 void svndump_reset(void);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v4 01/16] Implement a remote helper for svn in C.

2012-08-17 Thread Florian Achleitner
Enable basic fetching from subversion repositories. When processing
remote URLs starting with svn::, git invokes this remote-helper.
It starts svnrdump to extract revisions from the subversion repository
in the 'dump file format', and converts them to a git-fast-import stream
using the functions of vcs-svn/.

Imported refs are created in a private namespace at
refs/svn//master.
The revision history is imported linearly (no branch detection) and
completely, i.e. from revision 0 to HEAD.

The 'bidi-import' capability is used. The remote-helper expects data
from fast-import on its stdin. It buffers a batch of 'import' command
lines in a string_list before starting to process them.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/remote-svn.c |  174 +++
 1 file changed, 174 insertions(+)
 create mode 100644 contrib/svn-fe/remote-svn.c

diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
new file mode 100644
index 000..b853d54
--- /dev/null
+++ b/contrib/svn-fe/remote-svn.c
@@ -0,0 +1,174 @@
+#include "cache.h"
+#include "remote.h"
+#include "strbuf.h"
+#include "url.h"
+#include "exec_cmd.h"
+#include "run-command.h"
+#include "svndump.h"
+#include "notes.h"
+#include "argv-array.h"
+
+static const char *url;
+static const char *private_ref;
+static const char *remote_ref = "refs/heads/master";
+
+static int cmd_capabilities(const char *line);
+static int cmd_import(const char *line);
+static int cmd_list(const char *line);
+
+typedef int (*input_command_handler)(const char *);
+struct input_command_entry {
+   const char *name;
+   input_command_handler fn;
+   unsigned char batchable;/* whether the command starts or is 
part of a batch */
+};
+
+static const struct input_command_entry input_command_list[] = {
+   { "capabilities", cmd_capabilities, 0 },
+   { "import", cmd_import, 1 },
+   { "list", cmd_list, 0 },
+   { NULL, NULL }
+};
+
+static int cmd_capabilities(const char *line) {
+   printf("import\n");
+   printf("bidi-import\n");
+   printf("refspec %s:%s\n\n", remote_ref, private_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static void terminate_batch(void)
+{
+   /* terminate a current batch's fast-import stream */
+   printf("done\n");
+   fflush(stdout);
+}
+
+static int cmd_import(const char *line)
+{
+   int code;
+   int dumpin_fd;
+   unsigned int startrev = 0;
+   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
+   struct child_process svndump_proc;
+
+   memset(&svndump_proc, 0, sizeof(struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+
+   svndump_init_fd(dumpin_fd, STDIN_FILENO);
+   svndump_read(url, private_ref);
+   svndump_deinit();
+   svndump_reset();
+
+   close(dumpin_fd);
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+
+   return 0;
+}
+
+static int cmd_list(const char *line)
+{
+   printf("? %s\n\n", remote_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static int do_command(struct strbuf *line)
+{
+   const struct input_command_entry *p = input_command_list;
+   static struct string_list batchlines = STRING_LIST_INIT_DUP;
+   static const struct input_command_entry *batch_cmd;
+   /*
+* commands can be grouped together in a batch.
+* Batches are ended by \n. If no batch is active the program ends.
+* During a batch all lines are buffered and passed to the handler 
function
+* when the batch is terminated.
+*/
+   if (line->len == 0) {
+   if (batch_cmd) {
+   struct string_list_item *item;
+   for_each_string_list_item(item, &batchlines)
+   batch_cmd->fn(item->string);
+   terminate_batch();
+   batch_cmd = NULL;
+   string_list_clear(&batchlines, 0);
+   return 0;   /* end of the batch, continue reading 
other commands. */
+   }
+   return 1;   /* end 

[PATCH/RFC v4 01/16] GSOC remote-svn

2012-08-17 Thread Florian Achleitner
Hi!

Thanks for the reviews!
This series contains the follwing improvements. 
I decided to summarize them here, sorted by topic instead of
attaching them to the patches.

all:
- remove all merge garbage and debugging legacy (hopefully).
- reviews: style
- reorder patches

remote-svn:
- review: refactor (fct -> fn), style
- fix command batch detection
- setup_git_dir, make non-gentle.
- die on unknown commands, instead of warning

contrib/svn-fe/Makefile, symlink:
- remove symlink-creating commit
- create symlink after linking remote-svn in Makefile
- the makefile is meant as a temporary solution. Therefore, it chooses
  openssl/sha1.h unconditionally.

argv_array:
- new patch: add argv_array_detach and argv_array_free_detached

transport-helper:
- use those new functions
- free argv always after finish_command.
- remove patch that containing unconditional activation of --export-marks
  --import-marks on fast-import command line.

remote-svn:
- instead use fast-import's 'feature' command to activate marks import/export.
- use a more decriptive path for storing marks files.

Florian

 [PATCH/RFC v4 01/16] Implement a remote helper for svn in C.
 [PATCH/RFC v4 02/16] Integrate remote-svn into svn-fe/Makefile.
 [PATCH/RFC v4 03/16] Add svndump_init_fd to allow reading dumps from
 [PATCH/RFC v4 04/16] Add argv_array_detach and
 [PATCH/RFC v4 05/16] Connect fast-import to the remote-helper via
 [PATCH/RFC v4 06/16] Add documentation for the 'bidi-import'
 [PATCH/RFC v4 07/16] When debug==1, start fast-import with "--stats"
 [PATCH/RFC v4 08/16] remote-svn, vcs-svn: Enable fetching to private
 [PATCH/RFC v4 09/16] Allow reading svn dumps from files via file://
 [PATCH/RFC v4 10/16] vcs-svn: add fast_export_note to create notes
 [PATCH/RFC v4 11/16] Create a note for every imported commit
 [PATCH/RFC v4 12/16] remote-svn: Activate import/export-marks for
 [PATCH/RFC v4 13/16] remote-svn: add incremental import.
 [PATCH/RFC v4 14/16] Add a svnrdump-simulator replaying a dump file
 [PATCH/RFC v4 15/16] remote-svn: add marks-file regeneration.
 [PATCH/RFC v4 16/16] Add a test script for remote-svn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 14/16] transport-helper: add import|export-marks to fast-import command line.

2012-08-15 Thread Florian Achleitner
On Wednesday 15 August 2012 22:20:45 Florian Achleitner wrote:
> On Wednesday 15 August 2012 12:52:43 Junio C Hamano wrote:
> > Florian Achleitner  writes:
> > > fast-import internally uses marks that refer to an object via its sha1.
> > > Those marks are created during import to find previously created
> > > objects.
> > > At exit the accumulated marks can be exported to a file and reloaded at
> > > startup, so that the previous marks are available.
> > > Add command line options to the fast-import command line to enable this.
> > > The mark files are stored in info/fast-import/marks/.
> > > 
> > > Signed-off-by: Florian Achleitner 
> > > ---
> > > 
> > >  transport-helper.c |3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/transport-helper.c b/transport-helper.c
> > > index 7fb52d4..47db055 100644
> > > --- a/transport-helper.c
> > > +++ b/transport-helper.c
> > > @@ -387,6 +387,9 @@ static int get_importer(struct transport *transport,
> > > struct child_process *fasti>
> > > 
> > >   fastimport->in = helper->out;
> > >   argv_array_push(&argv, "fast-import");
> > >   argv_array_push(&argv, debug ? "--stats" : "--quiet");
> > > 
> > > + argv_array_push(&argv, "--relative-marks");
> > > + argv_array_pushf(&argv, "--import-marks-if-exists=marks/%s",
> > > transport->remote->name); +   argv_array_pushf(&argv,
> > > "--export-marks=marks/%s", transport->remote->name);
> > 
> > Is this something we want to do unconditionally?
> 
> Good question. It doesn't hurt, but it maybe . We could add another
> capability for remote-helpers, that tells us if it needs masks. What do you
> think?

Btw, for fast-export, there is already such a capability. It specifies a 
filename, in addition.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 14/16] transport-helper: add import|export-marks to fast-import command line.

2012-08-15 Thread Florian Achleitner
On Wednesday 15 August 2012 12:52:43 Junio C Hamano wrote:
> Florian Achleitner  writes:
> > fast-import internally uses marks that refer to an object via its sha1.
> > Those marks are created during import to find previously created objects.
> > At exit the accumulated marks can be exported to a file and reloaded at
> > startup, so that the previous marks are available.
> > Add command line options to the fast-import command line to enable this.
> > The mark files are stored in info/fast-import/marks/.
> > 
> > Signed-off-by: Florian Achleitner 
> > ---
> > 
> >  transport-helper.c |3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/transport-helper.c b/transport-helper.c
> > index 7fb52d4..47db055 100644
> > --- a/transport-helper.c
> > +++ b/transport-helper.c
> > @@ -387,6 +387,9 @@ static int get_importer(struct transport *transport,
> > struct child_process *fasti> 
> > fastimport->in = helper->out;
> > argv_array_push(&argv, "fast-import");
> > argv_array_push(&argv, debug ? "--stats" : "--quiet");
> > 
> > +   argv_array_push(&argv, "--relative-marks");
> > +   argv_array_pushf(&argv, "--import-marks-if-exists=marks/%s",
> > transport->remote->name); + argv_array_pushf(&argv,
> > "--export-marks=marks/%s", transport->remote->name);
> Is this something we want to do unconditionally?

Good question. It doesn't hurt, but it maybe . We could add another capability 
for remote-helpers, that tells us if it needs masks. What do you think?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 10/16] Create a note for every imported commit containing svn metadata.

2012-08-15 Thread Florian Achleitner
On Wednesday 15 August 2012 12:49:04 Junio C Hamano wrote:
> Florian Achleitner  writes:
> > To provide metadata from svn dumps for further processing, e.g.
> > branch detection, attach a note to each imported commit that
> > stores additional information.
> > The notes are currently hard-coded in refs/notes/svn/revs.
> > Currently the following lines from the svn dump are directly
> > accumulated in the note. This can be refined on purpose, of course.
> > - "Revision-number"
> > - "Node-path"
> > - "Node-kind"
> > - "Node-action"
> > - "Node-copyfrom-path"
> > - "Node-copyfrom-rev"
> > 
> > Signed-off-by: Florian Achleitner 
> > ---
> > 
> >  vcs-svn/fast_export.c |   13 +
> >  vcs-svn/fast_export.h |2 ++
> >  vcs-svn/svndump.c |   21 +++--
> >  3 files changed, 34 insertions(+), 2 deletions(-)
> > 
> > diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
> > index 1ecae4b..796dd1a 100644
> > --- a/vcs-svn/fast_export.c
> > +++ b/vcs-svn/fast_export.c
> > @@ -12,6 +12,7 @@
> > 
> >  #include "svndiff.h"
> >  #include "sliding_window.h"
> >  #include "line_buffer.h"
> > 
> > +#include "cache.h"
> 
> Shouldn't it be near the beginning?  Also if you include "cache.h",
> it probably makes git-compat-util and strbuf redundant.

Ack.

> 
> >  #define MAX_GITSVN_LINE_LEN 4096
> > 
> > @@ -68,6 +69,18 @@ void fast_export_modify(const char *path, uint32_t
> > mode, const char *dataref)> 
> > putchar('\n');
> >  
> >  }
> > 
> > +void fast_export_begin_note(uint32_t revision, const char *author,
> > +   const char *log, unsigned long timestamp)
> > +{
> > +   timestamp = 1341914616;
> 
> The magic number needs some comment.
> 
> > +   size_t loglen = strlen(log);
> 
> decl-after-statement.  I am starting to suspect that the assignment
> is a leftover from an earlier debugging effort, though.

Oh yes sorry. Leftover from a previous experiment.
Thx for your reviews Junio, I got too blind to see this.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 04/16] Connect fast-import to the remote-helper via pipe, adding 'bidi-import' capability.

2012-08-15 Thread Florian Achleitner
On Tuesday 14 August 2012 13:40:20 Junio C Hamano wrote:
> Florian Achleitner  writes:
> > The fast-import commands 'cat-blob' and 'ls' can be used by remote-helpers
> > to retrieve information about blobs and trees that already exist in
> > fast-import's memory. This requires a channel from fast-import to the
> > remote-helper.
> > remote-helpers that use this features shall advertise the new
> > 'bidi-import'
> 
> s/this fea/these fea/
> 
> > capability so signal that they require the communication channel.
> 
> s/so sig/to sig/, I think.
> 
> > When forking fast-import in transport-helper.c connect it to a dup of
> > the remote-helper's stdin-pipe. The additional file descriptor is passed
> > to fast-import via it's command line (--cat-blob-fd).
> 
> s/via it's/via its/;
> 
> > It follows that git and fast-import are connected to the remote-helpers's
> > stdin.
> > Because git can send multiple commands to the remote-helper on it's stdin,
> > it is required that helpers that advertise 'bidi-import' buffer all input
> > commands until the batch of 'import' commands is ended by a newline
> > before sending data to fast-import.
> > This is to prevent mixing commands and fast-import responses on the
> > helper's stdin.
> 
> Please have a blank line each between paragraphs; a solid block of
> text is very hard to follow.
> 
> > Signed-off-by: Florian Achleitner 
> > ---
> > 
> >  transport-helper.c |   45 -
> >  1 file changed, 32 insertions(+), 13 deletions(-)
> > 
> > diff --git a/transport-helper.c b/transport-helper.c
> > index cfe0988..257274b 100644
> > --- a/transport-helper.c
> > +++ b/transport-helper.c
> > @@ -10,6 +10,7 @@
> > 
> >  #include "string-list.h"
> >  #include "thread-utils.h"
> >  #include "sigchain.h"
> > 
> > +#include "argv-array.h"
> > 
> >  static int debug;
> > 
> > @@ -19,6 +20,7 @@ struct helper_data {
> > 
> > FILE *out;
> > unsigned fetch : 1,
> > 
> > import : 1,
> > 
> > +   bidi_import : 1,
> > 
> > export : 1,
> > option : 1,
> > push : 1,
> > 
> > @@ -101,6 +103,7 @@ static void do_take_over(struct transport *transport)
> > 
> >  static struct child_process *get_helper(struct transport *transport)
> >  {
> >  
> > struct helper_data *data = transport->data;
> > 
> > +   struct argv_array argv = ARGV_ARRAY_INIT;
> > 
> > struct strbuf buf = STRBUF_INIT;
> > struct child_process *helper;
> > const char **refspecs = NULL;
> > 
> > @@ -122,11 +125,10 @@ static struct child_process *get_helper(struct
> > transport *transport)> 
> > helper->in = -1;
> > helper->out = -1;
> > helper->err = 0;
> > 
> > -   helper->argv = xcalloc(4, sizeof(*helper->argv));
> > -   strbuf_addf(&buf, "git-remote-%s", data->name);
> > -   helper->argv[0] = strbuf_detach(&buf, NULL);
> > -   helper->argv[1] = transport->remote->name;
> > -   helper->argv[2] = remove_ext_force(transport->url);
> > +   argv_array_pushf(&argv, "git-remote-%s", data->name);
> > +   argv_array_push(&argv, transport->remote->name);
> > +   argv_array_push(&argv, remove_ext_force(transport->url));
> > +   helper->argv = argv.argv;
> 
> Much nicer than before thanks to argv_array ;-)
> 
> > helper->git_cmd = 0;
> > helper->silent_exec_failure = 1;
> > 
> > @@ -141,6 +143,8 @@ static struct child_process *get_helper(struct
> > transport *transport)> 
> > data->helper = helper;
> > data->no_disconnect_req = 0;
> > 
> > +   free((void*) helper_env[1]);
> 
> What is this free() for???

Sorry, legacy from previous versions, will be deleted.
> 
> > +   argv_array_clear(&argv);
> 
> See below.
> 
> > /*
> > 
> >  * Open the output as FILE* so strbuf_getline() can be used.
> > 
> > @@ -178,6 +182,8 @@ static struct child_process *get_helper(struct
> > transport *transport)> 
> > data->push = 1;
> > 
> > else if (!strcmp(capname, "import"))
> > 
> > data->import = 1;
&g

Re: [PATCH/RFC v3 01/16] Implement a remote helper for svn in C.

2012-08-15 Thread Florian Achleitner
On Tuesday 14 August 2012 13:07:32 Junio C Hamano wrote:
> Florian Achleitner  writes:
> > Enable basic fetching from subversion repositories. When processing remote
> > URLs starting with svn::, git invokes this remote-helper.
> > It starts svnrdump to extract revisions from the subversion repository in
> > the 'dump file format', and converts them to a git-fast-import stream
> > using the functions of vcs-svn/.
> 
> (nit) the above is a bit too wide, isn't it?
> 
> > Imported refs are created in a private namespace at
> > refs/svn/ (nit) missing closing '>'?
> 
> > The revision history is imported linearly (no branch detection) and
> > completely, i.e. from revision 0 to HEAD.
> > 
> > The 'bidi-import' capability is used. The remote-helper expects data from
> > fast-import on its stdin. It buffers a batch of 'import' command lines
> > in a string_list before starting to process them.
> > 
> > Signed-off-by: Florian Achleitner 
> > ---
> > diff:
> > - incorporate review
> > - remove redundant strbuf_init
> > - add 'bidi-import' to capabilities
> > - buffer all lines of a command batch in string_list
> > 
> >  contrib/svn-fe/remote-svn.c |  183
> >  +++ 1 file changed, 183
> >  insertions(+)
> >  create mode 100644 contrib/svn-fe/remote-svn.c
> > 
> > diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
> > new file mode 100644
> > index 000..ce59344
> > --- /dev/null
> > +++ b/contrib/svn-fe/remote-svn.c
> > @@ -0,0 +1,183 @@
> > +
> 
> Remove.
> 
> > +#include "cache.h"
> > +#include "remote.h"
> > +#include "strbuf.h"
> > +#include "url.h"
> > +#include "exec_cmd.h"
> > +#include "run-command.h"
> > +#include "svndump.h"
> > +#include "notes.h"
> > +#include "argv-array.h"
> > +
> > +static const char *url;
> > +static const char *private_ref;
> > +static const char *remote_ref = "refs/heads/master";
> 
> Just wondering; is this name "master" (or "refs/heads/" for that
> matter) significant in any way when talking to a subversion remote?

No, it isn't. But it has to specify something in the list command.

> 
> > +static int cmd_capabilities(const char *line);
> > +static int cmd_import(const char *line);
> > +static int cmd_list(const char *line);
> > +
> > +typedef int (*input_command_handler)(const char *);
> > +struct input_command_entry {
> > +   const char *name;
> > +   input_command_handler fct;
> > +   unsigned char batchable;/* whether the command starts or is 
> > part of a
> > batch */ +};
> > +
> > +static const struct input_command_entry input_command_list[] = {
> > +   { "capabilities", cmd_capabilities, 0 },
> 
> One level too deeply indented?
> 
> > +   { "import", cmd_import, 1 },
> > +   { "list", cmd_list, 0 },
> > +   { NULL, NULL }
> > +};
> > +
> > +static int cmd_capabilities(const char *line) {
> > +   printf("import\n");
> > +   printf("bidi-import\n");
> > +   printf("refspec %s:%s\n\n", remote_ref, private_ref);
> > +   fflush(stdout);
> > +   return 0;
> > +}
> > +
> > +static void terminate_batch(void)
> > +{
> > +   /* terminate a current batch's fast-import stream */
> > +   printf("done\n");
> 
> Likewise.
> 
> > +   fflush(stdout);
> > +}
> > +
> > +static int cmd_import(const char *line)
> > +{
> > +   int code;
> > +   int dumpin_fd;
> > +   unsigned int startrev = 0;
> > +   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
> > +   struct child_process svndump_proc;
> > +
> > +   memset(&svndump_proc, 0, sizeof (struct child_process));
> 
> Please lose SP between sizeof and '('.
> 
> > +   svndump_proc.out = -1;
> > +   argv_array_push(&svndump_argv, "svnrdump");
> > +   argv_array_push(&svndump_argv, "dump");
> > +   argv_array_push(&svndump_argv, url);
> > +   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
> > +   svndump_proc.argv = svndump_argv.argv;
> 
> (just me making a mental note) We read from "svnrdump", which would
> read (if it ever does) from the same stdin as ours and 

Re: [PATCH/RFC v3 16/16] Add a test script for remote-svn.

2012-08-15 Thread Florian Achleitner
Forget this patch! It contains some unwanted content. Something with rebasing 
went wrong..

On Tuesday 14 August 2012 21:13:18 Florian Achleitner wrote:
> Use svnrdump_sim.py to emulate svnrdump without an svn server.
> Tests fetching, incremental fetching, fetching from file://,
> and the regeneration of fast-import's marks file.
> 
> Signed-off-by: Florian Achleitner 
> ---
>  t/t9020-remote-svn.sh |   69
> + transport-helper.c|  
> 15 ++-
>  2 files changed, 77 insertions(+), 7 deletions(-)
>  create mode 100755 t/t9020-remote-svn.sh
> 
> diff --git a/t/t9020-remote-svn.sh b/t/t9020-remote-svn.sh
> new file mode 100755
> index 000..a0c6a21
> --- /dev/null
> +++ b/t/t9020-remote-svn.sh
> @@ -0,0 +1,69 @@
> +#!/bin/sh
> +
> +test_description='tests remote-svn'
> +
> +. ./test-lib.sh
> +
> +# We override svnrdump by placing a symlink to the svnrdump-emulator in .
> +export PATH="$HOME:$PATH"
> +ln -sf $GIT_BUILD_DIR/contrib/svn-fe/svnrdump_sim.py "$HOME/svnrdump"
> +
> +init_git () {
> + rm -fr .git &&
> + git init &&
> + #git remote add svnsim 
> svn::sim:///$TEST_DIRECTORY/t9020/example.svnrdump
> + # let's reuse an exisiting dump file!?
> + git remote add svnsim svn::sim:///$TEST_DIRECTORY/t9154/svn.dump
> + git remote add svnfile svn::file:///$TEST_DIRECTORY/t9154/svn.dump
> +}
> +
> +test_debug '
> + git --version
> + which git
> + which svnrdump
> +'
> +
> +test_expect_success 'simple fetch' '
> + init_git &&
> + git fetch svnsim &&
> + test_cmp .git/refs/svn/svnsim/master .git/refs/remotes/svnsim/master  &&
> + cp .git/refs/remotes/svnsim/master master.good
> +'
> +
> +test_debug '
> + cat .git/refs/svn/svnsim/master
> + cat .git/refs/remotes/svnsim/master
> +'
> +
> +test_expect_success 'repeated fetch, nothing shall change' '
> + git fetch svnsim &&
> + test_cmp master.good .git/refs/remotes/svnsim/master
> +'
> +
> +test_expect_success 'fetch from a file:// url gives the same result' '
> + git fetch svnfile
> +'
> +
> +test_expect_failure 'the sha1 differ because the git-svn-id line in the
> commit msg contains the url' ' +  test_cmp 
> .git/refs/remotes/svnfile/master
> .git/refs/remotes/svnsim/master +'
> +
> +test_expect_success 'mark-file regeneration' '
> + mv .git/info/fast-import/marks/svnsim
> .git/info/fast-import/marks/svnsim.old && +   git fetch svnsim &&
> + test_cmp .git/info/fast-import/marks/svnsim.old
> .git/info/fast-import/marks/svnsim +'
> +
> +test_expect_success 'incremental imports must lead to the same head' '
> + export SVNRMAX=3 &&
> + init_git &&
> + git fetch svnsim &&
> + test_cmp .git/refs/svn/svnsim/master .git/refs/remotes/svnsim/master  &&
> + unset SVNRMAX &&
> + git fetch svnsim &&
> + test_cmp master.good .git/refs/remotes/svnsim/master
> +'
> +
> +test_debug 'git branch -a'
> +
> +test_done
> diff --git a/transport-helper.c b/transport-helper.c
> index 47db055..a363f2c 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -17,6 +17,7 @@ static int debug;
>  struct helper_data {
>   const char *name;
>   struct child_process *helper;
> + struct argv_array argv;
>   FILE *out;
>   unsigned fetch : 1,
>   import : 1,
> @@ -103,7 +104,6 @@ static void do_take_over(struct transport *transport)
>  static struct child_process *get_helper(struct transport *transport)
>  {
>   struct helper_data *data = transport->data;
> - struct argv_array argv = ARGV_ARRAY_INIT;
>   struct strbuf buf = STRBUF_INIT;
>   struct child_process *helper;
>   const char **refspecs = NULL;
> @@ -125,10 +125,11 @@ static struct child_process *get_helper(struct
> transport *transport) helper->in = -1;
>   helper->out = -1;
>   helper->err = 0;
> - argv_array_pushf(&argv, "git-remote-%s", data->name);
> - argv_array_push(&argv, transport->remote->name);
> - argv_array_push(&argv, remove_ext_force(transport->url));
> - helper->argv = argv.argv;
> + argv_array_init(&data->argv);
> + argv_array_pushf(&data->argv, "git-remote-%s", data->name);
> + argv_array_push(&data->argv, transport->remot

Re: [PATCH/RFC v3 07/16] Add a symlink 'git-remote-svn' in base dir.

2012-08-15 Thread Florian Achleitner
On Tuesday 14 August 2012 13:46:43 Junio C Hamano wrote:
> Florian Achleitner  writes:
> > Allow execution of git-remote-svn even if the binary
> > currently is located in contrib/svn-fe/.
> > 
> > Signed-off-by: Florian Achleitner 
> > ---
> > 
> >  git-remote-svn |1 +
> >  1 file changed, 1 insertion(+)
> >  create mode 12 git-remote-svn
> > 
> > diff --git a/git-remote-svn b/git-remote-svn
> > new file mode 12
> > index 000..d3b1c07
> > --- /dev/null
> > +++ b/git-remote-svn
> > @@ -0,0 +1 @@
> > +contrib/svn-fe/remote-svn
> > \ No newline at end of file
> 
> Please scratch my previous comment.  I thought you were adding an
> entry to .gitignore or something.
> 
> I'd rather not to see such a symbolic link that points at a build
> product in the source tree.  Making a symlink from the toplevel
> Makefile _after_ we built it in contrib/svn-fe/ (and removing it
> upon "make clean") is OK, though.

As with the makefile in contrib/svn-fe, this is just a hack. The toplevel 
Makefile doesn't seem to build contrib/* at all. I always need to call make 
explicitly in these subdirs.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 02/16] Integrate remote-svn into svn-fe/Makefile.

2012-08-15 Thread Florian Achleitner
On Tuesday 14 August 2012 13:14:12 Junio C Hamano wrote:
> Florian Achleitner  writes:
> > Requires some sha.h to be used and the libraries
> > to be linked, this is currently hardcoded.
> > 
> > Signed-off-by: Florian Achleitner 
> > ---
> > 
> >  contrib/svn-fe/Makefile |   16 ++--
> >  1 file changed, 10 insertions(+), 6 deletions(-)
> > 
> > diff --git a/contrib/svn-fe/Makefile b/contrib/svn-fe/Makefile
> > index 360d8da..8f0eec2 100644
> > --- a/contrib/svn-fe/Makefile
> > +++ b/contrib/svn-fe/Makefile
> > @@ -1,14 +1,14 @@
> > -all:: svn-fe$X
> > +all:: svn-fe$X remote-svn$X
> > 
> >  CC = gcc
> >  RM = rm -f
> >  MV = mv
> > 
> > -CFLAGS = -g -O2 -Wall
> > +CFLAGS = -g -O2 -Wall -DSHA1_HEADER=''
> > -Wdeclaration-after-statement> 
> >  LDFLAGS =
> >  ALL_CFLAGS = $(CFLAGS)
> >  ALL_LDFLAGS = $(LDFLAGS)
> > 
> > -EXTLIBS =
> > +EXTLIBS = -lssl -lcrypto -lpthread ../../xdiff/lib.a
> 
> I haven't looked carefully, but didn't we have to do a bit more
> elaborate when linking with ssl/crypto in our main Makefile to be
> portable across various vintages of OpenSSL libraries?
> 
> Does contrib/svn-fe/ already depend on OpenSSL by the way?  It needs
> to be documented somewhere in the same directory.
> 
> If one builds the main Git binary with NO_OPENSSL, can this still be
> built and linked?
> 
> What does this use xdiff/lib.a for?
> 
> The above are just mental notes; I didn't read the later patches in
> the series that may already address these issues.

For the makefile, I've to say that this is just a hack to make it work. I'm not 
sure how it would be correctly integrated into git's makefile hierarchy.
The OPENSSL header and the xdiff/lib.a are here because it doesn't work 
otherwise. I need to dig into that to find out why. Any tips how to do it 
right?
 
> >  GIT_LIB = ../../libgit.a
> >  VCSSVN_LIB = ../../vcs-svn/lib.a
> > 
> > @@ -37,8 +37,12 @@ svn-fe$X: svn-fe.o $(VCSSVN_LIB) $(GIT_LIB)
> > 
> > $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ svn-fe.o \
> > 
> > $(ALL_LDFLAGS) $(LIBS)
> > 
> > -svn-fe.o: svn-fe.c ../../vcs-svn/svndump.h
> > -   $(QUIET_CC)$(CC) -I../../vcs-svn -o $*.o -c $(ALL_CFLAGS) $<
> > +remote-svn$X: remote-svn.o $(VCSSVN_LIB) $(GIT_LIB)
> > +   $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ remote-svn.o \
> > +   $(ALL_LDFLAGS) $(LIBS)
> > +
> > +%.o: %.c ../../vcs-svn/svndump.h
> > +   $(QUIET_CC)$(CC) -I../../vcs-svn -I../../ -o $*.o -c $(ALL_CFLAGS) $<
> > 
> >  svn-fe.html: svn-fe.txt
> >  
> > $(QUIET_SUBDIR0)../../Documentation $(QUIET_SUBDIR1) \
> > 
> > @@ -58,6 +62,6 @@ svn-fe.1: svn-fe.txt
> > 
> > $(QUIET_SUBDIR0)../.. $(QUIET_SUBDIR1) libgit.a
> >  
> >  clean:
> > -   $(RM) svn-fe$X svn-fe.o svn-fe.html svn-fe.xml svn-fe.1
> > +   $(RM) svn-fe$X svn-fe.o svn-fe.html svn-fe.xml svn-fe.1 remote-svn.o
> > 
> >  .PHONY: all clean FORCE
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 16/16] Add a test script for remote-svn.

2012-08-14 Thread Florian Achleitner
Use svnrdump_sim.py to emulate svnrdump without an svn server.
Tests fetching, incremental fetching, fetching from file://,
and the regeneration of fast-import's marks file.

Signed-off-by: Florian Achleitner 
---
 t/t9020-remote-svn.sh |   69 +
 transport-helper.c|   15 ++-
 2 files changed, 77 insertions(+), 7 deletions(-)
 create mode 100755 t/t9020-remote-svn.sh

diff --git a/t/t9020-remote-svn.sh b/t/t9020-remote-svn.sh
new file mode 100755
index 000..a0c6a21
--- /dev/null
+++ b/t/t9020-remote-svn.sh
@@ -0,0 +1,69 @@
+#!/bin/sh
+
+test_description='tests remote-svn'
+
+. ./test-lib.sh
+
+# We override svnrdump by placing a symlink to the svnrdump-emulator in .
+export PATH="$HOME:$PATH"
+ln -sf $GIT_BUILD_DIR/contrib/svn-fe/svnrdump_sim.py "$HOME/svnrdump"
+
+init_git () {
+   rm -fr .git &&
+   git init &&
+   #git remote add svnsim 
svn::sim:///$TEST_DIRECTORY/t9020/example.svnrdump
+   # let's reuse an exisiting dump file!?
+   git remote add svnsim svn::sim:///$TEST_DIRECTORY/t9154/svn.dump
+   git remote add svnfile svn::file:///$TEST_DIRECTORY/t9154/svn.dump
+}
+
+test_debug '
+   git --version
+   which git
+   which svnrdump
+'
+
+test_expect_success 'simple fetch' '
+   init_git &&
+   git fetch svnsim &&
+   test_cmp .git/refs/svn/svnsim/master .git/refs/remotes/svnsim/master  &&
+   cp .git/refs/remotes/svnsim/master master.good
+'
+
+test_debug '
+   cat .git/refs/svn/svnsim/master
+   cat .git/refs/remotes/svnsim/master
+'
+
+test_expect_success 'repeated fetch, nothing shall change' '
+   git fetch svnsim &&
+   test_cmp master.good .git/refs/remotes/svnsim/master
+'
+
+test_expect_success 'fetch from a file:// url gives the same result' '
+   git fetch svnfile 
+'
+
+test_expect_failure 'the sha1 differ because the git-svn-id line in the commit 
msg contains the url' '
+   test_cmp .git/refs/remotes/svnfile/master 
.git/refs/remotes/svnsim/master
+'
+
+test_expect_success 'mark-file regeneration' '
+   mv .git/info/fast-import/marks/svnsim 
.git/info/fast-import/marks/svnsim.old &&
+   git fetch svnsim &&
+   test_cmp .git/info/fast-import/marks/svnsim.old 
.git/info/fast-import/marks/svnsim
+'
+
+test_expect_success 'incremental imports must lead to the same head' '
+   export SVNRMAX=3 &&
+   init_git &&
+   git fetch svnsim &&
+   test_cmp .git/refs/svn/svnsim/master .git/refs/remotes/svnsim/master  &&
+   unset SVNRMAX &&
+   git fetch svnsim &&
+   test_cmp master.good .git/refs/remotes/svnsim/master
+'
+
+test_debug 'git branch -a' 
+
+test_done
diff --git a/transport-helper.c b/transport-helper.c
index 47db055..a363f2c 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -17,6 +17,7 @@ static int debug;
 struct helper_data {
const char *name;
struct child_process *helper;
+   struct argv_array argv;
FILE *out;
unsigned fetch : 1,
import : 1,
@@ -103,7 +104,6 @@ static void do_take_over(struct transport *transport)
 static struct child_process *get_helper(struct transport *transport)
 {
struct helper_data *data = transport->data;
-   struct argv_array argv = ARGV_ARRAY_INIT;
struct strbuf buf = STRBUF_INIT;
struct child_process *helper;
const char **refspecs = NULL;
@@ -125,10 +125,11 @@ static struct child_process *get_helper(struct transport 
*transport)
helper->in = -1;
helper->out = -1;
helper->err = 0;
-   argv_array_pushf(&argv, "git-remote-%s", data->name);
-   argv_array_push(&argv, transport->remote->name);
-   argv_array_push(&argv, remove_ext_force(transport->url));
-   helper->argv = argv.argv;
+   argv_array_init(&data->argv);
+   argv_array_pushf(&data->argv, "git-remote-%s", data->name);
+   argv_array_push(&data->argv, transport->remote->name);
+   argv_array_push(&data->argv, remove_ext_force(transport->url));
+   helper->argv = data->argv.argv;
helper->git_cmd = 0;
helper->silent_exec_failure = 1;
 
@@ -143,8 +144,6 @@ static struct child_process *get_helper(struct transport 
*transport)
 
data->helper = helper;
data->no_disconnect_req = 0;
-   free((void*) helper_env[1]);
-   argv_array_clear(&argv);
 
/*
 * Open the output as FILE* so strbuf_getline() can be used.
@@ -247,6 +246,8 @@ static int disconnect_helper(struct transport *transport)
 

[PATCH/RFC v3 15/16] remote-svn: add marks-file regeneration.

2012-08-14 Thread Florian Achleitner
fast-import mark files are stored outside the object database and are therefore
not fetched and can be lost somehow else.
marks provide a svn revision --> git sha1 mapping, while the notes that are 
attached
to each commit when it is imported provide a git sha1 --> svn revision.

If the marks file is not available or not plausible, regenerate it by walking 
through
the notes tree.
, i.e.
The plausibility check tests if the highest revision in the marks file matches 
the
revision of the top ref. It doesn't ensure that the mark file is completely 
correct.
This could only be done with an effort equal to unconditional regeneration.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/remote-svn.c |   69 ++-
 1 file changed, 68 insertions(+), 1 deletion(-)

diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
index d659a0e..94e5196 100644
--- a/contrib/svn-fe/remote-svn.c
+++ b/contrib/svn-fe/remote-svn.c
@@ -13,7 +13,7 @@ static const char *url;
 static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
-static const char *notes_ref;
+static const char *notes_ref, *marksfilename;
 struct rev_note { unsigned int rev_nr; };
 
 static int cmd_capabilities(const char *line);
@@ -87,6 +87,68 @@ static int parse_rev_note(const char *msg, struct rev_note 
*res) {
return 0;
 }
 
+static int note2mark_cb(const unsigned char *object_sha1,
+   const unsigned char *note_sha1, char *note_path,
+   void *cb_data) {
+   FILE *file = (FILE *)cb_data;
+   char *msg;
+   unsigned long msglen;
+   enum object_type type;
+   struct rev_note note;
+   if (!(msg = read_sha1_file(note_sha1, &type, &msglen)) ||
+   !msglen || type != OBJ_BLOB) {
+   free(msg);
+   return 1;
+   }
+   if (parse_rev_note(msg, ¬e))
+   return 2;
+   if (fprintf(file, ":%d %s\n", note.rev_nr, sha1_to_hex(object_sha1)) < 
1)
+   return 3;
+   return 0;
+}
+
+static void regenerate_marks() {
+   int ret;
+   FILE *marksfile;
+   marksfile = fopen(marksfilename, "w+");
+   if (!marksfile)
+   die_errno("Couldn't create mark file %s.", marksfilename);
+   ret = for_each_note(NULL, 0, note2mark_cb, marksfile);
+   if (ret)
+   die("Regeneration of marks failed, returned %d.", ret);
+   fclose(marksfile);
+}
+
+static void check_or_regenerate_marks(int latestrev) {
+   FILE *marksfile;
+   char *line = NULL;
+   size_t linelen = 0;
+   struct strbuf sb = STRBUF_INIT;
+   int found = 0;
+
+   if (latestrev < 1)
+   return;
+
+   init_notes(NULL, notes_ref, NULL, 0);
+   marksfile = fopen(marksfilename, "r");
+   if (!marksfile)
+   regenerate_marks(marksfile);
+   else {
+   strbuf_addf(&sb, ":%d ", latestrev);
+   while (getline(&line, &linelen, marksfile) != -1) {
+   if (!prefixcmp(line, sb.buf)) {
+   found++;
+   break;
+   }
+   }
+   fclose(marksfile);
+   if (!found)
+   regenerate_marks();
+   }
+   free_notes(NULL);
+   strbuf_release(&sb);
+}
+
 static int cmd_import(const char *line)
 {
int code;
@@ -112,6 +174,7 @@ static int cmd_import(const char *line)
free(note_msg);
}
}
+   check_or_regenerate_marks(startrev - 1);
 
if(dump_from_file) {
dumpin_fd = open(url, O_RDONLY);
@@ -238,6 +301,9 @@ int main(int argc, const char **argv)
strbuf_addf(&buf, "refs/notes/%s/revs", remote->name);
notes_ref = strbuf_detach(&buf, NULL);
 
+   strbuf_addf(&buf, "%s/info/fast-import/marks/%s", get_git_dir(), 
remote->name);
+   marksfilename = strbuf_detach(&buf, NULL);
+
while(1) {
if (strbuf_getline(&buf, stdin, '\n') == EOF) {
if (ferror(stdin))
@@ -254,5 +320,6 @@ int main(int argc, const char **argv)
free((void*)url);
free((void*)private_ref);
free((void*)notes_ref);
+   free((void*)marksfilename);
return 0;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 13/16] Add a svnrdump-simulator replaying a dump file for testing.

2012-08-14 Thread Florian Achleitner
To ease testing without depending on a reachable svn server, this
compact python script mimics parts of svnrdumps behaviour.
It requires the remote url to start with sim://.
Start and end revisions are evaluated.
If the requested revision doesn't exist, as it is the case with
incremental imports, if no new commit was added, it returns 1
(like svnrdump).
To allow using the same dump file for simulating multiple
incremental imports the highest revision can be limited by setting
the environment variable SVNRMAX to that value. This simulates the
situation where higher revs don't exist yet.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/svnrdump_sim.py |   53 
 1 file changed, 53 insertions(+)
 create mode 100755 contrib/svn-fe/svnrdump_sim.py

diff --git a/contrib/svn-fe/svnrdump_sim.py b/contrib/svn-fe/svnrdump_sim.py
new file mode 100755
index 000..ab4ccf1
--- /dev/null
+++ b/contrib/svn-fe/svnrdump_sim.py
@@ -0,0 +1,53 @@
+#!/usr/bin/python
+"""
+Simulates svnrdump by replaying an existing dump from a file, taking care
+of the specified revision range.
+To simulate incremental imports the environment variable SVNRMAX can be set
+to the highest revision that should be available.
+"""
+import sys, os
+
+
+def getrevlimit():
+   var = 'SVNRMAX'
+   if os.environ.has_key(var):
+   return os.environ[var]
+   return None
+   
+def writedump(url, lower, upper):
+   if url.startswith('sim://'):
+   filename = url[6:]
+   if filename[-1] == '/': filename = filename[:-1] #remove 
terminating slash
+   else:
+   raise ValueError('sim:// url required')
+   f = open(filename, 'r');
+   state = 'header'
+   wroterev = False
+   while(True):
+   l = f.readline()
+   if l == '': break
+   if state == 'header' and l.startswith('Revision-number: '):
+   state = 'prefix'
+   if state == 'prefix' and l == 'Revision-number: %s\n' % lower:
+   state = 'selection'
+   if not upper == 'HEAD' and state == 'selection' and l == 
'Revision-number: %s\n' % upper:
+   break;
+
+   if state == 'header' or state == 'selection':
+   if state == 'selection': wroterev = True
+   sys.stdout.write(l)
+   return wroterev
+
+if __name__ == "__main__":
+   if not (len(sys.argv) in (3, 4, 5)):
+   print "usage: %s dump URL -rLOWER:UPPER"
+   sys.exit(1)
+   if not sys.argv[1] == 'dump': raise NotImplementedError('only "dump" is 
suppported.')
+   url = sys.argv[2]
+   r = ('0', 'HEAD')
+   if len(sys.argv) == 4 and sys.argv[3][0:2] == '-r':
+   r = sys.argv[3][2:].lstrip().split(':')
+   if not getrevlimit() is None: r[1] = getrevlimit()
+   if writedump(url, r[0], r[1]): ret = 0
+   else: ret = 1
+   sys.exit(ret)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 12/16] remote-svn: add incremental import.

2012-08-14 Thread Florian Achleitner
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.

If there is no next revision in the svn repo, svnrdump terminates
with a message on stderr an non-zero return value. This looks a
little weird, but there is no other way to know whether there is
a new revision in the svn repo.

On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous
mark files are currently not reused.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/remote-svn.c |   66 +--
 contrib/svn-fe/svn-fe.c |3 +-
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |   16 ---
 vcs-svn/fast_export.h   |6 ++--
 vcs-svn/svndump.c   |   12 
 vcs-svn/svndump.h   |2 +-
 7 files changed, 89 insertions(+), 18 deletions(-)

diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
index df1babc..d659a0e 100644
--- a/contrib/svn-fe/remote-svn.c
+++ b/contrib/svn-fe/remote-svn.c
@@ -13,6 +13,8 @@ static const char *url;
 static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
+static const char *notes_ref;
+struct rev_note { unsigned int rev_nr; };
 
 static int cmd_capabilities(const char *line);
 static int cmd_import(const char *line);
@@ -47,14 +49,70 @@ static void terminate_batch(void)
fflush(stdout);
 }
 
+/* NOTE: 'ref' refers to a git reference, while 'rev' refers to a svn 
revision. */
+static char *read_ref_note(const unsigned char sha1[20]) {
+   const unsigned char *note_sha1;
+   char *msg = NULL;
+   unsigned long msglen;
+   enum object_type type;
+   init_notes(NULL, notes_ref, NULL, 0);
+   if( (note_sha1 = get_note(NULL, sha1)) == NULL ||
+   !(msg = read_sha1_file(note_sha1, &type, &msglen)) ||
+   !msglen || type != OBJ_BLOB) {
+   free(msg);
+   return NULL;
+   }
+   free_notes(NULL);
+   return msg;
+}
+
+static int parse_rev_note(const char *msg, struct rev_note *res) {
+   const char *key, *value, *end;
+   size_t len;
+   while(*msg) {
+   end = strchr(msg, '\n');
+   len = end ? end - msg : strlen(msg);
+
+   key = "Revision-number: ";
+   if(!prefixcmp(msg, key)) {
+   long i;
+   value = msg + strlen(key);
+   i = atol(value);
+   if(i < 0 || i > UINT32_MAX)
+   return 1;
+   res->rev_nr = i;
+   }
+   msg += len + 1;
+   }
+   return 0;
+}
+
 static int cmd_import(const char *line)
 {
int code;
int dumpin_fd;
-   unsigned int startrev = 0;
+   char *note_msg;
+   unsigned char head_sha1[20];
+   unsigned int startrev;
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
+   if(read_ref(private_ref, head_sha1))
+   startrev = 0;
+   else {
+   note_msg = read_ref_note(head_sha1);
+   if(note_msg == NULL) {
+   warning("No note found for %s.", private_ref);
+   startrev = 0;
+   }
+   else {
+   struct rev_note note = { 0 };
+   parse_rev_note(note_msg, ¬e);
+   startrev = note.rev_nr + 1;
+   free(note_msg);
+   }
+   }
+
if(dump_from_file) {
dumpin_fd = open(url, O_RDONLY);
if(dumpin_fd < 0) {
@@ -77,7 +135,7 @@ static int cmd_import(const char *line)
 
}
svndump_init_fd(dumpin_fd, STDIN_FILENO);
-   svndump_read(url, private_ref);
+   svndump_read(url, private_ref, notes_ref);
svndump_deinit();
svndump_reset();
 
@@ -177,6 +235,9 @@ int main(int argc, const char **argv)
strbuf_addf(&buf, "refs/svn/%s/master", remote->name);
private_ref = strbuf_detach(&buf, NULL);
 
+   strbuf_addf(&buf, "refs/notes/%s/revs", remote->name);
+   notes_ref = strbuf_detach(&buf, NULL);
+
while(1) {
if (strbuf_getline(&buf, stdin, '\n') == EOF) {
if (ferror(stdin))
@@ -192,5 +253,6 @@ int main(int argc, const char **argv)
strbuf_release(&buf);
free((void*)url);
free((void*)private_ref);
+   free((void*)notes_ref);
return 0;
 }
diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.

[PATCH/RFC v3 10/16] Create a note for every imported commit containing svn metadata.

2012-08-14 Thread Florian Achleitner
To provide metadata from svn dumps for further processing, e.g.
branch detection, attach a note to each imported commit that
stores additional information.
The notes are currently hard-coded in refs/notes/svn/revs.
Currently the following lines from the svn dump are directly
accumulated in the note. This can be refined on purpose, of course.
- "Revision-number"
- "Node-path"
- "Node-kind"
- "Node-action"
- "Node-copyfrom-path"
- "Node-copyfrom-rev"

Signed-off-by: Florian Achleitner 
---
 vcs-svn/fast_export.c |   13 +
 vcs-svn/fast_export.h |2 ++
 vcs-svn/svndump.c |   21 +++--
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1ecae4b..796dd1a 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -12,6 +12,7 @@
 #include "svndiff.h"
 #include "sliding_window.h"
 #include "line_buffer.h"
+#include "cache.h"
 
 #define MAX_GITSVN_LINE_LEN 4096
 
@@ -68,6 +69,18 @@ void fast_export_modify(const char *path, uint32_t mode, 
const char *dataref)
putchar('\n');
 }
 
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp)
+{
+   timestamp = 1341914616;
+   size_t loglen = strlen(log);
+   printf("commit refs/notes/svn/revs\n");
+   printf("committer %s <%s@%s> %ld +\n", author, author, "local", 
timestamp);
+   printf("data %"PRIuMAX"\n", loglen);
+   fwrite(log, loglen, 1, stdout);
+   fputc('\n', stdout);
+}
+
 void fast_export_note(const char *committish, const char *dataref)
 {
printf("N %s %s\n", dataref, committish);
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 9b32f1e..c2f6f11 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -10,6 +10,8 @@ void fast_export_deinit(void);
 void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_note(const char *committish, const char *dataref);
+void fast_export_begin_note(uint32_t revision, const char *author,
+   const char *log, unsigned long timestamp);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
const char *url, unsigned long timestamp, const char 
*local_ref);
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 288bb42..cd65b51 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -48,7 +48,7 @@ static struct {
 static struct {
uint32_t revision;
unsigned long timestamp;
-   struct strbuf log, author;
+   struct strbuf log, author, note;
 } rev_ctx;
 
 static struct {
@@ -77,6 +77,7 @@ static void reset_rev_ctx(uint32_t revision)
rev_ctx.timestamp = 0;
strbuf_reset(&rev_ctx.log);
strbuf_reset(&rev_ctx.author);
+   strbuf_reset(&rev_ctx.note);
 }
 
 static void reset_dump_ctx(const char *url)
@@ -310,8 +311,15 @@ static void begin_revision(const char *remote_ref)
 
 static void end_revision()
 {
-   if (rev_ctx.revision)
+   struct strbuf mark = STRBUF_INIT;
+   if (rev_ctx.revision) {
fast_export_end_commit(rev_ctx.revision);
+   fast_export_begin_note(rev_ctx.revision, "remote-svn",
+   "Note created by remote-svn.", 
rev_ctx.timestamp);
+   strbuf_addf(&mark, ":%"PRIu32, rev_ctx.revision);
+   fast_export_note(mark.buf, "inline");
+   fast_export_buf_to_data(&rev_ctx.note);
+   }
 }
 
 void svndump_read(const char *url, const char *local_ref)
@@ -358,6 +366,7 @@ void svndump_read(const char *url, const char *local_ref)
end_revision();
active_ctx = REV_CTX;
reset_rev_ctx(atoi(val));
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
case sizeof("Node-path"):
if (constcmp(t, "Node-"))
@@ -369,10 +378,12 @@ void svndump_read(const char *url, const char *local_ref)
begin_revision(local_ref);
active_ctx = NODE_CTX;
reset_node_ctx(val);
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
break;
}
if (constcmp(t + strlen("Node-"), "kind"))
continue;
+   strbuf_addf(&rev_ctx.note, "%s\n", t);
  

[PATCH/RFC v3 08/16] Allow reading svn dumps from files via file:// urls.

2012-08-14 Thread Florian Achleitner
For testing as well as for importing large, already
available dumps, it's useful to bypass svnrdump and
replay the svndump from a file directly.

Add support for file:// urls in the remote url.
e.g. svn::file:///path/to/dump
When the remote helper finds an url starting with
file:// it tries to open that file instead of invoking svnrdump.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/remote-svn.c |   59 ++-
 1 file changed, 36 insertions(+), 23 deletions(-)

diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
index ce59344..df1babc 100644
--- a/contrib/svn-fe/remote-svn.c
+++ b/contrib/svn-fe/remote-svn.c
@@ -10,6 +10,7 @@
 #include "argv-array.h"
 
 static const char *url;
+static int dump_from_file;
 static const char *private_ref;
 static const char *remote_ref = "refs/heads/master";
 
@@ -54,34 +55,39 @@ static int cmd_import(const char *line)
struct argv_array svndump_argv = ARGV_ARRAY_INIT;
struct child_process svndump_proc;
 
-   memset(&svndump_proc, 0, sizeof (struct child_process));
-   svndump_proc.out = -1;
-   argv_array_push(&svndump_argv, "svnrdump");
-   argv_array_push(&svndump_argv, "dump");
-   argv_array_push(&svndump_argv, url);
-   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
-   svndump_proc.argv = svndump_argv.argv;
-
-   code = start_command(&svndump_proc);
-   if (code)
-   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
-   dumpin_fd = svndump_proc.out;
-
-   code = start_command(&svndump_proc);
-   if (code)
-   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
-   dumpin_fd = svndump_proc.out;
+   if(dump_from_file) {
+   dumpin_fd = open(url, O_RDONLY);
+   if(dumpin_fd < 0) {
+   die_errno("Couldn't open svn dump file %s.", url);
+   }
+   }
+   else {
+   memset(&svndump_proc, 0, sizeof (struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", 
svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
 
+   }
svndump_init_fd(dumpin_fd, STDIN_FILENO);
svndump_read(url, private_ref);
svndump_deinit();
svndump_reset();
 
close(dumpin_fd);
-   code = finish_command(&svndump_proc);
-   if (code)
-   warning("%s, returned %d", svndump_proc.argv[0], code);
-   argv_array_clear(&svndump_argv);
+   if(!dump_from_file) {
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+   }
 
return 0;
 }
@@ -158,8 +164,15 @@ int main(int argc, const char **argv)
if (argc == 3)
url_in = argv[2];
 
-   end_url_with_slash(&buf, url_in);
-   url = strbuf_detach(&buf, NULL);
+   if (!prefixcmp(url_in, "file://")) {
+   dump_from_file = 1;
+   url = url_decode(url_in + sizeof("file://")-1);
+   }
+   else {
+   dump_from_file = 0;
+   end_url_with_slash(&buf, url_in);
+   url = strbuf_detach(&buf, NULL);
+   }
 
strbuf_addf(&buf, "refs/svn/%s/master", remote->name);
private_ref = strbuf_detach(&buf, NULL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 06/16] remote-svn, vcs-svn: Enable fetching to private refs.

2012-08-14 Thread Florian Achleitner
The reference to update by the fast-import stream is hard-coded.
When fetching from a remote the remote-helper shall update refs
in a private namespace, i.e. a private subdir of refs/.
This namespace is defined by the 'refspec' capability, that the
remote-helper advertises as a reply to the 'capablilities' command.

Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.

Signed-off-by: Florian Achleitner 
---
- fix hard-coded ref in test-svn-fe.c. Broke a testcase.

 contrib/svn-fe/svn-fe.c |2 +-
 test-svn-fe.c   |2 +-
 vcs-svn/fast_export.c   |4 ++--
 vcs-svn/fast_export.h   |2 +-
 vcs-svn/svndump.c   |   14 +++---
 vcs-svn/svndump.h   |2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
index 35db24f..c796cc0 100644
--- a/contrib/svn-fe/svn-fe.c
+++ b/contrib/svn-fe/svn-fe.c
@@ -10,7 +10,7 @@ int main(int argc, char **argv)
 {
if (svndump_init(NULL))
return 1;
-   svndump_read((argc > 1) ? argv[1] : NULL);
+   svndump_read((argc > 1) ? argv[1] : NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/test-svn-fe.c b/test-svn-fe.c
index 83633a2..cb0d80f 100644
--- a/test-svn-fe.c
+++ b/test-svn-fe.c
@@ -40,7 +40,7 @@ int main(int argc, char *argv[])
if (argc == 2) {
if (svndump_init(argv[1]))
return 1;
-   svndump_read(NULL);
+   svndump_read(NULL, "refs/heads/master");
svndump_deinit();
svndump_reset();
return 0;
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 1f04697..11f8f94 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -72,7 +72,7 @@ static char gitsvnline[MAX_GITSVN_LINE_LEN];
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log,
const char *uuid, const char *url,
-   unsigned long timestamp)
+   unsigned long timestamp, const char *local_ref)
 {
static const struct strbuf empty = STRBUF_INIT;
if (!log)
@@ -84,7 +84,7 @@ void fast_export_begin_commit(uint32_t revision, const char 
*author,
} else {
*gitsvnline = '\0';
}
-   printf("commit refs/heads/master\n");
+   printf("commit %s\n", local_ref);
printf("mark :%"PRIu32"\n", revision);
printf("committer %s <%s@%s> %ld +\n",
   *author ? author : "nobody",
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
index 8823aca..17eb13b 100644
--- a/vcs-svn/fast_export.h
+++ b/vcs-svn/fast_export.h
@@ -11,7 +11,7 @@ void fast_export_delete(const char *path);
 void fast_export_modify(const char *path, uint32_t mode, const char *dataref);
 void fast_export_begin_commit(uint32_t revision, const char *author,
const struct strbuf *log, const char *uuid,
-   const char *url, unsigned long timestamp);
+   const char *url, unsigned long timestamp, const char 
*local_ref);
 void fast_export_end_commit(uint32_t revision);
 void fast_export_data(uint32_t mode, off_t len, struct line_buffer *input);
 void fast_export_blob_delta(uint32_t mode,
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index d81a078..288bb42 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -299,22 +299,22 @@ static void handle_node(void)
node_ctx.text_length, &input);
 }
 
-static void begin_revision(void)
+static void begin_revision(const char *remote_ref)
 {
if (!rev_ctx.revision)  /* revision 0 gets no git commit. */
return;
fast_export_begin_commit(rev_ctx.revision, rev_ctx.author.buf,
&rev_ctx.log, dump_ctx.uuid.buf, dump_ctx.url.buf,
-   rev_ctx.timestamp);
+   rev_ctx.timestamp, remote_ref);
 }
 
-static void end_revision(void)
+static void end_revision()
 {
if (rev_ctx.revision)
fast_export_end_commit(rev_ctx.revision);
 }
 
-void svndump_read(const char *url)
+void svndump_read(const char *url, const char *local_ref)
 {
char *val;
char *t;
@@ -353,7 +353,7 @@ void svndump_read(const char *url)
if (active_ctx == NODE_CTX)
handle_node();
if (active_ctx == REV_CTX)
-   begin_revision();
+   begin_revision(local_ref);
if (active_ctx != DUMP_CTX)
end_revision();
active_ctx = REV_CTX;
@@ -366,7 +366,7 @@ void svndump_rea

[PATCH/RFC v3 05/16] Add documentation for the 'bidi-import' capability of remote-helpers.

2012-08-14 Thread Florian Achleitner
Signed-off-by: Florian Achleitner 
---
 Documentation/git-remote-helpers.txt |   21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-remote-helpers.txt 
b/Documentation/git-remote-helpers.txt
index f5836e4..5faa48e 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -98,6 +98,20 @@ advertised with this capability must cover all refs reported 
by
 the list command.  If no 'refspec' capability is advertised,
 there is an implied `refspec *:*`.
 
+'bidi-import'::
+   The fast-import commands 'cat-blob' and 'ls' can be used by 
remote-helpers
+to retrieve information about blobs and trees that already exist in
+fast-import's memory. This requires a channel from fast-import to the
+remote-helper.
+If it is advertised in addition to "import", git establishes a pipe from
+   fast-import to the remote-helper's stdin.
+   It follows that git and fast-import are both connected to the
+   remote-helper's stdin. Because git can send multiple commands to
+   the remote-helper it is required that helpers that use 'bidi-import'
+   buffer all 'import' commands of a batch before sending data to 
fast-import.
+This is to prevent mixing commands and fast-import responses on the
+helper's stdin.
+
 Capabilities for Pushing
 
 'connect'::
@@ -286,7 +300,12 @@ terminated with a blank line. For each batch of 'import', 
the remote
 helper should produce a fast-import stream terminated by a 'done'
 command.
 +
-Supported if the helper has the "import" capability.
+Note that if the 'bidi-import' capability is used the complete batch
+sequence has to be buffered before starting to send data to fast-import
+to prevent mixing of commands and fast-import responses on the helper's
+stdin.
++
+Supported if the helper has the 'import' capability.
 
 'connect' ::
Connects to given service. Standard input and standard output
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 03/16] Add svndump_init_fd to allow reading dumps from arbitrary FDs.

2012-08-14 Thread Florian Achleitner
The existing function only allows reading from a filename or
from stdin. Allow passing of a FD and an additional FD for
the back report pipe. This allows us to retrieve the name of
the pipe in the caller.

Fixes the filename could be NULL bug.

Signed-off-by: Florian Achleitner 
---
- dup input file descriptor, because buffer_deinit closes the fd.
 vcs-svn/svndump.c |   22 ++
 vcs-svn/svndump.h |1 +
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
index 2b168ae..d81a078 100644
--- a/vcs-svn/svndump.c
+++ b/vcs-svn/svndump.c
@@ -468,11 +468,9 @@ void svndump_read(const char *url)
end_revision();
 }
 
-int svndump_init(const char *filename)
+static void init(int report_fd)
 {
-   if (buffer_init(&input, filename))
-   return error("cannot open %s: %s", filename, strerror(errno));
-   fast_export_init(REPORT_FILENO);
+   fast_export_init(report_fd);
strbuf_init(&dump_ctx.uuid, 4096);
strbuf_init(&dump_ctx.url, 4096);
strbuf_init(&rev_ctx.log, 4096);
@@ -482,6 +480,22 @@ int svndump_init(const char *filename)
reset_dump_ctx(NULL);
reset_rev_ctx(0);
reset_node_ctx(NULL);
+   return;
+}
+
+int svndump_init(const char *filename)
+{
+   if (buffer_init(&input, filename))
+   return error("cannot open %s: %s", filename ? filename : 
"NULL", strerror(errno));
+   init(REPORT_FILENO);
+   return 0;
+}
+
+int svndump_init_fd(int in_fd, int back_fd)
+{
+   if(buffer_fdinit(&input, xdup(in_fd)))
+   return error("cannot open fd %d: %s", in_fd, strerror(errno));
+   init(xdup(back_fd));
return 0;
 }
 
diff --git a/vcs-svn/svndump.h b/vcs-svn/svndump.h
index df9ceb0..acb5b47 100644
--- a/vcs-svn/svndump.h
+++ b/vcs-svn/svndump.h
@@ -2,6 +2,7 @@
 #define SVNDUMP_H_
 
 int svndump_init(const char *filename);
+int svndump_init_fd(int in_fd, int back_fd);
 void svndump_read(const char *url);
 void svndump_deinit(void);
 void svndump_reset(void);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC v3 01/16] Implement a remote helper for svn in C.

2012-08-14 Thread Florian Achleitner
Enable basic fetching from subversion repositories. When processing remote URLs
starting with svn::, git invokes this remote-helper.
It starts svnrdump to extract revisions from the subversion repository in the
'dump file format', and converts them to a git-fast-import stream using
the functions of vcs-svn/.

Imported refs are created in a private namespace at 
refs/svn/
---
diff:
- incorporate review
- remove redundant strbuf_init
- add 'bidi-import' to capabilities
- buffer all lines of a command batch in string_list

 contrib/svn-fe/remote-svn.c |  183 +++
 1 file changed, 183 insertions(+)
 create mode 100644 contrib/svn-fe/remote-svn.c

diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
new file mode 100644
index 000..ce59344
--- /dev/null
+++ b/contrib/svn-fe/remote-svn.c
@@ -0,0 +1,183 @@
+
+#include "cache.h"
+#include "remote.h"
+#include "strbuf.h"
+#include "url.h"
+#include "exec_cmd.h"
+#include "run-command.h"
+#include "svndump.h"
+#include "notes.h"
+#include "argv-array.h"
+
+static const char *url;
+static const char *private_ref;
+static const char *remote_ref = "refs/heads/master";
+
+static int cmd_capabilities(const char *line);
+static int cmd_import(const char *line);
+static int cmd_list(const char *line);
+
+typedef int (*input_command_handler)(const char *);
+struct input_command_entry {
+   const char *name;
+   input_command_handler fct;
+   unsigned char batchable;/* whether the command starts or is 
part of a batch */
+};
+
+static const struct input_command_entry input_command_list[] = {
+   { "capabilities", cmd_capabilities, 0 },
+   { "import", cmd_import, 1 },
+   { "list", cmd_list, 0 },
+   { NULL, NULL }
+};
+
+static int cmd_capabilities(const char *line) {
+   printf("import\n");
+   printf("bidi-import\n");
+   printf("refspec %s:%s\n\n", remote_ref, private_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static void terminate_batch(void)
+{
+   /* terminate a current batch's fast-import stream */
+   printf("done\n");
+   fflush(stdout);
+}
+
+static int cmd_import(const char *line)
+{
+   int code;
+   int dumpin_fd;
+   unsigned int startrev = 0;
+   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
+   struct child_process svndump_proc;
+
+   memset(&svndump_proc, 0, sizeof (struct child_process));
+   svndump_proc.out = -1;
+   argv_array_push(&svndump_argv, "svnrdump");
+   argv_array_push(&svndump_argv, "dump");
+   argv_array_push(&svndump_argv, url);
+   argv_array_pushf(&svndump_argv, "-r%u:HEAD", startrev);
+   svndump_proc.argv = svndump_argv.argv;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+
+   code = start_command(&svndump_proc);
+   if (code)
+   die("Unable to start %s, code %d", svndump_proc.argv[0], code);
+   dumpin_fd = svndump_proc.out;
+
+   svndump_init_fd(dumpin_fd, STDIN_FILENO);
+   svndump_read(url, private_ref);
+   svndump_deinit();
+   svndump_reset();
+
+   close(dumpin_fd);
+   code = finish_command(&svndump_proc);
+   if (code)
+   warning("%s, returned %d", svndump_proc.argv[0], code);
+   argv_array_clear(&svndump_argv);
+
+   return 0;
+}
+
+static int cmd_list(const char *line)
+{
+   printf("? %s\n\n", remote_ref);
+   fflush(stdout);
+   return 0;
+}
+
+static int do_command(struct strbuf *line)
+{
+   const struct input_command_entry *p = input_command_list;
+   static struct string_list batchlines = STRING_LIST_INIT_DUP;
+   static const struct input_command_entry *batch_cmd;
+   /*
+* commands can be grouped together in a batch.
+* Batches are ended by \n. If no batch is active the program ends.
+* During a batch all lines are buffered and passed to the handler 
function
+* when the batch is terminated.
+*/
+   if (line->len == 0) {
+   if (batch_cmd) {
+   struct string_list_item *item;
+   for_each_string_list_item(item, &batchlines)
+   batch_cmd->fct(item->string);
+   terminate_batch();
+   batch_cmd = NULL;
+   string_list_clear(&batchlines, 0);
+   return 0;   /* end of the batch, continue reading 
other commands. */
+   }
+   return 1;   /* end of command stream, quit */
+   }
+   if (batch_cmd) {
+   if (strcmp(batch_cmd->name, line->buf))
+   die("Active %s batch interrupted by %s", 
batch_cmd->name, line->buf);
+   /* buffer batch lines */
+   string_list_appen

[PATCH/RFC v3 00/16] GSOC remote-svn

2012-08-14 Thread Florian Achleitner
Hi.

Version 3 of this series adds the 'bidi-import' capability, as suggested
Jonathan. 
Diff details are attached to the patches.
04 and 05 are completely new.

[PATCH/RFC v3 01/16] Implement a remote helper for svn in C.
[PATCH/RFC v3 02/16] Integrate remote-svn into svn-fe/Makefile.
[PATCH/RFC v3 03/16] Add svndump_init_fd to allow reading dumps from
[PATCH/RFC v3 04/16] Connect fast-import to the remote-helper via
[PATCH/RFC v3 05/16] Add documentation for the 'bidi-import'
[PATCH/RFC v3 06/16] remote-svn, vcs-svn: Enable fetching to private
[PATCH/RFC v3 07/16] Add a symlink 'git-remote-svn' in base dir.
[PATCH/RFC v3 08/16] Allow reading svn dumps from files via file://
[PATCH/RFC v3 09/16] vcs-svn: add fast_export_note to create notes
[PATCH/RFC v3 10/16] Create a note for every imported commit
[PATCH/RFC v3 11/16] When debug==1, start fast-import with "--stats"
[PATCH/RFC v3 12/16] remote-svn: add incremental import.
[PATCH/RFC v3 13/16] Add a svnrdump-simulator replaying a dump file
[PATCH/RFC v3 14/16] transport-helper: add import|export-marks to
[PATCH/RFC v3 15/16] remote-svn: add marks-file regeneration.
[PATCH/RFC v3 16/16] Add a test script for remote-svn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Florian Achleitner
On Sunday 12 August 2012 09:12:58 Jonathan Nieder wrote:
> Hi again,
> 
> Florian Achleitner wrote:
> > back to the pipe-topic.
> 
> Ok, thanks.
> 
> [...]
> 
> >> The way it's supposed to work is that in a bidi-import, the remote
> >> helper reads in the entire list of refs to be imported and only once
> >> the newline indicating that that list is over arrives starts writing
> >> its fast-import stream.
> 
> [...]
> 
> > This would require all existing remote helpers that use 'import' to be
> > ported to the new concept, right? Probably there is no other..
> 
> You mean all existing remote helpers that use 'bidi-import', right?
> There are none.

Ok, it would not affect the existing import command.
> 
> [...]
> 
> > I still don't believe that sharing the input pipe of the remote helper is
> > worth the hazzle.
> > It still requires an additional pipe to be setup, the one from fast-import
> > to the remote-helper, sharing one FD at the remote helper.
> 
> If I understand correctly, you misunderstood how sharing the input
> pipe works.  Have you tried it?

Yes wrote a test program, sharing works, that's not the problem.

> 
> It does not involve setting up an additional pipe.  Standard input for
> the remote helper is already a pipe.  That pipe is what allows
> transport-helper.c to communicate with the remote helper.  Letting
> fast-import share that pipe involves passing that file descriptor to
> git fast-import.  No additional pipe() calls.
> 
> Do you mean that it would be too much work to implement?  This
> explanation just doesn't make sense to me, given that the version
> using pipe() *already* *exists* and is *tested*.

Yes, that was the first version I wrote, and remote-svn-alpha uses.

> 
> I get the feeling I am missing something very basic.  I would welcome
> input from others that shows what I am missing.
> 

This is how I see it, probably it's all wrong:
I thought the main problem is, that we don't want processes to have *more than 
three pipes attached*, i.e. stdout, stdin, stderr, because existing APIs don't 
allow it.
When we share stdin of the remote helper, we achieve this goal for this one 
process, but fast-import still has an additional pipe:
stdout  --> shell;
stderr --> shell; 
stdin <-- remote-helper; 
additional_pipe --> remote-helper.

That's what I wanted to say: We still have more than three pipes on fast-
import.
And we need to transfer that fourth file descriptor by inheritance and it's 
number as a command line argument. 
So if we make the remote-helper have only three pipes by double-using stdin, 
but fast-import still has four pipes, what problem does it solve?

Using fifos would remove the requirement to inherit more than three pipes. 
That's my point.

[..]
> 
> Meanwhile it would:
> 
>  - be 100% functionally equivalent to the solution where fast-import
>writes directly to the remote helper's standard input.  Two programs
>can have the same pipe open for writing at the same time for a few
>seconds and that is *perfectly fine*.  On Unix and on Windows.
> 
>On Windows the only complication with the pipe()-based  is that we
> haven't wired up the low-level logic to pass file descriptors other than
> stdin, stdout, stderr to child processes; and if I have understood earlier
> messages correctly, the operating system *does* have a
>concept of that and this is just a todo item in msys
>implementation.

I digged into MSDN and it seems it's not a problem at all on the windows api 
layer. Pipe handles can be inherited. [1]
If the low-level logic once supports passing more than 3 fds, it will work on 
fast-import as well as remote-helper.

> 
>  - be more complicated than the code that already exists for this
>stuff.
> 
> So while I presented this as a compromise, I don't see the point.
> 
> Is your goal portability, a dislike of the interface, some
> implementation detail I have missed, or something else?  Could you
> explain the problem as concisely but clearly as possible (perhaps
> using an example) so that others like Sverre, Peff, or David can help
> think through it and to explain it in a way that dim people like me
> understand what's going on?

It all started as portability-only discussion. On Linux, my first version would 
have worked. It created an additional pipe before forking using pipe(). Runs 
great, it did it like remote-svn-alpha.sh.

I wouldn't have started to produce something else or start a discussion on my 
own. But I was told, it's not good because of portability. This is the root of 
this endless story. (you already know the thread, I think). Since weeks nobo

Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-12 Thread Florian Achleitner
Hi,

back to the pipe-topic.

On Wednesday 01 August 2012 12:42:48 Jonathan Nieder wrote:
> Hi again,
> 
> Florian Achleitner wrote:
> > When the first line arrives at the remote-helper, it starts importing one
> > line at a time, leaving the remaining lines in the pipe.
> > For importing it requires the data from fast-import, which would be mixed
> > with import lines or queued at the end of them.
> 
> Oh, good catch.
> 
> The way it's supposed to work is that in a bidi-import, the remote
> helper reads in the entire list of refs to be imported and only once
> the newline indicating that that list is over arrives starts writing
> its fast-import stream.  We could make this more obvious by not
> spawning fast-import until immediately before writing that newline.
> 
> This needs to be clearly documented in the git-remote-helpers(1) page
> if the bidi-import command is introduced.
> 
> If a remote helper writes commands for fast-import before that newline
> comes, that is a bug in the remote helper, plain and simple.  It might
> be fun to diagnose this problem:

This would require all existing remote helpers that use 'import' to be ported 
to the new concept, right? Probably there is no other..

> 
>   static void pipe_drained_or_die(int fd, const char *msg)
>   {
>   char buf[1];
>   int flags = fcntl(fd, F_GETFL);
>   if (flags < 0)
>   die_errno("cannot get pipe flags");
>   if (fcntl(fd, F_SETFL, flags | O_NONBLOCK))
>   die_errno("cannot set up non-blocking pipe read");
>   if (read(fd, buf, 1) > 0)
>   die("%s", msg);
>   if (fcntl(fd, F_SETFL, flags))
>   die_errno("cannot restore pipe flags");
>   }
>   ...
> 
>   for (i = 0; i < nr_heads; i++) {
>   write "import %s\n", to_fetch[i]->name;
>   }
> 
>   if (getenv("GIT_REMOTE_HELPERS_SLOW_SANITY_CHECK"))
>   sleep(1);
> 
>   pipe_drained_or_die("unexpected output from remote helper before
> fast-import launch");
> 
>   if (get_importer(transport, &fastimport))
>   die("couldn't run fast-import");
>   write_constant(data->helper->in, "\n");

I still don't believe that sharing the input pipe of the remote helper is 
worth the hazzle.
It still requires an additional pipe to be setup, the one from fast-import to 
the remote-helper, sharing one FD at the remote helper.
It still requires more than just stdin, stdout, stderr.

I would suggest to use a fifo. It can be openend independently, after forking 
and on windows they have named pipes with similar semantics, so I think this 
could be easily ported. 
I would suggest the following changes:
- add a capability to the remote helper 'bidi-import', or 'bidi-pipe'. This 
signals that the remote helper requires data from fast-import.

- add a command 'bidi-import', or 'bidi-pipe' that is tells the remote helper 
which filename the fifo is at, so that it can open it and read it when it 
handles 'import' commands.

- transport-helper.c creates the fifo on demand, i.e. on seeing the capability, 
in the gitdir or in /tmp.

- fast-import gets the name of the fifo as a command-line argument. The 
alternative would be to add a command, but that's not allowed, because it 
changes the stream semantics.
Another alternative would be to use the existing --cat-pipe-fd argument. But 
that requires to open the fifo before execing fast-import and makes us 
dependent on the posix model of forking and inheriting file descriptors, while 
opening a fifo in fast-import would not.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GSOC remote-svn: branch detection

2012-08-07 Thread Florian Achleitner
On Saturday 04 August 2012 23:53:58 Ramkumar Ramachandra wrote:
> Hi,
> 
> Florian Achleitner wrote:
> > 1. Import linearly and split later:
> I think this approach will be a lot less messy if you can cleanly
> separate the fetching component from the mapper.  Currently, svndump
> re-creates the layout of the SVN repository.  And the series you
> posted last week contains a patch that attaches a note with SVN
> metadata to each commit.  Do you have thoughts on how the mapping will
> take place?

The mapping itself is currently a black box for me, it's internals could be 
rather complex. It could get a function like is_branch_start, that is called 
with a node ctx and tells if this is likely to be the start of branch. The 
detected branches are stored and upcoming changes in the associated 
directories are mapped to a commit on a branch.
The detection of branch starts and the list of existing branches can be taken 
from whatever logic we want. So that's approx. the idea.

Currently I'm working on more basic preparations. I want to split the creation 
of commits and the creation of blobs in svndump.c.
This is necessary because fast import requires a branch name as an argument to 
the 'commit' command, and
currently a 'commit' command is started when a new revision is encountered in 
the svndump.
But to decide on which branch the commit should go, or even if it will be more 
than one commit, it is necessary to read all the nodes first.
To prevent buffering the node content, I want to replace the inline data format 
(currently used) by 'blob' commands.
While parsing the dump, every node change creates a blob command to feed the 
data immediately into fast-import while the node metadata (struct node_ctx) is 
stored at least until the revision ends. Then the blobs can be put on a linear 
master tree and other branch trees. The node metadata could also be read from 
notes, if remapping branches.
That's not so easy to do, because the current implementation mixes tree-
operations and blob-operations heavily, and relies on only one global 
node_ctx.

> 
> Ram

Flo
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


GSOC remote-svn: branch detection

2012-08-03 Thread Florian Achleitner
Hi!

I'm playing around in vcs-svn/ to start a framework for detecting and 
processing branches  in svndumps. So I wanted to let you know about my ideas.

Two approaches:
1. Import linearly and split later:
One idea is to import from svn linearly, i.e. one revision on top of it's 
predecessor, like now, and detect and split branches afterwards. The svn 
metadata is stored in git notes, so the required information would be 
available.
+ allows recovery, because the linear history is always here.
+ it's easier to peek around in the git history than in the svn dump during 
import to do the branch detection.
- requires creation of new commits in the branch detection stage.
- this results in double commits and awkward history, linear vs. branched.

2. Split during import:
Detect branches as they are created while reading the svn dump and identify to 
which branch a following node belongs.
First step is to restructure svndump.c to be able to buffer one complete 
revision for inspection before starting to write a commit to fast import.
Probably it's possible to feed the blobs to fast import directly and only 
buffer node data and defer commit creation, but not the data.
Currently, at the beginning of a new revision on the svn side, a new commit is 
created on top of a constant ref. When we support branches, we don't know the 
ref, i.e. the branch(es), the revision changes, before reading all the 'Node-
*' lines.
+ feels more 'right'
- requires revision buffering

Generally:
Detect branches as they are created by 'Node-copyfrom*' to some commonly used 
branch directories, like branches/. More complex branch detection can be 
implemented later, of course.
Store detected branches permanently (necessary for incremental fetches), and 
assign every file modification to one of those branches, if possible. Else 
assign them to, hm .. 
If a revision modifies more than one branch, create multiple commits.

Thanks for your comments and ideas! 

--
Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-08-01 Thread Florian Achleitner
On Tuesday 31 July 2012 15:43:57 Jonathan Nieder wrote:
> Florian Achleitner wrote:
> > I haven't tried that yet, nor do I remember anything where I've already
> > seen two processes writing to the same pipe.
> 
> It's a perfectly normal and well supported thing to do.

I played around with a little testprogram. It generally works.
I'm still not convinced that this doesn't cause more problems than it can 
solve.
The standard defines that write calls to pipe fds are atomic, i.e. data is not 
interleaved with data from other processes, if the data is less than PIPE_BUF 
[1].
We would need some kind of locking/synchronization to make it work for sure, 
while I believe it will work most of the time.

Currently it runs  like this:
transport-helper.c writes one or more 'import ' lines, we don't know in 
advance how many and how long they are. Then it waits for fast-import to 
finish.

When the first line arrives at the remote-helper, it starts importing one line 
at a time, leaving the remaining lines in the pipe.
For importing it requires the data from fast-import, which would be mixed with 
import lines or queued at the end of them.

[1] 
http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-31 Thread Florian Achleitner
On Monday 30 July 2012 11:55:02 Jonathan Nieder wrote:
> Florian Achleitner wrote:
> > Hm .. that would mean, that both fast-import and git (transport-helper)
> > would write to the remote-helper's stdin, right?
> 
> Yes, first git writes the list of refs to import, and then fast-import
> writes feedback during the import.  Is that a problem?

I haven't tried that yet, nor do I remember anything where I've already seen 
two processes writing to the same pipe.
At least it sounds cumbersome to me. Processes' lifetimes overlap, so buffering 
and flushing could mix data.
We have to use it for both purposes interchangably  because there can be more 
than one import command to the remote-helper, of course.

Will try that in test-program..
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v2 01/16] Implement a remote helper for svn in C.

2012-07-31 Thread Florian Achleitner
On Monday 30 July 2012 09:28:27 Junio C Hamano wrote:
> Florian Achleitner  writes:
> > Enables basic fetching from subversion repositories. When processing
> > Remote URLs starting with svn::, git invokes this remote-helper.
> > It starts svnrdump to extract revisions from the subversion repository in
> > the 'dump file format', and converts them to a git-fast-import stream
> > using the functions of vcs-svn/.
> > 
> > Imported refs are created in a private namespace at
> > refs/svn/ > (no branch detection) and completely, i.e. from revision 0 to HEAD.
> > 
> > Signed-off-by: Florian Achleitner 
> > ---
> > 
> >  contrib/svn-fe/remote-svn.c |  190
> >  +++ 1 file changed, 190
> >  insertions(+)
> >  create mode 100644 contrib/svn-fe/remote-svn.c
> > 
> > diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
> > new file mode 100644
> > index 000..d5c2df8
> > --- /dev/null
> > +++ b/contrib/svn-fe/remote-svn.c
> > @@ -0,0 +1,190 @@
> > +
> > +#include "cache.h"
> > +#include "remote.h"
> > +#include "strbuf.h"
> > +#include "url.h"
> > +#include "exec_cmd.h"
> > +#include "run-command.h"
> > +#include "svndump.h"
> > +#include "argv-array.h"
> > +
> > +static const char *url;
> > +static const char *private_ref;
> > +static const char *remote_ref = "refs/heads/master";
> > +
> > +int cmd_capabilities(struct strbuf *line);
> > +int cmd_import(struct strbuf *line);
> > +int cmd_list(struct strbuf *line);
> 
> How many of these and other symbols are necessary to be visible
> outside this file?

Will check and make them static.

> 
> > +typedef int (*input_command_handler)(struct strbuf *);
> > +struct input_command_entry {
> > +   const char *name;
> > +   input_command_handler fct;
> > +   unsigned char batchable;/* whether the command starts or is 
> > part of a
> > batch */ +};
> > +
> > +static const struct input_command_entry input_command_list[] = {
> > +   { "capabilities", cmd_capabilities, 0 },
> > +   { "import", cmd_import, 1 },
> > +   { "list", cmd_list, 0 },
> > +   { NULL, NULL }
> > +};
> > +
> > +int cmd_capabilities(struct strbuf *line)
> > +{
> > +   printf("import\n");
> > +   printf("refspec %s:%s\n\n", remote_ref, private_ref);
> > +   fflush(stdout);
> > +   return 0;
> > +}
> > +
> > +static void terminate_batch() {
> > +   /* terminate a current batch's fast-import stream */
> 
> Style:
> 
>   static void terminate_batch(void)
>   {
>   /* terminate ...
> 

Ok. Opening braces in new lines, right? But inside functions it's ok to have 
them on the same line?

> > +   printf("done\n");
> > +   fflush(stdout);
> > +}
> > +
> > +int cmd_import(struct strbuf *line)
> > +{
> > +   int code, report_fd;
> > +   char *back_pipe_env;
> > +   int dumpin_fd;
> > +   unsigned int startrev = 0;
> > +   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
> > +   struct child_process svndump_proc;
> > +
> > +   /*
> > +* When the remote-helper is invoked by transport-helper.c it passes 
the
> > +* filename of this pipe in the env-var.
> > +*/
> 
> s/ it passes/, &/;
> 
> > +   back_pipe_env = getenv("GIT_REPORT_FIFO");
> 
> Can we name "back pipe", "report fifo" and "report fd" more
> consistently and descriptively?
> 
> What kind of "REPORT" are we talking about here?  Is it to carry the
> contents of

This topic (pipe vs. fifo) is still under discussion with Jonathan. I called it 
REPORT, because that was the name of it in vcs-svn. That will change.

> 
> > +   if (!back_pipe_env) {
> > +   die("Cannot get cat-blob-pipe from environment! GIT_REPORT_FIFO 
> > has 
to"
> > +   "be set by the caller.");
> > +   }
> 
> Style: unnecesary {} block around a simple statement.  It is OK to
> have such a block early in a series if you add more statements to it
> in later steps, but that does not seem to be the case for this patch
> series.

ack.

> 
> > +   /*
> > +* Opening a fifo for reading usually blocks until a writer has opened
> > it too. +* Openin

Re: [RFC v2 11/16] Add explanatory comment for transport-helpers refs mapping.

2012-07-30 Thread Florian Achleitner
On Monday 30 July 2012 14:15:53 Jonathan Nieder wrote:
> Junio C Hamano wrote:
> > Jonathan Nieder  writes:
> >>> + /*
> >>> +  * If the remote helper advertised the "refspec" capability,
> >>> +  * it will have the written result of the import to the refs
> > 
> > perhaps s/will have the written result of/would have written result of/?
> 
> That would sound like 'If the remote helper advertised the "refspec"
> capability, it would have written the result of the import to the
> refs, but it didn't, so...', so I think "will" is the right tense.
> But 'will have the written' is awkward.  How about:

Yes, thats clearly a typing error of mine, 'the' is to be deleted.

> 
>* The fast-import stream of a remote helper advertising the
>* "refspec" capability writes to the refs named after the right
>* hand side of the first refspec matching each ref we were
>* fetching.
>*
>* (If no "refspec" capability is specified, for historical
>* reasons the default is *:*.)
>*
>* Store the result in to_fetch[i].old_sha1. [...]
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC v2 11/16] Add explanatory comment for transport-helpers refs mapping.

2012-07-30 Thread Florian Achleitner
transport-helpers can advertise the 'refspec' capability,
if not a default refspec *:* is assumed. This explains
the post-processing of refs after fetching with fast-import.

Signed-off-by: Florian Achleitner 
---
 transport-helper.c |   15 +++
 1 file changed, 15 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index d6daad5..e10fd6b 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -478,6 +478,21 @@ static int fetch_with_import(struct transport *transport,
 
argv_array_clear(&importer_argv);
 
+   /*
+* If the remote helper advertised the "refspec" capability,
+* it will have the written result of the import to the refs
+* named on the right hand side of the first refspec matching
+* each ref we were fetching.
+*
+* (If no "refspec" capability was specified, for historical
+* reasons we default to *:*.)
+*
+* Store the result in to_fetch[i].old_sha1.  Callers such
+* as "git fetch" can use the value to write feedback to the
+* terminal, populate FETCH_HEAD, and determine what new value
+* should be written to peer_ref if the update is a
+* fast-forward or this is a forced update.
+*/
for (i = 0; i < nr_heads; i++) {
char *private;
posn = to_fetch[i];
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Monday 30 July 2012 03:29:52 Jonathan Nieder wrote:
> > Generally I like your prefered solution.
> > I think there's one problem:
> > The pipe needs to be created before the fork, so that the fd can be
> > inherited. 
> The relevant pipe already exists at that point: the remote helper's
> stdin.
> 
> In other words, it could work like this (just like the existing demo
> code, except adding a conditional based on the "capabilities"
> response):
> 
> 0. transport-helper.c invokes the remote helper.  This requires
>a pipe used to send commands to the remote helper
>(helper->in) and a pipe used to receive responses from the
>remote helper (helper->out)
> 
> 1. transport-helper.c sends the "capabilities" command to decide
>what to do.  The remote helper replies that it would like
>some feedback from fast-import.
> 
> 2. transport-helper.c forks and execs git fast-import with input
>redirected from helper->out and the cat-blob fd redirected
>to helper->in

fast-import writes to the helpers stdin..

> 3. transport-helper.c tells the remote helper to start the
>import

transport-helper writes commands to the helper's stdin.

> 
> 4. wait for fast-import to exit

Hm .. that would mean, that both fast-import and git (transport-helper) would 
write to the remote-helper's stdin, right?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Thursday 26 July 2012 10:29:51 Junio C Hamano wrote:
> Of course, if the dispatch loop has to be rewritten so that a
> central dispatcher decides what to call, individual input handlers
> do not need to say NOT_HANDLED nor TERMINATE, as the central
> dispatcher should keep track of the overall state of the system, and
> the usual "0 on success, negative on error" may be sufficient.
> 
> One thing I wondered was how an input "capability" (or "list")
> should be handled after "import" was issued (hence batch_active
> becomes true).  The dispatcher loop in the patch based on
> NOT_HANDLED convention will happily call cmd_capabilities(), which
> does not have any notion of the batch_active state (because it is a
> function scope static inside cmd_import()), and will say "Ah, that
> is mine, and let me do my thing."  If we want to diagnose such an
> input stream as an error, the dispatch loop needs to become aware of
> the overall state of the system _anyway_, so that may be an argument
> against the NOT_HANDLED based dispatch system the patch series uses.

That's a good point. The current implementation allows other commands to 
appear during import batches. This shouldn't be possible according to the 
protocol, I think. But it doesn't do harm. Solving it will require a global 
state and go towards a global displatcher.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-30 Thread Florian Achleitner
On Saturday 28 July 2012 02:00:31 Jonathan Nieder wrote:
> Thanks for explaining.  Now we've discussed a few different approproaches,
> none of which is perfect.
> 
> a. use --cat-blob-fd, no FIFO
> 
>Doing this unconditionally would break platforms that don't support
>--cat-blob-fd=(descriptor >2), like Windows, so we'd have to:
> 
>* Make it conditional --- only do it (1) we are not on Windows and
>  (2) the remote helper requests backflow by advertising the
>  import-bidi capability.
> 
>* Let the remote helper know what's going on by using
>  "import-bidi" instead of "import" in the command stream to
>  initiate the import.

Generally I like your prefered solution.
I think there's one problem:
The pipe needs to be created before the fork, so that the fd can be inherited. 
There is no way of creating it if the remote-helper advertises a capability, 
because it is already forked then. This would work with fifos, though.

We could:
- add a capability: bidi-import. 
- make transport-helper create a fifo if the helper advertises it.
- add a command for remote-helpers, like 'bidi-import ' that makes 
the remote helper open the fifo at  and use it.
- fast-import is forked after the helper, so we do already know if there will 
be a back-pipe. If yes, open it in transport-helper and pass the fd as command 
line argument cat-blob-fd. 

--> fast-import wouldn't need to be changed, but we'd use a fifo, and we get 
rid of the env-vars.
(I guess it could work on windows too).

What do you think?

> 
> b. use envvars to pass around FIFO path
> 
>This complicates the fast-import interface and makes debugging hard.
>It would be nice to avoid this if we can, but in case we can't, it's
>nice to have the option available.
> 
> c. transport-helper.c uses FIFO behind the scenes.
> 
>Like (a), except it would require a fast-import tweak (boo) and
>would work on Windows (yea)
> 
> d. use --cat-blob-fd with FIFO
> 
>Early scripted remote-svn prototypes did this to fulfill "fetch"
>requests.
> 
>It has no advantage over "use --cat-blob-fd, no FIFO" except being
>easier to implement as a shell script.  I'm listing this just for
>comparison; since (a) looks better in every way, I don't see any
>reason to pursue this one.
> 
> Since avoiding deadlocks with bidirectional communication is always a
> little subtle, it would be nice for this to be implemented once in
> transport-helper.c rather than each remote helper author having to
> reimplement it again.  As a result, my knee-jerk ranking is a > c >
> b > d.
> 
> Sane?
> Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-27 Thread Florian Achleitner
On Thursday 26 July 2012 09:54:26 Jonathan Nieder wrote:
> 
> Since the svn remote helper relies on this, it seems worth working on,
> yeah.  As for how to spend your time (and whether to beg someone else
> to work on it instead :)): I'm not sure what's on your plate or where
> you are with respect to the original plan for the summer at the
> moment, so it would be hard for me to give useful advice about how to
> balance things.

Btw, the pipe version did already exist before I started, it was added with 
the cat-blob command and already used by Dmitry's remote-svn-alpha.
I didn't search for design discussions in the past ..

> 
> What did you think of the suggestion of adding a new bidi-import
> capability and command to the remote helper protocol?  I think this
> would be clean and avoid causing a regression on Windows, but it's
> easily possible I am missing something fundamental.

I don't have much overview over this topic besides the part I'm working on, 
like other users of fast-import. 
The bidi-import capability/command would have the advantage, that we don't 
have to bother with the pipe/fifo at all, if the remote-helper doesn't use it.

When I implemented the two variants I had the idea to pass it to the 'option' 
command, that fast-import already has. Anyways, specifying cat-blob-fd is not 
allowed via the 'option' command (see Documentation and 85c62395).
It wouldn't make too much sense, because the file descriptor must be set up by 
the parent.

But for the fifo, it would, probably. The backward channel is only used by the 
commands 'cat-blob' and 'ls' of fast-import. If a remote helper wants to use 
them, it would could make fast-import open the pipe by sending an 'option' 
command with the fifo filename, otherwise it defaults to stdout (like now) and 
is rather useless.
This would take the fifo setup out of transport-helper. The remote-helper would 
have to create it, if it needs it.

Apropos stdout. That leads to another idea. You already suggested that it 
would be easiest to only use FDs 0..2. Currently stdout and stderr of fast-
import go to the shell. We could connect stdout to the remote-helper and don't 
need the additional channel at all.
(Probably there's a good reason why they haven't done that ..)
Maybe this requires many changes to fast-import and breaks existing frontends.

--
Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
On Thursday 26 July 2012 04:08:42 Jonathan Nieder wrote:
> Florian Achleitner wrote:

> > On Monday 02 July 2012 06:07:41 Jonathan Nieder wrote:
> [...]
> 
> >>> +
> >>> +static inline void printd(const char* fmt, ...)
> 
> [...]
> 
> >> Why not use trace_printf and avoid the complication?
> > 
> > Hm.. I tried. It wasn't exactly what I wanted. When I use trace_printf,
> > it's activated together with all other traces. I can use trace_vprintf
> > and specify a key, but I would always have to print the header "rhsvn
> > debug: " and the key by hand. So I could replace vfprintf in this
> > function by trace_vprintf to do that. But then there's not much
> > simplification. (?)
> 
> Hmm.  There's no trace_printf_with_key() but that's presumably because
> no one has needed it.  If it existed, you could use
> 
>   #define printd(msg) trace_printf_with_key("GIT_TRACE_REMOTE_SVN", "%s",
> msg)
> 
> But now that I check, I don't see how the current printd() calls would
> be useful to other people.  Why announce these moments and not others?
> They're just temporary debugging cruft, right?
> 
> For that, plain trace_printf() works great.

Yes, it's for debugging only, I could just delete it all. It's inspired by 
transport-helper.c. The env var GIT_TRANSPORT_HELPER_DEBUG enables it. While 
transport-helper has a lot of if (debug) fprintf(..), I encapsulated it in 
printd.
So I should kick printd out?

> >>> +
> >>> + printf("import\n");
> >>> + printf("\n");
> >>> + fflush(stdout);
> >>> + return SUCCESS;
> >>> +}
> >> 
> >> Why the multiple printf?  Is the flush needed?
> > 
> > Excess printf gone.
> > Flush is needed. Otherwise it doesn't flush and the other end waits
> > forever.
> Ah, fast-import is ready, remote helper is ready, no one initiates
> pumping of data between them.  Maybe the purpose of the flush would
> be more obvious if it were moved to the caller.

Acutally this goes to the git parent process (not fast-import), waiting for a 
reply to the command. I think I have to call flush on this side of the pipe. 
Can you flush it from the reader? This wouldn't have the desired effect, it 
drops buffered data.

> [...]
> 
> >>> + /* opening a fifo for usually reading blocks until a writer has opened
> >>> it too. +  * Therefore, we open with RDWR.
> >>> +  */
> >>> + report_fd = open(back_pipe_env, O_RDWR);
> >>> + if(report_fd < 0) {
> >>> + die("Unable to open fast-import back-pipe! %s", 
> >>> strerror(errno));
> >>> + }
> >> 
> >> Is this necessary?  Why shouldn't we fork the writer first and wait
> >> for it here?
> > 
> > Yes, necessary.
> 
> Oh, dear.  I hope not.  E.g., Cygwin doesn't support opening fifos
> RDWR (out of scope for the gsoc project, but still).

I believe it can be solved using RDONLY and WRONLY too. Probably we solve it 
by not using the fifo at all.
Currently the blocking comes from the fact, that fast-import doesn't parse 
it's command line at startup. It rather reads an input line first and decides 
whether to parse the argv after reading the first input line or at the end of 
the input. (don't know why)
remote-svn opens the pipe before sending the first command to fast-import and 
blocks on the open, while fast-import waits for input --> deadlock.
with remote-svn: RDWR, fast-import: WRONLY, this works.

Other scenario: Nothing to import, remote-svn only sends 'done' and closes the 
pipe again. After fast-import reads the first line it parses it's command line 
and tries to open the fifo which is already closed on the other side --> 
blocks.
This is solved by using RDWR on both sides.

If we change the points where the pipes are openend and closed, this could be 
circumvented.

> 
> [...]
> 
> > E.g. If there's have nothing to import, the helper sends only 'done' to
> > fast- import and quits.
> 
> Won't the writer open the pipe and wait for us to open our end before
> doing that?
> 
> [...]
> 
> >>> +
> >>> + code = start_command(&svndump_proc);
> >>> + if(code)
> >>> + die("Unable to start %s, code %d", svndump_proc.argv[0], code);
> >> 
> >> start_command() is supposed to have printed a message already when it
> >> fails, unless errno == ENOENT and silent_exec_failure was set.
> > 
> > Yes, but it doesn't die, right?
> 
> You

Re: [RFC 14/16] transport-helper: add import|export-marks to fast-import command line.

2012-07-26 Thread Florian Achleitner

I just found that for fast-export something similar was added to transport-
helper in a515ebe9.
By adding a capabilities advertised by the remote helper. Probably that would 
be a nicer way to do that.
Btw, these added capabilities are not mentioned in Docs.

On Thursday 26 July 2012 09:32:35 Florian Achleitner wrote:
> fast-import internally uses marks that refer to an object via its sha1.
> Those marks are created during import to find previously created objects.
> At exit the accumulated marks can be exported to a file and reloaded at
> startup, so that the previous marks are available.
> Add command line options to the fast-import command line to enable this.
> The mark files are stored in info/fast-import/marks/.
> 
> Signed-off-by: Florian Achleitner 
> ---
>  transport-helper.c |3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/transport-helper.c b/transport-helper.c
> index e10fd6b..74f9608 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -394,6 +394,9 @@ static int get_importer(struct transport *transport,
> struct child_process *fasti argv_array_push(argv, "fast-import");
>   argv_array_push(argv, debug ? "--stats" : "--quiet");
>   argv_array_pushf(argv, "--cat-blob-pipe=%s", data->report_fifo);
> + argv_array_push(argv, "--relative-marks");
> + argv_array_pushf(argv, "--import-marks-if-exists=marks/%s",
> transport->remote->name); +   argv_array_pushf(argv,
> "--export-marks=marks/%s", transport->remote->name); fastimport->argv =
> argv->argv;
>   fastimport->git_cmd = 1;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
On Thursday 26 July 2012 06:40:39 Jonathan Nieder wrote:
> Steven Michalske wrote:
> > On Jul 2, 2012, at 4:07 AM, Jonathan Nieder  wrote:
> >> [...]
> >> 
> >>> diff: Use fifo instead of pipe: Retrieve the name of the pipe from env
> >>> and open it for svndump.
> >> 
> >> I'd prefer to avoid this if possible, since it means having to decide
> >> where the pipe goes on the filesystem.  Can you summarize the
> >> discussion in the commit message so future readers understand why
> >> we're doing it?
> > 
> > Crazy thought here but would a socket not be a bad choice here?
> 
> Not crazy --- it was already mentioned.  It could probably allow using
> --cat-blob-fd even on the platforms that don't inherit file
> descriptors >2, though it wuld take some tweaking.  Though I still
> think the way forward is to keep using plain pipes internally for now
> and to make the bidirectional communication optional, since it
> wouldn't close any doors to whatever is most convenient on each
> platform.  Hopefully I'll hear more from Florian about this in time.

Would you like to see a new pipe patch?

> 
> > Imagine being able to ssh tunnel into the SVN server and run the helper
> > with filesystem access to the SVN repo.
> 
> We're talking about what communicates between the SVN dump parser the
> version control system-specific backend (git fast-import) that reads
> the converted result, so that particular socket wouldn't help much.
>

Yes .. the network part is already handled quite well by svnrdump.
 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 01/16] Implement a remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
On Thursday 26 July 2012 03:14:43 Jonathan Nieder wrote:
> Florian Achleitner wrote:
> > Yes, I incorporated your review in the new version, as far as applicable.
> > But I didn't send you an answer on the detailed points.
> > I will send an answer to the previous review ..
> 
> Thanks.  Now that I check, I see that you did make lots of important
> changes and probably lost the one I noticed just now in the noise.
> 
> Another way to keep reviewers happy is to describe what changed since
> the last revision under the triple-dash for each patch when sending
> out a new set of patches.  That way, they can see that there was
> progress and there is less frustration when one specific change didn't
> make it.
> 
> See http://thread.gmane.org/gmane.comp.version-control.git/176203
> for example.

Yeah, that makes sense.
In this reroll, I really changed a lot, order and scope of patches is very 
different. Many haven't hit the list yet. I wanted to write a new more useful 
history.
The first patch, this one, consists of many enhancement commits I made after 
each other, finally integrated into one.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
Hi!

Most of this review went into the new version.. 
For the remaining points, some comments follow.

On Monday 02 July 2012 06:07:41 Jonathan Nieder wrote:
> Hi,
> 
> Florian Achleitner wrote:

> 
> > --- /dev/null
> > +++ b/contrib/svn-fe/remote-svn.c
> > @@ -0,0 +1,207 @@
> > +
> > +#include 
> > +#include 
> > +#include 
> 
> git-compat-util.h (or some header that includes it) must be the first
> header included so the appropriate feature test macros can be defined.
> See Documentation/CodingGuidelines for more on that.

check.

> 
> > +#include "cache.h"
> > +#include "remote.h"
> > +#include "strbuf.h"
> > +#include "url.h"
> > +#include "exec_cmd.h"
> > +#include "run-command.h"
> > +#include "svndump.h"
> > +
> > +static int debug = 0;
> 
> Small nit: please drop the redundant "= 0" here.  Or:

check.

> > +
> > +static inline void printd(const char* fmt, ...)
> > +{
> > +   if(debug) {
> > +   va_list vargs;
> > +   va_start(vargs, fmt);
> > +   fprintf(stderr, "rhsvn debug: ");
> > +   vfprintf(stderr, fmt, vargs);
> > +   fprintf(stderr, "\n");
> > +   va_end(vargs);
> > +   }
> > +}
> 
> Why not use trace_printf and avoid the complication?

Hm.. I tried. It wasn't exactly what I wanted. When I use trace_printf, it's 
activated together with all other traces. I can use trace_vprintf and specify 
a key, but I would always have to print the header "rhsvn debug: " and the key 
by hand. So I could replace vfprintf in this function by trace_vprintf to do 
that. But then there's not much simplification. (?)


> > +
> > +enum cmd_result cmd_capabilities(struct strbuf* line);
> > +enum cmd_result cmd_import(struct strbuf* line);
> > +enum cmd_result cmd_list(struct strbuf* line);
> 
> What's a cmd_result?  '*' sticks to variable name.
> 
> > +
> > +enum cmd_result { SUCCESS, NOT_HANDLED, ERROR };
> 
> Oh, that's what a cmd_result is. :)  Why not define the type before
> using it to avoid keeping the reader in suspense?
> 
> What does each result represent?  If this is a convention like
> 
>  1: handled
>  0: not handled
>  -1: error, callee takes care of printing the error message
> 
> then please document it in a comment near the caller so the reader can
> understand what is happening without too much confusion.  Given such a
> comment, does the enum add clarity?

Hm.. the enum now has SUCCESS, NOT_HANDLED, TERMINATE.
It gives the numbers a name, thats it.

> 
> > +typedef enum cmd_result (*command)(struct strbuf*);
> 
> When I first read this, I wonder what is being commanded.  Are these
> commands passed on the remote helper's standard input, commands passed
> on its output, or commands run at some point in the process?  What is
> the effect and return value of associated function?  Does the function
> always return some success/failure value, or does it sometimes exit?
> 
> Maybe a more specific type name would be clearer?

I renamed it to input_command_handler. Unfortunately the remote-helper spec 
calls what is sent to the helper a 'command'.

> 
> [...]
> 
> > +
> > +const command command_list[] = {
> > +   cmd_capabilities, cmd_import, cmd_list, NULL
> > +};
> 
> First association is to functions like cmd_fetch() which implement git
> subcommands.  So I thought these were going to implement subcommands
> like "git remote-svn capabilities", "git remote-svn import" and would
> use the same cmd_foo(argc, argv, prefix) calling convention that git
> subcommands do.  Maybe a different naming convention could avoid
> confusion.

Ok.. same as above, they are kind of commands. Of course I can change the 
names. For me it's not too confusing, because I don't know the git subcommands 
convention very well. You can choose a name.

> 
> [...]
> 
> > +enum cmd_result cmd_capabilities(struct strbuf* line)
> > +{
> > +   if(strcmp(line->buf, "capabilities"))
> > +   return NOT_HANDLED;
> 
> Style: missing SP after keyword.
> 
> > +
> > +   printf("import\n");
> > +   printf("\n");
> > +   fflush(stdout);
> > +   return SUCCESS;
> > +}
> 
> Why the multiple printf?  Is the flush needed?

Excess printf gone.
Flush is needed. Otherwise it doesn't flush and the other end waits forever.
Don't know exactly why. Some pipe-buffer ..

> > +
> > +   /* opening

Re: [RFC 01/16] Implement a remote helper for svn in C.

2012-07-26 Thread Florian Achleitner
On Thursday 26 July 2012 02:46:07 Jonathan Nieder wrote:
> Hi,
> 
> Florian Achleitner wrote:
> > --- /dev/null
> > +++ b/contrib/svn-fe/remote-svn.c
> > @@ -0,0 +1,219 @@
> > +
> > +#include "cache.h"
> > +#include "remote.h"
> > +#include "strbuf.h"
> > +#include "url.h"
> > +#include "exec_cmd.h"
> > +#include "run-command.h"
> > +#include "svndump.h"
> > +#include "argv-array.h"
> > +
> > +static int debug;
> > +
> > +static inline void printd(const char *fmt, ...)
> 
> I remember reviewing this before, and mentioning that this could be
> replaced with trace_printf() and that would simplify some code and
> improve the functionality.  I think I also remember giving some other
> suggestions, but I don't have it in front of me so I can't be sure
> (should have more time this weekend).
> 
> Did you look over that review?  Did you have any questions about it,
> or was it just full of bad ideas, or something else?
> 
> It's silly and vain of me, but I'm not motivated by the idea of
> spending more time looking over this without anything coming of it.
> (Rejecting suggestions is fine, but sending feedback when doing so is
> important because otherwise reviewers get demotivated.)

Yes, I incorporated your review in the new version, as far as applicable. But 
I didn't send you an answer on the detailed points. 
I will send an answer to the previous review ..

> 
> Hope that helps,
> Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 11/16] Add explanatory comment for transport-helpers refs mapping.

2012-07-26 Thread Florian Achleitner
transport-helpers can advertise the 'refspec' capability,
if not a default refspec *:* is assumed. This explains
the post-processing of refs after fetching with fast-import.

Signed-off-by: Florian Achleitner 
---
 transport-helper.c |   15 +++
 1 file changed, 15 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index d6daad5..e10fd6b 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -478,6 +478,21 @@ static int fetch_with_import(struct transport *transport,
 
argv_array_clear(&importer_argv);
 
+   /*
+* If the remote helper advertised the "refspec" capability,
+* it will have the written result of the import to the refs
+* named on the right hand side of the first refspec matching
+* each ref we were fetching.
+*
+* (If no "refspec" capability was specified, for historical
+* reasons we default to *:*.)
+*
+* Store the result in to_fetch[i].old_sha1.  Callers such
+* as "git fetch" can use the value to write feedback to the
+* terminal, populate FETCH_HEAD, and determine what new value
+* should be written to peer_ref if the update is a
+* fast-forward or this is a forced update.
+*/
for (i = 0; i < nr_heads; i++) {
char *private;
posn = to_fetch[i];
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 00/16] GSOC remote-svn, rewritten patch series

2012-07-26 Thread Florian Achleitner
Hi!

I decided to completely rewrite my commit history, I split, dropped, squashed, 
and reordered.
And finally rebased it all onto the current master.
Hope this removed a lot of my personal confusion and makes the patches 
more useful and understandable.
I think the remote helper does what it should now, except creating branches.
Several patches depend on each other, but some are purely optional and there are
working intermediate states.
I'll add some comments in the table of contents below.


[RFC 01/16] Implement a remote helper for svn in C.
[RFC 02/16] Integrate remote-svn into svn-fe/Makefile.
[RFC 03/16] Add svndump_init_fd to allow reading dumps from
[RFC 04/16] Add cat-blob report fifo from fast-import to #this one is still in 
discussion
[RFC 05/16] remote-svn, vcs-svn: Enable fetching to private refs.
[RFC 06/16] Add a symlink 'git-remote-svn' in base dir.
# basic functionality is available from here.
# additional features follow
[RFC 07/16] Allow reading svn dumps from files via file:// urls.
[RFC 08/16] vcs-svn: add fast_export_note to create notes
[RFC 09/16] Create a note for every imported commit containing svn
[RFC 10/16] When debug==1, start fast-import with "--stats" instead #optional
[RFC 11/16] Add explanatory comment for transport-helpers refs #optional
[RFC 12/16] remote-svn: add incremental import.
[RFC 13/16] Add a svnrdump-simulator replaying a dump file for
[RFC 14/16] transport-helper: add import|export-marks to fast-import
[RFC 15/16] remote-svn: add marks-file regeneration.
[RFC 16/16] Add a test script for remote-svn.


--
Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add a svnrdump-simulator replaying a dump file for testing.

2012-07-24 Thread Florian Achleitner
On Tuesday 24 July 2012 14:50:49 Jonathan Nieder wrote:
> > It is unclear how this is different from giving the ceiling by
> > specifying it as the "END" in -rSTART:END command line.  Is this
> > feature really needed?
> 
> I think the idea is that you put this script (or a symlink to it) on
> your $PATH with higher precedence than svnrdump and run a command
> that expected to be able to use svnrdump.  Then instead of going to
> the network, the command you run magically uses your test data
> instead.
> 
> If the command you are testing wanted to run "svnrdump" without the
> upper endpoint set, we need to handle that request, either by emitting
> all the revs we have, or by stopping somewhere.  The revlimit feature
> provides the "stopping somewhere" behavior which is not strictly
> needed but is presumably very useful when testing incremental fetch.

Exactly, the purpose is to transparently replace svnrdump.
Callers of svnrdump usually will specify -rSTART:HEAD, because they want to 
fetch everything they don't yet have.
This feature allows to limit HEAD and to simulate incremental fetches using 
the same dump file.
For me it proved very useful.

> Florian, do you mind if I make the revlimit feature a separate patch
> when applying this?

No problem.

> 
> Anyway, it looks good and reasonable to me, so will apply.
> 
> Thanks.
> Jonathan

--
Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Add a svnrdump-simulator replaying a dump file for testing.

2012-07-23 Thread Florian Achleitner
To ease testing without depending on a reachable svn server, this
compact python script mimics parts of svnrdumps behaviour.
It requires the remote url to start with sim://.
Eventual slashes at the end of the url are stripped.
The url specifies the path of the svn dump file (as created by
svnrdump). Selectable parts of it, or the whole file, are written
to stdout. The part is selectable by giving start and end revision
on the command line.

Start and end revisions can be specified on the command line
(-rSTART:END, like for svnrdump).
Only revisions between START and excluding END are replayed from
the dumpfile specified by the url. END can also be HEAD.

If the start revision specified on the command line doesn't exist
in the dump file, it returns 1.
This emulates the behaviour of svnrdump when START>HEAD, i.e. the
requested start revision doesn't exist on the server.

To allow using the same dump file for simulating multiple
incremental imports the highest visible revision can be limited by
setting the environment variable SVNRMAX to that value. This
effectively limits HEAD to simulate the situation where higher
revs don't exist yet.

Signed-off-by: Florian Achleitner 
---
 contrib/svn-fe/svnrdump_sim.py |   53 
 1 file changed, 53 insertions(+)
 create mode 100755 contrib/svn-fe/svnrdump_sim.py

diff --git a/contrib/svn-fe/svnrdump_sim.py b/contrib/svn-fe/svnrdump_sim.py
new file mode 100755
index 000..4701d76
--- /dev/null
+++ b/contrib/svn-fe/svnrdump_sim.py
@@ -0,0 +1,53 @@
+#!/usr/bin/python
+"""
+Simulates svnrdump by replaying an existing dump from a file, taking care
+of the specified revision range.
+To simulate incremental imports the environment variable SVNRMAX can be set
+to the highest revision that should be available.
+"""
+import sys, os
+
+
+def getrevlimit():
+   var = 'SVNRMAX'
+   if os.environ.has_key(var):
+   return os.environ[var]
+   return None
+   
+def writedump(url, lower, upper):
+   if url.startswith('sim://'):
+   filename = url[6:]
+   if filename[-1] == '/': filename = filename[:-1] #remove 
terminating slash
+   else:
+   raise ValueError('sim:// url required')
+   f = open(filename, 'r');
+   state = 'header'
+   wroterev = False
+   while(True):
+   l = f.readline()
+   if l == '': break
+   if state == 'header' and l.startswith('Revision-number: '):
+   state = 'prefix'
+   if state == 'prefix' and l == 'Revision-number: %s\n' % lower:
+   state = 'selection'
+   if not upper == 'HEAD' and state == 'selection' and l == 
'Revision-number: %s\n' % upper:
+   break;
+
+   if state == 'header' or state == 'selection':
+   if state == 'selection': wroterev = True
+   sys.stdout.write(l)
+   return wroterev
+
+if __name__ == "__main__":
+   if not (len(sys.argv) in (3, 4, 5)):
+   print "usage: %s dump URL -rLOWER:UPPER"
+   sys.exit(1)
+   if not sys.argv[1] == 'dump': raise NotImplementedError('only "dump" is 
suppported.')
+   url = sys.argv[2]
+   r = ('0', 'HEAD')
+   if len(sys.argv) == 4 and sys.argv[3][0:2] == '-r':
+   r = sys.argv[3][2:].lstrip().split(':')
+   if not getrevlimit() is None: r[1] = getrevlimit()
+   if writedump(url, r[0], r[1]): ret = 0
+   else: ret = 1
+   sys.exit(ret)
\ No newline at end of file
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add a svnrdump-simulator replaying a dump file for testing.

2012-07-23 Thread Florian Achleitner
On Monday 23 July 2012 18:24:40 Matthieu Moy wrote:
> You also have whitespace damages (i.e. line wrapping introduced by your
> mailer). Using git-send-email avoids this kind of problem (there are
> also some advices for some mailers in Documentation/SubmittingPatches).

Damn. That's usually no problem with kmail either, if the config is right.
I've already used git-send-email several times.
But for replying to threads and adding several Cc: addresses it's a little 
cumbersome.
How do you do that in a nice way?

--
Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add a svnrdump-simulator replaying a dump file for testing.

2012-07-23 Thread Florian Achleitner
To ease testing without depending on a reachable svn server, this
compact python script mimics parts of svnrdumps behaviour.
It requires the remote url to start with sim://.
Start and end revisions are evaluated.
If the requested revision doesn't exist, as it is the case with
incremental imports, if no new commit was added, it returns 1
(like svnrdump).
To allow using the same dump file for simulating multiple
incremental imports the highest revision can be limited by setting
the environment variable SVNRMAX to that value. This simulates the
situation where higher revs don't exist yet.

Signed-off-by: Florian Achleitner 
---

I had to fix the missing sign-off anyways..

 contrib/svn-fe/svnrdump_sim.py |   53 

 1 file changed, 53 insertions(+)
 create mode 100755 contrib/svn-fe/svnrdump_sim.py

diff --git a/contrib/svn-fe/svnrdump_sim.py b/contrib/svn-fe/svnrdump_sim.py
new file mode 100755
index 000..4701d76
--- /dev/null
+++ b/contrib/svn-fe/svnrdump_sim.py
@@ -0,0 +1,53 @@
+#!/usr/bin/python
+"""
+Simulates svnrdump by replaying an existing dump from a file, taking care
+of the specified revision range.
+To simulate incremental imports the environment variable SVNRMAX can be set
+to the highest revision that should be available.
+"""
+import sys, os
+
+
+def getrevlimit():
+   var = 'SVNRMAX'
+   if os.environ.has_key(var):
+   return os.environ[var]
+   return None
+   
+def writedump(url, lower, upper):
+   if url.startswith('sim://'):
+   filename = url[6:]
+   if filename[-1] == '/': filename = filename[:-1] #remove 
terminating slash
+   else:
+   raise ValueError('sim:// url required')
+   f = open(filename, 'r');
+   state = 'header'
+   wroterev = False
+   while(True):
+   l = f.readline()
+   if l == '': break
+   if state == 'header' and l.startswith('Revision-number: '):
+   state = 'prefix'
+   if state == 'prefix' and l == 'Revision-number: %s\n' % lower:
+   state = 'selection'
+   if not upper == 'HEAD' and state == 'selection' and l == 
'Revision-
number: %s\n' % upper:
+   break;
+
+   if state == 'header' or state == 'selection':
+   if state == 'selection': wroterev = True
+   sys.stdout.write(l)
+   return wroterev
+
+if __name__ == "__main__":
+   if not (len(sys.argv) in (3, 4, 5)):
+   print "usage: %s dump URL -rLOWER:UPPER"
+   sys.exit(1)
+   if not sys.argv[1] == 'dump': raise NotImplementedError('only "dump" is 
suppported.')
+   url = sys.argv[2]
+   r = ('0', 'HEAD')
+   if len(sys.argv) == 4 and sys.argv[3][0:2] == '-r':
+   r = sys.argv[3][2:].lstrip().split(':')
+   if not getrevlimit() is None: r[1] = getrevlimit()
+   if writedump(url, r[0], r[1]): ret = 0
+   else: ret = 1
+   sys.exit(ret)
\ No newline at end of file
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Add a svnrdump-simulator replaying a dump file for testing.

2012-07-23 Thread Florian Achleitner
On Monday 23 July 2012 07:59:21 Jonathan Nieder wrote:
> Florian Achleitner wrote:
> > To ease testing without depending on a reachable svn server, this
> > compact python script mimics parts of svnrdumps behaviour.
> 
> Thanks.  Mind if I forge your sign-off?

Ups. No problem, anyways I've added it locally, so here's the new version ..
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >