If we were to use GIT for this it wouldn't be in that way, it would be
writing to the central repo in order to keep multiple clients
synchronized while each is making changes and wanting to see the
changes of others.
Perhaps it would be better to discuss for SVN and not for GIT at all,
and just keep it more simple.
As for the revision number: the whole point would be to have revision
control and do things like allow changes on other servers, or other
content records, while a specific revision is deployed to production
(just as is the case for development efforts).
The initial point that I mentioned was to provide a central repository
that is a good alternative to a database, and that may involve
branching and merging for more advanced users, but that wouldn't be
the main point.
-David
On Jul 1, 2009, at 5:08 PM, Adam Heath wrote:
David E Jones wrote:
This is an interesting overview and while I'm not sure why I hadn't
thought along these lines before, at least it's through my thick
skull
now...
I asked Adam about how this would deploy on multiple servers with the
stuff in the filesystem versus the database, and I think what you've
written Ean is the answer.
Why not treat a source repo (either plain SVN or something more
exotic
like GIT) like the database? Each app server would read from and
write
to the source repo just like it would a database record. If SVN or
GIT
support 2-phase commits we could probably even do write operations in
the a transaction that includes connections to both data stores.
For performance reasons you'd want to cache content from the source
repo
just like you would content from a relational database. If it's
really
too terribly slow even doing that (ie reading directly from the
repo and
caching) you could cache it locally in the app server's file system,
though it would probably be best to never write directly to the local
filesystem and you'd want some sort of timeout or other logic to
invalidate the file system cache just like you'd do with the in
memory
cache (actually UtilCache supports this sort of thing, though now
with
straight files in the filesystem, just a sort of mini-database for
local
filesystem caching of data).
Anyway, is this something you guys have considered for WebSlinger?
I've got a commons-vfs filesystem implementation that uses git
plumbing to store content. Every single mutation causes a new 'tree'
hash to be created in git. It uses jgit to do this. However, we
don't currently use it, it was more of a quick test. One major
problem with jgit is that it reads the entire file into memory, which
will not work with large files.
I have not tested whether this interoperates with other git porcelain.
However, all that is moot. GIT is not a shared-write system. Each
instance is completely local. You have your own copy of the repo, per
install. You mutate it however. Then either you push to another
machine/repo, or the other machine pulls from you. This could be made
to work, doing some kind of anonymous ssh pulse thing, but it'd be a
heavy system integration, which ofbiz tends not to do.
For the OFBiz Content side of things you could pretty easily have a
DataResourceType for data in a source repo (ie instead of LOCAL_FILE
something like REPOSITORY_FILE). On the DataResource entity the
objectInfo field would have the URL/location of the resource (ie like
the SVN/HTTP URL), and we could add a field like "revisionNumber" to
specify which revision we want or null to get the head revision (I
was
thinking we could use the existing ContentRevision/Item entities for
this, but looking at them it seems they wouldn't work so well and are
really meant for a revision control built on top of the Content and
DataResource entities, and not one that would describe revision
information pointed to by them). The "revisionNumber" could also go
on
the Content entity so that we could have multiple Content records
with
different revision numbers pointing to the same DataResource
records and
reduce how many DataResource records we would require. That would
also
better fit how Content and DataResource are meant to work together,
but
on the other hand might be somewhat confusing.
No, no, you can't use a revisionNumber. They don't exist.
Distributed systems change that completely.
Thoughts anyone?
Oh, one more thing... I know there are some Java libraries for SVN,
and
there probably are some for GIT... has anyone played with these?
I've look at the documentation for svn/java; I've actually used
jgit(however, it's been a few years).