[Puppet-dev] Re: git filebucket

Luke Kanies Thu, 22 Oct 2009 09:07:45 -0700

On Oct 22, 2009, at 3:24 AM, Michael Gliwinski wrote:

>
> On Tuesday 06 October 2009 01:00:39 Luke Kanies wrote:
>> On Oct 5, 2009, at 8:56 AM, Michael Gliwinski wrote:
>>> On Monday 28 September 2009 09:16:37 sam wrote:
>>>> On Sep 28, 3:02 pm, Luke Kanies <[email protected]> wrote:
>>>>> On Sep 27, 2009, at 5:41 PM, sam wrote:
>>>>>> Hello,
>>>>>> I am thinking of wiring filebucket to save to a git repo.
>>>>>> It would allow diffs and history, is that something worthwhile  
>>>>>> for
>>>>>> inclusion ?
>>>>>
>>>>> I think it's a great idea.  In fact, I've written a prototype of  
>>>>> it:
>>>>>
>>>>> http://gist.github.com/77811
>>>>>
>>>>> It's just a thin executable, without the Puppet integration, and
>>>>> it's
>>>>> all execs rather than library calls.  It also doesn't do any of  
>>>>> the
>>>>> history, which you'd obviously want -- it's just the blobs, with  
>>>>> no
>>>>> branches or anything.
>>>>>
>>>>> I'd love to have this supported.  How were you thinking of doing  
>>>>> it?
>>>>
>>>> the filebucket would store the replaced files in a git repo on the
>>>> local host, using rubygem-git and commit at the end of a puppet
>>>> run, a
>>>> file would be placed into $GITROOT/$FULLPATH of original file. no
>>>> symlinks.
>>>> filebucket { main: path => git://$gitpath }
>>>>
>>>> A centralized git server I suppose is nice, keep a branch per  
>>>> server
>>>> (all lost on server renames). Would the best way be keep a git
>>>> clone -
>>>> l per server, then pull, add the file, then push back to the  
>>>> branch ?
>>>> sounds like a bottle neck. if there is interest I would prefer to
>>>> keep
>>>> it as a stage 2.
>>>>
>>>> The history/diffs would be something a person would run on the git
>>>> repo themselves, I don't have good ideas of integrating that part
>>>> into
>>>> puppet or it's usability. it's so much easier to use git to find  
>>>> the
>>>> rev you want and you would need to add the file back to  
>>>> puppetmaster
>>>> manually as the restored file would have been a puppet template  
>>>> or a
>>>> file resource (tidy aside)
>>>>
>>>> did you expect more or have a more thought out idea?
>>>
>>> This is a very good idea in general.  I would however urge you to
>>> consider
>>> having a thin "glue" layer between the client code and the actual
>>> VCS (in
>>> this case git).  In other words what I mean is, regardless if you  
>>> call
>>> commands or use a library to talk to git, do it through an
>>> abstraction layer.
>>> This way it would be possible for ppl to use different VC systems.
>>>
>>> In my company we use bazaar for example and I know some ppl on the
>>> list use
>>> subversion, it would be better to allow them to use their VCS of
>>> choice.
>>
>> I've thought about this a bit, and if it's possible I'd like to do  
>> so,
>> but I'm not sure it is actually possible.
>>
>> The problem is that git actually works really well as a content-
>> addressable file system -- you can provide content and get a checksum
>> back, and provide a checksum and get content back.  While other VCSes
>> do well at the basic interactions, git's design as a CAF first and  
>> VCS
>> second gives it additional functionality here.
>>
>> I'd like to be proven wrong, though.
>
> Sorry for the late response, was away for a while.
>
> Does it really matter in that case though?  If the point is to store  
> changed
> files in a VCS so that you get access to history of changes, etc. it  
> seems
> that humans would be direct users of that feature.  And humans  
> probably
> wouldn't like using content-based addressing directly and so git's  
> CAF seems
> like an implementation detail.
>
> I don't know enough about implementation of git so correct me if I'm  
> wrong,
> but content-based addressing is mainly useful for storage optimization
> anyways, is it not?


Truthfully, I'm not exactly sure how much of a difference there is  
between directly treating git like a CAF and just exec'ing out to the  
various commands we'd have to run.

My goal is not to have to have checked out the current revision of the  
repository anywhere, but rather to rely entirely on direct API-like  
access to the repository - creating blobs, storing them, and  
retrieving them can all be done pretty easily without having a checked  
out repository, and I think the branches can all be done similarly.

The benefit of this is that you get considerable space savings - 10x  
or so, because of file compression, but probably more in the end  
because of  a lack of duplication.

But in the end, I'm interested in trying all of it; I just think git  
will end up being a more powerful solution overall.

-- 
Chase after truth like hell and you'll free yourself, even though
you never touch its coat-tails. -- Clarence Darrow
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[Puppet-dev] Re: git filebucket

Reply via email to