Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-31 Thread Fred van Zwieten
An alternative to undelete is snapshotting. This also gives protection
against "logical data corruption" like virus infections or data changes
because of bugs or simply human failure. I am a huge proponent of solving
as much as possible problems with a single solution. And, afaik,
snapshotting is being worked on or is at least on the roadmap.

Cheers,
Fred


Met vriendelijke groeten,
*
Fred van Zwieten
*
*Enterprise Open Source Services*
*
Consultant*
*(vrijdags afwezig)*

*VX Company IT Services B.V.*
*T* (035) 539 09 50 mobiel (06) 41 68 28 48
*F* (035) 539 09 08
*E* fvzwie...@vxcompany.com
*I*  www.vxcompany.com


On Mon, Dec 31, 2012 at 4:01 PM, Vijay Bellur  wrote:

> On 12/30/2012 09:42 PM, Stephan von Krawczynski wrote:
> > If I delete
> > something on a disk that is far from being full it is just plain dumb to
> > really erase this data from the disk. It won't help anyone. It will only
> hurt
> > you if you deleted it accidently. Read my lips: free disk space is wasted
> > space, just like free mem is wasted mem.
> > And_that_  is the true reason for undelete. It won't hurt anybody, and
> will
> > help some. And since it is the true goal of a fs to organise data on a
> drive
> > it is most obvious that "undelete" (you may call it lazy-delete) is a
> very
> > basic fs feature and_not_  an add-on patched onto it.
>
> Have you explored xlators/features/trash in the source tree? Does that
> fit your requirements? If that does, code clean up in trash translator
> and exposing undelete (via trash xlator) as a tunable through the
> gluster volume set interface is not complex.
>
> -Vijay
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-31 Thread Vijay Bellur

On 12/30/2012 09:42 PM, Stephan von Krawczynski wrote:

If I delete
something on a disk that is far from being full it is just plain dumb to
really erase this data from the disk. It won't help anyone. It will only hurt
you if you deleted it accidently. Read my lips: free disk space is wasted
space, just like free mem is wasted mem.
And_that_  is the true reason for undelete. It won't hurt anybody, and will
help some. And since it is the true goal of a fs to organise data on a drive
it is most obvious that "undelete" (you may call it lazy-delete) is a very
basic fs feature and_not_  an add-on patched onto it.


Have you explored xlators/features/trash in the source tree? Does that 
fit your requirements? If that does, code clean up in trash translator 
and exposing undelete (via trash xlator) as a tunable through the 
gluster volume set interface is not complex.


-Vijay



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-31 Thread 符永涛
I agree with
> 3) Implement true undelete feature. Make delete a move to a deleted-files 
> area.
If some people want it and others don't we can make a configurable
translator to accomplish this and disable it by default.
Some other distributed file systems like moosefs also provide such
feature, deleted files are kept in the trash bin for the configured
amount of time before they are deleted.
We have such requirement because some data is just too important we
can't affort it be deleted accidently.

2012/12/31, Whit Blauvelt :
> On Sun, Dec 30, 2012 at 05:12:04PM +0100, Stephan von Krawczynski wrote:
>
>> If I delete
>> something on a disk that is far from being full it is just plain dumb to
>> really erase this data from the disk. It won't help anyone. It will only
>> hurt
>> you if you deleted it accidently. Read my lips: free disk space is wasted
>> space, just like free mem is wasted mem.
>> And _that_ is the true reason for undelete. It won't hurt anybody, and
>> will
>> help some. And since it is the true goal of a fs to organise data on a
>> drive
>> it is most obvious that "undelete" (you may call it lazy-delete) is a
>> very
>> basic fs feature and _not_ an add-on patched onto it.
>
> Stephan,
>
> It's good to have a strong debater here like yourself. But you overlooked
> Jeff's citing "compliance reasons." I don't know what sort of data you deal
> in. But if it's anything financial, at all, there is serious jeopardy if
> deleted files aren't really deleted. Much of it has both regulatory and
> contractual requirements, plus potential legal liability.
>
> Yeah, I know parts of deleted files still often linger on the disk anyway.
> But maintaining an index to those files, which would be what your request
> would require, would put many of us in violation of these requirements in a
> way that that simply does not. If a system is compromised, it's going to be
> far easier for the compromiser to find deleted data if there's an available
> index to it. It's far more work, and a far more obvious intrusion, if they
> have to go sector-by-sector through the storage.
>
> Best,
> Whit
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>


-- 
符永涛
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-31 Thread Fred van Zwieten
Stephan,

Why don't you simply file a feature request. Something like a volume wide
switch:

gluster volume set  delete-policy=

where value is one of:

"wastebin" this is what you want. There must be another setting somewhere
that specifies how must free space must be maintained ie when the wastebin
will be emptied fifo.
"index" this is the default and current behaviour
"highwater" this is where the blocks where the file parts levels are
getting overwritten with zeroes or whatever.

Better yet, as your a C programmer, start contributing.

Cheers,
Fred

On Mon, Dec 31, 2012 at 2:31 PM, Whit Blauvelt
wrote:

> On Sun, Dec 30, 2012 at 05:12:04PM +0100, Stephan von Krawczynski wrote:
>
> > If I delete
> > something on a disk that is far from being full it is just plain dumb to
> > really erase this data from the disk. It won't help anyone. It will only
> hurt
> > you if you deleted it accidently. Read my lips: free disk space is wasted
> > space, just like free mem is wasted mem.
> > And _that_ is the true reason for undelete. It won't hurt anybody, and
> will
> > help some. And since it is the true goal of a fs to organise data on a
> drive
> > it is most obvious that "undelete" (you may call it lazy-delete) is a
> very
> > basic fs feature and _not_ an add-on patched onto it.
>
> Stephan,
>
> It's good to have a strong debater here like yourself. But you overlooked
> Jeff's citing "compliance reasons." I don't know what sort of data you deal
> in. But if it's anything financial, at all, there is serious jeopardy if
> deleted files aren't really deleted. Much of it has both regulatory and
> contractual requirements, plus potential legal liability.
>
> Yeah, I know parts of deleted files still often linger on the disk anyway.
> But maintaining an index to those files, which would be what your request
> would require, would put many of us in violation of these requirements in a
> way that that simply does not. If a system is compromised, it's going to be
> far easier for the compromiser to find deleted data if there's an available
> index to it. It's far more work, and a far more obvious intrusion, if they
> have to go sector-by-sector through the storage.
>
> Best,
> Whit
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-31 Thread Whit Blauvelt
On Sun, Dec 30, 2012 at 05:12:04PM +0100, Stephan von Krawczynski wrote:

> If I delete
> something on a disk that is far from being full it is just plain dumb to
> really erase this data from the disk. It won't help anyone. It will only hurt
> you if you deleted it accidently. Read my lips: free disk space is wasted
> space, just like free mem is wasted mem.
> And _that_ is the true reason for undelete. It won't hurt anybody, and will
> help some. And since it is the true goal of a fs to organise data on a drive
> it is most obvious that "undelete" (you may call it lazy-delete) is a very
> basic fs feature and _not_ an add-on patched onto it.

Stephan,

It's good to have a strong debater here like yourself. But you overlooked
Jeff's citing "compliance reasons." I don't know what sort of data you deal
in. But if it's anything financial, at all, there is serious jeopardy if
deleted files aren't really deleted. Much of it has both regulatory and
contractual requirements, plus potential legal liability. 

Yeah, I know parts of deleted files still often linger on the disk anyway.
But maintaining an index to those files, which would be what your request
would require, would put many of us in violation of these requirements in a
way that that simply does not. If a system is compromised, it's going to be
far easier for the compromiser to find deleted data if there's an available
index to it. It's far more work, and a far more obvious intrusion, if they
have to go sector-by-sector through the storage. 

Best,
Whit
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-30 Thread Stephan von Krawczynski
On Sun, 30 Dec 2012 12:29:53 -0800
Joe Julian  wrote:

> Here's were you're getting labeled as a Troll. You have a tendency to do 
> this on just about every mailing list except LKML (not sure why they get 
> your love over others, but to each their own).

There is one basic difference between LKML and quite almost every other
"project" you probably saw me posting. The kernel project has _one_ head who
has proven to take real _management_ decisions in his project. Sometimes they
look rude, sometimes they are a bit late, very often they are just-in-time or
even early. And if you read the archives you probably notice one or two times
where I requested a _decision_ on fundamental strategies. Probably you
remember me being laughed at when I suggested to make cpus hot-pluggable years
ago. Nobody thought of the implications back then. Nowadays cpu hotplug is in
every arm-driven multicore android handy. I am not Jesus. Only sometimes I can 
read the writings on the wall a bit earlier than others do, that's all.

> You come in, spout some 
> diatribe claiming how you know better than everybody else to the point 
> of being told that "this is the last post I'm going to make on this 
> subject". You don't work with the developers, you antagonize them. I 
> still don't see the features you're asking for on the wiki, nor in bugzilla.
> 
> You obviously have some knowledge of C judging by your analysis of 
> issues in LKML and patch offers relating to the same. Why not offer your 
> abilities in a constructive way by using the tools we make publicly 
> available?

>From writing lots of lines of code in C and quite a bunch of other languages
for the last about 30 years I can tell you that the biggest effect of things I
did is not based on released code but on exactly this kind of discussions. One
of the fundamental problems in open source is that quite some good projects
die because nobody has the guts to tell that the basic direction needs
correction. I know that most people do not want to hear that, nevertheless
someone has to stand up and say "this is sh*t" if it really is. And if nobody
else does, I do. At the end of the day most people may hate me for that, but
if the project got better, I don't give a damn. I am no team player, I believe
in "one man, one vision".

-- 
Regards,
Stephan


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-30 Thread Joe Julian


On 12/30/2012 08:12 AM, Stephan von Krawczynski wrote:

On Sun, 30 Dec 2012 10:13:52 -0500
Jeff Darcy  wrote:


On 12/27/12 3:36 PM, Stephan von Krawczynski wrote:

And the same goes for glusterfs. It _could_ be the greatest fs on earth, but
only if you accept:

1) Throw away all non-linux code. Because this war is over since long.

Sorry, but we do have non-Linux users already and won't abandon them.  We
wouldn't save all that much time even if we did, so it just doesn't make sense.

Jeff, really, if you argue, please state your argument openly. You don't want
this point because its next logical step would be my point 2), the kernel
implementation. As long as you hold up dead boxes like orcale-owned solaris
you have a good point in not doing 2). Success needs focussing. If you try to
be everybody's darling you may well end up being dropped by everybody because
you are not good enough.
  

2) Make a kernel based client/server implementation. Because it is the only
way to acceptable performance.

That's an easy thing to state, but a bit harder to prove.

Come on, how old are you? can you remember userspace-nfs? In case you cannot:
it had just about the same problems glusterfs has today, and guess why it is
gone...
That's not proof. Even further, that's enough iterations in the past as 
to invalidate it's example. Nothing's the same today as it was back then.



[a lot of bad examples deleted]

Really, you cannot prove you are right by naming some examples that are even
more horrible.

... he says after pointing to userspace-nfs as an example.



3) Implement true undelete feature. Make delete a move to a deleted-files area.

Some people want that, some people do not.

Haha! A good argument for a config parameter :-) - I would have suggested that
anyway.


  Some are even precluded from using
it e.g. for compliance reasons.  It's hardly a must-have feature.  In any case,
it already exists - called "landfill" I believe, though I'm not sure of its
support status or configurability via the command line.  If it didn't exist, it
would still be easy to create - which wouldn't be the case at all if we
followed your advice to put this in the kernel.

Now I wonder how you argue about this. Let me bring in some analogy you will
probably hate. Linux MM uses free memory to cache for just about anything
thinkable of. This drives W*indows users crazy using Android. They always try
to put the latest "kill-all-not-needed-apps" tool to let them read a big
number in free space statistics. They do not understand that free memory is in
fact wasted memory. And the same thing goes for disk space. If I delete
something on a disk that is far from being full it is just plain dumb to
really erase this data from the disk. It won't help anyone. It will only hurt
you if you deleted it accidently. Read my lips: free disk space is wasted
space, just like free mem is wasted mem.
And _that_ is the true reason for undelete. It won't hurt anybody, and will
help some. And since it is the true goal of a fs to organise data on a drive
it is most obvious that "undelete" (you may call it lazy-delete) is a very
basic fs feature and _not_ an add-on patched onto it.
It's such a very basic fs feature that it's been around since minix... 
oh, wait, no it hasn't... only now are some filesystems starting to 
become stable that use copy-on-write. Like Jeff said, it's already in 
there (nobody was using it so it hasn't gotten a lot of attention in the 
last couple of iterations).

  If it's a priority for you and
existing facilities do not suffice, then I suggest adding a feature page on the
wiki and/or an enhancement-request bug report, so that we can incorporate that
feedback into our planning process.  Thank you for your help making GlusterFS
better.

[politics end]
Jeff, this is really no technical question we are talking about. It's more a
question of a management decision. If redhat wants a truely successful
glusterfs someone has to decide to follow my steps. If the stuff was only
bought because it looked interesting and no one else should use its true
potential, well then go ahead.

Here's were you're getting labeled as a Troll. You have a tendency to do 
this on just about every mailing list except LKML (not sure why they get 
your love over others, but to each their own). You come in, spout some 
diatribe claiming how you know better than everybody else to the point 
of being told that "this is the last post I'm going to make on this 
subject". You don't work with the developers, you antagonize them. I 
still don't see the features you're asking for on the wiki, nor in bugzilla.


You obviously have some knowledge of C judging by your analysis of 
issues in LKML and patch offers relating to the same. Why not offer your 
abilities in a constructive way by using the tools we make publicly 
available?

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo

Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-30 Thread Stephan von Krawczynski
On Sun, 30 Dec 2012 10:13:52 -0500
Jeff Darcy  wrote:

> On 12/27/12 3:36 PM, Stephan von Krawczynski wrote:
> > And the same goes for glusterfs. It _could_ be the greatest fs on earth, but
> > only if you accept:
> > 
> > 1) Throw away all non-linux code. Because this war is over since long.
> 
> Sorry, but we do have non-Linux users already and won't abandon them.  We
> wouldn't save all that much time even if we did, so it just doesn't make 
> sense.

Jeff, really, if you argue, please state your argument openly. You don't want
this point because its next logical step would be my point 2), the kernel
implementation. As long as you hold up dead boxes like orcale-owned solaris
you have a good point in not doing 2). Success needs focussing. If you try to
be everybody's darling you may well end up being dropped by everybody because
you are not good enough.
 
> > 2) Make a kernel based client/server implementation. Because it is the only
> > way to acceptable performance.
> 
> That's an easy thing to state, but a bit harder to prove.

Come on, how old are you? can you remember userspace-nfs? In case you cannot:
it had just about the same problems glusterfs has today, and guess why it is
gone...

> [a lot of bad examples deleted]

Really, you cannot prove you are right by naming some examples that are even
more horrible. 

> > 3) Implement true undelete feature. Make delete a move to a deleted-files 
> > area.
> 
> Some people want that, some people do not.

Haha! A good argument for a config parameter :-) - I would have suggested that
anyway.

>  Some are even precluded from using
> it e.g. for compliance reasons.  It's hardly a must-have feature.  In any 
> case,
> it already exists - called "landfill" I believe, though I'm not sure of its
> support status or configurability via the command line.  If it didn't exist, 
> it
> would still be easy to create - which wouldn't be the case at all if we
> followed your advice to put this in the kernel.

Now I wonder how you argue about this. Let me bring in some analogy you will
probably hate. Linux MM uses free memory to cache for just about anything
thinkable of. This drives W*indows users crazy using Android. They always try
to put the latest "kill-all-not-needed-apps" tool to let them read a big
number in free space statistics. They do not understand that free memory is in
fact wasted memory. And the same thing goes for disk space. If I delete
something on a disk that is far from being full it is just plain dumb to
really erase this data from the disk. It won't help anyone. It will only hurt
you if you deleted it accidently. Read my lips: free disk space is wasted
space, just like free mem is wasted mem.
And _that_ is the true reason for undelete. It won't hurt anybody, and will
help some. And since it is the true goal of a fs to organise data on a drive
it is most obvious that "undelete" (you may call it lazy-delete) is a very
basic fs feature and _not_ an add-on patched onto it.

>  If it's a priority for you and
> existing facilities do not suffice, then I suggest adding a feature page on 
> the
> wiki and/or an enhancement-request bug report, so that we can incorporate that
> feedback into our planning process.  Thank you for your help making GlusterFS
> better.

[politics end] 
Jeff, this is really no technical question we are talking about. It's more a
question of a management decision. If redhat wants a truely successful
glusterfs someone has to decide to follow my steps. If the stuff was only
bought because it looked interesting and no one else should use its true
potential, well then go ahead.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-30 Thread Jeff Darcy
On 12/27/12 3:36 PM, Stephan von Krawczynski wrote:
> And the same goes for glusterfs. It _could_ be the greatest fs on earth, but
> only if you accept:
> 
> 1) Throw away all non-linux code. Because this war is over since long.

Sorry, but we do have non-Linux users already and won't abandon them.  We
wouldn't save all that much time even if we did, so it just doesn't make sense.

> 2) Make a kernel based client/server implementation. Because it is the only
> way to acceptable performance.

That's an easy thing to state, but a bit harder to prove.  Even Ceph, which
makes a big deal of having a kernel client, has a user-space server.  HDFS is
way out in Java-land, as are many non-filesystem (e.g. object/NoSQL) data
stores, and people seem OK with that.  PLFS is even using FUSE, and the people
who run some of the biggest systems on the planet have reported significant
improvements over fully-in-kernel Lustre for demanding real-world workloads.

Thus, I don't think the case for putting things in the kernel is fully made.
We'd be giving up too much terms of flexibility and development velocity, and
for what?  Why do you think Ceph is taking so long to mature?  Are you
volunteering to implement complex new features such as multi-tenancy or
deduplication in the kernel?  I'm not, and I've been a kernel developer for
over twenty years.  A single task-specific translator can often provide greater
gains than putting everything in the kernel, for far less effort.  It's hard
enough to get people to think that way when the code's out in user space (even
in Python); in the kernel it simply wouldn't happen.  That would put us in a
me-too race with all the other distributed filesystems, instead of using
modularity and open source to let people create the filesystems that they each
need.  That's our advantage, and we intend to keep it.

Would a full in-kernel implementation help with latency, for certain workloads
that aren't already using the qemu interface (which reduces it still further)?
 Yes.  Would it help with bandwidth/scalability across many clients and
servers?  Not really.  Would it require extreme sacrifices in just about every
other area to address one need that's already well served elsewhere?
Absolutely.  It's fine that you want something else, but GlusterFS is not going
to be that.  Sorry.  If you want some help evaluating alternatives, e.g. with
tips for how to evaluate their performance or correctness, please let me know
(off list) and I'll do what I can.

> 3) Implement true undelete feature. Make delete a move to a deleted-files 
> area.

Some people want that, some people do not.  Some are even precluded from using
it e.g. for compliance reasons.  It's hardly a must-have feature.  In any case,
it already exists - called "landfill" I believe, though I'm not sure of its
support status or configurability via the command line.  If it didn't exist, it
would still be easy to create - which wouldn't be the case at all if we
followed your advice to put this in the kernel.  If it's a priority for you and
existing facilities do not suffice, then I suggest adding a feature page on the
wiki and/or an enhancement-request bug report, so that we can incorporate that
feedback into our planning process.  Thank you for your help making GlusterFS
better.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users