Re: [Gluster-devel] Languages (was Re: Proposal for GlusterD-2.0)

2014-09-07 Thread Krishnan Parthasarathi
[Apologies up front for questionable posting etiquettes]

Two characteristics of a language (tool chain) are important to me, especially
when you spend a good part of your time debugging failures/bugs.

- Analysing core files.
- Ability to reason about space consumption. This becomes important in
  the case of garbage collected languages.

I have written a few toy programs in Go and have been following the language
lately. Some of its features like channels and go routines catch my attention
as we are aspiring to build reactive and scalable services. Its lack of 
type-inference
and inheritance worries me a little. But, I shouldn't be complaining when
our default choice has been C thus far ;)

~KP

- Original Message -
> 
> Digging deeper into Go, I see there is a fascinating discussion in the
> language communities comparing Go with C++.
> 
> Go has no..
> - classes (no inheritance), though it has interfaces (sets of methods) which
> remind me of things like gluster's struct xlator_fops {}
> - polymorphism
> - pointer arithmetic
> - generic programming
> - etc.
> 
> Here is a comparison of C++ with Go from Rob Pike himself (a Go author).
> 
> http://commandcenter.blogspot.com/2012/06/less-is-exponentially-more.html
> 
> And here are a few counter arguments.
> 
> http://lambda-the-ultimate.org/node/4554
> 
> Preference for Go seems to come down to how deeply you prefer the C++ object
> oriented way of doing things (as Pike calls it, the "type-centric" focus on
> classes). If thats your cup of tea, you may find Go a letdown or step
> backwards. Pike implies that coders invest a lot of time to master those
> techniques and are reluctant to ditch those skills.
> 
> But if you are a C or python programmer, you may see Go as a way to have your
> cake (modern stripped down language with lists, maps, packages, interfaces,
> no #includes) and eat it too (it compiles to binary, no VM).
> 
> As gluster is not beholden in any way to legacy C++, Go seems like a great
> fit. I'm looking forward to giving it a spin :)
> 
> - Original Message -
> > From: "Dan Lambright" 
> > To: "Jeff Darcy" 
> > Cc: "Justin Clift" , "Gluster Devel"
> > 
> > Sent: Friday, September 5, 2014 5:32:05 PM
> > Subject: Re: [Gluster-devel] Languages (was Re: Proposal for GlusterD-2.0)
> > 
> > One reason to use c++ could be to build components that we wish to share
> > with
> > ceph. (Not that I know of any at this time). Also c++0x11 has improved the
> > language.
> > But the more I hear about it, the more interesting go sounds..
> > 
> > - Original Message -
> > > From: "Jeff Darcy" 
> > > To: "Justin Clift" 
> > > Cc: "Gluster Devel" 
> > > Sent: Friday, September 5, 2014 11:44:35 AM
> > > Subject: [Gluster-devel] Languages (was Re: Proposal for GlusterD-2.0)
> > > 
> > > > Does this mean we'll need to learn Go as well as C and Python?
> > > 
> > > As KP points out, the fact that consul is written in Go doesn't mean our
> > > code needs to be ... unless we need to contribute code upstream e.g. to
> > > add new features.  Ditto for etcd also being written in Go, ZooKeeper
> > > being written in Java, and so on.  It's probably more of an issue that
> > > these all require integration into our build/test environments.  At
> > > least Go, unlike Java, doesn't require any new *run time* support.
> > > Python kind of sits in between - it does require runtime support, but
> > > it's much less resource-intensive and onerous than Java (no GC-tuning
> > > hell).  Between that and the fact that it's almost always present
> > > already, it just doesn't seem to provoke the same kind of allergic
> > > reaction that Java does.
> > > 
> > > However, this is as good a time as any to think about what languages
> > > we're going to use for the project going forward.  While there are many
> > > good reasons for our I/O path to remain in Plain Old C (yes I'm
> > > deliberately avoiding the C++ issue), many of those reasons apply only
> > > weakly to other parts of the code - not only management code, but also
> > > "offline" processes like self heal and rebalancing.  Some people might
> > > already be aware that I've used Python for the reconciliation component
> > > of NSR, for example, and that version is in almost every way better than
> > > the C version it replaces.  When we need to interface with code written
> > > in other languages, or even interact with communities where other
> > > languages are spoken more fluently than C, it's pretty natural to
> > > consider using those languages ourselves.  Let's look at some of the
> > > alternatives.
> > > 
> > >  * C++
> > >Code is highly compatible with C, programming styles and idioms less
> > >so.  Not prominent in most areas we care about.
> > > 
> > >  * Java
> > >The "old standard" for a lot of distributed systems - e.g.  the
> > >entire Hadoop universe, Cassandra, etc.  Also a great burden as
> > >discussed previously.
> > > 
> > >  * Go
> > >Definitely the "

Re: [Gluster-devel] [Gluster-users] Proposal for GlusterD-2.0

2014-09-07 Thread Krishnan Parthasarathi


- Original Message -
> > As part of the first phase, we aim to delegate the distributed
> > configuration
> > store. We are exploring consul [1] as a replacement for the existing
> > distributed configuration store (sum total of /var/lib/glusterd/* across
> > all
> > nodes). Consul provides distributed configuration store which is consistent
> > and partition tolerant. By moving all Gluster related configuration
> > information into consul we could avoid split-brain situations.
> 
> Overall, I like the idea.  But I think you knew that.  ;)

Thanks. I am glad you like it :-)
> 
> Is the idea to run consul on all nodes as we do with glusterd, or to run
> it only on a few nodes (similar to Ceph's mon cluster) and then use them
> to coordinate membership etc. for the rest?

It is not set in stone, but I think we should have consul running only on
a subset of the nodes in the cluster, similar to Ceph's mon cluster approach.

~KP

> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Brick replace

2014-09-07 Thread Emmanuel Dreyfus
Emmanuel Dreyfus  wrote:

> I try getting tests/basic/pump.t to pass on NetBSD, but after a few
> experiments, it seems the brick replace functionality is just broken.

I found the problem(s).

First there are hardcoded /bin/umount path in glusterd, which works on
Linux but not other systems. Easy to fix: I introduce _PATH_UMOUNT,
defined per-OS in compat.h

Second problem, the feature seems to really rely on lazy unmount, a
Linux-only feature. On NetBSD, I have to unmount the maintenance client
without -l (lazy), and it fails with EBUSY. I can just avoid the unmount
operation and proceed, that lets brick-replace work. With that change,
NetBSD passes tests/basic/pump.t

I will try to see if the maintenance client can be cleaned up after
commit success: it also makes sense, and it has greater chances of
succeeding because brick-replace activity should be over.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Proposal for GlusterD-2.0

2014-09-07 Thread Krishnan Parthasarathi


> Bulk of current GlusterD code deals with keeping the configuration of the
> cluster and the volumes in it consistent and available across the nodes. The
> current algorithm is not scalable (N^2 in no. of nodes) and doesn't prevent
> split-brain of configuration. This is the problem area we are targeting for
> the first phase.
> 
> As part of the first phase, we aim to delegate the distributed configuration
> store. We are exploring consul [1] as a replacement for the existing
> distributed configuration store (sum total of /var/lib/glusterd/* across all
> nodes). Consul provides distributed configuration store which is consistent
> and partition tolerant. By moving all Gluster related configuration
> information into consul we could avoid split-brain situations.
> Did you get a chance to go over the following questions while making the
> decision? If yes could you please share the info.
> What are the consistency guarantees for changing the configuration in case of
> network partitions?
> specifically when there are 2 nodes and 1 of them is not reachable?
> consistency guarantees when there are more than 2 nodes?
> What are the consistency guarantees for reading configuration in case of
> network partitions?

consul uses Raft[1] distributed consensus algorithm internally for maintaining
consistency. The Raft consensus algorithm is proven to be correct. I will be
going through the workings of the algorithm soon. I will share my answers to
the above questions after that. Thanks for the questions, it is important
for the user to understand the behaviour of a system especially under failure.
I am considering adding a FAQ section to this proposal, where questions like 
the above would
go, once it gets accepted and makes it to the feature page.

[1] - https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf

~KP

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] glusterfs replica volume self heal dir very slow!!why?

2014-09-07 Thread Paul Robert Marino
Its the small files there is an overhead on any operation on small lots of small file this is not unique to Gluster. Also are you using XFS As the underlying filesystem if you are not that would play a big part ext has issues with performance when dealing with small file due to its over reliance on inodes for read operations.I would advise you to experiment with incrementing the background self heal count a little at a time till you achieve optimal performance. Unfortunately I can't give you hard numbers because this kind of tuning is more of an art than a science because of the number of possible variables which come into play depending on your hardware configuration and exact use case.-- Sent from my HP Pre3On Sep 5, 2014 4:16 AM, justgluste...@gmail.com  wrote: 
Hi all:      I do the  following test:     I create a glusterfs  replica volume (replica count is 2 ) with two server node(server A and server B), then  mount the volume in client node,    then, I  shut down the network of server A node, in  client node, I copy a dir(which has a lot of small files), the dir size is 2.9GByte,    when  copy finish, I start the network of server A node,   now, glusterfs  self-heal-daemon start heal dir  from  server B to  server  A,     in the  end,  I find the self-heal-daemon   heal the  dir use  40 minutes,  It's too slow!  why?   I   find out   related options  with  self-heal, as  follow:   cluster.self-heal-window-size



   cluster.self-heal-readdir-size



   cluster.background-self-heal-count  I  want  to ask, modify  the above options can improve  the  performance of heal dir?  if possible, please give a reasonable value about above options。    thanks!
justgluste...@gmail.com
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Languages (was Re: Proposal for GlusterD-2.0)

2014-09-07 Thread Dan Lambright

Digging deeper into Go, I see there is a fascinating discussion in the language 
communities comparing Go with C++.

Go has no..
- classes (no inheritance), though it has interfaces (sets of methods) which 
remind me of things like gluster's struct xlator_fops {}
- polymorphism 
- pointer arithmetic
- generic programming
- etc. 

Here is a comparison of C++ with Go from Rob Pike himself (a Go author). 

http://commandcenter.blogspot.com/2012/06/less-is-exponentially-more.html 

And here are a few counter arguments.

http://lambda-the-ultimate.org/node/4554

Preference for Go seems to come down to how deeply you prefer the C++ object 
oriented way of doing things (as Pike calls it, the "type-centric" focus on 
classes). If thats your cup of tea, you may find Go a letdown or step 
backwards. Pike implies that coders invest a lot of time to master those 
techniques and are reluctant to ditch those skills.

But if you are a C or python programmer, you may see Go as a way to have your 
cake (modern stripped down language with lists, maps, packages, interfaces, no 
#includes) and eat it too (it compiles to binary, no VM).

As gluster is not beholden in any way to legacy C++, Go seems like a great fit. 
I'm looking forward to giving it a spin :)

- Original Message -
> From: "Dan Lambright" 
> To: "Jeff Darcy" 
> Cc: "Justin Clift" , "Gluster Devel" 
> 
> Sent: Friday, September 5, 2014 5:32:05 PM
> Subject: Re: [Gluster-devel] Languages (was Re: Proposal for GlusterD-2.0)
> 
> One reason to use c++ could be to build components that we wish to share with
> ceph. (Not that I know of any at this time). Also c++0x11 has improved the
> language.
> But the more I hear about it, the more interesting go sounds..
> 
> - Original Message -
> > From: "Jeff Darcy" 
> > To: "Justin Clift" 
> > Cc: "Gluster Devel" 
> > Sent: Friday, September 5, 2014 11:44:35 AM
> > Subject: [Gluster-devel] Languages (was Re: Proposal for GlusterD-2.0)
> > 
> > > Does this mean we'll need to learn Go as well as C and Python?
> > 
> > As KP points out, the fact that consul is written in Go doesn't mean our
> > code needs to be ... unless we need to contribute code upstream e.g. to
> > add new features.  Ditto for etcd also being written in Go, ZooKeeper
> > being written in Java, and so on.  It's probably more of an issue that
> > these all require integration into our build/test environments.  At
> > least Go, unlike Java, doesn't require any new *run time* support.
> > Python kind of sits in between - it does require runtime support, but
> > it's much less resource-intensive and onerous than Java (no GC-tuning
> > hell).  Between that and the fact that it's almost always present
> > already, it just doesn't seem to provoke the same kind of allergic
> > reaction that Java does.
> > 
> > However, this is as good a time as any to think about what languages
> > we're going to use for the project going forward.  While there are many
> > good reasons for our I/O path to remain in Plain Old C (yes I'm
> > deliberately avoiding the C++ issue), many of those reasons apply only
> > weakly to other parts of the code - not only management code, but also
> > "offline" processes like self heal and rebalancing.  Some people might
> > already be aware that I've used Python for the reconciliation component
> > of NSR, for example, and that version is in almost every way better than
> > the C version it replaces.  When we need to interface with code written
> > in other languages, or even interact with communities where other
> > languages are spoken more fluently than C, it's pretty natural to
> > consider using those languages ourselves.  Let's look at some of the
> > alternatives.
> > 
> >  * C++
> >Code is highly compatible with C, programming styles and idioms less
> >so.  Not prominent in most areas we care about.
> > 
> >  * Java
> >The "old standard" for a lot of distributed systems - e.g.  the
> >entire Hadoop universe, Cassandra, etc.  Also a great burden as
> >discussed previously.
> > 
> >  * Go
> >Definitely the "up and comer" in distributed systems, for which it
> >was (partly) designed.  Easy for C programmers to pick up, and also
> >popular among (former?) Python folks.  Light on resources and
> >dependencies.
> > 
> >  * JavaScript
> >Ubiquitous.  Common in HTTP-ish "microservice" situations, but not so
> >much in true distributed systems.
> > 
> >  * Ruby
> >Much like JavaScript as far as we're concerned, but less ubiquitous.
> > 
> >  * Erlang
> >Functional, designed for highly reliable distributed systems,
> >significant use in related areas (e.g. Riak).
> > 
> > Obviously, there are many more, but issues of compatibility and talent
> > availability weigh heavier for most than for Erlang (which barely made
> > the list as it is despite its strengths).  Of these, the ones without
> > serious drawbacks are JavaScript and Go.  As popular as JS is in other
> > specialties

Re: [Gluster-devel] Another transaction is in progress

2014-09-07 Thread Emmanuel Dreyfus
Atin Mukherjee  wrote:

> > I still have the message after a glusterd restart. I will try with
> > release-3.6
> Can you restart the originator glusterd node as well?

After restarting all glusterfsd and glusterd, it recovers.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Another transaction is in progress

2014-09-07 Thread Atin Mukherjee


On 09/07/2014 12:43 PM, Emmanuel Dreyfus wrote:
> Atin Mukherjee  wrote:
> 
>> I suggest you should check the glusterd log file at node
>> 078015de-2186-4bd7-a4d1-017e39c16dd3, if you don't find the reason of
>> why glusterd did not release the lock work around will be to restart
>> glusterd there.
> 
> I still have the message after a glusterd restart. I will try with
> release-3.6
Can you restart the originator glusterd node as well?

~Atin
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Another transaction is in progress

2014-09-07 Thread Emmanuel Dreyfus
Atin Mukherjee  wrote:

> I suggest you should check the glusterd log file at node
> 078015de-2186-4bd7-a4d1-017e39c16dd3, if you don't find the reason of
> why glusterd did not release the lock work around will be to restart
> glusterd there.

I still have the message after a glusterd restart. I will try with
release-3.6

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel