Re: [Gluster-users] [Gluster-devel] Freenode takeover and GlusterFS IRC channels

2021-06-07 Thread Stephan von Krawczynski
And why don't you just do the right thing and drop these semi-closed-source
stuff and use XMPP, about the only free/GPLed messenger service.
It has:
- no central provider
- no central servers
- server software free for anyone to install
- lots of free xmpp services around
- crypted or not crypted on user choice

-> https://xmpp.org/

You use email, why?

--
Regards,
Stephan


On Mon, 7 Jun 2021 21:41:11 +0530
Amar Tumballi  wrote:

> We (at least many developers and some users) actively use Slack at
> https://gluster.slack.com
> 
> While I agree that it's not a free/open alternative to IRC, it does get
> many questions answered, and also gets communication happen related to the
> project.
> 
> Regards,
> Amar
> 
> 
> On Mon, 7 Jun, 2021, 9:27 pm Jordan Erickson, <
> jerick...@logicalnetworking.net> wrote:  
> 
> > I'm relatively new to the community but I would vote for having a point
> > of presence on libera.chat, or OFTC as some other F/OSS projects are
> > moving there as an alternative. I use IRC daily for supporting my own
> > projects as well as related projects such as GlusterFS. Personally I
> > hadn't heard of Matrix until the whole Freenode fiasco happened, so I
> > would imagine others may be in the same boat. Anyway, just my $0.02 :)
> >
> >
> > Cheers,
> > Jordan Erickson
> >
> >
> > On 6/7/21 5:51 AM, Anoop C S wrote:  
> > > Hi all,
> > >
> > > I hope many of us are aware of the recent changes that happened at
> > > Freenode IRC network(in case you are not, feel free to look into
> > > details based on various resignation letters from long-time then
> > > Freenode staff starting with [1]). In the light of this take over
> > > situation, many open source communities have moved over to its
> > > replacement i.e, libera.chat[2].
> > >
> > > Now I would like to open this up to GlusterFS community to think about
> > > moving forward with our current IRC channels(#gluster, #gluster-dev and
> > > #gluster-meeting) on Freenode. How important are those channels for
> > > GlusterFS project? How about moving over to libera.chat in case we
> > > stick to IRC communication?
> > >
> > > Let's discuss and conclude on the way forward..
> > >
> > > Note:- Matrix[3] platform is also an option nowadays and we do have a
> > > Gluster room(#gluster:matrix.org) there ! welcome..welcome :-)
> > >
> > > Regards,
> > > Anoop C S
> > >
> > >
> > > [1] https://fuchsnet.ch/freenode-resign-letter.txt
> > > [2] https://libera.chat/
> > > [3] https://matrix.org/
> > >
> > > 
> > >
> > >
> > >
> > > Community Meeting Calendar:
> > >
> > > Schedule -
> > > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> > > Bridge: https://meet.google.com/cpu-eiue-hvk
> > > Gluster-users mailing list
> > > Gluster-users@gluster.org
> > > https://lists.gluster.org/mailman/listinfo/gluster-users
> > >  
> >
> > --
> > Jordan Erickson (PGP: 0x78DD41CB)
> > Logical Networking Solutions, 707-636-5678
> > 
> >
> >
> >
> > Community Meeting Calendar:
> >
> > Schedule -
> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> > Bridge: https://meet.google.com/cpu-eiue-hvk
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> >  




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Replica 3 scale out and ZFS bricks

2020-09-17 Thread Stephan von Krawczynski
And Joe is the only man on the planet that thinks that nfs is fast because it
fell from heaven and did not get better when it was moved from userspace to
kernel. Of course this was done because someone had lots of spare time to
waste ...
How long will it take until it is accepted that none of the people programming
glusterfs has the skills to do it right, and this is the simple truth why it
this project is lost? Ask yourself why redhat dumped it.


On Thu, 17 Sep 2020 04:18:20 -0700
Joe Julian  wrote:

> He's a troll that has wasted 10 years trying to push his unfounded belief
> that moving to an in-kernel driver would give significantly more performance.
> 
> On September 17, 2020 3:21:01 AM PDT, Alexander Iliev
>  wrote:
> >On 9/17/20 3:37 AM, Stephan von Krawczynski wrote:  
> >> Nevertheless you will break performance anyway by deploying  
> >user-space  
> >> crawling-slow glusterfs... outcome of 10 wasted years of development  
> >in the  
> >> wrong direction.  
> >
> >Genuinely asking - what would you recommend instead of GlusterFS for a 
> >highly available, horizontally scalable storage system?
> >
> >Best regards,
> >--
> >alexander iliev




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Replica 3 scale out and ZFS bricks

2020-09-17 Thread Stephan von Krawczynski
On Thu, 17 Sep 2020 12:21:01 +0200
Alexander Iliev  wrote:

> On 9/17/20 3:37 AM, Stephan von Krawczynski wrote:
> > Nevertheless you will break performance anyway by deploying user-space
> > crawling-slow glusterfs... outcome of 10 wasted years of development in the
> > wrong direction.  
> 
> Genuinely asking - what would you recommend instead of GlusterFS for a 
> highly available, horizontally scalable storage system?

I was a glusterfs user for years waiting for significant performance
improvements. But they never arrived. Instead the software got from a fs
driver to a userspace collection of tools of a fs emulation with complete
bogus configs without the slightest path to a working and overall useable
setup.
IOW it was developed into a deadend.
And honestly I would be the first to deploy it again if it came back to where
it was, a network fs exporting a linux-fs with no additional bs. The day
where this changed to something needing to copy every file onto over glusterfs
to work "properly" was the first day of its death.
The original idea was great, the implementation is useless.
And yes, there is a lack of a HA network fs (which is indeed one, not
something like ceph).
This is why I even feel more sorry about the whole ongoings. It could have
been a big hit. But it failed miserably.
My last trust is in Matt Dillon and Hammer2. Yes, this is a longterm believe
...

--
Regards
Stephan
 
> Best regards,
> --
> alexander iliev





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] State of Gluster project

2020-06-23 Thread Stephan von Krawczynski
On Wed, 17 Jun 2020 00:06:33 +0300
Mahdi Adnan  wrote:

> [gluster going down ]

I am following this project for quite some years now, probably longer than
most of the people nowadays on the list. The project started with the
brilliant idea of making a fs on top of classical fs's distributed over
several hardware pieces without need to re-copy data for entering or leaving
the gluster. (I thought) it started as a proof-of-concept fs on fuse with the
intention to turn into kernel-space as soon as possible to get the performance
that it should have for a fs. 
After five years of waiting (and using) I declared the project dead for our
use (about five years ago) because it evolved more and more to bloatware. And
I do think that Red Hat understood that finally (too), and what you mentioned
is just the outcome of that.
I really hate the way this project took, because for me it was visible from
the very start that it is a dead-end. After all those years it is bloatware on
fuse. And I feel very sorry for that brilliant idea that it once was.
_FS IN USERSPACE IS SH*T_  - understand that.

-- 
Regards,
Stephan





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] State of Gluster project

2020-06-18 Thread Stephan von Krawczynski
On Thu, 18 Jun 2020 13:27:19 -0400
Alvin Starr  wrote:

> >  [me]
> This is an amazingly unreasonable comment.
> First off ALL distributed file systems are slower than non-distributed 
> file systems.

Obviously you fail to understand my point: the design of glusterfs implies
that it can be as fast as a non-distributed fs. If you have not understood
this by now you should stay another 10 years on this list.
As glusterfs should only read from a single node, and write concurrently to
all nodes it must only be slower than a non-distributed fs if your network is
not designed according to the needed paths. Glusterfs being
slow is not by design but by implementation.

> Second ALL network file systems are slower than local hardware.

Uh, really no comment.
 
> Kernel inclusion does not make for a radically faster implementation.
> I have worked with Kernel included NFS and user space NFS 
> implementations and the performance differences have not been all that 
> amazingly radical.

Probably you are not talking about linux based NFS experience. The last time
we used userspace nfs (very long ago) on linux it was slow _and_ amazingly
buggy. Many thanks to Neil that he made kernel nfs what it is today.
 
> If your so convinced that a kernel included file system is the answer 
> you are free to implement a solution.

Well, people were _paid_ for years now and came up with this mess. And you
want me to implement it for free in what time? If you give me the bucks that
were paid during the last decade you can be sure the solution is a lot
better, easier to configure, and thought-through.

> I am sure the project maintainers would love to have someone come along 
> and improve the code.

Yes, we perfectly agree on that.

-- 
Regards,
Stephan




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] State of Gluster project

2020-06-18 Thread Stephan von Krawczynski
On Thu, 18 Jun 2020 07:40:36 -0700
Joe Julian  wrote:

> You're still here and still hurt about that? It was never intended to be in
> kernel. It was always intended to run in userspace. After all these years I
> thought you'd be over that by now.

Top Poster ;-)

And in fact, it's not true. The clear message to me once was: we are not able
to make a kernel version.
Which I understood as: we have not the knowledge to do that.
Since that was quite some time before Red Hat stepped in there was still hope
that some day someone capable may come ...
Since 2009 when I entered the list there was not a single month where there
were no complaints about gluster being slow. I wonder if you could accept
after 11 years and the projects near death now that I was right from the very
first day.

-- 
Regards,
Stephan





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] State of Gluster project

2020-06-18 Thread Stephan von Krawczynski
On Thu, 18 Jun 2020 13:06:51 +0400
Dmitry Melekhov  wrote:

> 18.06.2020 12:54, Stephan von Krawczynski пишет:
> >
> > _FS IN USERSPACE IS SH*T_  - understand that.
> >  
> 
> we use qemu and it uses gfapi... :-)

And exactly this kind of "insight" is base of my critics. gfapi is _userspace_
on client (given, without fuse), but does not at all handle the basic glusterfs
problem: the need to go through _userspace_ on _server_.
Simply look at the docs here and understand where the work should have been
done:

https://www.humblec.com/libgfapi-interface-glusterfs/

On the server you have to go from kernel-space network to userspace glusterfs
back to kernel-space underlying fs. So gfapi only eliminates one of two major
problems. Comparing performance to NFS on ZFS shows the flaw.
If it was implemented like it should you would have almost _no_ difference,
because you would be able to split up the two network paths to gluster servers
(for a setup with two) on different switches and network cards on client. So
for reading it should be as fast, for writing you may calculate a (very) small
loss.

-- 
Regards,
Stephan





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] State of Gluster project

2020-06-18 Thread Stephan von Krawczynski
On Wed, 17 Jun 2020 00:06:33 +0300
Mahdi Adnan  wrote:

> [gluster going down ]

I am following this project for quite some years now, probably longer than
most of the people nowadays on the list. The project started with the
brilliant idea of making a fs on top of classical fs's distributed over
several hardware pieces without need to re-copy data for entering or leaving
the gluster. (I thought) it started as a proof-of-concept fs on fuse with the
intention to turn into kernel-space as soon as possible to get the performance
that it should have for a fs. 
After five years of waiting (and using) I declared the project dead for our
use (about five years ago) because it evolved more and more to bloatware. And
I do think that Red Hat understood that finally (too), and what you mentioned
is just the outcome of that.
I really hate the way this project took, because for me it was visible from
the very start that it is a dead-end. After all those years it is bloatware on
fuse. And I feel very sorry for that brilliant idea that it once was.
_FS IN USERSPACE IS SH*T_  - understand that.


-- 
Regards,
Stephan




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] FW: Performance with Gluster+Fuse is 60x slower then Gluster+NFS ?

2016-02-18 Thread Stephan von Krawczynski
On Thu, 18 Feb 2016 10:14:59 +1000
Dan Mons  wrote:

> Without knowing the details, I'm putting my money on cache.
> 
> Choosing how to mount Gluster is workload dependent.  If you're doing
> a lot of small files with single threaded writes, I suggest NFS.  Your
> client's nfscache will dramatically improve performance from the
> end-user's point of view.
> 
> If you're doing heavy multi-threaded reads and writes, and you have
> very good bandwidth from your client (e.g.: 10GbE) FUSE+GlusterFS is
> better, as it allows your client to talk to all Gluster nodes.
> [...]

Dan, forgive my jump in this matter which is obvious to everyone using
glusterfs for years: fuse+glusterfs is simply sh*t talking of performance.
There is absolutely nobody whose setup wouldn't be at least two (to several
hundred) times faster using simple NFS. So Stefans numbers are no surprise.
I really cannot believe you are trying to argue for fuse. It is completely
clear that fuse is only used because of the incompetence to write a
kernel-space driver (and this was said years ago by the people who originally
wrote the whole lot). You probably can find this answer to my question in the
archives of this (or the devel) list years back. And because of this I pretty
much stopped writing here, I mean you cannot blame someone for not being
skilled enough to produce the right code in a GPL situation.
The basic concept is good, the implementation is just a mess. And that's it.

Regards,
Stephan

 
> If you are using FUSE+GlusterFS, on the gluster nodes themselves,
> experiment with the "performance.write-behind-window-size" and
> "performance.cache-size" options.  Note that these will affect the
> cache used by the clients, so don't set them so high as to exhaust the
> RAM of any client connecting (or, for low-memory clients, use NFS
> instead).
> 
> Gluster ships with conservative defaults for cache, which is a good
> thing.  It's up to the user to tweak for their optimal needs.
> 
> There's no right or wrong answer here.  Experiment with NFS and
> various cache allocations with FUSE+GlusterFS, and see how you go.
> And again, consider your workloads, and whether or not they're taking
> full advantage of the FUSE client's ability to deal with highly
> parallel workloads.
> 
> -Dan
> 
> Dan Mons - VFX Sysadmin
> Cutting Edge
> http://cuttingedge.com.au
> 
> 
> On 18 February 2016 at 08:56, Stefan Jakobs  wrote:
> > Van Renterghem Stijn:
> >> Interval2
> >> Block Size:  1b+  16b+  
> >> 32b+
> >> No. of Reads:0 0
> >>  0 No. of Writes:  34225
> >>575
> >>
> >>Block Size: 64b+ 128b+
> >> 256b+ No. of Reads:0 0
> >>0 No. of Writes:  143   898
> >>  118
> >>
> >>Block Size:512b+1024b+
> >> 2048b+ No. of Reads:1 4
> >>11 No. of Writes:   82 0
> >> 0
> >>
> >>Block Size:   4096b+8192b+
> >> 16384b+ No. of Reads:   1131
> >> 39 No. of Writes:0 0
> >>  0
> >>
> >>Block Size:  32768b+   65536b+
> >> 131072b+ No. of Reads:   59   148
> >> 555 No. of Writes:0 0
> >>   0
> >>
> >> %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls
> >> Fop -   ---   ---   ---   
> >>    0.00   0.00 us   0.00 us   0.00 us  1
> >> FORGET 0.00   0.00 us   0.00 us   0.00 us201
> >> RELEASE 0.00   0.00 us   0.00 us   0.00 us  54549
> >> RELEASEDIR 0.00  47.00 us  47.00 us  47.00 us  1
> >> REMOVEXATTR 0.00  94.00 us  74.00 us 114.00 us  2
> >>   XATTROP 0.00 191.00 us 191.00 us 191.00 us  1
> >> TRUNCATE 0.00  53.50 us  35.00 us  74.00 us  4
> >> STATFS 0.00  79.67 us  70.00 us  91.00 us  3
> >> RENAME 0.00  37.33 us  27.00 us  68.00 us 15
> >> INODELK 0.00 190.67 us 116.00 us 252.00 us  3
> >> UNLINK 0.00  28.83 us   8.00 us  99.00 us 30
> >> ENTRYLK 0.00 146.33 us 117.00 us 188.00 us  6
> >> CREATE 0.00  37.63 us  12.00 us  73.00 us 84
> >> READDIR 0.00  23.75 us   8.00 us  75.00 us198
> >> FLUSH 0.00  65.33 us  42.00 us 141.00 us204
> >> OPEN 0.01  45.78 us  11.00 us 

Re: [Gluster-users] 40 gig ethernet

2013-06-15 Thread Stephan von Krawczynski
On Fri, 14 Jun 2013 14:35:26 -0700
Bryan Whitehead dri...@megahappy.net wrote:

 GigE is slower. Here is ping from same boxes but using the 1GigE cards:
 
 [root@node0.cloud ~]# ping -c 10 10.100.0.11
 PING 10.100.0.11 (10.100.0.11) 56(84) bytes of data.
 64 bytes from 10.100.0.11: icmp_seq=1 ttl=64 time=0.628 ms
 64 bytes from 10.100.0.11: icmp_seq=2 ttl=64 time=0.283 ms
 64 bytes from 10.100.0.11: icmp_seq=3 ttl=64 time=0.307 ms
 64 bytes from 10.100.0.11: icmp_seq=4 ttl=64 time=0.275 ms
 64 bytes from 10.100.0.11: icmp_seq=5 ttl=64 time=0.313 ms
 64 bytes from 10.100.0.11: icmp_seq=6 ttl=64 time=0.278 ms
 64 bytes from 10.100.0.11: icmp_seq=7 ttl=64 time=0.309 ms
 64 bytes from 10.100.0.11: icmp_seq=8 ttl=64 time=0.197 ms
 64 bytes from 10.100.0.11: icmp_seq=9 ttl=64 time=0.267 ms
 64 bytes from 10.100.0.11: icmp_seq=10 ttl=64 time=0.187 ms
 
 --- 10.100.0.11 ping statistics ---
 10 packets transmitted, 10 received, 0% packet loss, time 9000ms
 rtt min/avg/max/mdev = 0.187/0.304/0.628/0.116 ms
 
 Note: The Infiniband interfaces have a constant load of traffic from
 glusterfs. The Nic cards comparatively have very little traffic.

Uh, you should throw away your GigE switch. Example:

# ping 192.168.83.1
PING 192.168.83.1 (192.168.83.1) 56(84) bytes of data.
64 bytes from 192.168.83.1: icmp_seq=1 ttl=64 time=0.310 ms
64 bytes from 192.168.83.1: icmp_seq=2 ttl=64 time=0.199 ms
64 bytes from 192.168.83.1: icmp_seq=3 ttl=64 time=0.119 ms
64 bytes from 192.168.83.1: icmp_seq=4 ttl=64 time=0.115 ms
64 bytes from 192.168.83.1: icmp_seq=5 ttl=64 time=0.099 ms
64 bytes from 192.168.83.1: icmp_seq=6 ttl=64 time=0.082 ms
64 bytes from 192.168.83.1: icmp_seq=7 ttl=64 time=0.091 ms
64 bytes from 192.168.83.1: icmp_seq=8 ttl=64 time=0.096 ms
64 bytes from 192.168.83.1: icmp_seq=9 ttl=64 time=0.097 ms
64 bytes from 192.168.83.1: icmp_seq=10 ttl=64 time=0.095 ms
64 bytes from 192.168.83.1: icmp_seq=11 ttl=64 time=0.097 ms
64 bytes from 192.168.83.1: icmp_seq=12 ttl=64 time=0.102 ms
64 bytes from 192.168.83.1: icmp_seq=13 ttl=64 time=0.103 ms
64 bytes from 192.168.83.1: icmp_seq=14 ttl=64 time=0.108 ms
64 bytes from 192.168.83.1: icmp_seq=15 ttl=64 time=0.098 ms
64 bytes from 192.168.83.1: icmp_seq=16 ttl=64 time=0.093 ms
64 bytes from 192.168.83.1: icmp_seq=17 ttl=64 time=0.099 ms
64 bytes from 192.168.83.1: icmp_seq=18 ttl=64 time=0.102 ms
64 bytes from 192.168.83.1: icmp_seq=19 ttl=64 time=0.092 ms
64 bytes from 192.168.83.1: icmp_seq=20 ttl=64 time=0.111 ms
64 bytes from 192.168.83.1: icmp_seq=21 ttl=64 time=0.112 ms
64 bytes from 192.168.83.1: icmp_seq=22 ttl=64 time=0.099 ms
64 bytes from 192.168.83.1: icmp_seq=23 ttl=64 time=0.092 ms
64 bytes from 192.168.83.1: icmp_seq=24 ttl=64 time=0.102 ms
64 bytes from 192.168.83.1: icmp_seq=25 ttl=64 time=0.108 ms
^C
--- 192.168.83.1 ping statistics ---
25 packets transmitted, 25 received, 0% packet loss, time 23999ms
rtt min/avg/max/mdev = 0.082/0.112/0.310/0.047 ms

That is _loaded_.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] 40 gig ethernet

2013-06-14 Thread Stephan von Krawczynski
On Fri, 14 Jun 2013 12:13:53 -0700
Bryan Whitehead dri...@megahappy.net wrote:

 I'm using 40G Infiniband with IPoIB for gluster. Here are some ping
 times (from host 172.16.1.10):
 
 [root@node0.cloud ~]# ping -c 10 172.16.1.11
 PING 172.16.1.11 (172.16.1.11) 56(84) bytes of data.
 64 bytes from 172.16.1.11: icmp_seq=1 ttl=64 time=0.093 ms
 64 bytes from 172.16.1.11: icmp_seq=2 ttl=64 time=0.113 ms
 64 bytes from 172.16.1.11: icmp_seq=3 ttl=64 time=0.163 ms
 64 bytes from 172.16.1.11: icmp_seq=4 ttl=64 time=0.125 ms
 64 bytes from 172.16.1.11: icmp_seq=5 ttl=64 time=0.125 ms
 64 bytes from 172.16.1.11: icmp_seq=6 ttl=64 time=0.125 ms
 64 bytes from 172.16.1.11: icmp_seq=7 ttl=64 time=0.198 ms
 64 bytes from 172.16.1.11: icmp_seq=8 ttl=64 time=0.171 ms
 64 bytes from 172.16.1.11: icmp_seq=9 ttl=64 time=0.194 ms
 64 bytes from 172.16.1.11: icmp_seq=10 ttl=64 time=0.115 ms
 
 --- 172.16.1.11 ping statistics ---
 10 packets transmitted, 10 received, 0% packet loss, time 8999ms
 rtt min/avg/max/mdev = 0.093/0.142/0.198/0.035 ms

What you like to say is that there is no significant difference compared to
GigE, right?
Anyone got a ping between two kvm-qemu virtio-net cards at hand?

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Rebalancing with no new bricks.

2013-06-12 Thread Stephan von Krawczynski
On Wed, 12 Jun 2013 09:04:30 -0400
Jeff Darcy jda...@redhat.com wrote:

 [...]
 that need to be moved, it shouldn't be too hard to combine this little
 bag of tricks into a solution that meets your needs.  Just let me know
 if you'd like me to assist.

The true question is indeed: why does he need tricks at all to come to
something obvious for humans: a way of distributing files over the glusterfs
so that full means really all bricks are full.
It cannot be the right way to design software (for humans) so that they have
to adapt to the software. Instead the software should be able to adapt to the
users' needs and situation. It is very obvious today that bricks can be of
different size.

In fact I always thought it would be a big advantage of glusterfs to be able
to use what's already there and make more out of it (just as linux did from
the first day on).
Which means for me:
1) It must be easy to deploy to an already filled fileserver = no need to
copy data over onto the new glusterfs (soft migration).
2) whatever layout the bricks are glusterfs must be able to follow the
obvious: if there is space left then use it.
3) The data must be left accessible even if glusterfs is not used on the
bricks any longer - without copying back.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Rebalancing with no new bricks.

2013-06-12 Thread Stephan von Krawczynski
On Wed, 12 Jun 2013 09:57:15 -0400
Jeff Darcy jda...@redhat.com wrote:

 On 06/12/2013 09:46 AM, Stephan von Krawczynski wrote:
  The true question is indeed: why does he need tricks at all to come to
  something obvious for humans: a way of distributing files over the glusterfs
  so that full means really all bricks are full.
 
 That's my view too.  I keep trying.
 
  3) The data must be left accessible even if glusterfs is not used on the
  bricks any longer - without copying back.
 
 This part is already true in practically all cases.  You can ignore the 
 .glusterfs directory and extra xattrs, or nuke them, and you have a perfectly 
 normal file/directory structure that's usable as-is.  The exceptions are if 
 you 
 use striping or erasure coding, but (like RAID) those are fundamentally ways 
 of 
 slicing and dicing data across storage units so some reassembly would be 
 necessary.

I only tried to list _the_ major advantages glusterfs could/should have over
almost all other competitors.
Of special importance is 1) and 3) because it allows everyone to go and
try without having to fiddle around with tons of data.
2) is convenience, something a good piece of software should deliver :-)

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Fwd: [Gluster-devel] glusterfs-3.3.2qa1 released

2013-04-13 Thread Stephan von Krawczynski
On Sat, 13 Apr 2013 23:47:21 +0530
Vijay Bellur vbel...@redhat.com wrote:

 On 04/13/2013 08:11 PM, Stephan von Krawczynski wrote:
  On Sat, 13 Apr 2013 10:37:23 -0400 (EDT)
  John Walker jowal...@redhat.com wrote:
 
  Try the new qa build for 3.3.2. We're hopeful that this will solve some 
  lingering problems out there.
 
  The ext4, too?
 
 Not in this qa release. A subsequent qa release will have the ext4 fix.
 
 -Vijay

Great news, thank you.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Performance for KVM images (qcow)

2013-04-09 Thread Stephan von Krawczynski
On Tue, 09 Apr 2013 03:13:10 -0700
Robert Hajime Lanning lann...@lanning.cc wrote:

 On 04/09/13 01:17, Eyal Marantenboim wrote:
  Hi Bryan,
 
  We have 1G nics on all our servers.
  Do you think that changing our design to distribute-replicate will
  improve the performance?
 
  Anything in the gluster performance settings that you think I should change?
 
 With GlusterFS, almost all the processing is in the client side.  This 
 includes replication.  So, when you have replica 4, the client will be 
 duplicating all transactions 4 times, synchronously.  Your 1G ethernet 
 just became 256M.

Let me drop in that no clear mind does it this way. Obviously one would give
the client more physical network cards, best choice as many as there are
replications, and do the subnetting accordingly.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Slow read performance

2013-03-08 Thread Stephan von Krawczynski
I really do wonder if this bug in _glusterfs_ is not fixed. It really makes no
sense to do an implementation that breaks on the most used fs on linux.
And just as you said: don't wait on btrfs, it will never be production-ready.
And xfs is no solution, it is just a bad work-around.


On Fri, 8 Mar 2013 10:43:41 -0800
Bryan Whitehead dri...@megahappy.net wrote:

 Here are some details about ext4 changes in the kernel screwing up
 glusterfs:
 http://www.gluster.org/2012/08/glusterfs-bit-by-ext4-structure-change/
 https://bugzilla.redhat.com/show_bug.cgi?id=838784
 
 I thought I read there was a work-around in recent versions of gluster but
 I think it came at a cost somewhere. I'm not sure since I've been using xfs
 since the 1.x days of gluster and only see random ext3/4 problems bubble up
 on these maillinglist. In general, ext4 was just a stopgap for the wait on
 btrfs getting flushed out. That said, I don't see ext4 going away for a
 long long time. :-/
 
 NOTE: I don't even know if this is your problem. You might try updating 2
 bricks that are replica pairs to use xfs then do some performance tests on
 files living on them to confirm. Example, you have 20 some servers/bricks.
 If hostD and hostE are replica pairs for some subset of files, shutdown
 glusterd on HostD, change fs to xfs, fire glusterd back up - let it resync
 and recover all the files, do the same on hostE (once hostD is good), then
 see if there is a read speed improvement for files living on those two host
 pairs.
 
 
 On Fri, Mar 8, 2013 at 6:40 AM, Thomas Wakefield tw...@cola.iges.orgwrote:
 
  I am still confused how ext4 is suddenly slow to read when it's behind
  Gluster, but plenty fast stand alone reading?
 
  And it writes really fast from both the server and client.
 
 
 
  On Mar 8, 2013, at 4:07 AM, Jon Tegner teg...@renget.se wrote:
 
  We had issues with ext4 about a bit less than a year ago, at that time I
  upgraded the servers to CentOS-6.2. But that gave us large problems (more
  than slow reads). Since I didn't want to reformat the disks at that time
  (and switch to XFS) I went back to CentOS-5.5 (which we had used before).
  On some link (think it was
  https://bugzilla.redhat.com/show_bug.cgi?id=713546 but can't seem to
  reach that now) it was stated that the ext4-issue was present even on later
  versions of CentOS-5 (I _think_ 5.8 was affected).
 
  Are there hope that the ext4-issue will be solved in later
  kernels/versions of gluster? If not, it seems one is eventually forced to
  switch to XFS.
 
  Regards,
 
  /jon
 
  On Mar 8, 2013 03:27 Thomas Wakefield tw...@iges.org 
  tw...@iges.orgwrote:
 
 
  inode size is 256.
 
 
 
 
  Pretty stuck with these settings and ext4. I missed the memo that
  Gluster started to prefer xfs, back in the 2.x days xfs was not the
  preferred filesystem. At this point it's a 340TB filesystem with 160TB
  used. I just added more space, and was doing some followup testing and
  wasn't impressed with the results. But I am sure I was happier before
  with the performance.
 
 
 
 
  Still running CentOS 5.8
 
 
 
 
  Anything else I could look at?
 
 
 
 
  Thanks, Tom
 
 
 
 
 
  On Mar 7, 2013, at 5:04 PM, Bryan Whitehead dri...@megahappy.net
  wrote:
 
 
   I'm sure you know, but xfs is the recommended filesystem for
   glusterfs. Ext4 has a number of issues. (Particularly on
   CentOS/Redhat6).
  
  
  
   The default inode size for ext4 (and xfs) is small for the number of
   extended attributes glusterfs uses. This causes a minor hit in
   performance on xfs if theextended attributes grow more than 265 (xfs
   default size). In xfs, this is fixed by setting the size of an inode
   to 512. How big the impact is on ext4 is something I don't know
   offhand. But looking at a couple of boxes I have it looks like some
   ext4 filesystems have 128 inode size and some have 256 inode size
   (both of which are too small for glusterfs). The performance hit is
   everytimeextended attributes need to be read several inodes need to be
   seeked and found.
  
  
  
   run dumpe2fs -h blockdevice | grep size on your ext4 mountpoints.
  
  
  
  
   If it is not too much of a bother - I'd try xfs as your filesystem for
   the bricks
  
  
  
  
   mkfs.xfs -i size=512 blockdevice
  
  
  
  
   Please see this for more detailed info:
  
   https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Storage/2.0/ht
   ml-single/Administration_Guide/index.html#chap-User_Guide-Setting_Volu
   mes
  
  
  
  
  
   On Thu, Mar 7, 2013 at 12:08 PM, Thomas Wakefield
   tw...@cola.iges.org wrote:
  
Everything is built as ext4, no options other than
lazy_itable_init=1 when I built the filesystems.
   
   
   
   
Server mount example:
   
LABEL=disk2a /storage/disk2a ext4defaults 0 0
   
   
   
   
Client mount:
   
fs-disk2:/shared /shared glusterfs defaults 0 0
   
   
   
   
Remember, the slow reads are only from gluster clients, the disks
are really fast when I am 

Re: [Gluster-users] NFS availability

2013-01-31 Thread Stephan von Krawczynski
On Wed, 30 Jan 2013 20:44:52 -0800
harry mangalam harry.manga...@uci.edu wrote:

 On Thursday, January 31, 2013 11:28:04 AM glusterzhxue wrote:
  Hi all,
  As is known to us all, gluster provides NFS mount. However, if the mount
  point fails, clients will lose connection to Gluster. While if we use
  gluster native client, this fail will have no effect on clients. For
  example:
  mount -t glusterfs host1:/vol1  /mnt
  
  If host1 goes down for some reason, client works still, it has no sense
  about the failure(suppose we have multiple gluster servers).  
 
 The client will still fail (in most cases) since host1 (if I follow you) is 
 part of the gluster groupset. Certainly if it's a distributed-only, maybe not 
 if it's a dist/repl gluster.  But if host1 goes down, the client will not be 
 able to find a gluster vol to mount.

For sure it will not fail if replication is used. 
 
  However, if
  we use the following:
  mount -t nfs -o vers=3   host1:/vol1 /mnt
 
  If host1 failed, client will lose connection to gluster servers.
 
 If the client was mounting the glusterfs via a re-export from an intermediate 
 host, you might be able to failover to another intermediate NFS server, but 
 if 
 it was a gluster host, it would fail due to the reasons above.

kernel-nfs _may_ failover from server A to server B if B takes the original
server IP and some requirements are met.
You don't need an intermediate (re-exporting) server for this.

-- 
Regards,
Stephan 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS availability

2013-01-31 Thread Stephan von Krawczynski
On Thu, 31 Jan 2013 12:47:30 +
Brian Candler b.cand...@pobox.com wrote:

 On Thu, Jan 31, 2013 at 09:18:26AM +0100, Stephan von Krawczynski wrote:
   The client will still fail (in most cases) since host1 (if I follow you) 
   is 
   part of the gluster groupset. Certainly if it's a distributed-only, maybe 
   not 
   if it's a dist/repl gluster.  But if host1 goes down, the client will not 
   be 
   able to find a gluster vol to mount.
  
  For sure it will not fail if replication is used. 
 
 Aside: it will *fail* if the client reboots, and /etc/fstab has
 server1:/volname, and server1 is the one which failed.

Well, this is exactly the reason we generally deny to fetch the volfile from
the server. This whole idea is obvious nonsense for exactly the reason you
described.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS availability

2013-01-31 Thread Stephan von Krawczynski
On Thu, 31 Jan 2013 09:07:50 -0800
Joe Julian j...@julianfamily.org wrote:

 On 01/31/2013 08:38 AM, Stephan von Krawczynski wrote:
  On Thu, 31 Jan 2013 12:47:30 +
  Brian Candler b.cand...@pobox.com wrote:
 
  On Thu, Jan 31, 2013 at 09:18:26AM +0100, Stephan von Krawczynski wrote:
  The client will still fail (in most cases) since host1 (if I follow you) 
  is
  part of the gluster groupset. Certainly if it's a distributed-only, 
  maybe not
  if it's a dist/repl gluster.  But if host1 goes down, the client will 
  not be
  able to find a gluster vol to mount.
  For sure it will not fail if replication is used.
  Aside: it will *fail* if the client reboots, and /etc/fstab has
  server1:/volname, and server1 is the one which failed.
  Well, this is exactly the reason we generally deny to fetch the volfile from
  the server. This whole idea is obvious nonsense for exactly the reason you
  described.
 
 That doesn't lend me much confidence in your expertise with regard to 
 your other recommendations, Stephan.
 
 There are two good ways to make this work even if a server is down:
 
   * Round robin DNS. A hostname (ie. glusterfs.domain.dom) with multiple
 A records that point to all your servers. Using that hostname in
 fstab will allow the client to roll over to the additional servers
 in the event the first one it gets is not available (ie.
 glusterfs.domain.dom:myvol /mnt/myvol glusterfs defaults 0 0).

You don't want to use DNS in an environment where security is your first rule.
If your DNS drops dead your setup is dead. Not very promising ...
The basic goal of glusterfs has been to secure data by replicating it.
Data distribution is really not interesting for us. Now you say go and
replicate your data for security, but use DNS to secure your setup.
???
You really seem to like Domino-setups. DNS dead = everything dead.

   * The mount option backupvolfile-server. An fstab entry like
 server1:myvol /mnt/myvol glusterfs backupvolfile-server=server2 0
 0 will allow the mount command to try server2 if server1 does not
 mount successfully.

And how many backup servers do you want to name in your fstab? In fact you
have to name all your servers because else there will always be at least one
situation you are busted. 
 
 This whole idea is obvious experience and forethought, not nonsense. By 
 having a management service that provides configuration, on-the-fly 
 configuration changes are possible. If one denies to fetch the volfile 
 one cripples their cluster's flexibility.

I don't know what kind of setups you drive. In our environment we don't want
to fiddle around with fs configs. We want them to work as expected even if
other parts of the total setup fall apart. Flexibility in our world means you
can do widespread types of configurations. It does not mean we switch the
running configs every day only because gluster is so flexible.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS availability

2013-01-31 Thread Stephan von Krawczynski
On Thu, 31 Jan 2013 14:17:32 -0500
Jeff Darcy jda...@redhat.com wrote:

 There is *always* at least one situation, however unlikely, where you're 
 busted.  Designing reliable systems is always about probabilities.  If 
 none of the solutions mentioned so far suffice for you, there are still 
 others that don't involve sacrificing the advantages of dynamic 
 configuration.  If your network is so FUBAR that you have trouble 
 reaching any server to fetch a volfile, then it probably wouldn't do you 
 any good to have one locally because you wouldn't be able to reach those 
 servers for I/O anyway.  You'd be just asking for split brain and other 
 problems.  Redesigning the mount is likely to yield less benefit than 
 redesigning the network that's susceptible to such failures.

You are asking in the wrong direction. The simple question is: is there any
dynamic configuration equally safe than a local config file?
If your local fs is dead, then you are really dead. But if it is alive you
have a config. And that's about it. You need no working DNS, no poisoned cache
and no special server that must not fail.
Everything with less security is inacceptable. There is no probability, either
you are a dead client or a working one.
And if you really want to include the network as a question. I would expect
the gluster client-server and server-server protocols to accept network
failure as a default case. It wouldn't be useful to release a network
filesystem that drops dead in case of network errors. If there is some chance
to survive it should be able to do so and still work.
Most common network errors are not a matter of design, but of dead iron. 

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS availability

2013-01-31 Thread Stephan von Krawczynski
On Thu, 31 Jan 2013 16:00:38 -0500
Jeff Darcy jda...@redhat.com wrote:

  Most common network errors are not a matter of design, but of dead
  iron.
 
 It's usually both - a design that is insufficiently tolerant of
 component failure, plus a combination of component failures that exceeds
 that tolerance.  You seem to have a very high standard for filesystems
 continuing to maintain 100% functionality - and I suppose 100%
 performance as well - if there's any possibility whatsoever that they
 could do so.  Why don't you apply that same standard to the part of the
 system that you're responsible for designing?  Running any distributed
 system on top of a deficient network infrastructure will lead only to
 disappointment.

I am sorry that glusterfs is part of the design and your critics. 
Everyone working sufficiently long with networks of all kinds of sizes and
components can tell you that in the end you want a design for a file service
that works as long as possible. This means it should survive even if there is
only one client and server and network path left.
At least that is what is expected from glusterfs. Unfortunately sometimes you
get disappointed. We saw just about everything happening when switching off
all but one reliable network path including network hangs and server hangs
(the last one) (read the list for examples by others).
On the other end of the story clients see servers go offline if you increase
the non-gluster traffic on the network. Main (but not only) reason is the very
low default ping time (read the list for examples by others).
All these seen effects show clearly that noone ever tested this to an extent I
would have done writing this kind of software. After all this is a piece of
software whose merely only purpose is surviving dead servers and networks.
It is no question of design, because on paper everything looks promising.

Sometimes your arguments let me believe you want glusterfs working like a ford
car. A lot of technical gameplay built in but the idea that a car should be a
good car in the first place got lost on the way somewhere. Quite a lot of the
features built in lately have the quality of an mp3-player in your ford. Nice
to have but does not help you a lot driving 200 and a rabbit crossing.
And this is why I am requesting the equivalent of a BMW.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Self healing metadata info

2013-01-25 Thread Stephan von Krawczynski
Hi Patric,

your paper shows clearly you are infected by the fs-programmer-virus :-)
Noone else would give you tags/gfids/inode nums of a file inside a logfile
instead of the full true filename, simply because looking at the logfile
days/months/years later you know exactly nothing about the files affected by
e.g. a self heal. Can you explain why a fs cannot give the user/admin the
files' name currently fiddling around in a logfile instead of a cryptic number?

For the completeness in split-brain case I would probably do a 
gluster volume heal repvol prefer brick filename
command which prefers the files' copy on brick and triggers the self-heal
for that file.
As an addition you would be able to allow
gluster volume heal repvol prefer brick
(without filename) to generally prefer files on brick and trigger self-heal
for all files. There are cases where admins do not care about the actual copy
but more about the accessibility of the file per se.
Everything is easy around self-heal/splitbrain if you deal with 5 files
affected. But dealing with 5000 files instead shows you that no admin is
probably able to look at every single file. So he should be able to choose
some general option like gluster volume heal repvol prefer tag
where tag can be:
brickname (as above)
length, choose longest file always
date, choose latest file date always
delete, simply remove all affected files
name-one ...


Regards,
Stephan



On Fri, 25 Jan 2013 10:11:07 +0100
Patric Uebele pueb...@redhat.com wrote:

 Hi JPro,
 
 perhaps the attached doc does explain it a bit.
 
 Best regards,
 
 Patric
 
 On Fri, 2013-01-25 at 01:26 -0500, Java Pro wrote:
  Hi,
  
  
  If a brick is down and comes back up later, how does Glusterfs know
  which files in this brick need to be 'self-healed'?
  
  
  Since the metadata of whether to 'heal' is stored as an xattr in a
  replica on other bricks. Does Glusterfs scan these files on the other
  bricks to see if one is accusing its replica and therefore need to
  heal its replica? 
  
  
  In short, does Glusterfs keep a record of writes to a brick when a
  brick is down and apply these writes to the brick when its backup?
  
  
  
  
  Thanks,
  JPro
   
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
 -- 
 Patric Uebele 
 Solution Architect Storage
 
 Red Hat GmbH 
 Technopark II, Haus C 
 Werner-von-Siemens-Ring 14 
 85630 Grasbrunn 
 Germany 
 
 Office:+49 89 205071-162 
 Cell:  +49 172 669 14 99 
 mailto:patric.ueb...@redhat.com 
 
 gpg keyid: 48E64CC1
 gpg fingerprint: C63E 6320 A03B 4410 D208  4EE7 12FC D0E6 48E6 4CC1
 
  
 Reg. Adresse: Red Hat GmbH, Werner-von-Siemens-Ring 14, 85630 Grasbrunn 
 Handelsregister: Amtsgericht Muenchen HRB 153243 
 Geschaeftsfuehrer: Mark Hegarty, Charlie Peters, Michael Cunningham,
 Charles Cachera

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Meta

2013-01-22 Thread Stephan von Krawczynski
On Tue, 22 Jan 2013 09:05:56 -0500
Whit Blauvelt whit.glus...@transpect.com wrote:

 On Tue, Jan 22, 2013 at 08:37:03AM -0500, F. Ozbek wrote:
 [...]
 We've not only got freedom of speech. We've got freedom of guns. Still,
 walking into the meeting with your gun drawn will get you viewed as rude or
 worse. We're supposed to be data pros here, not cowboys. So, data please.
 
 Best,
 Whit

Whit, just for the sake of it.

Jeffs method of discussion is to lengthen every idea/opinion/fact to an
academical epos. This is why sometimes you simply don't have the time to
argue with him, especially if you are not paid but wasting your own spare time.
Additionally one very basic fact should be accepted. People are on different
levels of experience on this _user_ list.
Some have tested the software for years and experienced the lacks and dead
ends. Some don't. Quite some of the pro-arguers do not accept experiences as
long as you do not hard-proove them within a lenghty article starting by
definition of the alphabet used.
It is not really helpful to hit everyone writing two sentences with data
please. Quite some data can be found if you really care. 
But even the long pdf someone posted lately with comparison data has
significant lacks in presentation.
I would love to see some acceptance around the major problems the software has
currently, because without acceptance there is no way to true solution.
Again, the design is impressive, only the implementation does not keep up.
Don't trust my words, look for the comparisons and judge for yourself.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues

2013-01-08 Thread Stephan von Krawczynski
On Tue, 08 Jan 2013 07:04:48 -0500
Jeff Darcy jda...@redhat.com wrote:

 Timestamps are totally unreliable as a conflict resolution mechanism.  Even if
 one were to accept the dependency on time synchronization, there's still the
 possibility of drift as yet uncorrected by the synchronization protocol.  The
 change logs used by self heal are the *only* viable solution here.  If you 
 want
 to participate constructively, we could have a discussion about how those
 change logs should be set and checked, and whether a brick should be allowed 
 to
 respond to requests for a file between coming up and completion of at least 
 one
 self-heal check (Mario's example would be a good one to follow), but insisting
 on even less reliable methods isn't going to help.

Nobody besides you is talking about timestamps. I would simply choose an
increasing stamp, increased by every write-touch of the file.
In a trivial comparison this assures you choose the latest copy of the file.
There is really no time needed at all, and therefore no time synchronisation
issues.

-- 
MfG,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues

2013-01-08 Thread Stephan von Krawczynski
On Mon, 07 Jan 2013 20:21:25 -0800
Joe Julian j...@julianfamily.org wrote:

  I don't know the answer. I know that they want this problem to be
  solved, but right now the best solution is hardware. The lower the
  latency, the less of a problem you'll have.
  The only solution is correct programming, no matter what the below hardware
  looks like. The only outcome of good or bad hardware is how _fast_ the
  _correct_ answer reaches the fs client.
 Yes, if you can control the programming of your application, that would 
 be a better solution. Unfortunately most of us use pre-packaged software 
 like apache, php, etc. Since most of us don't have the chance to use the 
 correct programming solution, then you're going to need to decrease 
 latency if your going to open thousands of fd's for every operation and 
 are unsatisfied with the results.

I am _not_ talking about the application software. I am talking about the fact
that everybody using glusterfs has seen glusterfs choosing the _wrong_ (i.e.
old) version of a file from a brick just coming back from downstate to the
replicated unit.
In fact I already saw just about every possibility you can think of when
accessing files, be it a simple ls or writing or reading a file.
I verified files being absent if opened although shown in ls. I saw outdated
file content, although timestamp in ls being up to date. I saw file content
being new although ls shows outdated file date _and_ length.
Please don't tell me the fs has no immanent confusion about the various stats
of different bricks.
I don't state this happens with every file, I'm just saying it does happen.
Am I the only one with these kind of experiences?

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues

2013-01-08 Thread Stephan von Krawczynski
On Tue, 08 Jan 2013 07:54:05 -0500
Jeff Darcy jda...@redhat.com wrote:

 On 1/8/13 7:11 AM, Stephan von Krawczynski wrote:
  Nobody besides you is talking about timestamps. I would simply choose an
  increasing stamp, increased by every write-touch of the file.
  In a trivial comparison this assures you choose the latest copy of the file.
  There is really no time needed at all, and therefore no time synchronisation
  issues.
 
 When you dismiss change logs and then say latest without elaboration then
 it's not unreasonable to assume you mean timestamps.  Perhaps you should try 
 to
 write more clearly.
 
 Versions are certainly an improvement over timestamps, but they're not as
 simple as you say either - and I've actually used versioning in a functional
 replication translator[1] so I'm not just idly speculating about work other
 people might do.  If two replicas are both at (integer) version X but are
 partitioned from one another, then writes to both could result in two copies
 each with version X+1 but with different data.

This can only happen in a broken versioning. Obviously one would take (very
rough explanation) at least a two-shot concept. You increase the version by
one when starting the file modification process and again by one when the
process is completed without error.
You end up knowing that version nr 1,3,5,... are intermediate/incomplete
versions and 2,4,6,... are files with completed operations.
Now you can tell at any time throughout any stat comparison which file is
truely actual and which one is in intermediate state. If you want that you can
even await the completion of an ongoing modification before returning some
result to your requesting app. Yes, this would result in immanent locking.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues

2013-01-08 Thread Stephan von Krawczynski
On Tue, 8 Jan 2013 08:01:16 -0500
Whit Blauvelt whit.glus...@transpect.com wrote:

 On Tue, Jan 08, 2013 at 01:11:24PM +0100, Stephan von Krawczynski wrote:
 
  Nobody besides you is talking about timestamps. I would simply choose an
  increasing stamp, increased by every write-touch of the file.
  In a trivial comparison this assures you choose the latest copy of the file.
  There is really no time needed at all, and therefore no time synchronisation
  issues.
 
 So rather than the POSIX attribute of a time stamp, which is I'm pretty sure
 what we all thought you were talking about, you're asking for a new
 xattribute? And you want that to be simply iterative? Okay, so in a
 split-brain, a file gets touched 5 times on one side, and actually written
 to just once, not touched at all, on the other. Then the system's brought
 back together. Your trivial comparison will choose the wrong file version.

What an dead-end argument. _Nothing_ will save you in case of a split-brain.
Lets clarify that a split-brain is a situation where your replication unit is
teared into bricks and these used independently from each other. There is no
way at all to join such a situation again regarding equal files being written
to. You cannot blame a versioning for having another really bad conceptional
problem. Since there is no automated solution to a split brain you can either
decide to give the user access to two different file versions, none of which
has the original file name to prevent irritation or live with lost data by
choosing one of the available file versions. 
Lets spell it this way: either you want maximum availability and accept a
split brain in worst case (indeed very acceptable in case of read-only data),
or you prevent split brain and accept downtime of _some_ of the clients by
choosing which brick is master in this special case.
 
 That's the thing about complex systems. Trivial solutions are usually both
 simple and wrong. Some work most of the time, but there are corner cases. As
 we see with Gluster even complex solutions tend to have corner cases; but at
 least in complex solutions the corners can be whittled down.

Can they? I'd rather say if it is non-trivial it is broken most of the time.
Ask btrfs for confirmation.
 
 Regards,
 Whit

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues

2013-01-08 Thread Stephan von Krawczynski
On Tue, 08 Jan 2013 07:55:41 -0500
Jeff Darcy jda...@redhat.com wrote:

 On 1/8/13 7:35 AM, Stephan von Krawczynski wrote:
  In fact I already saw just about every possibility you can think of when
  accessing files, be it a simple ls or writing or reading a file.
 
 Would you mind citing the bug IDs for the problems you found?

Yes, I mind.
The problem with this kind of bugs is that you cannot describe reproduction.
Which makes them pretty useless as bug reports.
They can therefore only contain the information that such situations are seen,
but not much else. And me and others have told that continously over the years
on the lists.
Take 4 physical boxes and check out some damage situations (switch bricks off
and on), you will see the described problems within a day.
You only need bonnie and ls to find out.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues

2013-01-08 Thread Stephan von Krawczynski
On Tue, 8 Jan 2013 09:25:05 -0500
Whit Blauvelt whit.glus...@transpect.com wrote:

 On Tue, Jan 08, 2013 at 02:42:49PM +0100, Stephan von Krawczynski wrote:
 
  What an dead-end argument. _Nothing_ will save you in case of a split-brain.
 
 So then, to your mind, there's _nothing_ Gluster can do to heal after a
 split brain? Some non-trivial portion of the error scenarios discussed in
 this thread result from a momentary or longer split-brain situation. I'm
 using split-brain in the broad sense of any situation where two sides of a
 replicated system are out-of-touch for some period and thus get out-of-sync.
 Isn't that exactly what we're discussing, how to heal from that? Sure, you
 can have instances of specific files beyond algorithmic treatment. But
 aren't we discussing how to ensure that the largest possible portion of the
 set of files amenable to algorithmic treatment are so-handled?

Really about the only thing regarding split brain (in our common sense) that
is important is that you are notified that it happened at all.
I would never recommend a setup to joe-average-user/admin that does real
split-brain instead of tearing down every brick besides the master configured
for split brain situation. There is no good way to avoid a lot of nasty
problems. It is not really sufficient to tell people that _most_ of the split
brain can be healed. If not all, then it is better to tear down or at least
switch all but one to read-only.
 
   That's the thing about complex systems. Trivial solutions are usually both
   simple and wrong. Some work most of the time, but there are corner cases. 
   As
   we see with Gluster even complex solutions tend to have corner cases; but 
   at
   least in complex solutions the corners can be whittled down.
  
  Can they? I'd rather say if it is non-trivial it is broken most of the time.
  Ask btrfs for confirmation.
 
 Pointing out that a complex system can go wrong doesn't invalidate complex
 systems as a class. It's well established in ecological science that more
 complex natural systems are far more resiliant than simple ones. A rich,
 complex local ecosystem has a higher rate of stability and survival than a
 simple, poorer one. That's assuming the systems are evolved and have niches
 well-fitted with organisms - that the complexity is organic, not just
 random.

That is a good example for excluded corner cases, just like the current split
brain discussion. All I need to do to your complex natural system to
invalidate is to throw a big stone on it. Ask dinosaurs for real life
experience after that. People really tend to think what you think. But most of
the complexity that you think might help is in fact worthless.
If you did not solve the basic questions completely, then added complexity
won't help.

In split-brain you have to solve only one question: who is the survivor for
writes? Every other problem or question is just a drawback of this unresolved
issue.

 Computer software, hardware, and the human culture that supports them also
 form complex, evolved ecosystems. Can there be simple solutions that help
 optimize such complex systems? Sure. But to look only for simple solutions
 is to be like the proverbial drunk looking for his keys under the
 streetlight, even though he heard them drop a half-block away, because The
 light is better here. When people try to apply simple solutions to complex,
 evolved ecosystems, the law of unintended consequences is more the rule
 than the exception. Solutions that appear simple and obvious should always
 be suspect. Granted, complex, obscure ones also require scrutiny. It's just,
 the simple stuff should never get a pass.

Where's the guy that said keep it simple ?
;-)
 
 Best,
 Whit

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues - meta

2013-01-08 Thread Stephan von Krawczynski
On Tue, 8 Jan 2013 11:44:15 -0500
Whit Blauvelt whit.glus...@transpect.com wrote:

 On Tue, Jan 08, 2013 at 04:49:30PM +0100, Stephan von Krawczynski wrote:
 
   Pointing out that a complex system can go wrong doesn't invalidate complex
   systems as a class. It's well established in ecological science that more
   complex natural systems are far more resilient than simple ones. A rich,
   complex local ecosystem has a higher rate of stability and survival than a
   simple, poorer one. That's assuming the systems are evolved and have 
   niches
   well-fitted with organisms - that the complexity is organic, not just
   random.
  
  That is a good example for excluded corner cases, just like the current 
  split
  brain discussion. All I need to do to your complex natural system to
  invalidate is to throw a big stone on it. Ask dinosaurs for real life
  experience after that. 
 
 Throw a big enough stone and anything can be totally crushed. The question
 is one of resilience when the stone is less than totally crushing. The
 ecosystem the big stone was thrown at which included the dinosaurs survived,
 because in its complexity it also included little mammals - which themselves
 were more complex organisms than the dinosaurs. Not that some simpler
 organisms didn't make it through the extinction event too. Plenty did. The
 chicken I ate for dinner is a descendant of feathered dinosaurs.
 
 Take two local ecosystems, one more complex than the other. Throw in some
 big disturbance, the same size of disruption in each. On average, the
 complex local ecosystem is more likely to survive and bounce back, while the
 simple one is more likely to go into terminal decline. This is field data,
 not mere conjecture. Your argument here could be that technological systems
 don't obey the same laws as ecosystems. But work in complexity theory shows
 that the right sorts of complexity produce greater stability across a broad
 range of systems, not just biological ones. 
 
 Free, open source software's particular advantage is that it advances in a
 more evolutionary manner than closed software, since there is evolutionary
 pressure from many directions on each part of it, at every scale.
 Evolutionary pressure produces complexity, the _right sort_ of complexity.
 That's why Linux systems are more complex, and at the same time more stable
 and manageable, than Windows systems. 
 
 Simplicity does not have the advantage. Even when smashing things with
 rocks, the more complex thing is more likely to survive the assault, if it
 has the right sort of complexity.

Listen, I don't really want to lengthen the discussion about complexity issues
in ecosystems. But let me please point out that the fundamental flaw in your
example as you turn it now is that a natural ecosystem has no _goal_ of
existence. Whereas programmed code should at least have _some_.
Which means you can take it as negative example for reasons why something does
not work. But you cannot elaborate it as positive example why something should
work. Glusterfs(d) has a clearly stated goal of being, an ecosystem has not.
So you cannot say that only because _something_ survived a crashing ecosystem
proves that an equally complex code does something useful after an equally
complex crash. In fact it most certainly does not.
Contrary is true. You should strip down complexity to the lowest possible
level to make the code more obvious and therefore more debuggable and readable
to a larger number of people. That will have a positive effect on its
stability. But if you throw in more and more code for fragile corner cases
instead of drawing a clear line between clearly working and clearly
failing you will not end up at the desired state where everything works
stable.
This path is as wrong as it was to release a complete fileserver installation
image back in the old days of glusterfs.
In the end, everything boils down to the question where efforts are invested
best in order to make the project more successful. And its really not that
hard to find out what the biggest show stopper is. simply count the articles
in the list dealing with performance issues and strange effects of not-synced
files during _normal_ operation. there is not much left.
Read the fs comparison between NFS, Samba, Ceph and Glusterfs in some german
linux magazine lately? Guess who's last...

 Best,
 Whit

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs performance issues

2013-01-07 Thread Stephan von Krawczynski
On Mon, 07 Jan 2013 13:19:49 -0800
Joe Julian j...@julianfamily.org wrote:

 You have a replicated filesystem, brick1 and brick2.
 Brick 2 goes down and you edit a 4k file, appending data to it.
 That change, and the fact that there is a pending change, is stored on 
 brick1.
 Brick2 returns to service.
 Your app wants to append to the file again. It calls stat on the file. 
 Brick2 answers first stating that the file is 4k long. Your app seeks to 
 4k and writes. Now the data you wrote before is gone.

Forgive my ignorance, but it obvious that this implementation of a stat on a
replicating fs is shit. Of course a stat should await _all_ returning local
stats and should choose the stat of the _latest_ file version and note that
the file needs self heal.
 
 This is one of the processes by which stale stat data can cause data 
 loss. That's why each lookup() (which precedes the stat) causes a 
 self-heal check and why it's a problem that hasn't been resolved in the 
 last two years.

self-heal is no answer to this question. The only valid answer is choosing the
_latest_ file version no matter if self heal is necessary or not.
 
 I don't know the answer. I know that they want this problem to be 
 solved, but right now the best solution is hardware. The lower the 
 latency, the less of a problem you'll have.

The only solution is correct programming, no matter what the below hardware
looks like. The only outcome of good or bad hardware is how _fast_ the
_correct_ answer reaches the fs client.

Your description is a satire, not?

 
 On 01/07/2013 12:59 PM, Dennis Jacobfeuerborn wrote:
  On 01/07/2013 06:11 PM, Jeff Darcy wrote:
  On 01/07/2013 12:03 PM, Dennis Jacobfeuerborn wrote:
  The gm convert processes make almost no progress even though on a 
  regular
  filesystem each call takes only a fraction of a second.
  Can you run gm_convert under strace?  That will give us a more accurate
  idea of what kind of I/O it's generating.  I recommend both -t and -T to
  get timing information as well.  Also, it never hurts to file a bug so
  we can track/prioritize/etc.  Thanks.
 
  https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
  Thanks for the strace hint. As it turned out the gm convert call was issued
  on the filename with a [0] appended which apparently led gm to stat() all
  (!) files in the directory.
 
  While this particular problem isn't really a glusterfs problem is there a
  way to improve the stat() performance in general?
 
  Regards,
 Dennis
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 


-- 
MfG,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-30 Thread Stephan von Krawczynski
On Sun, 30 Dec 2012 10:13:52 -0500
Jeff Darcy jda...@redhat.com wrote:

 On 12/27/12 3:36 PM, Stephan von Krawczynski wrote:
  And the same goes for glusterfs. It _could_ be the greatest fs on earth, but
  only if you accept:
  
  1) Throw away all non-linux code. Because this war is over since long.
 
 Sorry, but we do have non-Linux users already and won't abandon them.  We
 wouldn't save all that much time even if we did, so it just doesn't make 
 sense.

Jeff, really, if you argue, please state your argument openly. You don't want
this point because its next logical step would be my point 2), the kernel
implementation. As long as you hold up dead boxes like orcale-owned solaris
you have a good point in not doing 2). Success needs focussing. If you try to
be everybody's darling you may well end up being dropped by everybody because
you are not good enough.
 
  2) Make a kernel based client/server implementation. Because it is the only
  way to acceptable performance.
 
 That's an easy thing to state, but a bit harder to prove.

Come on, how old are you? can you remember userspace-nfs? In case you cannot:
it had just about the same problems glusterfs has today, and guess why it is
gone...

 [a lot of bad examples deleted]

Really, you cannot prove you are right by naming some examples that are even
more horrible. 

  3) Implement true undelete feature. Make delete a move to a deleted-files 
  area.
 
 Some people want that, some people do not.

Haha! A good argument for a config parameter :-) - I would have suggested that
anyway.

  Some are even precluded from using
 it e.g. for compliance reasons.  It's hardly a must-have feature.  In any 
 case,
 it already exists - called landfill I believe, though I'm not sure of its
 support status or configurability via the command line.  If it didn't exist, 
 it
 would still be easy to create - which wouldn't be the case at all if we
 followed your advice to put this in the kernel.

Now I wonder how you argue about this. Let me bring in some analogy you will
probably hate. Linux MM uses free memory to cache for just about anything
thinkable of. This drives W*indows users crazy using Android. They always try
to put the latest kill-all-not-needed-apps tool to let them read a big
number in free space statistics. They do not understand that free memory is in
fact wasted memory. And the same thing goes for disk space. If I delete
something on a disk that is far from being full it is just plain dumb to
really erase this data from the disk. It won't help anyone. It will only hurt
you if you deleted it accidently. Read my lips: free disk space is wasted
space, just like free mem is wasted mem.
And _that_ is the true reason for undelete. It won't hurt anybody, and will
help some. And since it is the true goal of a fs to organise data on a drive
it is most obvious that undelete (you may call it lazy-delete) is a very
basic fs feature and _not_ an add-on patched onto it.

  If it's a priority for you and
 existing facilities do not suffice, then I suggest adding a feature page on 
 the
 wiki and/or an enhancement-request bug report, so that we can incorporate that
 feedback into our planning process.  Thank you for your help making GlusterFS
 better.

[politics end] 
Jeff, this is really no technical question we are talking about. It's more a
question of a management decision. If redhat wants a truely successful
glusterfs someone has to decide to follow my steps. If the stuff was only
bought because it looked interesting and no one else should use its true
potential, well then go ahead.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Turning GlusterFS into something else (was Re: how well will this work)

2012-12-30 Thread Stephan von Krawczynski
On Sun, 30 Dec 2012 12:29:53 -0800
Joe Julian j...@julianfamily.org wrote:

 Here's were you're getting labeled as a Troll. You have a tendency to do 
 this on just about every mailing list except LKML (not sure why they get 
 your love over others, but to each their own).

There is one basic difference between LKML and quite almost every other
project you probably saw me posting. The kernel project has _one_ head who
has proven to take real _management_ decisions in his project. Sometimes they
look rude, sometimes they are a bit late, very often they are just-in-time or
even early. And if you read the archives you probably notice one or two times
where I requested a _decision_ on fundamental strategies. Probably you
remember me being laughed at when I suggested to make cpus hot-pluggable years
ago. Nobody thought of the implications back then. Nowadays cpu hotplug is in
every arm-driven multicore android handy. I am not Jesus. Only sometimes I can 
read the writings on the wall a bit earlier than others do, that's all.

 You come in, spout some 
 diatribe claiming how you know better than everybody else to the point 
 of being told that this is the last post I'm going to make on this 
 subject. You don't work with the developers, you antagonize them. I 
 still don't see the features you're asking for on the wiki, nor in bugzilla.
 
 You obviously have some knowledge of C judging by your analysis of 
 issues in LKML and patch offers relating to the same. Why not offer your 
 abilities in a constructive way by using the tools we make publicly 
 available?

From writing lots of lines of code in C and quite a bunch of other languages
for the last about 30 years I can tell you that the biggest effect of things I
did is not based on released code but on exactly this kind of discussions. One
of the fundamental problems in open source is that quite some good projects
die because nobody has the guts to tell that the basic direction needs
correction. I know that most people do not want to hear that, nevertheless
someone has to stand up and say this is sh*t if it really is. And if nobody
else does, I do. At the end of the day most people may hate me for that, but
if the project got better, I don't give a damn. I am no team player, I believe
in one man, one vision.

-- 
Regards,
Stephan


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] how well will this work

2012-12-27 Thread Stephan von Krawczynski
On Wed, 26 Dec 2012 22:04:09 -0800
Joe Julian j...@julianfamily.org wrote:

 It would probably be better to ask this with end-goal questions instead 
 of with a unspecified critical feature list and performance problems.
 
 6 months ago, for myself and quite an extensive (and often impressive) 
 list of users there were no missing critical features nor was there any 
 problems with performance. That's not to say that they did not meet your 
 design specifications, but without those specs you're the only one who 
 could evaluate that.

Well, then the list of users does obviously not contain me ;-)
The damn thing will only become impressive if a native kernel client module is
done. FUSE is really a pain.
And read my lips: the NFS implementation has general load/performance problems.
Don't be surprised if it jumps into your face.
Why on earth do they think linux has NFS as kernel implementation?
-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] how well will this work

2012-12-27 Thread Stephan von Krawczynski
Dear JM,

unfortunately one has to tell openly that the whole concept that is tried here
is simply wrong. The problem is not the next-bug-to-fix. The problem is the
client strategy in user space. It is broken by design. You can either believe
this or go ahead ignoring it and never really get a good and stable setup.
Really, the whole we-close-our-eyes-and-hope-it-will-turn-out-well strategy
looks just like btrfs. Read the archives, I told them years ago it will not
work out in our life time. And today, still they have no ready-for-production
fs, and believe me: it never will be there.
And the same goes for glusterfs. It _could_ be the greatest fs on earth, but
only if you accept:

1) Throw away all non-linux code. Because this war is over since long.
2) Make a kernel based client/server implementation. Because it is the only
way to acceptable performance.
3) Implement true undelete feature. Make delete a move to a deleted-files area.

These are the minimal steps to take for a real success, everything else is
just beating the dead horse. 

Regards,
Stephan



On Thu, 27 Dec 2012 10:03:10 -0500 (EST)
John Mark Walker johnm...@redhat.com wrote:

 Look, fuse its issues that we all know about. Either it works for you or it 
 doesn't. If fuse bothers you that much, look into libgfapi. 
 
 Re: NFS - I'm trying to help track this down. Please either add your comment 
 to an existing bug or create a new ticket. 
 
 Either way, ranting won't solve your problem or inspire anyone to fix it. 
 
 -JM
 
 
 Stephan von Krawczynski sk...@ithnet.com wrote:
 
 On Wed, 26 Dec 2012 22:04:09 -0800
 Joe Julian j...@julianfamily.org wrote:
 
  It would probably be better to ask this with end-goal questions instead 
  of with a unspecified critical feature list and performance problems.
  
  6 months ago, for myself and quite an extensive (and often impressive) 
  list of users there were no missing critical features nor was there any 
  problems with performance. That's not to say that they did not meet your 
  design specifications, but without those specs you're the only one who 
  could evaluate that.
 
 Well, then the list of users does obviously not contain me ;-)
 The damn thing will only become impressive if a native kernel client module is
 done. FUSE is really a pain.
 And read my lips: the NFS implementation has general load/performance 
 problems.
 Don't be surprised if it jumps into your face.
 Why on earth do they think linux has NFS as kernel implementation?
 -- 
 Regards,
 Stephan
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] how well will this work

2012-12-27 Thread Stephan von Krawczynski
On Thu, 27 Dec 2012 13:24:55 -0800
Dan Cyr d...@truenorthmanagement.com wrote:

 I also don’t think this is a rant. I, as well, have been following this list 
 for a few years, and have been waiting for GlusterFS to stabilize for VM 
 deployment. I hope this discussion helps the devs understand areas that 
 people are waiting for.
 
 We have 2 SAN servers with Infiniband connections to a Blade Center. I would 
 like all the KVM VMs hosted on the SAN with the ability to add more SAN 
 servers in the future. – Currently Gluster allows this via NFS but I’ve read 
 about performance issues. – So, right now, after 2 years of not deploying 
 this gear (and running the VMs images on each blade), am looking for an 
 expandable solution for the backend storage so I stop manually babying this 
 network and install OpenNebula so I’m not the only person in our office who 
 can manage our VM infrastructure.
 
 This does fit into the OP’s question because I would love to see GlusterFS 
 work like this.
 
 Miles - As is right now GlusterFS is not what you want for backend VM storage.
   Question: “how well will this work”
   Answer: “horribly”
 
 Dan
 
 
 From: gluster-users-boun...@gluster.org 
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of John Mark Walker
 Sent: Thursday, December 27, 2012 12:39 PM
 To: Stephan von Krawczynski
 Cc: gluster-users@gluster.org
 Subject: Re: [Gluster-users] how well will this work
 
 
 Stephan,
 
 I'm going to make this as simple as possible. Every message to this list 
 should follow these rules:
 
 1. be helpful
 2. be constructive
 3. be respectful
 
 I will not tolerate ranting that serves no purpose. If your message doesn't 
 follow any of the rules above, then you shouldn't be posting it.
 
 This is your 2nd warning.
 
 -JM


Hola JM,

are you aware that your above message has neither arrived at my side through 
the list, nor through personal mail.
Does this mean I got deleted from the list by you?

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Client-side GlusterFS

2012-12-27 Thread Stephan von Krawczynski
I follow both lists quite some time longer than you or redhat have been here.
The basic idea of the project is good, the implementation idea is mostly
wrong. quite some users follow the lists and are still hoping for better
times. Very few told you so now - again.
If you are not acting political here, then ask yourself why nfs in userspace
on linux has gone dead years ago long before you re-implemented it in
glusterfs.
I really don't want to teach you things everyone knowing the past should have
understood. I am only a reminder. Don't revive the dinosaurs, they are extinct
for good reasons. The future will not become better if you follow a path only
because it's easy. I honor the original decision to use userspace, because it
made implementation a lot easier and the goal was to show the whole thing is
possible at all. But years have gone by and the time of becoming production
ready has come for some time. And production readyness needs kernel modules.
I made my point. Lets see how things look in a year or two. I will remind you
- again.

Regards,
Stephan


On Thu, 27 Dec 2012 18:33:10 -0500 (EST)
John Mark Walker johnm...@redhat.com wrote:

 If you feel that our strategy on the client side is broken, while I respect 
 that opinion, its kind of a pointless discussion. We made the architectural 
 decisions we made understanding the tradeoffs as they were - which have been 
 enumerated on this list numerous times.
 
 In any case, if you want to have an architectural discussion or debate, 
 that's better directed towards gluster-devel, and that's a discussion we 
 welcome. However, this list is gluster-users, which as the name implies, is 
 about users of the software as it exists today, warts and all. 
 
 Feel free to use the wiki to develop any thoughts you may have regarding 
 ideal architectures. Even better if you can round up developers to implement 
 said architecture.
 
 -JM
 
 
 Stephan von Krawczynski sk...@ithnet.com wrote:
 
 Dear JM,
 
 unfortunately one has to tell openly that the whole concept that is tried here
 is simply wrong. The problem is not the next-bug-to-fix. The problem is the
 client strategy in user space. It is broken by design. You can either believe
 this or go ahead ignoring it and never really get a good and stable setup.
 Really, the whole we-close-our-eyes-and-hope-it-will-turn-out-well strategy
 looks just like btrfs. Read the archives, I told them years ago it will not
 work out in our life time. And today, still they have no ready-for-production
 fs, and believe me: it never will be there.
 And the same goes for glusterfs. It _could_ be the greatest fs on earth, but
 only if you accept:
 
 1) Throw away all non-linux code. Because this war is over since long.
 2) Make a kernel based client/server implementation. Because it is the only
 way to acceptable performance.
 3) Implement true undelete feature. Make delete a move to a deleted-files 
 area.
 
 These are the minimal steps to take for a real success, everything else is
 just beating the dead horse. 
 
 Regards,
 Stephan
 
 
 
 On Thu, 27 Dec 2012 10:03:10 -0500 (EST)
 John Mark Walker johnm...@redhat.com wrote:
 
  Look, fuse its issues that we all know about. Either it works for you or it 
  doesn't. If fuse bothers you that much, look into libgfapi. 
  
  Re: NFS - I'm trying to help track this down. Please either add your 
  comment to an existing bug or create a new ticket. 
  
  Either way, ranting won't solve your problem or inspire anyone to fix it. 
  
  -JM
  
  
  Stephan von Krawczynski sk...@ithnet.com wrote:
  
  On Wed, 26 Dec 2012 22:04:09 -0800
  Joe Julian j...@julianfamily.org wrote:
  
   It would probably be better to ask this with end-goal questions instead 
   of with a unspecified critical feature list and performance problems.
   
   6 months ago, for myself and quite an extensive (and often impressive) 
   list of users there were no missing critical features nor was there any 
   problems with performance. That's not to say that they did not meet your 
   design specifications, but without those specs you're the only one who 
   could evaluate that.
  
  Well, then the list of users does obviously not contain me ;-)
  The damn thing will only become impressive if a native kernel client module 
  is
  done. FUSE is really a pain.
  And read my lips: the NFS implementation has general load/performance 
  problems.
  Don't be surprised if it jumps into your face.
  Why on earth do they think linux has NFS as kernel implementation?
  -- 
  Regards,
  Stephan
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-users
  
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 

___
Gluster-users mailing list
Gluster-users

Re: [Gluster-users] Meta-discussion

2012-12-27 Thread Stephan von Krawczynski
Sorry, JM forgive my ignorance, but it simply does not match up what you say.
First you say:

In general, I don't recommend any distributed filesystems for VM images, but
I can also see that this is the wave of the future. 

Which means you do not believe at all in one major goal of this fs. Hu?
And then:

I am sorry that you haven't been able to deploy glusterfs in production.
Discussing how and why glusterfs works - or doesn't work - for particular use
cases is welcome on this list. Starting off a discussion about how the entire
approach is unworkable is kind of counter-productive and not exactly helpful
to those of us who just want to use the thing.

Now how can you expect a productive input to a question where you yourself
do not believe in an answer being possible at all.
I mean, you expect it to fail anyway but nevertheless want people to spend
their time? Most of us are _not_ paid for debugging glusterfs. Are you
paid for it? And you do not believe in the project anyway (you said so above)?
I am astonished ...

Regards,
Stephan




 
 Sean Fulton s...@gcnpublishing.com wrote:
 
 I didn't think his message violated any of your rules. Seems to me he 
 has some disagreements with the approach being used to develop Gluster. 
 I think you should listen to people who disagree with you.
 
  From monitoring this list for more than a year and 
 tried--unsuccessfully--to put Gluster into production use, I think there 
 are a lot of people who have problems with stability.
 
 So please, can you respond to his comments with why his suggestions are 
 invalid?
 
 sean
 
 
 On 12/27/2012 03:39 PM, John Mark Walker wrote:
 
  Stephan,
 
  I'm going to make this as simple as possible. Every message to this 
  list should follow these rules:
 
  1. be helpful
  2. be constructive
  3. be respectful
 
  I will not tolerate ranting that serves no purpose. If your message 
  doesn't follow any of the rules above, then you shouldn't be posting it.
 
  This is your 2nd warning.
 
  -JM
 
 
 
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
 -- 
 Sean Fulton
 GCN Publishing, Inc.
 Internet Design, Development and Consulting For Today's Media Companies
 http://www.gcnpublishing.com
 (203) 665-6211, x203
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Renaming a file in a distributed volume

2012-10-13 Thread Stephan von Krawczynski
On Sat, 13 Oct 2012 15:52:56 +0100
Brian Candler b.cand...@pobox.com wrote:

 In a distributed volume (glusterfs 3.3), files within a directory are
 assigned to a brick by a hash of their filename, correct?
 
 So what happens if you do mv foo bar? Does the file get copied to another
 brick? Is this no longer an atomic operation?
 
 Thanks,
 
 Brian.

In fact it has never been atomic.
Take a look at my corresponding bug report from back then...
You can use a small script to show it is not.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Throughout over infiniband

2012-09-10 Thread Stephan von Krawczynski
On Mon, 10 Sep 2012 08:48:03 +0100
Brian Candler b.cand...@pobox.com wrote:

 On Sun, Sep 09, 2012 at 09:28:47PM +0100, Andrei Mikhailovsky wrote:
 While trying to figure out the cause of the bottleneck i've realised
 that the bottle neck is coming from the client side as running
 concurrent test from two clients would give me about 650mb/s per each
 client.
 
 Yes - so in workloads where you have many concurrent clients, this isn't a
 problem.  It's only a problem if you have a single client doing a lot of
 sequential operations.

That is not correct for most cases. GlusterFS always has a problem on clients
with high workloads. This obviously derives from the fact that the FS is
userspace-based. If other userspace applications eat lots of cpu your FS comes
to a crawl.

 [...]
 Have you tried doing exactly the same test but over NFS? I didn't see that
 in your posting (you only mentioned NFS in the context of KVM)

And as I said above NFS (kernel-version) does have no problem at all in these
scenarios.
And it does not have the GlusterFS-problems with multiple concurrent FS action
on the same client, too. Neither there is a problem with maximum bandwidth.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] XFS and MD RAID

2012-09-10 Thread Stephan von Krawczynski
On Mon, 10 Sep 2012 09:39:18 +0100
Brian Candler b.cand...@pobox.com wrote:

 On Mon, Sep 10, 2012 at 09:29:25AM +0800, Jack Wang wrote:
  below patch should fix your bug.
 
 Thank you Jack - that was a very quick response! I'm building a new kernel
 with this patch now and will report back.
 
 However, I think the existence of this bug suggests that Linux with software
 RAID is unsuitable for production use.  There has obviously been no testing
 of basic critical functionality like hot-plugging drives, and serious
 regressions are introduced into supposedly stable kernels.

Brian, please re-think this. What you call a stable kernel (Ubuntu 3.2.0-30)
is indeed very old.
If you want to check a MD raid you should really use a stock kernel from
kernel.org (probably 3.4.10).
_That_ is the latest stable kernel.
 
 So I'm now on the lookout for a 24-port SATA RAID controller with good Linux
 support. What are my options?
 
 Googling I have found:
 
 * 3ware 9650SE-24
 * Areca ARC-1280ML
 * LSI MegaRAID 9280-24i (newer SAS/SATA)
 * Areca ARC-1882ix-24 (newer SAS/SATA)

I can tell you that I just had to throw away Areca because it had exactly the
problem you don't like: drives going offline for no good reason.
I went back to MD with the very same drives in the very same box, online using
the onboard SATA (6 ports) which works flawlessly.
My impression is Areca has troubles with new big drives of 2 TB and above. The
1TB worked ok.
I have some 3ware too, but did not check them with 2TB drives so far.
I must say I would probably drop them only because current processors are
faster with MD anyway. I just built a box with XEON E3-1280v2 with MD raid
4x2TB and I am impressed by the performance.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Throughout over infiniband

2012-09-10 Thread Stephan von Krawczynski
On Mon, 10 Sep 2012 09:44:26 +0100
Brian Candler b.cand...@pobox.com wrote:

 On Mon, Sep 10, 2012 at 10:03:14AM +0200, Stephan von Krawczynski wrote:
   Yes - so in workloads where you have many concurrent clients, this isn't a
   problem.  It's only a problem if you have a single client doing a lot of
   sequential operations.
  
  That is not correct for most cases. GlusterFS always has a problem on 
  clients
  with high workloads. This obviously derives from the fact that the FS is
  userspace-based. If other userspace applications eat lots of cpu your FS 
  comes
  to a crawl.
 
 It's only obvious if your application is CPU-bound, rather than I/O-bound.

I think one can drop the 5% market share that uses storage only for storing
_big_ files from client boxes with zero load. This is about the only case
where GlusterFS works ok if you don't mind the throughput problem of FUSE at
high rates.
If you have small files you are busted, if you have workload on the clients
you are busted and if you have lots of concurrent FS action on the client you
are busted. Which leaves you with test cases nowhere near real life.
I replaced nfs servers with glusterfs and I know what's going on in these
setups afterwards. If you're lucky you reach something like 1/3 of the NFS
performance.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Throughout over infiniband

2012-09-10 Thread Stephan von Krawczynski
On Mon, 10 Sep 2012 08:06:51 -0400
Whit Blauvelt whit.glus...@transpect.com wrote:

 On Mon, Sep 10, 2012 at 11:13:11AM +0200, Stephan von Krawczynski wrote:
  [...] 
  If you're lucky you reach something like 1/3 of the NFS
  performance.
 [Gluster NFS Client]
 Whit

There is a reason why one would switch from NFS to GlusterFS, and mostly it is
redundancy. If you start using a NFS-client type you cut yourself off the
complete solution. As said elsewhere you can as well export GlusterFS via
kernel-nfs-server. But honestly, it is a patch. It would be better by far if
things are done right, native glusterfs client in kernel-space.
And remember, generally there should be no big difference between NFS and
GlusterFS with bricks spread over several networks - if it is done how it
should be, without userspace.

-- 
MfG,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Throughout over infiniband

2012-09-09 Thread Stephan von Krawczynski
Ok, now you can see why I am talking about dropping the long-gone unix
versions (BSD/Solaris/name-one) and concentrate on doing a linux-kernel module
for glusterfs without fuse overhead. It is the _only_ way to make this project
a really successful one. Everything happening now is just a project pre-test
environment. And saying that open is the reason why quite some people dislike
my comments...

Please stop riding dead horses guys.
-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Ownership changed to root

2012-08-28 Thread Stephan von Krawczynski
On Mon, 27 Aug 2012 18:43:27 +0100
Brian Candler b.cand...@pobox.com wrote:

 On Mon, Aug 27, 2012 at 03:08:21PM +0200, Stephan von Krawczynski wrote:
  The gluster version is 2.X and cannot be changed.
 
 Ah, that's the important bit. If you have a way to replicate the problem
 with current code it will be easier to get someone to look at it.

Again, let me note two things:
- the current code has a lot more (other) problems than the 2.X tree, that is
why we won't use that.
- if one has to look at the code to find out the basic problem he is not the
target person of our question.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Ownership changed to root

2012-08-28 Thread Stephan von Krawczynski
On Tue, 28 Aug 2012 09:21:57 +0100
Brian Candler b.cand...@pobox.com wrote:

 On Tue, Aug 28, 2012 at 10:01:16AM +0200, Stephan von Krawczynski wrote:
  Again, let me note two things:
  - the current code has a lot more (other) problems than the 2.X tree, that 
  is
  why we won't use that.
  - if one has to look at the code to find out the basic problem he is not the
  target person of our question.
 
 To which I would suggest that if such a fundamental problem were known
 about, it would have been fixed long ago.
 
 IMO your best bet is to raise a bug report in bugzilla.

It is obvious I cannot do that because the only answer will be to update to a
current version and re-file the report if the problem still persists.

I am well aware though that the problem is quite fundamental for a fs...

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] FeedBack Requested : Changes to CLI output of 'peer status'

2012-08-28 Thread Stephan von Krawczynski
Top posting and kidding is a bit exaggerated for one posting ...

You are not seriously talking about 80 char terminals for an output that is
commonly used by scripts and stuff like nagios, are you?


On Tue, 28 Aug 2012 08:46:22 -0400 (EDT)
Pranith Kumar Karampuri pkara...@redhat.com wrote:

 hi Amar,
  This is the format we considered initially but we did not go with this 
 because it may exceed 80 chars and wrap over for small terminals if we want 
 to add more fields in future.
 
 Pranith.
 - Original Message -
 From: Amar Tumballi ama...@redhat.com
 To: Gluster Devel gluster-de...@nongnu.org, gluster-users 
 gluster-users@gluster.org
 Sent: Tuesday, August 28, 2012 4:36:07 PM
 Subject: [Gluster-users] FeedBack Requested : Changes to CLI output of 'peer  
 status'
 
 Hi,
 
 Wanted to check if any one is using gluster CLI output of 'peer status' 
 in their scripts/programs? If yes, let me know. If not, we are trying to 
 make it more script friendly.
 
 For example the current output would look something like:
 
 -
 Hostname: 10.70.36.7
 Uuid: c7283ee7-0e8d-4cb8-8552-a63ab05deaa7
 State: Peer in Cluster (Connected)
 
 Hostname: 10.70.36.6
 Uuid: 5a2fdeb3-e63e-4e56-aebe-8b68a5abfcef
 State: Peer in Cluster (Connected)
 
 -
 
 New changes would make it look like :
 
 ---
 UUID  Hostname   Status
 c7283ee7-0e8d-4cb8-8552-a63ab05deaa7  10.70.36.7 Connected
 5a2fdeb3-e63e-4e56-aebe-8b68a5abfcef  10.70.36.6 Connected
 
 ---
 
 If anyone has better format, or want more information, let us know now. 
 I would keep timeout for this mail as 3 more working days, and without 
 any response, we will go ahead with the change.
 
 Regards,
 Amar
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 
 ___
 Gluster-devel mailing list
 gluster-de...@nongnu.org
 https://lists.nongnu.org/mailman/listinfo/gluster-devel
 


-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] FeedBack Requested : Changes to CLI output of 'peer status'

2012-08-28 Thread Stephan von Krawczynski
Ok, maybe I didn't explain the true nature in detail:
The number of fields and the formatting is all the same. nobody wants to read
the output. Instead it is read by scripts most of the time. so the only valid
question is the field delimiter, simply to make the output parseable as easy
as possible for some scripts. There is no human in front of a terminal who
really likes to read this output all day long. Does that make the point clear?



On Tue, 28 Aug 2012 09:57:13 -0400 (EDT)
Pranith Kumar Karampuri pkara...@redhat.com wrote:

 No. Output formats in that way generally start out nice but as you start 
 adding more fields, formatting them becomes difficult IMO.
 
 Pranith
 - Original Message -
 From: Stephan von Krawczynski sk...@ithnet.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: gluster-users gluster-users@gluster.org, Gluster Devel 
 gluster-de...@nongnu.org
 Sent: Tuesday, August 28, 2012 7:01:57 PM
 Subject: Re: [Gluster-devel] [Gluster-users] FeedBack Requested : Changes to 
 CLI output of 'peer status'
 
 Top posting and kidding is a bit exaggerated for one posting ...
 
 You are not seriously talking about 80 char terminals for an output that is
 commonly used by scripts and stuff like nagios, are you?
 
 
 On Tue, 28 Aug 2012 08:46:22 -0400 (EDT)
 Pranith Kumar Karampuri pkara...@redhat.com wrote:
 
  hi Amar,
   This is the format we considered initially but we did not go with this 
  because it may exceed 80 chars and wrap over for small terminals if we want 
  to add more fields in future.
  
  Pranith.
  - Original Message -
  From: Amar Tumballi ama...@redhat.com
  To: Gluster Devel gluster-de...@nongnu.org, gluster-users 
  gluster-users@gluster.org
  Sent: Tuesday, August 28, 2012 4:36:07 PM
  Subject: [Gluster-users] FeedBack Requested : Changes to CLI output of 
  'peerstatus'
  
  Hi,
  
  Wanted to check if any one is using gluster CLI output of 'peer status' 
  in their scripts/programs? If yes, let me know. If not, we are trying to 
  make it more script friendly.
  
  For example the current output would look something like:
  
  -
  Hostname: 10.70.36.7
  Uuid: c7283ee7-0e8d-4cb8-8552-a63ab05deaa7
  State: Peer in Cluster (Connected)
  
  Hostname: 10.70.36.6
  Uuid: 5a2fdeb3-e63e-4e56-aebe-8b68a5abfcef
  State: Peer in Cluster (Connected)
  
  -
  
  New changes would make it look like :
  
  ---
  UUID  Hostname   Status
  c7283ee7-0e8d-4cb8-8552-a63ab05deaa7  10.70.36.7 Connected
  5a2fdeb3-e63e-4e56-aebe-8b68a5abfcef  10.70.36.6 Connected
  
  ---
  
  If anyone has better format, or want more information, let us know now. 
  I would keep timeout for this mail as 3 more working days, and without 
  any response, we will go ahead with the change.
  
  Regards,
  Amar
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
  
  ___
  Gluster-devel mailing list
  gluster-de...@nongnu.org
  https://lists.nongnu.org/mailman/listinfo/gluster-devel
  
 
 
 -- 
 Regards,
 Stephan
 
 ___
 Gluster-devel mailing list
 gluster-de...@nongnu.org
 https://lists.nongnu.org/mailman/listinfo/gluster-devel
 


-- 
MfG,
Stephan von Krawczynski


--
ith Kommunikationstechnik GmbH

Lieferanschrift  : Reiterstrasse 24, D-94447 Plattling
Telefon  : +49 9931 9188 0
Fax  : +49 9931 9188 44
Geschaeftsfuehrer: Stephan von Krawczynski
Registergericht  : Deggendorf HRB 1625
--

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Ownership changed to root

2012-08-27 Thread Stephan von Krawczynski
On Sun, 26 Aug 2012 20:01:20 +0100
Brian Candler b.cand...@pobox.com wrote:

 On Sun, Aug 26, 2012 at 03:50:16PM +0200, Stephan von Krawczynski wrote:
  I'd like to point you to [Gluster-devel] Specific bug question dated few
  days ago, where I describe a trivial situation when owner changes on a brick
  can occur, asking if someone can point me to a patch for that.
 
 I guess this is
 http://lists.gnu.org/archive/html/gluster-devel/2012-08/msg00130.html
 ?
 
 This could be helpful but as far as I can see a lot of important information
 is missing: e.g.  what glusterfs version you are using, what operating
 system and kernel version, what underlying filesystem is used for the
 bricks.  Is the volume mounted on a separate client machine, or on one of
 the brick servers?  gluster volume info would be useful too.

In fact I wrote the pieces of information that seemed really important for me,
only they seem unclear. The setup has two independant hardware bricks and one
client (on seperate hardware). It is an all-linux setup with ext4 on the
bricks. The kernel versions are really of no use because I tested quite some
and the behaviour is always the same.
The problem has to do with the load on the client which is about the only sure
thing I can say.
The gluster version is 2.X and cannot be changed. AFAIK the glusterfsd
versions are not downward compatible to a point where one can build a setup
with one brick 2.X and the other 3.X, which is - if true - a general design
flaw amongst others.
I did in fact not intend to enter a big discussion about the point. I thought
there must be at least one person knowing the code to an extent where my
question can be answered immediately with one sentence. All you have to know
is how it may be possible that a mv command overruns a former one that
should in fact have already completed its job, because it exited successfully.

 Regards,
 
 Brian.
 


-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Ownership changed to root

2012-08-26 Thread Stephan von Krawczynski
On Sun, 26 Aug 2012 08:53:33 +0100
Brian Candler b.cand...@pobox.com wrote:

 On Fri, Aug 24, 2012 at 07:45:35PM -0600, Joe Topjian wrote:
 This removed mdadm and LVM out of the equation and the problem went
 away. I then tried with just LVM and still did not see this problem.
  
 Unfortunately I don't have enough hardware at the moment to create
 another RAID1 mirror, so I can't single that out. I will try when I get
 a chance -- unless anyone else knows if it would cause a problem? Or
 maybe it is the mdamd+LVM combination?
 
 This sounds extremely unlikely. mdadm and LVM both work at the block device
 layer - reading and writing 512-byte blocks. They have no understanding of
 filesystems and no understanding of user IDs.
 
 I suspect there were other differences between the tests. For example, did
 you do one with an ext4 filesystem and one with xfs? Or did you have a
 failed drive in your RAID1, which meant that some writes were timing out?
 
 FWIW, I've also seen the files owned by root occasionally in testing, but
 wasn't able to pin down the cause.
 
 Regards,
 
 Brian.

Hello,

I'd like to point you to [Gluster-devel] Specific bug question dated few
days ago, where I describe a trivial situation when owner changes on a brick
can occur, asking if someone can point me to a patch for that.
If you have no replication setup (like mine) where the other brick may help
you around the owner change then you may possibly see a real change on your fs.
I don't know if your bug is the same, but its nature and cause may be the same.
I have not received any answers to the topic so far.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.2.2 Performance Issue

2011-08-11 Thread Stephan von Krawczynski
On Wed, 10 Aug 2011 12:08:39 -0700
Mohit Anchlia mohitanch...@gmail.com wrote:

 Did you run dd tests on all your servers? Could it be one of the disk is 
 slower?
 
 On Wed, Aug 10, 2011 at 10:51 AM, Joey McDonald j...@scare.org wrote:
  Hi Joe, thanks for your response!
 
 
  An order of magnitude slower with replication. What's going on I wonder?
  Thanks for any suggestions.
 
  You are dealing with contention for Gigabit bandwidth.  Replication will
  do that, and will be pronounced over 1GbE.  Much less of an issue over 
  10GbE
  or Infiniband.

If that was a GBit contention you can check out by spreading your boxes
over different switches. That should prevent a contention problem.
Unfortunately I can tell you it did not help on our side, so we doubt the
explanation.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.2.2 Performance Issue

2011-08-11 Thread Stephan von Krawczynski
On Thu, 11 Aug 2011 09:13:53 -0400
Joe Landman land...@scalableinformatics.com wrote:

 On 08/11/2011 09:11 AM, Burnash, James wrote:
  Cogently put and helpful, Joe. Thanks. I'm filing this under good
  answers to frequently asked technical questions. You have a number
  of spots in that archive already :-)
 
 Thanks :)

Unfortunately he failed to understand my point. Obviously I was not talking
about simply _supplying_ more switches, I talked about _spreading_ the network
over several switches. This means you take a client that has at least two GBit
Ports and connect your two gluster servers (bricks) to one each. Obviously you
can do the same with a bigger number of bricks, it only depends on the number
of interfaces your client has. This means contention is not possible by
accessing several bricks at the same time in a replication setup.

But as told before, the problem of bad performance did not go away for us.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster 3.2.0 - totally broken?

2011-05-22 Thread Stephan von Krawczynski
On Sat, 21 May 2011 13:27:38 +0200
Tomasz Chmielewski man...@wpkg.org wrote:

 If you found a bug, and even more, it's repeatable for you, please file 
 a bug report and describe the way to reproduce it.

Ha, very sorry that the project is not an easy-go for a dev. Creating
reproducable setups for software spreading over 3 or more boxes is a pretty
complex thing to do. And even if something is reproducable on my side that
does not mean it is with _other_ hardware and the same setup on the devs' side.
Drop the idea this can be debugged with the same strategy you debug hello
world. I stopped to look at the bugs long ago because the software does not
give you a chance to even find out when a problem started. If you want to see
something where you can find out yourself about what is going on look at
netfilter. There you have tables and output in /proc about ongoing nats and
open connections (connection-tracker).
In glusterfs you have exactly nothing, and if you stop the replication setup
at some point you need to ls terabytes of data to find the not-synced files.
This is complete nonsense and not worth looking at it.

If you need input, how about reading udo?

I already mentioned the bugs that seem to describe the same problems. I
really do not think that creating new ones describing the same problems
would help. Maybe the old ones should be reopened. These bugs mentioned in: 
http://gluster.org/pipermail/gluster-users/2011-May/007619.html are
basically the same.

Currently I really do not know how to describe/analyze the problem further.


?

 Initiating flame discussions is not really a good development model.

I did not start the topic, but I can well imagine the feelings of the first
poster. I was in the same situation more than a year ago and had to find out
that nobody cares to improve the fundamental strategy. And that people still
find out the same - months later - is the real bad news.
I have no doubts that we read the same topics with new version number in a
year.

 -- 
 Tomasz Chmielewski
 http://wpkg.org

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster 3.2.0 - totally broken?

2011-05-21 Thread Stephan von Krawczynski
On Fri, 20 May 2011 17:01:22 +0200
Tomasz Chmielewski man...@wpkg.org wrote:

 On 20.05.2011 15:51, Stephan von Krawczynski wrote:
 
  most of them are just an outcome of not being able to find a working i.e. 
  best
  solution for a problem. cache-timeout? thread-count? quick-read?
  stat-prefetch? Gimme a break. Being a fs I'd even say all the cache-size 
  paras
  are bogus. When did you last tune the ext4 cache size or timeout? Don't come
  up with ext4 being kernel vs. userspace fs.  It was their decision to make 
  it
  userspace, so don't blame me. As a fs with networking it has to take the
  comparison with nfs - as most interested users come from nfs.
 
 Ever heard of fsc (FS-Cache),

To my knowledge there is no persistent (disk-based) caching in glusterfs at
all ...

 acreg*, acdir*, actimeo options for NFS?

... as well as options only dealing with caching of file/dir attributes.
You are talking about completely different things here. If you want to argue
about that you should probably _request_ these types of options additionally
to the already existing ones.
 
 Yes, they are related to cache, and oh, NFS is kernelspace. And yes, 
 there are tunable timeout options for NFS as well.

The only reasonable configurable timeout in nfs is the rpc timeout.

 As of timeout options with ext4, or any other local filesystem - if you 
 ever used iSCSI, you would also discover that it's recommended to set 
 reasonable timeout options there as well, depending on your network 
 infrastructure and usage/maintenance patterns. Incidentally, iSCSI is 
 also kernelspace.

And is it incidentally as slow as glusterfs in the same environment? Not?
And did you ever manage to hard freeze your boxes with it? To show double
files? Not being able to open existing files? Wrong filedates? Wrong UIDs/GIDs?
Shall I continue to name problems we saw through all tested versions of
glusterfs? I don't because I dropped the idea that it would be helpful at all.
If you want to share helpful information tell us how you would
default-configure glusterfs so it is equally performing to nfs in most cases.
If you can't, what is your point then?

 -- 
 Tomasz Chmielewski
 http://wpkg.org

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster 3.2.0 - totally broken?

2011-05-20 Thread Stephan von Krawczynski
On Wed, 18 May 2011 13:16:59 -0700
Anand Babu Periasamy a...@gluster.com wrote:

 GlusterFS is completely free. Same versions released to the community are
 used for commercial deployments too. Their issues gets higher priority
 though. Code related to other proprietary software such as VMWare, AWS,
 RightScale are kept proprietary.
 
 We acknowledge that we have done a poor job when it comes to  managing
 community, documentation and bug tracking. While we improved a lot since 2.x
 versions, I agree we are not there yet. We hired a lot of engineers to
 specifically focus on testing and bug fixes recently.  QA team is
 growing steadily. Lab size has been doubled. New QA lead is joining us next
 month. QA team will have closer interaction with the community moving
 forward. We also appointed Dave Garnett from HP as VP product manager and
 Vidya Sakar from Sun/Oracle as Engineering manager.
 
 We fully understand the importance of community. Paid vs Non-paid should not
 matter when it comes to quality of software. Intangible contributions from
 the community are equally valuable to the success of GlusterFS project.  We
 have appointed John Mark Walker as community manager. We launched
 community.gluster.org site recently. Starting next month, we will have
 regular community sessions. Problems raised by the community will also get
 prioritized.
 
 We are redoing the documentation completely. New system will be based on Red
 Hat's Publican. Documentation team too will closely work with the community.
 
 *Criticisms are taken positively. So please don't hesitate.*
 Thanks!
 -ab

Sorry, this clearly shows the problem: understanding.
It really does not help you a lot to hire a big number of people, you do not
fail in terms of business relation. Your problem is the _code_. You need a
filesystem expert. A _real_ one, not _some_ one. Like lets say Daniel
Phillips, Theodore Ted Ts'o or the like.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster 3.2.0 - totally broken?

2011-05-20 Thread Stephan von Krawczynski
On Fri, 20 May 2011 08:35:35 -0400
Jeff Darcy jda...@redhat.com wrote:

 On 05/20/2011 05:15 AM, Stephan von Krawczynski wrote:
  Sorry, this clearly shows the problem: understanding. It really does
  not help you a lot to hire a big number of people, you do not fail in
  terms of business relation. Your problem is the _code_. You need a 
  filesystem expert. A _real_ one, not _some_ one. Like lets say
  Daniel Phillips, Theodore Ted Ts'o or the like.
 
 I know both Daniel and Ted professionally. As a member of the largest
 Linux filesystem group in the world, I am also privileged to work with
 many other world-class filesystem experts. I also know the Gluster folks
 quite well, and I can assure you that they have all the filesystem
 expertise they need. They also have a *second* kind of expertise -
 distributed systems - which is even more critical to this work and which
 the vast majority of filesystem developers lack.  What Gluster needs is
 not more filesystem experts but more *other kinds* of experts as well as
 non-experts and resources.  The actions AB has mentioned are IMO exactly
 those Gluster should be taking, and should be appreciated as such by any
 knowledgeable observer.
 
 Your flames are not only counter-productive but factually incorrect as
 well. Please, if only for the sake of your own reputation, try to do
 better.

Forgive my ignorance Jeff, but it is obvious to anyone having used glusterfs
for months or years that the guys have a serious software design issue. If you
look at the tuning options configurable in glusterfs you should notice that
most of them are just an outcome of not being able to find a working i.e. best
solution for a problem. cache-timeout? thread-count? quick-read?
stat-prefetch? Gimme a break. Being a fs I'd even say all the cache-size paras
are bogus. When did you last tune the ext4 cache size or timeout? Don't come
up with ext4 being kernel vs. userspace fs. It was their decision to make it
userspace, so don't blame me. As a fs with networking it has to take the
comparison with nfs - as most interested users come from nfs. The first thing
they experience is that glusterfs is really slow compared to their old setups
with nfs. And the cause is _not_ replication per se. And as long as they
cannot cope with nfs performance my argument stands: they have a problem,be it
inferior per design or per coding.
As you see I am not talking at all about things that I count as basics in a
replication fs. I mean, really, I cannot express my feelings about the lack of
information for the admin around replication. Its pretty much like a wheel of
your car just fell off and you cannot find out which one. Would you trust that
car?
Let me clearly state this: the idea is quite brilliant, but the coding is at 
the stage of a design study and could have been far better if they only
concentrated on the basics. If you want to build a house you don't buy the tv
set at first...

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster 3.2.0 - totally broken?

2011-05-18 Thread Stephan von Krawczynski
On Wed, 18 May 2011 14:45:19 +0200
Udo Waechter udo.waech...@uni-osnabrueck.de wrote:

 Hi there,
 after reporting some trouble with group access permissions, 
 http://gluster.org/pipermail/gluster-users/2011-May/007619.html (which 
 still persist, btw.)
 
 things get worse and worse with each day.
 [...]
 Currently our only option seems to be to go away from glusterfs to some 
 other filesystem which would be a bitter decission.
 
 Thanks for any help,
 udo.

Hello Udo,

unfortunately I can only confirm your problems. The last known-to-work version
we see is 2.0.9. Everything beyond is just bogus.
3.X did not solve a single issue but brought quite a lot of new ones instead.
The project only gained featurism but did not solve the very basic problems.
Up to the current day there is no way to see a list of not-synced files on a
replication setup, that is ridiculous. I hope ever since 2.0.9 that someone
does a fork and really attacks the basics. IOW: good idea, pretty bad
implementation, no will to listen or learn.

Regards,
Stephan



 
 -- 
 Institute of Cognitive Science - System Administration Team
   Albrechtstrasse 28 - 49076 Osnabrueck - Germany
Tel: +49-541-969-3362 - Fax: +49-541-969-3361
  https://doc.ikw.uni-osnabrueck.de
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Seeking Feedback on Gluster Development Priorities/Roadmap

2011-03-08 Thread Stephan von Krawczynski
How about the _basics_ of such a fs? Create an answer to the still unresolved
question: What files are currently not in-sync?
From the very first day of glusterfs there is no answer to this fundamental
question for the user. No way to monitor the real state of a replicating fs up
to the current day.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] very bad performance on small files

2011-01-16 Thread Stephan von Krawczynski
On Sun, 16 Jan 2011 02:45:50 +0530
Anand Avati anand.av...@gmail.com wrote:

 In any case comparing to local disk performance and network disk performance
 is never right and is always misleading.
 
 Avati

This statement is fundamentally broken.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] ReiserFS problems

2011-01-05 Thread Stephan von Krawczynski
On Sun, 2 Jan 2011 12:18:08 +0100
nurdin david duchn...@free.fr wrote:

 Hello,
 
 When i launch the server glusterFS on a reiserFS partition i got this error :
 
 [2011-01-02 12:17:20.269951] C [posix.c:4313:init] posix: Extended attribute 
 not supported, exiting.
 [2011-01-02 12:17:20.269973] E [xlator.c:909:xlator_init] posix: 
 Initialization of volume 'posix' failed, review your volfile again
 
 
 
 And strace is :
 
 stat64(/data/export, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
 lsetxattr(/data/export, trusted.glusterfs.test, working, 8, 0) = -1 
 EDQUOT (Disk quota exceeded)
 gettimeofday({1293965111, 13398}, NULL) = 0
 
 
 I turn ON Xattr on the partition mount : 
 
 /dev/mapper/pve-data on /var/lib/vz type reiserfs (rw,attrs,acl,user_xattr)
 
 
 Have u got an idea ? 
 Thanks

Yes. Don't use reiserfs. Even if you manage to get it working (which _is_ 
possible) you will find out that its performance regarding xattrs is pretty 
bad. Take this advice: use ext3 for this case.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS replica question

2010-12-01 Thread Stephan von Krawczynski
Which is a regression compared to 2.X btw...

On Wed, 1 Dec 2010 02:40:53 -0600 (CST)
Raghavendra Bhat raghavendrab...@gluster.com wrote:

 
 If you create a volume with only one brick, and then add one more brick to 
 the volume then, the volume will be of distribute type and not replicate. If 
 replica feature is neede , then a replicate volume itself should be created 
 and to create replicate volume minimum 2 bricks are needed.
 
 
 - Original Message -
 From: Raghavendra G raghaven...@gluster.com
 To: raveenpl ravee...@gmail.com
 Cc: gluster-users@gluster.org
 Sent: Wednesday, December 1, 2010 12:52:03 PM
 Subject: Re: [Gluster-users] GlusterFS replica question
 
 Yes, it is possible in 3.1.x without downtime.
 
 - Original Message -
 From: raveenpl ravee...@gmail.com
 To: gluster-users@gluster.org
 Sent: Sunday, November 28, 2010 2:54:13 AM
 Subject: [Gluster-users] GlusterFS replica question
 
 Hi,
 
 For small lab environment I want to use GlusterFS with only ONE node.
 
 After some time I would like to add the second node as the redundant
 node (replica).
 
 Is it possible in GlusterFS 3.1 without downtime?
 
 Cheers
 PK
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 


-- 
MfG,
Stephan von Krawczynski


--
ith Kommunikationstechnik GmbH

Lieferanschrift  : Reiterstrasse 24, D-94447 Plattling
Telefon  : +49 9931 9188 0
Fax  : +49 9931 9188 44
Geschaeftsfuehrer: Stephan von Krawczynski
Registergericht  : Deggendorf HRB 1625
--

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster client 32bit

2010-11-17 Thread Stephan von Krawczynski
On Tue, 16 Nov 2010 16:54:07 -0800
Craig Carl cr...@gluster.com wrote:


 
 Stephan -
 Based on your feedback, and from other members of the community we have 
 opened discussions internally around adding support for a 32-bit client. 
 We have not made a decision at this point, and I can't make any 
 guarantees but I will do my best to get it added to the next version of 
 the product (3.1.2, (3.1.1 is feature locked)).
 On the sync question you brought up that is only an issue in the rare 
 case of split brain (if I understand the scenario you've brought up). 
 Split brain is a difficult problem with no answer right now. Gluster 3.1 
 added much more aggressive locking to reduce the possibility of split 
 brain. The process you described as ...the deamons are talking with 
 each other about whatever... will also reduce the likelihood of split 
 brain by eliminating the possibility that client or server vol files are 
 not the same across the entire cluster, the cause of a vast majority of 
 split brain issues with Gluster.
 Auto heal is slow, we have some processes along the lines you are 
 thinking, please let me know if these address some of your ideas around 
 stat -
 
 #cd gluster mount
 #find ./ -type f -exec stat /backend device’{}’ \; this will heal only 
 the files on that device.
 
 If you know when you had a failure you want to recover from this is even 
 faster -
 
 #cd gluster mount
 #find ./ -type f -mmin minutes since failure+ some extra -exec stat 
 /backend device’{}’ \; this will heal only the files on that device 
 changed x or more minutes ago.
 
 
 Thanks,
 
 Craig

Hello Craig,

let me repeat a very old suggestion (in fact I believe it was before your time
at gluster). I suggested to create a module (for server) that does only one
thing: maintain a special file in a way that a filename (with path) is added
to it when the server sets acls meaning the file is currently not in sync.
When acls are set to the file that mean it is in sync remove the filename from
the list again. Lets say this special file is named
/.glusterfs-server-ip (root of the mounted glusterfs). Now that would
allow you to have a look at _all_ files on _all_ servers not in sync from the
clients view. All you had to do for healing is to stat only these filelists
and you are done. You can simply drop the auto-healing, because you could as
well do a cronjob for that now as there is no find involved the whole method
uses virtually no resources on the servers and clients.
You have full control, you know what files on what servers are out-of-sync.
This solves all possible questions around replication.

Regards,
Stephan


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster client 32bit

2010-11-16 Thread Stephan von Krawczynski
On Tue, 16 Nov 2010 08:51:17 -0800
Jeff Anderson-Lee jo...@eecs.berkeley.edu wrote:

 On 11/16/2010 05:36 AM, Stefano Baronio wrote:
  Hi MArtin,
 the XenServer Dom0 is 32bit whilst the hypervisor is 64 bit.
  You need to know it when you install third part sw on the host.
  http://forums.citrix.com/thread.jspa?threadID=269924tstart=0
 
  So I need the 32bit compiled version to be able to mount glusterfs directly
  from the XenServer host.
 
 The built-in NFS module is typically as fast or faster than using the 
 fuse wrapper on the client side.  So the best way to support 32-bit 
 clients is likely via NFS.

NFS is really something completely different. And - what is also ignored - the
infrastructure usage is completely different when using nfs. nfs does not
replicate at the client side, which means that the data paths explicitly built
for client replication are useless for nfs. Using the nfs translator leads to
server-server replication. For that case a data path exclusively used for this
server traffic would be best (because it cannot interfere with 64 bit client
replication).
So if you happen to upgrade a 2.0.9 setup with 64 bit servers and 64 as well
as 32 bit clients you have to redesign the network for best performance _and_
glusterfsd on the servers have to use the shortest data path for the nfss'
data replication (which I don't know if they are able to do that at all).
In other words: whereas the setup in 2.0.9 was clear and simple, the very same
usage case in 3.X is a _mess_.
Obviously nobody really thought about that - unbelievable for me as it is
really obvious. But I got accustomed to that situation because up to the
current day there is no solution for another most obvious problem: which files
are not in sync in a replication setup? There is no trivial answer to this
question I already brought up in early 2.X development phase...
How can you sell someone a storage platform if you're unable to answer such an 
essential question? Really, nobody needed auto-healing. All you need is the
answer to this question and then stat exactly this file list at a time _of
your choice_.
The good thing about 2.0.X was that you as an admin had quite full control
over things. in 3.X you have exactly nothing, the deamons are talking with
each other about whatever and hopefully things work out. That is no setup I
want to be an admin.

Regards,
Stephan



  Cheers
  Stefano
 
 
  2010/11/16 Deadpan110deadpan...@gmail.com
 
 
  My home testing environment I also use XenServer (again, Citrix - with
  a Centos minimalistic core OS) - even though the Dom0 is 64bit, in any
  Xen setup (maybe even for other virtuali[s\z]ation solutions),
  performance is better using 32bit VM's (DomU).
 
  My production environment comprises of Xen virtual machines (not
  XenServer, but still Xen), scattered around a remote datacenter.
 
  I too will be sharing my experiences as GlusterFS offers exactly what
  I need and would like to deploy.
 
  Martin
 
 
 
  On 16 November 2010 20:39, Stefano Baroniostefano.baro...@gmail.com
  wrote:
   
   From my point of view, 64 bit on server side is easy to handle but the
  client side can have different needs and limitations.
  For example, we are using XenServer from Citrix, the Dom0 is taken from a
  CentOS 5 distro and it is 32bit. I cannot change that, because is a
 
  Citrix
   
  design choice and there might be lots of these situations around.
  Sorry but I can't code any patches..
  Anyway, I will share what our experience will be with 32bit client.
 
  Cheers
  Stefano
 
 
  2010/11/16 Bernard Libern...@vanhpc.org
 
 
  Hi Christian:
 
  On Tue, Nov 16, 2010 at 1:34 AM, Christian Fischer
  christian.fisc...@easterngraphics.com  wrote:
 
   
  No statement from the developers about usability of glusterfs client
 
  on
   
  32bit
   
  systems. But this was probably discussed in earlier threads.
 
  I believe the official comment is that Gluster is not going to support
  32-bit systems.  However, it doesn't mean that the community cannot
  support it.  If we find bugs and can code up patches, we should still
  file a bug and submit the patches and hopefully they will be checked
  into the official repository.
 
  Cheers,
 
  Bernard
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 
   
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 
 
 
   
 
 
 
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 
 

___
Gluster-users mailing list
Gluster-users@gluster.org

[Gluster-users] GlusterFS on mailservers

2010-11-15 Thread Stephan von Krawczynski
Hi all,

I just read this one on the dovecot web:
---
FUSE / GlusterFS

FUSE caches dentries and file attributes internally. If you're using multiple
GlusterFS clients to access the same mailboxes, you're going to have problems.
Worst of these problems can be avoided by using NFS cache flushes, which just
happen to work with FUSE as well:

mail_nfs_index = yes
mail_nfs_storage = yes

These probably don't work perfectly. 


Can someone comment on that? Does anybody use glusterfs as a storage for
mailboxes/mailfolders ?

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS on mailservers

2010-11-15 Thread Stephan von Krawczynski
On Mon, 15 Nov 2010 06:25:23 -0800
Craig Carl cr...@gluster.com wrote:

 On 11/15/2010 04:57 AM, Stephan von Krawczynski wrote:
  Hi all,
 
  I just read this one on the dovecot web:
  ---
  FUSE / GlusterFS
 
  FUSE caches dentries and file attributes internally. If you're using 
  multiple
  GlusterFS clients to access the same mailboxes, you're going to have 
  problems.
  Worst of these problems can be avoided by using NFS cache flushes, which 
  just
  happen to work with FUSE as well:
 
  mail_nfs_index = yes
  mail_nfs_storage = yes
 
  These probably don't work perfectly.
  
 
  Can someone comment on that? Does anybody use glusterfs as a storage for
  mailboxes/mailfolders ?
 
 Stephan -
 Dovecot has been a challenge in the past. We don't specifically test 
 with it here, if you are interested in using it with Gluster I would 
 suggest testing with 3.1.1, and always keep the index files local, that 
 makes a big difference.
 
 Thanks,
 
 Craig

Well, Craig, I cannot follow your advice as these are 32 bit clients and AFAIK
you said 3.1.1 is not expected to be used in such an environment.
Really quite a lot of interesting setups for glusterfs turn around mail
servers, I judge it to be a major deficiency if the fs cannot be used for such
purposes. You cannot expect voting for glusterfs if there are other options
that have no problems with such a standard setup. I mean is there something
more obvious than mailservers for such a fs?
Honestly, I got the impression that you're heading away from the mainstream fs
usage to very special environments and usage patterns.
I feel very sorry about that because 2.X looked very promising. But I did not
find a single setup where 3.X could be used at all.

 --
 Craig Carl
 Senior Systems Engineer
 Gluster

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS on mailservers

2010-11-15 Thread Stephan von Krawczynski
On Mon, 15 Nov 2010 10:18:28 -0500
Joe Landman land...@scalableinformatics.com wrote:

 On 11/15/2010 09:47 AM, Stephan von Krawczynski wrote:
 
  Stephan -
   Dovecot has been a challenge in the past. We don't specifically test
  with it here, if you are interested in using it with Gluster I would
  suggest testing with 3.1.1, and always keep the index files local, that
  makes a big difference.
 
  Thanks,
 
  Craig
 
  Well, Craig, I cannot follow your advice as these are 32 bit clients and 
  AFAIK
  you said 3.1.1 is not expected to be used in such an environment.
  Really quite a lot of interesting setups for glusterfs turn around mail
  servers, I judge it to be a major deficiency if the fs cannot be used for 
  such
 
 Quick interjection here:  We have some customers using Dovecot on our 
 storage units with GlusterFS 3.0.x.  There are some issues, usually 
 interactions between dovecot and fuse/glusterfs.  Nothing that can't be 
 worked around.

Well, a work-around is not the same as just working. Do you really think that
it is no sign of a problem if you need a work-around for a pretty standard
usage request?

  We are seeing strong/growing interest from our customer 
 base in this use case.

Well, that means I am right, not?
 
 Craig's advice is spot on.
 
  purposes. You cannot expect voting for glusterfs if there are other options
  that have no problems with such a standard setup. I mean is there something
  more obvious than mailservers for such a fs?
 
 Hmmm ... apart from NFS (which isn't a cluster file system), which has a 
 number of its own issues, which other cluster file system are you 
 referring to, that don't have these sorts of issues?  Small file and 
 small record performance on any sort of cluster file system is very 
 hard.  You have to get it right first, and then work on the performance 
 side later.

I am not talking of performance currently (though argueable), I am talking
about the shere basic usage. Probably a lot of potential users come from nfs
setups and want to make them redundant. And none has ever heard of a fs
problem with 32 bit clients (just as an example) ...
So this is an obvious problem.
Dovecot has been a challenge in the past, well, and how does the fs
currently cope with this challenge?
I am no supporter of the idea that fs tuning should be necessary just to make
something work at all. For faster performance let there be tuning options, but
for general support of a certain environment? I mean, did you ever tune
fat,ntfs,extX or the like just to make email work? And don't argue about them
not being network related: the simple truth is that this product is only a big
hit if it is as easy to deploy as a local fs. That should be the primary goal.
 
  Honestly, I got the impression that you're heading away from the mainstream 
  fs
  usage to very special environments and usage patterns.
  I feel very sorry about that because 2.X looked very promising. But I did 
  not
  find a single setup where 3.X could be used at all.
 
 While I respect your opinion, I do disagree with it. In our opinion 
 3.1.x has gotten better than 3.0.x, which was a huge step up from 2.0.x.

2.0.x was something like a filesystem, 3.X is obviously heading to be a
storage platform. That makes a big difference. And I'd say it did not get
really better in general comparing apples to apples. glusterfs 2.0.x is a lot
closer to a useable filesystem (lets say on linux boxes) than glusterfs 3.X is
to netapp or emc storage platforms. There is nothing comparable to glusterfs
2.0.X on its boxes whereas one cannot really choose glusterfs storage in
comparison to netapp. I mean you're trying to enter the wrong league because
the big players will just crash you.
 
 Regards,
 
 Joe
 
 -- 
 Joseph Landman, Ph.D
 Founder and CEO
 Scalable Informatics, Inc.
 email: land...@scalableinformatics.com
 web  : http://scalableinformatics.com
 http://scalableinformatics.com/jackrabbit
 phone: +1 734 786 8423 x121
 fax  : +1 866 888 3112
 cell : +1 734 612 4615
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS on mailservers

2010-11-15 Thread Stephan von Krawczynski
On Mon, 15 Nov 2010 12:17:48 -0800
Craig Carl cr...@gluster.com wrote:

  Please don't think we are 
 not working hard to meet your expectations.

Really, Craig, I am not expecting _anything_ for _me_ from glusterfs.
I only feel very sorry for an interesting project that gave a great vision but
choose featurism over completely solving basic requirements of a fs, not to
mention trivial expectations concerning a replication setup - which should
have been a true strength. 

 At a higher level Gluster is changing, and I think improving based 
 on feedback from the community, our paid subscribers and the storage 
 industry as a whole. Designing and writing a file system that is used on 
 thousands of servers in less than 3 years was, and is incredibly 
 challenging, and expensive. Contrast Gluster with another excellent file 
 system project, brtfs, which also has paid engineering resources and is 
 still very experimental [1].

I really don't want to talk about btrfs here, because its problems are
unrelated to glusterfs problems.

Our community asked for a couple of things from Gluster 3.1;

Well, honestly, whatever the community asked, you managed to create the first
project I have seen in more than a decade that is not able to upgrade its
older versions because trivial deployment setups have just been _dropped_.
I cannot remember ever seeing something like this before. That is really
outstanding.

 Thanks,
 
 Craig
 
 --
 Craig Carl
 Senior Systems Engineer
 Gluster

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] upgrading from 2.0.9 to 3.1, any gotchas?

2010-11-13 Thread Stephan von Krawczynski
On Fri, 12 Nov 2010 18:26:11 -0800
Liam Slusser lslus...@gmail.com wrote:

 Hey Gluster Users,
 
 Been awhile since i've posted here.  I'm looking to upgrade our 150tb
 10 brick cluster from 2.0.9 to 3.1.  Is there any gotcha's that i
 should be aware of?  Anybody run into any problems?  Any suggestions
 or hints would be most helpful.  I hoping the new Gluster will be a
 bit more forgiving on split brain issues and an increase in
 performance is always welcome.
 
 thanks,
 liam

You will loose your 32bit clients if you have some...

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster client 32bit

2010-11-12 Thread Stephan von Krawczynski
I can tell you that 3.1 does not compile under 32bit on my box - I tried
lately.
Honestly I find it a bit strange not to support 32 bit clients as there are
lots of them - and 2.9 did work on 32 bit. Which means you cannot upgrade such
setups.

Regards,
Stephan


On Sat, 13 Nov 2010 01:17:05 +1030
Deadpan110 deadpan...@gmail.com wrote:

 It should work... but it is very unsupported by the devs...
 
 USE AT YOUR OWN RISK...
 
 I successfully used glusterfs 3.1.0 for a while on Ubuntu Lucid 32bit
 - the only problems i encountered are a few of the ones recently
 discussed in this mailing list for 64bit.
 
 I will be implementing it again soon - I hope!
 
 Martin
 
 On 13 November 2010 00:54, Christian Fischer
 christian.fisc...@easterngraphics.com wrote:
  On Friday 12 November 2010 11:29:52 Bernard Li wrote:
  Hi Stefano:
 
  On Fri, Nov 12, 2010 at 2:18 AM, Stefano Baronio
 
  stefano.baro...@gmail.com wrote:
     is there a way to have a 32bit Glusterfs client?
 
  You can definitely build it yourself, but it is not officially
  supported by Gluster.  They recommend you use GlusterFS on 64-bit
  architecture servers.
 
  Someone knows the reason for it?
  Are problems to expect on 32bit architecture?
 
 
  Cheers,
 
  Bernard
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Some client problems with TCP-only NFS in Gluster 3.1

2010-10-22 Thread Stephan von Krawczynski
On Fri, 22 Oct 2010 04:46:44 -0500 (CDT)
Craig Carl cr...@gluster.com wrote:

 {Resending due to incomplete response] 
 
 Brent, 
 Thanks for your feedback . To mount with a Solaris client use - 
 ` mount -o proto=tcp,vers=3 nfs://SERVER-ADDR:38467/EXPORT MNT-POINT` 
 
 As to UDP access we want to force users to use TCP. Everything about Gluster 
 is designed to be fast , as NFS over UDP approaches line speed it becomes 
 increasingly inefficient, [1] we want to avoid that. 
 
 I have updated our documentation to reflect the required tcp option and 
 Solaris instructions. 
 
 [1] http://nfs.sourceforge.net/#faq_b10 

Sorry to jump in at this point. If you read the FAQ you may have noticed
that the problem only hits very ancient boxes with kernels lower 2.4.20 (!).

On contrary you gain a real problem with NFS over TCP if you are experiencing
even very minor packet loss. Your tcp-based server comes to a crawl in such a
scenario. In fact we completely dropped the idea of NFS over TCP exactly for
that reason. We never experienced any performance problem with NFS over UDP.

Regards,
Stephan

 
 
 Thanks again, 
 
 Craig 
 
 -- 
 Craig Carl 
 Senior Systems Engineer 
 Gluster 
 
 
 From: Brent A Nelson br...@phys.ufl.edu 
 To: gluster-users@gluster.org 
 Sent: Thursday, October 21, 2010 8:18:02 AM 
 Subject: [Gluster-users] Some client problems with TCP-only NFS in Gluster 
 3.1 
 
 I see that the built-in NFS support registers mountd in portmap only with 
 tcp and not udp. While this makes sense for a TCP-only NFS 
 implementation, it does cause problems for some clients: 
 
 Ubuntu 10.04 and 7.04 mount just fine. 
 
 Ubuntu 8.04 gives requested NFS version or transport protocol is not 
 supported, unless you specify -o mountproto=tcp as a mount option, in 
 which case it works just fine. 
 
 Solaris 2.6  7 both give RPC: Program not registered. Solaris 
 apparently doesn't support the mountproto=tcp option, so there doesn't 
 seem to be any way for Solaris clients to mount. 
 
 There may be other clients that assume mountd will be contactable via 
 udp, even though they (otherwise) happily support TCP NFS... 
 
 Thanks, 
 
 Brent 
 ___ 
 Gluster-users mailing list 
 Gluster-users@gluster.org 
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Some client problems with TCP-only NFS in Gluster 3.1

2010-10-22 Thread Stephan von Krawczynski
On Fri, 22 Oct 2010 15:18:09 +0200
Beat Rubischon b...@0x1b.ch wrote:

 Hi Stephan!
 
 Quoting sk...@ithnet.com (22.10.10 15:05):
 
  We never experienced any performance problem with NFS over UDP.
 
 Be careful when using NFSoUDP on recent networking hardware. It's simply too
 fast for the primitive reassembly algorithm in UDP. You will get silent data
 corruption.
 
 SuSE warns about this fact quite some years in their nfs manpage. You'll
 find a lot of copies when Googleing the title Using NFS over UDP on
 high-speed links such as Gigabit can cause silent data corruption.
 
 Beat

Hi Beat,

you are talking of the problem with identification field being only 16 bit,
right?
We experienced this scenario to be far less severe than TCP busted by packet
drops. In fact we were not able to run NFS over TCP for more than 2 days
without a complete service breakdown. Whereas UDP runs for several years now
without seeing the corruption issue.
We were very astonished about the bad tcp performance, but we had to accept it
as a fact.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Configuration suggestions (aka poor/slow performance on new hardware)

2010-03-26 Thread Stephan von Krawczynski
Can you check how things look like when using ext3 instead of xfs?

On Fri, 26 Mar 2010 18:04:07 +0100
Ramiro Magallanes lis...@sabueso.org wrote:

   Hello there!
 
 Im working on a 6-nodes cluster, with SuperMicro new hardware.
 The cluster have to store a millons of JPG's about (200k-4MB),and little
 text files.
 
 Each node is :
 
   -Single Xeon(R) CPU E5405  @ 2.00GHz (4 cores)
   -4 GB RAM
   -64 bits Distro-based (Debian Lenny)
   -3ware 9650 sataII-raid, with 1 logical drive in raid 5 mode,  the unit
 with 3 sata hardisk of 2TB wdc with 64MB of cache each one.
   -Xfs filesystem on each logical unit.
 
 When i run the genfiles.sh test on each node in local (in the raid-5
 unit) mode i've have the follow results:
 
   -3143 files created in 60 seconds.
 
 and if i comment the sync line in the script:
 
   -8947 files created in 60 seconds.
 
 Now , with Gluster mounted (22TB) i run the test and the results are:
 
   -1370 files created in 60 seconds.
 
 Now, I'm running the cluster with standard distributed configuration,
 and i was making significant number of change in the test process , but
 i obtain the same number of wroted files all the time.
 Never more than 1400 files created, and 170mbits of network load (top).
 
 The switching layer is gigabit (obviusly) , and there's no high
 resources being used , all is normal.
 
 I'm using the 3.0.3 version of Gluster.
 
 Here is my configuration file (only the last part of the file):
 
 ##
 volume distribute
 type cluster/distribute
 subvolumes 172.17.15.1-1 172.17.15.2-1 172.17.15.3-1
 172.17.15.4-1 172.17.15.5-1 172.17.15.6-1
 end-volume
 
 volume writebehind
 type performance/write-behind
option cache-size 1MB
 option flush-behind on
 subvolumes distribute
 end-volume
 
 volume readahead
 type performance/read-ahead
 option page-count 4
 subvolumes writebehind
 end-volume
 
 volume iocache
 type performance/io-cache
 option cache-size `grep 'MemTotal' /proc/meminfo  | awk '{print
 $2 * 0.2 / 1024}' | cut -f1 -d.`MB
 
 option cache-timeout 1
 subvolumes readahead
 end-volume
 
 volume iothreads
 type performance/io-threads
 option thread-count 32 # default is 16
 subvolumes distribute
 end-volume
 
 volume quickread
 type performance/quick-read
 option cache-timeout 1
 option max-file-size 128kB
 subvolumes iocache
 end-volume
 
 volume statprefetch
 type performance/stat-prefetch
 subvolumes quickread
 end-volume
 ##
 
 Any idea or suggestion to make the performance goes up?
 Thanks everyone!
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Setup for production - which one would you choose?

2010-03-25 Thread Stephan von Krawczynski
In fact, background for my post is very trivial: glusterfs is really in
development stage. So there is a real difference in using 2.0.9, 3.0.2 or
3.0.3. In fact it might be a difference of go vs no-go in your very special
setup. That's why I judge the comparison to other rpm questions as not valid.
This is not fetchmail where you can use almost any rpm flying around.
And I did not tell to compile your whole setup by hand. I am talking about
glusterfs and using its latest version in favor of using some available rpm
not containing the latest version.
--
Regards,
Stephan


On Wed, 24 Mar 2010 23:19:30 +0100
Steve stev...@gmx.net wrote:

 
  Original-Nachricht 
  Datum: Wed, 24 Mar 2010 23:01:55 +0100
  Von: Oliver Hoffmann o...@dom.de
  An: gluster-users@gluster.org
  Betreff: Re: [Gluster-users] Setup for production - which one would you 
  choose?
 
  Yep, thanx.
  
  @Stephan: It is not a matter of knowing how use tar and make, but if you 
  have a bunch of servers than you want to do an apt-get update/upgrade 
  once in a while without compiling this piece of software on that server 
  and another one on another server, etc.
 
 Not only that. On a RPM system (aka Red Hat, SuSE, Mandriva, etc) where you 
 have a support contract, installing packages that are not made by the vendor 
 does void support. So there is a good reason to use by vendor pre-build RPMs.
 
 A bunch of years ago I have helped a big vendor to virtualize the biggest 
 Linux installation in northern Europe for one of their customers. There where 
 over thousand Red Hat Enterprise Server installed in total. The customer 
 followed ITIL Release To Production. No you could jump up and down about a 
 new release of application XYZ and that you could install it form a self made 
 RPM. The customer does not care. Installing own made RPMS = no support from 
 Red Hat. Now if your business is depended on running systems and ever second 
 downtime can cost you hundreds of € then you don't think twice about 
 installing from source. You just don't do it. It's that easy. Just compare 
 the potential problem (aka: downtime, loss of money, loss of trust from 
 customers, etc) to the potential benefit of a own made RPM then you will 
 quickly realize that it is a no go.
 
 Stephan is probably a small shop doing all his stuff by hand. But there are 
 situations where this handicraft stuff is just not the way to go.
 
 
   It is hard to fully understand what you just wrote.  If you are
   suggesting that someone else's personal preferences (or company
   objectives) are incorrect or misguided simply because they don't match
   your own I'm trying to understand how your last post pertains to the
   user forum for Gluster?  There are plenty of reasons to prefer packages
   over source installations but that academic conversation is also not
   appropriate for this list.
  
   Cheers,
   Benjamin
  
  
  
   -Original Message-
   From: gluster-users-boun...@gluster.org
   [mailto:gluster-users-boun...@gluster.org] On Behalf Of Stephan von
   Krawczynski
   Sent: Wednesday, March 24, 2010 4:37 PM
   To: Ian Rogers
   Cc: gluster-users@gluster.org
   Subject: Re: [Gluster-users] Setup for production - which one would you
   choose?
  
   Ok, guys, honestly: it is allowed to learn (RMS fought for your right to
   do so)
   :-)
   Really rarely in the open source universe you will find a piece of
   software
   that is as easy to compile and run as glusterfs. All you have to know
   yourself
   is how to use tar. Then enter the source directory and do ./configure ;
   make ;
   make install What exactly is difficult to do? Why would you install
   _some_
   rpm that is outdated anyways (be it 2.0.9 or 3.0.2)?
   Please don't tell you configure and drive LAMP but can't build
   glusterfs.
   The docs for 5 apache config options are longer than the whole
   glusterfs-source...
  
   --
   Regards,
   Stephan
  
   PS: yes, I know it's the user-list. 
  
  
  
   On Wed, 24 Mar 2010 17:14:32 +
   Ian Rogers ian.rog...@contactclean.com wrote:
  
 
   I've just done part one of a writeup of my EC2 gluster LAMP
   
   installation 
 
   at 
  
   
   http://www.sirgroane.net/2010/03/distributed-file-system-on-amazon-ec2/ 
 
   - may or may not be useful to you :-)
  
   Ian
  
   On 24/03/2010 17:09, Oliver Hoffmann wrote:
   
   Yes, that's an idea. Thanx. That will be important for all the
 
   debian
 
   clients, mostly lenny.
  
   I think waiting and testing a month is quite ok though.
  
  
 
   To have glusterfs 3.0.3 on ubuntu 9.10 you can also just install
   
   the
 
   debian package for gluster 3.0.3 with dpkg -i.
  
   http://packages.debian.org/source/sid/glusterfs
  
   But then 10.04 is only a month away, so depends how much of a rush
   your in!
  
  
  
   On Wednesday 24 Mar 2010 16:45:40 Oliver Hoffmann wrote:

   
   Haha

Re: [Gluster-users] gluster local vs local = gluster x4 slower

2010-03-24 Thread Stephan von Krawczynski
Hi Jeremy,

have you tried to reproduce with all performance options disabled? They are
possibly no good idea on a local system.
What local fs do you use?


--
Regards,
Stephan


On Tue, 23 Mar 2010 19:11:28 -0500
Jeremy Enos je...@ncsa.uiuc.edu wrote:

 Stephan is correct- I primarily did this test to show a demonstrable 
 overhead example that I'm trying to eliminate.  It's pronounced enough 
 that it can be seen on a single disk / single node configuration, which 
 is good in a way (so anyone can easily repro).
 
 My distributed/clustered solution would be ideal if it were fast enough 
 for small block i/o as well as large block- I was hoping that single 
 node systems would achieve that, hence the single node test.  Because 
 the single node test performed poorly, I eventually reduced down to 
 single disk to see if it could still be seen, and it clearly can be.  
 Perhaps it's something in my configuration?  I've pasted my config files 
 below.
 thx-
 
  Jeremy
 
 ##glusterfsd.vol##
 volume posix
type storage/posix
option directory /export
 end-volume
 
 volume locks
type features/locks
subvolumes posix
 end-volume
 
 volume disk
type performance/io-threads
option thread-count 4
subvolumes locks
 end-volume
 
 volume server-ib
type protocol/server
option transport-type ib-verbs/server
option auth.addr.disk.allow *
subvolumes disk
 end-volume
 
 volume server-tcp
type protocol/server
option transport-type tcp/server
option auth.addr.disk.allow *
subvolumes disk
 end-volume
 
 ##ghome.vol##
 
 #---IB remotes--
 volume ghome
type protocol/client
option transport-type ib-verbs/client
 #  option transport-type tcp/client
option remote-host acfs
option remote-subvolume raid
 end-volume
 
 #Performance Options---
 
 volume readahead
type performance/read-ahead
option page-count 4   # 2 is default option
option force-atime-update off # default is off
subvolumes ghome
 end-volume
 
 volume writebehind
type performance/write-behind
option cache-size 1MB
subvolumes readahead
 end-volume
 
 volume cache
type performance/io-cache
option cache-size 1GB
subvolumes writebehind
 end-volume
 
 ##END##
 
 
 
 On 3/23/2010 6:02 AM, Stephan von Krawczynski wrote:
  On Tue, 23 Mar 2010 02:59:35 -0600 (CST)
  Tejas N. Bhisete...@gluster.com  wrote:
 
 
  Out of curiosity, if you want to do stuff only on one machine,
  why do you want to use a distributed, multi node, clustered,
  file system ?
   
  Because what he does is a very good way to show the overhead produced only 
  by
  glusterfs and nothing else (i.e. no network involved).
  A pretty relevant test scenario I would say.
 
  --
  Regards,
  Stephan
 
 
 
  Am I missing something here ?
 
  Regards,
  Tejas.
 
  - Original Message -
  From: Jeremy Enosje...@ncsa.uiuc.edu
  To: gluster-users@gluster.org
  Sent: Tuesday, March 23, 2010 2:07:06 PM GMT +05:30 Chennai, Kolkata, 
  Mumbai, New Delhi
  Subject: [Gluster-users] gluster local vs local = gluster x4 slower
 
  This test is pretty easy to replicate anywhere- only takes 1 disk, one
  machine, one tarball.  Untarring to local disk directly vs thru gluster
  is about 4.5x faster.  At first I thought this may be due to a slow host
  (Opteron 2.4ghz).  But it's not- same configuration, on a much faster
  machine (dual 3.33ghz Xeon) yields the performance below.
 
  THIS TEST WAS TO A LOCAL DISK THRU GLUSTER
  [r...@ac33 jenos]# time tar xzf
  /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
 
  real0m41.290s
  user0m14.246s
  sys 0m2.957s
 
  THIS TEST WAS TO A LOCAL DISK (BYPASS GLUSTER)
  [r...@ac33 jenos]# cd /export/jenos/
  [r...@ac33 jenos]# time tar xzf
  /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
 
  real0m8.983s
  user0m6.857s
  sys 0m1.844s
 
  THESE ARE TEST FILE DETAILS
  [r...@ac33 jenos]# tar tzvf
  /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz  |wc -l
  109
  [r...@ac33 jenos]# ls -l
  /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
  -rw-r--r-- 1 jenos ac 804385203 2010-02-07 06:32
  /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
  [r...@ac33 jenos]#
 
  These are the relevant performance options I'm using in my .vol file:
 
  #Performance Options---
 
  volume readahead
  type performance/read-ahead
  option page-count 4   # 2 is default option
  option force-atime-update off # default is off
  subvolumes ghome
  end-volume
 
  volume writebehind
  type performance/write-behind
  option cache-size 1MB
  subvolumes readahead
  end-volume
 
  volume cache
  type performance/io-cache
  option cache-size 1GB
  subvolumes writebehind
  end

Re: [Gluster-users] gluster local vs local = gluster x4 slower

2010-03-23 Thread Stephan von Krawczynski
On Tue, 23 Mar 2010 02:59:35 -0600 (CST)
Tejas N. Bhise te...@gluster.com wrote:

 Out of curiosity, if you want to do stuff only on one machine, 
 why do you want to use a distributed, multi node, clustered, 
 file system ?

Because what he does is a very good way to show the overhead produced only by
glusterfs and nothing else (i.e. no network involved).
A pretty relevant test scenario I would say.

--
Regards,
Stephan


 
 Am I missing something here ?
 
 Regards,
 Tejas.
 
 - Original Message -
 From: Jeremy Enos je...@ncsa.uiuc.edu
 To: gluster-users@gluster.org
 Sent: Tuesday, March 23, 2010 2:07:06 PM GMT +05:30 Chennai, Kolkata, Mumbai, 
 New Delhi
 Subject: [Gluster-users] gluster local vs local = gluster x4 slower
 
 This test is pretty easy to replicate anywhere- only takes 1 disk, one 
 machine, one tarball.  Untarring to local disk directly vs thru gluster 
 is about 4.5x faster.  At first I thought this may be due to a slow host 
 (Opteron 2.4ghz).  But it's not- same configuration, on a much faster 
 machine (dual 3.33ghz Xeon) yields the performance below.
 
 THIS TEST WAS TO A LOCAL DISK THRU GLUSTER
 [r...@ac33 jenos]# time tar xzf 
 /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
 
 real0m41.290s
 user0m14.246s
 sys 0m2.957s
 
 THIS TEST WAS TO A LOCAL DISK (BYPASS GLUSTER)
 [r...@ac33 jenos]# cd /export/jenos/
 [r...@ac33 jenos]# time tar xzf 
 /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
 
 real0m8.983s
 user0m6.857s
 sys 0m1.844s
 
 THESE ARE TEST FILE DETAILS
 [r...@ac33 jenos]# tar tzvf 
 /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz  |wc -l
 109
 [r...@ac33 jenos]# ls -l 
 /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
 -rw-r--r-- 1 jenos ac 804385203 2010-02-07 06:32 
 /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
 [r...@ac33 jenos]#
 
 These are the relevant performance options I'm using in my .vol file:
 
 #Performance Options---
 
 volume readahead
type performance/read-ahead
option page-count 4   # 2 is default option
option force-atime-update off # default is off
subvolumes ghome
 end-volume
 
 volume writebehind
type performance/write-behind
option cache-size 1MB
subvolumes readahead
 end-volume
 
 volume cache
type performance/io-cache
option cache-size 1GB
subvolumes writebehind
 end-volume
 
 What can I do to improve gluster's performance?
 
  Jeremy
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How to re-sync

2010-03-07 Thread Stephan von Krawczynski
I love top-post ;-)

Generally, you are right. But in real-life you cannot trust on this
smartness. We tried exactly this point and had to find out that the clients
do not always select the correct file version (i.e. the latest) automatically.
Our idea in the testcase was to bring down a node, update its kernel an revive
it - just as you would like to do it in real world for a kernel update.
We found out that some files were taken from the downed node afterwards and
the new contents on the other node got in fact overwritten.
This does not happen generally, of course. But it does happen. We could only
stop this behaviour by setting favorite-child. But that does not really help
a lot, since we want to take down all nodes some other day.
This is in fact one of our show-stoppers.


On Sun, 7 Mar 2010 01:33:14 -0800
Liam Slusser lslus...@gmail.com wrote:

 Assuming you used raid1 (distribute), you DO bring up the new machine
 and start gluster.  On one of your gluster mounts you run a ls -alR
 and it will resync the new node.  The gluster clients are smart enough
 to get the files from the first node.
 
 liam
 
 On Sat, Mar 6, 2010 at 11:48 PM, Chad ccolu...@hotmail.com wrote:
  Ok, so assuming you have N glusterfsd servers (say 2 cause it does not
  really matter).
  Now one of the servers dies.
  You repair the machine and bring it back up.
 
  I think 2 things:
  1. You should not start glusterfsd on boot (you need to sync the HD first)
  2. When it is up how do you re-sync it?
 
  Do you rsync the underlying mount points?
  If it is a busy gluster cluster it will be getting new files all the time.
  So how do you sync and bring it back up safely so that clients don't connect
  to an incomplete server?
 
  ^C
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
 


-- 
MfG,
Stephan von Krawczynski


--
ith Kommunikationstechnik GmbH

Lieferanschrift  : Reiterstrasse 24, D-94447 Plattling
Telefon  : +49 9931 9188 0
Fax  : +49 9931 9188 44
Geschaeftsfuehrer: Stephan von Krawczynski
Registergericht  : Deggendorf HRB 1625
--

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Migrate from an NFS storage to GlusterFS

2010-02-16 Thread Stephan von Krawczynski
On Tue, 16 Feb 2010 17:31:00 +0530
Vikas Gorur vi...@gluster.com wrote:

 Olivier Le Cam wrote:
  Thanks Vikas. BTW, might it be possible to have the same volume 
  exported both by regular-NFS and GlusterFS at the same time in order 
  to migrate my clients smoothly? Is there any risks to get GlusterFS 
  confused and/or the ext3 volume damaged?
 That would be quite risky. If you have both GlusterFS clients and NFS 
 clients operating on
 the same files or directories there are chances of race conditions which 
 might lead
 to lost files, GlusterFS getting confused, NFS getting confused etc. I 
 wouldn't recommend it.

But isn't that a setup that every average user would expect to work? You can
share data between nfs and a local (nfs-server) user, too. Is your file
locking racy? Did you break atomic operations?

Remember that long discussion about soft migrating data by just exporting
already existing data via glusterfs without copying? This point is very
similar. It is a common understanding in modern fs that multiple users of the
same file should be managed by record- and/or file-locking. A network based fs
on top of some other fs should behave just as it were some average local user
- then your data should (and must) be safe.

 Vikas

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Bonded Gigabit

2010-01-05 Thread Stephan von Krawczynski
On Tue, 05 Jan 2010 21:39:58 +0530
Vikas Gorur vi...@gluster.com wrote:

 Adrian Revill wrote:
  That sounds OK
 
  So if I have a client on server A and I write a file on server A, 
  would the file be copied to server B, C and D all at the same time, or 
  will the file be first copped to server B then coied to C and D in turn?
 
 It will be written to all servers simultaneously.
 
 Vikas

Forgive my ignorance, but I doubt that.
Simultaneously would mean that you have parallel network paths to all
servers, then your client would be able to copy data at almost the same time.
If your network path to your servers is in fact a bottle-neck at one client
network card, then you might notice what I did, too: your servers look like
processing the data linear and not parallel. the first one shows hd
blinking, then the second one, and so on. I already noticed that with two
servers and simple bonnie tests.

So again: are you really sure about simultaneous ? I'd say pushing large
chunks of data per server through a single network path cannot be called that 
way.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Strange server locks isuess with 2.0.7 - updating

2009-11-21 Thread Stephan von Krawczynski
The problem we experienced was occasional packet loss (not high, only very
occasional). You will see that in almost every LAN. If your ping-packet is
lost and you configured a low value a brick will be offline quite fast, though
there is no real problem. The bigger the timeout the more chances you have
that a following ping packet will make it and reset the wait-time.



On Fri, 20 Nov 2009 14:18:46 +0100
Marek m...@kis.p.lodz.pl wrote:

 Why You suggest ping-timeout with that high value?
 When some brick gets in trouble, mounted fs on client side is unusable (I/O 
 is locked)
 and have to wait 120 sec. for timeout and release fs.
 Locked client IO for 120 sec. is not acceptable.
 
 
 regards,
 
 Stephan von Krawczynski wrote:
  Try setting your ping-timeout way higher, since we use 120 we have almost no
  issues in regular use. Nevertheless we do believe every problem will come 
  back
  when some brick(s) die...
  
  
  On Tue, 10 Nov 2009 14:59:07 +0100
  Marek Blaszkowski m...@kis.p.lodz.pl wrote:
  
  OK,
  here goes some more details, on a bad servers (with strange lockups) we 
  got
  problems with open/move files. We are unable to open,move or just ls files
  (file utils just hangs )
 
 
  Marek wrote:
  Hello,
  we're testing a simple configuration of glusterfs 2.0.7 with 4 servers 
  and 1 client (2+2 bricks each replicated with
  a distribute translator, configs below).
  Durning our tests (client side copying/moving a lot of small files on 
  glusterfs mounted FS) we got a strange
  lockups on two of servers (bricks).
  I was unable to login (via ssh) to server, on already started terminal 
  sessions I couldn't spawn a top
  process (it just hangs), vmstats exists with floating point exception. 
  Other fileutils commands behaves normal.
  There was no any dmesg kernel messages (first guess was a kernel ups or 
  other kernel related problems).
  This server never had any CPU/memory problems under high loads before. 
  Problems starts when we
  run glusterfsd on this server. We had to a hard reset malfunction server 
  (reboot doesn't work).
  After a couple hours of testing another server disconected from a client 
  (according to a client debug log).
  Scenario was the same:
  1. unable to login to a server, connection was established but sshd on 
  server side hang/timeout after entering a user password
  2. on a previous established terminal sessions was unable to run top or 
  vmstat utility (vmstats exit with with
  floating point exception. Copying/moving files was OK. Load was  0.00, 
  0.00, 0.00
 
 
  What could be wrong? These servers never had problems before (simple 
  terminal/proxy servers). Strange locking looks
  like related to a kernel VM structures (why top/vmstat behaves odd??) or 
  other kernel related problems.
 
  Server remote1 details: Linux version 2.6.26-1-686 (Debian 
  2.6.26-13lenny2) (da...@debian.org)
  (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 SMP Fri 
  Mar 13 18:08:45 UTC 2009
  running debian 5.0
 
  Server remote2 details: Linux version 2.6.22-14-server (bui...@palmer) 
  (gcc version 4.1.3 20070929
  (prerelease) (Ubuntu 4.1.2-16ubuntu2)) #1 SMP Sun Oct 14 23:34:23 GMT 2007
  running ubuntu
  both run glusterfsd:
   /usr/local/sbin/glusterfsd -p /var/run/glusterfsd.pid -f 
  /usr/local/etc/glusterfs/glusterfs-server.vol
 
 
  Note that both servers runs different os versions and got simillar 
  lockup problems, never having problems
  before (without glusterfsd).
 
 
  Server gluster config file (the same on 4 servers):
  -cut here
  volume brick
  type storage/posix
  option directory /var/gluster
  end-volume
 
  volume locks
  type features/posix-locks
  subvolumes brick
  end-volume
 
  volume server
  type protocol/server
  option transport-type tcp/server
  option auth.ip.locks.allow *
  option auth.ip.brick-ns.allow *
  subvolumes locks
  end-volume
  -cut here---
 
  client gluster config below (please note remote1 and remote4 got 
  problems metioned above), gluster client was
  start with a command:
  glusterfs --log-file=/var/log/gluster-client -f 
  /usr/local/etc/glusterfs/glusterfs-client.vol /var/glustertest
 
 
  -client config-cut here---
  volume remote1
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.2.184
  option ping-timeout 5
  option remote-subvolume locks
  end-volume
 
  volume remote2
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.2.195
  option ping-timeout 5
  option remote-subvolume locks
  end-volume
 
  volume remote3
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.2.145
  option ping-timeout 5
  option remote-subvolume locks
  end-volume
 
  volume remote4
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.2.193
  option ping-timeout 5

Re: [Gluster-users] Gluster in HTTP cluster

2009-11-11 Thread Stephan von Krawczynski
On Tue, 28 Jul 2009 16:31:44 -0500
Brian Koloszyc br...@creativemerch.com wrote:

 Hi,
 
 I am in the process of building out a sandbox glusterFS environment in 
 Amazon's EC2 cloud.  I have successfully configured the NFS clone, but I'm 
 looking to transition over to gluster in order to get away from NFS in the 
 first place.
 
 Our desired configuration would be to have x number of web slaves, each 
 having a local attached device for storage, with replication enabled between 
 all 4 attached devices in order to keep dynamically generated content in sync.
 
 Can someone point me in the direction of the correct config for this?
 
 I've read over this:
 http://www.gluster.org/docs/index.php/Translators
 
 I'm a bit confused.  Is it even possible to have the client always read/write 
 to the local disk?   Or will each client round robin between gluster server 
 storage?  My concern is that we want optimal read/write times (nfs is too 
 slow), and we are worried that the tcp connection times will be as slow as 
 nfs.

I'd be surprised if you manage to get even nfs performance. We never made that 
in real world situation.
 
 Thanks,
 
 --Brian.
 

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Strange server locks isuess with 2.0.7 - updating

2009-11-10 Thread Stephan von Krawczynski
-bin/mailman/listinfo/gluster-users
 


-- 
MfG,
Stephan von Krawczynski


--
ith Kommunikationstechnik GmbH

Lieferanschrift  : Reiterstrasse 24, D-94447 Plattling
Telefon  : +49 9931 9188 0
Fax  : +49 9931 9188 44
Geschaeftsfuehrer: Stephan von Krawczynski
Registergericht  : Deggendorf HRB 1625
--

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Rsync

2009-10-06 Thread Stephan von Krawczynski
Remember, the gluster-team does not like my way of data-feeding. If your setup
blows up, don't blame them (or me :-)
I can only tell you what I am doing: simply move (or copy) the initial data to
the primary server of the replication setup and then start glusterfsd for
exporting.
You will  notice that the data gets replicated as soon as stat is going on
(first ls or the like). If you already exported the data via nfs before you
probably only need to setup up glusterfs on the very same box and use it as
primary server. Then there is no data copying at all.

After months of experiments I can say that glusterfs runs pretty stable on
_low_ performance setups. But you have to do one thing: lengthen the
ping-timeout (something like option ping-timeout 120).
If you do not do that you will loose some of your server(s) at any time and
that will turn your glusterfs setup in a mess.
If your environment is ok, it works. If your environment fails it will fail,
too, sooner or later. In other words: it exports data, but it does not fulfill
the promise of keeping your setup alive during failures - at this stage.
My advice for the team is to stop whatever they may work on and take for
physical boxes (2 server, 2 client), run a lot of bonnies and unplug/re-plug 
the servers non-deterministic. You can find all kinds of weirdos this way.

Regards,
Stephan


On Mon, 5 Oct 2009 16:49:53 +0100
Hiren Joshi j...@moonfruit.com wrote:

 My users are more pitch fork less shooting.
 
 I don't understand what you're saying, should I have locally copied all
 the files over not using gluster before attempting an rsync?
 
  -Original Message-
  From: Stephan von Krawczynski [mailto:sk...@ithnet.com] 
  Sent: 05 October 2009 14:13
  To: Hiren Joshi
  Cc: Pavan Vilas Sondur; gluster-users@gluster.org
  Subject: Re: [Gluster-users] Rsync
  
  It would be nice to remember my thread about _not_ copying 
  data initially to
  gluster via the mountpoint. And one major reason for _local_ 
  feed was: speed. 
  Obviously a lot of cases are merely impossible because of the 
  pure waiting
  time. If you had a live setup people would have already shot you...
  This is why I talked about a feature and not an accepted bug 
  behaviour.
  
  Regards,
  Stephan
  
  
  On Mon, 5 Oct 2009 11:00:36 +0100
  Hiren Joshi j...@moonfruit.com wrote:
  
   Just a quick update: The rsync is *still* not finished. 
   
-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of 
  Hiren Joshi
Sent: 01 October 2009 16:50
To: Pavan Vilas Sondur
Cc: gluster-users@gluster.org
Subject: Re: [Gluster-users] Rsync

Thanks!

I'm keeping a close eye on the is glusterfs DHT really 
  distributed?
thread =)

I tried nodelay on and unhashd no. I tarred about 400G to 
  the share in
about 17 hours (~6MB/s?) and am running an rsync now. 
  Will post the
results when it's done.

 -Original Message-
 From: Pavan Vilas Sondur [mailto:pa...@gluster.com] 
 Sent: 01 October 2009 09:00
 To: Hiren Joshi
 Cc: gluster-users@gluster.org
 Subject: Re: Rsync
 
 Hi,
 We're looking into the problem on similar setups and 
  workng on it. 
 Meanwhile can you let us know if performance increases if you 
 use this option:
 
 option transport.socket.nodelay on' in each of your
 protocol/client and protocol/server volumes.
 
 Pavan
 
 On 28/09/09 11:25 +0100, Hiren Joshi wrote:
  Another update:
  It took 1240 minutes (over 20 hours) to complete on 
  the simplified
  system (without mirroring). What else can I do to debug?
  
   -Original Message-
   From: gluster-users-boun...@gluster.org 
   [mailto:gluster-users-boun...@gluster.org] On Behalf Of 
 Hiren Joshi
   Sent: 24 September 2009 13:05
   To: Pavan Vilas Sondur
   Cc: gluster-users@gluster.org
   Subject: Re: [Gluster-users] Rsync
   

   
-Original Message-
From: Pavan Vilas Sondur [mailto:pa...@gluster.com] 
Sent: 24 September 2009 12:42
To: Hiren Joshi
Cc: gluster-users@gluster.org
Subject: Re: Rsync

Can you let us know the following:

 * What is the exact directory structure?
   /abc/def/ghi/jkl/[1-4]
   now abc, def, ghi and jkl are one of a thousand dirs.
   
 * How many files are there in each individual 
  directory and 
of what size?
   Each of the [1-4] dirs has about 100 files in, all 
  under 1MB.
   
 * It looks like each server process has 6 export 
directories. Can you run one server process each 
  for a single 
export directory and check if the rsync speeds up?
   I had no idea you could do that. How? Would I need to 
 create 6 config
   files and start gluster

Re: [Gluster-users] Rsync

2009-10-05 Thread Stephan von Krawczynski
It would be nice to remember my thread about _not_ copying data initially to
gluster via the mountpoint. And one major reason for _local_ feed was: speed. 
Obviously a lot of cases are merely impossible because of the pure waiting
time. If you had a live setup people would have already shot you...
This is why I talked about a feature and not an accepted bug behaviour.

Regards,
Stephan


On Mon, 5 Oct 2009 11:00:36 +0100
Hiren Joshi j...@moonfruit.com wrote:

 Just a quick update: The rsync is *still* not finished. 
 
  -Original Message-
  From: gluster-users-boun...@gluster.org 
  [mailto:gluster-users-boun...@gluster.org] On Behalf Of Hiren Joshi
  Sent: 01 October 2009 16:50
  To: Pavan Vilas Sondur
  Cc: gluster-users@gluster.org
  Subject: Re: [Gluster-users] Rsync
  
  Thanks!
  
  I'm keeping a close eye on the is glusterfs DHT really distributed?
  thread =)
  
  I tried nodelay on and unhashd no. I tarred about 400G to the share in
  about 17 hours (~6MB/s?) and am running an rsync now. Will post the
  results when it's done.
  
   -Original Message-
   From: Pavan Vilas Sondur [mailto:pa...@gluster.com] 
   Sent: 01 October 2009 09:00
   To: Hiren Joshi
   Cc: gluster-users@gluster.org
   Subject: Re: Rsync
   
   Hi,
   We're looking into the problem on similar setups and workng on it. 
   Meanwhile can you let us know if performance increases if you 
   use this option:
   
   option transport.socket.nodelay on' in each of your
   protocol/client and protocol/server volumes.
   
   Pavan
   
   On 28/09/09 11:25 +0100, Hiren Joshi wrote:
Another update:
It took 1240 minutes (over 20 hours) to complete on the simplified
system (without mirroring). What else can I do to debug?

 -Original Message-
 From: gluster-users-boun...@gluster.org 
 [mailto:gluster-users-boun...@gluster.org] On Behalf Of 
   Hiren Joshi
 Sent: 24 September 2009 13:05
 To: Pavan Vilas Sondur
 Cc: gluster-users@gluster.org
 Subject: Re: [Gluster-users] Rsync
 
  
 
  -Original Message-
  From: Pavan Vilas Sondur [mailto:pa...@gluster.com] 
  Sent: 24 September 2009 12:42
  To: Hiren Joshi
  Cc: gluster-users@gluster.org
  Subject: Re: Rsync
  
  Can you let us know the following:
  
   * What is the exact directory structure?
 /abc/def/ghi/jkl/[1-4]
 now abc, def, ghi and jkl are one of a thousand dirs.
 
   * How many files are there in each individual directory and 
  of what size?
 Each of the [1-4] dirs has about 100 files in, all under 1MB.
 
   * It looks like each server process has 6 export 
  directories. Can you run one server process each for a single 
  export directory and check if the rsync speeds up?
 I had no idea you could do that. How? Would I need to 
   create 6 config
 files and start gluster:
 
 /usr/sbin/glusterfsd -f /etc/glusterfs/export1.vol or similar?
 
 I'll give this a go
 
   * Also, do you have any benchmarks with a similar setup on 
 say, NFS?
 NFS will create the dir tree in about 20 minutes then start 
 copying the
 files over, it takes about 2-3 hours.
 
  
  Pavan
  
  On 24/09/09 12:13 +0100, Hiren Joshi wrote:
   It's been running for over 24 hours now.
   Network traffic is nominal, top shows about 200-400% cpu 
 (7 cores so
   it's not too bad).
   About 14G of memory used (the rest is being used as 
   disk cache).
   
   Thoughts?
   
   
   
   snip
   
   An update, after running the rsync for a day, 
   I killed it 
  and remounted
   all the disks (the underlying filesystem, not the 
 gluster) 
  with noatime,
   the rsync completed in about 600 minutes. I'm now 
 going to 
  try one level
   up (about 1,000,000,000 dirs).
   
-Original Message-
From: Pavan Vilas Sondur 
  [mailto:pa...@gluster.com] 
Sent: 23 September 2009 07:55
To: Hiren Joshi
Cc: gluster-users@gluster.org
Subject: Re: Rsync

Hi Hiren,
What glusterfs version are you using? Can you 
 send us the 
volfiles and the log files.

Pavan

On 22/09/09 16:01 +0100, Hiren Joshi wrote:
 I forgot to mention, the mount is mounted with 
  direct-io, would this
 make a difference? 
 
  -Original Message-
  From: gluster-users-boun...@gluster.org 
  [mailto:gluster-users-boun...@gluster.org] On 
  Behalf Of 
Hiren Joshi
  Sent: 22 September 2009 11:40
  To: gluster-users@gluster.org
  Subject: [Gluster-users] Rsync
  
  Hello all,

Re: [Gluster-users] The continuing story ...

2009-09-18 Thread Stephan von Krawczynski
On Fri, 18 Sep 2009 10:35:22 +0200
Peter Gervai grin...@gmail.com wrote:

 Funny thread we have.
 
 Just a sidenote on the last week part about userspace cannot lock up
 the system: blocking resource waits / I/O waits can stall _all_ disk
 access, and try to imagine what you can do with a system without disk
 access. Obviously, you cannot log in, cannot start new programs,
 cannot load dynamic libraries. Yet the system pings, and your already
 logged in shells may function more or less, especially if you have a
 statically linked one (like sash).
 
 As a bitter sidenote: google for 'xtreemfs', may be interesting if you
 only need a shared redundant access with extreme network fault
 tolerance. (And yes, it can stall the system, too. :-))

I would not want to use it for exactly this reason (from the docs):

-
XtreemFS implements an object-based file system architecture (Fig. 2.1). The
name of this architecture comes from the fact that an object-based file system
splits file content into a series of fixed-size objects and stores them on its
storage servers. In contrast to block-based file systems, the size of such an
object can vary from file to file.

The metadata of a file (such as the file name or file size) is stored separate
from the file content on a Metadata server. This metadata server organizes
file system metadata as a set of volumes, each of which implements a separate
file system namespace in form of a directory tree. 
-

That's exactly what we don't want. We want a disk layout that is accessible
even if glusterfs (or call it the network fs) has a bad day and doesn't want
to start.

 Another sidenote: I tend to see FUSE as a low-speed toy nowadays. It
 doesn't seem to be able to handle any serious I/O load.

Really, i can't judge. I haven't opened (this) pandora's box up to now ...

 -- 
  byte-byte,
 grin

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] booster

2009-09-14 Thread Stephan von Krawczynski
On Mon, 14 Sep 2009 15:03:05 +0530
Shehjar Tikoo shehj...@gluster.com wrote:

 Stephan von Krawczynski wrote:
  On Mon, 14 Sep 2009 11:40:03 +0530 Shehjar Tikoo
  shehj...@gluster.com wrote:
  
  We only tried to run some bash scripts with preloaded
  booster...
  
  Do you mean the scripts contained commands with LD_PRELOADed 
  booster? Or were you trying to run bash with LD_PRELOADed
  booster?
  
  The second scenario will not work at this point.
  
  Thanks -Shehjar
  
  Oh, that's bad news. We tried to PRELOAD booster in front of bash
  (implicit, called the bash-script with LD_PRELOADED). Is this a
  general problem or a not-yet-implemented feature?
  
 
 A general problem, I'd say. The last time, i.e. when we revamped
 booster, we tried running with bash but there was some clash with bash
 internals.
 
 We havent done anything special to fix the problem since then because:
 
 1. it requires changes deep inside GlusterFS and;
 
 2. running bash wasnt a very useful scenario when the LD_PRELOAD
 variable can be added for the bash environment as a whole. For eg.
 if you just do export LD_PRELOAD=blah on the command line, you can
 actually have every program started from that shell use booster.
 
 -Shehjar

Well, how about other interpreters like sh,csh,perl,python,php,name-one ?
There are tons of perl-applications out there, we use some, too.
Is the problem only linked to bash?

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] booster

2009-09-14 Thread Stephan von Krawczynski

 2. running bash wasnt a very useful scenario when the LD_PRELOAD
 variable can be added for the bash environment as a whole. For eg.
 if you just do export LD_PRELOAD=blah on the command line, you can
 actually have every program started from that shell use booster.
 
 -Shehjar

There is a problem with that: if your bash environment calls only one other
bash-script it will fail either. Another problem can be script-replaced
binaries. If you replace some classical binary with a shell-script for
additional parameters (or any other thinkable reason) this general export
approach will fail, too. 
Or lets say your favourite email client calls some script to mark spam...
There are a lot of black holes in this ground ...
-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-10 Thread Stephan von Krawczynski
On Wed, 09 Sep 2009 19:43:15 -0400
Mark Mielke m...@mark.mielke.cc wrote:

 
  On Wed, 9 Sep 2009 23:17:07 +0530
  Anand Avatiav...@gluster.com  wrote:
 
 
  Please reply back to this thread only after you have a response from
  the appropriate kernel developer indicating that the cause of this
  lockup is because of a misbehaving userspace application. After that,
  let us give you the benefit of doubt that the misbehaving userspace
  process is glusterfsd and then continue any further debugging. It is
  not that we do not want to help you, but we really are pointing you to
  the right place where your problem can actually get fixed. You have
  all the necessary input they need.
   
  This is the kind of statement that often drives listeners to think about a
  project fork...
 
 
 
 Only if backed up. Has the trace been shown to the linux developers? 
 What do they think?
 
 If the linux developers come back with this is totally a userspace 
 program - go away, then yes, it can lead to people thinking about a 
 project fork. But, if the linux developers come back with crap - yes, 
 this is a kernel program, then I think you might owe Anand an apology 
 for pushing him... :-)
 
 In this case, there is too many unknowns - but I agree with Anand's 
 logic 100%. Gluster should not be able to cause a CPU lock up. It should 
 be impossible. If it is not impossible - it means a kernel bug, and the 
 best place to have this addressed is the kernel devel list, or, if you 
 have purchased a subscription from a company such as RedHat, than this 
 belongs as a ticket open with RedHat.

You know, I am really bothered about the way the maintainers are acting since
I read this list. There is really a lot of ideology going on (can't be, is
impossible for userspace etc) and very few real debugging.
This application is not the only one in the world. People use heavily file-
and net-acting applications like firefox, apache, shell-scripts, name-one on
their boxes. None leads to effects seen if you play with glusterfs. If you
really think it is a logical way of debugging to go out and simply tell
userspace can't do that while the rest of the application-world does not
show up with dead-ends like seen on this list, how can I change your mind?
I hardly believe I can. I can only tell you what I would do: I would try to
document _first_ that my piece of code really does behave well. But as you may
have noticed there is no real way to provide this information. And that is
indeed part of the problem. 
Wouldn't it be a nice step if you could debug the ongoings of a
glusterfs-server on the client by simply reading an exported file (something
like a server-dependant meta-debug-file) that outputs something like strace
does? Something that enables you to say: Ok, here you can see what the
application did, and there you can see what the kernel made of it. As we
noticed a server-logfile is not sufficient.
Is ideology really a prove for anything in todays' world? Do you really think
it is possible to understand the complete world by seeing half of it and the
other half painted by ideology? What is wrong about _proving_ being not
guilty? About acting defensive ?

It is important to understand that this application is a kind of core
technology for data storage. This means people want to be sure that their
setup does not explode just because they made a kernel update or some other
change where their experience tells them it should have no influence on the
glusterfs service. You want to be sure, just like you are when using nfs. It
just does work (even being in kernel-space!).
Now, answer for yourself if you think glusterfs is as stable as nfs on the
same box. 

 Cheers,
 mark

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-10 Thread Stephan von Krawczynski

  Only if backed up. Has the trace been shown to the linux developers? 
  What do they think?

Maybe we should just ask questions about the source before bothering others...

From 2.0.6 /transport/socket/src/socket.c line 867 ff:

new_trans = CALLOC (1, sizeof (*new_trans));
new_trans-xl = this-xl;
new_trans-fini = this-fini;

memcpy (new_trans-peerinfo.sockaddr, new_sockaddr,
addrlen);
new_trans-peerinfo.sockaddr_len = addrlen;

new_trans-myinfo.sockaddr_len =
sizeof (new_trans-myinfo.sockaddr);

ret = getsockname (new_sock,
   SA (new_trans-myinfo.sockaddr),
   new_trans-myinfo.sockaddr_len);

CALLOC from libglusterfs/src/mem-pool.h:
#define CALLOC(cnt,size) calloc(cnt,size)

man calloc:
RETURN VALUE
   For calloc() and malloc(), the value returned is a pointer to the 
allocated memory, which is suitably aligned for any
   kind of variable, or NULL if the request fails.


Did I understand the source? What about calloc returning NULL?

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-10 Thread Stephan von Krawczynski
On Thu, 10 Sep 2009 21:20:04 +0530
Krishna Srinivas kris...@gluster.com wrote:

 Now, failing to check for NULL pointer here is a bug which we will fix
 in future releases (blame it on our laziness for not doing the check
 already!) Thanks for pointing it out.

Really, this was only _one_ quick example of which there are numerous in your
code. Look at all CALLOC/MALLOC calls. Most of them are not safe.

Look at your documentation. It is quite a mess. There seems to be no intention
to document which version knows which options, they are pretty different. It
would have been easy-go if you started that from the very first release and
just copied all the docs to a new tree deleting dead options and adding the
new ones, linking the docs to version numbers. This would allow people to find
out what is really a valid option.

I will not stop to post every single case that looks bogus, even without
understanding a single bit of the semantics.

 Talking about analogy, in a car assume that engine is the glusterfs
 and tyres the kernel. If you get flat tyres and the car doesn't move
 you can't blame the engine!

Boy, you really entered cloud nr 9. To bring your example down to reality I'd
rather suggest the kernel being the engine and and glusterfs being the rear
view mirror. The car can live without, nice to have one though.

 Thanks
 Krishna

And todays' example of coding is in
glusterfs-2.0.6/transport/socket/src/name.c.

# grep -n UNIX_PATH_MAX name.c
95:if (!path || strlen (path)  UNIX_PATH_MAX) {
281:if (strlen (connect_path)  UNIX_PATH_MAX) {
284:strlen (connect_path), UNIX_PATH_MAX);
321:#ifndef UNIX_PATH_MAX
322:#define UNIX_PATH_MAX 108
323:#endif
325:if (strlen (listen_path)  UNIX_PATH_MAX) {
329:strlen (listen_path), UNIX_PATH_MAX);

Now what does that mean? UNIX_PATH_MAX used in lines 95,281,284 and then in
321 it comes to programmers' mind that it may be undefined? Ah well, things
get more interesting:

libglusterfs/src/compat.h:#define UNIX_PATH_MAX 108
libglusterfs/src/compat.h:#define UNIX_PATH_MAX 104
libglusterfs/src/compat.h:#define UNIX_PATH_MAX 104
libglusterfs/src/compat.h:#define UNIX_PATH_MAX 108
libglusterfs/src/transport.h:   char identifier[UNIX_PATH_MAX];

Ok, if you define it depending on the OS, how can it be absolute 108 in
socket/src/name.c (and elsewhere) ?

Remember, no semantics analyzed, just reading ... may as well be bs from me.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Stephan von Krawczynski
On Tue, 8 Sep 2009 10:13:17 +1000 (EST)
Jeff Evans je...@tricab.com wrote:

  - server was ping'able
  - glusterfsd was disconnected by the client because of missing
  ping-pong - no login possible
  - no fs action (no lights on the hd-stack)
  - no screen (was blank, stayed blank)
 
 This is very similar to what I have seen many times (even back on
 1.3), and have also commented on the list.
 
 It seems that we have quite a few ACK's on this, or similar problems.
 
 The only thing different in my scenario, is that the console doesn't
 stay blank. When attempting to login I get the last login message, and
 nothing more, no prompt ever. Also, I can see that other processes are
 still listening on sockets etc.. so it seems like the kernel just
 can't grab new FD's.
 
 I too found the hang happens more easily if a downed node from a
 replicate pair re-joins after some time.
 
 Following suggestions that this is all kernel related, I have just
 moved up to RHEL 5.4 in the hope that the new kernel will
 help.
 
 This fix stood out as potentially related for me:
 https://bugzilla.redhat.com/show_bug.cgi?id=44543

This is an ext3 fix, unlikely that we run into a similar effect on reiserfs3,
they are really very different in internals and coding.
 
 We also have a broadcom network card, which had reports of hangs under
 load, the kernel has a patch for that too.

We used tg3 in this setup, but the load was not very high (below 10 MBit on a
1000MBit link). 

 If I still run into the hangs, I'll try xfs.

I doubt that this can be a real solution. My guess is that glusterfsd runs
into some race condition where it locks itself up completely.
It is not funny to debug something the like on a production setup. Best would
be to have debugging output sent from the servers' glusterfsd directly to a
client to save the logs. I would not count on syslog in this case, if it
survives one could use a serial console for syslog output though.
 
 Thanks, Jeff.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Stephan von Krawczynski
On Tue, 8 Sep 2009 03:23:37 -0700
Anand Avati anand.av...@gmail.com wrote:

  I doubt that this can be a real solution. My guess is that glusterfsd runs
  into some race condition where it locks itself up completely.
  It is not funny to debug something the like on a production setup. Best 
  would
  be to have debugging output sent from the servers' glusterfsd directly to a
  client to save the logs. I would not count on syslog in this case, if it
  survives one could use a serial console for syslog output though.
 
 Does the system which is locking up have a fuse mountpoint? or is it a
 pure glusterfsd export server without a glusterfs mountpoint?
 
 Avati

The system acts as pure server for both glusterfs and nfs. It has no fuse nor
nfs client mount points.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Stephan von Krawczynski
On Tue, 8 Sep 2009 05:37:09 -0700
Anand Avati av...@gluster.com wrote:

   I doubt that this can be a real solution. My guess is that glusterfsd 
   runs
   into some race condition where it locks itself up completely.
   It is not funny to debug something the like on a production setup. Best 
   would
   be to have debugging output sent from the servers' glusterfsd directly 
   to a
   client to save the logs. I would not count on syslog in this case, if it
   survives one could use a serial console for syslog output though.
 
 I'm going to iterate through this yet again at the risk of frustrating
 you. glusterfsd (on the server side) is yet another process running
 only system calls. If glusterfsd has a race condition and locks itself
 up, then it locks _only its own process_ up. What you are having is a
 frozen system. There is no way glusterfsd can lock up your system
 through just VFS system calls, even if it wanted to, intentionally. It
 is a pure user space process and has no power to lock up the system.
 The worst glusterfsd can do to your system is deadlock its own process
 resulting in a glusterfs fuse mountpoint hang, or segfault and result
 in a core dump.
 
 Please consult system/kernel programmers you trust. Or ask on the
 kernel-devel mailing list. The system freeze you are facing is not
 something which can be caused by _any_ user space application.

Please read carefully what I told about the system condition. The fact that I
can ping the box means that the kernel is not messed up, i.e. this is no
freeze. But as I cannot login nor use any other user-space software to get
hands on the box only means that an application should only be able to mess up
the userspace to an extent that every other application gets few to no
timeslices, or some system resource is eaten up to an extent that others are
simply locked out. That does not sound impossible to me as it is just like a
local DoS attack which is possible. Maybe one only needs some messed up
pointers to create such a situation. What really bothers me more is the fact
that you continously deny to see what several people on the list described.
It is not our intention to waste someones time, we try to give as much
information as possible to go out and find some problem. Unfortunately we
cannot do that job, because we don't have the background knowledge about your
code. 
Since it all is userspace maybe it would be helpful to have a version that
just outputs logs to serial, so that we can trace where it went before things
blew up. Maybe we can watch it cycling somewhere...

Do you really deny that a local DoS attack is generally possible? 
-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] The continuing story ...

2009-09-07 Thread Stephan von Krawczynski
Hello all,

last week we saw our first try to enable something like a real-world
environment on glusterfs fail.
Nevertheless we managed to get a working combination of _one_ server and _one_
client (using a replicate setup with a missing second server).
This setup worked for about 4 days, so yesterday we tried to enable the second
server. Within minutes the first one crashed. Well, really we do not know if
it crashed in its true meaning, the situation looked like this:
- server was ping'able
- glusterfsd was disconnected by the client because of missing ping-pong
- no login possible
- no fs action (no lights on the hd-stack)
- no screen (was blank, stayed blank)

This could also be a user-space hang or cpu busy/looping. We don't know.
The really interesting part is that the server worked for days being single,
but as soon as dual server fs action (obviously in combination with self
healing) started it did not survive 10 minutes.
Of course the second server went on, but we had to stop the whole thing
because the data was not completely healed, so it made no sense to go on with
old copies.
This was glusterfs 2.0.6 with a minimal server setup (storage/posix,
features/locks, performance/io-threads) on a linux kernel 2.6.25.2.
Is there someone out there that experienced something the like? 
Any ideas?

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS replacement

2009-09-02 Thread Stephan von Krawczynski
On Tue, 01 Sep 2009 11:33:38 +0530
Shehjar Tikoo shehj...@gluster.com wrote:

 Stephan von Krawczynski wrote:
  On Mon, 31 Aug 2009 19:48:46 +0530 Shehjar Tikoo 
  shehj...@gluster.com wrote:
  
  Stephan von Krawczynski wrote:
  Hello all,
  
  after playing around for some weeks we decided to make some real
   world tests with glusterfs. Therefore we took a nfs-client and 
  mounted the very same data with glusterfs. The client does some 
  logfile processing every 5 minutes and needs around 3,5 mins 
  runtime in a nfs setup. We found out that it makes no sense to 
  try this setup with gluster replicate as long as we do not have 
  the same performance in a single server setup with glusterfs. So
   now we have one server mounted (halfway replicate) and would
  like to tune performance. Does anyone have experience with some 
  simple replacement like that? We had to find out that almost all
   performance options have exactly zero effect. The only thing
  that seems to make at least some difference is read-ahead on the
   server. We end up with around 4,5 - 5,5 minutes runtime of the 
  scripts, which is on the edge as we need something quite below 5
   minutes (just like nfs was). Our goal is to maximise performance
   in this setup and then try a real replication setup with two 
  servers. The load itselfs looks like around 100 scripts starting
   at one time and processing their data.
  
  Any ideas?
  
  What nfs server are you using? The in-kernel one?
  
  Yes.
  
  You could try the unfs3booster server, which is the original unfs3 
  with our modifications for bug fixes and slight performance 
  improvements. It should give better performance in certain cases 
  since it avoids the FUSE bottleneck on the server.
  
  For more info, do take a look at this page: 
  http://www.gluster.org/docs/index.php/Unfs3boosterConfiguration
  
  When using unfs3booster, please use GlusterFS release 2.0.6 since 
  that has the required changes to make booster work with NFS.
  
  I read the docs, but I don't understand the advantage. Why should we
   use nfs as kind of a transport layer to an underlying glusterfs 
  server, when we can easily export the service (i.e. glusterfs) 
  itself. Remember, we don't want nfs on the client any longer, but a 
  replicate setup with two servers (though we do not use it right now,
   but nevertheless it stays our primary goal).
 
 Ok. My answer was simply under the impression that moving to NFS
 was the motive. unfs3booster-over-gluster is a better solution as
 opposed to having kernel-nfs-over-gluster because of the avoidance of
 the FUSE layer completely.

Sorry. To make that one clear again: I don't want to use NFS if not ultimately
necessary. I would be happy to use a complete glusterfs environment without
any patches and glues to nfs, cifs or the like. 
  It sounds obvious to me
  that a nfs-over-gluster must be slower than a pure kernel-nfs. On the
   other hand glusterfs per se may even have some advantages on the 
  network side, iff performance tuning (and of course the options 
  themselves) is well designed. The first thing we noticed is that load
   dropped dramatically both on server and client when not using 
  kernel-nfs. Client dropped from around 20 to around 4. Server dropped
   from around 10 to around 5. Since all boxes are pretty much 
  dedicated to their respective jobs a lot of caching is going on 
  anyways.
 Thanks, that is useful information.
 
 So I
  would not expect nfs to have advantages only because it is 
  kernel-driven. And the current numbers (loss of around 30% in 
  performance) show that nfs performance is not completely out of 
  reach.
 That is true, we do have setups performing as well and in some
 cases better than kernel NFS despite the replication overhead. It
 is a matter of testing and arriving at a config that works for your
 setup.
 
 
  
  What advantages would you expect from using unfs3booster at all?
  
 To begin with, unfs3booster must be compared against kernel nfsd and not
 against a GlusterFS-only config. So when comparing with kernel-nfsd, one
 should understand that knfsd involves the FUSE layer, kernel's VFS and
 network layer, all of which have their advantages and also
 disadvantages, especially FUSE when using with the kernel nfsd. Those
 bottlenecks with FUSE+knfsd interaction are well documented elsewhere.
 
 unfs3booster enables you to avoid the FUSE layer, the VFS, etc and talk
 directly to the network and through that, to the GlusterFS server. In
 our measurements, we found that we could perform better than kernel
 nfs-over-gluster by avoiding FUSE and using our own caching(io-cache),
 buffering(write-behind, read-ahead) and request scheduling(io-threads).
 
  Another thing we really did not understand is the _negative_ effect 
  of adding iothreads on client or server. Our nfs setup needs around 
  90 nfs kernel threads to run smoothly. Every number greater than 8 
  iothreads reduces the performance

[Gluster-users] NFS replacement

2009-08-31 Thread Stephan von Krawczynski
Hello all,

after playing around for some weeks we decided to make some real world tests
with glusterfs. Therefore we took a nfs-client and mounted the very same data
with glusterfs. The client does some logfile processing every 5 minutes and
needs around 3,5 mins runtime in a nfs setup.
We found out that it makes no sense to try this setup with gluster replicate
as long as we do not have the same performance in a single server setup with
glusterfs. So now we have one server mounted (halfway replicate) and would
like to tune performance.
Does anyone have experience with some simple replacement like that? We had to
find out that almost all performance options have exactly zero effect. The
only thing that seems to make at least some difference is read-ahead on the
server. We end up with around 4,5 - 5,5 minutes runtime of the scripts, which
is on the edge as we need something quite below 5 minutes (just like nfs was).
Our goal is to maximise performance in this setup and then try a real
replication setup with two servers.
The load itselfs looks like around 100 scripts starting at one time and
processing their data.

Any ideas?

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS replacement

2009-08-31 Thread Stephan von Krawczynski
On Mon, 31 Aug 2009 19:48:46 +0530
Shehjar Tikoo shehj...@gluster.com wrote:

 Stephan von Krawczynski wrote:
  Hello all,
  
  after playing around for some weeks we decided to make some real world tests
  with glusterfs. Therefore we took a nfs-client and mounted the very same 
  data
  with glusterfs. The client does some logfile processing every 5 minutes and
  needs around 3,5 mins runtime in a nfs setup.
  We found out that it makes no sense to try this setup with gluster replicate
  as long as we do not have the same performance in a single server setup with
  glusterfs. So now we have one server mounted (halfway replicate) and would
  like to tune performance.
  Does anyone have experience with some simple replacement like that? We had 
  to
  find out that almost all performance options have exactly zero effect. The
  only thing that seems to make at least some difference is read-ahead on the
  server. We end up with around 4,5 - 5,5 minutes runtime of the scripts, 
  which
  is on the edge as we need something quite below 5 minutes (just like nfs 
  was).
  Our goal is to maximise performance in this setup and then try a real
  replication setup with two servers.
  The load itselfs looks like around 100 scripts starting at one time and
  processing their data.
  
  Any ideas?
  
 What nfs server are you using? The in-kernel one?

Yes.

 You could try the unfs3booster server, which is the original unfs3
 with our modifications for bug fixes and slight performance
 improvements. It should give better performance in certain cases
 since it avoids the FUSE bottleneck on the server.
 
 For more info, do take a look at this page:
 http://www.gluster.org/docs/index.php/Unfs3boosterConfiguration
 
 When using unfs3booster, please use GlusterFS release 2.0.6 since
 that has the required changes to make booster work with NFS.

I read the docs, but I don't understand the advantage. Why should we use nfs
as kind of a transport layer to an underlying glusterfs server, when we can
easily export the service (i.e. glusterfs) itself. Remember, we don't want nfs
on the client any longer, but a replicate setup with two servers (though we do
not use it right now, but nevertheless it stays our primary goal).
It sounds obvious to me that a nfs-over-gluster must be slower than a pure
kernel-nfs. On the other hand glusterfs per se may even have some advantages
on the network side, iff performance tuning (and of course the options
themselves) is well designed.
The first thing we noticed is that load dropped dramatically both on server
and client when not using kernel-nfs. Client dropped from around 20 to around
4. Server dropped from around 10 to around 5.
Since all boxes are pretty much dedicated to their respective jobs a lot of
caching is going on anyways. So I would not expect nfs to have advantages only
because it is kernel-driven. And the current numbers (loss of around 30% in
performance) show that nfs performance is not completely out of reach.

What advantages would you expect from using unfs3booster at all?

Another thing we really did not understand is the _negative_ effect of adding
iothreads on client or server. Our nfs setup needs around 90 nfs kernel
threads to run smoothly. Every number greater than 8 iothreads reduces the
performance of glusterfs measurably.

 -Shehjar
-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] NFS replacement, rest stopped

2009-08-31 Thread Stephan von Krawczynski
Hello all,

as told earlier we tried to replace a nfs-server/client combination in
semi-production environment with a trivial one-server gluster setup. We
thought at first that this pretty simple setup would allow some more testing.
Unfortunately we have to stop those tests because it turns out that the client
system has troubles with networking as soon as we start glusterfs.
The client has three network cards, first is for internet use, second is for
connection to glusterfs-server, third for collecting data from several other
boxes.
It turned out that the third interface had troubles soon after we started to
work with glusterfs. We could not ping several hosts on the same lan, or
packet delay was very high (up to 20 s).
The effects were pretty weird, looked like a bad interface card. But switching
back to kernel-nfs everything went back to normal.
It really looks like glusterfs client has some problems, too. It looks like
buffer re-usage or mem thrashing or pointer mixup or the like.
Interestingly no problems were visible on the interface where the glusterfs
was happening, I have no idea how something like this happens.
Anyway, the story looks like someone will tell me it is the kernel networking
that has troubles, just like reiserfs that has troubles or ext3 :-(
To give you an idea what ugly things look like:

Aug 31 08:20:16 heather kernel: [ cut here ]
Aug 31 08:20:16 heather kernel: WARNING: at net/ipv4/tcp.c:1405 
tcp_recvmsg+0x1c7/0x7b6()
Aug 31 08:20:16 heather kernel: Hardware name: empty
Aug 31 08:20:16 heather kernel: Modules linked in: nfs lockd nfs_acl sunrpc 
fuse loop i2c_i801 e100 i2c_core e1000e
Aug 31 08:20:16 heather kernel: Pid: 31500, comm: netcat Not tainted 2.6.30.5 #1
Aug 31 08:20:16 heather kernel: Call Trace:
Aug 31 08:20:16 heather kernel:  [80431497] ? tcp_recvmsg+0x1c7/0x7b6
Aug 31 08:20:16 heather kernel:  [80431497] ? tcp_recvmsg+0x1c7/0x7b6
Aug 31 08:20:16 heather kernel:  [8023282d] ? 
warn_slowpath_common+0x77/0xa3
Aug 31 08:20:16 heather kernel:  [80431497] ? tcp_recvmsg+0x1c7/0x7b6
Aug 31 08:20:16 heather kernel:  [80401340] ? 
sock_common_recvmsg+0x30/0x45
Aug 31 08:20:16 heather kernel:  [8029b3d8] ? 
mnt_drop_write+0x25/0x12e
Aug 31 08:20:16 heather kernel:  [803fee67] ? 
sock_aio_read+0x109/0x11d
Aug 31 08:20:16 heather kernel:  [80287131] ? do_sync_read+0xce/0x113
Aug 31 08:20:16 heather kernel:  [80244348] ? 
autoremove_wake_function+0x0/0x2e
Aug 31 08:20:16 heather kernel:  [80293243] ? 
poll_select_copy_remaining+0xd0/0xf3
Aug 31 08:20:16 heather kernel:  [80287b83] ? vfs_read+0xbd/0x133
Aug 31 08:20:16 heather kernel:  [80287cb5] ? sys_read+0x45/0x6e
Aug 31 08:20:16 heather kernel:  [8020ae6b] ? 
system_call_fastpath+0x16/0x1b
Aug 31 08:20:16 heather kernel: ---[ end trace 31e61d5bab6e7cc0 ]---

Hopefully you would not tell that netcat has problems, or not?
Hopefully we can agree on the fact that there are nasty things going on inside 
this code and someone with better brain and kernel knowledge than me should 
give it a very close look.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Known Issues : Replicate will only self-heal if the files exist on the first subvolume. Server A- B works, Server A -B does not work.

2009-08-30 Thread Stephan von Krawczynski
On Sat, 29 Aug 2009 03:46:04 +0200
supp...@citytoo.com supp...@citytoo.com wrote:

 Hello,
 
  Known Issues : Replicate will only self-heal if the files exist on the first 
 subvolume. Server A- B works, Server A -B does not work.
 
 When this probleme will be fixed because it's very important ?
 
 Ben
 
 Cordialement

Hi Ben,

really, don't push to hard in this direction, because this is easily solvable
by running find on server b and statd'ing the filelist on server a. You may
call that inconveniant, but at least there is a trivial solution.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Replication not working on server hang

2009-08-28 Thread Stephan von Krawczynski

 [...]
 Glusterfs log only shows lines like this ones:
 
 [2009-08-28 09:19:28] E [client-protocol.c:292:call_bail] data2: bailing 
 out frame LOOKUP(32) frame sent = 2009-08-28 08:49:18. frame-timeout = 1800
 [2009-08-28 09:23:38] E [client-protocol.c:292:call_bail] data2: bailing 
 out frame LOOKUP(32) frame sent = 2009-08-28 08:53:28. frame-timeout = 1800
 
 Once server2 has been rebooted all gluster fs become available
 again on all clients and the hanged df and ls processes terminate,
 but difficult to understand why a replicated share that must survive
 to failure on one server does not.

You are suffering from the problem we talked about few days ago on the list.
If your local fs produces a deadlock somehow on one server glusterfs is
currently unable to cope with the situation and just _waits_ for things to
come. This deadlocks your clients, too, without any need.
Your experience backs my critics on the handling of these situations.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] 2.0.6

2009-08-24 Thread Stephan von Krawczynski
Hello Avati,

back to our original problem of all-hanging glusterfs servers and clients.
Today we got another hang with same look and feel, but this time we got
something in the logs, please read and tell us how to further proceed.
Configuration is as before. I send the whole log since boot, crash is visible 
at the end. We did the same testing as before, running two bonnies on two 
clients.

Linux version 2.6.30.5 (r...@linux-tnpx) (gcc version 4.3.2 [gcc-4_3-branch 
revision 141291] (SUSE Linux) ) #1 SMP Tue Aug 18 12:06:06 CEST 2009
Command line: root=/dev/sda3 resume=/dev/sda1 splash=silent console=ttyS0,9600 
console=tty0
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009dc00 (usable)
 BIOS-e820: 0009dc00 - 000a (reserved)
 BIOS-e820: 000ca000 - 000cc000 (reserved)
 BIOS-e820: 000e4000 - 0010 (reserved)
 BIOS-e820: 0010 - d7e8 (usable)
 BIOS-e820: d7e8 - d7e8a000 (ACPI data)
 BIOS-e820: d7e8a000 - d7f0 (ACPI NVS)
 BIOS-e820: d7f0 - d800 (reserved)
 BIOS-e820: e000 - f000 (reserved)
 BIOS-e820: fec0 - fec1 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820: ff00 - 0001 (reserved)
 BIOS-e820: 0001 - 00012800 (usable)
DMI present.
Phoenix BIOS detected: BIOS may corrupt low RAM, working around it.
last_pfn = 0x128000 max_arch_pfn = 0x1
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
last_pfn = 0xd7e80 max_arch_pfn = 0x1
init_memory_mapping: -d7e8
init_memory_mapping: 0001-00012800
ACPI: RSDP 000f6390 00014 (v00 PTLTD )
ACPI: RSDT d7e822bb 0003C (v01 PTLTDRSDT   0604  LTP )
ACPI: FACP d7e89e54 00074 (v01 INTEL   0604 PTL  0003)
ACPI: DSDT d7e83b29 0632B (v01  INTEL MUKLTEO2 0604 MSFT 010E)
ACPI: FACS d7e8afc0 00040
ACPI: MCFG d7e89ec8 0003C (v01 PTLTDMCFG   0604  LTP )
ACPI: APIC d7e89f04 00084 (v01 PTLTD APIC   0604  LTP )
ACPI: BOOT d7e89f88 00028 (v01 PTLTD  $SBFTBL$ 0604  LTP 0001)
ACPI: SPCR d7e89fb0 00050 (v01 PTLTD  $UCRTBL$ 0604 PTL  0001)
ACPI: SSDT d7e822f7 013EC (v01  PmRefCpuPm 3000 INTL 20050228)
(7 early reservations) == bootmem [00 - 012800]
  #0 [00 - 001000]   BIOS data page == [00 - 001000]
  #1 [006000 - 008000]   TRAMPOLINE == [006000 - 008000]
  #2 [20 - 6e5778]TEXT DATA BSS == [20 - 6e5778]
  #3 [09dc00 - 10]BIOS reserved == [09dc00 - 10]
  #4 [6e6000 - 6e6174]  BRK == [6e6000 - 6e6174]
  #5 [01 - 014000]  PGTABLE == [01 - 014000]
  #6 [014000 - 015000]  PGTABLE == [014000 - 015000]
found SMP MP-table at [880f63c0] f63c0
Zone PFN ranges:
  DMA  0x0010 - 0x1000
  DMA320x1000 - 0x0010
  Normal   0x0010 - 0x00128000
Movable zone start PFN for each node
early_node_map[3] active PFN ranges
0: 0x0010 - 0x009d
0: 0x0100 - 0x000d7e80
0: 0x0010 - 0x00128000
ACPI: PM-Timer IO Port: 0x1008
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
ACPI: IOAPIC (id[0x04] address[0xfec0] gsi_base[0])
IOAPIC[0]: apic_id 4, version 0, address 0xfec0, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Using ACPI (MADT) for SMP configuration information
SMP: Allowing 4 CPUs, 0 hotplug CPUs
PM: Registered nosave memory: 0009d000 - 0009e000
PM: Registered nosave memory: 0009e000 - 000a
PM: Registered nosave memory: 000a - 000ca000
PM: Registered nosave memory: 000ca000 - 000cc000
PM: Registered nosave memory: 000cc000 - 000e4000
PM: Registered nosave memory: 000e4000 - 0010
PM: Registered nosave memory: d7e8 - d7e8a000
PM: Registered nosave memory: d7e8a000 - d7f0
PM: Registered nosave memory: d7f0 - d800
PM: Registered nosave memory: d800 - e000
PM: Registered nosave memory: e000 - 

  1   2   >