Re: [Gluster-users] Very slow directory listing and high CPU usage on replicated volume

2012-11-06 Thread Ivan Dimitrov
That's a very good point. I've been evaluating glusterfs from version 
1.0 and refused to use it for one and only reason: the split-brain 
problem. With version 3.3 I have finally switched to glusterfs, but 
after a few months of production usage, I'm thinking of going back to 
separate servers with big raids.


/home/freecloud# time echo * |wc -w
87926

real16m42.242s
user0m0.384s
sys0m0.072s

I just don't get it. Until version 3.3 - Why would I need openstack, 
qemu support etc etc when after one simple reboot I would loose part of 
my data.


On 11/6/12 11:35 AM, Fernando Frediani (Qube) wrote:

Joe,

I don't think we have to accept this as this is not acceptable thing. I have 
seen countless people complaining about this problem for a while and seems no 
improvements have been done.
The thing about the ramdisk although might help, looks more a chewing gun. I 
have seen other distributed filesystems that don't suffer for the same problem, 
so why Gluster have to ?


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] rdma tansport on 3.3?

2012-10-22 Thread Ivan Dimitrov
I have the same experiance. All comunication goes through ethernet. I 
think the documentation should be changed to "NOT SUPPORTED AT ALL!", 
because with my broken english I figured that there was no commercial 
support for rdma, but the code is there.


On 10/19/12 9:00 PM, Bartek Krawczyk wrote:
Funny thing is I was able to mount using transport rdma on 3.3.0 but 
there wasn't any speed difference. I'm not sure there is any 
difference in 3.3.1. Regards, 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] rdma tansport on 3.3?

2012-10-19 Thread Ivan Dimitrov
I have an existing volume configured to use GbE and just got two 
Infiniband cards. How can I reconfigure the peers to use the IPoIB?


On 10/19/12 2:48 PM, Bartek Krawczyk wrote:

Due to the lack of rdma support in 3.3.x we decided to stick with plain IPoIB.



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster 3.3.0 on CentOS 6 - GigabitEthernet vs InfiniBand

2012-10-18 Thread Ivan Dimitrov


On 10/18/12 10:48 AM, Bartek Krawczyk wrote:

On 18 October 2012 08:44, Ling Ho  wrote:

If your volume is created with both tcp and rdma, my experience is rdma does
not work under 3.3.0 and it will always fall back to tcp.
I just converted from gbe to infiniband and I was cursing the entire 
last week about this. Please devs: Make sure we can set the transport 
type between peers!



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Change transport type on volume from tcp to rdma

2012-10-11 Thread Ivan Dimitrov

http://community.gluster.org/q/how-to-change-transport-type-on-active-volume---glusterfs-3-3/



On 10/11/12 4:41 PM, John Mark Walker wrote:

Cool - can you add this to http://community.gluster.org/ ?

-JM



- Original Message -

What I did was:

gluster volume stop VOLUME
gluster volume delete VOLUME

On each peer on each brick I did:
setfattr -x trusted.glusterfs.volume-id /mnt/brick1
setfattr -x trusted.gfid /mnt/brick1
setfattr -x trusted.glusterfs.volume-id /mnt/brick2
setfattr -x trusted.gfid /mnt/brick2
rm -r /mnt/brick1/.glusterfs/
rm -r /mnt/brick2/.glusterfs/

gluster volume create VOLUME replica 2 transport rdma,tcp
peer1:brick1
peer2:brick1 peer1:brick2 peer2:brick2

Now I was able to mount with -o transport=rdma where I have
Infiniband
cards and -o transport=tcp where I have only ethernet

Best Regards
Ivan Dimitrov

On 10/10/12 4:59 PM, Ivan Dimitrov wrote:

So there is no manual way to change the transport right now?
I need the transport between peers to be rdma and the transport
between clients/peers to be tcp.

Regards
Ivan Dimitrov


On 10/10/12 3:57 PM, Amar Tumballi wrote:

On 10/10/2012 04:47 PM, Ivan Dimitrov wrote:

Hello

I have two peers setup and working with x2 bricks each. They have
been
working via tcp for the last 4-5 months.
I just got two Infiniband cards and put the on the peers. I want
to
change the transport type to rdma instead of tcp but I don't see
an
easy
way to do this.
Can you please help me with proper instructions.


Hi Ivan,

You are asking for a feature which just got merged upstream
(http://review.gluster.org/4008). This will make it to 3.4.0
release,
till then the functionality you are asking will not be available.

Regards,
Amar


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Change transport type on volume from tcp to rdma

2012-10-11 Thread Ivan Dimitrov

What I did was:

gluster volume stop VOLUME
gluster volume delete VOLUME

On each peer on each brick I did:
setfattr -x trusted.glusterfs.volume-id /mnt/brick1
setfattr -x trusted.gfid /mnt/brick1
setfattr -x trusted.glusterfs.volume-id /mnt/brick2
setfattr -x trusted.gfid /mnt/brick2
rm -r /mnt/brick1/.glusterfs/
rm -r /mnt/brick2/.glusterfs/

gluster volume create VOLUME replica 2 transport rdma,tcp peer1:brick1 
peer2:brick1 peer1:brick2 peer2:brick2


Now I was able to mount with -o transport=rdma where I have Infiniband 
cards and -o transport=tcp where I have only ethernet


Best Regards
Ivan Dimitrov

On 10/10/12 4:59 PM, Ivan Dimitrov wrote:

So there is no manual way to change the transport right now?
I need the transport between peers to be rdma and the transport 
between clients/peers to be tcp.


Regards
Ivan Dimitrov


On 10/10/12 3:57 PM, Amar Tumballi wrote:

On 10/10/2012 04:47 PM, Ivan Dimitrov wrote:

Hello

I have two peers setup and working with x2 bricks each. They have been
working via tcp for the last 4-5 months.
I just got two Infiniband cards and put the on the peers. I want to
change the transport type to rdma instead of tcp but I don't see an 
easy

way to do this.
Can you please help me with proper instructions.



Hi Ivan,

You are asking for a feature which just got merged upstream 
(http://review.gluster.org/4008). This will make it to 3.4.0 release, 
till then the functionality you are asking will not be available.


Regards,
Amar



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Change transport type on volume from tcp to rdma

2012-10-10 Thread Ivan Dimitrov

So there is no manual way to change the transport right now?
I need the transport between peers to be rdma and the transport between 
clients/peers to be tcp.


Regards
Ivan Dimitrov


On 10/10/12 3:57 PM, Amar Tumballi wrote:

On 10/10/2012 04:47 PM, Ivan Dimitrov wrote:

Hello

I have two peers setup and working with x2 bricks each. They have been
working via tcp for the last 4-5 months.
I just got two Infiniband cards and put the on the peers. I want to
change the transport type to rdma instead of tcp but I don't see an easy
way to do this.
Can you please help me with proper instructions.



Hi Ivan,

You are asking for a feature which just got merged upstream 
(http://review.gluster.org/4008). This will make it to 3.4.0 release, 
till then the functionality you are asking will not be available.


Regards,
Amar



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Change transport type on volume from tcp to rdma

2012-10-10 Thread Ivan Dimitrov

Hello

I have two peers setup and working with x2 bricks each. They have been 
working via tcp for the last 4-5 months.
I just got two Infiniband cards and put the on the peers. I want to 
change the transport type to rdma instead of tcp but I don't see an easy 
way to do this.

Can you please help me with proper instructions.

Best Regards
Ivan Dimitrov
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster and maildir

2012-10-03 Thread Ivan Dimitrov

I agree on the fewer bricks. Also see if you can use Infiniband.

Best Regards
Ivan

On 10/2/12 11:29 PM, Robert Hajime Lanning wrote:

On 10/02/12 13:01, ja...@combatyoga.net wrote:

Basically, I'm trying to figure out if Gluster will perform better with
more storage nodes in the storage block or if I would be better off
consolidating the storage to a few of the systems and freeing up the
resources for the email services on the remaining systems.  I've had
mixed results testing this in a KVM virtual environment, however it's
getting down to the time where I need to make some decisions on ordering
hardware.  I do know that RAID1 and RAID5 do not compare apples to
apples for performance,  I'm looking for thoughts from the community as
to which way you would set it up.


With maildir, I believe that fewer bricks would perform better. The 
maildir format tends to be readdir() heavy.  Since Gluster does not 
have a master index of directory entries, it has to hit every brick in 
the volume.




___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster speed sooo slow

2012-08-13 Thread Ivan Dimitrov
I have a low traffic free hosting and I converted some x,000 users on 
glusterfs a few months ago. I'm not impressed at all and would probably 
not convert any more users. It works ok for now, but with 88GB of 2TB 
volume. It's kind of pointless for now... :(
I'm researching a way to convert my payed hosting users, but I can't 
find any system suitable for the job.


Fernando, what gluster structure are you talking about?

Best Regards
Ivan Dimitrov

Fernando, what
On 8/13/12 2:16 PM, Fernando Frediani (Qube) wrote:

I heard from a Large ISP talking to someone that works there they were trying 
to use GlusteFS for Maildir and they had a hell because of the many small files 
and had customer complaining all the time.
Latency is acceptable on a networked filesystem, but the results people are 
reporting are beyond any latency problems, they are due to the way Gluster is 
structured and that was already confirmed by some people on this list, so 
changed are indeed needed on the code. If you take even a Gigabit network the 
round trip isn't that much really, (not more than a quarter of a ms) so it 
shouldn't be a big thing.
Yes FUSE might also contribute to decrease performance but still the 
performance problems are on the architecture of the filesystem.
One thing that is new to Gluster and that in my opinion could contribute to 
increase performance is the Distributed-Stripped volumes, but that doesn't 
still work for all enviroemnts.
So as it stands for Multimedia or Archive files fine, for other usages I 
wouldn't bet my chips and would rather test thoroughly first.

-Original Message-
From: Brian Candler [mailto:b.cand...@pobox.com]
Sent: 13 August 2012 11:00
To: Fernando Frediani (Qube)
Cc: 'Ivan Dimitrov'; 'gluster-users@gluster.org'
Subject: Re: [Gluster-users] Gluster speed sooo slow

On Mon, Aug 13, 2012 at 09:40:49AM +, Fernando Frediani (Qube) wrote:

I think Gluster as it stands now and current level of development is
more for Multimedia and Archival files, not for small files nor for
running Virtual Machines. It requires still a fair amount of
development which hopefully RedHat will put in place.

I know a large ISP is using gluster successfully for Maildir storage - or at 
least was a couple of years ago when I last spoke to them about it - which 
means very large numbers of small files.

I think you need to be clear on the difference between throughput and latency.

Any networked filesystem is going to have latency, and gluster maybe suffers 
more than most because of the FUSE layer at the client.  This will show as poor 
throughput if a single client is sequentially reading or writing lots of small 
files, because it has to wait a round trip for each request.

However, if you have multiple clients accessing at the same time, you can still have high 
total throughput.  This is because the "wasted" time between requests from one 
client is used to service other clients.

If gluster were to do aggressive client-side caching then it might be able to 
make responses appear faster to a single client, but this would be at the risk 
of data loss (e.g.  responding that a file has been committed to disk, when in 
fact it hasn't).  But this would make no difference to total throughput with 
multiple clients, which depends on the available bandwidth into the disk drives 
and across the network.

So it all depends on your overall usage pattern. Only make your judgement based 
on a single-threaded benchmark if that's what your usage pattern is really 
going to be like: i.e.  are you really going to have a single user accessing 
the filesystem, and their application reads or writes one file after the other 
rather than multiple files concurrently.

Regards,

Brian.



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster speed sooo slow

2012-08-13 Thread Ivan Dimitrov
There is a big difference with working with small files (around 16kb) 
and big files (2mb). Performance is much better with big files. Witch is 
too bad for me ;(


On 8/11/12 2:15 AM, Gandalf Corvotempesta wrote:

What do you mean with "small files"? 16k ? 160k? 16mb?
Do you know any workaround or any other software for this?

Mee too i'm trying to create a clustered storage for many
small file

2012/8/10 Philip Poten <mailto:philip.po...@gmail.com>>


Hi Ivan,

that's because Gluster has really bad "many small files" performance
due to it's architecture.

On all stat() calls (which rsync is doing plenty of), all replicas are
being checked for integrity.

regards,
Philip

2012/8/10 Ivan Dimitrov mailto:dob...@amln.net>>:
> So I stopped a node to check the BIOS and after it went up, the
rebalance
> kicked in. I was looking for those kind of speeds on a normal
write. The
> rebalance is much faster than my rsync/cp.
>
>

https://dl.dropbox.com/u/282332/Screen%20Shot%202012-08-10%20at%202.04.09%20PM.png
>
> Best Regards
> Ivan Dimitrov
>
>
> On 8/10/12 1:23 PM, Ivan Dimitrov wrote:
>>
>> Hello
>> What am I doing wrong?!?
>>
>> I have a test setup with 4 identical servers with 2 disks each in
>> distribute-replicate 2. All servers are connected to a GB switch.
>>
>> I am experiencing really slow speeds at anything I do. Slow
write, slow
>> read, not to mention random write/reads.
>>
>> Here is an example:
>> random-files is a directory with 32768 files with average size
16kb.
>> [root@gltclient]:~# rsync -a /root/speedtest/random-files/
>> /home/gltvolume/
>> ^^ This will take more than 3 hours.
>>
>> On any of the servers if I do "iostat" the disks are not loaded
at all:
>>
>>

https://dl.dropbox.com/u/282332/Screen%20Shot%202012-08-10%20at%201.08.54%20PM.png
>>
>> This is similar result for all servers.
>>
>> Here is an example of simple "ls" command on the content.
>> [root@gltclient]:~# unalias ls
>> [root@gltclient]:~# /usr/bin/time -f "%e seconds" ls
/home/gltvolume/ | wc
>> -l
>> 2.81 seconds
>> 5393
>>
>> almost 3 seconds to display 5000 files?!?! When they are
32,000, the ls
>> will take around 35-45 seconds.
>>
>> This directory is on local disk:
>> [root@gltclient]:~# /usr/bin/time -f "%e seconds" ls
>> /root/speedtest/random-files/ | wc -l
>> 1.45 seconds
>> 32768
>>
>> [root@gltclient]:~# /usr/bin/time -f "%e seconds" cat
/home/gltvolume/*
>> >/dev/null
>> 190.50 seconds
>>
>> [root@gltclient]:~# /usr/bin/time -f "%e seconds" du -sh
/home/gltvolume/
>> 126M/home/gltvolume/
>> 75.23 seconds
>>
>>
>> Here is the volume information.
>>
>> [root@glt1]:~# gluster volume info
>>
>> Volume Name: gltvolume
>> Type: Distributed-Replicate
>> Volume ID: 16edd852-8d23-41da-924d-710b753bb374
>> Status: Started
>> Number of Bricks: 4 x 2 = 8
>> Transport-type: tcp
>> Bricks:
>> Brick1: 1.1.74.246:/home/sda3
>> Brick2: glt2.network.net:/home/sda3
>> Brick3: 1.1.74.246:/home/sdb1
>> Brick4: glt2.network.net:/home/sdb1
>> Brick5: glt3.network.net:/home/sda3
>> Brick6: gltclient.network.net:/home/sda3
>> Brick7: glt3.network.net:/home/sdb1
>> Brick8: gltclient.network.net:/home/sdb1
>> Options Reconfigured:
>> performance.io-thread-count: 32
>> performance.cache-size: 256MB
>> cluster.self-heal-daemon: on
>>
>>
>> [root@glt1]:~# gluster volume status all detail
>> Status of volume: gltvolume
>>
>>

--
>> Brick: Brick 1.1.74.246:/home/sda3
>> Port : 24009
>> Online   : Y
>> Pid  : 1479
>> File System  : ext4
>> Device   : /dev/sda3
>> Mount Options: rw,noatime
>> Inode Size   : 256
>> Disk Space Free  : 179.3GB
>> Total Disk Spac

Re: [Gluster-users] Gluster speed sooo slow

2012-08-10 Thread Ivan Dimitrov
So I stopped a node to check the BIOS and after it went up, the 
rebalance kicked in. I was looking for those kind of speeds on a normal 
write. The rebalance is much faster than my rsync/cp.


https://dl.dropbox.com/u/282332/Screen%20Shot%202012-08-10%20at%202.04.09%20PM.png

Best Regards
Ivan Dimitrov

On 8/10/12 1:23 PM, Ivan Dimitrov wrote:

Hello
What am I doing wrong?!?

I have a test setup with 4 identical servers with 2 disks each in 
distribute-replicate 2. All servers are connected to a GB switch.


I am experiencing really slow speeds at anything I do. Slow write, 
slow read, not to mention random write/reads.


Here is an example:
random-files is a directory with 32768 files with average size 16kb.
[root@gltclient]:~# rsync -a /root/speedtest/random-files/ 
/home/gltvolume/

^^ This will take more than 3 hours.

On any of the servers if I do "iostat" the disks are not loaded at all:
https://dl.dropbox.com/u/282332/Screen%20Shot%202012-08-10%20at%201.08.54%20PM.png 



This is similar result for all servers.

Here is an example of simple "ls" command on the content.
[root@gltclient]:~# unalias ls
[root@gltclient]:~# /usr/bin/time -f "%e seconds" ls /home/gltvolume/ 
| wc -l

2.81 seconds
5393

almost 3 seconds to display 5000 files?!?! When they are 32,000, the 
ls will take around 35-45 seconds.


This directory is on local disk:
[root@gltclient]:~# /usr/bin/time -f "%e seconds" ls 
/root/speedtest/random-files/ | wc -l

1.45 seconds
32768

[root@gltclient]:~# /usr/bin/time -f "%e seconds" cat 
/home/gltvolume/* >/dev/null

190.50 seconds

[root@gltclient]:~# /usr/bin/time -f "%e seconds" du -sh /home/gltvolume/
126M/home/gltvolume/
75.23 seconds


Here is the volume information.

[root@glt1]:~# gluster volume info

Volume Name: gltvolume
Type: Distributed-Replicate
Volume ID: 16edd852-8d23-41da-924d-710b753bb374
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 1.1.74.246:/home/sda3
Brick2: glt2.network.net:/home/sda3
Brick3: 1.1.74.246:/home/sdb1
Brick4: glt2.network.net:/home/sdb1
Brick5: glt3.network.net:/home/sda3
Brick6: gltclient.network.net:/home/sda3
Brick7: glt3.network.net:/home/sdb1
Brick8: gltclient.network.net:/home/sdb1
Options Reconfigured:
performance.io-thread-count: 32
performance.cache-size: 256MB
cluster.self-heal-daemon: on


[root@glt1]:~# gluster volume status all detail
Status of volume: gltvolume
-- 


Brick: Brick 1.1.74.246:/home/sda3
Port : 24009
Online   : Y
Pid  : 1479
File System  : ext4
Device   : /dev/sda3
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 179.3GB
Total Disk Space : 179.7GB
Inode Count  : 11968512
Free Inodes  : 11901550
-- 


Brick: Brick glt2.network.net:/home/sda3
Port : 24009
Online   : Y
Pid  : 1589
File System  : ext4
Device   : /dev/sda3
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 179.3GB
Total Disk Space : 179.7GB
Inode Count  : 11968512
Free Inodes  : 11901550
-- 


Brick: Brick 1.1.74.246:/home/sdb1
Port : 24010
Online   : Y
Pid  : 1485
File System  : ext4
Device   : /dev/sdb1
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 228.8GB
Total Disk Space : 229.2GB
Inode Count  : 15269888
Free Inodes  : 15202933
-- 


Brick: Brick glt2.network.net:/home/sdb1
Port : 24010
Online   : Y
Pid  : 1595
File System  : ext4
Device   : /dev/sdb1
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 228.8GB
Total Disk Space : 229.2GB
Inode Count  : 15269888
Free Inodes  : 15202933
-- 


Brick: Brick glt3.network.net:/home/sda3
Port : 24009
Online   : Y
Pid  : 28963
File System  : ext4
Device   : /dev/sda3
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 179.3GB
Total Disk Space : 179.7GB
Inode Count  : 11968512
Free Inodes  : 11906058
-- 


Brick: Brick gltclient.network.net:/home/sda3
Port : 24009
Online   

[Gluster-users] Gluster speed sooo slow

2012-08-10 Thread Ivan Dimitrov

Hello
What am I doing wrong?!?

I have a test setup with 4 identical servers with 2 disks each in 
distribute-replicate 2. All servers are connected to a GB switch.


I am experiencing really slow speeds at anything I do. Slow write, slow 
read, not to mention random write/reads.


Here is an example:
random-files is a directory with 32768 files with average size 16kb.
[root@gltclient]:~# rsync -a /root/speedtest/random-files/ /home/gltvolume/
^^ This will take more than 3 hours.

On any of the servers if I do "iostat" the disks are not loaded at all:
https://dl.dropbox.com/u/282332/Screen%20Shot%202012-08-10%20at%201.08.54%20PM.png

This is similar result for all servers.

Here is an example of simple "ls" command on the content.
[root@gltclient]:~# unalias ls
[root@gltclient]:~# /usr/bin/time -f "%e seconds" ls /home/gltvolume/ | 
wc -l

2.81 seconds
5393

almost 3 seconds to display 5000 files?!?! When they are 32,000, the ls 
will take around 35-45 seconds.


This directory is on local disk:
[root@gltclient]:~# /usr/bin/time -f "%e seconds" ls 
/root/speedtest/random-files/ | wc -l

1.45 seconds
32768

[root@gltclient]:~# /usr/bin/time -f "%e seconds" cat /home/gltvolume/* 
>/dev/null

190.50 seconds

[root@gltclient]:~# /usr/bin/time -f "%e seconds" du -sh /home/gltvolume/
126M/home/gltvolume/
75.23 seconds


Here is the volume information.

[root@glt1]:~# gluster volume info

Volume Name: gltvolume
Type: Distributed-Replicate
Volume ID: 16edd852-8d23-41da-924d-710b753bb374
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: 1.1.74.246:/home/sda3
Brick2: glt2.network.net:/home/sda3
Brick3: 1.1.74.246:/home/sdb1
Brick4: glt2.network.net:/home/sdb1
Brick5: glt3.network.net:/home/sda3
Brick6: gltclient.network.net:/home/sda3
Brick7: glt3.network.net:/home/sdb1
Brick8: gltclient.network.net:/home/sdb1
Options Reconfigured:
performance.io-thread-count: 32
performance.cache-size: 256MB
cluster.self-heal-daemon: on


[root@glt1]:~# gluster volume status all detail
Status of volume: gltvolume
--
Brick: Brick 1.1.74.246:/home/sda3
Port : 24009
Online   : Y
Pid  : 1479
File System  : ext4
Device   : /dev/sda3
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 179.3GB
Total Disk Space : 179.7GB
Inode Count  : 11968512
Free Inodes  : 11901550
--
Brick: Brick glt2.network.net:/home/sda3
Port : 24009
Online   : Y
Pid  : 1589
File System  : ext4
Device   : /dev/sda3
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 179.3GB
Total Disk Space : 179.7GB
Inode Count  : 11968512
Free Inodes  : 11901550
--
Brick: Brick 1.1.74.246:/home/sdb1
Port : 24010
Online   : Y
Pid  : 1485
File System  : ext4
Device   : /dev/sdb1
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 228.8GB
Total Disk Space : 229.2GB
Inode Count  : 15269888
Free Inodes  : 15202933
--
Brick: Brick glt2.network.net:/home/sdb1
Port : 24010
Online   : Y
Pid  : 1595
File System  : ext4
Device   : /dev/sdb1
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 228.8GB
Total Disk Space : 229.2GB
Inode Count  : 15269888
Free Inodes  : 15202933
--
Brick: Brick glt3.network.net:/home/sda3
Port : 24009
Online   : Y
Pid  : 28963
File System  : ext4
Device   : /dev/sda3
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 179.3GB
Total Disk Space : 179.7GB
Inode Count  : 11968512
Free Inodes  : 11906058
--
Brick: Brick gltclient.network.net:/home/sda3
Port : 24009
Online   : Y
Pid  : 3145
File System  : ext4
Device   : /dev/sda3
Mount Options: rw,noatime
Inode Size   : 256
Disk Space Free  : 179.3GB
Total Disk Space : 179.7GB
Inode Count  : 11968512
Free Inodes  : 11906058
--
Brick: Brick glt3.network.net:/home/sdb1
Port   

[Gluster-users] Transport endpoint is not connected

2012-07-12 Thread Ivan Dimitrov

Hi group,
I'm in production with gluster for the last 2 weeks. No problems until 
today.
As of today I've got the "Transport endpoint is not connected" problem 
on the client, maybe once every hour.

df: `/services/users/6': Transport endpoint is not connected

Here is my setup:
I have 1 Client and 2 Servers with 2 Disks each for bricks. Glusterfs 
3.3 compiled from source.


# gluster volume info

Volume Name: freecloud
Type: Distributed-Replicate
Volume ID: 1cf4804f-12aa-4cd1-a892-cec69fc2cf22
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: XX.25.137.252:/mnt/35be42b4-afb3-48a2-8b3c-17a422fd1e15
Brick2: YY.40.3.216:/mnt/7ee4f117-8aee-4cae-b08c-5e441b703886
Brick3: XX.25.137.252:/mnt/9ee7c816-085d-4c5c-9276-fd3dadac6c72
Brick4: YY.40.3.216:/mnt/311399bc-4d55-445d-8480-286c56cf493e
Options Reconfigured:
cluster.self-heal-daemon: on
performance.cache-size: 256MB
performance.io-thread-count: 32
features.quota: on

Quota is ON but not used
-

# gluster volume status all detail
Status of volume: freecloud
--
Brick: Brick 
XX.25.137.252:/mnt/35be42b4-afb3-48a2-8b3c-17a422fd1e15

Port : 24009
Online   : Y
Pid  : 29221
File System  : xfs
Device   : /dev/sdd1
Mount Options: rw
Inode Size   : 256
Disk Space Free  : 659.7GB
Total Disk Space : 698.3GB
Inode Count  : 732571968
Free Inodes  : 730418928
--
Brick: Brick 
YY.40.3.216:/mnt/7ee4f117-8aee-4cae-b08c-5e441b703886

Port : 24009
Online   : Y
Pid  : 15496
File System  : xfs
Device   : /dev/sdc1
Mount Options: rw
Inode Size   : 256
Disk Space Free  : 659.7GB
Total Disk Space : 698.3GB
Inode Count  : 732571968
Free Inodes  : 730410396
--
Brick: Brick 
XX.25.137.252:/mnt/9ee7c816-085d-4c5c-9276-fd3dadac6c72

Port : 24010
Online   : Y
Pid  : 29227
File System  : xfs
Device   : /dev/sdc1
Mount Options: rw
Inode Size   : 256
Disk Space Free  : 659.9GB
Total Disk Space : 698.3GB
Inode Count  : 732571968
Free Inodes  : 730417864
--
Brick: Brick 
YY.40.3.216:/mnt/311399bc-4d55-445d-8480-286c56cf493e

Port : 24010
Online   : Y
Pid  : 15502
File System  : xfs
Device   : /dev/sdb1
Mount Options: rw
Inode Size   : 256
Disk Space Free  : 659.9GB
Total Disk Space : 698.3GB
Inode Count  : 732571968
Free Inodes  : 730409337


On server1 I mount the volume and start copying files to it. Server1 is 
used like storage.


209.25.137.252:freecloud  1.4T   78G  1.3T   6% 
/home/freecloud


One thing to mention is that I have a large list of subdirectories in 
the main directory and the list keeps getting bigger.

client1# ls | wc -l
42424

---
I have one client server that mounts glusterfs and uses the files 
directly as the files are for low traffic web sites. On the client, 
there is no gluster daemon, just the mount.


client1# mount -t glusterfs rscloud1.domain.net:/freecloud 
/services/users/6/


This all worked fine for the last 2-3 weeks. Here is a log from the 
crash client1:/var/log/glusterfs/services-users-6-.log


pending frames:
frame : type(1) op(RENAME)
frame : type(1) op(RENAME)
frame : type(1) op(RENAME)
frame : type(1) op(RENAME)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2012-07-12 14:51:01
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.0
/lib/x86_64-linux-gnu/libc.so.6(+0x32480)[0x7f1e0e9f0480]
/services/glusterfs//lib/libglusterfs.so.0(uuid_unpack+0x0)[0x7f1e0f79d760]
/services/glusterfs//lib/libglusterfs.so.0(+0x4c526)[0x7f1e0f79d526]
/services/glusterfs//lib/libglusterfs.so.0(uuid_utoa+0x26)[0x7f1e0f77ca66]
/services/glusterfs//lib/glusterfs/3.3.0/xlator/features/quota.so(quota_rename_cbk+0x308)[0x7f1e09b940c8]
/services/glusterfs//lib/glusterfs/3.3.0/xlator/cluster/distribute.so(dht_rename_unlink_cbk+0x454)[0x7f1e09dad264]
/services/glusterfs//lib/glusterfs/3.3.0/xlator/cluster/replicate.so(afr_unlink_unwind+0xf7)[0x7f1e09ff23c7]
/services/glusterfs//lib/glusterfs/3.3.0/xlator/cluster/replicate.so(afr_unlink_wind_cbk+0xb6)[0x7f1e09ff43d6]
/services/glusterfs//lib/glusterfs/3.3.0/xlator/protocol/cli