Re: [Gluster-users] Direct I/O access performance with GLFS2 rc8

2009-04-23 Thread Anand Avati
It is unfair to expect a high throughput for IO with O_DIRECT on a
network filesystem. The fact that a loopback interface is in picture
can influence some bias in opinion, but in reality, if you were to
compare GlusterFS direct IO performance with direct IO performance of
any other network filesystem, you will realize that the differences
are not too much. Any kind of trickery, any filesystem might possibly
do to improve direct IO performance will result in breakage of the
semantics.

Along those lines, there _is_ a trickery what you can do with
GlusterFS. I presume that your application is opening in O_DIRECT to
avoid the write data filling up page cache. If it is just for the
commitment of data being written to disk, you can just do an fsync().
So you can make a small change in write-behind.c to let it actually do
background writes on files which are opened/created in O_DIRECT mode.
This will not eat up page cache (even if it were opened without
O_DIRECT). The O_DIRECT flag itself will ensure that on the server
side data is hitting the disk. The effect of write-behind will only be
pipelining write calls, and write-behind by default ensures that all
writes have reached the server side before close() on the file
descriptor returns (unless you turn on flush-behind option).

Hope that might help.

Avati


 I did simple tryout for GLFS 2.0rc8 on CentOS Linux with 2x Dual Core Xeon
 4GB RAM.
 My benchmarks, it is high load single stream access on loopback glfs mount
 on a single server with high performance FC RAID.
 Target volume is XFS formatted. Local benchmark results are as
 follows.(benchmark command is XDD command)

 Buf I/O
 READ = about 660MB/s
 WRITE = about 480MB/s

 Direct I/O
 4MB block read = about 540MB/s
 4MB block write = about 350MB/s

 The results for GLFS loopback mount volume are as follows.

 Buf I/O
 READ = about 460MB/s
 WRITE = about 330MB/s

 Direct I/O
 4MB block read = about 160MB/s
 4MB block write = about 100MB/s

 Buf I/O with GLFS is good results with small block size.
 But The large block size access is slows down.
 Direct I/O is poor performance without follow the block size.
 Attached please find a detailed information text.

 I want to use glfs with professional video applications on IB networks.
 Video applications are using storages with large uncompress image sequences
 and/or uncompress large movie files.(up to 2K/4K)
 The block size control and direct I/O performance are important for them.

 Please advise me about options/configurations for improve the performance,
 and theory for improve performance by block size on GLFS.

 My best regards,

 hideo


___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Gluster on a Mail Server

2009-04-23 Thread Andrew Burkett
I have gluster mount two volumes on start up of a mail server.  However
every so often gluster claims the ports 993 and 995 which are the default
ports for POP3S and IMAPS.  I can sort of work around it by starting dovecot
(my POP3S and IMAPS server) before gluster, so that it reserves the ports
before gluster can, but then dovecot doesn't start up properly. (I think its
because it starts up too early in the boot process.)   Is there anyway to
blacklist ports that gluster claims locally, since it really shouldn't
attach itself to commonly used ports?

Thanks,
Andrew
___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] connection requirements

2009-04-23 Thread Harald Stürzebecher
Hello!

2009/4/21 randall rand...@ciparo.nl:
 dear all,

 second post ;)

 another question here, also in most examples i noticed the infiniband or 10
 GigE recommendation, does this really do any good for the individual server
 connection?

IIRC, another recommendation is a RAID-6 array with 8-12 disks. I'd
expect 400-600MB/s on linear read with hardware like that. In that
case, 1GigE would be a bottleneck so 10 GigE or Infiniband might be a
reasonable recommendation.

IIRC, some posts on the mailing list point out that the performance of
GlusterFS (and most or all other distributed filesystems) is limited
by connection latency (e.g. 'replicate' has to ask each of the servers
if it has a newer version of a file and wait for the answer). On
100Mbit/s it takes longer to push a packet (of equal size) through the
wire than it takes at 1GBit/s or even 10GBit/s.

Maybe someone with access to a test setup with GigE, 10GigE and/or
Infiniband could benchmark this so that others might have a base line
of what to expect?

 another assumption on my side was that 1 individual server can never
 saturate a full gigabit link due to the disk throughput limitation, (maybe a
 few 100 MBps at most)

AFAIK, a modern disk (=1TB SATA) can deliver more than 100MB/s
locally on linear read:

# hdparm -t /dev/sdc

/dev/sdc:
 Timing buffered disk reads:  306 MB in  3.00 seconds = 101.84 MB/sec
# fdisk -l /dev/sdc

Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes


Even a server with just two of these disks in a 'dht' or 'unify'
configuration might be able to saturate a 1GigE network link under
specific conditions.
Some Solid State Disks are even rated at 250MB/s for read.

 so if every server has a 1 Gigabit connection to a switch which in turn has
 a 10 GigE uplink it would not be a bottleneck (as long as there are not too
 many servers sharing the 10 GigE uplink)

 correct?

I'm not sure about that, but after looking at prices I'd do a lot of
testing that the 1GigE network adapter really is a bottleneck before
buying 10GigE or Infiniband hardware. On a cost/value comparison GigE
might win on small systems with only one or two disks.


Harald Stürzebecher

___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] connection requirements

2009-04-23 Thread Sean Davis
2009/4/23 Harald Stürzebecher hara...@cs.tu-berlin.de

 Hello!

 2009/4/21 randall rand...@ciparo.nl:
  dear all,
 
  second post ;)
 
  another question here, also in most examples i noticed the infiniband or
 10
  GigE recommendation, does this really do any good for the individual
 server
  connection?

 IIRC, another recommendation is a RAID-6 array with 8-12 disks. I'd
 expect 400-600MB/s on linear read with hardware like that. In that
 case, 1GigE would be a bottleneck so 10 GigE or Infiniband might be a
 reasonable recommendation.

 IIRC, some posts on the mailing list point out that the performance of
 GlusterFS (and most or all other distributed filesystems) is limited
 by connection latency (e.g. 'replicate' has to ask each of the servers
 if it has a newer version of a file and wait for the answer). On
 100Mbit/s it takes longer to push a packet (of equal size) through the
 wire than it takes at 1GBit/s or even 10GBit/s.

 Maybe someone with access to a test setup with GigE, 10GigE and/or
 Infiniband could benchmark this so that others might have a base line
 of what to expect?

  another assumption on my side was that 1 individual server can never
  saturate a full gigabit link due to the disk throughput limitation,
 (maybe a
  few 100 MBps at most)

 AFAIK, a modern disk (=1TB SATA) can deliver more than 100MB/s
 locally on linear read:

 # hdparm -t /dev/sdc

 /dev/sdc:
  Timing buffered disk reads:  306 MB in  3.00 seconds = 101.84 MB/sec
 # fdisk -l /dev/sdc

 Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes


 Even a server with just two of these disks in a 'dht' or 'unify'
 configuration might be able to saturate a 1GigE network link under
 specific conditions.
 Some Solid State Disks are even rated at 250MB/s for read.

  so if every server has a 1 Gigabit connection to a switch which in turn
 has
  a 10 GigE uplink it would not be a bottleneck (as long as there are not
 too
  many servers sharing the 10 GigE uplink)
 
  correct?

 I'm not sure about that, but after looking at prices I'd do a lot of
 testing that the 1GigE network adapter really is a bottleneck before
 buying 10GigE or Infiniband hardware. On a cost/value comparison GigE
 might win on small systems with only one or two disks.


And, while it may be a pain, I think it is possible to aggregate multiple
GigE connections

Sean
___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users