Re: [Gluster-users] Slow performance on samba with small files

2017-02-10 Thread Anoop C S
On Thu, 2017-02-09 at 16:18 +, Gary Lloyd wrote:
> Was just reading the small file section of the 3.9 release notes:
> 
> http://blog.gluster.org/2016/11/announcing-gluster-3-9/
> 
> Setting these options does seem to increase transfer speeds on small files by 
> quite alot:
>   # gluster volume set  features.cache-invalidation on
>   # gluster volume set  features.cache-invalidation-timeout 600
>   # gluster volume set  performance.stat-prefetch on   #This one 
> seemed to have the
> biggest impact in small file performance for me
>   # gluster volume set  performance.cache-invalidation on
>   # gluster volume set  performance.md-cache-timeout 600
> 
> Setting  # gluster volume set  performance.cache-samba-metadata on # 
> Only for SMB
> access. Results in my client to keep losing the state of the server and the 
> shares often disappear
> / become inaccessible and I can only get them back if I logon / logoff the 
> machine, this is with
> distro Samba 4.4.4.
> 

This is something which needs to be analyzed further..
We would need more info on volume config, glusterfs client logs from 
Samba(/var/log/samba/glusterfs-
..log) and your smb.conf?

> Has anyone here had the same issue, does the version of samba need to be 
> newer to support the
> feature ?
> 

To my knowledge, version of Samba should not be a problem here. At least I am 
not aware of any such
issues with v4.4.6.

> Thanks
> 
> Gary Lloyd
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> +44 1782 733063
> 
> 
> On 8 February 2017 at 11:49, Дмитрий Глушенок  wrote:
> > For _every_ file copied samba performs readdir() to get all entries of the 
> > destination folder.
> > Then the list is searched for filename (to prevent name collisions as SMB 
> > shares are not case
> > sensitive). More files in folder, more time it takes to perform readdir(). 
> > It is a lot worse for
> > Gluster because single folder contents distributed among many servers and 
> > Gluster has to join
> > many directory listings (requested via network) to form one and return it 
> > to caller.
> > 
> > Rsync does not perform readdir(), it just checks file existence with stat() 
> > IIRC. And as modern
> > Gluster versions has default setting to check for file only at its 
> > destination (when volume is
> > balanced) - the check performs relatively fast.
> > 
> > You can hack samba to prevent such checks if your goal is to get files 
> > copied not so slow (as
> > you sure the files you are copying are not exists at destination). But try 
> > to perform 'ls -l' on
> > _not_ cached folder with thousands of files - it will take tens of seconds. 
> > This is time your
> > users will waste browsing shares.
> > 
> > > 8 февр. 2017 г., в 13:17, Gary Lloyd  написал(а):
> > > 
> > > Thanks for the reply
> > > 
> > > I've just done a bit more testing. If I use rsync from a gluster client 
> > > to copy the same files
> > > to the mount point it only takes a couple of minutes.
> > > For some reason it's very slow on samba though (version 4.4.4).
> > > 
> > > I have tried various samba tweaks / settings and have yet to get 
> > > acceptable write speed on
> > > small files.
> > > 
> > > 
> > > Gary Lloyd
> > > 
> > > I.T. Systems:Keele University
> > > Finance & IT Directorate
> > > Keele:Staffs:IC1 Building:ST5 5NB:UK
> > > +44 1782 733063
> > > 
> > > 
> > > On 8 February 2017 at 10:05, Дмитрий Глушенок  wrote:
> > > > Hi,
> > > > 
> > > > There is a number of tweaks/hacks to make it better, but IMHO overall 
> > > > performance with small
> > > > files is still unacceptable for such folders with thousands of entries.
> > > > 
> > > > If your shares are not too large to be placed on single filesystem and 
> > > > you still want to use
> > > > Gluster - it is possible to run VM on top of Gluster. Inside that VM 
> > > > you can create ZFS/NTFS
> > > > to be shared.
> > > > 
> > > > > 8 февр. 2017 г., в 12:10, Gary Lloyd  написал(а):
> > > > > 
> > > > > Hi
> > > > > 
> > > > > I am currently testing gluster 3.9 replicated/distrbuted on centos 
> > > > > 7.3 with samba/ctdb.
> > > > > I have been able to get it all up and running, but writing small 
> > > > > files is really slow. 
> > > > > 
> > > > > If I copy large files from gluster backed samba I get almost wire 
> > > > > speed (We only have 1Gb
> > > > > at the moment). I get around half that speed if I copy large files to 
> > > > > the gluster backed
> > > > > samba system, which I am guessing is due to it being replicated (This 
> > > > > is acceptable).
> > > > > 
> > > > > Small file write performance seems really poor for us though:
> > > > > As an example I have an eclipse IDE workspace folder 

Re: [Gluster-users] Slow performance on samba with small files

2017-02-09 Thread Gary Lloyd
Was just reading the small file section of the 3.9 release notes:

http://blog.gluster.org/2016/11/announcing-gluster-3-9/

Setting these options does seem to increase transfer speeds on small files
by quite alot:

  # gluster volume set  features.cache-invalidation on
  # gluster volume set  features.cache-invalidation-timeout 600
  # gluster volume set  performance.stat-prefetch on
#This one seemed to have the biggest impact in small file performance
for me
  # gluster volume set  performance.cache-invalidation on
  # gluster volume set  performance.md-cache-timeout 600


Setting  # gluster volume set  performance.cache-samba-metadata on
# Only for SMB access. Results in my client to keep losing the state of the
server and the shares often disappear / become inaccessible and I can only
get them back if I logon / logoff the machine, this is with distro Samba
4.4.4.

Has anyone here had the same issue, does the version of samba need to be
newer to support the feature ?

Thanks

*Gary Lloyd*

I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK
+44 1782 733063 <%2B44%201782%20733073>


On 8 February 2017 at 11:49, Дмитрий Глушенок  wrote:

> For _every_ file copied samba performs readdir() to get all entries of the
> destination folder. Then the list is searched for filename (to prevent name
> collisions as SMB shares are not case sensitive). More files in folder,
> more time it takes to perform readdir(). It is a lot worse for Gluster
> because single folder contents distributed among many servers and Gluster
> has to join many directory listings (requested via network) to form one and
> return it to caller.
>
> Rsync does not perform readdir(), it just checks file existence with
> stat() IIRC. And as modern Gluster versions has default setting to check
> for file only at its destination (when volume is balanced) - the check
> performs relatively fast.
>
> You can hack samba to prevent such checks if your goal is to get files
> copied not so slow (as you sure the files you are copying are not exists at
> destination). But try to perform 'ls -l' on _not_ cached folder with
> thousands of files - it will take tens of seconds. This is time your users
> will waste browsing shares.
>
> 8 февр. 2017 г., в 13:17, Gary Lloyd  написал(а):
>
> Thanks for the reply
>
> I've just done a bit more testing. If I use rsync from a gluster client to
> copy the same files to the mount point it only takes a couple of minutes.
> For some reason it's very slow on samba though (version 4.4.4).
>
> I have tried various samba tweaks / settings and have yet to get
> acceptable write speed on small files.
>
>
> *Gary Lloyd*
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> +44 1782 733063 <%2B44%201782%20733073>
> 
>
> On 8 February 2017 at 10:05, Дмитрий Глушенок  wrote:
>
>> Hi,
>>
>> There is a number of tweaks/hacks to make it better, but IMHO overall
>> performance with small files is still unacceptable for such folders with
>> thousands of entries.
>>
>> If your shares are not too large to be placed on single filesystem and
>> you still want to use Gluster - it is possible to run VM on top of Gluster.
>> Inside that VM you can create ZFS/NTFS to be shared.
>>
>> 8 февр. 2017 г., в 12:10, Gary Lloyd  написал(а):
>>
>> Hi
>>
>> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3
>> with samba/ctdb.
>> I have been able to get it all up and running, but writing small files is
>> really slow.
>>
>> If I copy large files from gluster backed samba I get almost wire speed
>> (We only have 1Gb at the moment). I get around half that speed if I copy
>> large files to the gluster backed samba system, which I am guessing is due
>> to it being replicated (This is acceptable).
>>
>> Small file write performance seems really poor for us though:
>> As an example I have an eclipse IDE workspace folder that is 6MB in size
>> that has around 6000 files in it. A lot of these files are <1k in size.
>>
>> If I copy this up to gluster backed samba it takes almost one hour to get
>> there.
>> With our basic samba deployment it only takes about 5 minutes.
>>
>> Both systems reside on the same disks/SAN.
>>
>>
>> I was hoping that we would be able to move away from using a proprietary
>> SAN to house our network shares and use gluster instead.
>>
>> Does anyone have any suggestions of anything I could tweak to make it
>> better ?
>>
>> Many Thanks
>>
>>
>> *Gary Lloyd*
>> 
>> I.T. Systems:Keele University
>> Finance & IT Directorate
>> Keele:Staffs:IC1 Building:ST5 5NB:UK
>> 
>> 

Re: [Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Дмитрий Глушенок
For _every_ file copied samba performs readdir() to get all entries of the 
destination folder. Then the list is searched for filename (to prevent name 
collisions as SMB shares are not case sensitive). More files in folder, more 
time it takes to perform readdir(). It is a lot worse for Gluster because 
single folder contents distributed among many servers and Gluster has to join 
many directory listings (requested via network) to form one and return it to 
caller.

Rsync does not perform readdir(), it just checks file existence with stat() 
IIRC. And as modern Gluster versions has default setting to check for file only 
at its destination (when volume is balanced) - the check performs relatively 
fast.

You can hack samba to prevent such checks if your goal is to get files copied 
not so slow (as you sure the files you are copying are not exists at 
destination). But try to perform 'ls -l' on _not_ cached folder with thousands 
of files - it will take tens of seconds. This is time your users will waste 
browsing shares.

> 8 февр. 2017 г., в 13:17, Gary Lloyd  написал(а):
> 
> Thanks for the reply
> 
> I've just done a bit more testing. If I use rsync from a gluster client to 
> copy the same files to the mount point it only takes a couple of minutes.
> For some reason it's very slow on samba though (version 4.4.4).
> 
> I have tried various samba tweaks / settings and have yet to get acceptable 
> write speed on small files.
> 
> 
> Gary Lloyd
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> +44 1782 733063 
> 
> 
> On 8 February 2017 at 10:05, Дмитрий Глушенок  > wrote:
> Hi,
> 
> There is a number of tweaks/hacks to make it better, but IMHO overall 
> performance with small files is still unacceptable for such folders with 
> thousands of entries.
> 
> If your shares are not too large to be placed on single filesystem and you 
> still want to use Gluster - it is possible to run VM on top of Gluster. 
> Inside that VM you can create ZFS/NTFS to be shared.
> 
>> 8 февр. 2017 г., в 12:10, Gary Lloyd > > написал(а):
>> 
>> Hi
>> 
>> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with 
>> samba/ctdb.
>> I have been able to get it all up and running, but writing small files is 
>> really slow. 
>> 
>> If I copy large files from gluster backed samba I get almost wire speed (We 
>> only have 1Gb at the moment). I get around half that speed if I copy large 
>> files to the gluster backed samba system, which I am guessing is due to it 
>> being replicated (This is acceptable).
>> 
>> Small file write performance seems really poor for us though:
>> As an example I have an eclipse IDE workspace folder that is 6MB in size 
>> that has around 6000 files in it. A lot of these files are <1k in size.
>> 
>> If I copy this up to gluster backed samba it takes almost one hour to get 
>> there.
>> With our basic samba deployment it only takes about 5 minutes.
>> 
>> Both systems reside on the same disks/SAN.
>> 
>> 
>> I was hoping that we would be able to move away from using a proprietary SAN 
>> to house our network shares and use gluster instead.
>> 
>> Does anyone have any suggestions of anything I could tweak to make it better 
>> ?
>> 
>> Many Thanks
>> 
>> 
>> Gary Lloyd
>> 
>> I.T. Systems:Keele University
>> Finance & IT Directorate
>> Keele:Staffs:IC1 Building:ST5 5NB:UK
>> 
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> http://lists.gluster.org/mailman/listinfo/gluster-users 
>> 
> --
> Dmitry Glushenok
> Jet Infosystems
> 
> 

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Gary Lloyd
Thanks for the reply

I've just done a bit more testing. If I use rsync from a gluster client to
copy the same files to the mount point it only takes a couple of minutes.
For some reason it's very slow on samba though (version 4.4.4).

I have tried various samba tweaks / settings and have yet to get acceptable
write speed on small files.


*Gary Lloyd*

I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK
+44 1782 733063 <%2B44%201782%20733073>


On 8 February 2017 at 10:05, Дмитрий Глушенок  wrote:

> Hi,
>
> There is a number of tweaks/hacks to make it better, but IMHO overall
> performance with small files is still unacceptable for such folders with
> thousands of entries.
>
> If your shares are not too large to be placed on single filesystem and you
> still want to use Gluster - it is possible to run VM on top of Gluster.
> Inside that VM you can create ZFS/NTFS to be shared.
>
> 8 февр. 2017 г., в 12:10, Gary Lloyd  написал(а):
>
> Hi
>
> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3
> with samba/ctdb.
> I have been able to get it all up and running, but writing small files is
> really slow.
>
> If I copy large files from gluster backed samba I get almost wire speed
> (We only have 1Gb at the moment). I get around half that speed if I copy
> large files to the gluster backed samba system, which I am guessing is due
> to it being replicated (This is acceptable).
>
> Small file write performance seems really poor for us though:
> As an example I have an eclipse IDE workspace folder that is 6MB in size
> that has around 6000 files in it. A lot of these files are <1k in size.
>
> If I copy this up to gluster backed samba it takes almost one hour to get
> there.
> With our basic samba deployment it only takes about 5 minutes.
>
> Both systems reside on the same disks/SAN.
>
>
> I was hoping that we would be able to move away from using a proprietary
> SAN to house our network shares and use gluster instead.
>
> Does anyone have any suggestions of anything I could tweak to make it
> better ?
>
> Many Thanks
>
>
> *Gary Lloyd*
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> Dmitry Glushenok
> Jet Infosystems
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Дмитрий Глушенок
Hi,

There is a number of tweaks/hacks to make it better, but IMHO overall 
performance with small files is still unacceptable for such folders with 
thousands of entries.

If your shares are not too large to be placed on single filesystem and you 
still want to use Gluster - it is possible to run VM on top of Gluster. Inside 
that VM you can create ZFS/NTFS to be shared.

> 8 февр. 2017 г., в 12:10, Gary Lloyd  написал(а):
> 
> Hi
> 
> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with 
> samba/ctdb.
> I have been able to get it all up and running, but writing small files is 
> really slow. 
> 
> If I copy large files from gluster backed samba I get almost wire speed (We 
> only have 1Gb at the moment). I get around half that speed if I copy large 
> files to the gluster backed samba system, which I am guessing is due to it 
> being replicated (This is acceptable).
> 
> Small file write performance seems really poor for us though:
> As an example I have an eclipse IDE workspace folder that is 6MB in size that 
> has around 6000 files in it. A lot of these files are <1k in size.
> 
> If I copy this up to gluster backed samba it takes almost one hour to get 
> there.
> With our basic samba deployment it only takes about 5 minutes.
> 
> Both systems reside on the same disks/SAN.
> 
> 
> I was hoping that we would be able to move away from using a proprietary SAN 
> to house our network shares and use gluster instead.
> 
> Does anyone have any suggestions of anything I could tweak to make it better ?
> 
> Many Thanks
> 
> 
> Gary Lloyd
> 
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

--
Dmitry Glushenok
Jet Infosystems

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Slow performance on samba with small files

2017-02-08 Thread Gary Lloyd
Hi

I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with
samba/ctdb.
I have been able to get it all up and running, but writing small files is
really slow.

If I copy large files from gluster backed samba I get almost wire speed (We
only have 1Gb at the moment). I get around half that speed if I copy large
files to the gluster backed samba system, which I am guessing is due to it
being replicated (This is acceptable).

Small file write performance seems really poor for us though:
As an example I have an eclipse IDE workspace folder that is 6MB in size
that has around 6000 files in it. A lot of these files are <1k in size.

If I copy this up to gluster backed samba it takes almost one hour to get
there.
With our basic samba deployment it only takes about 5 minutes.

Both systems reside on the same disks/SAN.


I was hoping that we would be able to move away from using a proprietary
SAN to house our network shares and use gluster instead.

Does anyone have any suggestions of anything I could tweak to make it
better ?

Many Thanks


*Gary Lloyd*

I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users