Re: [Gluster-users] URGENT: Update issues from 3.6.6 to 3.10.2 Accessing files via samba come up with permission denied

2017-06-02 Thread Raghavendra Talur
On 03-Jun-2017 3:27 AM, "Diego Remolina"  wrote:

Hi everyone,

Is there anything else we could do to check on this problem and try to
fix it? The issue is definitively related to either the samba vfs
gluster plugin or gluster itself. I am not sure how to pin it down
futher.


I don't think it is vfs plugin because you haven't updated samba packages
and vfs plugin is same as before.

It is either gfapi in Gluster or one of the lower xlators.

Changing to fuse based shares might make the problem go away and can be
used as a workaround.



I went ahead and created a new share in the samba server which is on a
local filesystem where the OS is installed, not part of gluster:

# mount | grep home
]# ls -ld /home
drwxr-xr-x. 14 root root 4096 Jun  2 17:21 /home

This shows this is not mounted from anywhere. so I created a folder
and shared it:

]# ls -ld /home/rvtsharetest/
drwxrwx---. 3 dijuremo Staff 95 Jun  2 17:31 /home/rvtsharetest/

]# tail -7 /etc/samba/smb.conf
[rvtsharetest]
   path=/home/rvtsharetest
   browseable = yes
   write list = @Staff,dijuremo
   guest ok = yes
   create mask = 664
   directory mask = 775

When accessing Revit files in this new share the problem is *not* observed.

When accessing Revit files on any of the samba shares that use the vfs
gluster plugin and are stored in the gluster volume, we see the
problems.

Further analysis of the issue is even more disconcerting. As you may
remember, I have found a workaround about renaming the file and back
to the original, where things work. Here is where it gets more
interesting. This seems to be workstation dependent, not user
dependent.

1. User1 logs into PC1 and tries to access the file and gets the error
message from Revit "ACCESS DENIED".

2. User1 uses windows explorer to go to the file location (we tried
doing this from the server itself on the command line and it did not
change anything, i.e. su - User1 then mv command). User1 renames the
file, Revit.rvt -> Revit2.rvt, and clicks away for the rename to take
place. User1 immediately renames the file to the original, Revit2.rvt
-> Revit.rvt

3. User1 opens the file and everything works properly.

4. User2 logs into PC1 and tries to open the file. The file opens and
works properly.

5. User2 logs into PC2 and tries to open the file, the problem comes
up. User2 uses the rename trick and this fixes the problem on PC2.
Even if User1 now comes to use PC2, User1 will have no problems with
the file.

6. User1 now goes to PC3 where nobody has used the file rename trick,
and experiences the problem. Only solution is to play the rename trick
again in PC3.

So it seems you have to play the rename trick at least one per
workstation and that "fixes" the issue for any user who logs into that
workstation.

What other suggestions do you have? What debugging can I do next? I am
planning once everyone leaves the office today on changing the share
to bypass vfs gluster plugin and access the file directly from the
fuse mount, ie.:

# mount | grep export
10.0.1.7:/export on /export type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072)

Then set samba to share without using the vfs gluster plugin as follows

[Projects]
   path = /export/projects
   browseable = yes
   write list = @Staff,root,@Admin,@Managers
   writeable = yes
   guest ok = no
   create mask = 660
   directory mask = 770

This test will determine if the issue is the samba vfs gluster plugin
or if it is the fact that the file is stored in the gluster volume.

Any other thoughts?


I suspect this has to do with locking and such.
Can create a new volume and share through vfs plugin.
Set logging to number 5 for samba and Gluster volume set client-log-level
and server-log-level to TRACE.
Copy one rvt file.
Performs open, this is like fail.
Perform rename
Perform open, this would pass

Send us the logs.

I know it is troublesome but detailed logs are the only things that would
help us analyze the issue.

Talur


Diego

On Wed, May 31, 2017 at 12:59 PM, Diego Remolina  wrote:
> The vfs gluster plugin for samba is linked against libglusterfs.so.0
> which is provided by glusterfs-libs-3.10.2-1.el7.x86_64, please see
> below.
>
> # ldd /usr/lib64/samba/vfs/glusterfs.so | grep glus
>libglusterfs.so.0 => /lib64/libglusterfs.so.0 (0x7f61da858000)
>
> ]# yum provides /lib64/libglusterfs.so.0
> Loaded plugins: fastestmirror
> Loading mirror speeds from cached hostfile
> * base: centos.vwtonline.net
> * extras: mirror.cs.vt.edu
> * updates: centosv.centos.org
> glusterfs-libs-3.10.2-1.el7.x86_64 : GlusterFS common libraries
> Repo: @centos-gluster310
> Matched from:
> Filename: /lib64/libglusterfs.so.0
>
>
>
> On Wed, May 31, 2017 at 12:39 PM, Diego Remolina 
wrote:
>> Samba is running in the same machine as glusterd. The machines were
>> rebooted after the upgrades and samba has been restarted a few times.
>>
>> # rpm -qa | grep gluster
>> 

Re: [Gluster-users] URGENT: Update issues from 3.6.6 to 3.10.2 Accessing files via samba come up with permission denied

2017-06-02 Thread Diego Remolina
Hi everyone,

Is there anything else we could do to check on this problem and try to
fix it? The issue is definitively related to either the samba vfs
gluster plugin or gluster itself. I am not sure how to pin it down
futher.

I went ahead and created a new share in the samba server which is on a
local filesystem where the OS is installed, not part of gluster:

# mount | grep home
]# ls -ld /home
drwxr-xr-x. 14 root root 4096 Jun  2 17:21 /home

This shows this is not mounted from anywhere. so I created a folder
and shared it:

]# ls -ld /home/rvtsharetest/
drwxrwx---. 3 dijuremo Staff 95 Jun  2 17:31 /home/rvtsharetest/

]# tail -7 /etc/samba/smb.conf
[rvtsharetest]
   path=/home/rvtsharetest
   browseable = yes
   write list = @Staff,dijuremo
   guest ok = yes
   create mask = 664
   directory mask = 775

When accessing Revit files in this new share the problem is *not* observed.

When accessing Revit files on any of the samba shares that use the vfs
gluster plugin and are stored in the gluster volume, we see the
problems.

Further analysis of the issue is even more disconcerting. As you may
remember, I have found a workaround about renaming the file and back
to the original, where things work. Here is where it gets more
interesting. This seems to be workstation dependent, not user
dependent.

1. User1 logs into PC1 and tries to access the file and gets the error
message from Revit "ACCESS DENIED".

2. User1 uses windows explorer to go to the file location (we tried
doing this from the server itself on the command line and it did not
change anything, i.e. su - User1 then mv command). User1 renames the
file, Revit.rvt -> Revit2.rvt, and clicks away for the rename to take
place. User1 immediately renames the file to the original, Revit2.rvt
-> Revit.rvt

3. User1 opens the file and everything works properly.

4. User2 logs into PC1 and tries to open the file. The file opens and
works properly.

5. User2 logs into PC2 and tries to open the file, the problem comes
up. User2 uses the rename trick and this fixes the problem on PC2.
Even if User1 now comes to use PC2, User1 will have no problems with
the file.

6. User1 now goes to PC3 where nobody has used the file rename trick,
and experiences the problem. Only solution is to play the rename trick
again in PC3.

So it seems you have to play the rename trick at least one per
workstation and that "fixes" the issue for any user who logs into that
workstation.

What other suggestions do you have? What debugging can I do next? I am
planning once everyone leaves the office today on changing the share
to bypass vfs gluster plugin and access the file directly from the
fuse mount, ie.:

# mount | grep export
10.0.1.7:/export on /export type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,allow_other,max_read=131072)

Then set samba to share without using the vfs gluster plugin as follows

[Projects]
   path = /export/projects
   browseable = yes
   write list = @Staff,root,@Admin,@Managers
   writeable = yes
   guest ok = no
   create mask = 660
   directory mask = 770

This test will determine if the issue is the samba vfs gluster plugin
or if it is the fact that the file is stored in the gluster volume.

Any other thoughts?

Diego

On Wed, May 31, 2017 at 12:59 PM, Diego Remolina  wrote:
> The vfs gluster plugin for samba is linked against libglusterfs.so.0
> which is provided by glusterfs-libs-3.10.2-1.el7.x86_64, please see
> below.
>
> # ldd /usr/lib64/samba/vfs/glusterfs.so | grep glus
>libglusterfs.so.0 => /lib64/libglusterfs.so.0 (0x7f61da858000)
>
> ]# yum provides /lib64/libglusterfs.so.0
> Loaded plugins: fastestmirror
> Loading mirror speeds from cached hostfile
> * base: centos.vwtonline.net
> * extras: mirror.cs.vt.edu
> * updates: centosv.centos.org
> glusterfs-libs-3.10.2-1.el7.x86_64 : GlusterFS common libraries
> Repo: @centos-gluster310
> Matched from:
> Filename: /lib64/libglusterfs.so.0
>
>
>
> On Wed, May 31, 2017 at 12:39 PM, Diego Remolina  wrote:
>> Samba is running in the same machine as glusterd. The machines were
>> rebooted after the upgrades and samba has been restarted a few times.
>>
>> # rpm -qa | grep gluster
>> glusterfs-client-xlators-3.10.2-1.el7.x86_64
>> glusterfs-server-3.10.2-1.el7.x86_64
>> glusterfs-api-3.10.2-1.el7.x86_64
>> glusterfs-3.10.2-1.el7.x86_64
>> glusterfs-cli-3.10.2-1.el7.x86_64
>> centos-release-gluster310-1.0-1.el7.centos.noarch
>> samba-vfs-glusterfs-4.4.4-14.el7_3.x86_64
>> glusterfs-fuse-3.10.2-1.el7.x86_64
>> glusterfs-libs-3.10.2-1.el7.x86_64
>> glusterfs-rdma-3.10.2-1.el7.x86_64
>>
>> # rpm -qa | grep samba
>> samba-common-libs-4.4.4-14.el7_3.x86_64
>> samba-common-tools-4.4.4-14.el7_3.x86_64
>> samba-libs-4.4.4-14.el7_3.x86_64
>> samba-4.4.4-14.el7_3.x86_64
>> samba-client-libs-4.4.4-14.el7_3.x86_64
>> samba-vfs-glusterfs-4.4.4-14.el7_3.x86_64
>> samba-common-4.4.4-14.el7_3.noarch
>>
>> # cat /etc/redhat-release
>> CentOS Linux release 

Re: [Gluster-users] ?==?utf-8?q? Heal operation detail of EC volumes

2017-06-02 Thread Xavier Hernandez

Hi Serkan,

On Thursday, June 01, 2017 21:31 CEST, Serkan Çoban  
wrote:
 >Is it possible that this matches your observations ?
Yes that matches what I see. So 19 files is being in parallel by 19
SHD processes. I thought only one file is being healed at a time.
Then what is the meaning of disperse.shd-max-threads parameter? If I
set it to 2 then each SHD thread will heal two files at the same time?Each SHD 
normally heals a single file at a time. However there's an SHD on each node so 
all of them are trying to process dirty files. If one peeks one file to heal, 
other SHD's will skip that one and try another.

The disperse.shd-max-threads indicates how many heals can do each SHD 
simultaneously. Setting a value of 2 would mean that each SHD could heal 2 
files, up to 40 using 20 nodes.
>How many IOPS can handle your bricks ?
Bricks are 7200RPM NL-SAS disks. 70-80 random IOPS max. But write
pattern seems sequential, 30-40MB bulk writes every 4-5 seconds.
This is what iostat shows.This is probably caused by some write back policy on 
the file system that accumulates multiple writes, optimizing disk access. This 
way the apparent 1000 IOPS can be handled by a single disk with 70-80 real IOPS 
by making each IO operation bigger.
>Do you have a test environment where we could check all this ?
Not currently but will have in 4-5 weeks. New servers are arriving, I
will add this test to my notes.

> There's a feature to allow to configure the self-heal block size to optimize 
> these cases. The feature is available on 3.11.
I did not see this in 3.11 release notes, what parameter name I should look 
for?The new option is named 'self-heal-window-size'. It represents the size of 
each heal operation as the number of 128KB blocks to use. The default value is 
1. To use blocks of 1MB, this parameter should be set to 8.

​​​Xavi
On Thu, Jun 1, 2017 at 10:30 AM, Xavier Hernandez  wrote:
> Hi Serkan,
>
> On 30/05/17 10:22, Serkan Çoban wrote:
>>
>> Ok I understand that heal operation takes place on server side. In
>> this case I should see X KB
>> out network traffic from 16 servers and 16X KB input traffic to the
>> failed brick server right? So that process will get 16 chunks
>> recalculate our chunk and write it to disk.
>
>
> That should be the normal operation for a single heal.
>
>> The problem is I am not seeing such kind of traffic on servers. In my
>> configuration (16+4 EC) I see 20 servers are all have 7-8MB outbound
>> traffic and none of them has more than 10MB incoming traffic.
>> Only heal operation is happening on cluster right now, no client/other
>> traffic. I see constant 7-8MB write to healing brick disk. So where is
>> the missing traffic?
>
>
> Not sure about your configuration, but probably you are seeing the result of
> having the SHD of each server doing heals. That would explain the network
> traffic you have.
>
> Suppose that all SHD but the one on the damaged brick are working. In this
> case 19 servers will peek 16 fragments each. This gives 19 * 16 = 304
> fragments to be requested. EC balances the reads among all available
> servers, and there's a chance (1/19) that a fragment is local to the server
> asking it. So we'll need a total of 304 - 304 / 19 = 288 network requests,
> 288 / 19 = 15.2 sent by each server.
>
> If we have a total of 288 requests, it means that each server will answer
> 288 / 19 = 15.2 requests. The net effect of all this is that each healthy
> server is sending 15.2*X bytes of data and each server is receiving 15.2*X
> bytes of data.
>
> Now we need to account for the writes to the damaged brick. We have 19
> simultaneous heals. This means that the damaged brick will receive 19*X
> bytes of data, and each healthy server will send X additional bytes of data.
>
> So:
>
> A healthy server receives 15.2*X bytes of data
> A healthy server sends 16.2*X bytes of data
> A damaged server receives 19*X bytes of data
> A damaged server sends few bytes of data (communication and synchronization
> overhead basically)
>
> As you can see, in this configuration each server has almost the same amount
> of inbound and outbound traffic. Only big difference is the damaged brick,
> that should receive a little more of traffic, but it should send much less.
>
> Is it possible that this matches your observations ?
>
> There's one more thing to consider here, and it's the apparent low
> throughput of self-heal. One possible thing to check is the small size and
> random behavior of the requests.
>
> Assuming that each request has a size of ~128 / 16 = 8KB, at a rate of ~8
> MB/s the servers are processing ~1000 IOPS. Since requests are going to 19
> different files, even if each file is accessed sequentially, the real effect
> will be like random access (some read-ahead on the filesystem can improve
> reads a bit, but writes won't benefit so much).
>
> How many IOPS can handle your bricks ?
>
> Do you have a test environment where we could 

[Gluster-users] libgfapi with encryption?

2017-06-02 Thread Ziemowit Pierzycki
Hi,

I created an encrypted volume which appears to be working fine with
FUSE but the volume is supposed to store VM images (master key in
place).  I noticed some references in libgfapi source code to
encryption so I decided to try it out.  While attempting to create an
image:

# qemu-img create -f qcow2 gluster://gluster01/virt0/testing.img 30G
Formatting 'gluster://dalpinfglt01/virt0/testing.img', fmt=qcow2
size=32212254720 encryption=off cluster_size=65536 lazy_refcounts=off
refcount_bits=16
[2017-06-02 18:30:37.489831] E [mem-pool.c:579:mem_put]
(-->/lib64/libglusterfs.so.0(syncop_lookup+0x4e5) [0x7f2abbe66d35]
-->/lib64/libglusterfs.so.0(+0x59f02) [0x7f2abbe62f02]
-->/lib64/libglusterfs.so.0(mem_put+0x190) [0x7f2abbe54a60] )
0-mem-pool: mem-pool ptr is NULL
[2017-06-02 18:30:37.490715] E [mem-pool.c:579:mem_put]
(-->/lib64/libglusterfs.so.0(syncop_lookup+0x4e5) [0x7f2abbe66d35]
-->/lib64/libglusterfs.so.0(+0x59f02) [0x7f2abbe62f02]
-->/lib64/libglusterfs.so.0(mem_put+0x190) [0x7f2abbe54a60] )
0-mem-pool: mem-pool ptr is NULL
[2017-06-02 18:30:37.492255] E [mem-pool.c:579:mem_put]
(-->/lib64/libglusterfs.so.0(syncop_lookup+0x4e5) [0x7f2abbe66d35]
-->/lib64/libglusterfs.so.0(+0x59f02) [0x7f2abbe62f02]
-->/lib64/libglusterfs.so.0(mem_put+0x190) [0x7f2abbe54a60] )
0-mem-pool: mem-pool ptr is NULL


Am I to understand that libgfapi doesn't support encryption at rest?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] File locking...

2017-06-02 Thread Joe Julian
Yes, the fuse client is fully posix.

On June 2, 2017 5:12:34 AM PDT, Krist van Besien  wrote:
>Hi all,
>
>A few questions.
>
>- Is POSIX locking enabled when using the native client? I would assume
>yes.
>- What other settings/tuneables exist when it comes to file locking?
>
>Krist
>
>
>-- 
>Vriendelijke Groet |  Best Regards | Freundliche Grüße | Cordialement
>--
>Krist van Besien | Senior Architect | Red Hat EMEA Cloud Practice |
>RHCE |
>RHCSA Open Stack
>@: kr...@redhat.com | M: +41-79-5936260

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] File locking...

2017-06-02 Thread Krist van Besien
Hi all,

A few questions.

- Is POSIX locking enabled when using the native client? I would assume yes.
- What other settings/tuneables exist when it comes to file locking?

Krist


-- 
Vriendelijke Groet |  Best Regards | Freundliche Grüße | Cordialement
--
Krist van Besien | Senior Architect | Red Hat EMEA Cloud Practice | RHCE |
RHCSA Open Stack
@: kr...@redhat.com | M: +41-79-5936260
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users