Re: [Gluster-users] How understand some code execute client side or server side?

2017-03-09 Thread Mohammed Rafi K C


On 03/10/2017 10:47 AM, Tahereh Fattahi wrote:
> Thank you very much, it is very helpful.
> I see the client graph also in /var/log/glusterfs/mnt-glusterfs.log
> when mount the file system.

Yes, you are on the right place. Fuse mount process log's the graph if
the log level is INFO.

> I think there is a tree structure between xlator (I had seen something
> in code like child and parent of each xlator), so just some of them
> are the point of connecting to server. I think xlator with type
> protocol/client is responsible for send request and get response from
> server.

> am I correct?

Indeed, you are a quick learner. Translator with type protocol/client
will be the last node in the graph which connects to the protocol/server
loaded in server . protocol/server will be the starting node in server.


Regards
Rafi KC

>
> On Thu, Mar 9, 2017 at 8:38 PM, Mohammed Rafi K C  > wrote:
>
> GlusterFS has mainly four daemons, ie glusterfs (generally client
> process), glusterfsd (generally brick process), glusterd
> (management daemon) and gluster (cli).
>
> Except cli (cli/src) all of them are basically the same binary
> symlinked to different name. So what makes them different is
> graphs, ie each daemons loads a graph and based on the graph it
> does it's job.
>
>
> Nodes of each graph are called xlators. So to figure out what are
> the xlators loaded in client side graph. You can see a client
> graph
> /var/lib/glusterd/vols//trusted-.-fuse.vol
>
> Once you figured out the xlators in client graph and their type,
> you can go to the source code, xlatos//.
>
>
> Please note that, if an xlator loaded on client graph it doesn't
> mean that it will only run in client side. The same xlator can
> also run in server if we load a graph with that xlator loaded.
>
>
> Let me know if this is not helping you to understand
>
>
> Regards
>
> Rafi KC
>
>
> So glusterd and cli codes are always ran on servers.
>
> On 03/09/2017 08:28 PM, Tahereh Fattahi wrote:
>> Hi
>> Is there any way to understand that some code is running client
>> side or server side (from source code and its directories)?
>> Is it possible for some code to execute both client and server side?
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How understand some code execute client side or server side?

2017-03-09 Thread Tahereh Fattahi
Thank you very much, it is very helpful.
I see the client graph also in /var/log/glusterfs/mnt-glusterfs.log when
mount the file system.
I think there is a tree structure between xlator (I had seen something in
code like child and parent of each xlator), so just some of them are the
point of connecting to server. I think xlator with type protocol/client is
responsible for send request and get response from server. am I correct?

On Thu, Mar 9, 2017 at 8:38 PM, Mohammed Rafi K C 
wrote:

> GlusterFS has mainly four daemons, ie glusterfs (generally client
> process), glusterfsd (generally brick process), glusterd (management
> daemon) and gluster (cli).
>
> Except cli (cli/src) all of them are basically the same binary symlinked
> to different name. So what makes them different is graphs, ie each daemons
> loads a graph and based on the graph it does it's job.
>
>
> Nodes of each graph are called xlators. So to figure out what are the
> xlators loaded in client side graph. You can see a client graph
> /var/lib/glusterd/vols//trusted-.-fuse.vol
>
> Once you figured out the xlators in client graph and their type, you can
> go to the source code, xlatos//.
>
>
> Please note that, if an xlator loaded on client graph it doesn't mean that
> it will only run in client side. The same xlator can also run in server if
> we load a graph with that xlator loaded.
>
>
> Let me know if this is not helping you to understand
>
>
> Regards
>
> Rafi KC
>
>
> So glusterd and cli codes are always ran on servers.
> On 03/09/2017 08:28 PM, Tahereh Fattahi wrote:
>
> Hi
> Is there any way to understand that some code is running client side or
> server side (from source code and its directories)?
> Is it possible for some code to execute both client and server side?
>
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Sharding?

2017-03-09 Thread Laura Bailey
Hi folks,

This chapter on sharding and how to configure it went into the RHGS 3.1
Administration Guide some time ago:
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Administration_Guide/chap-Managing_Sharding.html

If there's anything in here that isn't clear, let me know so I can fix it.

It doesn't seem to show up if you search on the customer portal; I'll get
in touch with JP Sherman and see what we can do about that.

Cheers,
Laura B

On Fri, Mar 10, 2017 at 2:17 AM, Vijay Bellur  wrote:

>
>
> On Thu, Mar 9, 2017 at 11:10 AM, Kevin Lemonnier 
> wrote:
>
>> > I've seen the term sharding pop up on the list a number of times but I
>> > haven't found any documentation or explanation of what it is. Would
>> someone
>> > please enlighten me?
>>
>> It's a way to split the files you put on the volume. With a shard size of
>> 64 MB
>> for example, the biggest file on the volume will be 64 MB. It's
>> transparent
>> when accessing the files though, you can still of course write your 2 TB
>> file
>> and access it as usual.
>>
>> It's usefull for things like healing (only the shard being headed is
>> locked,
>> and you have a lot less data to transfert) and for things like hosting a
>> single
>> huge file that would be bigger than one of your replicas.
>>
>> We use it for VM disks, as it decreases heal times a lot.
>>
>>
>
> Some more details on sharding can be found at [1].
>
> Regards,
> Vijay
>
> [1] http://blog.gluster.org/2015/12/introducing-shard-translator/
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Laura Bailey
Senior Technical Writer
Customer Content Services BNE
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Disperse mkdir fails

2017-03-09 Thread Ankireddypalle Reddy
Xavi,
Thanks for checking this.
1) mkdir returns errnum 5. EIO.  
2)  The specified directory is the parent directory under which all 
the data in the gluster volume will be stored. Current around 160TB of 262 TB 
is  consumed.
3)  It is extremely difficult to list the exact sequence of FOPS 
that would have been issued to the directory. The storage is heavily used and 
lot of sub directories are present inside this directory.

   Are you looking for the extended attributes for this directory from 
all the bricks inside the volume.  There are about 60 bricks.

Thanks and Regards,
Ram

-Original Message-
From: Xavier Hernandez [mailto:xhernan...@datalab.es] 
Sent: Thursday, March 09, 2017 11:15 AM
To: Ankireddypalle Reddy; Gluster Devel (gluster-de...@gluster.org); 
gluster-users@gluster.org
Subject: Re: [Gluster-users] Disperse mkdir fails

Hi Ram,

On 09/03/17 16:52, Ankireddypalle Reddy wrote:
> Attachment (1):
>
> 1
>
>   
>
> info.txt
>  mmvault.com/webconsole/api/drive/publicshare/346714/file/3037641a3f9b4
> 133920b1b251ed32d5d/action/preview=https://imap.commvault.
> com/webconsole/api/contentstore/publicshare/346714/file/3037641a3f9b41
> 33920b1b251ed32d5d/action/download>
> [Download]
>  6714/file/3037641a3f9b4133920b1b251ed32d5d/action/download>(3.35
> KB)
>
> Hi,
>
> I have a disperse gluster volume  with 6 servers. 262TB of 
> usable capacity.  Gluster version is 3.7.19.
>
> glusterfs1, glusterf2 and glusterfs3 nodes were initially used 
> for creating the volume. Nodes glusterf4, glusterfs5 and glusterfs6 
> were later added to the volume.
>
>
>
> Directory creation failed on a directory called 
> /ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC.
>
> # file: ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC
>
> glusterfs.gfid.string="e8e51015-616f-4f04-b9d2-92f46eb5cfc7"
>
>
>
> gluster mount log contains lot of following errors:
>
> [2017-03-09 15:32:36.773937] W [MSGID: 122056] 
> [ec-combine.c:875:ec_combine_check] 0-StoragePool-disperse-7:
> Mismatching xdata in answers of 'LOOKUP' for
> e8e51015-616f-4f04-b9d2-92f46eb5cfc7
>
>
>
> The directory seems to be out of sync between nodes 
> glusterfs1,
> glusterfs2 and glusterfs3. Each has different version.
>
>
>
>  trusted.ec.version=0x000839f83a4d
>
>  trusted.ec.version=0x00082ea400083a4b
>
>  trusted.ec.version=0x00083a7600083a7b
>
>
>
>  Self-heal does not seem to be healing this directory.
>

This is very similar to what happened the other time. Once more than 1 brick is 
damaged, self-heal cannot do anything to heal it on a 2+1 configuration.

What error does return the mkdir request ?

Does the directory you are trying to create already exist on some brick ?

Can you show all the remaining extended attributes of the directory ?

It would also be useful to have the directory contents on each brick (an 'ls 
-l'). In this case, include the name of the directory you are trying to create.

Can you explain a detailed sequence of operations done on that directory since 
the last time you successfully created a new subdirectory ? 
including any metadata change.

Xavi

>
>
> Thanks and Regards,
>
> Ram
>
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material 
> for the sole use of the intended recipient. Any unauthorized review, 
> use or distribution by others is strictly prohibited. If you have 
> received the message by mistake, please advise the sender by reply 
> email and delete the message. Thank you."
> **
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>

***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] How understand some code execute client side or server side?

2017-03-09 Thread Mohammed Rafi K C
GlusterFS has mainly four daemons, ie glusterfs (generally client
process), glusterfsd (generally brick process), glusterd (management
daemon) and gluster (cli).

Except cli (cli/src) all of them are basically the same binary symlinked
to different name. So what makes them different is graphs, ie each
daemons loads a graph and based on the graph it does it's job.


Nodes of each graph are called xlators. So to figure out what are the
xlators loaded in client side graph. You can see a client graph
/var/lib/glusterd/vols//trusted-.-fuse.vol

Once you figured out the xlators in client graph and their type, you can
go to the source code, xlatos//.


Please note that, if an xlator loaded on client graph it doesn't mean
that it will only run in client side. The same xlator can also run in
server if we load a graph with that xlator loaded.


Let me know if this is not helping you to understand


Regards

Rafi KC


So glusterd and cli codes are always ran on servers.

On 03/09/2017 08:28 PM, Tahereh Fattahi wrote:
> Hi
> Is there any way to understand that some code is running client side
> or server side (from source code and its directories)?
> Is it possible for some code to execute both client and server side?
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Sharding?

2017-03-09 Thread Vijay Bellur
On Thu, Mar 9, 2017 at 11:10 AM, Kevin Lemonnier 
wrote:

> > I've seen the term sharding pop up on the list a number of times but I
> > haven't found any documentation or explanation of what it is. Would
> someone
> > please enlighten me?
>
> It's a way to split the files you put on the volume. With a shard size of
> 64 MB
> for example, the biggest file on the volume will be 64 MB. It's transparent
> when accessing the files though, you can still of course write your 2 TB
> file
> and access it as usual.
>
> It's usefull for things like healing (only the shard being headed is
> locked,
> and you have a lot less data to transfert) and for things like hosting a
> single
> huge file that would be bigger than one of your replicas.
>
> We use it for VM disks, as it decreases heal times a lot.
>
>

Some more details on sharding can be found at [1].

Regards,
Vijay

[1] http://blog.gluster.org/2015/12/introducing-shard-translator/
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Disperse mkdir fails

2017-03-09 Thread Xavier Hernandez

Hi Ram,

On 09/03/17 16:52, Ankireddypalle Reddy wrote:

Attachment (1):

1



info.txt

[Download]
(3.35
KB)

Hi,

I have a disperse gluster volume  with 6 servers. 262TB of
usable capacity.  Gluster version is 3.7.19.

glusterfs1, glusterf2 and glusterfs3 nodes were initially used
for creating the volume. Nodes glusterf4, glusterfs5 and glusterfs6 were
later added to the volume.



Directory creation failed on a directory called
/ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC.

# file: ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC

glusterfs.gfid.string="e8e51015-616f-4f04-b9d2-92f46eb5cfc7"



gluster mount log contains lot of following errors:

[2017-03-09 15:32:36.773937] W [MSGID: 122056]
[ec-combine.c:875:ec_combine_check] 0-StoragePool-disperse-7:
Mismatching xdata in answers of 'LOOKUP' for
e8e51015-616f-4f04-b9d2-92f46eb5cfc7



The directory seems to be out of sync between nodes glusterfs1,
glusterfs2 and glusterfs3. Each has different version.



 trusted.ec.version=0x000839f83a4d

 trusted.ec.version=0x00082ea400083a4b

 trusted.ec.version=0x00083a7600083a7b



 Self-heal does not seem to be healing this directory.



This is very similar to what happened the other time. Once more than 1 
brick is damaged, self-heal cannot do anything to heal it on a 2+1 
configuration.


What error does return the mkdir request ?

Does the directory you are trying to create already exist on some brick ?

Can you show all the remaining extended attributes of the directory ?

It would also be useful to have the directory contents on each brick (an 
'ls -l'). In this case, include the name of the directory you are trying 
to create.


Can you explain a detailed sequence of operations done on that directory 
since the last time you successfully created a new subdirectory ? 
including any metadata change.


Xavi




Thanks and Regards,

Ram

***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or
distribution
by others is strictly prohibited. If you have received the message by
mistake,
please advise the sender by reply email and delete the message. Thank you."
**


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Sharding?

2017-03-09 Thread Kevin Lemonnier
> I've seen the term sharding pop up on the list a number of times but I
> haven't found any documentation or explanation of what it is. Would someone
> please enlighten me?

It's a way to split the files you put on the volume. With a shard size of 64 MB
for example, the biggest file on the volume will be 64 MB. It's transparent
when accessing the files though, you can still of course write your 2 TB file
and access it as usual.

It's usefull for things like healing (only the shard being headed is locked,
and you have a lot less data to transfert) and for things like hosting a single
huge file that would be bigger than one of your replicas.

We use it for VM disks, as it decreases heal times a lot.

-- 
Kevin Lemonnier
PGP Fingerprint : 89A5 2283 04A0 E6E9 0111


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Disperse mkdir fails

2017-03-09 Thread Ankireddypalle Reddy
Attachment (1):

1

info.txt
 
[Download]
 (3.35 KB)

Hi,
I have a disperse gluster volume  with 6 servers. 262TB of usable 
capacity.  Gluster version is 3.7.19.
glusterfs1, glusterf2 and glusterfs3 nodes were initially used for 
creating the volume. Nodes glusterf4, glusterfs5 and glusterfs6 were later 
added to the volume.

Directory creation failed on a directory called 
/ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC.
# file: ws/glus/Folder_07.11.2016_23.02/CV_MAGNETIC
glusterfs.gfid.string="e8e51015-616f-4f04-b9d2-92f46eb5cfc7"

gluster mount log contains lot of following errors:
[2017-03-09 15:32:36.773937] W [MSGID: 122056] 
[ec-combine.c:875:ec_combine_check] 0-StoragePool-disperse-7: Mismatching xdata 
in answers of 'LOOKUP' for e8e51015-616f-4f04-b9d2-92f46eb5cfc7

The directory seems to be out of sync between nodes glusterfs1, 
glusterfs2 and glusterfs3. Each has different version.

 trusted.ec.version=0x000839f83a4d
 trusted.ec.version=0x00082ea400083a4b
 trusted.ec.version=0x00083a7600083a7b

 Self-heal does not seem to be healing this directory.

Thanks and Regards,
Ram
***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Sharding?

2017-03-09 Thread Jake Davis
I've seen the term sharding pop up on the list a number of times but I
haven't found any documentation or explanation of what it is. Would someone
please enlighten me?

Many Thanks,
-Jake
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] How understand some code execute client side or server side?

2017-03-09 Thread Tahereh Fattahi
Hi
Is there any way to understand that some code is running client side or
server side (from source code and its directories)?
Is it possible for some code to execute both client and server side?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Is it possible to have more than one volume?

2017-03-09 Thread Kaushal M
On Thu, Mar 9, 2017 at 6:39 PM, Tahereh Fattahi  wrote:
> Thank  you
> So one distributed file system has one volume with many server and many
> brick, is it correct?

Each volume is an individual distributed file-system. It can be
composed of many bricks, which can be spread over many servers.

>
> On Thu, Mar 9, 2017 at 4:26 PM, Kaushal M  wrote:
>>
>> On Thu, Mar 9, 2017 at 6:15 PM, Tahereh Fattahi 
>> wrote:
>> > Hi
>> > Is it possible to have more than one volume? (I know difference between
>> > brick and volume and in this question I mean volume)
>> > If yes, how should link these volumes to each other?
>>
>> You can have more than one volume in your GlusterFS pool. The volumes
>> are completely independent of each other, there is no linking between
>> volumes.
>>
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Is it possible to have more than one volume?

2017-03-09 Thread Tahereh Fattahi
Thank  you
So one distributed file system has one volume with many server and many
brick, is it correct?

On Thu, Mar 9, 2017 at 4:26 PM, Kaushal M  wrote:

> On Thu, Mar 9, 2017 at 6:15 PM, Tahereh Fattahi 
> wrote:
> > Hi
> > Is it possible to have more than one volume? (I know difference between
> > brick and volume and in this question I mean volume)
> > If yes, how should link these volumes to each other?
>
> You can have more than one volume in your GlusterFS pool. The volumes
> are completely independent of each other, there is no linking between
> volumes.
>
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Is it possible to have more than one volume?

2017-03-09 Thread Kaushal M
On Thu, Mar 9, 2017 at 6:15 PM, Tahereh Fattahi  wrote:
> Hi
> Is it possible to have more than one volume? (I know difference between
> brick and volume and in this question I mean volume)
> If yes, how should link these volumes to each other?

You can have more than one volume in your GlusterFS pool. The volumes
are completely independent of each other, there is no linking between
volumes.

>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Is it possible to have more than one volume?

2017-03-09 Thread Tahereh Fattahi
Hi
Is it possible to have more than one volume? (I know difference between
brick and volume and in this question I mean volume)
If yes, how should link these volumes to each other?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] RE : Frequent connect and disconnect messages flooded in logs

2017-03-09 Thread Mohammed Rafi K C
I'm sorry that you had to downgrade. We will work on it and hopefully
will see you soon in 3.8 ;) .


Just one question, does your workload include lot of delete either files
or directories. We just want to see if the delayed deletes (Janitor
thread) causing any issue .


Regards

Rafi KC


On 03/09/2017 01:53 PM, Amar Tumballi wrote:
>
> - Original Message -
>> From: "Micha Ober" 
>>
>> ​Just to let you know: I have reverted back to glusterfs 3.4.2 and everything
>> is working again. No more disconnects, no more errors in the kernel log. So
>> there *has* to be some kind of regression in the newer versions​. Sadly, I
>> guess, it will be hard to find.
>>
> Thanks for the update Micha. This helps to corner the issue a little at least.
>
> Regards,
> Amar
>
>
>> 2016-12-20 13:31 GMT+01:00 Micha Ober < mich...@gmail.com > :
>>
>>
>>
>> Hi Rafi,
>>
>> here are the log files:
>>
>> NFS: http://paste.ubuntu.com/23658653/
>> Brick: http://paste.ubuntu.com/23658656/
>>
>> The brick log is of the brick which has caused the last disconnect at
>> 2016-12-20 06:46:36 (0-gv0-client-7).
>>
>> For completeness, here is also dmesg output:
>> http://paste.ubuntu.com/23658691/
>>
>> Regards,
>> Micha
>>
>> 2016-12-19 7:28 GMT+01:00 Mohammed Rafi K C < rkavu...@redhat.com > :
>>
>>
>>
>>
>>
>> Hi Micha,
>>
>> Sorry for the late reply. I was busy with some other things.
>>
>> If you have still the setup available Can you enable TRACE log level [1],[2]
>> and see if you could find any log entries when the network start
>> disconnecting. Basically I'm trying to find out any disconnection had
>> occurred other than ping timer expire issue.
>>
>>
>>
>>
>>
>>
>>
>> [1] : gluster volume  diagnostics.brick-log-level TRACE
>>
>> [2] : gluster volume  diagnostics.client-log-level TRACE
>>
>>
>>
>>
>>
>> Regards
>>
>> Rafi KC
>>
>> On 12/08/2016 07:59 PM, Atin Mukherjee wrote:
>>
>>
>>
>>
>>
>> On Thu, Dec 8, 2016 at 4:37 PM, Micha Ober < mich...@gmail.com > wrote:
>>
>>
>>
>> Hi Rafi,
>>
>> thank you for your support. It is greatly appreciated.
>>
>> Just some more thoughts from my side:
>>
>> There have been no reports from other users in *this* thread until now, but I
>> have found at least one user with a very simiar problem in an older thread:
>>
>> https://www.gluster.org/pipermail/gluster-users/2014-November/019637.html
>>
>> He is also reporting disconnects with no apparent reasons, althogh his setup
>> is a bit more complicated, also involving a firewall. In our setup, all
>> servers/clients are connected via 1 GbE with no firewall or anything that
>> might block/throttle traffic. Also, we are using exactly the same software
>> versions on all nodes.
>>
>>
>> I can also find some reports in the bugtracker when searching for
>> "rpc_client_ping_timer_expired" and "rpc_clnt_ping_timer_expired" (looks
>> like spelling changed during versions).
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1096729
>>
>> Just FYI, this is a different issue, here GlusterD fails to handle the volume
>> of incoming requests on time since MT-epoll is not enabled here.
>>
>>
>>
>>
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1370683
>>
>> But both reports involve large traffic/load on the bricks/disks, which is not
>> the case for out setup.
>> To give a ballpark figure: Over three days, 30 GiB were written. And the data
>> was not written at once, but continuously over the whole time.
>>
>>
>> Just to be sure, I have checked the logfiles of one of the other clusters
>> right now, which are sitting in the same building, in the same rack, even on
>> the same switch, running the same jobs, but with glusterfs 3.4.2 and I can
>> see no disconnects in the logfiles. So I can definitely rule out our
>> infrastructure as problem.
>>
>> Regards,
>> Micha
>>
>>
>>
>> Am 07.12.2016 um 18:08 schrieb Mohammed Rafi K C:
>>
>>
>>
>>
>> Hi Micha,
>>
>> This is great. I will provide you one debug build which has two fixes which I
>> possible suspect for a frequent disconnect issue, though I don't have much
>> data to validate my theory. So I will take one more day to dig in to that.
>>
>> Thanks for your support, and opensource++
>>
>> Regards
>>
>> Rafi KC
>> On 12/07/2016 05:02 AM, Micha Ober wrote:
>>
>>
>>
>> Hi,
>>
>> thank you for your answer and even more for the question!
>> Until now, I was using FUSE. Today I changed all mounts to NFS using the same
>> 3.7.17 version.
>>
>> But: The problem is still the same. Now, the NFS logfile contains lines like
>> these:
>>
>> [2016-12-06 15:12:29.006325] C
>> [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired] 0-gv0-client-7: server
>> X.X.18.62:49153 has not responded in the last 42 seconds, disconnecting.
>>
>> Interestingly enough, the IP address X.X.18.62 is the same machine! As I
>> wrote earlier, each node serves both as a server and a client, as each node
>> contributes bricks to the volume. Every server is connecting to itself via
>> its hostname. For example, the fstab on the 

Re: [Gluster-users] Deleting huge file from glusterfs hangs the cluster for a while

2017-03-09 Thread Krutika Dhananjay
Unfortunately you'll need to delete those shards manually from the bricks.
I am assuming you know how to identify shards that belong to a particular
image.
Since the VM is deleted, no IO will be happening on those remaining shards.

You would need to identify the shards, find all hard links associated with
every shard,
and delete the shards and their hard links from the backend.

Do you mind raising a bug for this issue? I'll send a patch to move the
deletion of the shards
to the background.

https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS

-Krutika

On Thu, Mar 9, 2017 at 12:29 AM, Georgi Mirchev  wrote:

>
> На 03/08/2017 в 03:37 PM, Krutika Dhananjay написа:
>
> Thanks for your feedback.
>
> May I know what was the shard-block-size?
>
> The shard size is 4 MB.
>
> One way to fix this would be to make shard translator delete only the base
> file (0th shard) in the IO path and move
> the deletion of the rest of the shards to background. I'll work on this.
>
> Is there a manual way?
>
>
> -Krutika
>
> On Fri, Mar 3, 2017 at 10:35 PM, GEORGI MIRCHEV  wrote:
>
>> Hi,
>>
>> I have deleted two large files (around 1 TB each) via gluster client
>> (mounted
>> on /mnt folder). I used a simple rm command, e.g "rm /mnt/hugefile". This
>> resulted in hang of the cluster (no io can be done, the VM hanged). After
>> a
>> few minutes my ssh connection to the gluster node gets disconnected - I
>> had to
>> reconnect, which was very strange, probably some kind of timeout. Nothing
>> in
>> dmesg so it's probably the ssh that terminated the connection.
>>
>> After that the cluster works, everything seems fine, the file is gone in
>> the
>> client but the space is not reclaimed.
>>
>> The deleted file is also gone from bricks, but the shards are still there
>> and
>> use up all the space.
>>
>> I need to reclaim the space. How do I delete the shards / other metadata
>> for a
>> file that no longer exists?
>>
>>
>> Versions:
>> glusterfs-server-3.8.9-1.el7.x86_64
>> glusterfs-client-xlators-3.8.9-1.el7.x86_64
>> glusterfs-geo-replication-3.8.9-1.el7.x86_64
>> glusterfs-3.8.9-1.el7.x86_64
>> glusterfs-fuse-3.8.9-1.el7.x86_64
>> vdsm-gluster-4.19.4-1.el7.centos.noarch
>> glusterfs-cli-3.8.9-1.el7.x86_64
>> glusterfs-libs-3.8.9-1.el7.x86_64
>> glusterfs-api-3.8.9-1.el7.x86_64
>>
>> --
>> Georgi Mirchev
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] RE : Frequent connect and disconnect messages flooded in logs

2017-03-09 Thread Amar Tumballi


- Original Message -
> From: "Micha Ober" 
> 
> ​Just to let you know: I have reverted back to glusterfs 3.4.2 and everything
> is working again. No more disconnects, no more errors in the kernel log. So
> there *has* to be some kind of regression in the newer versions​. Sadly, I
> guess, it will be hard to find.
> 

Thanks for the update Micha. This helps to corner the issue a little at least.

Regards,
Amar


> 2016-12-20 13:31 GMT+01:00 Micha Ober < mich...@gmail.com > :
> 
> 
> 
> Hi Rafi,
> 
> here are the log files:
> 
> NFS: http://paste.ubuntu.com/23658653/
> Brick: http://paste.ubuntu.com/23658656/
> 
> The brick log is of the brick which has caused the last disconnect at
> 2016-12-20 06:46:36 (0-gv0-client-7).
> 
> For completeness, here is also dmesg output:
> http://paste.ubuntu.com/23658691/
> 
> Regards,
> Micha
> 
> 2016-12-19 7:28 GMT+01:00 Mohammed Rafi K C < rkavu...@redhat.com > :
> 
> 
> 
> 
> 
> Hi Micha,
> 
> Sorry for the late reply. I was busy with some other things.
> 
> If you have still the setup available Can you enable TRACE log level [1],[2]
> and see if you could find any log entries when the network start
> disconnecting. Basically I'm trying to find out any disconnection had
> occurred other than ping timer expire issue.
> 
> 
> 
> 
> 
> 
> 
> [1] : gluster volume  diagnostics.brick-log-level TRACE
> 
> [2] : gluster volume  diagnostics.client-log-level TRACE
> 
> 
> 
> 
> 
> Regards
> 
> Rafi KC
> 
> On 12/08/2016 07:59 PM, Atin Mukherjee wrote:
> 
> 
> 
> 
> 
> On Thu, Dec 8, 2016 at 4:37 PM, Micha Ober < mich...@gmail.com > wrote:
> 
> 
> 
> Hi Rafi,
> 
> thank you for your support. It is greatly appreciated.
> 
> Just some more thoughts from my side:
> 
> There have been no reports from other users in *this* thread until now, but I
> have found at least one user with a very simiar problem in an older thread:
> 
> https://www.gluster.org/pipermail/gluster-users/2014-November/019637.html
> 
> He is also reporting disconnects with no apparent reasons, althogh his setup
> is a bit more complicated, also involving a firewall. In our setup, all
> servers/clients are connected via 1 GbE with no firewall or anything that
> might block/throttle traffic. Also, we are using exactly the same software
> versions on all nodes.
> 
> 
> I can also find some reports in the bugtracker when searching for
> "rpc_client_ping_timer_expired" and "rpc_clnt_ping_timer_expired" (looks
> like spelling changed during versions).
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1096729
> 
> Just FYI, this is a different issue, here GlusterD fails to handle the volume
> of incoming requests on time since MT-epoll is not enabled here.
> 
> 
> 
> 
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1370683
> 
> But both reports involve large traffic/load on the bricks/disks, which is not
> the case for out setup.
> To give a ballpark figure: Over three days, 30 GiB were written. And the data
> was not written at once, but continuously over the whole time.
> 
> 
> Just to be sure, I have checked the logfiles of one of the other clusters
> right now, which are sitting in the same building, in the same rack, even on
> the same switch, running the same jobs, but with glusterfs 3.4.2 and I can
> see no disconnects in the logfiles. So I can definitely rule out our
> infrastructure as problem.
> 
> Regards,
> Micha
> 
> 
> 
> Am 07.12.2016 um 18:08 schrieb Mohammed Rafi K C:
> 
> 
> 
> 
> Hi Micha,
> 
> This is great. I will provide you one debug build which has two fixes which I
> possible suspect for a frequent disconnect issue, though I don't have much
> data to validate my theory. So I will take one more day to dig in to that.
> 
> Thanks for your support, and opensource++
> 
> Regards
> 
> Rafi KC
> On 12/07/2016 05:02 AM, Micha Ober wrote:
> 
> 
> 
> Hi,
> 
> thank you for your answer and even more for the question!
> Until now, I was using FUSE. Today I changed all mounts to NFS using the same
> 3.7.17 version.
> 
> But: The problem is still the same. Now, the NFS logfile contains lines like
> these:
> 
> [2016-12-06 15:12:29.006325] C
> [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired] 0-gv0-client-7: server
> X.X.18.62:49153 has not responded in the last 42 seconds, disconnecting.
> 
> Interestingly enough, the IP address X.X.18.62 is the same machine! As I
> wrote earlier, each node serves both as a server and a client, as each node
> contributes bricks to the volume. Every server is connecting to itself via
> its hostname. For example, the fstab on the node "giant2" looks like:
> 
> #giant2:/gv0 /shared_data glusterfs defaults,noauto 0 0
> #giant2:/gv2 /shared_slurm glusterfs defaults,noauto 0 0
> 
> giant2:/gv0 /shared_data nfs defaults,_netdev,vers=3 0 0
> giant2:/gv2 /shared_slurm nfs defaults,_netdev,vers=3 0 0
> 
> So I understand the disconnects even less.
> 
> I don't know if it's possible to create a dummy cluster which exposes the
> same behaviour, because the