Re: [Gluster-devel] Review-request: Readdirp (ls -l) Performance Improvement

2020-05-27 Thread RAFI KC

Result for a single ls on a dir with 10k directories inside (16*3 volume)

*
*

*

Configuration



Plain volume



Parallel-readdir



Proposed Solution

Single Dir ls (Seconds)



-



135



32.744

*
**


It is showing 321% improvements.

Regards
Rafi KC

On 27/05/20 11:22 am, RAFI KC wrote:


Hi All,

I have been working on POC to improve readdirp performance 
improvement. At the end of the experiment, The results are showing 
promising result in performance, overall there is a 104% improvement 
for full filesystem crawl compared to the existing solution. Here is 
the short test numbers. The tests were carried out in 16*3 setup with 
1.5 Million dentries (Both files and dir). The system also contains 
some empty directories. *In the result the proposed solution is 287% 
faster than the plane volume and 104% faster than the parallel-readdir 
based solution.*


*
*

*

Configuration



Plain volume



Parallel-readdir



Proposed Solution

FS Crawl Time in Seconds



16497.523



8717.872



4261.401

*
**

In short, the basic idea behind the proposal is the efficient managing 
of readdir buffer in gluster along with prefetching the dentries for 
intelligent switch-over to the next buffer. The detailed problem 
description, deign description and results are available in the 
doc.https://docs.google.com/document/d/10z4T5Sd_-wCFrmDrzyQtlWOGLang1_g17wO8VUxSiJ8/edit 




https://review.gluster.org/24469

https://review.gluster.org/24470


Regards

Rafi KC


___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] Review-request: Readdirp (ls -l) Performance Improvement

2020-05-26 Thread RAFI KC

Hi All,

I have been working on POC to improve readdirp performance improvement. 
At the end of the experiment, The results are showing promising result 
in performance, overall there is a 104% improvement for full filesystem 
crawl compared to the existing solution. Here is the short test numbers. 
The tests were carried out in 16*3 setup with 1.5 Million dentries (Both 
files and dir). The system also contains some empty directories. *In the 
result the proposed solution is 287% faster than the plane volume and 
104% faster than the parallel-readdir based solution.*


*
*

*

Configuration



Plain volume



Parallel-readdir



Proposed Solution

FS Crawl Time in Seconds



16497.523



8717.872



4261.401

*
**

In short, the basic idea behind the proposal is the efficient managing 
of readdir buffer in gluster along with prefetching the dentries for 
intelligent switch-over to the next buffer. The detailed problem 
description, deign description and results are available in the 
doc.https://docs.google.com/document/d/10z4T5Sd_-wCFrmDrzyQtlWOGLang1_g17wO8VUxSiJ8/edit 




https://review.gluster.org/24469

https://review.gluster.org/24470


Regards

Rafi KC

___

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] Minutes of Gluster Community Meeting (APAC) 14th April 2020

2020-04-14 Thread RAFI KC


021# Gluster Community Meeting -  14-04-2020



### Previous Meeting minutes:

- http://github.com/gluster/community
- Recording of this meeting-

### Date/Time: Check the [community 
calendar](https://calendar.google.com/calendar/b/1?cid=dmViajVibDBrbnNiOWQwY205ZWg5cGJsaTRAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ)


### Bridge
* APAC/EMEA friendly hours
  - Tuesday 14-04-2020, 02:30PM IST
  - Bridge: https://bluejeans.com/441850968


---

### Attendance
Name (#gluster-dev alias) - company
* Rafi (#rafi) - Red Hat
* Hari Gowtham (#hgowtham) - Red Hat
* Sheetal Pamecha (#spamecha) - Red Hat
* Aravinda (Kadalu.IO)
* Amar Tumballi (Kadalu.IO)
* Rinku Kothiya - (#rinku) - Red Hat
* Sunny(#sunny) - Red Hat
* Ravi (@itisravi) - Red Hat
* Sunil Kumar - Red Hat
* Sanju - Red Hat
* Deepshikha - Red Hat
* Barak - Red Hat

### User stories
* https://lists.gluster.org/pipermail/gluster-users/2020-April/038026.html
* https://twitter.com/gluster/status/1247990943602495488

### Community

* Project metrics:


Metrics Value
Coverity    61
Clang Scan   58
Test coverage 
 
	71.1

New Issues in last 14 days
master 



28
Gluster User Queries in last 14 days 
 
	79

Total Github issues  738




* Any release updates?

    - Release-5.13, tagging is done, Sheetal and shwetha are looking 
into building packages. Once that is done, we will announce the release.

    - 5.13 will be last release in branch 5.

    - Unlike the previous process, currently for both release 7 and 5, 
the same issue is being used, so I have tagged the issue with 
appropriate release while merging.

https://review.gluster.org/#/c/glusterfs/+/24231/
https://review.gluster.org/#/c/glusterfs/+/24233/
discussion about how we are going to backport the patches.

Xavi clarified earlier that we can use the same issue.
Rinku proposed to use labels to know that an issue is used for various 
branches as well


Template for tracker :
https://review.gluster.org/#/c/glusterfs/+/24285/

Sanju : make changes in template to ask the reported to get the labels 
right from the maintainer atleast and would be better if we can give 
others the permission to assign labels.Amar: the reporter needs write 
permission in the repo.
Deepshikha:  created a team of dev. for this team, they will have 
permissions to add labels and other necessary values.



* Blocker issues across the project?
https://github.com/gluster/glusterfs/issues/884. has been merged in master

*Reduce number of release cycles - Announcement to user ML[hari]
*reach out to Amar and take the release-8 work forward to packaging.
*Tentative date for release 8 - end of April start branching
*https://www.gluster.org/release-schedule/ - update the release schedule 
in gluster.org [Rinku]. Hari will let him know the other issues in the 
website


* Notable thread form mailing list

    - glfs_fallocate() looks completely broken on disperse volumes with 
sharding enabled. tests/bugs/shard/zero-flag.t failed when changing the 
volume type to disperse.
    - Impressive boot times for big clusters: NFS, Image Objects, and 
Sharding

Rafi will ping the dev offline.
Ravi has a fix for the split-brain issue reported by user, needs to 
reach out to the reporter to see if it fixed the actual issue.



### Conferences / Meetups

COVID-19 impact is likely for events!

* Storage Developer Conference

Important dates:
CFP Closed : No (May 15, 2020)
Schedule Announcement :
Event Open for Registration : Yes (early bird registeration till 8/24/2020)
Last Date of Registration :
Event dates: September 21-24, 2020
Venue: Santa Clara, CA

Talks related to gluster:

* Devconf.us

Important dates:
CFP Closed : 27th April
Schedule Announcement : July 3, 2020
Event Open for Registration : July 3 2020
Last Date of Registration :
Event dates: September 23rd to 25th 2020
Venue:Framingham, MA.

* https://sambaxp.org/

Important dates:
CFP Closed : Closed
Event Open for Registration : Open
Event dates: 26th - 28th of May 2020
Venue: Online

Talks related to gluster:



### GlusterFS - v8.0 and beyond
*Have the release dates updated. Hari, Rinku and Amar will discuss about 
what is left for release 8 and get the dates right. and will take 
release-8 to a conclusion.


### Developer focus

* Any design specs to discuss?
Any effort done in the perf side of gluster? gluster vs other storage? 
any design change[sunil]
* [Ravi] I am working on a patch that adds io_uring functionality to 
posix xlator. Will create a github issue and share details once ready.

* while backporting please do add the release-branch label.
*
### 

Re: [Gluster-devel] WORM-Xlator: How to get filepath in worm_create_cbk?

2020-02-05 Thread RAFI KC

Hi David,

As I said earlier the inode is not linked with itable, so similar to 
inode_path, inode_parent also won't work. We need to remember the parent 
inode during the worm_create. May be we can store it in the frame->local 
by creating a struct(if we have more than 2 elements to remember), or 
simply store it in the frame->local = inode_ref(loc->parent). Please 
make sure to unref/free the local.



Regards

Rafi KC

On 2/5/20 7:18 PM, David Spisla wrote:

Hello Rafi,
I understand. I first tried the way with the parent inode. I did it 
this way:


inode_t *parent_inode = NULL;
char *filepath = NULL;
parent_inode = inode_parent(inode, NULL, NULL); // also with fd->inode 
it didn't work

if (!parent_inode) {
  gf_log(this->name, GF_LOG_ERROR, "Can't get parent inode!");
}
inode_path(parent_inode, NULL, );
if (!filepath) {
   gf_log(this->name, GF_LOG_ERROR, "Can't get filepath!");
}

But it didn't work. See brick log:
[2020-02-05 13:39:56.408915] E [worm.c:489:worm_create_cbk] 
0-repo2-worm: Can't get parent inode!
[2020-02-05 13:39:56.408941] E [worm.c:495:worm_create_cbk] 
0-repo2-worm: Can't get filepath!


What could be wrong? If this way promise no succeed I will try out the 
other approach you suggested.


Regards
David Spisla

Am Mi., 5. Feb. 2020 um 12:00 Uhr schrieb RAFI KC <mailto:rkavu...@redhat.com>>:



On 2/5/20 4:15 PM, David Spisla wrote:

Hello Amar,
I do the following in worm_create_cbk:

char *filepath = NULL;
inode_path(inode, NULL, );
if (!filepath) {
    gf_log(this->name, GF_LOG_ERROR, "Can't get filepath!");
}

Unfortunately I got this in the brick log:
[2020-02-05 10:09:41.880522] E [inode.c:1498:__inode_path]
(-->/usr/lib64/glusterfs/5.11/xlator/features/worm.so(+0xb129)
[0x7f4657df7129] -->/usr/lib64/libglusterfs.so.0(inode_path+0x31)
[0x7f4664e44961] -->/us
r/lib64/libglusterfs.so.0(__inode_path+0x38b) [0x7f4664e448bb] )
0-: Assertion failed: 0
[2020-02-05 10:09:41.880580] W [inode.c:1500:__inode_path]
(-->/usr/lib64/glusterfs/5.11/xlator/features/worm.so(+0xb129)
[0x7f4657df7129] -->/usr/lib64/libglusterfs.so.0(inode_path+0x31)
[0x7f4664e44961] -->/us
r/lib64/libglusterfs.so.0(__inode_path+0x3d3) [0x7f4664e44903] )
0-repo2-worm: invalid inode [Invalid argument]
[2020-02-05 10:09:41.880594] E [worm.c:488:worm_create_cbk]
0-repo2-worm: Can't get filepath!

The inode I use seems to be not valid because inode_path()
returns with error. The same with fd->inode. Is there a way to
validate the inode before passing it to the function?


This inode hasn't linked yet to the inode table(creation is still
in progress), that will only happens at server4_post_create from
the server xlator which is the last xlator in the cbk path. That
is why the inode_path creation is failed. Why don't you use parent
inode to create the path, I believe parent inode will work for you.


If all the files and folders in the special directory follows the
same property, An alternative approach is to use an inode type to
distinguish this special directory and dentries on it. Something
similar to snapview-client which uses virtual inode to distinguish
the .snap folder.


Regards

Rafi KC





Regards
David



Am Di., 4. Feb. 2020 um 17:57 Uhr schrieb Amar Tumballi
mailto:a...@kadalu.io>>:



On Tue, Feb 4, 2020 at 7:16 PM David Spisla
mailto:spisl...@gmail.com>> wrote:

Dear Gluster Community,
in worm_create_cbk a file gets the xattr
"trusted.worm_file" and "trusted.start_time" if
worm-file-level is enabled. Now I want to exclude some
files in a special folder from the WORM function.
Therefore I want to check in worm_create_cbk if the file
is in this folder or not. But I don't find a parameter
where the filepath is stored. So my alternative solution
was, to check it in worm_create (via loc->path) and store
a boolean value in frame->local. This boolean value will
be used in worm_create_cbk later. But its not my
favourite solution.

Do you know how to get the filepath in the cbk function?


As per FS guidelines, inside the filesystem, we need to
handle inodes or parent-inode + basename. If you are looking
at building a 'path' info in create_cbk, then i
recommend using 'inode_path()' to build the path as per the
latest inode table information.

-Amar


-- 
https://kadalu.io

Container Storage made easy!

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA S

Re: [Gluster-devel] WORM-Xlator: How to get filepath in worm_create_cbk?

2020-02-05 Thread RAFI KC


On 2/5/20 4:15 PM, David Spisla wrote:

Hello Amar,
I do the following in worm_create_cbk:

char *filepath = NULL;
inode_path(inode, NULL, );
if (!filepath) {
    gf_log(this->name, GF_LOG_ERROR, "Can't get filepath!");
}

Unfortunately I got this in the brick log:
[2020-02-05 10:09:41.880522] E [inode.c:1498:__inode_path] 
(-->/usr/lib64/glusterfs/5.11/xlator/features/worm.so(+0xb129) 
[0x7f4657df7129] -->/usr/lib64/libglusterfs.so.0(inode_path+0x31) 
[0x7f4664e44961] -->/us
r/lib64/libglusterfs.so.0(__inode_path+0x38b) [0x7f4664e448bb] ) 0-: 
Assertion failed: 0
[2020-02-05 10:09:41.880580] W [inode.c:1500:__inode_path] 
(-->/usr/lib64/glusterfs/5.11/xlator/features/worm.so(+0xb129) 
[0x7f4657df7129] -->/usr/lib64/libglusterfs.so.0(inode_path+0x31) 
[0x7f4664e44961] -->/us
r/lib64/libglusterfs.so.0(__inode_path+0x3d3) [0x7f4664e44903] ) 
0-repo2-worm: invalid inode [Invalid argument]
[2020-02-05 10:09:41.880594] E [worm.c:488:worm_create_cbk] 
0-repo2-worm: Can't get filepath!


The inode I use seems to be not valid because inode_path() returns 
with error. The same with fd->inode. Is there a way to validate the 
inode before passing it to the function?


This inode hasn't linked yet to the inode table(creation is still in 
progress), that will only happens at server4_post_create from the server 
xlator which is the last xlator in the cbk path. That is why the 
inode_path creation is failed. Why don't you use parent inode to create 
the path, I believe parent inode will work for you.



If all the files and folders in the special directory follows the same 
property, An alternative approach is to use an inode type to distinguish 
this special directory and dentries on it. Something similar to 
snapview-client which uses virtual inode to distinguish the .snap folder.



Regards

Rafi KC





Regards
David



Am Di., 4. Feb. 2020 um 17:57 Uhr schrieb Amar Tumballi 
mailto:a...@kadalu.io>>:




On Tue, Feb 4, 2020 at 7:16 PM David Spisla mailto:spisl...@gmail.com>> wrote:

Dear Gluster Community,
in worm_create_cbk a file gets the xattr "trusted.worm_file"
and "trusted.start_time" if worm-file-level is enabled. Now I
want to exclude some files in a special folder from the WORM
function. Therefore I want to check in worm_create_cbk if the
file is in this folder or not. But I don't find a parameter
where the filepath is stored. So my alternative solution was,
to check it in worm_create (via loc->path) and store a boolean
value in frame->local. This boolean value will be used in
worm_create_cbk later. But its not my favourite solution.

Do you know how to get the filepath in the cbk function?


As per FS guidelines, inside the filesystem, we need to handle
inodes or parent-inode + basename. If you are looking at building
a 'path' info in create_cbk, then i recommend using 'inode_path()'
to build the path as per the latest inode table information.

-Amar


-- 
https://kadalu.io

Container Storage made easy!

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Introduce GD_OP_VERSION_5_10 and increase GD_OP_VERSION_MAX to this version

2020-01-29 Thread RAFI KC


On 1/29/20 2:28 PM, David Spisla wrote:

Dear Gluster Devels,
i am using gluster 5.10 and want to introduce a new volume option. 
Therefore I want to set a proper GD_OP_VERSION for it. In gluster 5.10 
source code there is no macro defined for 51000.


But concurrently the GD_OP_VERSION_MAX is set to 50400. I would do 
something like this:


1. Change in libglusterfs/src/globals-h (line 47)
#define GD_OP_VERSION_MAX                    \
    GD_OP_VERSION_5_10
2. Add line to same Header file:
#define GD_OP_VERSION_5_10 51000 /* Op-version for GlusterFS 5.10 */

Do you think this is fine?


This should be fine. Also, define the new volume set command with the 
newly introduced op-version.




3. libglusterfs/src/common-utils.c (line 2036):
On the other side there is a if-branch which uses GD_OP_VERSION_5_4 
which is currently the GD_OP_VERSION_MAX. Why it is used here and 
should I increase it also to GD_OP_VERSION_5_10?


In Gluster5.4, the checksum calculation has been changed to a new 
method. Hence if it is a heterogeneous cluster (Not every node is in the 
same version, usually happens when you upgrade) the peer will go to the 
rejected state as one of the cluster compute checksum with new method 
and rest uses the old. So this check prevents doing so.



Keep the op-version as GD_OP_VERSION_5_4 for this check. Because let's 
say you have a heterogeneous cluster where op-version is 5.4 or higher 
then it is completely fine to do the checksum calculation with new method.


Regards

Rafi KC




Regards
David Spisla


___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Geo-rep start after snapshot restore makes the geo-rep faulty

2019-09-23 Thread RAFI KC


On 9/23/19 4:13 PM, Shwetha Acharya wrote:

Hi All,
I am planning to work on this 
 bugzilla issue.
Here, when we restore the snapshots, and start the geo-replication 
session, we see that the geo-replication goes faulty. It is mainly 
because, the brick path of original session and the session after 
snapshot restore will be different. There is a proposed work around 
for this issue, according to which we replace the old brick path with 
new brick path inside the index file HTIME.xx, which basically 
solves the issue.


I have some doubts regarding the same.
We are going with the work around from a long time. Are there any 
limitations stopping us from implementing solution for this, which I 
am currently unaware of?
Is it important to have paths inside index file? Can we eliminate the 
paths inside them?

Is there any concerns from snapshot side?


Can you please explain how we are planning to replace the path in the 
index file. Did we finalized the method? The problem here is that any 
time consuming operation within the glusterd transaction could be a 
difficult.


Rafi


Are there any other general concerns regarding the same?

Regards,
Shwetha
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] ./tests/bugs/snapshot/bug-1399598-uss-with-ssl.t generating core very often

2019-05-18 Thread RAFI KC
All of this links have a common backtrace, and suggest a crash from 
socket layer with ssl code path,


Backtrace is

Thread 1 (Thread 0x7f9cfbfff700 (LWP 31373)):
#0  0x7f9d14d65400 in ssl3_free_digest_list () from /lib64/libssl.so.10
No symbol table info available.
#1  0x7f9d14d65586 in ssl3_digest_cached_records () from 
/lib64/libssl.so.10

No symbol table info available.
#2  0x7f9d14d5f91d in ssl3_send_client_verify () from 
/lib64/libssl.so.10

No symbol table info available.
#3  0x7f9d14d61be7 in ssl3_connect () from /lib64/libssl.so.10
No symbol table info available.
#4  0x7f9d14fb3585 in ssl_complete_connection (this=0x7f9ce802e980) 
at 
/home/jenkins/root/workspace/centos7-regression/rpc/rpc-transport/socket/src/socket.c:482

    ret = -1
    cname = 0x0
    r = -1
    ssl_error = -1
    priv = 0x7f9ce802efc0
    __FUNCTION__ = "ssl_complete_connection"
#5  0x7f9d14fbb596 in ssl_handle_client_connection_attempt 
(this=0x7f9ce802e980) at 
/home/jenkins/root/workspace/centos7-regression/rpc/rpc-transport/socket/src/socket.c:2809

    priv = 0x7f9ce802efc0
    ctx = 0x7f9d08001170
    idx = 1
    ret = -1
    fd = 16
    __FUNCTION__ = "ssl_handle_client_connection_attempt"
#6  0x7f9d14fbb8b3 in socket_complete_connection 
(this=0x7f9ce802e980) at 
/home/jenkins/root/workspace/centos7-regression/rpc/rpc-transport/socket/src/socket.c:2908

    priv = 0x7f9ce802efc0
    ctx = 0x7f9d08001170
    idx = 1
    gen = 4
    ret = -1
    fd = 16
#7  0x7f9d14fbbc16 in socket_event_handler (fd=16, idx=1, gen=4, 
data=0x7f9ce802e980, poll_in=0, poll_out=4, poll_err=0, 
event_thread_died=0 '\000') at 
/home/jenkins/root/workspace/centos7-regression/rpc/rpc-transport/socket/src/socket.c:2970
#8  0x7f9d20c896c1 in event_dispatch_epoll_handler 
(event_pool=0x7f9d08034960, event=0x7f9cfbffe140) at 
/home/jenkins/root/workspace/centos7-regression/libglusterfs/src/event-epoll.c:648
#9  0x7f9d20c89bda in event_dispatch_epoll_worker 
(data=0x7f9cf4000b60) at 
/home/jenkins/root/workspace/centos7-regression/libglusterfs/src/event-epoll.c:761

#10 0x7f9d1fa39dd5 in start_thread () from /lib64/libpthread.so.0


Mohith,

Do you have any idea what is going on with ssl?

Regards

Rafi KC

On 5/16/19 8:01 PM, Sanju Rakonde wrote:

Thank you for the quick responses.

I missed pasting the links here. You can find the core file in the 
following links.

https://build.gluster.org/job/centos7-regression/6035/consoleFull
https://build.gluster.org/job/centos7-regression/6055/consoleFull
https://build.gluster.org/job/centos7-regression/6045/consoleFull

On Thu, May 16, 2019 at 7:49 PM RAFI KC <mailto:rkavu...@redhat.com>> wrote:


Currently I'm looking into one of the priority issue, In parallel
I will also looking to this.

Saju,

Do you have a link to the core file?


Regards

Rafi KC

On 5/16/19 7:28 PM, FNU Raghavendra Manjunath wrote:


I am working on other uss issue. i.e. the occasional failure of
uss.t due to delays in the brick-mux regression. Rafi? Can you
please look into this?

Regards,
Raghavendra

On Thu, May 16, 2019 at 9:48 AM Sanju Rakonde
mailto:srako...@redhat.com>> wrote:

In most of the regression jobs
./tests/bugs/snapshot/bug-1399598-uss-with-ssl.t is dumping
core, hence the regression is failing for many patches.

Rafi/Raghavendra, can you please look into this issue?

-- 
Thanks,

Sanju




--
Thanks,
Sanju
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] ./tests/bugs/snapshot/bug-1399598-uss-with-ssl.t generating core very often

2019-05-16 Thread RAFI KC
Currently I'm looking into one of the priority issue, In parallel I will 
also looking to this.


Saju,

Do you have a link to the core file?


Regards

Rafi KC

On 5/16/19 7:28 PM, FNU Raghavendra Manjunath wrote:


I am working on other uss issue. i.e. the occasional failure of uss.t 
due to delays in the brick-mux regression. Rafi? Can you please look 
into this?


Regards,
Raghavendra

On Thu, May 16, 2019 at 9:48 AM Sanju Rakonde <mailto:srako...@redhat.com>> wrote:


In most of the regression jobs
./tests/bugs/snapshot/bug-1399598-uss-with-ssl.t is dumping core,
hence the regression is failing for many patches.

Rafi/Raghavendra, can you please look into this issue?

-- 
Thanks,

Sanju

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-users] Replica 3 - how to replace failed node (peer)

2019-04-10 Thread RAFI KC
reset brick is another way of replacing a brick. this usually helpful, 
when you want to replace the brick with same name. You can find the 
documentation here 
https://docs.gluster.org/en/latest/release-notes/3.9.0/#introducing-reset-brick-command.


In your case, I think you can use replace brick. So you can initiate a 
reset-brick start, then you have to replace your failed disk and create 
new brick with same name . Once you have healthy disk and brick, you can 
commit the reset-brick.



Let's know if you have any question,


Rafi KC


On 4/10/19 3:39 PM, David Spisla wrote:

Hello Martin,

look here:
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/pdf/administration_guide/Red_Hat_Gluster_Storage-3.4-Administration_Guide-en-US.pdf
on page 324. There is a manual how to replace a brick in case of a 
hardware failure


Regards
David Spisla

Am Mi., 10. Apr. 2019 um 11:42 Uhr schrieb Martin Toth 
mailto:snowmai...@gmail.com>>:


Hi all,

I am running replica 3 gluster with 3 bricks. One of my servers
failed - all disks are showing errors and raid is in fault state.

Type: Replicate
Volume ID: 41d5c283-3a74-4af8-a55d-924447bfa59a
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: node1.san:/tank/gluster/gv0imagestore/brick1
Brick2: node2.san:/tank/gluster/gv0imagestore/brick1 <— this brick
is down
Brick3: node3.san:/tank/gluster/gv0imagestore/brick1

So one of my bricks is totally failed (node2). It went down and
all data are lost (failed raid on node2). Now I am running only
two bricks on 2 servers out from 3.
This is really critical problem for us, we can lost all data. I
want to add new disks to node2, create new raid array on them and
try to replace failed brick on this node.

What is the procedure of replacing Brick2 on node2, can someone
advice? I can’t find anything relevant in documentation.

Thanks in advance,
Martin
___
Gluster-users mailing list
gluster-us...@gluster.org <mailto:gluster-us...@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] shd multiplexing patch has introduced coverity defects

2019-04-05 Thread RAFI KC
This patch will try to address the issues reported 
https://review.gluster.org/#/c/glusterfs/+/22514/



Regards

Rafi KC

On 4/5/19 10:35 AM, Rafi Kavungal Chundattu Parambil wrote:

Yes. I will work on this.

Rafi KC

- Original Message -
From: "Atin Mukherjee" 
To: "Rafi Kavungal Chundattu Parambil" 
Cc: "Gluster Devel" 
Sent: Thursday, April 4, 2019 11:47:59 AM
Subject: shd multiplexing patch has introduced coverity defects

Based on yesterday's coverity scan report, 6 defects are introduced because
of the shd multiplexing patch. Could you address them, Rafi?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Not able to start glusterd

2019-03-06 Thread RAFI KC

Hi Abhishek,

Good to know that you have resolved your problem. Do you think any more 
information is required to add in the upgrade doc for a smooth upgrade 
flow. It would be great to see a PR to the repo 
https://github.com/gluster/glusterdocs/tree/master/docs/Upgrade-Guide 
for updating the doc with the information.



Regards

Rafi KC

On 3/6/19 2:03 PM, ABHISHEK PALIWAL wrote:

Hi Sanju,

Thanks for the response.

I have resolved the issue, actually I have updated from 3.7.6 to 5.0, 
in new version RPC is coming from libtirpb , but I forgot to enable 
"--with-libtirpc" in configuration.


After enabling able to start glusterd.

Regards,
Abhishek

On Wed, Mar 6, 2019 at 12:58 PM Sanju Rakonde <mailto:srako...@redhat.com>> wrote:


Abhishek,

We need below information on investigate this issue.
1. gluster --version
2. Please run glusterd in gdb, so that we can capture the
backtrace. I see some rpc errors in log, but backtrace will be
more helpful.
    To run glusterd in gdb, you need start glusterd in gdb (i.e.
gdb glusterd, and then give the command "run -N"). when you see a
segmentation           fault, please capture the backtrace and
paste it here.

On Wed, Mar 6, 2019 at 10:07 AM ABHISHEK PALIWAL
mailto:abhishpali...@gmail.com>> wrote:

Hi Team,

I am facing the issue where at the time of starting the
glusterd segmentation fault is reported.

Below are the logs

root@128:/usr/sbin# ./glusterd  --debug
[1970-01-01 15:19:43.940386] I [MSGID: 100030]
[glusterfsd.c:2691:main] 0-./glusterd: Started running
./glusterd version 5.0 (args: ./glusterd --debug)
[1970-01-01 15:19:43.940855] D
[logging.c:1833:__gf_log_inject_timer_event] 0-logging-infra:
Starting timer now. Timeout = 120, current buf size = 5
[1970-01-01 15:19:43.941736] D [MSGID: 0]
[glusterfsd.c:747:get_volfp] 0-glusterfsd: loading volume file
/etc/glusterfs/glusterd.vol
[1970-01-01 15:19:43.945796] D [MSGID: 101097]
[xlator.c:341:xlator_dynload_newway] 0-xlator:
dlsym(xlator_api) on
/usr/lib64/glusterfs/5.0/xlator/mgmt/glusterd.so: undefined
symbol: xlator_api. Fall back to old symbols
[1970-01-01 15:19:43.946279] I [MSGID: 106478]
[glusterd.c:1435:init] 0-management: Maximum allowed open file
descriptors set to 65536
[1970-01-01 15:19:43.946419] I [MSGID: 106479]
[glusterd.c:1491:init] 0-management: Using /var/lib/glusterd
as working directory
[1970-01-01 15:19:43.946515] I [MSGID: 106479]
[glusterd.c:1497:init] 0-management: Using /var/run/gluster as
pid file working directory
[1970-01-01 15:19:43.946968] D [MSGID: 0]
[glusterd.c:458:glusterd_rpcsvc_options_build] 0-glusterd:
listen-backlog value: 10
[1970-01-01 15:19:43.947139] D [rpcsvc.c:2607:rpcsvc_init]
0-rpc-service: RPC service inited.
[1970-01-01 15:19:43.947241] D
[rpcsvc.c:2146:rpcsvc_program_register] 0-rpc-service: New
program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0
[1970-01-01 15:19:43.947379] D
[rpc-transport.c:269:rpc_transport_load] 0-rpc-transport:
attempt to load file
/usr/lib64/glusterfs/5.0/rpc-transport/socket.so
[1970-01-01 15:19:43.955198] D [socket.c:4464:socket_init]
0-socket.management: Configued transport.tcp-user-timeout=0
[1970-01-01 15:19:43.955316] D [socket.c:4482:socket_init]
0-socket.management: Reconfigued transport.keepalivecnt=9
[1970-01-01 15:19:43.955415] D
[socket.c:4167:ssl_setup_connection_params]
0-socket.management: SSL support on the I/O path is NOT enabled
[1970-01-01 15:19:43.955504] D
[socket.c:4170:ssl_setup_connection_params]
0-socket.management: SSL support for glusterd is NOT enabled
[1970-01-01 15:19:43.955612] D
[name.c:572:server_fill_address_family] 0-socket.management:
option address-family not specified, defaulting to inet6
[1970-01-01 15:19:43.955928] D
[rpc-transport.c:269:rpc_transport_load] 0-rpc-transport:
attempt to load file
/usr/lib64/glusterfs/5.0/rpc-transport/rdma.so
[1970-01-01 15:19:43.956079] E
[rpc-transport.c:273:rpc_transport_load] 0-rpc-transport:
/usr/lib64/glusterfs/5.0/rpc-transport/rdma.so: cannot open
shared object file: No such file or directory
[1970-01-01 15:19:43.956177] W
[rpc-transport.c:277:rpc_transport_load] 0-rpc-transport:
volume 'rdma.management': transport-type 'rdma' is not valid
or not found on this machine
[1970-01-01 15:19:43.956270] W
[rpcsvc.c:1789:rpcsvc_create_listener] 0-rpc-service: cannot
create listener, initing the transport failed
[1970-01-0

Re: [Gluster-devel] Release 6: Kick off!

2019-01-23 Thread RAFI KC

There are three patches that I'm working for Gluster-6.

[1] : https://review.gluster.org/#/c/glusterfs/+/22075/

[2] : https://review.gluster.org/#/c/glusterfs/+/21333/

[3] : https://review.gluster.org/#/c/glusterfs/+/21720/


Regards

Rafi KC

On 1/19/19 1:51 AM, Shyam Ranganathan wrote:

On 12/6/18 9:34 AM, Shyam Ranganathan wrote:

On 11/6/18 11:34 AM, Shyam Ranganathan wrote:

## Schedule

We have decided to postpone release-6 by a month, to accommodate for
late enhancements and the drive towards getting what is required for the
GCS project [1] done in core glusterfs.

This puts the (modified) schedule for Release-6 as below,

Working backwards on the schedule, here's what we have:
- Announcement: Week of Mar 4th, 2019
- GA tagging: Mar-01-2019
- RC1: On demand before GA
- RC0: Feb-04-2019
- Late features cut-off: Week of Jan-21st, 2018
- Branching (feature cutoff date): Jan-14-2018
   (~45 days prior to branching)

We are slightly past the branching date, I would like to branch early
next week, so please respond with a list of patches that need to be part
of the release and are still pending a merge, will help address review
focus on the same and also help track it down and branch the release.

Thanks, Shyam
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Implementing multiplexing for self heal client.

2019-01-07 Thread RAFI KC
I have completed the patches and pushed for reviews. Please feel free to 
raise your review concerns/suggestions.



https://review.gluster.org/#/c/glusterfs/+/21868 
<https://www.google.com/url?q=https%3A%2F%2Freview.gluster.org%2F%23%2Fc%2Fglusterfs%2F%2B%2F21868=D=1547360636897000=AFQjCNFBLolUmZYRP05J_7GbdYXGSs_Wcg>


https://review.gluster.org/#/c/glusterfs/+/21907 
<https://www.google.com/url?q=https%3A%2F%2Freview.gluster.org%2F%23%2Fc%2Fglusterfs%2F%2B%2F21907=D=1547360636897000=AFQjCNFT0DYFibCGY_n8a3JdF53-5L1Jrw>


https://review.gluster.org/#/c/glusterfs/+/21960 
<https://www.google.com/url?q=https%3A%2F%2Freview.gluster.org%2F%23%2Fc%2Fglusterfs%2F%2B%2F21960=D=1547360636897000=AFQjCNEZUnX5MjDJNvowrRNQzIjnGX-Skg>


https://review.gluster.org/#/c/glusterfs/+/21989/ 
<https://www.google.com/url?q=https%3A%2F%2Freview.gluster.org%2F%23%2Fc%2Fglusterfs%2F%2B%2F21960=D=1547360636897000=AFQjCNEZUnX5MjDJNvowrRNQzIjnGX-Skg>



Regards

Rafi KC


On 12/24/18 3:58 PM, RAFI KC wrote:


On 12/21/18 6:56 PM, Sankarshan Mukhopadhyay wrote:

On Fri, Dec 21, 2018 at 6:30 PM RAFI KC  wrote:

Hi All,

What is the problem?
As of now self-heal client is running as one daemon per node, this 
means

even if there are multiple volumes, there will only be one self-heal
daemon. So to take effect of each configuration changes in the cluster,
the self-heal has to be reconfigured. But it doesn't have ability to
dynamically reconfigure. Which means when you have lot of volumes in 
the
cluster, every management operation that involves configurations 
changes

like volume start/stop, add/remove brick etc will result in self-heal
daemon restart. If such operation is executed more often, it is not 
only

slow down self-heal for a volume, but also increases the slef-heal logs
substantially.

What is the value of the number of volumes when you write "lot of
volumes"? 1000 volumes, more etc


Yes, more than 1000 volumes. It also depends on how often you execute 
glusterd management operations (mentioned above). Each time self heal 
daemon is restarted, it prints the entire graph. This graph traces in 
the log will contribute the majority it's size.







How to fix it?

We are planning to follow a similar procedure as attach/detach graphs
dynamically which is similar to brick multiplex. The detailed steps is
as below,




1) First step is to make shd per volume daemon, to generate/reconfigure
volfiles per volume basis .

    1.1) This will help to attach the volfiles easily to existing 
shd daemon


    1.2) This will help to send notification to shd daemon as each
volinfo keeps the daemon object

    1.3) reconfiguring a particular subvolume is easier as we can check
the topology better

    1.4) With this change the volfiles will be moved to workdir/vols/
directory.

2) Writing new rpc requests like attach/detach_client_graph function to
support clients attach/detach

    2.1) Also functions like graph reconfigure, mgmt_getspec_cbk has to
be modified

3) Safely detaching a subvolume when there are pending frames to 
unwind.


    3.1) We can mark the client disconnected and make all the frames to
unwind with ENOTCONN

    3.2) We can wait all the i/o to unwind until the new updated subvol
attaches

4) Handle scenarios like glusterd restart, node reboot, etc



At the moment we are not planning to limit the number of heal subvolmes
per process as, because with the current approach also for every volume
heal was doing from a single process. We have not heared any major
complains on this?

Is the plan to not ever limit or, have a throttle set to a default
high(er) value? How would system resources be impacted if the proposed
design is implemented?


The plan is to implement in a way that it can support more than one 
multiplexed self-heal daemon. The throttling function as of now 
returns the same process to multiplex, but it can be easily modified 
to create a new process.


This multiplexing logic won't utilize any additional resources that it 
currently does.



Rafi KC



___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Implementing multiplexing for self heal client.

2018-12-24 Thread RAFI KC



On 12/21/18 6:56 PM, Sankarshan Mukhopadhyay wrote:

On Fri, Dec 21, 2018 at 6:30 PM RAFI KC  wrote:

Hi All,

What is the problem?
As of now self-heal client is running as one daemon per node, this means
even if there are multiple volumes, there will only be one self-heal
daemon. So to take effect of each configuration changes in the cluster,
the self-heal has to be reconfigured. But it doesn't have ability to
dynamically reconfigure. Which means when you have lot of volumes in the
cluster, every management operation that involves configurations changes
like volume start/stop, add/remove brick etc will result in self-heal
daemon restart. If such operation is executed more often, it is not only
slow down self-heal for a volume, but also increases the slef-heal logs
substantially.

What is the value of the number of volumes when you write "lot of
volumes"? 1000 volumes, more etc


Yes, more than 1000 volumes. It also depends on how often you execute 
glusterd management operations (mentioned above). Each time self heal 
daemon is restarted, it prints the entire graph. This graph traces in 
the log will contribute the majority it's size.







How to fix it?

We are planning to follow a similar procedure as attach/detach graphs
dynamically which is similar to brick multiplex. The detailed steps is
as below,




1) First step is to make shd per volume daemon, to generate/reconfigure
volfiles per volume basis .

1.1) This will help to attach the volfiles easily to existing shd daemon

1.2) This will help to send notification to shd daemon as each
volinfo keeps the daemon object

1.3) reconfiguring a particular subvolume is easier as we can check
the topology better

1.4) With this change the volfiles will be moved to workdir/vols/
directory.

2) Writing new rpc requests like attach/detach_client_graph function to
support clients attach/detach

2.1) Also functions like graph reconfigure, mgmt_getspec_cbk has to
be modified

3) Safely detaching a subvolume when there are pending frames to unwind.

3.1) We can mark the client disconnected and make all the frames to
unwind with ENOTCONN

3.2) We can wait all the i/o to unwind until the new updated subvol
attaches

4) Handle scenarios like glusterd restart, node reboot, etc



At the moment we are not planning to limit the number of heal subvolmes
per process as, because with the current approach also for every volume
heal was doing from a single process. We have not heared any major
complains on this?

Is the plan to not ever limit or, have a throttle set to a default
high(er) value? How would system resources be impacted if the proposed
design is implemented?


The plan is to implement in a way that it can support more than one 
multiplexed self-heal daemon. The throttling function as of now returns 
the same process to multiplex, but it can be easily modified to create a 
new process.


This multiplexing logic won't utilize any additional resources that it 
currently does.



Rafi KC



___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Implementing multiplexing for self heal client.

2018-12-21 Thread RAFI KC

Hi All,

What is the problem?
As of now self-heal client is running as one daemon per node, this means 
even if there are multiple volumes, there will only be one self-heal 
daemon. So to take effect of each configuration changes in the cluster, 
the self-heal has to be reconfigured. But it doesn't have ability to 
dynamically reconfigure. Which means when you have lot of volumes in the 
cluster, every management operation that involves configurations changes 
like volume start/stop, add/remove brick etc will result in self-heal 
daemon restart. If such operation is executed more often, it is not only 
slow down self-heal for a volume, but also increases the slef-heal logs 
substantially.



How to fix it?

We are planning to follow a similar procedure as attach/detach graphs 
dynamically which is similar to brick multiplex. The detailed steps is 
as below,





1) First step is to make shd per volume daemon, to generate/reconfigure 
volfiles per volume basis .


  1.1) This will help to attach the volfiles easily to existing shd daemon

  1.2) This will help to send notification to shd daemon as each 
volinfo keeps the daemon object


  1.3) reconfiguring a particular subvolume is easier as we can check 
the topology better


  1.4) With this change the volfiles will be moved to workdir/vols/ 
directory.


2) Writing new rpc requests like attach/detach_client_graph function to 
support clients attach/detach


  2.1) Also functions like graph reconfigure, mgmt_getspec_cbk has to 
be modified


3) Safely detaching a subvolume when there are pending frames to unwind.

  3.1) We can mark the client disconnected and make all the frames to 
unwind with ENOTCONN


  3.2) We can wait all the i/o to unwind until the new updated subvol 
attaches


4) Handle scenarios like glusterd restart, node reboot, etc



At the moment we are not planning to limit the number of heal subvolmes 
per process as, because with the current approach also for every volume 
heal was doing from a single process. We have not heared any major 
complains on this?



Regards

Rafi KC

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Gluster Community Bug Triage meeting

2015-12-08 Thread Rafi KC (via Doodle)
Hi there,

Rafi KC (rkavu...@redhat.com) invites you to participate in the Doodle
poll "Gluster Community Bug Triage meeting."

This meeting is scheduled for anyone that is interested in learning
more
  about, or assisting with the triaging of bug for GlusterFS
product. This meeting usually happens every Tuesday 12:00UTC.
  
 
Since a lot of suggestion came to change the current time (12:00 UTC,
5:30PM IST),  here is a poll to choose a most convenient time slot to
pick from the available options. If you want to suggest a time please
feel free to do so.
  more info:
http://gluster.readthedocs.org/en/latest/Contributors-Guide/Bug-Triage

Participate now
https://doodle.com/poll/tsywtwfngfk4ssr8?tmail=poll_invitecontact_participant_invitation_with_message=pollbtn

What is Doodle? Doodle is a web service that helps Rafi KC to find a
suitable date for meeting with a group of people. Learn more about how
Doodle works.
(https://doodle.com/features?tlink=checkOutLink=poll_invitecontact_participant_invitation_with_message)

--

You have received this e-mail because "Rafi KC" has invited you to
participate in the Doodle poll "Gluster Community Bug Triage meeting."



Doodle is also available for iOS and Android.


Doodle AG, Werdstrasse 21, 8021 Zürich
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel