Re: [Gluster-devel] Troubleshooting and Diagnostic tools for Gluster

2016-01-27 Thread Raghavendra Bhat
I have a script written to analyze the log message of gluster process.

It actually scans the log file and identifies the log messages with ERROR
and WARNING levels.
It lists the functions (with either ERROR or WARNING logs) and their
percentage of occcurance.

It also lists the MSGIDs for ERROR and WARNING logs and their percentage of
occurance.

A sample o/p the script:

[root@hal9000 ~]# ./log_analyzer.sh /var/log/glusterfs/mnt-glusterfs.log
Number Percentage Function
7 0.49 __socket_rwv
4 0.28 mgmt_getspec_cbk
4 0.28 gf_timer_call_after
3 0.21 rpc_clnt_reconfig
2 0.14 fuse_thread_proc
2 0.14 fini
2 0.14 cleanup_and_exit
1 0.07 _ios_dump_thread
1 0.07 fuse_init
1 0.07 fuse_graph_setup

= Error Functions 

7 0.49 __socket_rwv
2 0.14 cleanup_and_exit

Number Percentage MSGID
958 67.99 109066
424 30.09 109036
3 0.21 114057
3 0.21 114047
3 0.21 114046
3 0.21 114035
3 0.21 114020
3 0.21 114018
3 0.21 108031
2 0.14 101190
1 0.07 7962
1 0.07 108006
1 0.07 108005
1 0.07 108001
1 0.07 100030

= Error MSGIDs 

1 0.07 108006
1 0.07 108001

It can be found here.

https://github.com/raghavendrabhat/threaded-io/blob/master/log_analyzer.sh.

Do you think it can be added to the repo?

Regards,
Raghavendra

On Wed, Jan 27, 2016 at 3:44 AM, Aravinda  wrote:

> Hi,
>
> I am happy to share the `glustertool` project, which is a
> infrastructure for adding more tools for Gluster.
>
> https://github.com/aravindavk/glustertool
>
> Following tools available with the initial release.(`glustertool
>  [ARGS..]`)
>
> 1. gfid - To get GFID of given path(Mount or Backend)
> 2. changelogparser - To parse the Gluster Changelog
> 3. xtime - To get Xtime from brick backend
> 4. stime - To get Stime from brick backend
> 5. volmark - To get Volmark details from Gluster mount
>
> rpm/deb packages are not yet available, install this using `sudo
> python setup.py install`
>
> Once installed, run `glustertool list` to see list of tools available.
> `glustertool doc TOOLNAME` shows documentation about the tool and
> `glustertool  --help` shows the usage of the tool.
>
> More tools can be added to this collection easily using `newtool`
> utility available in this repo.
>
> # ./newtool 
>
> Read more about adding tools here
> https://github.com/aravindavk/glustertool/blob/master/CONTRIBUTING.md
>
> You can create an issue in github requesting more tools for Gluster
> https://github.com/aravindavk/glustertool/issues
>
> Comments & Suggestions Welcome
>
> regards
> Aravinda
>
> On 10/23/2015 11:42 PM, Vijay Bellur wrote:
>
>> On Friday 23 October 2015 04:16 PM, Aravinda wrote:
>>
>>> Hi Gluster developers,
>>>
>>> In this mail I am proposing troubleshooting documentation and
>>> Gluster Tools infrastructure.
>>>
>>> Tool to search in documentation
>>> ===
>>> We recently added message Ids to each error messages in Gluster. Some
>>> of the error messages are self explanatory. But some error messages
>>> requires manual intervention to fix the issue. How about identifying
>>> the error messages which requires more explanation and creating
>>> documentation for the same. Even though the information about some
>>> errors available in documentation, it is very difficult to search and
>>> relate to the error message. It will be very useful if we create a
>>> tool which looks for documentation and tells us exactly what to do.
>>>
>>> For example,(Illustrativepurpose only)
>>> glusterdoc --explain GEOREP0003
>>>
>>>  SSH configuration issue. This error is seen when Pem keys from all
>>>  master nodes are not distributed properly to Slave
>>>  nodes. Use Geo-replication create command with force option to
>>>  redistribute the keys. If issue stillpersists, look for any errors
>>>  while running hook scripts inGlusterd log file.
>>>
>>>
>>> Note: Inspired from rustc --explain command
>>> https://twitter.com/jaredforsyth/status/626960244707606528
>>>
>>> If we don't know the message id, we can still search from the
>>> available documentation like,
>>>
>>>  glusterdoc --search 
>>>
>>> These commands can be programmatically consumed, for example
>>> `--json` will return the output in JSON format. This enables UI
>>> developers to automatically show help messages when they display
>>> errors.
>>>
>>> Gluster Tools infrastructure
>>> 
>>> Are our Gluster log files sufficient for root causing the issues? Is
>>> that error caused due to miss configuration? Geo-replication status is
>>> showing faulty. Where to find the reason for Faulty?
>>>
>>> Sac(surs AT redhat.com) mentioned that heis working on gdeploy and many
>>> developers
>>> are using their owntools. How about providing common infrastructure(say
>>> gtool/glustertool) to host all these tools.
>>>
>>>
>> Would this be a repository with individual tools being git submodules or
>> something similar? Is there also a plan to bundle the set of tools into a
>> binary 

Re: [Gluster-devel] Throttling xlator on the bricks

2016-01-27 Thread Raghavendra Bhat
There is already a patch submitted for moving TBF part to libglusterfs. It
is under review.
http://review.gluster.org/#/c/12413/


Regards,
Raghavendra

On Mon, Jan 25, 2016 at 2:26 AM, Venky Shankar  wrote:

> On Mon, Jan 25, 2016 at 11:06:26AM +0530, Ravishankar N wrote:
> > Hi,
> >
> > We are planning to introduce a throttling xlator on the server (brick)
> > process to regulate FOPS. The main motivation is to solve complaints
> about
> > AFR selfheal taking too much of CPU resources. (due to too many fops for
> > entry
> > self-heal, rchecksums for data self-heal etc.)
> >
> > The throttling is achieved using the Token Bucket Filter algorithm (TBF).
> > TBF
> > is already used by bitrot's bitd signer (which is a client process) in
> > gluster to regulate the CPU intensive check-sum calculation. By putting
> the
> > logic on the brick side, multiple clients- selfheal, bitrot, rebalance or
> > even the mounts themselves can avail the benefits of throttling.
>
>   [Providing current TBF implementation link for completeness]
>
>
> https://github.com/gluster/glusterfs/blob/master/xlators/features/bit-rot/src/bitd/bit-rot-tbf.c
>
> Also, it would be beneficial to have the core TBF implementation as part of
> libglusterfs so as to be consumable by the server side xlator component to
> throttle dispatched FOPs and for daemons to throttle anything that's
> outside
> "brick" boundary (such as cpu, etc..).
>
> >
> > The TBF algorithm in a nutshell is as follows: There is a bucket which is
> > filled
> > at a steady (configurable) rate with tokens. Each FOP will need a fixed
> > amount
> > of tokens to be processed. If the bucket has that many tokens, the FOP is
> > allowed and that many tokens are removed from the bucket. If not, the
> FOP is
> > queued until the bucket is filled.
> >
> > The xlator will need to reside above io-threads and can have different
> > buckets,
> > one per client. There has to be a communication mechanism between the
> client
> > and
> > the brick (IPC?) to tell what FOPS need to be regulated from it, and the
> no.
> > of
> > tokens needed etc. These need to be re configurable via appropriate
> > mechanisms.
> > Each bucket will have a token filler thread which will fill the tokens in
> > it.
> > The main thread will enqueue heals in a list in the bucket if there
> aren't
> > enough tokens. Once the token filler detects some FOPS can be serviced,
> it
> > will
> > send a cond-broadcast to a dequeue thread which will process (stack wind)
> > all
> > the FOPS that have the required no. of tokens from all buckets.
> >
> > This is just a high level abstraction: requesting feedback on any aspect
> of
> > this feature. what kind of mechanism is best between the client/bricks
> for
> > tuning various parameters? What other requirements do you foresee?
> >
> > Thanks,
> > Ravi
>
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] distributed files/directories and [cm]time updates

2016-01-26 Thread Raghavendra Bhat
Hi Xavier,

There is a patch sent for review which implements the metadata cache in the
posix layer.  What the changes do is this:

Whenever there is a fresh lookup on a object (file/directory/symlink),
posix xlator saves the stat attributes of that object in its cache.
As of now, whenever there is a fop on a object, posix tries to build HANDLE
of the object by looking into gfid based backend (i.e. .glusterfs
directory) and doing stat to check if the gfid exists. The patch makes
chages to posix to check into its own cache first and return if it can find
the attributes. If not, then look into actual gfid backend.

But as of now, there is no cache invalidation. Whenever there is a
setattr() fop to change the attributes of a object, the new stat info is
saved in the cache once the fop is successful on disk.

The patch can be found here. (http://review.gluster.org/#/c/12157/).

Regards,
Raghavendra

On Tue, Jan 26, 2016 at 2:51 AM, Xavier Hernandez 
wrote:

> Hi Pranith,
>
> On 26/01/16 03:47, Pranith Kumar Karampuri wrote:
>
>> hi,
>>Traditionally gluster has been using ctime/mtime of the
>> files/dirs on the bricks as stat output. Problem we are seeing with this
>> approach is that, software which depends on it gets confused when there
>> are differences in these times. Tar especially gives "file changed as we
>> read it" whenever it detects ctime differences when stat is served from
>> different bricks. The way we have been trying to solve it is to serve
>> the stat structures from same brick in afr, max-time in dht. But it
>> doesn't avoid the problem completely. Because there is no way to change
>> ctime at the moment(lutimes() only allows mtime, atime), there is little
>> we can do to make sure ctimes match after self-heals/xattr
>> updates/rebalance. I am wondering if anyone of you solved these problems
>> before, if yes how did you go about doing it? It seems like applications
>> which depend on this for backups get confused the same way. The only way
>> out I see it is to bring ctime to an xattr, but that will need more iops
>> and gluster has to keep updating it on quite a few fops.
>>
>
> I did think about this when I was writing ec at the beginning. The idea
> was that the point in time at which each fop is executed were controlled by
> the client by adding an special xattr to each regular fop. Of course this
> would require support inside the storage/posix xlator. At that time, adding
> the needed support to other xlators seemed too complex for me, so I decided
> to do something similar to afr.
>
> Anyway, the idea was like this: for example, when a write fop needs to be
> sent, dht/afr/ec sets the current time in a special xattr, for example
> 'glusterfs.time'. It can be done in a way that if the time is already set
> by a higher xlator, it's not modified. This way DHT could set the time in
> fops involving multiple afr subvolumes. For other fops, would be afr who
> sets the time. It could also be set directly by the top most xlator (fuse),
> but that time could be incorrect because lower xlators could delay the fop
> execution and reorder it. This would need more thinking.
>
> That xattr will be received by storage/posix. This xlator will determine
> what times need to be modified and will change them. In the case of a
> write, it can decide to modify mtime and, maybe, atime. For a mkdir or
> create, it will set the times of the new file/directory and also the mtime
> of the parent directory. It depends on the specific fop being processed.
>
> mtime, atime and ctime (or even others) could be saved in a special posix
> xattr instead of relying on the file system attributes that cannot be
> modified (at least for ctime).
>
> This solution doesn't require extra fops, So it seems quite clean to me.
> The additional I/O needed in posix could be minimized by implementing a
> metadata cache in storage/posix that would read all metadata on lookup and
> update it on disk only at regular intervals and/or on invalidation. All
> fops would read/write into the cache. This would even reduce the number of
> I/O we are currently doing for each fop.
>
> Xavi
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] glusterfs-3.6.7 released

2015-12-02 Thread Raghavendra Bhat
Hi,

glusterfs-3.6.7 has been released and the packages for RHEL/Fedora/Centos
can be found here.http://download.gluster.org/pub/gluster/glusterfs/3.6/LATEST/

Requesting people running 3.6.x to please try it out and let us know if
there are any issues.

This release supposedly fixes the bugs listed below since 3.6.6 was made
available. Thanks to all who submitted patches, reviewed the changes.

1283690 - core dump in protocol/client:client_submit_request
1283144 - glusterfs does not register with rpcbind on restart
1277823 - [upgrade] After upgrade from 3.5 to 3.6, probing a new 3.6
node is moving the peer to rejected state
1277822 - glusterd: probing a new node(>=3.6) from 3.5 cluster is
moving the peer to rejected state

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] netbsd failures in 3.6 release

2015-11-04 Thread Raghavendra Bhat
Hi,

We have been observing netbsd failures in 3.6 branch since few months and I
have been merging the patches by ignoring netbsd failures. Last few 3.6
releases were made without considering 3.6 failures. IIRC there was a
discussion about it back then when netbsd tests started failing and it was
discussed that we shall ignore 3.6 netbsd errors. I am not sure if it was
discussed over IRC or as part of some patch (over gerrit).

Emmanuel? Do you recollect any discussions about it?

But I think it would be better to discuss about it here and see what can be
done. Please provide feedback.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] REMINDER: Weekly gluster community meeting to start in 30 minutes

2015-10-14 Thread Raghavendra Bhat
Hi All,

In 30 minutes from now we will have the regular weekly Gluster
Community meeting.

Meeting details:
- location: #gluster-meeting on Freenode IRC
- date: every Wednesday
- time: 12:00 UTC, 14:00 CEST, 17:30 IST
(in your terminal, run: date -d "12:00 UTC")
- agenda: https://public.pad.fsfe.org/p/gluster-community-meetings

Currently the following items are listed:
* Roll Call
* Status of last week's action items
* Gluster 3.7
* Gluster 3.8
* Gluster 3.6
* Gluster 3.5
* Gluster 4.0
* Open Floor
- bring your own topic!

The last topic has space for additions. If you have a suitable topic to
discuss, please add it to the agenda.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] REMINDER: Weekly gluster community meeting to start in 30 minutes

2015-09-30 Thread Raghavendra Bhat
Hi All,

In 30 minutes from now we will have the regular weekly Gluster
Community meeting.

Meeting details:
- location: #gluster-meeting on Freenode IRC
- date: every Wednesday
- time: 12:00 UTC, 14:00 CEST, 17:30 IST
(in your terminal, run: date -d "12:00 UTC")
- agenda: https://public.pad.fsfe.org/p/gluster-community-meetings

Currently the following items are listed:
* Roll Call
* Status of last week's action items
* Gluster 3.7
* Gluster 3.8
* Gluster 3.6
* Gluster 3.5
* Gluster 4.0
* Open Floor
- bring your own topic!

The last topic has space for additions. If you have a suitable topic to
discuss, please add it to the agenda.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2015-09-28 Thread Raghavendra Bhat
Hi Oleksandr,

You are right. The description should have said it as the limit on the
number of inodes in the lru list of the inode cache. I have sent a patch
for that.
http://review.gluster.org/#/c/12242/

Regards,
Raghavendra Bhat


On Thu, Sep 24, 2015 at 1:44 PM, Oleksandr Natalenko <
oleksa...@natalenko.name> wrote:

> I've checked statedump of volume in question and haven't found lots of
> iobuf as mentioned in that bugreport.
>
> However, I've noticed that there are lots of LRU records like this:
>
> ===
> [conn.1.bound_xl./bricks/r6sdLV07_vd0_mail/mail.lru.1]
> gfid=c4b29310-a19d-451b-8dd1-b3ac2d86b595
> nlookup=1
> fd-count=0
> ref=0
> ia_type=1
> ===
>
> In fact, there are 16383 of them. I've checked "gluster volume set help"
> in order to find something LRU-related and have found this:
>
> ===
> Option: network.inode-lru-limit
> Default Value: 16384
> Description: Specifies the maximum megabytes of memory to be used in the
> inode cache.
> ===
>
> Is there error in description stating "maximum megabytes of memory"?
> Shouldn't it mean "maximum amount of LRU records"? If no, is that true,
> that inode cache could grow up to 16 GiB for client, and one must lower
> network.inode-lru-limit value?
>
> Another thought: we've enabled write-behind, and the default
> write-behind-window-size value is 1 MiB. So, one may conclude that with
> lots of small files written, write-behind buffer could grow up to
> inode-lru-limit×write-behind-window-size=16 GiB? Who could explain that to
> me?
>
> 24.09.2015 10:42, Gabi C write:
>
>> oh, my bad...
>> coulb be this one?
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1126831 [2]
>> Anyway, on ovirt+gluster w I experienced similar behavior...
>>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] glusterfs 3.6.6 released

2015-09-24 Thread Raghavendra Bhat
Hi,

glusterfs-3.6.6 has been released and the packages for RHEL/Fedora/Centos
can be found here.
http://download.gluster.org/pub/gluster/glusterfs/3.6/LATEST/

Requesting people running 3.6.x to please try it out and let us know if
there are any issues.

This release supposedly fixes the bugs listed below since 3.6.5 was made
available. Thanks to all who submitted patches, reviewed the changes.

1259578 - [3.6.x] quota usage gets miscalculated when loc->gfid is NULL
1247972 - quota/marker: lk_owner is null while acquiring inodelk in rename
operation
1252072 - POSIX ACLs as used by a FUSE mount can not use more than 32 groups
1256245 - AFR: gluster v restart force or brick process restart doesn't
heal the files
1258069 - gNFSd: NFS mount fails with "Remote I/O error"
1173437 - [RFE] changes needed in snapshot info command's xml output.

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] [posix-compliance] unlink and access to file through open fd

2015-09-04 Thread Raghavendra Bhat

On 09/04/2015 12:43 PM, Raghavendra Gowdappa wrote:

All,

Posix allows access to file through open fds even if name associated with file 
is deleted. While this works for glusterfs for most of the cases, there are 
some corner cases where we fail.

1. Reboot of brick:
===

With the reboot of brick, fd is lost. unlink would've deleted both gfid and 
path links to file and we would loose the file. As a solution, perhaps we 
should create an hardlink to the file (say in .glusterfs) which gets deleted 
only when last fd is closed?

2. Graph switch:
=

The issue is captured in bz 1259995 [1]. Pasting the content from bz verbatim:
Consider following sequence of operations:
1. fd = open ("/mnt/glusterfs/file");
2. unlink ("/mnt/glusterfs/file");
3. Do a graph-switch, lets say by adding a new brick to volume.
4. migration of fd to new graph fails. This is because as part of migration we 
do a lookup and open. But, lookup fails as file is already deleted and hence 
migration fails and fd is marked bad.

In fact this test case is already present in our regression tests, though the 
test checks whether the fd is just marked as bad. But the expectation of filing 
this bug is that migration should succeed. This is possible since there is an 
fd opened on brick through old-graph and hence can be duped using dup syscall.

Of course the solution outlined here doesn't cover the case where file is not 
present on brick at all. For eg., a new brick was added to replica set and that 
new brick doesn't contain the file. Now, since the file is deleted, how do 
replica heals that file to another brick etc.

But atleast this can be solved for those cases where file was present on a 
brick and fd was already opened.


Du,

For this 2nd example (where the file is opened, unlinked and a graph 
swatch happens), there was a patch submitted long back.


http://review.gluster.org/#/c/5428/

Regards,
Raghavendra Bhat


3. Open-behind and unlink from a different client:
==

While open-behind handles unlink from the same client (through which open was 
performed), if unlink and open are done from two different clients, file is 
lost. I cannot think of any good solution for this.

I wanted to know whether these problems are real enough to channel our efforts 
to fix these issues. Comments are welcome in terms of solutions or other 
possible scenarios which can lead to this issue.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1259995

regards,
Raghavendra.
___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] glusterfs-3.6.5 released

2015-08-26 Thread Raghavendra Bhat


Hi,

glusterfs-3.6.5 has been released and the packages for RHEL/Fedora/Centos can 
be found here.
http://download.gluster.org/pub/gluster/glusterfs/3.6/LATEST/

The Ubuntu packages can be found here:
https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.6.

Requesting people running 3.6.x to please try it out and let us know if there 
are any issues.

This release supposedly fixes the bugs listed below since 3.6.4 was made 
available. Thanks to all who submitted patches, reviewed the changes.

1247959 - Statfs is hung because of frame loss in quota
1247970 - huge mem leak in posix xattrop
1234096 - rmtab file is a bottleneck when lot of clients are accessing a volume 
through NFS
1254421 - glusterd fails to get the inode size for a brick
1247964 - Disperse volume: Huge memory leak of glusterfsd process
1218732 - gluster snapshot status --xml gives back unexpected non xml output
1250836 - [upgrade] After upgrade from 3.5 to 3.6 onwards version, bumping up 
op-version failed
1244117 - unix domain sockets on Gluster/NFS are created as fifo/pipe
1243700 - GlusterD crashes when management encryption is enabled
1235601 - tar on a glusterfs mount displays file changed as we read it even 
though the file was not changed

Regards,
Raghavendra Bhat


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [release-3.6] compile error: 'GF_REPLACE_OP_START' undeclared

2015-08-18 Thread Raghavendra Bhat

On 08/18/2015 12:39 PM, Avra Sengupta wrote:

+ Adding Raghavendra Bhat.

When is the next GA planned on this branch? And can we take patches in 
this branch while this is being investigated.


Regards,
Avra



I am planning to make the release by the end of this week. I can accept 
the patches if it is fixing some critical bug. But it would be better if 
the issue being investigated is fixed.


Regards,
Raghavendra Bhat


On 08/18/2015 12:07 PM, Avra Sengupta wrote:
Still hitting this on freebsd and netbsd smoke runs on release 3.6 
branch. Are we merging patches on release 3.6 branch for now even 
with these failures. I have two such patches that need to be merged.


Regards,
Avra

On 07/06/2015 02:32 PM, Niels de Vos wrote:

On Mon, Jul 06, 2015 at 02:19:07PM +0530, Raghavendra Bhat wrote:

On 07/06/2015 01:39 PM, Niels de Vos wrote:

On Mon, Jul 06, 2015 at 12:09:28PM +0530, Raghavendra Bhat wrote:

On 07/06/2015 09:52 AM, Kaushal M wrote:

I checked on NetBSD-7.0_BETA and FreeBSD-10.1. I couldn't reproduce
this. I'll try on NetBSD-6 next.

~kaushal
I think it has to be included before 3.6.4 is made G.A. I can 
wait till the
fix for this issue is merged before making 3.6.4. Does it sound 
ok? Or

should I go ahead with 3.6.4 and make a quick 3.6.5 with this fix?

I only care about getting http://review.gluster.org/11335 merged :-)

This is a patch I promised to take into release-3.5. It would be 
nicer

to have this change included in the release-3.6 branch before I merge
the 3.5 backport. At the moment, 3.5.5 is waiting on this patch. 
But I
do not think you really need to delay 3.6.4 off for that one. It 
should
be fine if it lands in 3.6.5. (The compile error looks more like a 
3.6.4

blocker.)

Niels

Niels,

The patch you mentioned has received the acks and also has passed 
the linux

regression tests. But it seem to have failed netbsd regression tests.

Yes, at least the smoke tests on NetBSD and FreeBSD fail with the
compile error mentioned in the subject of this email :)

Thanks,
Niels



Regards,
Raghavendra Bhat


Regards,
Raghavendra Bhat

On Mon, Jul 6, 2015 at 8:38 AM, Kaushal M kshlms...@gmail.com 
wrote:
Krutika hit this last week, and let us (GlusterD maintiners) 
know of
it. I volunteered to look into this, but couldn't find time. 
I'll do

it now.

~kaushal

On Sun, Jul 5, 2015 at 10:43 PM, Atin Mukherjee
atin.mukherje...@gmail.com wrote:
I remember Krutika reporting it few days back. So it seems 
like its not

fixed yet. If there is no taker I will send a patch tomorrow.

-Atin
Sent from one plus one

On Jul 5, 2015 9:58 PM, Niels de Vos nde...@redhat.com wrote:

Hi,

it seems that the current release-3.6 branch does not compile on
FreedBSD and NetBSD (not sure why it compiles on CentOS-6). 
These errors

are thrown:

   --- glusterd_la-glusterd-op-sm.lo ---
 CC   glusterd_la-glusterd-op-sm.lo

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c: 


In function 'glusterd_op_start_rb_timer':

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19: 

error: 'GF_REPLACE_OP_START' undeclared (first use in this 
function)


/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19: 

note: each undeclared identifier is reported only once for 
each function it

appears in

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c: 


In function 'glusterd_bricks_select_status_volume':

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:5800:34: 


warning: unused variable 'snapd'
   *** [glusterd_la-glusterd-op-sm.lo] Error code 1


Could someone send a (pointer to the) backport that addresses 
this?


Thanks,
Niels


On Sun, Jul 05, 2015 at 08:59:32AM -0700, Gluster Build 
System (Code

Review) wrote:

Gluster Build System has posted comments on this change.

Change subject: nfs: make it possible to disable 
nfs.mount-rmtab
.. 




Patch Set 1: -Verified

Build Failed

http://build.gluster.org/job/compare-bug-version-and-git-branch/9953/ 
:

SUCCESS

http://build.gluster.org/job/freebsd-smoke/8551/ : FAILURE

http://build.gluster.org/job/smoke/19820/ : SUCCESS

http://build.gluster.org/job/netbsd6-smoke/7808/ : FAILURE

--
To view, visit http://review.gluster.org/11335
To unsubscribe, visit http://review.gluster.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
Gerrit-PatchSet: 1
Gerrit-Project: glusterfs
Gerrit-Branch: release-3.6
Gerrit-Owner: Niels de Vos nde...@redhat.com
Gerrit-Reviewer: Gluster Build System 
jenk...@build.gluster.com

Gerrit-Reviewer: Kaleb KEITHLEY kkeit...@redhat.com
Gerrit-Reviewer: NetBSD Build System 
jenk...@build.gluster.org

Gerrit-Reviewer: Niels de Vos nde...@redhat.com
Gerrit-Reviewer: Raghavendra Bhat rab...@redhat.com
Gerrit-Reviewer: jiffin tony Thottan

Re: [Gluster-devel] v3.6.3 doesn't respect default ACLs?

2015-08-10 Thread Raghavendra Bhat

On 08/10/2015 09:56 PM, Niels de Vos wrote:

On Wed, Jul 29, 2015 at 04:00:48PM +0530, Raghavendra Bhat wrote:

On 07/27/2015 08:30 PM, Glomski, Patrick wrote:

I built a patched version of 3.6.4 and the problem does seem to be fixed
on a test server/client when I mounted with those flags (acl,
resolve-gids, and gid-timeout). Seeing as it was a test system, I can't
really provide anything meaningful as to the performance hit seen without
the gid-timeout option. Thank you for implementing it so quickly, though!

Is there any chance of getting this fix incorporated in the upcoming 3.6.5
release?

Patrick

I am planning to include this fix in 3.6.5. This fix is still under review.
Once it is accepted in master, it cab be backported to release-3.6 branch. I
will wait till then and make 3.6.5.

I dont think there is a tracker bug for 3.6.5 yet? Or at least I could
not find it by an alias.

https://bugzilla.redhat.com/show_bug.cgi?id=1252072 is used to get the
backport in release-3.6.x, please review and merge :-)

Thanks,
Niels


This is the 3.6.5 tracker bug. Will merge the patch once regression 
tests are passed.


https://bugzilla.redhat.com/show_bug.cgi?id=1250544.

Regards,
Raghavendra Bhat


Regards,
Raghavendra Bhat



On Thu, Jul 23, 2015 at 6:27 PM, Niels de Vos nde...@redhat.com
mailto:nde...@redhat.com wrote:

On Tue, Jul 21, 2015 at 10:30:04PM +0200, Niels de Vos wrote:
 On Wed, Jul 08, 2015 at 03:20:41PM -0400, Glomski, Patrick wrote:
  Gluster devs,
 
  I'm running gluster v3.6.3 (both server and client side). Since my
  application requires more than 32 groups, I don't mount with
ACLs on the
  client. If I mount with ACLs between the bricks and set a
default ACL on
  the server, I think I'm right in stating that the server
should respect
  that ACL whenever a new file or folder is made.

 I would expect that the ACL gets in herited on the brick. When a new
 file is created without the default ACL, things seem to be
wrong. You
 mention that creating the file directly on the brick has the correct
 ACL, so there must be some Gluster component interfering.

 You reminded me on IRC about this email, and that helped a lot.
Its very
 easy to get distracted when trying to investigate things from the
 mailinglists.

 I had a brief look, and I think we could reach a solution. An
ugly patch
 for initial testing is ready. Well... it compiles. I'll try to
run some
 basic tests tomorrow and see if it improves things and does not
crash
 immediately.

 The change can be found here:
 http://review.gluster.org/11732

 It basically adds a resolve-gids mount option for the FUSE client.
 This causes the fuse daemon to call getgrouplist() and retrieve
all the
 groups for the UID that accesses the mountpoint. Without this
option,
 the behavior is not changed, and /proc/$PID/status is used to
get up to
 32 groups (the $PID is the process that accesses the mountpoint).

 You probably want to also mount with gid-timeout=N where N is
seconds
 that the group cache is valid. In the current master branch this
is set
 to 300 seconds (like the sssd default), but if the groups of a used
 rarely change, this value can be increased. Previous versions had a
 lower timeout which could cause resolving the groups on almost each
 network packet that arrives (HUGE performance impact).

 When using this option, you may also need to enable
server.manage-gids.
 This option allows using more than ~93 groups on the bricks. The
network
 packets can only contain ~93 groups, when server.manage-gids is
enabled,
 the groups are not sent in the network packets, but are resolved
on the
 bricks with getgrouplist().

The patch linked above had been tested, corrected and updated. The
change works for me on a test-system.

A backport that you should be able to include in a package for 3.6 can
be found here: http://termbin.com/f3cj
Let me know if you are not familiar with rebuilding patched packages,
and I can build a test-version for you tomorrow.

On glusterfs-3.6, you will want to pass a gid-timeout mount option
too.
The option enables caching of the resolved groups that the uid belongs
too, if caching is not enebled (or expires quickly), you will probably
notice a preformance hit. Newer version of GlusterFS set the
timeout to
300 seconds (like the default timeout sssd uses).

Please test and let me know if this fixes your use case.

Thanks,
Niels



 Cheers,
 Niels

  Maybe an example is in order:
 
  We first set up a test directory with setgid bit so that our new
  subdirectories inherit the group.
  [root@gfs01a hpc_shared]# mkdir test; cd test; chown
pglomski.users .;
  chmod 2770 .; getfacl

[Gluster-devel] release schedule for glusterfs

2015-08-05 Thread Raghavendra Bhat


Hi,

In previous community meeting it was discussed to come up with a 
schedule for glusterfs releases. It was discussed that each of the 
supported release branches (3.5, 3.6 and 3.7) will make a new release 
every month.


The previous releases of them happened at below dates.

glusterfs-3.5.5 - 9th July
glusterfs-3.6.4 - 13th July
glusterfs-3.7.3 - 29th July.

Is it ok to slightly align those dates? i.e. on 10th of every month 3.5 
based release would happen (in general the oldest supported and most 
stable release branch). On 20th of every month 3.6 based release would 
happen (In general, the release branch which is being stabilized). And 
on 30th of every month 3.7 based release would happen (in general, the 
latest release branch).


Please provide feedback. Once a schedule is finalized we can put that 
information in gluster.org.


Regards,
Raghavendra Bhat

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] release schedule for glusterfs

2015-08-05 Thread Raghavendra Bhat

On 08/05/2015 05:57 PM, Humble Devassy Chirammal wrote:

Hi Ragavendra,

This LGTM . However Is there any guide line on :

How many beta releases hit for each minor release ? and the gap 
between these releases ?


--Humble



I am not sure about the beta releases. As per my understanding there are 
no beta releases happening in release-3.5 branch and also the latest 
release-3.7 branch. I was doing beta releases for release-3.6 branch. 
But I am also thinking of moving away from it and make 3.6.5 directly 
(and also future release-3.6 releases).


Regards,
Raghavendra Bhat



On Wed, Aug 5, 2015 at 5:12 PM, Raghavendra Bhat rab...@redhat.com 
mailto:rab...@redhat.com wrote:



Hi,

In previous community meeting it was discussed to come up with a
schedule for glusterfs releases. It was discussed that each of the
supported release branches (3.5, 3.6 and 3.7) will make a new
release every month.

The previous releases of them happened at below dates.

glusterfs-3.5.5 - 9th July
glusterfs-3.6.4 - 13th July
glusterfs-3.7.3 - 29th July.

Is it ok to slightly align those dates? i.e. on 10th of every
month 3.5 based release would happen (in general the oldest
supported and most stable release branch). On 20th of every month
3.6 based release would happen (In general, the release branch
which is being stabilized). And on 30th of every month 3.7 based
release would happen (in general, the latest release branch).

Please provide feedback. Once a schedule is finalized we can put
that information in gluster.org http://gluster.org.

Regards,
Raghavendra Bhat

___
Gluster-devel mailing list
Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] v3.6.3 doesn't respect default ACLs?

2015-07-29 Thread Raghavendra Bhat

On 07/27/2015 08:30 PM, Glomski, Patrick wrote:
I built a patched version of 3.6.4 and the problem does seem to be 
fixed on a test server/client when I mounted with those flags (acl, 
resolve-gids, and gid-timeout). Seeing as it was a test system, I 
can't really provide anything meaningful as to the performance hit 
seen without the gid-timeout option. Thank you for implementing it so 
quickly, though!


Is there any chance of getting this fix incorporated in the upcoming 
3.6.5 release?


Patrick


I am planning to include this fix in 3.6.5. This fix is still under 
review. Once it is accepted in master, it cab be backported to 
release-3.6 branch. I will wait till then and make 3.6.5.


Regards,
Raghavendra Bhat




On Thu, Jul 23, 2015 at 6:27 PM, Niels de Vos nde...@redhat.com 
mailto:nde...@redhat.com wrote:


On Tue, Jul 21, 2015 at 10:30:04PM +0200, Niels de Vos wrote:
 On Wed, Jul 08, 2015 at 03:20:41PM -0400, Glomski, Patrick wrote:
  Gluster devs,
 
  I'm running gluster v3.6.3 (both server and client side). Since my
  application requires more than 32 groups, I don't mount with
ACLs on the
  client. If I mount with ACLs between the bricks and set a
default ACL on
  the server, I think I'm right in stating that the server
should respect
  that ACL whenever a new file or folder is made.

 I would expect that the ACL gets in herited on the brick. When a new
 file is created without the default ACL, things seem to be
wrong. You
 mention that creating the file directly on the brick has the correct
 ACL, so there must be some Gluster component interfering.

 You reminded me on IRC about this email, and that helped a lot.
Its very
 easy to get distracted when trying to investigate things from the
 mailinglists.

 I had a brief look, and I think we could reach a solution. An
ugly patch
 for initial testing is ready. Well... it compiles. I'll try to
run some
 basic tests tomorrow and see if it improves things and does not
crash
 immediately.

 The change can be found here:
 http://review.gluster.org/11732

 It basically adds a resolve-gids mount option for the FUSE client.
 This causes the fuse daemon to call getgrouplist() and retrieve
all the
 groups for the UID that accesses the mountpoint. Without this
option,
 the behavior is not changed, and /proc/$PID/status is used to
get up to
 32 groups (the $PID is the process that accesses the mountpoint).

 You probably want to also mount with gid-timeout=N where N is
seconds
 that the group cache is valid. In the current master branch this
is set
 to 300 seconds (like the sssd default), but if the groups of a used
 rarely change, this value can be increased. Previous versions had a
 lower timeout which could cause resolving the groups on almost each
 network packet that arrives (HUGE performance impact).

 When using this option, you may also need to enable
server.manage-gids.
 This option allows using more than ~93 groups on the bricks. The
network
 packets can only contain ~93 groups, when server.manage-gids is
enabled,
 the groups are not sent in the network packets, but are resolved
on the
 bricks with getgrouplist().

The patch linked above had been tested, corrected and updated. The
change works for me on a test-system.

A backport that you should be able to include in a package for 3.6 can
be found here: http://termbin.com/f3cj
Let me know if you are not familiar with rebuilding patched packages,
and I can build a test-version for you tomorrow.

On glusterfs-3.6, you will want to pass a gid-timeout mount option
too.
The option enables caching of the resolved groups that the uid belongs
too, if caching is not enebled (or expires quickly), you will probably
notice a preformance hit. Newer version of GlusterFS set the
timeout to
300 seconds (like the default timeout sssd uses).

Please test and let me know if this fixes your use case.

Thanks,
Niels



 Cheers,
 Niels

  Maybe an example is in order:
 
  We first set up a test directory with setgid bit so that our new
  subdirectories inherit the group.
  [root@gfs01a hpc_shared]# mkdir test; cd test; chown
pglomski.users .;
  chmod 2770 .; getfacl .
  # file: .
  # owner: pglomski
  # group: users
  # flags: -s-
  user::rwx
  group::rwx
  other::---
 
  New subdirectories share the group, but the umask leads to
them being group
  read-only.
  [root@gfs01a test]# mkdir a; getfacl a
  # file: a
  # owner: root
  # group: users
  # flags: -s-
  user::rwx
  group::r-x
  other::r-x
 
  Setting default ACLs on the server allows group write to new
directories
  made

Re: [Gluster-devel] gluster vol start is failing when glusterfs is compiled with debug enable .

2015-07-22 Thread Raghavendra Bhat

On 07/22/2015 09:50 AM, Atin Mukherjee wrote:


On 07/22/2015 12:50 AM, Anand Nekkunti wrote:

Hi All
gluster vol start is failing when glusterfs is compiled with debug
enable .
Link: :https://bugzilla.redhat.com/show_bug.cgi?id=1245331

*brick start is failing with fallowing error:*
2015-07-21 19:01:59.408729] I [MSGID: 100030] [glusterfsd.c:2296:main]
0-/usr/local/sbin/glusterfsd: Started running /usr/local/sbin/glusterfsd
version 3.8dev (args: /usr/local/sbin/glusterfsd -s 192.168.0.4
--volfile-id VOL.192.168.0.4.tmp-BRICK1 -p
/var/lib/glusterd/vols/VOL/run/192.168.0.4-tmp-BRICK1.pid -S
/var/run/gluster/0a4faf3d8d782840484629176ecf307a.socket --brick-name
/tmp/BRICK1 -l /var/log/glusterfs/bricks/tmp-BRICK1.log --xlator-option
*-posix.glusterd-uuid=4ec09b0c-6043-40f0-bc1a-5cc312d49a78 --brick-port
49152 --xlator-option VOL-server.listen-port=49152)
[2015-07-21 19:02:00.075574] I [MSGID: 101190]
[event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2015-07-21 19:02:00.078905] W [MSGID: 101095]
[xlator.c:189:xlator_dynload] 0-xlator: /usr/local/lib/libgfdb.so.0:
undefined symbol: gf_sql_str2sync_t
[2015-07-21 19:02:00.078947] E [MSGID: 101002] [graph.y:211:volume_type]
0-parser: Volume 'VOL-changetimerecorder', line 16: type
'features/changetimerecorder' is not valid or not found on this machine
[2015-07-21 19:02:00.079020] E [MSGID: 101019] [graph.y:319:volume_end]
0-parser: type not specified for volume VOL-changetimerecorder
[2015-07-21 19:02:00.079150] E [MSGID: 100026]
[glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the
graph
[2015-07-21 19:02:00.079399] W [glusterfsd.c:1214:cleanup_and_exit]
(--/usr/local/sbin/glusterfsd(mgmt_getspec_cbk+0x343) [0x40df64]
--/usr/local/sbin/glusterfsd(glusterfs_process_volfp+0x1a2) [0x409b58]
--/usr/local/sbin/glusterfsd(cleanup_and_exit+0x77) [0x407a6f] ) 0-:
received signum (0), shutting down

I am not able to hit this though.


This seems to be the case of inline functions being considerd as 
undefined symblols. There has been a discussion about it in the mailing 
list.


https://www.gluster.org/pipermail/gluster-devel/2015-June/045942.html

Regards,
Raghavendra Bhat



ThanksRegards
Anand.N



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] on patch #11553

2015-07-08 Thread Raghavendra Bhat

On 07/07/2015 12:30 PM, Raghavendra G wrote:

+ vijay mallikarjuna for quotad has similar concerns

+ Raghavendra Bhat for snapd might've similar concerns.


Snapd also uses protocol/server at the top of the graph. So the fix for 
protocol/server should be good enough.


Regards,
Raghavendra Bhat



On Tue, Jul 7, 2015 at 12:02 PM, Raghavendra Gowdappa 
rgowd...@redhat.com mailto:rgowd...@redhat.com wrote:


+gluster-devel

- Original Message -
 From: Raghavendra Gowdappa rgowd...@redhat.com
mailto:rgowd...@redhat.com
 To: Krishnan Parthasarathi kpart...@redhat.com
mailto:kpart...@redhat.com
 Cc: Nithya Balachandran nbala...@redhat.com
mailto:nbala...@redhat.com, Anoop C S achir...@redhat.com
mailto:achir...@redhat.com
 Sent: Tuesday, 7 July, 2015 11:32:01 AM
 Subject: on patch #11553

 KP,

 Though the crash because of lack of init while fops are in
progress is
 solved, concerns addressed by [1] are still valid. Basically
what we need to
 guarantee is that when is it safe to wind fops through a
particular subvol
 of protocol/server. So, if some xlators are doing things in
events like
 CHILD_UP (like trash), server_setvolume should wait for CHILD_UP
on a
 particular subvol before accepting a client. So, [1] is
necessary but
 following changes need to be made:

 1. protocol/server _can_ have multiple subvol as children. In
that case we
 should track whether the exported subvol has received CHILD_UP
and only
 after a successful CHILD_UP on that subvol connections to that
subvol can be
 accepted.
 2. It is valid (though not a common thing on brick process) that
some subvols
 can be up and some might be down. So, child readiness should be
localised to
 that subvol instead of tracking readiness at protocol/server level.

 So, please revive [1] and send it with corrections and I'll
merge it.

 [1] http://review.gluster.org/11553

 regards,
 Raghavendra.
___
Gluster-devel mailing list
Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel




--
Raghavendra G


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] healing of bad objects (marked by scrubber)

2015-07-08 Thread Raghavendra Bhat

Adding the correct gluster-devel id.

Regards,
Raghavendra Bhat

On 07/08/2015 11:38 AM, Raghavendra Bhat wrote:


Hi,

In bit-rot feature, the scrubber marks the corrupted (objects whose 
data has gone bad) as bad objects (via extended attribute). If the 
volume is a replicate volume and a object in one of the replicas goes 
bad. In this case, the client is able to see the data via the good 
copy present in the other replica. But as of now, the self-heal does 
not heal the bad objects.  So the method to heal the bad object is to 
remove the bad object directly from the backend and let self-heal take 
care of healing it from the good copy.


The above method has a problem. The bit-rot-stub xlator sitting in the 
brick graph, remembers an object as bad in its inode context (either 
when the object was being marked bad by scrubber, or during the first 
lookup of the object if it was already marked bad). Bit-rot-stub uses 
that info to block any read/write operations on such bad objects. So 
it blocks any kind of operation attempted by self-heal as well to 
correct the object (the object was deleted directly in the backend, so 
the in memory inode will still be present and considered valid).


There are 2 methods that I think can solve the issue.

1) In server_lookup_cbk, if the lookup of a object fails due to 
ENOENT  *AND*  the lookup is a revalidate lookup, then forget the 
inode associated with that object (not just unlinking the dentry, 
forget the inode as well iff there are no more dentries associated 
with the inode). Atleast this way, the inode would be forgotten, and 
later when self-heal wants to correct the object, it has to create a 
new object (the object was removed directly from the backend), which 
has to happen with the creation of a new in memory inode and 
read/write operations by self-heal daemon will not be blocked.

I have sent a patch for review for the above method:
http://review.gluster.org/#/c/11489/

OR

2) Do not block write operations coming on the bad object if the 
operation is coming from self-heal and allow it to completely heal the 
file and once healing is done, remove the bad-object information from 
the inode context.
The requests coming from self-heal demon can be identified by checking 
the pid of it (it has -ve pid). But if the self-heal happening from 
the glusterfs client itself, I am not sure whether self-heal happens 
with a -ve pid for the frame or the same pid as that of the frame of 
the original fop which triggered the self-heal. Pranith? Can you 
clarify this?


Please provide feedback.

Regards,
Raghavendra Bhat


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [release-3.6] compile error: 'GF_REPLACE_OP_START' undeclared

2015-07-06 Thread Raghavendra Bhat

On 07/06/2015 01:39 PM, Niels de Vos wrote:

On Mon, Jul 06, 2015 at 12:09:28PM +0530, Raghavendra Bhat wrote:

On 07/06/2015 09:52 AM, Kaushal M wrote:

I checked on NetBSD-7.0_BETA and FreeBSD-10.1. I couldn't reproduce
this. I'll try on NetBSD-6 next.

~kaushal

I think it has to be included before 3.6.4 is made G.A. I can wait till the
fix for this issue is merged before making 3.6.4. Does it sound ok? Or
should I go ahead with 3.6.4 and make a quick 3.6.5 with this fix?

I only care about getting http://review.gluster.org/11335 merged :-)

This is a patch I promised to take into release-3.5. It would be nicer
to have this change included in the release-3.6 branch before I merge
the 3.5 backport. At the moment, 3.5.5 is waiting on this patch. But I
do not think you really need to delay 3.6.4 off for that one. It should
be fine if it lands in 3.6.5. (The compile error looks more like a 3.6.4
blocker.)

Niels


Niels,

The patch you mentioned has received the acks and also has passed the 
linux regression tests. But it seem to have failed netbsd regression tests.


Regards,
Raghavendra Bhat


Regards,
Raghavendra Bhat


On Mon, Jul 6, 2015 at 8:38 AM, Kaushal M kshlms...@gmail.com wrote:

Krutika hit this last week, and let us (GlusterD maintiners) know of
it. I volunteered to look into this, but couldn't find time. I'll do
it now.

~kaushal

On Sun, Jul 5, 2015 at 10:43 PM, Atin Mukherjee
atin.mukherje...@gmail.com wrote:

I remember Krutika reporting it few days back. So it seems like its not
fixed yet. If there is no taker I will send a patch tomorrow.

-Atin
Sent from one plus one

On Jul 5, 2015 9:58 PM, Niels de Vos nde...@redhat.com wrote:

Hi,

it seems that the current release-3.6 branch does not compile on
FreedBSD and NetBSD (not sure why it compiles on CentOS-6). These errors
are thrown:

   --- glusterd_la-glusterd-op-sm.lo ---
 CC   glusterd_la-glusterd-op-sm.lo

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:
In function 'glusterd_op_start_rb_timer':

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19:
error: 'GF_REPLACE_OP_START' undeclared (first use in this function)

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19:
note: each undeclared identifier is reported only once for each function it
appears in

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:
In function 'glusterd_bricks_select_status_volume':

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:5800:34:
warning: unused variable 'snapd'
   *** [glusterd_la-glusterd-op-sm.lo] Error code 1


Could someone send a (pointer to the) backport that addresses this?

Thanks,
Niels


On Sun, Jul 05, 2015 at 08:59:32AM -0700, Gluster Build System (Code
Review) wrote:

Gluster Build System has posted comments on this change.

Change subject: nfs: make it possible to disable nfs.mount-rmtab
..


Patch Set 1: -Verified

Build Failed

http://build.gluster.org/job/compare-bug-version-and-git-branch/9953/ :
SUCCESS

http://build.gluster.org/job/freebsd-smoke/8551/ : FAILURE

http://build.gluster.org/job/smoke/19820/ : SUCCESS

http://build.gluster.org/job/netbsd6-smoke/7808/ : FAILURE

--
To view, visit http://review.gluster.org/11335
To unsubscribe, visit http://review.gluster.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
Gerrit-PatchSet: 1
Gerrit-Project: glusterfs
Gerrit-Branch: release-3.6
Gerrit-Owner: Niels de Vos nde...@redhat.com
Gerrit-Reviewer: Gluster Build System jenk...@build.gluster.com
Gerrit-Reviewer: Kaleb KEITHLEY kkeit...@redhat.com
Gerrit-Reviewer: NetBSD Build System jenk...@build.gluster.org
Gerrit-Reviewer: Niels de Vos nde...@redhat.com
Gerrit-Reviewer: Raghavendra Bhat rab...@redhat.com
Gerrit-Reviewer: jiffin tony Thottan jthot...@redhat.com
Gerrit-HasComments: No

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [release-3.6] compile error: 'GF_REPLACE_OP_START' undeclared

2015-07-06 Thread Raghavendra Bhat

On 07/06/2015 09:52 AM, Kaushal M wrote:

I checked on NetBSD-7.0_BETA and FreeBSD-10.1. I couldn't reproduce
this. I'll try on NetBSD-6 next.

~kaushal


I think it has to be included before 3.6.4 is made G.A. I can wait till 
the fix for this issue is merged before making 3.6.4. Does it sound ok? 
Or should I go ahead with 3.6.4 and make a quick 3.6.5 with this fix?


Regards,
Raghavendra Bhat



On Mon, Jul 6, 2015 at 8:38 AM, Kaushal M kshlms...@gmail.com wrote:

Krutika hit this last week, and let us (GlusterD maintiners) know of
it. I volunteered to look into this, but couldn't find time. I'll do
it now.

~kaushal

On Sun, Jul 5, 2015 at 10:43 PM, Atin Mukherjee
atin.mukherje...@gmail.com wrote:

I remember Krutika reporting it few days back. So it seems like its not
fixed yet. If there is no taker I will send a patch tomorrow.

-Atin
Sent from one plus one

On Jul 5, 2015 9:58 PM, Niels de Vos nde...@redhat.com wrote:

Hi,

it seems that the current release-3.6 branch does not compile on
FreedBSD and NetBSD (not sure why it compiles on CentOS-6). These errors
are thrown:

   --- glusterd_la-glusterd-op-sm.lo ---
 CC   glusterd_la-glusterd-op-sm.lo

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:
In function 'glusterd_op_start_rb_timer':

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19:
error: 'GF_REPLACE_OP_START' undeclared (first use in this function)

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:3685:19:
note: each undeclared identifier is reported only once for each function it
appears in

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:
In function 'glusterd_bricks_select_status_volume':

/home/jenkins/root/workspace/netbsd6-smoke/xlators/mgmt/glusterd/src/glusterd-op-sm.c:5800:34:
warning: unused variable 'snapd'
   *** [glusterd_la-glusterd-op-sm.lo] Error code 1


Could someone send a (pointer to the) backport that addresses this?

Thanks,
Niels


On Sun, Jul 05, 2015 at 08:59:32AM -0700, Gluster Build System (Code
Review) wrote:

Gluster Build System has posted comments on this change.

Change subject: nfs: make it possible to disable nfs.mount-rmtab
..


Patch Set 1: -Verified

Build Failed

http://build.gluster.org/job/compare-bug-version-and-git-branch/9953/ :
SUCCESS

http://build.gluster.org/job/freebsd-smoke/8551/ : FAILURE

http://build.gluster.org/job/smoke/19820/ : SUCCESS

http://build.gluster.org/job/netbsd6-smoke/7808/ : FAILURE

--
To view, visit http://review.gluster.org/11335
To unsubscribe, visit http://review.gluster.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40c4d8d754932f86fb2b1b2588843390464c773d
Gerrit-PatchSet: 1
Gerrit-Project: glusterfs
Gerrit-Branch: release-3.6
Gerrit-Owner: Niels de Vos nde...@redhat.com
Gerrit-Reviewer: Gluster Build System jenk...@build.gluster.com
Gerrit-Reviewer: Kaleb KEITHLEY kkeit...@redhat.com
Gerrit-Reviewer: NetBSD Build System jenk...@build.gluster.org
Gerrit-Reviewer: Niels de Vos nde...@redhat.com
Gerrit-Reviewer: Raghavendra Bhat rab...@redhat.com
Gerrit-Reviewer: jiffin tony Thottan jthot...@redhat.com
Gerrit-HasComments: No

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] tests/bugs/snapshot/bug-1109889.t - snapd crash

2015-07-03 Thread Raghavendra Bhat

On 07/03/2015 03:37 PM, Atin Mukherjee wrote:

http://build.gluster.org/job/rackspace-regression-2GB-triggered/11898/consoleFull
has caused a crash in snapd with the following bt:


This seem to have crashed in server_setvolume (i.e. before the graph 
could be properly made available for i/o. snapview-server xlator is yet 
to come into the picture). But still I will try to reproduce it on my 
local setup and see what might be causing this.



Regards,
Raghavendra Bhat



#0  0x7f11e2ed3ded in gf_client_put (client=0x0, detached=0x0)
 at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/client_t.c:294
#1  0x7f11d4eeac96 in server_setvolume (req=0x7f11c000195c)
 at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/protocol/server/src/server-handshake.c:710
#2  0x7f11e2c1e05c in rpcsvc_handle_rpc_call (svc=0x7f11d001b160,
trans=0x7f11cac0, msg=0x7f11c0001810)
 at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:698
#3  0x7f11e2c1e3cf in rpcsvc_notify (trans=0x7f11cac0,
mydata=0x7f11d001b160, event=RPC_TRANSPORT_MSG_RECEIVED,
 data=0x7f11c0001810) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:792
#4  0x7f11e2c23ad7 in rpc_transport_notify (this=0x7f11cac0,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f11c0001810)
 at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:538
#5  0x7f11d841787b in socket_event_poll_in (this=0x7f11cac0)
 at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2285
#6  0x7f11d8417dd1 in socket_event_handler (fd=13, idx=3,
data=0x7f11cac0, poll_in=1, poll_out=0, poll_err=0)
 at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2398
#7  0x7f11e2ed79ec in event_dispatch_epoll_handler
(event_pool=0x13bb040, event=0x7f11d4eb9e70)
 at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:570
#8  0x7f11e2ed7dda in event_dispatch_epoll_worker (data=0x7f11d000dc10)
 at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:673
#9  0x7f11e213e9d1 in start_thread () from ./lib64/libpthread.so.0
#10 0x7f11e1aa88fd in clone () from ./lib64/libc.so.6



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] glusterfs-3.6.4beta2 released

2015-07-02 Thread Raghavendra Bhat

Hi,

glusterfs-3.6.4beta1 has been released and the packages for 
RHEL/Fedora/Centos can be found here.

http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.4beta2/

Requesting people running 3.6.x to please try it out and let us know if 
there are any issues.


This release supposedly fixes the bugs listed below since 3.6.4beta1 was 
made available. Thanks to all who submitted patches, reviewed the changes.


1230242 - `ls' on a directory which has files with mismatching gfid's 
does not list anything

1230259 -  Honour afr self-heal volume set options from clients
1122290 - Issues reported by Cppcheck static analysis tool
1227670 - wait for sometime before accessing the activated snapshot
1225745 - [AFR-V2] - afr_final_errno() should treat op_ret  0 also as 
success

1223891 - readdirp return 64bits inodes even if enable-ino32 is set
1206429 - Maintainin local transaction peer list in op-sm framework
1217419 - DHT:Quota:- brick process crashed after deleting .glusterfs 
from backend

1225072 - OpenSSL multi-threading changes break build in RHEL5 (3.6.4beta1)
1215419 - Autogenerated files delivered in tarball
1224624 - cli: Excessive logging
1217423 - glusterfsd crashed after directory was removed from the mount 
point, while self-heal and rebalance  were running on 
the volume



Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] xattr creation failure in posix_lookup

2015-06-29 Thread Raghavendra Bhat


Hi,

In posix_lookup, it allocates a dict for storing the values of the 
extended attributes and other hint keys set into the xdata of call path 
(i.e. wind path) by higher xlators (such as quick-read, bit-rot-stub etc).


But if the creation of new dict fails, then a NULL dict is returned in 
the callback path. There might be many xlators for which the key-value 
information present in the dict might be very important for making 
certain decisions (Ex: In bit-rot-stub it tries to fetch an extended 
attribute which tells whether the object is bad or not. If the the key 
is present in the dict means the object is bad and the xlator updates 
the same in the inode context. Later when there is any read/modify 
operations on that object, the fop is failed instead of allowing to 
continue).


Now suppose in posix_lookup the dict creation fails, then posix simply 
proceeds with the lookup operation and if other stat operations 
succeeded, then lookup will return success with NULL dict.


if (xdata  (op_ret == 0)) {
xattr = posix_xattr_fill (this, real_path, loc, NULL, 
-1, xdata,

  buf);
}

The above piece of code in posix_lookup creates a new dict called 
@xattr. The return value of posix_xattr_fill is not checked.


So in this case, as per the bit-rot-stub example mentioned above, there 
is a possibility that the object being looked up is a bad object (marked 
by the scrubber). And since lookup succeeded, but the bad-object xattr 
is not obtained in the callback (dict itself being NULL), bit-rot-stub 
xlator does not mark that object as bad and might allow further 
read/write requests coming, thus allowing bad data to be served.


There might be other xlators as well dependent upon the xattrs being 
returned in lookup.


Should we fail lookup if the dict creation fails?

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] bad file access (bit-rot + AFR)

2015-06-29 Thread Raghavendra Bhat

On 06/27/2015 03:28 PM, Venky Shankar wrote:



On 06/27/2015 02:32 PM, Raghavendra Bhat wrote:

Hi,

There is a patch that is submitted for review to deny access to 
objects which are marked as bad by scrubber (i.e. the data of the 
object might have been corrupted in the backend).


http://review.gluster.org/#/c/11126/10
http://review.gluster.org/#/c/11389/4

The above  2 patch sets solve the problem of denying access to the 
bad objects (they have passed regression and received a +1 from 
venky). But in our testing we found that there is a race window 
(depending upon the scrubber frequency the race window can be larger) 
where there is a possibility of self-heal daemon healing the contents 
of the bad file before scrubber can mark it as bad.


I am not sure if the data truly gets corrupted in the backend, there 
is a chance of hitting this issue. But in our testing to simulate 
backend corruption we modify the contents of the file directly in the 
backend. Now in this case, before the scrubber can mark the object as 
bad, the self-heal daemon kicks in and heals the contents of the bad 
file to the good copy. Or before the scrubber marks the file as bad, 
if the client accesses it AFR finds that there is a mismatch in 
metadata (since we modified the contents of the file in the backend) 
and does data and metadata self-healing, thus copying the contents of 
the bad copy to good copy. And from now onwards the clients accessing 
that object always gets bad data.


I understand from Ravi (ranaraya@) that AFR-v2 would chose the 
biggest file as the source, provided that afr xattrs are clean 
(AFR-v1 would give back EIO). If a file is modified directly from the 
brick but leaves the size unchanged, contents can be served from 
either copy. For self-heal to detect anomalies, there needs to be 
verification (checksum/signature) at each stage of it's operation. But 
this might be too heavy on the I/O side. We could still cache mtime 
[but update on client I/O] after pre-check, but this still would not 
catch bit flips (unless a filesystem scrub is done).


Thoughts?



Yes. Even if wants to verify just before healing the file, the time 
taken to verify the checksum might be large if the file size is large. 
It might affect the self-heal performance.


Regards,
Raghavendra Bhat



Pranith?Do you have any solution for this? Venky and me are trying to 
come up with a solution for this.


But does this issue block the above patches in anyway? (Those 2 
patches are still needed to deny access to objects once they are 
marked as bad by scrubber).



Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] bad file access (bit-rot + AFR)

2015-06-27 Thread Raghavendra Bhat

Hi,

There is a patch that is submitted for review to deny access to objects 
which are marked as bad by scrubber (i.e. the data of the object might 
have been corrupted in the backend).


http://review.gluster.org/#/c/11126/10
http://review.gluster.org/#/c/11389/4

The above  2 patch sets solve the problem of denying access to the bad 
objects (they have passed regression and received a +1 from venky). But 
in our testing we found that there is a race window (depending upon the 
scrubber frequency the race window can be larger) where there is a 
possibility of self-heal daemon healing the contents of the bad file 
before scrubber can mark it as bad.


I am not sure if the data truly gets corrupted in the backend, there is 
a chance of hitting this issue. But in our testing to simulate backend 
corruption we modify the contents of the file directly in the backend. 
Now in this case, before the scrubber can mark the object as bad, the 
self-heal daemon kicks in and heals the contents of the bad file to the 
good copy. Or before the scrubber marks the file as bad, if the client 
accesses it AFR finds that there is a mismatch in metadata (since we 
modified the contents of the file in the backend) and does data and 
metadata self-healing, thus copying the contents of the bad copy to good 
copy. And from now onwards the clients accessing that object always gets 
bad data.


Pranith?Do you have any solution for this? Venky and me are trying to 
come up with a solution for this.


But does this issue block the above patches in anyway? (Those 2 patches 
are still needed to deny access to objects once they are marked as bad 
by scrubber).



Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] spurious failure with test-case ./tests/basic/tier/tier.t

2015-06-26 Thread Raghavendra Bhat

On 06/26/2015 04:00 PM, Ravishankar N wrote:



On 06/26/2015 03:57 PM, Vijaikumar M wrote:

Hi

Upstream regression failure with test-case ./tests/basic/tier/tier.t

My patch# 11315 regression failed twice with 
test-case./tests/basic/tier/tier.t. Anyone seeing this issue with 
other patches?




Yes, one of my patches failed today too: 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/11461/consoleFull


-Ravi


Even I had faced failure in tier.t couple of times.

Regards,
Raghavendra Bhat

http://build.gluster.org/job/rackspace-regression-2GB-triggered/11396/consoleFull 

http://build.gluster.org/job/rackspace-regression-2GB-triggered/11456/consoleFull 




Thanks,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Valgrind + glusterfs

2015-06-24 Thread Raghavendra Bhat

On 06/25/2015 09:57 AM, Pranith Kumar Karampuri wrote:

hi,
   Does anyone know why glusterfs hangs with valgrind?

Pranith


Yes. I have faced it too. It used work before. But recently its not 
working. glusterfs hangs when run with valgrind.

Not sure why it is hanging.


Regards,
Raghavendra Bhat


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Bad file access in bit-rot-detection

2015-06-08 Thread Raghavendra Bhat


Hi,

As part of Bit-rot detection feature a file that has its data changed 
due to some backend errors is marked as a bad file by the scrubber (sets 
an extended attribute indicating its a bad file). Now, the access to the 
bad file has to be denied (to prevent wrong data being served).


In bit-rot-stub xlator (the xlator which does object versioning and 
sends notifications to BitD upon object modification) the check for 
whether the file is bad or not can be done in lookup where if the xattr 
is set, then the object can be marked as bad within its inode context as 
well. But the problem is what if the object was not marked as bad at the 
time of lookup and later it was marked bad. Now when a fop such as open, 
readv or writev comes, the fops should not be allowed. If its fuse 
client from which the file is being accessed, then probably its ok to 
rely only on lookups (to check if its bad or not), as fuse sends lookups 
before sending fops. But for NFS, once the lookup is done and filehandle 
is available further lookups are not sent. In that case relying only on 
lookup to check if its a bad file or not is suffecient.


Below 3 solutions in bit-rot-stub xlator seem to address the above issue.

1) Whenever a fop such as open, readv or writev comes, check in the 
inode context if its a bad file or not. If not, then send a getxattr of 
bad file xattr on that file. If its present, then set the bad file 
attribute in the inode context and fail the fop.


But for above operation, a getxattr call has to be sent downwards for 
almost each open or readv or writev. If the file is identified as bad, 
then getxattr might not be necessary. But for good files extra getxattr 
might affect the performance.


OR

2) Set a key in xdata whenever open, readv, or writev comes (in 
bit-rot-stub xlator) and send it downwards. The posix xlator can look 
into the xdata and if the key for bad file identification is present, 
then it can do getxattr as part of open or readv or writev itself and 
send the response back in xdata itself.


Not sure whether the above method is ok or not as it overloads open, 
readv and writev. Apart from that, the getxattr disk operation is still 
done.


OR

3) Once the file is identified as bad, the scrubber marks it as bad (via 
setxattr) by sending a call to to bit-rot-stub xlator. Bit-rot-stub 
xlator marks the file as bad in the inode context once it receives the 
notification from scrubber that a file is bad. This saves those getxattr 
calls being made from other fops (either in bit-rot-stub xlator or posix 
xlator).


But the trick is what if the inode gets forgotten or the brick restarts. 
But I think in that case, checking in lookup call is suffecient (as in 
both inode forgets and brick restarts, a lookup will definitely come if 
there is an accss to that file).


Please provide feedback on above 3 methods. If there are any other 
solutions which might solve this issue, they are welcome.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] spurious regression status

2015-05-08 Thread Raghavendra Bhat

On Wednesday 06 May 2015 10:53 PM, Vijay Bellur wrote:

On 05/06/2015 06:52 AM, Pranith Kumar Karampuri wrote:

hi,
   Please backport the patches that fix spurious regressions to 3.7
as well. This is the status of regressions now:

  * ./tests/bugs/quota/bug-1035576.t (Wstat: 0 Tests: 24 Failed: 2)

  * Failed tests:  20-21

  * 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8329/consoleFull



  * ./tests/bugs/snapshot/bug-1112559.t: 1 new core files

  * 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8308/consoleFull


  * One more occurrence -

  * Failed tests:  9, 11

  * 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8430/consoleFull



Rafi - this seems to be a test unit contributed by you. Can you please 
look into this one?



  * ./tests/geo-rep/georep-rsync-changelog.t (Wstat: 256 Tests: 3 
Failed: 0)


  * Non-zero exit status: 1

  * 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8168/console





Aravinda/Kotresh - any update on this? If we do not intend enabling 
geo-replication tests in regression runs for now, this should go off 
the list.




  * ./tests/basic/quota-anon-fd-nfs.t (failed-test: 21)

  * Happens in: master
(http://build.gluster.org/job/rackspace-regression-2GB-triggered/8147/consoleFull)
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8147/consoleFull%29

  * Being investigated by: ?



Sachin - does this happen anymore or should we move it off the list?




  * tests/features/glupy.t

  * nuked tests 7153, 7167, 7169, 7173, 7212



Emmanuel's investigation should help us here. Thanks!



  * tests/basic/volume-snapshot-clone.t

  * http://review.gluster.org/#/c/10053/

  * Came back on April 9

  * 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/6658/




Rafi - does this happen anymore? If fixed due to subsequent commits, 
we should look at dropping this test from is_bad_test() in run-tests.sh.




  * tests/basic/uss.t

  * https://bugzilla.redhat.com/show_bug.cgi?id=1209286

  * http://review.gluster.org/#/c/10143/

  * Came back on April 9

  * 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/6660/


  * ./tests/bugs/glusterfs/bug-867253.t (Wstat: 0 Tests: 9 Failed: 1)

  * Failed test:  8



Raghu - does this happen anymore? If fixed due to subsequent commits, 
we should look at dropping this test from is_bad_test() in run-tests.sh.


-Vijay


I tried to reproduce the issue and it did not happen in my setup. So I 
am planning to get a slave machine and test it there.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] spurious regression status

2015-05-08 Thread Raghavendra Bhat

On Thursday 07 May 2015 10:50 AM, Sachin Pandit wrote:


- Original Message -

From: Vijay Bellur vbel...@redhat.com
To: Pranith Kumar Karampuri pkara...@redhat.com, Gluster Devel 
gluster-devel@gluster.org, Rafi Kavungal
Chundattu Parambil rkavu...@redhat.com, Aravinda avish...@redhat.com, Sachin 
Pandit span...@redhat.com,
Raghavendra Bhat rab...@redhat.com, Kotresh Hiremath Ravishankar 
khire...@redhat.com
Sent: Wednesday, May 6, 2015 10:53:01 PM
Subject: Re: [Gluster-devel] spurious regression status

On 05/06/2015 06:52 AM, Pranith Kumar Karampuri wrote:

hi,
Please backport the patches that fix spurious regressions to 3.7
as well. This is the status of regressions now:

   * ./tests/bugs/quota/bug-1035576.t (Wstat: 0 Tests: 24 Failed: 2)

   * Failed tests:  20-21

   *
   
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8329/consoleFull


   * ./tests/bugs/snapshot/bug-1112559.t: 1 new core files

   *
   
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8308/consoleFull

   * One more occurrence -

   * Failed tests:  9, 11

   *
   
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8430/consoleFull


Rafi - this seems to be a test unit contributed by you. Can you please
look into this one?



   * ./tests/geo-rep/georep-rsync-changelog.t (Wstat: 256 Tests: 3 Failed:
   0)

   * Non-zero exit status: 1

   *
   http://build.gluster.org/job/rackspace-regression-2GB-triggered/8168/console



Aravinda/Kotresh - any update on this? If we do not intend enabling
geo-replication tests in regression runs for now, this should go off the
list.


   * ./tests/basic/quota-anon-fd-nfs.t (failed-test: 21)

   * Happens in: master
 
(http://build.gluster.org/job/rackspace-regression-2GB-triggered/8147/consoleFull)
 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/8147/consoleFull%29

   * Being investigated by: ?


Sachin - does this happen anymore or should we move it off the list?

quota-anon-fd.t failure is consistent in NetBSD, whereas in linux
apart from test failure mentioned in etherpad I did not see this
failure again in the regression runs. However, I remember Pranith
talking about hitting this issue again.



   * tests/features/glupy.t

   * nuked tests 7153, 7167, 7169, 7173, 7212


Emmanuel's investigation should help us here. Thanks!


   * tests/basic/volume-snapshot-clone.t

   * http://review.gluster.org/#/c/10053/

   * Came back on April 9

   * http://build.gluster.org/job/rackspace-regression-2GB-triggered/6658/


Rafi - does this happen anymore? If fixed due to subsequent commits, we
should look at dropping this test from is_bad_test() in run-tests.sh.


   * tests/basic/uss.t

   * https://bugzilla.redhat.com/show_bug.cgi?id=1209286

   * http://review.gluster.org/#/c/10143/

   * Came back on April 9

   * http://build.gluster.org/job/rackspace-regression-2GB-triggered/6660/

   * ./tests/bugs/glusterfs/bug-867253.t (Wstat: 0 Tests: 9 Failed: 1)

   * Failed test:  8


Raghu - does this happen anymore? If fixed due to subsequent commits, we
should look at dropping this test from is_bad_test() in run-tests.sh.

-Vijay



As per the Jenkins output uss.t is failing in this test case

TEST stat $M0/.history/snap6/aaa

And its failing with the below error.

stat: cannot stat `/mnt/glusterfs/0/.history/snap6/aaa': No such file or 
directory


Its bit strange as before doing this check the file is created in the 
mount point and then the snapshot is taken. I am not sure whether its 
not able to reach the file itself or its parent directory (which 
represents the snapshot of the volume i.e. in this case its 
/mnt/glusterfs/0/.history/snap6).


So I have sent a patch to check for the parent directory (i.e. stat on 
it). It will help us get more information.

http://review.gluster.org/10671

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Raghavendra Bhat

On Thursday 02 April 2015 01:00 PM, Pranith Kumar Karampuri wrote:


On 04/02/2015 12:27 AM, Raghavendra Talur wrote:



On Wed, Apr 1, 2015 at 10:34 PM, Justin Clift jus...@gluster.org 
mailto:jus...@gluster.org wrote:


On 1 Apr 2015, at 10:57, Emmanuel Dreyfus m...@netbsd.org
mailto:m...@netbsd.org wrote:
 Hi

 crypt.t was recently broken in NetBSD regression. The glusterfs
returns
 a node with file type invalid to FUSE, and that breaks the test.

 After running a git bisect, I found the offending commit after
which
 this behavior appeared:
8a2e2b88fc21dc7879f838d18cd0413dd88023b7
mem-pool: invalidate memory on GF_FREE to aid debugging

 This means the bug has always been there, but this debugging aid
 caused it to be reliable.

Sounds like that commit is a good win then. :)

Harsha/Pranith/Lala, your names are on the git blame for crypt.c...
any ideas? :)


I found one issue that local is not allocated using GF_CALLOC and 
with a mem-type.

This is a patch which *might* fix it.

diff --git a/xlators/encryption/crypt/src/crypt-mem-types.h 
b/xlators/encryption/crypt/src/crypt-mem-types.h

index 2eab921..c417b67 100644
--- a/xlators/encryption/crypt/src/crypt-mem-types.h
+++ b/xlators/encryption/crypt/src/crypt-mem-types.h
@@ -24,6 +24,7 @@ enum gf_crypt_mem_types_ {
gf_crypt_mt_key,
gf_crypt_mt_iovec,
gf_crypt_mt_char,
+gf_crypt_mt_local,
gf_crypt_mt_end,
 };
diff --git a/xlators/encryption/crypt/src/crypt.c 
b/xlators/encryption/crypt/src/crypt.c

index ae8cdb2..63c0977 100644
--- a/xlators/encryption/crypt/src/crypt.c
+++ b/xlators/encryption/crypt/src/crypt.c
@@ -48,7 +48,7 @@ static crypt_local_t 
*crypt_alloc_local(call_frame_t *frame, xlator_t *this,

 {
crypt_local_t *local = NULL;
-   local = mem_get0(this-local_pool);
+local = GF_CALLOC (sizeof (*local), 1, gf_crypt_mt_local);
local is using the memory from pool earlier(i.e. with mem_get0()). 
Which seems ok to me. Changing it this way will include memory 
allocation in fop I/O path which is why xlators generally use the 
mem-pool approach.


Pranith


I think, crypt xlator should do a mem_put of local after doing 
STACK_UNWIND like other xlators which also use mem_get for local (such 
as AFR). I am suspecting crypt not doing mem_put might be the reason for 
the bug mentioned.


Regards,
Raghavendra Bat


if (!local) {
gf_log(this-name, GF_LOG_ERROR, out of memory);
return NULL;


Niels should be able to recognize if this is sufficient fix or not.

Thanks,
Raghavendra Talur

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift
http://twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel




--
*Raghavendra Talur *





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] crypt xlator bug

2015-04-02 Thread Raghavendra Bhat

On Thursday 02 April 2015 05:50 PM, Jeff Darcy wrote:

I think, crypt xlator should do a mem_put of local after doing STACK_UNWIND
like other xlators which also use mem_get for local (such as AFR). I am
suspecting crypt not doing mem_put might be the reason for the bug
mentioned.

My understanding was that mem_put should be called automatically from
FRAME_DESTROY, which is itself called from STACK_DESTROY when the fop
completes (e.g. at FUSE or GFAPI).  On the other hand, I see that AFR
and others call mem_put themselves, without zeroing the local pointer.
In my (possibly no longer relevant) experience, freeing local myself
without zeroing the pointer would lead to a double free, and I don't
see why that's not the case here.  What am I missing?


As per my understanding, the xlators which get local by mem_get should 
be doing below things in callback funtion  just before unwinding:


1) save frame-local pointer (i.e. local = frame-local);
2) STACK_UNWIND
3) mem_put (local)

After STACK_UNWIND and before mem_put any reference to fd or inode or 
dict that might be present in the local should be unrefed (also any 
allocated resources that are present in local should be freed). So 
mem_put is done at last. To avoid double free in FRAME_DESTROY, 
frame-local is set to NULL before doing STACK_UNWIND.


I suspect not doing 1 of the above three operations (may be either 1st 
or 3rd) in crypt xlator might be the reason for the bug.


Regards,
Raghavendra Bhat


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] glusterfs-3.6.3beta2 released

2015-04-01 Thread Raghavendra Bhat

Hi

glusterfs-3.6.3beta2 has been released and can be found here.
http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.3beta2/

This beta release supposedly fixes the bugs listed below since 
3.6.3beta1 was made available. Thanks to all who submitted the patches, 
reviewed the changes.



1187526 - Disperse volume mounted through NFS doesn't list any 
files/directories
1188471 - When the volume is in stopped state/all the bricks are down 
mount of the volume hangs
1201484 - glusterfs-3.6.2 fails to build on Ubuntu Precise: 
'RDMA_OPTION_ID_REUSEADDR' undeclared

1202212 - Performance enhancement for RDMA
1189023 - Directories not visible anymore after add-brick, new brick 
dirs not part of old bricks

1202673 - Perf: readdirp in replicated volumes causes performance degrade
1203081 - Entries in indices/xattrop directory not removed appropriately
1203648 - Quota: Build ancestry in the lookup
1199936 - readv on /var/run/6b8f1f2526c6af8a87f1bb611ae5a86f.socket 
failed when NFS is disabled

1200297 - cli crashes when listing quota limits with xml output
1201622 - Convert quota size from n-to-h order before using it
1194141 - AFR : failure in self-heald.t
1201624 - Spurious failure of tests/bugs/quota/bug-1038598.t
1194306 - Do not count files which did not need index heal in the first 
place as successfully healed
1200258 - Quota: features.quota-deem-statfs is on even after disabling 
quota.

1165938 - Fix regression test spurious failures
1197598 - NFS logs are filled with system.posix_acl_access messages
1199577 - mount.glusterfs uses /dev/stderr and fails if the device does 
not exist

1197598 - NFS logs are filled with system.posix_acl_access messages
1188066 - logging improvements in marker translator
1191537 - With afrv2 + ext4, lookups on directories with large offsets 
could result in duplicate/missing entries

1165129 - libgfapi: use versioned symbols in libgfapi.so for compatibility
1179136 - glusterd: Gluster rebalance status returns failure
1176756 - glusterd: remote locking failure when multiple synctask 
transactions are run
1188064 - log files get flooded when removexattr() can't find a 
specified key or value

1165938 - Fix regression test spurious failures
1192522 - index heal doesn't continue crawl on self-heal failure
1193970 - Fix spurious ssl-authz.t regression failure (backport)


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [PATCH ANNOUNCE] BitRot : Object signing

2015-02-19 Thread Raghavendra Bhat


Hi,

These are the patches.

http://review.gluster.org/#/c/9705/
http://review.gluster.org/#/c/9706/
http://review.gluster.org/#/c/9707/
http://review.gluster.org/#/c/9708/
http://review.gluster.org/#/c/9709/
http://review.gluster.org/#/c/9710/
http://review.gluster.org/#/c/9711/
http://review.gluster.org/#/c/9712/

Regards,
Raghavendra Bhat

On Thursday 19 February 2015 07:34 PM, Venky Shankar wrote:

Hi folks,

Listed below is the initial patchset for the upcoming bitrot detection 
feature targeted for GlusterFS 3.7. As of now, these set of patches 
implement object signing. Myself and Raghavendra (rabhat@) are working 
on pending items (scrubber, etc..) and would be sending those patches 
shortly. Since this is the initial patch set, it might be prone to 
bugs (as we speak rabhat@ is chasing a memory leak :-)).


There is an upcoming event on Google+ Hangout regarding bitrot on 
Tuesday, 24th March. The hangout session would cover implementation 
details (algorithm, flow, etc..) and would be beneficial for anyone 
from code reviewers, users or generally interested parties. Please 
plan attend if possible: http://goo.gl/dap9rF


As usual, comments/suggestions are more than welcome.

Thanks,
Venky (overclk on #freenode)



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] glusterfs-3.6.3beta1 released

2015-02-13 Thread Raghavendra Bhat

Hi

glusterfs-3.6.3beta1 has been released and can be found here.
http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.3beta1/

This beta release supposedly fixes the bugs listed below since 3.6.2 was 
made available. Thanks to all who submitted the patches, reviewed the 
changes.


1138897 - NetBSD port
1184527 - Some newly created folders have root ownership although 
created by unprivileged user

1181977 - gluster vol clear
1159471 - rename operation leads to core dump
1173528 - Change in volume heal info command output
1186119 - tar on a gluster directory gives message file changed as we 
read it even though no updates to file in progress

1183716 - Force replace
1178590 - Enable quota(default) leads to heal directory's xattr failed.
1182490 - Internal ec xattrs are allowed to be modified
1187547 - self-heal-algorithm with option full doesn't heal sparse 
files correctly
1174170 - Glusterfs outputs a lot of warnings and errors when quota is 
enabled
1186119 - tar on a gluster directory gives message file changed as we 
read it even though no updates to file in progress


Regards,
Raghavendra Bhat

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] glusterfs-3.6.2beta2

2015-01-16 Thread Raghavendra Bhat


Hi

glusterfs-3.6.2beta2 has been released and can be found here.
http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.2beta2/


This beta release supposedly fixes the bugs listed below 3.6.2beta1 was 
made available. Thanks to all who submitted the patches, reviewed the 
changes.



 1180404 - nfs server restarts when a snapshot is deactivated
 1180411 - CIFS:[USS]: glusterfsd OOM killed when 255 snapshots were 
browsed at CIFS mount and Control+C is issued
 1180070 - [AFR] getfattr on fuse mount gives error : Software caused 
connection abort

 1175753 - [readdir-ahead]: indicate EOF for readdirp
 1175752 - [USS]: On a successful lookup, snapd logs are filled with 
Warnings dict OR key (entry-point) is NULL

 1175749 - glusterfs client crashed while migrating the fds
 1179658 - Add brick fails if parent dir of new brick and existing 
brick is same and volume was accessed using libgfapi and smb.
 1146524 - glusterfs.spec.in - synch minor diffs with fedora dist-git 
glusterfs.spec
 1175744 - [USS]: Unable to access .snaps after snapshot restore after 
directories were deleted and recreated
 1175742 - [USS]: browsing .snaps directory with CIFS fails with 
Invalid argument
 1175739 - [USS]: Non root user who has no access to a directory, from 
NFS mount, is able to access the files under .snaps under that directory
 1175758 - [USS] : Rebalance process tries to connect to snapd and in 
case when snapd crashes it might affect rebalance process
 1175765 - USS]: When snapd is crashed gluster volume stop/delete 
operation fails making the cluster in inconsistent state

 1173528 - Change in volume heal info command output
 1166515 - [Tracker] RDMA support in glusterfs
 1166505 - mount fails for nfs protocol in rdma volumes
 1138385 - [DHT:REBALANCE]: Rebalance failures are seen with error 
message  remote operation failed: File exists

 1177418 - entry self-heal in 3.5 and 3.6 are not compatible
 1170954 - Fix mutex problems reported by coverity scan
 1177899 - nfs: ls shows Permission denied with root-squash
 1175738 - [USS]: data unavailability for a period of time when USS is 
enabled/disabled
 1175736 - [USS]:After deactivating a snapshot trying to access the 
remaining activated snapshots from NFS mount gives 'Invalid argument' error

 1175735 - [USS]: snapd process is not killed once the glusterd comes back
 1175733 - [USS]: If the snap name is same as snap-directory than cd to 
virtual snap directory fails
 1175756 - [USS] : Snapd crashed while trying to access the snapshots 
under .snaps directory
 1175755 - SNAPSHOT[USS]:gluster volume set for uss doesnot check any 
boundaries
 1175732 - [SNAPSHOT]: nouuid is appended for every snapshoted brick 
which causes duplication if the original brick has already nouuid
 1175730 - [USS]: creating file/directories under .snaps shows wrong 
error message
 1175754 - [SNAPSHOT]: before the snap is marked to be deleted if the 
node goes down than the snaps are propagated on other nodes and glusterd 
hungs

 1159484 - ls -alR can not heal the disperse volume
 1138897 - NetBSD port
 1175728 - [USS]: All uss related logs are reported under 
/var/log/glusterfs, it makes sense to move it into subfolder

 1170548 - [USS] : don't display the snapshots which are not activated
 1170921 - [SNAPSHOT]: snapshot should be deactivated by default when 
created
 1175694 - [SNAPSHOT]: snapshoted volume is read only but it shows rw 
attributes in mount

 1161885 - Possible file corruption on dispersed volumes
 1170959 - EC_MAX_NODES is defined incorrectly
 1175645 - [USS]: Typo error in the description for USS under gluster 
volume set help

 1171259 - mount.glusterfs does not understand -n option

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] handling statfs call in USS

2015-01-05 Thread Raghavendra Bhat

On Monday 29 December 2014 01:19 PM, RAGHAVENDRA TALUR wrote:

On Sun, Dec 28, 2014 at 5:03 PM, Vijay Bellur vbel...@redhat.com wrote:

On 12/24/2014 02:30 PM, Raghavendra Bhat wrote:


Hi,

I have a doubt. In user serviceable snapshots as of now statfs call is
not implemented. There are 2 ways how statfs can be handled.

1) Whenever snapview-client xlator gets statfs call on a path that
belongs to snapshot world, it can send the
statfs call to the main volume itself, with the path and the inode being
set to the root of the main volume.

OR

2) It can redirect the call to the snapshot world (the snapshot demon
which talks to all the snapshots of that particular volume) and send
back the reply that it has obtained.


Each entry in .snaps can be thought of as a specially mounted read-only
filesystem and doing a statfs in such a filesystem should generate
statistics associated with that. So approach 2. seems more appropriate.

I agree with Vijay here. Treating each entry in .snaps as a specially mounted
read-only filesystem will be required to send proper error codes to Samba.


Yeah makes sense. But one challenge is if someone does statfs on .snaps 
directory itself, then
what should be done? Because .snaps is a virtual directory. I can think 
of 2 ways
1) Make snapview-server xlator return 0s when it receives statfs on 
.snaps so that the o/p is similar the one that is obtained when statfs 
is done on /proc

OR if the above o/p is not right,
2) If statfs comes on .snaps, then wind the call to regular volume 
itself. Anything beyond .snaps will be sent to the snapshot world.


Regards,
Raghavendra Bhat


-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] 3.6.2beta1

2014-12-25 Thread Raghavendra Bhat


Hi,

glusterfs-3.6.2beta1 has been released and the rpms can be found here.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] 3.6.2beta1

2014-12-25 Thread Raghavendra Bhat

On Friday 26 December 2014 12:22 PM, Raghavendra Bhat wrote:


Hi,

glusterfs-3.6.2beta1 has been released and the rpms can be found here.


Regards,
Raghavendra Bhat
___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Oops. Sorry. Missed the link

 http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.2beta1/


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] handling statfs call in USS

2014-12-24 Thread Raghavendra Bhat


Hi,

I have a doubt. In user serviceable snapshots as of now statfs call is 
not implemented. There are 2 ways how statfs can be handled.


1) Whenever snapview-client xlator gets statfs call on a path that 
belongs to snapshot world, it can send the
statfs call to the main volume itself, with the path and the inode being 
set to the root of the main volume.


OR

2) It can redirect the call to the snapshot world (the snapshot demon 
which talks to all the snapshots of that particular volume) and send 
back the reply that it has obtained.


Please provide feedback.

Regards,
Raghavendra Bhat

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] explicit lookup of inods linked via readdirp

2014-12-23 Thread Raghavendra Bhat

On Thursday 18 December 2014 12:58 PM, Raghavendra Gowdappa wrote:


- Original Message -

From: Raghavendra Bhat rab...@redhat.com
To: Gluster Devel gluster-devel@gluster.org
Cc: Anand Avati aav...@redhat.com
Sent: Thursday, December 18, 2014 12:31:41 PM
Subject: [Gluster-devel] explicit lookup of inods linked via readdirp


Hi,

In fuse I saw, that as part of resolving a inode, an explicit lookup is
done on it if the inode is found to be linked via readdirp (At the time
of linking in readdirp, fuse sets a flag in the inode context). It is
done because,  many xlators such as afr depend upon lookup call for many
things such as healing.

Yes. But the lookup is a nameless lookup and hence is not sufficient enough. 
Some of the functionalities that get affected AFAIK are:
1. dht cannot create/heal directories and their layouts.
2. afr cannot identify gfid mismatch of a file across its subvolumes, since to 
identify a gfid mismatch we need a name.

 From what I heard, afr relies on crawls done by self-heal daemon for 
named-lookups. But dht is worst hit in terms of maintaining directory structure 
on newly added bricks (this problem is  slightly different, since we don't hit 
this because of nameless lookup after readdirp. Instead it is because of a lack 
of named-lookup on the file after a graph switch. Neverthless I am clubbing 
both because a named lookup would've solved the issue). I've a feeling that 
different components have built their own way of handling what is essentially 
same issue. Its better we devise a single comprehensive solution.


But that logic is not there in gfapi. I am thinking of introducing that
mechanism in gfapi as well, where as part of resolve it checks if the
inode is linked from readdirp. And if so it will do an explicit lookup
on that inode.

As you've mentioned a lookup gives a chance to afr to heal the file. So, its 
needed in gfapi too. However you've to speak to afr folks to discuss whether 
nameless lookup is sufficient enough.


As per my understanding, this change in gfapi creates same chances as 
that of fuse. When I tried with fuse, where I had a file that need to be 
healed, doing ls, and cat file actually triggered a selfheal on it. So 
even with gfapi, the change creates same chances of healing as that of fuse.


Regards,
Raghavendra Bhat



NOTE: It can be done in NFS server as well.

Dht in NFS setup is also hit because of lack of named-lookups resulting in 
non-healing of directories on newly added brick.


Please provide feedback.

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] patches for 3.6.2

2014-12-22 Thread Raghavendra Bhat

On Tuesday 23 December 2014 11:09 AM, Atin Mukherjee wrote:

Can you please take in http://review.gluster.org/#/c/9328/ for 3.6.2?

~Atin

On 12/19/2014 02:05 PM, Raghavendra Bhat wrote:

Hi,

glusterfs-3.6.2beta1 has been released. I am planning to make 3.6.2
before end of this year. If there are some patches that has to go in for
3.6.2, please send them by EOD 23-12-2014 (i.e. coming Tuesday) so that
I can make a 3.6.2 release sooner.

As of now, these are the bugs in new or assigned state.
https://bugzilla.redhat.com/buglist.cgi?bug_status=NEWbug_status=ASSIGNEDclassification=Communityf1=blockedlist_id=3106878o1=substringproduct=GlusterFSquery_format=advancedv1=1163723



Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Sure. Will do it.

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] patches for 3.6.2

2014-12-19 Thread Raghavendra Bhat


Hi,

glusterfs-3.6.2beta1 has been released. I am planning to make 3.6.2 
before end of this year. If there are some patches that has to go in for 
3.6.2, please send them by EOD 23-12-2014 (i.e. coming Tuesday) so that 
I can make a 3.6.2 release sooner.


As of now, these are the bugs in new or assigned state.
https://bugzilla.redhat.com/buglist.cgi?bug_status=NEWbug_status=ASSIGNEDclassification=Communityf1=blockedlist_id=3106878o1=substringproduct=GlusterFSquery_format=advancedv1=1163723


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.6.1 issue

2014-12-18 Thread Raghavendra Bhat

On Tuesday 16 December 2014 10:59 PM, David F. Robinson wrote:
Gluster 3.6.1 seems to be having an issue creating symbolic links.  To 
reproduce this issue, I downloaded the file 
dakota-6.1-public.src_.tar.gz from

https://dakota.sandia.gov/download.html
# gunzip dakota-6.1-public.src_.tar.gz
# tar -xf dakota-6.1-public.src_.tar
# cd dakota-6.1.0.src/examples/script_interfaces/TankExamples/DakotaList
# ls -al
*_### Results from my old storage system (non gluster)_*
corvidpost5:TankExamples/DakotaList ls -al
total 12
drwxr-x--- 2 dfrobins users  112 Dec 16 12:12 ./
drwxr-x--- 6 dfrobins users  117 Dec 16 12:12 ../
*lrwxrwxrwx 1 dfrobins users   25 Dec 16 12:12 EvalTank.py - 
../tank_model/EvalTank.py*
lrwxrwxrwx 1 dfrobins users   24 Dec 16 12:12 FEMTank.py - 
../tank_model/FEMTank.py*

-rwx--x--- 1 dfrobins users  734 Nov  7 11:05 RunTank.sh*
-rw--- 1 dfrobins users 1432 Nov  7 11:05 dakota_PandL_list.in
-rw--- 1 dfrobins users 1860 Nov  7 11:05 dakota_Ponly_list.in
*_### Results from gluster (broken links that have no permissions)_*
corvidpost5:TankExamples/DakotaList ls -al
total 5
drwxr-x--- 2 dfrobins users  166 Dec 12 08:43 ./
drwxr-x--- 6 dfrobins users  445 Dec 12 08:43 ../
*-- 1 dfrobins users0 Dec 12 08:43 EvalTank.py
-- 1 dfrobins users0 Dec 12 08:43 FEMTank.py*
-rwx--x--- 1 dfrobins users  734 Nov  7 11:05 RunTank.sh*
-rw--- 1 dfrobins users 1432 Nov  7 11:05 dakota_PandL_list.in
-rw--- 1 dfrobins users 1860 Nov  7 11:05 dakota_Ponly_list.in
===
David F. Robinson, Ph.D.
President - Corvid Technologies
704.799.6944 x101 [office]
704.252.1310 [cell]
704.799.7974 [fax]
david.robin...@corvidtec.com mailto:david.robin...@corvidtec.com
http://www.corvidtechnologies.com


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Hi David,

Can you please provide the log files? You can find them in 
/var/log/glusterfs.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] telldir/seekdir portability fixes

2014-12-17 Thread Raghavendra Bhat

On Wednesday 17 December 2014 02:21 PM, Emmanuel Dreyfus wrote:

Hello

Any chance http://review.gluster.org/9071 gets merged (and
http://review.gluster.org/9084 for release-3.6)? It has been waiting for
review for more than a month now.
I tried to push the above patch. But it failed with merge conflict. Can 
you please rebase and sent it?


Regards,
Raghavendra Bhat


This is the remaining of a fix that has been partially done in
http://review.gluster.org/8933, and that one has been operating without
a hitch for a while.

Without the fix, self heal breaks on NetBSD if it needs to iterate on a
directory (that is: content is more than 128k). That is a big roadblock.



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] explicit lookup of inods linked via readdirp

2014-12-17 Thread Raghavendra Bhat


Hi,

In fuse I saw, that as part of resolving a inode, an explicit lookup is 
done on it if the inode is found to be linked via readdirp (At the time 
of linking in readdirp, fuse sets a flag in the inode context). It is 
done because,  many xlators such as afr depend upon lookup call for many 
things such as healing.


But that logic is not there in gfapi. I am thinking of introducing that 
mechanism in gfapi as well, where as part of resolve it checks if the 
inode is linked from readdirp. And if so it will do an explicit lookup 
on that inode.


NOTE: It can be done in NFS server as well.

Please provide feedback.

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] snapshot restore and USS

2014-12-01 Thread Raghavendra Bhat

On Monday 01 December 2014 04:51 PM, Raghavendra G wrote:



On Fri, Nov 28, 2014 at 6:48 PM, RAGHAVENDRA TALUR 
raghavendra.ta...@gmail.com mailto:raghavendra.ta...@gmail.com wrote:


On Thu, Nov 27, 2014 at 2:59 PM, Raghavendra Bhat
rab...@redhat.com mailto:rab...@redhat.com wrote:
 Hi,

 With USS to access snapshots, we depend on last snapshot of the
volume (or
 the latest snapshot) to resolve some issues.
 Ex:
 Say there is a directory called dir within the root of the
volume and USS
 is enabled. Now when .snaps is accessed from dir (i.e.
/dir/.snaps), first
 a lookup is sent on /dir which snapview-client xlator passes
onto the normal
 graph till posix xlator of the brick. Next the lookup comes on
/dir/.snaps.
 snapview-client xlator now redirects this call to the snap
daemon (since
 .snaps is a virtual directory to access the snapshots). The
lookup comes to
 snap daemon with parent gfid set to the gfid of /dir and the
basename
 being set to .snaps. Snap daemon will first try to resolve the
parent gfid
 by trying to find the inode for that gfid. But since that gfid
was not
 looked up before in the snap daemon, it will not be able to find
the inode.
 So now to resolve it, snap daemon depends upon the latest
snapshot. i.e. it
 tries to look up the gfid of /dir in the latest snapshot and if
it can get
 the gfid, then lookup on /dir/.snaps is also successful.

From the user point of view, I would like to be able to enter into the
.snaps anywhere.
To be able to do that, we can turn the dependency upside down, instead
of listing all
snaps in the .snaps dir, lets just show whatever snapshots had
that dir.


Currently readdir in snap-view server is listing _all_ the snapshots. 
However if you try to do ls on a snapshot which doesn't contain this 
directory (say dir/.snaps/snap3), I think it returns ESTALE/ENOENT. 
So, to get what you've explained above, readdir(p) should filter out 
those snapshots which doesn't contain this directory (to do that, it 
has to lookup dir on each of the snapshots).


Raghavendra Bhat explained the problem and also a possible solution to 
me in person. There are some pieces missing in the problem description 
as explained in the mail (but not in the discussion we had). The 
problem explained here occurs  when you restore a snapshot (say snap3) 
where the directory got created, but deleted before next snapshot. So, 
directory doesn't exist in snap2 and snap4, but exists only in snap3. 
Now, when you restore snap3, ls on dir/.snaps should show nothing. 
Now, what should be result of lookup (gfid-of-dir, .snaps) should be?


1. we can blindly return a virtual inode, assuming there is atleast 
one snapshot contains dir. If fops come on specific snapshots (eg., 
dir/.snaps/snap4), they'll anyways fail with ENOENT (since dir is not 
present on any snaps).
2. we can choose to return ENOENT if we figure out that dir is not 
present on any snaps.


The problem we are trying to solve here is how to achieve 2. One 
simple solution is to lookup for gfid-of-dir on all the snapshots 
and if every lookup fails with ENOENT, we can return ENOENT. The other 
solution is to just lookup in snapshots before and after (if both are 
present, otherwise just in latest snapshot). If both fail, then we can 
be sure that no snapshots contain that directory.


Rabhat, Correct me if I've missed out anything :).




If a readdir on .snaps entered from a non root directory has to show the 
list of only those snapshots where the directory (or rather gfid of the 
directory) is present, then the way to achieve will be bit costly.


When readdir comes on .snaps entered from a non root directory (say ls 
/dir/.snaps), following operations have to be performed
1) In a array we have the names of all the snapshots. So, do a nameless 
lookup on the gfid of /dir on all the snapshots
2) Based on which snapshots have sent success to the above lookup, build 
a new array or list of snapshots.

3) Then send the above new list as the readdir entries.

But the above operation it costlier. Because, just to serve one readdir 
request we have to make a lookup on each snapshot (if there are 256 
snapshots, then we have to make 256 lookup calls via network).


One more thing is resource usage. As of now any snapshot will be initied 
(i.e. via gfapi a connection is established with the corresponding 
snapshot volume, which is equivalent to a mounted volume.) when that 
snapshot is accessed (from fops point of view a lookup comes on the 
snapshot entry, say ls /dir/.snaps/snap1). Now to serve readdir all 
the snapshots will be  accessed and all the snapshots are initialized. 
This means there can be 256 instances of gfapi connections with each 
instance having its own inode table and other resources). After readdir 
if a snapshot is not accessed, so many resources

[Gluster-devel] snapshot restore and USS

2014-11-27 Thread Raghavendra Bhat

Hi,

With USS to access snapshots, we depend on last snapshot of the volume 
(or the latest snapshot) to resolve some issues.

Ex:
Say there is a directory called dir within the root of the volume and 
USS is enabled. Now when .snaps is accessed from dir (i.e. 
/dir/.snaps), first a lookup is sent on /dir which snapview-client 
xlator passes onto the normal graph till posix xlator of the brick. Next 
the lookup comes on /dir/.snaps. snapview-client xlator now redirects 
this call to the snap daemon (since .snaps is a virtual directory to 
access the snapshots). The lookup comes to snap daemon with parent gfid 
set to the gfid of /dir and the basename being set to .snaps. Snap 
daemon will first try to resolve the parent gfid by trying to find the 
inode for that gfid. But since that gfid was not looked up before in the 
snap daemon, it will not be able to find the inode. So now to resolve 
it, snap daemon depends upon the latest snapshot. i.e. it tries to look 
up the gfid of /dir in the latest snapshot and if it can get the gfid, 
then lookup on /dir/.snaps is also successful.


But, there can be some confusion in the case of snapshot restore. Say 
there are 5 snapshots (snap1, snap2, snap3, snap4, snap5) for a volume 
vol. Now say the volume is restored to snap3. If there was a directory 
called
/a at the time of taking snap3 and was later removed, then after 
snapshot restore accessing .snaps from that directory (in fact all the 
directories which were present while taking snap3) might cause problems. 
Because now the original volume is nothing but the snap3 and snap daemon 
when gets the lookup on /a/.snaps, it tries to find the gfid of /a 
in the latest snapshot (which is snap5) and if a was removed after 
taking snap3, then the lookup of /a in snap5 fails and thus the lookup 
of /a/.snaps will also fail.



Possible Solution:
One of the possible solution that can be helpful in this case is, 
whenever glusterd sends the list of snapshots to snap daemon after 
snapshot restore, send the list in such a way that the snapshot which is 
previous to the restored snapshot is sent as the latest snapshot (in the 
example above, since snap3 is restored, glusterd should send snap2 as 
the latest snapshot to snap daemon).


But in the above solution also, there is a problem. If there are only 2 
snapshots (snap1, snap2) and the volume is restored to the first 
snapshot (snap1), there is no previous snapshot to look at. And glusterd 
will send only one name in the list which is snap2 but it is in a future 
state than the volume.


A patch has been submitted for the review to handle this 
(http://review.gluster.org/#/c/9094/).
And in the patch because of the above confusions snapd tries to consult 
the adjacent snapshots  of the restored snapshot to resolve the gfids. 
As per the 5 snapshots example, it tries to look at snap2 and snap4 
(i.e. look into snap2 first, if it fails then look into snap4). If there 
is no previous snapshot, then look at the next snapshot (2 snapshots 
example). If there is no next snapshot, then look at the previous snapshot.


Please provide feed back about how this issue can be handled.

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] documentation on inode and dentry management

2014-09-23 Thread Raghavendra Bhat


Hi,

I have sent a patch to add the info on how glusterfs manages inodes and 
dentries.

http://review.gluster.org/#/c/8815/

Please review it and provide feedback to improve it.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] inode linking in GlusterFS NFS server

2014-07-07 Thread Raghavendra Bhat


Hi,

As per my understanding nfs server is not doing inode linking in 
readdirp callback. Because of this there might be some errors while 
dealing with virtual inodes (or gfids). As of now meta, gfid-access and 
snapview-server (used for user serviceable snapshots) xlators makes use 
of virtual inodes with random gfids. The situation is this:


Say User serviceable snapshot feature has been enabled and there are 2 
snapshots (snap1 and snap2). Let /mnt/nfs be the nfs mount. Now the 
snapshots can be accessed by entering .snaps directory.  Now if snap1 
directory is entered and *ls -l* is done (i.e. cd 
/mnt/nfs/.snaps/snap1 and then ls -l),  the readdirp fop is sent to 
the snapview-server xlator (which is part of a daemon running for the 
volume), which talks to the corresponding snapshot volume and gets the 
dentry list. Before unwinding it would have generated random gfids for 
those dentries.


Now nfs server upon getting readdirp reply, will associate the gfid with 
the filehandle created for the entry. But without linking the inode, it 
would send the readdirp reply back to nfs client. Now next time when nfs 
client makes some operation on one of those filehandles, nfs server 
tries to resolve it by finding the inode for the gfid present in the 
filehandle. But since the inode was not linked in readdirp, inode_find 
operation fails and it tries to do a hard resolution by sending the 
lookup operation on that gfid to the normal main graph. (The information 
on whether the call should be sent to main graph or snapview-server 
would be present in the inode context. But here the lookup has come on a 
gfid with a newly created inode where the context is not there yet. So 
the call would be sent to the main graph itself). But since the gfid is 
a randomly generated virtual gfid (not present on disk), the lookup 
operation fails giving error.


As per my understanding this can happen with any xlator that deals with 
virtual inodes (by generating random gfids).


I can think of these 2 methods to handle this:
1)  do inode linking for readdirp also in nfs server
2)  If lookup operation fails, snapview-client xlator (which actually 
redirects the fops on snapshot world to snapview-server by looking into 
the inode context) should check if the failed lookup is a nameless 
lookup. If so, AND the gfid of the inode is NULL AND lookup has come 
from main graph, then instead of unwinding the lookup with failure, send 
it to snapview-server which might be able to find the inode for the gfid 
(as the gfid was generated by itself, it should be able to find the 
inode for that gfid unless and until it has been purged from the inode 
table).



Please let me know if I have missed anything. Please provide feedback.

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] inode linking in GlusterFS NFS server

2014-07-07 Thread Raghavendra Bhat

On Tuesday 08 July 2014 01:21 AM, Anand Avati wrote:
On Mon, Jul 7, 2014 at 12:48 PM, Raghavendra Bhat rab...@redhat.com 
mailto:rab...@redhat.com wrote:



Hi,

As per my understanding nfs server is not doing inode linking in
readdirp callback. Because of this there might be some errors
while dealing with virtual inodes (or gfids). As of now meta,
gfid-access and snapview-server (used for user serviceable
snapshots) xlators makes use of virtual inodes with random gfids.
The situation is this:

Say User serviceable snapshot feature has been enabled and there
are 2 snapshots (snap1 and snap2). Let /mnt/nfs be the nfs
mount. Now the snapshots can be accessed by entering .snaps
directory.  Now if snap1 directory is entered and *ls -l* is done
(i.e. cd /mnt/nfs/.snaps/snap1 and then ls -l),  the readdirp
fop is sent to the snapview-server xlator (which is part of a
daemon running for the volume), which talks to the corresponding
snapshot volume and gets the dentry list. Before unwinding it
would have generated random gfids for those dentries.

Now nfs server upon getting readdirp reply, will associate the
gfid with the filehandle created for the entry. But without
linking the inode, it would send the readdirp reply back to nfs
client. Now next time when nfs client makes some operation on one
of those filehandles, nfs server tries to resolve it by finding
the inode for the gfid present in the filehandle. But since the
inode was not linked in readdirp, inode_find operation fails and
it tries to do a hard resolution by sending the lookup operation
on that gfid to the normal main graph. (The information on whether
the call should be sent to main graph or snapview-server would be
present in the inode context. But here the lookup has come on a
gfid with a newly created inode where the context is not there
yet. So the call would be sent to the main graph itself). But
since the gfid is a randomly generated virtual gfid (not present
on disk), the lookup operation fails giving error.

As per my understanding this can happen with any xlator that deals
with virtual inodes (by generating random gfids).

I can think of these 2 methods to handle this:
1)  do inode linking for readdirp also in nfs server
2)  If lookup operation fails, snapview-client xlator (which
actually redirects the fops on snapshot world to snapview-server
by looking into the inode context) should check if the failed
lookup is a nameless lookup. If so, AND the gfid of the inode is
NULL AND lookup has come from main graph, then instead of
unwinding the lookup with failure, send it to snapview-server
which might be able to find the inode for the gfid (as the gfid
was generated by itself, it should be able to find the inode for
that gfid unless and until it has been purged from the inode table).


Please let me know if I have missed anything. Please provide feedback.



That's right. NFS server should be linking readdirp_cbk inodes just 
like FUSE or protocol/server. It has been OK without virtual gfids 
thus far.


I did the changes to link inodes in readdirp_cbk in nfs server. It seems 
to work fine. Should we need the second change also? (i.e chage in the 
snapview-client to redirect the fresh nameless lookups to 
snapview-server). With nfs server linking the inodes in readdirp, I 
think second change might not be needed.


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] spurious failure (bug-1112559.t)

2014-07-04 Thread Raghavendra Bhat


Hi,

I think the regression test bug-1112559.t is causing some spurious 
failures. I see some regression jobs being failed due to it.



Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Need clarification regarding the force option for snapshot delete.

2014-07-01 Thread Raghavendra Bhat

On Friday 27 June 2014 10:47 AM, Raghavendra Talur wrote:

Inline.

- Original Message -
From: Atin Mukherjee amukh...@redhat.com
To: Sachin Pandit span...@redhat.com, Gluster Devel 
gluster-devel@gluster.org, gluster-us...@gluster.org
Sent: Thursday, June 26, 2014 3:30:31 PM
Subject: Re: [Gluster-devel] Need clarification regarding the force option 
for snapshot delete.



On 06/26/2014 01:58 PM, Sachin Pandit wrote:

Hi all,

We had some concern regarding the snapshot delete force option,
That is the reason why we thought of getting advice from everyone out here.

Currently when we give gluster snapshot delete snapname, It gives a 
notification
saying that mentioned snapshot will be deleted, Do you still want to continue 
(y/n)?.
As soon as he presses y it will delete the snapshot.

Our new proposal is, When a user issues snapshot delete command without force
then the user should be given a notification saying to use force option to
delete the snap.

In that case gluster snapshot delete snapname becomes useless apart
from throwing a notification. If we can ensure snapshot delete all works
only with force option then we can have gluster snapshot delete
volname to work as it is now.

~Atin

Agree with Atin here, asking user to execute same command with force appended is
not right.



When snapshot delete command is issued with force option then the user should
be given a notification saying Mentioned snapshot will be deleted, Do you still
want to continue (y/n).

The reason we thought of bringing this up is because we have planned to 
introduce
a command gluster snapshot delete all which deletes all the snapshot in a 
system,
and gluster snapshot delete volume volname which deletes all the snapshots 
in
the mentioned volume. If user accidentally issues any one of the above mentioned
command and press y then he might lose few or more snapshot present in 
volume/system.
(Thinking it will ask for notification for each delete).

It will be good to have this feature, asking for y for every delete.
When force is used we don't ask confirmation for each. Similar to rm -f.

If that is not feasible as of now, is something like this better?

Case 1 : Single snap
[root@snapshot-24 glusterfs]# gluster snapshot delete snap1
Deleting snap will erase all the information about the snap.
Do you still want to continue? (y/n) y
[root@snapshot-24 glusterfs]#

Case 2: Delete all system snaps
[root@snapshot-24 glusterfs]# gluster snapshot delete all
Deleting N snaps stored on the system
Do you still want to continue? (y/n) y
[root@snapshot-24 glusterfs]#

Case 3: Delete all volume snaps
[root@snapshot-24 glusterfs]# gluster snapshot delete volume volname
Deleting N snaps for the volume volname
Do you still want to continue? (y/n) y
[root@snapshot-24 glusterfs]#

Idea here being, if the Warnings to different commands are different
then users may pause for  moment to read and check the message.
We can even list the snaps to be deleted even if we don't ask for
confirmation for each.

Raghavendra Talur


Agree with Raghavendra Talur. It would be better to ask the user without 
force option. The above method suggested by Talur seems to be neat.


Regards,
Raghavendra Bhat


Do you think notification would be more than enough, or do we need to introduce
a force option ?

--
Current procedure:
--

[root@snapshot-24 glusterfs]# gluster snapshot delete snap1
Deleting snap will erase all the information about the snap.
Do you still want to continue? (y/n)


Proposed procedure:
---

[root@snapshot-24 glusterfs]# gluster snapshot delete snap1
Please use the force option to delete the snap.

[root@snapshot-24 glusterfs]# gluster snapshot delete snap1 force
Deleting snap will erase all the information about the snap.
Do you still want to continue? (y/n)
--

We are looking forward for the feedback on this.

Thanks,
Sachin Pandit.

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] regarding inode-unref on root inode

2014-06-25 Thread Raghavendra Bhat

On Tuesday 24 June 2014 08:17 PM, Pranith Kumar Karampuri wrote:

Does anyone know why inode_unref is no-op for root inode?

I see the following code in inode.c

 static inode_t *
 __inode_unref (inode_t *inode)
 {
 if (!inode)
 return NULL;

 if (__is_root_gfid(inode-gfid))
 return inode;
 ...
}


I think its done with the intention that, root inode should *never* ever 
get removed from the active inodes list. (not even accidentally). So 
unref on root-inode is a no-op. Dont know whether there are any other 
reasons.


Regards,
Raghavendra Bhat



Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] glupy test failing

2014-06-20 Thread Raghavendra Bhat


Hi,

I am seeing glupy.t test being failed in some testcases. It is failing 
in my local machine as well (with latest master). Is it a genuine 
failure or a spurious one?


/tests/features/glupy.t(Wstat: 0 Tests: 6 
Failed: 2)

  Failed tests:  2, 6

As per the logfile of the fuse mount done in the testcase this is the 
error:



[2014-06-20 14:15:53.038826] I [MSGID: 100030] [glusterfsd.c:1998:main] 
0-glusterfs: Started running glusterfs version 3.5qa2 (args: glusterfs 
-f /d/backends/glupytest.vol /mnt/glusterfs/0)
[2014-06-20 14:15:53.059484] E [glupy.c:2382:init] 0-vol-glupy: Python 
import failed
[2014-06-20 14:15:53.059575] E [xlator.c:425:xlator_init] 0-vol-glupy: 
Initialization of volume 'vol-glupy' failed, review your volfile again
[2014-06-20 14:15:53.059587] E [graph.c:322:glusterfs_graph_init] 
0-vol-glupy: initializing translator failed
[2014-06-20 14:15:53.059595] E [graph.c:525:glusterfs_graph_activate] 
0-graph: init failed
[2014-06-20 14:15:53.060045] W [glusterfsd.c:1182:cleanup_and_exit] (-- 
0-: received signum (0), shutting down
[2014-06-20 14:15:53.060090] I [fuse-bridge.c:5561:fini] 0-fuse: 
Unmounting '/mnt/glusterfs/0'.
[2014-06-20 14:19:01.867378] I [MSGID: 100030] [glusterfsd.c:1998:main] 
0-glusterfs: Started running glusterfs version 3.5qa2 (args: glusterfs 
-f /d/backends/glupytest.vol /mnt/glusterfs/0)
[2014-06-20 14:19:01.897158] E [glupy.c:2382:init] 0-vol-glupy: Python 
import failed
[2014-06-20 14:19:01.897241] E [xlator.c:425:xlator_init] 0-vol-glupy: 
Initialization of volume 'vol-glupy' failed, review your volfile again
[2014-06-20 14:19:01.897252] E [graph.c:322:glusterfs_graph_init] 
0-vol-glupy: initializing translator failed
[2014-06-20 14:19:01.897260] E [graph.c:525:glusterfs_graph_activate] 
0-graph: init failed
[2014-06-20 14:19:01.897635] W [glusterfsd.c:1182:cleanup_and_exit] (-- 
0-: received signum (0), shutting down
[2014-06-20 14:19:01.897677] I [fuse-bridge.c:5561:fini] 0-fuse: 
Unmounting '/mnt/glusterfs/0'.



Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] autodelete in snapshots

2014-06-04 Thread Raghavendra Bhat

On Wednesday 04 June 2014 11:23 AM, Rajesh Joseph wrote:


- Original Message -

From: M S Vishwanath Bhat msvb...@gmail.com
To: Rajesh Joseph rjos...@redhat.com
Cc: Vijay Bellur vbel...@redhat.com, Seema Naik sen...@redhat.com, Gluster 
Devel
gluster-devel@gluster.org
Sent: Tuesday, June 3, 2014 5:55:27 PM
Subject: Re: [Gluster-devel] autodelete in snapshots

On 3 June 2014 15:21, Rajesh Joseph rjos...@redhat.com wrote:



- Original Message -
From: M S Vishwanath Bhat msvb...@gmail.com
To: Vijay Bellur vbel...@redhat.com
Cc: Seema Naik sen...@redhat.com, Gluster Devel 
gluster-devel@gluster.org
Sent: Tuesday, June 3, 2014 1:02:08 AM
Subject: Re: [Gluster-devel] autodelete in snapshots




On 2 June 2014 20:22, Vijay Bellur  vbel...@redhat.com  wrote:



On 04/23/2014 05:50 AM, Vijay Bellur wrote:


On 04/20/2014 11:42 PM, Lalatendu Mohanty wrote:


On 04/16/2014 11:39 AM, Avra Sengupta wrote:


The whole purpose of introducing the soft-limit is, that at any point
of time the number of
snaps should not exceed the hard limit. If we trigger auto-delete on
hitting hard-limit, then
the purpose itself is lost, because at that point we would be taking a
snap, making the limit
hard-limit + 1, and then triggering auto-delete, which violates the
sanctity of the hard-limit.
Also what happens when we are at hard-limit + 1, and another snap is
issued, while auto-delete
is yet to process the first delete. At that point we end up at
hard-limit + 1. Also what happens
if for a particular snap the auto-delete fails.

We should see the hard-limit, as something set by the admin keeping in
mind the resource consumption
and at no-point should we cross this limit, come what may. If we hit
this limit, the create command
should fail asking the user to delete snaps using the snapshot
delete command.

The two options Raghavendra mentioned are applicable for the
soft-limit only, in which cases on
hitting the soft-limit

1. Trigger auto-delete

or

2. Log a warning-message, for the user saying the number of snaps is
exceeding the snap-limit and
display the number of available snaps

Now which of these should happen also depends on the user, because the
auto-delete option
is configurable.

So if the auto-delete option is set as true, auto-delete should be
triggered and the above message
should also be logged.

But if the option is set as false, only the message should be logged.

This is the behaviour as designed. Adding Rahul, and Seema in the
mail, to reflect upon the
behaviour as well.

Regards,
Avra

This sounds correct. However we need to make sure that the usage or
documentation around this should be good enough , so that users
understand the each of the limits correctly.


It might be better to avoid the usage of the term soft-limit.
soft-limit as used in quota and other places generally has an alerting
connotation. Something like auto-deletion-limit might be better.


I still see references to soft-limit and auto deletion seems to get
triggered upon reaching soft-limit.

Why is the ability to auto delete not configurable? It does seem pretty
nasty to go about deleting snapshots without obtaining explicit consent
from the user.

I agree with Vijay here. It's not good to delete a snap (even though it is
oldest) without the explicit consent from user.

FYI It took me more than 2 weeks to figure out that my snaps were getting
autodeleted after reaching soft-limit. For all I know I had not done
anything and my snap restore were failing.

I propose to remove the terms soft and hard limit. I believe there
should be a limit (just limit) after which all snapshot creates should
fail with proper error messages. And there can be a water-mark after which
user should get warning messages. So below is my proposal.

auto-delete + snap-limit: If the snap-limit is set to n , next snap create
(n+1th) will succeed only if if auto-delete is set to on/true/1 and oldest
snap will get deleted automatically. If autodelete is set to off/false/0 ,
(n+1)th snap create will fail with proper error message from gluster CLI
command. But again by default autodelete should be off.

snap-water-mark : This should come in picture only if autodelete is turned
off. It should not have any meaning if auto-delete is turned ON. Basically
it's usage is to give the user warning that limit almost being reached and
it is time for admin to decide which snaps should be deleted (or which
should be kept)

*my two cents*

-MS


The reason for having a hard-limit is to stop snapshot creation once we
reached this limit. This helps to have a control over the resource
consumption. Therefore if we only have this limit (as snap-limit) then
there is no question of auto-delete. Auto-delete can only be triggered once
the count crosses the limit. Therefore we introduced the concept of
soft-limit and a hard-limit. As the name suggests once the hard-limit is
reached no more snaps will be created.


Perhaps I could have been more clearer. auto-delete value does come into

[Gluster-devel] inode lru limit

2014-05-30 Thread Raghavendra Bhat


Hi,

Currently the lru-limit of the inode table in brick processes is 16384. 
There is a option to configure it to some other value. The 
protocol/server uses inode_lru_limit variable present in its private 
structure while creating the inode table (whose default value is 16384). 
When the option is reconfigured via volume set option the 
protocol/server's inode_lru_limit variable present in its private 
structure is changed. But the actual size of the inode table still 
remains same as old one. Only when the brick is restarted the newly set 
value comes into picture. Is it ok? Should we change the inode table's 
lru_limit variable also as part of reconfigure? If so, then probably we 
might have to remove the extra inodes present in the lru list by calling 
inode_table_prune.


Please provide feedback


Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel