Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-22 Thread Vijay Bellur

On 05/21/2014 08:50 PM, Vijaikumar M wrote:

KP, Atin and myself did some debugging and found that there was a
deadlock in glusterd.

When creating a volume snapshot, the back-end operation 'taking a
lvm_snapshot and starting brick' for the each brick
are executed in parallel using synctask framework.

brick_start was releasing a big_lock with brick_connect and does a lock
again.
This caused a deadlock in some race condition where main-thread waiting
for one of the synctask thread to finish and
synctask-thread waiting for the big_lock.


We are working on fixing this issue.



If this fix is going to take more time, can we please log a bug to track 
this problem and remove the test cases that need to be addressed from 
the test unit? This way other valid patches will not be blocked by the 
failure of the snapshot test unit.


We can introduce these tests again as part of the fix for the problem.

-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regression testing results for master branch

2014-05-22 Thread Kaushal M
It should be possible. I'll check and do the change.

~kaushal


On Thu, May 22, 2014 at 8:14 AM, Pranith Kumar Karampuri 
pkara...@redhat.com wrote:



 - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: Justin Clift jus...@gluster.org
  Cc: Gluster Devel gluster-devel@gluster.org
  Sent: Thursday, May 22, 2014 6:23:16 AM
  Subject: Re: [Gluster-devel] Regression testing results for master branch
 
 
 
  - Original Message -
   From: Justin Clift jus...@gluster.org
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Gluster Devel gluster-devel@gluster.org
   Sent: Wednesday, May 21, 2014 11:01:36 PM
   Subject: Re: [Gluster-devel] Regression testing results for master
 branch
  
   On 21/05/2014, at 6:17 PM, Justin Clift wrote:
Hi all,
   
Kicked off 21 VM's in Rackspace earlier today, running the regression
tests
against master branch.
   
Only 3 VM's failed out of the 21 (86% PASS, 14% FAIL), with all three
being
for the same test:
   
Test Summary Report
---
./tests/bugs/bug-948686.t   (Wstat: 0 Tests: 20
Failed:
2)
 Failed tests:  13-14
Files=230, Tests=4373, 5601 wallclock secs ( 2.09 usr  1.58 sys +
 1012.66
cusr 688.80 csys = 1705.13 CPU)
Result: FAIL
  
  
   Interestingly, this one looks like a simple time based thing
   too.  The failed tests are the ones after the sleep:
  
 ...
 #modify volume config to see change in volume-sync
 TEST $CLI_1 volume set $V0 write-behind off
 #add some files to the volume to see effect of volume-heal cmd
 TEST touch $M0/{1..100};
 TEST $CLI_1 volume stop $V0;
 TEST $glusterd_3;
 sleep 3;
 TEST $CLI_3 volume start $V0;
 TEST $CLI_2 volume stop $V0;
 TEST $CLI_2 volume delete $V0;
  
   Do you already have this one on your radar?
 
  It wasn't, thanks for bringing it on my radar :-). Sent
  http://review.gluster.org/7837 to address this.

 Kaushal,
 I made this fix based on the assumption that the script seems to be
 waiting for all glusterds to be online. I could not check the logs because
 glusterds spawned by cluster.rc seem to be storing the logs not in the
 default location. Do you think we can make changes to the script so that we
 can get logs from glusterds spawned by cluster.rc as well?

 Pranith

 
  Pranith
 
  
   + Justin
  
   --
   Open Source and Standards @ Red Hat
  
   twitter.com/realjustinclift
  
  
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regression testing results for master branch

2014-05-22 Thread Kaushal M
The glusterds spawned using cluster.rc store their logs at
/d/backends/N/glusterd.log . But the cleanup() function cleans
/d/backends/, so those logs are lost before we can archive.

cluster.rc should be fixed to use a better location for the logs.

~kaushal


On Thu, May 22, 2014 at 11:45 AM, Kaushal M kshlms...@gmail.com wrote:

 It should be possible. I'll check and do the change.

 ~kaushal


 On Thu, May 22, 2014 at 8:14 AM, Pranith Kumar Karampuri 
 pkara...@redhat.com wrote:



 - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: Justin Clift jus...@gluster.org
  Cc: Gluster Devel gluster-devel@gluster.org
  Sent: Thursday, May 22, 2014 6:23:16 AM
  Subject: Re: [Gluster-devel] Regression testing results for master
 branch
 
 
 
  - Original Message -
   From: Justin Clift jus...@gluster.org
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Gluster Devel gluster-devel@gluster.org
   Sent: Wednesday, May 21, 2014 11:01:36 PM
   Subject: Re: [Gluster-devel] Regression testing results for master
 branch
  
   On 21/05/2014, at 6:17 PM, Justin Clift wrote:
Hi all,
   
Kicked off 21 VM's in Rackspace earlier today, running the
 regression
tests
against master branch.
   
Only 3 VM's failed out of the 21 (86% PASS, 14% FAIL), with all
 three
being
for the same test:
   
Test Summary Report
---
./tests/bugs/bug-948686.t   (Wstat: 0 Tests: 20
Failed:
2)
 Failed tests:  13-14
Files=230, Tests=4373, 5601 wallclock secs ( 2.09 usr  1.58 sys +
 1012.66
cusr 688.80 csys = 1705.13 CPU)
Result: FAIL
  
  
   Interestingly, this one looks like a simple time based thing
   too.  The failed tests are the ones after the sleep:
  
 ...
 #modify volume config to see change in volume-sync
 TEST $CLI_1 volume set $V0 write-behind off
 #add some files to the volume to see effect of volume-heal cmd
 TEST touch $M0/{1..100};
 TEST $CLI_1 volume stop $V0;
 TEST $glusterd_3;
 sleep 3;
 TEST $CLI_3 volume start $V0;
 TEST $CLI_2 volume stop $V0;
 TEST $CLI_2 volume delete $V0;
  
   Do you already have this one on your radar?
 
  It wasn't, thanks for bringing it on my radar :-). Sent
  http://review.gluster.org/7837 to address this.

 Kaushal,
 I made this fix based on the assumption that the script seems to be
 waiting for all glusterds to be online. I could not check the logs because
 glusterds spawned by cluster.rc seem to be storing the logs not in the
 default location. Do you think we can make changes to the script so that we
 can get logs from glusterds spawned by cluster.rc as well?

 Pranith

 
  Pranith
 
  
   + Justin
  
   --
   Open Source and Standards @ Red Hat
  
   twitter.com/realjustinclift
  
  
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regression testing results for master branch

2014-05-22 Thread Justin Clift
somepath/glusterd-backend%N.log maybe?

On 22/05/2014, at 8:03 AM, Kaushal M wrote:
 The glusterds spawned using cluster.rc store their logs at 
 /d/backends/N/glusterd.log . But the cleanup() function cleans 
 /d/backends/, so those logs are lost before we can archive.
 
 cluster.rc should be fixed to use a better location for the logs.
 
 ~kaushal

--
Open Source and Standards @ Red Hat

twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] bug-857330/normal.t failure

2014-05-22 Thread Pranith Kumar Karampuri
Kaushal,
   Rebalance status command seems to be failing sometimes. I sent a mail about 
such spurious failure earlier today. Did you get a chance to look at the logs 
and confirm that rebalance didn't fail and it is indeed a timeout?

Pranith
- Original Message -
 From: Kaushal M kshlms...@gmail.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Justin Clift jus...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Thursday, May 22, 2014 4:40:25 PM
 Subject: Re: [Gluster-devel] bug-857330/normal.t failure
 
 The test is waiting for rebalance to finish. This is a rebalance with some
 actual data so it could have taken a long time to finish. I did set a
 pretty high timeout, but it seems like it's not enough for the new VMs.
 
 Possible options are,
 - Increase this timeout further
 - Reduce the amount of data. Currently this is 100 directories with 10
 files each of size between 10-500KB
 
 ~kaushal
 
 
 On Thu, May 22, 2014 at 3:59 PM, Pranith Kumar Karampuri 
 pkara...@redhat.com wrote:
 
  Kaushal has more context about these CCed. Keep the setup until he
  responds so that he can take a look.
 
  Pranith
  - Original Message -
   From: Justin Clift jus...@gluster.org
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Gluster Devel gluster-devel@gluster.org
   Sent: Thursday, May 22, 2014 3:54:46 PM
   Subject: bug-857330/normal.t failure
  
   Hi Pranith,
  
   Ran a few VM's with your Gerrit CR 7835 applied, and in DEBUG
   mode (I think).
  
   One of the VM's had a failure in bug-857330/normal.t:
  
 Test Summary Report
 ---
 ./tests/basic/rpm.t (Wstat: 0 Tests: 0
  Failed:
 0)
   Parse errors: Bad plan.  You planned 8 tests but ran 0.
 ./tests/bugs/bug-857330/normal.t(Wstat: 0 Tests: 24
  Failed:
 1)
   Failed test:  13
 Files=230, Tests=4369, 5407 wallclock secs ( 2.13 usr  1.73 sys +
  941.82
 cusr 645.54 csys = 1591.22 CPU)
 Result: FAIL
  
   Seems to be this test:
  
 COMMAND=volume rebalance $V0 status
 PATTERN=completed
 EXPECT_WITHIN 300 $PATTERN get-task-status
  
   Is this one on your radar already?
  
   Btw, this VM is still online.  Can give you access to retrieve logs
   if useful.
  
   + Justin
  
   --
   Open Source and Standards @ Red Hat
  
   twitter.com/realjustinclift
  
  
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel
 
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Guidelines for Maintainers

2014-05-22 Thread Vijay Bellur

[Adding the right alias for gluster-devel this time around]

On 05/22/2014 05:29 PM, Vijay Bellur wrote:

Hi All,

Given the addition of new sub-maintainers  release maintainers to the
community [1], I have felt the need to publish a set of guidelines for
all categories of maintainers to have a non-ambiguous operational state.
A first cut of one such document can be found at [2]. I would love to
hear your thoughts and feedback to make the proposal very clear to
everybody. We can convert this draft to a real set of guidelines once
there is consensus.

Cheers,
Vijay

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6249

[2]
http://www.gluster.org/community/documentation/index.php/Guidelines_For_Maintainers

___
Gluster-users mailing list
gluster-us...@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] bug-857330/normal.t failure

2014-05-22 Thread Kaushal M
Thanks Justin, I found the problem. The VM can be deleted now.

Turns out, there was more than enough time for the rebalance to complete.
But we hit a race, which caused a command to fail.

The particular test that failed is waiting for rebalance to finish. It does
this by doing a 'gluster volume rebalance  status' command and checking
the result. The EXPECT_WITHIN function runs this command till we have a
match, the command fails or the timeout happens.

For a rebalance status command, glusterd sends a request to the rebalance
process (as a brick_op) to get the latest stats. It had done the same in
this case as well. But while glusterd was waiting for the reply, the
rebalance completed and the process stopped itself. This caused the rpc
connection between glusterd and rebalance proc to close. This caused the
all pending requests to be unwound as failures. Which in turnlead to the
command failing.

I cannot think of a way to avoid this race from within glusterd. For this
particular test, we could avoid using the 'rebalance status' command if we
directly checked the rebalance process state using its pid etc. I don't
particularly approve of this approach, as I think I used the 'rebalance
status' command for a reason. But I currently cannot recall the reason, and
if cannot come with it soon, I wouldn't mind changing the test to avoid
rebalance status.

~kaushal



On Thu, May 22, 2014 at 5:22 PM, Justin Clift jus...@gluster.org wrote:

 On 22/05/2014, at 12:32 PM, Kaushal M wrote:
  I haven't yet. But I will.
 
  Justin,
  Can I get take a peek inside the vm?

 Sure.

   IP: 23.253.57.20
   User: root
   Password: foobar123

 The stdout log from the regression test is in /tmp/regression.log.

 The GlusterFS git repo is in /root/glusterfs.  Um, you should be
 able to find everything else pretty easily.

 Btw, this is just a temp VM, so feel free to do anything you want
 with it.  When you're finished with it let me know so I can delete
 it. :)

 + Justin


  ~kaushal
 
 
  On Thu, May 22, 2014 at 4:53 PM, Pranith Kumar Karampuri 
 pkara...@redhat.com wrote:
  Kaushal,
 Rebalance status command seems to be failing sometimes. I sent a mail
 about such spurious failure earlier today. Did you get a chance to look at
 the logs and confirm that rebalance didn't fail and it is indeed a timeout?
 
  Pranith
  - Original Message -
   From: Kaushal M kshlms...@gmail.com
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Justin Clift jus...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
   Sent: Thursday, May 22, 2014 4:40:25 PM
   Subject: Re: [Gluster-devel] bug-857330/normal.t failure
  
   The test is waiting for rebalance to finish. This is a rebalance with
 some
   actual data so it could have taken a long time to finish. I did set a
   pretty high timeout, but it seems like it's not enough for the new VMs.
  
   Possible options are,
   - Increase this timeout further
   - Reduce the amount of data. Currently this is 100 directories with 10
   files each of size between 10-500KB
  
   ~kaushal
  
  
   On Thu, May 22, 2014 at 3:59 PM, Pranith Kumar Karampuri 
   pkara...@redhat.com wrote:
  
Kaushal has more context about these CCed. Keep the setup until he
responds so that he can take a look.
   
Pranith
- Original Message -
 From: Justin Clift jus...@gluster.org
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Thursday, May 22, 2014 3:54:46 PM
 Subject: bug-857330/normal.t failure

 Hi Pranith,

 Ran a few VM's with your Gerrit CR 7835 applied, and in DEBUG
 mode (I think).

 One of the VM's had a failure in bug-857330/normal.t:

   Test Summary Report
   ---
   ./tests/basic/rpm.t (Wstat: 0 Tests:
 0
Failed:
   0)
 Parse errors: Bad plan.  You planned 8 tests but ran 0.
   ./tests/bugs/bug-857330/normal.t(Wstat: 0 Tests:
 24
Failed:
   1)
 Failed test:  13
   Files=230, Tests=4369, 5407 wallclock secs ( 2.13 usr  1.73 sys +
941.82
   cusr 645.54 csys = 1591.22 CPU)
   Result: FAIL

 Seems to be this test:

   COMMAND=volume rebalance $V0 status
   PATTERN=completed
   EXPECT_WITHIN 300 $PATTERN get-task-status

 Is this one on your radar already?

 Btw, this VM is still online.  Can give you access to retrieve logs
 if useful.

 + Justin

 --
 Open Source and Standards @ Red Hat

 twitter.com/realjustinclift


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel
   
  
 

 --
 Open Source and Standards @ Red Hat

 twitter.com/realjustinclift


___
Gluster-devel mailing list

Re: [Gluster-devel] Changes needing review before a glusterfs-3.5.1 Beta is released

2014-05-22 Thread Niels de Vos
On Wed, May 21, 2014 at 06:40:57PM +0200, Niels de Vos wrote:
 A lot of work has been done on getting blockers resolved for the next 
 3.5 release. We're not there yet, but we're definitely getting close to
 releasing a 1st beta.
 
 Humble will follow-up with an email related to the documentation that is 
 still missing for features introduced with 3.5. We will not hold back on
 the Beta if the documentation is STILL incomplete, but it is seen as a
 major blocker for the final 3.5.1 release.
 
 The following list is based on the bugs that have been requested as 
 blockers¹:
 
 * 1089054 gf-error-codes.h is missing from source tarball
   Depends on 1038391 for getting the changes reviewed and included in 
   the master branch firsh:
   - http://review.gluster.org/7714
   - http://review.gluster.org/7786

These have been reviewed merged in the master branch. Backports have 
been posted for review:
- http://review.gluster.org/7850
- http://review.gluster.org/7851

 * 1096425 i/o error when one user tries to access RHS volume over NFS
   with 100+
   Patches for 3.5 posted for review:
   - http://review.gluster.org/7829
   - http://review.gluster.org/7830

Review of the backport http://review.gluster.org/7830 is still pending.

 * 1099878 Need support for handle based Ops to fetch/modify extended
   attributes of a file
   Patch for 3.5 posted for review:
   - http://review.gluster.org/7825

Got reviewed and merged!

New addition, confirmed yesterday:

* 1081016 glusterd needs xfsprogs and e2fsprogs packages
  (don't leave zombies if required programs aren't installed)
  Needs review+merging in master: http://review.gluster.org/7361
  After approval for master, a backport for release-3.5 can be sent.

Thanks,
Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster driver for Archipelago - Development process

2014-05-22 Thread Vijay Bellur

On 05/22/2014 02:10 AM, Alex Pyrgiotis wrote:

On 02/17/2014 06:22 PM, Vijay Bellur wrote:

On 02/17/2014 05:11 PM, Alex Pyrgiotis wrote:

On 02/10/2014 07:06 PM, Vijay Bellur wrote:

On 02/05/2014 04:10 PM, Alex Pyrgiotis wrote:

Hi all,

Just wondering, do we have any news on that?


Hi Alex,

I have started some work on this. The progress has been rather slow
owing to 3.5 release cycle amongst other things. I intend to propose
this as a feature for 3.6 and will keep you posted as we have something
more to get you going.



Hi Vijay,

That sounds good. I suppose that if it gets included in 3.6, we will see
it in this page [1], right?



Hi Alex,

Yes, that is correct.

Thanks,
Vijay



Hi Vijay,

On the planning page for 3.6 [1], I see that Archipelago is included
(great!) and that the feature freeze was due to 21st of May. So, do we
have any news on which features will get included on 3.6, as well as
more info about the Archipelago integration?



Yes, the gfapi and related changes needed by Archipelago are planned for 
inclusion in 3.6.


The feature freeze was moved by a month after a discussion in 
yesterday's GlusterFS community meeting. I will ping you back once we 
have something tangible to get started with integration testing.


Cheers,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] regarding special treatment of ENOTSUP for setxattr

2014-05-22 Thread Harshavardhana
Here are the important locations in the XFS tree coming from 2.6.32 branch

STATIC int
xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
{
struct xfs_inode *ip = XFS_I(inode);
unsigned char *ea_name;
int error;

if (S_ISLNK(inode-i_mode))  I would
generally think this is the issue.
return -EOPNOTSUPP;

STATIC long
xfs_vn_fallocate(
struct inode*inode,
int mode,
loff_t  offset,
loff_t  len)
{
longerror;
loff_t  new_size = 0;
xfs_flock64_t   bf;
xfs_inode_t *ip = XFS_I(inode);
int cmd = XFS_IOC_RESVSP;
int attr_flags = XFS_ATTR_NOLOCK;

if (mode  ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
return -EOPNOTSUPP;

STATIC int
xfs_ioc_setxflags(
xfs_inode_t *ip,
struct file *filp,
void__user *arg)
{
struct fsxattr  fa;
unsigned intflags;
unsigned intmask;
int error;

if (copy_from_user(flags, arg, sizeof(flags)))
return -EFAULT;

if (flags  ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \
  FS_NOATIME_FL | FS_NODUMP_FL | \
  FS_SYNC_FL))
return -EOPNOTSUPP;

Perhaps some sort of system level acl's are being propagated by us
over symlinks() ? - perhaps this is the related to the same issue of
following symlinks?

On Sun, May 18, 2014 at 10:48 AM, Pranith Kumar Karampuri
pkara...@redhat.com wrote:
 Sent the following patch to remove the special treatment of ENOTSUP here: 
 http://review.gluster.org/7788

 Pranith
 - Original Message -
 From: Kaleb KEITHLEY kkeit...@redhat.com
 To: gluster-devel@gluster.org
 Sent: Tuesday, May 13, 2014 8:01:53 PM
 Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for  
  setxattr

 On 05/13/2014 08:00 AM, Nagaprasad Sathyanarayana wrote:
  On 05/07/2014 03:44 PM, Pranith Kumar Karampuri wrote:
 
  - Original Message -
  From: Raghavendra Gowdappa rgowd...@redhat.com
  To: Pranith Kumar Karampuri pkara...@redhat.com
  Cc: Vijay Bellur vbel...@redhat.com, gluster-devel@gluster.org,
  Anand Avati aav...@redhat.com
  Sent: Wednesday, May 7, 2014 3:42:16 PM
  Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP
  for setxattr
 
  I think with repetitive log message suppression patch being merged, we
  don't really need gf_log_occasionally (except if they are logged in
  DEBUG or
  TRACE levels).
  That definitely helps. But still, setxattr calls are not supposed to
  fail with ENOTSUP on FS where we support gluster. If there are special
  keys which fail with ENOTSUPP, we can conditionally log setxattr
  failures only when the key is something new?

 I know this is about EOPNOTSUPP (a.k.a. ENOTSUPP) returned by
 setxattr(2) for legitimate attrs.

 But I can't help but wondering if this isn't related to other bugs we've
 had with, e.g., lgetxattr(2) called on invalid xattrs?

 E.g. see https://bugzilla.redhat.com/show_bug.cgi?id=765202. We have a
 hack where xlators communicate with each other by getting (and setting?)
 invalid xattrs; the posix xlator has logic to filter out  invalid
 xattrs, but due to bugs this hasn't always worked perfectly.

 It would be interesting to know which xattrs are getting errors and on
 which fs types.

 FWIW, in a quick perusal of a fairly recent (3.14.3) kernel, in xfs
 there are only six places where EOPNOTSUPP is returned, none of them
 related to xattrs. In ext[34] EOPNOTSUPP can be returned if the
 user_xattr option is not enabled (enabled by default in ext4.) And in
 the higher level vfs xattr code there are many places where EOPNOTSUPP
 _might_ be returned, primarily only if subordinate function calls aren't
 invoked which would clear the default or return a different error.

 --

 Kaleb





 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel



-- 
Religious confuse piety with mere ritual, the virtuous confuse
regulation with outcomes
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] regarding special treatment of ENOTSUP for setxattr

2014-05-22 Thread Harshavardhana
http://review.gluster.com/#/c/7823/ - the fix here

On Thu, May 22, 2014 at 1:41 PM, Harshavardhana
har...@harshavardhana.net wrote:
 Here are the important locations in the XFS tree coming from 2.6.32 branch

 STATIC int
 xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
 {
 struct xfs_inode *ip = XFS_I(inode);
 unsigned char *ea_name;
 int error;

 if (S_ISLNK(inode-i_mode))  I would
 generally think this is the issue.
 return -EOPNOTSUPP;

 STATIC long
 xfs_vn_fallocate(
 struct inode*inode,
 int mode,
 loff_t  offset,
 loff_t  len)
 {
 longerror;
 loff_t  new_size = 0;
 xfs_flock64_t   bf;
 xfs_inode_t *ip = XFS_I(inode);
 int cmd = XFS_IOC_RESVSP;
 int attr_flags = XFS_ATTR_NOLOCK;

 if (mode  ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
 return -EOPNOTSUPP;

 STATIC int
 xfs_ioc_setxflags(
 xfs_inode_t *ip,
 struct file *filp,
 void__user *arg)
 {
 struct fsxattr  fa;
 unsigned intflags;
 unsigned intmask;
 int error;

 if (copy_from_user(flags, arg, sizeof(flags)))
 return -EFAULT;

 if (flags  ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \
   FS_NOATIME_FL | FS_NODUMP_FL | \
   FS_SYNC_FL))
 return -EOPNOTSUPP;

 Perhaps some sort of system level acl's are being propagated by us
 over symlinks() ? - perhaps this is the related to the same issue of
 following symlinks?

 On Sun, May 18, 2014 at 10:48 AM, Pranith Kumar Karampuri
 pkara...@redhat.com wrote:
 Sent the following patch to remove the special treatment of ENOTSUP here: 
 http://review.gluster.org/7788

 Pranith
 - Original Message -
 From: Kaleb KEITHLEY kkeit...@redhat.com
 To: gluster-devel@gluster.org
 Sent: Tuesday, May 13, 2014 8:01:53 PM
 Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for 
   setxattr

 On 05/13/2014 08:00 AM, Nagaprasad Sathyanarayana wrote:
  On 05/07/2014 03:44 PM, Pranith Kumar Karampuri wrote:
 
  - Original Message -
  From: Raghavendra Gowdappa rgowd...@redhat.com
  To: Pranith Kumar Karampuri pkara...@redhat.com
  Cc: Vijay Bellur vbel...@redhat.com, gluster-devel@gluster.org,
  Anand Avati aav...@redhat.com
  Sent: Wednesday, May 7, 2014 3:42:16 PM
  Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP
  for setxattr
 
  I think with repetitive log message suppression patch being merged, we
  don't really need gf_log_occasionally (except if they are logged in
  DEBUG or
  TRACE levels).
  That definitely helps. But still, setxattr calls are not supposed to
  fail with ENOTSUP on FS where we support gluster. If there are special
  keys which fail with ENOTSUPP, we can conditionally log setxattr
  failures only when the key is something new?

 I know this is about EOPNOTSUPP (a.k.a. ENOTSUPP) returned by
 setxattr(2) for legitimate attrs.

 But I can't help but wondering if this isn't related to other bugs we've
 had with, e.g., lgetxattr(2) called on invalid xattrs?

 E.g. see https://bugzilla.redhat.com/show_bug.cgi?id=765202. We have a
 hack where xlators communicate with each other by getting (and setting?)
 invalid xattrs; the posix xlator has logic to filter out  invalid
 xattrs, but due to bugs this hasn't always worked perfectly.

 It would be interesting to know which xattrs are getting errors and on
 which fs types.

 FWIW, in a quick perusal of a fairly recent (3.14.3) kernel, in xfs
 there are only six places where EOPNOTSUPP is returned, none of them
 related to xattrs. In ext[34] EOPNOTSUPP can be returned if the
 user_xattr option is not enabled (enabled by default in ext4.) And in
 the higher level vfs xattr code there are many places where EOPNOTSUPP
 _might_ be returned, primarily only if subordinate function calls aren't
 invoked which would clear the default or return a different error.

 --

 Kaleb





 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel



 --
 Religious confuse piety with mere ritual, the virtuous confuse
 regulation with outcomes



-- 
Religious confuse piety with mere ritual, the virtuous confuse
regulation with outcomes
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] bug-857330/normal.t failure

2014-05-22 Thread Pranith Kumar Karampuri


- Original Message -
 From: Kaushal M kshlms...@gmail.com
 To: Justin Clift jus...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Thursday, May 22, 2014 6:04:29 PM
 Subject: Re: [Gluster-devel] bug-857330/normal.t failure
 
 Thanks Justin, I found the problem. The VM can be deleted now.
 
 Turns out, there was more than enough time for the rebalance to complete. But
 we hit a race, which caused a command to fail.
 
 The particular test that failed is waiting for rebalance to finish. It does
 this by doing a 'gluster volume rebalance  status' command and checking
 the result. The EXPECT_WITHIN function runs this command till we have a
 match, the command fails or the timeout happens.
 
 For a rebalance status command, glusterd sends a request to the rebalance
 process (as a brick_op) to get the latest stats. It had done the same in
 this case as well. But while glusterd was waiting for the reply, the
 rebalance completed and the process stopped itself. This caused the rpc
 connection between glusterd and rebalance proc to close. This caused the all
 pending requests to be unwound as failures. Which in turnlead to the command
 failing.

Do you think we can print the status of the process as 'not-responding' when 
such a thing happens, instead of failing the command?

Pranith

 
 I cannot think of a way to avoid this race from within glusterd. For this
 particular test, we could avoid using the 'rebalance status' command if we
 directly checked the rebalance process state using its pid etc. I don't
 particularly approve of this approach, as I think I used the 'rebalance
 status' command for a reason. But I currently cannot recall the reason, and
 if cannot come with it soon, I wouldn't mind changing the test to avoid
 rebalance status.
 
 ~kaushal
 
 
 
 On Thu, May 22, 2014 at 5:22 PM, Justin Clift  jus...@gluster.org  wrote:
 
 
 
 On 22/05/2014, at 12:32 PM, Kaushal M wrote:
  I haven't yet. But I will.
  
  Justin,
  Can I get take a peek inside the vm?
 
 Sure.
 
 IP: 23.253.57.20
 User: root
 Password: foobar123
 
 The stdout log from the regression test is in /tmp/regression.log.
 
 The GlusterFS git repo is in /root/glusterfs. Um, you should be
 able to find everything else pretty easily.
 
 Btw, this is just a temp VM, so feel free to do anything you want
 with it. When you're finished with it let me know so I can delete
 it. :)
 
 + Justin
 
 
  ~kaushal
  
  
  On Thu, May 22, 2014 at 4:53 PM, Pranith Kumar Karampuri 
  pkara...@redhat.com  wrote:
  Kaushal,
  Rebalance status command seems to be failing sometimes. I sent a mail about
  such spurious failure earlier today. Did you get a chance to look at the
  logs and confirm that rebalance didn't fail and it is indeed a timeout?
  
  Pranith
  - Original Message -
   From: Kaushal M  kshlms...@gmail.com 
   To: Pranith Kumar Karampuri  pkara...@redhat.com 
   Cc: Justin Clift  jus...@gluster.org , Gluster Devel 
   gluster-devel@gluster.org 
   Sent: Thursday, May 22, 2014 4:40:25 PM
   Subject: Re: [Gluster-devel] bug-857330/normal.t failure
   
   The test is waiting for rebalance to finish. This is a rebalance with
   some
   actual data so it could have taken a long time to finish. I did set a
   pretty high timeout, but it seems like it's not enough for the new VMs.
   
   Possible options are,
   - Increase this timeout further
   - Reduce the amount of data. Currently this is 100 directories with 10
   files each of size between 10-500KB
   
   ~kaushal
   
   
   On Thu, May 22, 2014 at 3:59 PM, Pranith Kumar Karampuri 
   pkara...@redhat.com  wrote:
   
Kaushal has more context about these CCed. Keep the setup until he
responds so that he can take a look.

Pranith
- Original Message -
 From: Justin Clift  jus...@gluster.org 
 To: Pranith Kumar Karampuri  pkara...@redhat.com 
 Cc: Gluster Devel  gluster-devel@gluster.org 
 Sent: Thursday, May 22, 2014 3:54:46 PM
 Subject: bug-857330/normal.t failure
 
 Hi Pranith,
 
 Ran a few VM's with your Gerrit CR 7835 applied, and in DEBUG
 mode (I think).
 
 One of the VM's had a failure in bug-857330/normal.t:
 
 Test Summary Report
 ---
 ./tests/basic/rpm.t (Wstat: 0 Tests: 0
Failed:
 0)
 Parse errors: Bad plan. You planned 8 tests but ran 0.
 ./tests/bugs/bug-857330/normal.t (Wstat: 0 Tests: 24
Failed:
 1)
 Failed test: 13
 Files=230, Tests=4369, 5407 wallclock secs ( 2.13 usr 1.73 sys +
941.82
 cusr 645.54 csys = 1591.22 CPU)
 Result: FAIL
 
 Seems to be this test:
 
 COMMAND=volume rebalance $V0 status
 PATTERN=completed
 EXPECT_WITHIN 300 $PATTERN get-task-status
 
 Is this one on your radar already?
 
 Btw, this VM is still online. Can give you access to retrieve logs
 if useful.
 
 + Justin
 
 --
 Open Source and Standards @ Red Hat
 
 

Re: [Gluster-devel] bug-857330/normal.t failure

2014-05-22 Thread Krishnan Parthasarathi

- Original Message -
 On 22/05/2014, at 1:34 PM, Kaushal M wrote:
  Thanks Justin, I found the problem. The VM can be deleted now.
 
 Done. :)
 
 
  Turns out, there was more than enough time for the rebalance to complete.
  But we hit a race, which caused a command to fail.
  
  The particular test that failed is waiting for rebalance to finish. It does
  this by doing a 'gluster volume rebalance  status' command and checking
  the result. The EXPECT_WITHIN function runs this command till we have a
  match, the command fails or the timeout happens.
  
  For a rebalance status command, glusterd sends a request to the rebalance
  process (as a brick_op) to get the latest stats. It had done the same in
  this case as well. But while glusterd was waiting for the reply, the
  rebalance completed and the process stopped itself. This caused the rpc
  connection between glusterd and rebalance proc to close. This caused the
  all pending requests to be unwound as failures. Which in turnlead to the
  command failing.
  
  I cannot think of a way to avoid this race from within glusterd. For this
  particular test, we could avoid using the 'rebalance status' command if we
  directly checked the rebalance process state using its pid etc. I don't
  particularly approve of this approach, as I think I used the 'rebalance
  status' command for a reason. But I currently cannot recall the reason,
  and if cannot come with it soon, I wouldn't mind changing the test to
  avoid rebalance status.
 

I think its the rebalance daemon's life cycle which is problematic. It makes it
inconvenient, if not impossible, for glusterd to gather progress/status 
deterministically.
The rebalance process could wait for the rebalance-commit subcommand to 
terminate.
There is no other daemon, managed by glusterd, has this kind of life cycle.
I don't see any good reason why rebalance should kill itself on completion
of data migration.

Thoughts?

~Krish

 Hmmm, is it the kind of thing where the rebalance status command
 should retry, if it's connection gets closed by a just-completed-
 rebalance (as happened here)?
 
 Or would that not work as well?
 
 + Justin
 
 --
 Open Source and Standards @ Red Hat
 
 twitter.com/realjustinclift
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel