from:"Pranith Kumar Karampuri"

Re: [Gluster-devel] regarding special treatment of ENOTSUP for setxattr

2014-05-07 Thread Pranith Kumar Karampuri

- Original Message -
 From: Raghavendra Gowdappa rgowd...@redhat.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Vijay Bellur vbel...@redhat.com, gluster-devel@gluster.org, Anand 
 Avati aav...@redhat.com
 Sent: Wednesday, May 7, 2014 3:42:16 PM
 Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for 
 setxattr

 I think with repetitive log message suppression patch being merged, we
 don't really need gf_log_occasionally (except if they are logged in DEBUG or
 TRACE levels).

That definitely helps. But still, setxattr calls are not supposed to fail with 
ENOTSUP on FS where we support gluster. If there are special keys which fail 
with ENOTSUPP, we can conditionally log setxattr failures only when the key is 
something new?

Pranith

 - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: Vijay Bellur vbel...@redhat.com
  Cc: gluster-devel@gluster.org, Anand Avati aav...@redhat.com
  Sent: Wednesday, 7 May, 2014 3:12:10 PM
  Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for
  setxattr

  - Original Message -
   From: Vijay Bellur vbel...@redhat.com
   To: Pranith Kumar Karampuri pkara...@redhat.com, Anand Avati
   aav...@redhat.com
   Cc: gluster-devel@gluster.org
   Sent: Tuesday, May 6, 2014 7:16:12 PM
   Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for
   setxattr

   On 05/06/2014 01:07 PM, Pranith Kumar Karampuri wrote:
hi,
   Why is there occasional logging for ENOTSUP errno when setxattr
   fails?

   In the absence of occasional logging, the log files would be flooded
   with this message every time there is a setxattr() call.

  How to know which keys are failing setxattr with ENOTSUPP if it is not
  logged
  when the key keeps changing?

  Pranith

   -Vijay

  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Need inputs for command deprecation output

2014-05-15 Thread Pranith Kumar Karampuri

- Original Message -
 From: Ravishankar N ravishan...@redhat.com
 To: Pranith Kumar Karampuri pkara...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Friday, May 16, 2014 7:15:58 AM
 Subject: Re: [Gluster-devel] Need inputs for command deprecation output

 On 05/16/2014 06:25 AM, Pranith Kumar Karampuri wrote:
  Hi,
  As part of changing behaviour of 'volume heal' commands. I want the
  commands to show the following output. Any feedback in making them
  better would be awesome :-).

  root@pranithk-laptop - ~
  06:20:10 :) ⚡ gluster volume heal r2 info healed
  This command has been deprecated

  root@pranithk-laptop - ~
  06:20:13 :( ⚡ gluster volume heal r2 info heal-failed
  This command has been deprecated
 When a command is deprecated, it still works the way it did but gives
 out a warning about it not being maintained and possible alternatives to it.
 If I understand http://review.gluster.org/#/c/7766/ correctly, we are
 not supporting these commands any more, in which case the right message
 would be Command not supported

I am wondering if we should even let the command be sent to self-heal-daemons 
from glusterd.

How about
06:20:10 :) ⚡ gluster volume heal r2 info healed
Command not supported.

Instead of 
06:20:10 :) ⚡ gluster volume heal r2 info healed
brick: brick-1
status: Command not supported

brick: brick-2
status: Command not supported

Pranith

 -Ravi
  Pranith.
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] spurios failures in tests/encryption/crypt.t

2014-05-17 Thread Pranith Kumar Karampuri

hi,
 crypt.t is failing regression builds once in a while and most of the times 
it is because of the failures just after the remount in the script.

TEST rm -f $M0/testfile-symlink
TEST rm -f $M0/testfile-link

Both of these are failing with ENOTCONN. I got a chance to look at the logs.
According to the brick logs, this is what I see:
[2014-05-17 05:43:43.363979] E [posix.c:2272:posix_open] 0-patchy-posix: open 
on /d/backends/patchy1/testfile-symlink: Transport endpoint is not connected

This is the very first time I saw posix failing with ENOTCONN. Do we have these 
bricks on some other network mounts? I wonder why it fails with ENOTCONN.

I also see that it happens right after a call_bail on the mount.

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Changes to Regression script

2014-05-17 Thread Pranith Kumar Karampuri

- Original Message -
 From: Vijay Bellur vbel...@redhat.com
 To: gluster-infra gluster-in...@gluster.org
 Cc: gluster-devel@gluster.org
 Sent: Tuesday, May 13, 2014 4:13:02 PM
 Subject: [Gluster-devel] Changes to Regression script

 Hi All,

 Me and Kaushal have effected the following changes on regression.sh in
 build.gluster.org:

 1. If a regression run results in a core and all tests pass, that
 particular run will be flagged as a failure. Previously a core that
 would cause test failures only would get marked as a failure.

 2. Cores from a particular test run are now archived and are available
 at /d/archived_builds/. This will also prevent manual intervention for
 managing cores.

 3. Logs from failed regression runs are now archived and are available
 at /d/logs/glusterfs-timestamp.tgz

 Do let us know if you have any comments on these changes.

This is already proving to be useful :-). I was able to debug one of the 
spurious failures for crypt.t. But the only problem is I was not able copy out 
the logs. Had to take avati's help to get the log files. Will it be possible to 
give access to these files so that anyone can download them?

Pranith

 Thanks,
 Vijay

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Changes to Regression script

2014-05-17 Thread Pranith Kumar Karampuri

- Original Message -
 From: Vijay Bellur vbel...@redhat.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: gluster-infra gluster-in...@gluster.org, gluster-devel@gluster.org
 Sent: Saturday, May 17, 2014 2:52:03 PM
 Subject: Re: [Gluster-devel] Changes to Regression script

 On 05/17/2014 02:10 PM, Pranith Kumar Karampuri wrote:

  - Original Message -
  From: Vijay Bellur vbel...@redhat.com
  To: gluster-infra gluster-in...@gluster.org
  Cc: gluster-devel@gluster.org
  Sent: Tuesday, May 13, 2014 4:13:02 PM
  Subject: [Gluster-devel] Changes to Regression script

  Hi All,

  Me and Kaushal have effected the following changes on regression.sh in
  build.gluster.org:

  1. If a regression run results in a core and all tests pass, that
  particular run will be flagged as a failure. Previously a core that
  would cause test failures only would get marked as a failure.

  2. Cores from a particular test run are now archived and are available
  at /d/archived_builds/. This will also prevent manual intervention for
  managing cores.

  3. Logs from failed regression runs are now archived and are available
  at /d/logs/glusterfs-timestamp.tgz

  Do let us know if you have any comments on these changes.

  This is already proving to be useful :-). I was able to debug one of the
  spurious failures for crypt.t. But the only problem is I was not able copy
  out the logs. Had to take avati's help to get the log files. Will it be
  possible to give access to these files so that anyone can download them?

 Good to know!

 You can access the .tgz files from:

 http://build.gluster.org:443/logs/

Awesome!!

Pranith

 -Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding special treatment of ENOTSUP for setxattr

2014-05-18 Thread Pranith Kumar Karampuri

Sent the following patch to remove the special treatment of ENOTSUP here: 
http://review.gluster.org/7788

Pranith
- Original Message -
 From: Kaleb KEITHLEY kkeit...@redhat.com
 To: gluster-devel@gluster.org
 Sent: Tuesday, May 13, 2014 8:01:53 PM
 Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for   
 setxattr

 On 05/13/2014 08:00 AM, Nagaprasad Sathyanarayana wrote:
  On 05/07/2014 03:44 PM, Pranith Kumar Karampuri wrote:

  - Original Message -
  From: Raghavendra Gowdappa rgowd...@redhat.com
  To: Pranith Kumar Karampuri pkara...@redhat.com
  Cc: Vijay Bellur vbel...@redhat.com, gluster-devel@gluster.org,
  Anand Avati aav...@redhat.com
  Sent: Wednesday, May 7, 2014 3:42:16 PM
  Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP
  for setxattr

  I think with repetitive log message suppression patch being merged, we
  don't really need gf_log_occasionally (except if they are logged in
  DEBUG or
  TRACE levels).
  That definitely helps. But still, setxattr calls are not supposed to
  fail with ENOTSUP on FS where we support gluster. If there are special
  keys which fail with ENOTSUPP, we can conditionally log setxattr
  failures only when the key is something new?

 I know this is about EOPNOTSUPP (a.k.a. ENOTSUPP) returned by
 setxattr(2) for legitimate attrs.

 But I can't help but wondering if this isn't related to other bugs we've
 had with, e.g., lgetxattr(2) called on invalid xattrs?

 E.g. see https://bugzilla.redhat.com/show_bug.cgi?id=765202. We have a
 hack where xlators communicate with each other by getting (and setting?)
 invalid xattrs; the posix xlator has logic to filter out  invalid
 xattrs, but due to bugs this hasn't always worked perfectly.

 It would be interesting to know which xattrs are getting errors and on
 which fs types.

 FWIW, in a quick perusal of a fairly recent (3.14.3) kernel, in xfs
 there are only six places where EOPNOTSUPP is returned, none of them
 related to xattrs. In ext[34] EOPNOTSUPP can be returned if the
 user_xattr option is not enabled (enabled by default in ext4.) And in
 the higher level vfs xattr code there are many places where EOPNOTSUPP
 _might_ be returned, primarily only if subordinate function calls aren't
 invoked which would clear the default or return a different error.

 --

 Kaleb

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Changes to Regression script

2014-05-18 Thread Pranith Kumar Karampuri

- Original Message -
 From: Vijay Bellur vbel...@redhat.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: gluster-infra gluster-in...@gluster.org, gluster-devel@gluster.org
 Sent: Saturday, 17 May, 2014 2:52:03 PM
 Subject: Re: [Gluster-devel] Changes to Regression script

 On 05/17/2014 02:10 PM, Pranith Kumar Karampuri wrote:

  - Original Message -
  From: Vijay Bellur vbel...@redhat.com
  To: gluster-infra gluster-in...@gluster.org
  Cc: gluster-devel@gluster.org
  Sent: Tuesday, May 13, 2014 4:13:02 PM
  Subject: [Gluster-devel] Changes to Regression script

  Hi All,

  Me and Kaushal have effected the following changes on regression.sh in
  build.gluster.org:

  1. If a regression run results in a core and all tests pass, that
  particular run will be flagged as a failure. Previously a core that
  would cause test failures only would get marked as a failure.

  2. Cores from a particular test run are now archived and are available
  at /d/archived_builds/. This will also prevent manual intervention for
  managing cores.

  3. Logs from failed regression runs are now archived and are available
  at /d/logs/glusterfs-timestamp.tgz

  Do let us know if you have any comments on these changes.

  This is already proving to be useful :-). I was able to debug one of the
  spurious failures for crypt.t. But the only problem is I was not able copy
  out the logs. Had to take avati's help to get the log files. Will it be
  possible to give access to these files so that anyone can download them?

 Good to know!

 You can access the .tgz files from:

 http://build.gluster.org:443/logs/

I was able to access these yesterday. But now it gives 404.

Pranith

 -Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-18 Thread Pranith Kumar Karampuri

- Original Message -
 From: Justin Clift jus...@gluster.org
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Monday, 19 May, 2014 10:26:04 AM
 Subject: Re: [Gluster-devel] Spurious failures because of nfs and snapshots

 On 16/05/2014, at 1:49 AM, Pranith Kumar Karampuri wrote:
  hi,
 In the latest build I fired for review.gluster.com/7766
 (http://build.gluster.org/job/regression/4443/console) failed because
 of spurious failure. The script doesn't wait for nfs export to be
 available. I fixed that, but interestingly I found quite a few scripts
 with same problem. Some of the scripts are relying on 'sleep 5' which
 also could lead to spurious failures if the export is not available in
 5 seconds.

 Cool.  Fixing this NFS problem across all of the tests would be really
 welcome.  That specific failed test (bug-1087198.t) is the most common
 one I've seen over the last few weeks, causing about half of all
 failures in master.

 Eliminating this class of regression failure would be really helpful. :)

This particular class is eliminated :-). Patch was merged on Friday.

Pranith

 + Justin

 --
 Open Source and Standards @ Red Hat

 twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-20 Thread Pranith Kumar Karampuri

hi,
Please resubmit the patches on top of http://review.gluster.com/#/c/7753 to 
prevent frequent regression failures.

Pranith
- Original Message -
 From: Vijaikumar M vmall...@redhat.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Joseph Fernandes josfe...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Monday, May 19, 2014 2:40:47 PM
 Subject: Re: Spurious failures because of nfs and snapshots
 
 Brick disconnected with ping-time out:
 
 Here is the log message
 [2014-05-19 04:29:38.133266] I [MSGID: 100030] [glusterfsd.c:1998:main]
 0-/build/install/sbin/glusterfsd: Started running /build/install/sbi
 n/glusterfsd version 3.5qa2 (args: /build/install/sbin/glusterfsd -s
 build.gluster.org --volfile-id /snaps/patchy_snap1/3f2ae3fbb4a74587b1a9
 1013f07d327f.build.gluster.org.var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3
 -p /var/lib/glusterd/snaps/patchy_snap1/3f2ae3f
 bb4a74587b1a91013f07d327f/run/build.gluster.org-var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3.pid
 -S /var/run/51fe50a6faf0aae006c815da946caf3a.socket --brick-name
 /var/run/gluster/snaps/3f2ae3fbb4a74587b1a91013f07d327f/brick3 -l
 /build/install/var/log/glusterfs/br
 icks/var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3.log
 --xlator-option *-posix.glusterd-uuid=494ef3cd-15fc-4c8c-8751-2d441ba
 7b4b0 --brick-port 49164 --xlator-option
 3f2ae3fbb4a74587b1a91013f07d327f-server.listen-port=49164)
2 [2014-05-19 04:29:38.141118] I
 [rpc-clnt.c:988:rpc_clnt_connection_init] 0-glusterfs: defaulting
 ping-timeout to 30secs
3 [2014-05-19 04:30:09.139521] C
 [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-glusterfs: server
 10.3.129.13:24007 has not responded in the last 30 seconds, disconnecting.
 
 
 
 Patch 'http://review.gluster.org/#/c/7753/' will fix the problem, where
 ping-timer will be disabled by default for all the rpc connection except
 for glusterd-glusterd (set to 30sec) and client-glusterd (set to 42sec).
 
 
 Thanks,
 Vijay
 
 
 On Monday 19 May 2014 11:56 AM, Pranith Kumar Karampuri wrote:
  The latest build failure also has the same issue:
  Download it from here:
  http://build.gluster.org:443/logs/glusterfs-logs-20140518%3a22%3a27%3a31.tgz
 
  Pranith
 
  - Original Message -
  From: Vijaikumar M vmall...@redhat.com
  To: Joseph Fernandes josfe...@redhat.com
  Cc: Pranith Kumar Karampuri pkara...@redhat.com, Gluster Devel
  gluster-devel@gluster.org
  Sent: Monday, 19 May, 2014 11:41:28 AM
  Subject: Re: Spurious failures because of nfs and snapshots
 
  Hi Joseph,
 
  In the log mentioned below, it say ping-time is set to default value
  30sec.I think issue is different.
  Can you please point me to the logs where you where able to re-create
  the problem.
 
  Thanks,
  Vijay
 
 
 
  On Monday 19 May 2014 09:39 AM, Pranith Kumar Karampuri wrote:
  hi Vijai, Joseph,
In 2 of the last 3 build failures,
http://build.gluster.org/job/regression/4479/console,
http://build.gluster.org/job/regression/4478/console this
test(tests/bugs/bug-1090042.t) failed. Do you guys think it is
better
to revert this test until the fix is available? Please send a patch
to revert the test case if you guys feel so. You can re-submit it
along with the fix to the bug mentioned by Joseph.
 
  Pranith.
 
  - Original Message -
  From: Joseph Fernandes josfe...@redhat.com
  To: Pranith Kumar Karampuri pkara...@redhat.com
  Cc: Gluster Devel gluster-devel@gluster.org
  Sent: Friday, 16 May, 2014 5:13:57 PM
  Subject: Re: Spurious failures because of nfs and snapshots
 
 
  Hi All,
 
  tests/bugs/bug-1090042.t :
 
  I was able to reproduce the issue i.e when this test is done in a loop
 
  for i in {1..135} ; do  ./bugs/bug-1090042.t
 
  When checked the logs
  [2014-05-16 10:49:49.003978] I [rpc-clnt.c:973:rpc_clnt_connection_init]
  0-management: setting frame-timeout to 600
  [2014-05-16 10:49:49.004035] I [rpc-clnt.c:988:rpc_clnt_connection_init]
  0-management: defaulting ping-timeout to 30secs
  [2014-05-16 10:49:49.004303] I [rpc-clnt.c:973:rpc_clnt_connection_init]
  0-management: setting frame-timeout to 600
  [2014-05-16 10:49:49.004340] I [rpc-clnt.c:988:rpc_clnt_connection_init]
  0-management: defaulting ping-timeout to 30secs
 
  The issue is with ping-timeout and is tracked under the bug
 
  https://bugzilla.redhat.com/show_bug.cgi?id=1096729
 
 
  The workaround is mentioned in
  https://bugzilla.redhat.com/show_bug.cgi?id=1096729#c8
 
 
  Regards,
  Joe
 
  - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: Gluster Devel gluster-devel@gluster.org
  Cc: Joseph Fernandes josfe...@redhat.com
  Sent: Friday, May 16, 2014 6:19:54 AM
  Subject: Spurious failures because of nfs and snapshots
 
  hi,
In the latest build I fired for review.gluster.com/7766
(http://build.gluster.org/job/regression/4443/console) failed

[Gluster-devel] Split-brain present and future in afr

2014-05-20 Thread Pranith Kumar Karampuri

hi,

Thanks to Vijay Bellur for helping with the re-write of the draft I sent him 
:-).

Present:
Split-brains of files happen in afr today due to 2 primary reasons:

1. Split-brains due to network partition or network split-brains

2. Split-brains due to servers in a replicated group being offline at different 
points in time without self-heal happening in the common period of time when 
the servers were online. For further discussion, this is referred to as 
split-brain over time.

To prevent the occurence of split-brains, we have the following quorum 
implementations in place:

a Client quorum - Driven by afr (client) and writes are allowed when majority 
of bricks in a replica group are online. Majority is by default N/2 + 1, where 
N is the replication factor for files in a volume.

b Server quorum - Driven by glusterd (server) and writes are allowed when 
majority of peers are online. Majority by default is N/2 + 1, where N is the 
number of peers in a trusted storage pool.

Both a and b primarily safeguard network split-brains. The protection of 
these quorum implementations for split-brain over time scenarios is not very 
high.
Let us consider how replica 3 and replica 2 can be protected against 
split-brains.

Replica 3:
Client quorum is quite effective in this case as writes are only allowed when 
at least 2 of 3 bricks that form a replica group is seen by afr/client. A 
recent fix for a corner case race in client quorum, 
(http://review.gluster.org/7600) makes it very robust. This patch is now part 
of master and release-3.5. We plan to backport it to release-3.4 too.

Replica 2:
Majority for client quorum in a deployment with 2 bricks per replica group is 
2.  Hence availability becomes a problem with replica 2 when either of the 
bricks is offline. To provide better avaialbility for replica-2, the first 
brick in a replica set is provided higher weight and quorum is met as long as 
the first brick is online. If the first brick is offline, then quorum is lost. 

Let us consider the following cases with B1 and B2 forming a replicated set:
B1B2Quorum
Online  OnlineMet
Online  Offline Met
Offline   OfflineNot Met
Offline   OfflineNot Met

Though better availability is provided by client quorum in replica 2 scenarios, 
it is not very optimal and hence an improvement in behavior seems desirable.
Future:

Our  focus in afr going forward would be to solve three problems to provide 
better protection  against split-brains and resolving them:

1. Better protection for split-brain over time.
2. Policy based split-brain resolution.
3. Provide better availability with client quorum and replica 2.

For 1, implementation of outcasting logic will address the problem:
   - An outcast is a copy of a file on which writes have been performed only 
when quorum is met.
   - When a brick goes down and comes back up self-heal daemon will go and mark 
the affected files on the brick that just came back up as outcasts. The outcast 
marking can be implemented even before the brick is declared available to 
regular clients. Once a copy of a file is marked as needing self-heal (or as an 
outcast), writes from clients will not land on that copy till self-heal is 
completed and the outcast tag is removed.

For 2,  we plan to provide commands that can heal based on user configurable 
policies. Examples of policies would be:
 - Pick up the largest file as the winner for resolving a self-heal
-  Choose brick foo as the winner for resolving split-brains
-  Pick up the file with the latest version as the winner (when versioning for 
files is available).

For 3, we are planning to introduce arbiter bricks that can be used to 
determine quorum. The arbiter bricks will be dummy bricks that host only files 
that will be updated from multiple clients. This will be achieved by bringing 
about variable replication count for configurable class of files within a 
volume.
 In the case of a replicated volume with one arbiter brick per replica group, 
certain files that are prone to split-brain will be in 3 bricks (2 data bricks 
+ 1 arbiter brick).  All other files will be present in the regular data 
bricks. For example, when oVirt VM disks are hosted on a replica 2 volume, 
sanlock is used by oVirt for arbitration. sanloclk lease files will be written 
by all clients and VM disks are written by only a single client at any given 
point of time. In this scenario, we can place sanlock lease files on 2 data + 1 
arbiter bricks. The VM disk files will only be present on the 2 data bricks. 
Client quorum is now determined by looking at 3 bricks instead of 2 and we have 
better protection when network split-brains happen.
 
 A combination of 1. and 3. does seem

Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-20 Thread Pranith Kumar Karampuri

Hey,
Seems like even after this fix is merged, the regression tests are failing 
for the same script. You can check the logs at 
http://build.gluster.org:443/logs/glusterfs-logs-20140520%3a14%3a06%3a46.tgz

Relevant logs:
[2014-05-20 20:17:07.026045]  : volume create patchy 
build.gluster.org:/d/backends/patchy1 build.gluster.org:/d/backends/patchy2 : 
SUCCESS
[2014-05-20 20:17:08.030673]  : volume start patchy : SUCCESS
[2014-05-20 20:17:08.279148]  : volume barrier patchy enable : SUCCESS
[2014-05-20 20:17:08.476785]  : volume barrier patchy enable : FAILED : Failed 
to reconfigure barrier.
[2014-05-20 20:17:08.727429]  : volume barrier patchy disable : SUCCESS
[2014-05-20 20:17:08.926995]  : volume barrier patchy disable : FAILED : Failed 
to reconfigure barrier.

Pranith

- Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: Gluster Devel gluster-devel@gluster.org
 Cc: Joseph Fernandes josfe...@redhat.com, Vijaikumar M 
 vmall...@redhat.com
 Sent: Tuesday, May 20, 2014 3:41:11 PM
 Subject: Re: Spurious failures because of nfs and snapshots
 
 hi,
 Please resubmit the patches on top of http://review.gluster.com/#/c/7753
 to prevent frequent regression failures.
 
 Pranith
 - Original Message -
  From: Vijaikumar M vmall...@redhat.com
  To: Pranith Kumar Karampuri pkara...@redhat.com
  Cc: Joseph Fernandes josfe...@redhat.com, Gluster Devel
  gluster-devel@gluster.org
  Sent: Monday, May 19, 2014 2:40:47 PM
  Subject: Re: Spurious failures because of nfs and snapshots
  
  Brick disconnected with ping-time out:
  
  Here is the log message
  [2014-05-19 04:29:38.133266] I [MSGID: 100030] [glusterfsd.c:1998:main]
  0-/build/install/sbin/glusterfsd: Started running /build/install/sbi
  n/glusterfsd version 3.5qa2 (args: /build/install/sbin/glusterfsd -s
  build.gluster.org --volfile-id /snaps/patchy_snap1/3f2ae3fbb4a74587b1a9
  1013f07d327f.build.gluster.org.var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3
  -p /var/lib/glusterd/snaps/patchy_snap1/3f2ae3f
  bb4a74587b1a91013f07d327f/run/build.gluster.org-var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3.pid
  -S /var/run/51fe50a6faf0aae006c815da946caf3a.socket --brick-name
  /var/run/gluster/snaps/3f2ae3fbb4a74587b1a91013f07d327f/brick3 -l
  /build/install/var/log/glusterfs/br
  icks/var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3.log
  --xlator-option *-posix.glusterd-uuid=494ef3cd-15fc-4c8c-8751-2d441ba
  7b4b0 --brick-port 49164 --xlator-option
  3f2ae3fbb4a74587b1a91013f07d327f-server.listen-port=49164)
 2 [2014-05-19 04:29:38.141118] I
  [rpc-clnt.c:988:rpc_clnt_connection_init] 0-glusterfs: defaulting
  ping-timeout to 30secs
 3 [2014-05-19 04:30:09.139521] C
  [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-glusterfs: server
  10.3.129.13:24007 has not responded in the last 30 seconds, disconnecting.
  
  
  
  Patch 'http://review.gluster.org/#/c/7753/' will fix the problem, where
  ping-timer will be disabled by default for all the rpc connection except
  for glusterd-glusterd (set to 30sec) and client-glusterd (set to 42sec).
  
  
  Thanks,
  Vijay
  
  
  On Monday 19 May 2014 11:56 AM, Pranith Kumar Karampuri wrote:
   The latest build failure also has the same issue:
   Download it from here:
   http://build.gluster.org:443/logs/glusterfs-logs-20140518%3a22%3a27%3a31.tgz
  
   Pranith
  
   - Original Message -
   From: Vijaikumar M vmall...@redhat.com
   To: Joseph Fernandes josfe...@redhat.com
   Cc: Pranith Kumar Karampuri pkara...@redhat.com, Gluster Devel
   gluster-devel@gluster.org
   Sent: Monday, 19 May, 2014 11:41:28 AM
   Subject: Re: Spurious failures because of nfs and snapshots
  
   Hi Joseph,
  
   In the log mentioned below, it say ping-time is set to default value
   30sec.I think issue is different.
   Can you please point me to the logs where you where able to re-create
   the problem.
  
   Thanks,
   Vijay
  
  
  
   On Monday 19 May 2014 09:39 AM, Pranith Kumar Karampuri wrote:
   hi Vijai, Joseph,
 In 2 of the last 3 build failures,
 http://build.gluster.org/job/regression/4479/console,
 http://build.gluster.org/job/regression/4478/console this
 test(tests/bugs/bug-1090042.t) failed. Do you guys think it is
 better
 to revert this test until the fix is available? Please send a
 patch
 to revert the test case if you guys feel so. You can re-submit it
 along with the fix to the bug mentioned by Joseph.
  
   Pranith.
  
   - Original Message -
   From: Joseph Fernandes josfe...@redhat.com
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Gluster Devel gluster-devel@gluster.org
   Sent: Friday, 16 May, 2014 5:13:57 PM
   Subject: Re: Spurious failures because of nfs and snapshots
  
  
   Hi All,
  
   tests/bugs/bug-1090042.t :
  
   I was able to reproduce the issue i.e when this test is done in a loop
  
   for i in {1

Re: [Gluster-devel] Spurious failures because of nfs and snapshots

2014-05-21 Thread Pranith Kumar Karampuri



- Original Message -
 From: Atin Mukherjee amukh...@redhat.com
 To: gluster-devel@gluster.org, Pranith Kumar Karampuri pkara...@redhat.com
 Sent: Wednesday, May 21, 2014 3:39:21 PM
 Subject: Re: Fwd: Re: [Gluster-devel] Spurious failures because of nfs and 
 snapshots
 
 
 
 On 05/21/2014 11:42 AM, Atin Mukherjee wrote:
  
  
  On 05/21/2014 10:54 AM, SATHEESARAN wrote:
  Guys,
 
  This is the issue pointed out by Pranith with regard to Barrier.
  I was reading through it.
 
  But I wanted to bring it to concern
 
  -- S
 
 
   Original Message 
  Subject:   Re: [Gluster-devel] Spurious failures because of nfs and
  snapshots
  Date:  Tue, 20 May 2014 21:16:57 -0400 (EDT)
  From:  Pranith Kumar Karampuri pkara...@redhat.com
  To:Vijaikumar M vmall...@redhat.com, Joseph Fernandes
  josfe...@redhat.com
  CC:Gluster Devel gluster-devel@gluster.org
 
 
 
  Hey,
  Seems like even after this fix is merged, the regression tests are
  failing for the same script. You can check the logs at
  
  http://build.gluster.org:443/logs/glusterfs-logs-20140520%3a14%3a06%3a46.tgz
  Pranith,
  
  Is this the correct link? I don't see any log having this sequence there.
  Also looking at the log from this mail, this is expected as per the
  barrier functionality, an enable request followed by another enable
  should always fail and the same happens for disable.
  
  Can you please confirm the link and which particular regression test is
  causing this issue, is it bug-1090042.t?
  
  --Atin
 
  Relevant logs:
  [2014-05-20 20:17:07.026045]  : volume create patchy
  build.gluster.org:/d/backends/patchy1
  build.gluster.org:/d/backends/patchy2 : SUCCESS
  [2014-05-20 20:17:08.030673]  : volume start patchy : SUCCESS
  [2014-05-20 20:17:08.279148]  : volume barrier patchy enable : SUCCESS
  [2014-05-20 20:17:08.476785]  : volume barrier patchy enable : FAILED :
  Failed to reconfigure barrier.
  [2014-05-20 20:17:08.727429]  : volume barrier patchy disable : SUCCESS
  [2014-05-20 20:17:08.926995]  : volume barrier patchy disable : FAILED :
  Failed to reconfigure barrier.
 
 This log is for bug-1092841.t and its expected.

Damn :-(. I think I screwed up the timestamps while checking Sorry about 
that :-(. But there are failures. Check 
http://build.gluster.org/job/regression/4501/consoleFull

Pranith

 
 --Atin
  Pranith
 
  - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: Gluster Devel gluster-devel@gluster.org
  Cc: Joseph Fernandes josfe...@redhat.com, Vijaikumar M
  vmall...@redhat.com
  Sent: Tuesday, May 20, 2014 3:41:11 PM
  Subject: Re: Spurious failures because of nfs and snapshots
 
  hi,
  Please resubmit the patches on top of
  http://review.gluster.com/#/c/7753
  to prevent frequent regression failures.
 
  Pranith
  - Original Message -
  From: Vijaikumar M vmall...@redhat.com
  To: Pranith Kumar Karampuri pkara...@redhat.com
  Cc: Joseph Fernandes josfe...@redhat.com, Gluster Devel
  gluster-devel@gluster.org
  Sent: Monday, May 19, 2014 2:40:47 PM
  Subject: Re: Spurious failures because of nfs and snapshots
 
  Brick disconnected with ping-time out:
 
  Here is the log message
  [2014-05-19 04:29:38.133266] I [MSGID: 100030] [glusterfsd.c:1998:main]
  0-/build/install/sbin/glusterfsd: Started running /build/install/sbi
  n/glusterfsd version 3.5qa2 (args: /build/install/sbin/glusterfsd -s
  build.gluster.org --volfile-id /snaps/patchy_snap1/3f2ae3fbb4a74587b1a9
  1013f07d327f.build.gluster.org.var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3
  -p /var/lib/glusterd/snaps/patchy_snap1/3f2ae3f
  bb4a74587b1a91013f07d327f/run/build.gluster.org-var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3.pid
  -S /var/run/51fe50a6faf0aae006c815da946caf3a.socket --brick-name
  /var/run/gluster/snaps/3f2ae3fbb4a74587b1a91013f07d327f/brick3 -l
  /build/install/var/log/glusterfs/br
  icks/var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3.log
  --xlator-option *-posix.glusterd-uuid=494ef3cd-15fc-4c8c-8751-2d441ba
  7b4b0 --brick-port 49164 --xlator-option
  3f2ae3fbb4a74587b1a91013f07d327f-server.listen-port=49164)
 2 [2014-05-19 04:29:38.141118] I
  [rpc-clnt.c:988:rpc_clnt_connection_init] 0-glusterfs: defaulting
  ping-timeout to 30secs
 3 [2014-05-19 04:30:09.139521] C
  [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-glusterfs: server
  10.3.129.13:24007 has not responded in the last 30 seconds,
  disconnecting.
 
 
 
  Patch 'http://review.gluster.org/#/c/7753/' will fix the problem, where
  ping-timer will be disabled by default for all the rpc connection except
  for glusterd-glusterd (set to 30sec) and client-glusterd (set to 42sec).
 
 
  Thanks,
  Vijay
 
 
  On Monday 19 May 2014 11:56 AM, Pranith Kumar Karampuri wrote:
  The latest build failure also has the same issue:
  Download it from here:
  http://build.gluster.org:443/logs/glusterfs-logs

Re: [Gluster-devel] spurios failures in tests/encryption/crypt.t

2014-05-21 Thread Pranith Kumar Karampuri

- Original Message -
 From: Anand Avati av...@gluster.org
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Edward Shishkin edw...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Wednesday, May 21, 2014 12:36:22 PM
 Subject: Re: [Gluster-devel] spurios failures in tests/encryption/crypt.t

 On Tue, May 20, 2014 at 10:54 PM, Pranith Kumar Karampuri 
 pkara...@redhat.com wrote:

  - Original Message -
   From: Anand Avati av...@gluster.org
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Edward Shishkin edw...@redhat.com, Gluster Devel 
  gluster-devel@gluster.org
   Sent: Wednesday, May 21, 2014 10:53:54 AM
   Subject: Re: [Gluster-devel] spurios failures in tests/encryption/crypt.t

   There are a few suspicious things going on here..

   On Tue, May 20, 2014 at 10:07 PM, Pranith Kumar Karampuri 
   pkara...@redhat.com wrote:

  hi,
   crypt.t is failing regression builds once in a while and most
  of
  the times it is because of the failures just after the remount in
  the
  script.

  TEST rm -f $M0/testfile-symlink
  TEST rm -f $M0/testfile-link

  Both of these are failing with ENOTCONN. I got a chance to look at
  the logs. According to the brick logs, this is what I see:
  [2014-05-17 05:43:43.363979] E [posix.c:2272:posix_open]
  0-patchy-posix: open on /d/backends/patchy1/testfile-symlink:
  Transport endpoint is not connected

   posix_open() happening on a symlink? This should NEVER happen. glusterfs
   itself should NEVER EVER by triggering symlink resolution on the server.
  In
   this case, for whatever reason an open() is attempted on a symlink, and
  it
   is getting followed back onto gluster's own mount point (test case is
   creating an absolute link).

   So first find out: who is triggering fop-open() on a symlink. Fix the
   caller.

http://review.gluster.org/7824

   Next: add a check in posix_open() to fail with ELOOP or EINVAL if the
  inode
   is a symlink.

http://review.gluster.org/7823

  I think I understood what you are saying. Open call for symlink on fuse
  mount lead to an open call again for the target on the same fuse mount.

 It's not that simple. The client VFS is intelligent enough to resolve
 symlinks and send open() only on non-symlinks. And the test case script was
 doing an obvious unlink() (TEST rm -f filename), so it was not initiated
 by an open() attempt in the first place. My guess is that some xlator
 (probably crypt?) is doing an open() on an inode and that is going through
 unchecked in posix. It is a bug in both the caller and posix, but the
 onus/responsibility is on posix to disallow open() on anything but regular
 files (even open() on character or block devices should not happen in
 posix).

  Which lead to deadlock :). That is why we disallow opens on symlink in
  gluster?

 That's not just why open on symlink is disallowed in gluster, it is a more
 generic problem of following symlinks in general inside gluster. Symlink
 resolution must strictly happen only in the outermost VFS. Following
 symlinks inside the filesystem is not only an invalid operation, but can
 lead to all kinds of deadlocks, security holes (what if you opened a
 symlink which points to /etc/passwd, should it show the contents of the
 client machine's /etc/passwd or the server? Now what if you wrote to the
 file through the symlink? etc. you get the idea..) and
 wrong/weird/dangerous behaviors. This is not just related to following
 symlinks, even open()ing special devices.. e.g if you create a char device
 file with major/minor number of an audio device and wrote pcm data into it,
 should it play music on the client machine or in the server machine? etc.
 The summary is, following symlinks or opening non-regular files is
 VFS/client operation and are invalid operations in a filesystem context.

Now only one question remains. How could it not hang everytime?

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Regarding spurious failure of bug-884455.t

2014-05-21 Thread Pranith Kumar Karampuri

hi Raghavendra,
The failures are happening because rebalance status command is failing. I 
could see the following logs in glusterd log if I start it with -L DEBUG and 
try to recreate the bug. Could you please take a look at it.

Test Summary Report
---
tests/bugs/bug-884455.t (Wstat: 0 Tests: 22 Failed: 1)
  Failed test:  11

[2014-05-22 01:49:29.493657] D 
[glusterd-op-sm.c:5961:glusterd_op_ac_send_brick_op] 0-management: Returning 
with 0
[2014-05-22 01:49:29.493662] D 
[glusterd-utils.c:8027:glusterd_sm_tr_log_transition_add] 0-management: 
Transitioning from 'Stage op sent' to 'Brick op 
sent' due to event 'GD_OP_EVENT_ALL_ACC'
[2014-05-22 01:49:29.493667] D 
[glusterd-utils.c:8029:glusterd_sm_tr_log_transition_add] 0-management: 
returning 0
[2014-05-22 01:49:29.493673] D [glusterd-op-sm.c:268:glusterd_set_txn_opinfo] 
0-: Successfully set opinfo for transaction ID : 3825ae51-bb80-4031-9d61-
82799fe0bc81
[2014-05-22 01:49:29.493678] D [glusterd-op-sm.c:275:glusterd_set_txn_opinfo] 
0-: Returning 0
[2014-05-22 01:49:29.493820] D 
[glusterd-rpc-ops.c:1796:__glusterd_brick_op_cbk] 0-: transaction ID = 
3825ae51-bb80-4031-9d61-82799fe0bc81
[2014-05-22 01:49:29.493829] D 
[glusterd-op-sm.c:6411:glusterd_op_sm_inject_event] 0-management: Enqueue 
event: 'GD_OP_EVENT_RCVD_RJT'
[2014-05-22 01:49:29.493835] D [glusterd-op-sm.c:6489:glusterd_op_sm] 
0-management: Dequeued event of type: 'GD_OP_EVENT_RCVD_RJT'

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Regression testing results for master branch

2014-05-21 Thread Pranith Kumar Karampuri

- Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: Justin Clift jus...@gluster.org
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Thursday, May 22, 2014 6:23:16 AM
 Subject: Re: [Gluster-devel] Regression testing results for master branch

 - Original Message -
  From: Justin Clift jus...@gluster.org
  To: Pranith Kumar Karampuri pkara...@redhat.com
  Cc: Gluster Devel gluster-devel@gluster.org
  Sent: Wednesday, May 21, 2014 11:01:36 PM
  Subject: Re: [Gluster-devel] Regression testing results for master branch

  On 21/05/2014, at 6:17 PM, Justin Clift wrote:
   Hi all,

   Kicked off 21 VM's in Rackspace earlier today, running the regression
   tests
   against master branch.

   Only 3 VM's failed out of the 21 (86% PASS, 14% FAIL), with all three
   being
   for the same test:

   Test Summary Report
   ---
   ./tests/bugs/bug-948686.t   (Wstat: 0 Tests: 20
   Failed:
   2)
Failed tests:  13-14
   Files=230, Tests=4373, 5601 wallclock secs ( 2.09 usr  1.58 sys + 1012.66
   cusr 688.80 csys = 1705.13 CPU)
   Result: FAIL

  Interestingly, this one looks like a simple time based thing
  too.  The failed tests are the ones after the sleep:

...
#modify volume config to see change in volume-sync
TEST $CLI_1 volume set $V0 write-behind off
#add some files to the volume to see effect of volume-heal cmd
TEST touch $M0/{1..100};
TEST $CLI_1 volume stop $V0;
TEST $glusterd_3;
sleep 3;
TEST $CLI_3 volume start $V0;
TEST $CLI_2 volume stop $V0;
TEST $CLI_2 volume delete $V0;

  Do you already have this one on your radar?

 It wasn't, thanks for bringing it on my radar :-). Sent
 http://review.gluster.org/7837 to address this.

Kaushal,
I made this fix based on the assumption that the script seems to be waiting 
for all glusterds to be online. I could not check the logs because glusterds 
spawned by cluster.rc seem to be storing the logs not in the default location. 
Do you think we can make changes to the script so that we can get logs from 
glusterds spawned by cluster.rc as well?

Pranith

 Pranith

  + Justin

  --
  Open Source and Standards @ Red Hat

  twitter.com/realjustinclift

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] regression tests and DEBUG flags

2014-05-21 Thread Pranith Kumar Karampuri

hi,
   I think we should run the regression tests with DEBUG builds so that 
GF_ASSERTs are caught. I will work with Justin to make sure we don't see too 
many failures before turning it on. I also want the regression tests to catch 
memory-corruption (invalid read/write of deallocated memory). For that I sent 
the following patch http://review.gluster.com/7835 to minimize the effects of 
mem-pool. Please let me know your comments. Review on the patch would be nice 
too :-).

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] bug-857330/normal.t failure

2014-05-22 Thread Pranith Kumar Karampuri

Kaushal,
   Rebalance status command seems to be failing sometimes. I sent a mail about 
such spurious failure earlier today. Did you get a chance to look at the logs 
and confirm that rebalance didn't fail and it is indeed a timeout?

Pranith
- Original Message -
 From: Kaushal M kshlms...@gmail.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Justin Clift jus...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Thursday, May 22, 2014 4:40:25 PM
 Subject: Re: [Gluster-devel] bug-857330/normal.t failure

 The test is waiting for rebalance to finish. This is a rebalance with some
 actual data so it could have taken a long time to finish. I did set a
 pretty high timeout, but it seems like it's not enough for the new VMs.

 Possible options are,
 - Increase this timeout further
 - Reduce the amount of data. Currently this is 100 directories with 10
 files each of size between 10-500KB

 ~kaushal

 On Thu, May 22, 2014 at 3:59 PM, Pranith Kumar Karampuri 
 pkara...@redhat.com wrote:

  Kaushal has more context about these CCed. Keep the setup until he
  responds so that he can take a look.

  Pranith
  - Original Message -
   From: Justin Clift jus...@gluster.org
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Gluster Devel gluster-devel@gluster.org
   Sent: Thursday, May 22, 2014 3:54:46 PM
   Subject: bug-857330/normal.t failure

   Hi Pranith,

   Ran a few VM's with your Gerrit CR 7835 applied, and in DEBUG
   mode (I think).

   One of the VM's had a failure in bug-857330/normal.t:

 Test Summary Report
 ---
 ./tests/basic/rpm.t (Wstat: 0 Tests: 0
  Failed:
 0)
   Parse errors: Bad plan.  You planned 8 tests but ran 0.
 ./tests/bugs/bug-857330/normal.t(Wstat: 0 Tests: 24
  Failed:
 1)
   Failed test:  13
 Files=230, Tests=4369, 5407 wallclock secs ( 2.13 usr  1.73 sys +
  941.82
 cusr 645.54 csys = 1591.22 CPU)
 Result: FAIL

   Seems to be this test:

 COMMAND=volume rebalance $V0 status
 PATTERN=completed
 EXPECT_WITHIN 300 $PATTERN get-task-status

   Is this one on your radar already?

   Btw, this VM is still online.  Can give you access to retrieve logs
   if useful.

   + Justin

   --
   Open Source and Standards @ Red Hat

   twitter.com/realjustinclift

  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] bug-857330/normal.t failure

2014-05-22 Thread Pranith Kumar Karampuri

- Original Message -
 From: Kaushal M kshlms...@gmail.com
 To: Justin Clift jus...@gluster.org, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Thursday, May 22, 2014 6:04:29 PM
 Subject: Re: [Gluster-devel] bug-857330/normal.t failure

 Thanks Justin, I found the problem. The VM can be deleted now.

 Turns out, there was more than enough time for the rebalance to complete. But
 we hit a race, which caused a command to fail.

 The particular test that failed is waiting for rebalance to finish. It does
 this by doing a 'gluster volume rebalance  status' command and checking
 the result. The EXPECT_WITHIN function runs this command till we have a
 match, the command fails or the timeout happens.

 For a rebalance status command, glusterd sends a request to the rebalance
 process (as a brick_op) to get the latest stats. It had done the same in
 this case as well. But while glusterd was waiting for the reply, the
 rebalance completed and the process stopped itself. This caused the rpc
 connection between glusterd and rebalance proc to close. This caused the all
 pending requests to be unwound as failures. Which in turnlead to the command
 failing.

Do you think we can print the status of the process as 'not-responding' when 
such a thing happens, instead of failing the command?

Pranith

 I cannot think of a way to avoid this race from within glusterd. For this
 particular test, we could avoid using the 'rebalance status' command if we
 directly checked the rebalance process state using its pid etc. I don't
 particularly approve of this approach, as I think I used the 'rebalance
 status' command for a reason. But I currently cannot recall the reason, and
 if cannot come with it soon, I wouldn't mind changing the test to avoid
 rebalance status.

 ~kaushal

 On Thu, May 22, 2014 at 5:22 PM, Justin Clift  jus...@gluster.org  wrote:

 On 22/05/2014, at 12:32 PM, Kaushal M wrote:
  I haven't yet. But I will.

  Justin,
  Can I get take a peek inside the vm?

 Sure.

 IP: 23.253.57.20
 User: root
 Password: foobar123

 The stdout log from the regression test is in /tmp/regression.log.

 The GlusterFS git repo is in /root/glusterfs. Um, you should be
 able to find everything else pretty easily.

 Btw, this is just a temp VM, so feel free to do anything you want
 with it. When you're finished with it let me know so I can delete
 it. :)

 + Justin

  ~kaushal

  On Thu, May 22, 2014 at 4:53 PM, Pranith Kumar Karampuri 
  pkara...@redhat.com  wrote:
  Kaushal,
  Rebalance status command seems to be failing sometimes. I sent a mail about
  such spurious failure earlier today. Did you get a chance to look at the
  logs and confirm that rebalance didn't fail and it is indeed a timeout?

  Pranith
  - Original Message -
   From: Kaushal M  kshlms...@gmail.com 
   To: Pranith Kumar Karampuri  pkara...@redhat.com 
   Cc: Justin Clift  jus...@gluster.org , Gluster Devel 
   gluster-devel@gluster.org 
   Sent: Thursday, May 22, 2014 4:40:25 PM
   Subject: Re: [Gluster-devel] bug-857330/normal.t failure

   The test is waiting for rebalance to finish. This is a rebalance with
   some
   actual data so it could have taken a long time to finish. I did set a
   pretty high timeout, but it seems like it's not enough for the new VMs.

   Possible options are,
   - Increase this timeout further
   - Reduce the amount of data. Currently this is 100 directories with 10
   files each of size between 10-500KB

   ~kaushal

   On Thu, May 22, 2014 at 3:59 PM, Pranith Kumar Karampuri 
   pkara...@redhat.com  wrote:

Kaushal has more context about these CCed. Keep the setup until he
responds so that he can take a look.

Pranith
- Original Message -
 From: Justin Clift  jus...@gluster.org 
 To: Pranith Kumar Karampuri  pkara...@redhat.com 
 Cc: Gluster Devel  gluster-devel@gluster.org 
 Sent: Thursday, May 22, 2014 3:54:46 PM
 Subject: bug-857330/normal.t failure

 Hi Pranith,

 Ran a few VM's with your Gerrit CR 7835 applied, and in DEBUG
 mode (I think).

 One of the VM's had a failure in bug-857330/normal.t:

 Test Summary Report
 ---
 ./tests/basic/rpm.t (Wstat: 0 Tests: 0
Failed:
 0)
 Parse errors: Bad plan. You planned 8 tests but ran 0.
 ./tests/bugs/bug-857330/normal.t (Wstat: 0 Tests: 24
Failed:
 1)
 Failed test: 13
 Files=230, Tests=4369, 5407 wallclock secs ( 2.13 usr 1.73 sys +
941.82
 cusr 645.54 csys = 1591.22 CPU)
 Result: FAIL

 Seems to be this test:

 COMMAND=volume rebalance $V0 status
 PATTERN=completed
 EXPECT_WITHIN 300 $PATTERN get-task-status

 Is this one on your radar already?

 Btw, this VM is still online. Can give you access to retrieve logs
 if useful.

 + Justin

 --
 Open Source and Standards @ Red Hat

Re: [Gluster-devel] Split-brain present and future in afr

2014-05-23 Thread Pranith Kumar Karampuri

- Original Message -
 From: Jeff Darcy jda...@redhat.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Tuesday, May 20, 2014 10:08:12 PM
 Subject: Re: [Gluster-devel] Split-brain present and future in afr

  1. Better protection for split-brain over time.
  2. Policy based split-brain resolution.
  3. Provide better availability with client quorum and replica 2.

 I would add the following:

 (4) Quorum enforcement - any kind - on by default.

For replica - 3 we can do that. For replica 2, quorum implementation at the 
moment is not good enough. Until we fix it correctly may be we should let it 
be. We can revisit that decision once we come up with better solution for 
replica 2.

 (5) Fix the problem of volumes losing quorum because unrelated nodes
 went down (i.e. implement volume-level quorum).

 (6) Better tools for users to resolve split brain themselves.

Agreed. Already in plan for 3.6.

  For 3, we are planning to introduce arbiter bricks that can be used to
  determine quorum. The arbiter bricks will be dummy bricks that host only
  files that will be updated from multiple clients. This will be achieved by
  bringing about variable replication count for configurable class of files
  within a volume.
   In the case of a replicated volume with one arbiter brick per replica
   group,
   certain files that are prone to split-brain will be in 3 bricks (2 data
   bricks + 1 arbiter brick).  All other files will be present in the regular
   data bricks. For example, when oVirt VM disks are hosted on a replica 2
   volume, sanlock is used by oVirt for arbitration. sanloclk lease files
   will
   be written by all clients and VM disks are written by only a single client
   at any given point of time. In this scenario, we can place sanlock lease
   files on 2 data + 1 arbiter bricks. The VM disk files will only be present
   on the 2 data bricks. Client quorum is now determined by looking at 3
   bricks instead of 2 and we have better protection when network
   split-brains
   happen.

 Constantly filtering requests to use either N or N+1 bricks is going to be
 complicated and hard to debug.  Every data-structure allocation or loop
 based on replica count will have to be examined, and many will have to be
 modified.  That's a *lot* of places.  This also overlaps significantly
 with functionality that can be achieved with data classification (i.e.
 supporting multiple replica levels within the same volume).  What use case
 requires that it be implemented within AFR instead of more generally and
 flexibly?

1) It wouldn't still bring in arbiter for replica 2.
2) That would need more bricks, more processes, more ports.

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding special treatment of ENOTSUP for setxattr

2014-05-26 Thread Pranith Kumar Karampuri

Please review http://review.gluster.com/7788 submitted to remove the filtering 
of that error.

Pranith
- Original Message -
 From: Harshavardhana har...@harshavardhana.net
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Kaleb KEITHLEY kkeit...@redhat.com, Gluster Devel 
 gluster-devel@gluster.org
 Sent: Friday, May 23, 2014 2:12:02 AM
 Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for 
 setxattr

 http://review.gluster.com/#/c/7823/ - the fix here

 On Thu, May 22, 2014 at 1:41 PM, Harshavardhana
 har...@harshavardhana.net wrote:
  Here are the important locations in the XFS tree coming from 2.6.32 branch

  STATIC int
  xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
  {
  struct xfs_inode *ip = XFS_I(inode);
  unsigned char *ea_name;
  int error;

  if (S_ISLNK(inode-i_mode))  I would
  generally think this is the issue.
  return -EOPNOTSUPP;

  STATIC long
  xfs_vn_fallocate(
  struct inode*inode,
  int mode,
  loff_t  offset,
  loff_t  len)
  {
  longerror;
  loff_t  new_size = 0;
  xfs_flock64_t   bf;
  xfs_inode_t *ip = XFS_I(inode);
  int cmd = XFS_IOC_RESVSP;
  int attr_flags = XFS_ATTR_NOLOCK;

  if (mode  ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
  return -EOPNOTSUPP;

  STATIC int
  xfs_ioc_setxflags(
  xfs_inode_t *ip,
  struct file *filp,
  void__user *arg)
  {
  struct fsxattr  fa;
  unsigned intflags;
  unsigned intmask;
  int error;

  if (copy_from_user(flags, arg, sizeof(flags)))
  return -EFAULT;

  if (flags  ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \
FS_NOATIME_FL | FS_NODUMP_FL | \
FS_SYNC_FL))
  return -EOPNOTSUPP;

  Perhaps some sort of system level acl's are being propagated by us
  over symlinks() ? - perhaps this is the related to the same issue of
  following symlinks?

  On Sun, May 18, 2014 at 10:48 AM, Pranith Kumar Karampuri
  pkara...@redhat.com wrote:
  Sent the following patch to remove the special treatment of ENOTSUP here:
  http://review.gluster.org/7788

  Pranith
  - Original Message -
  From: Kaleb KEITHLEY kkeit...@redhat.com
  To: gluster-devel@gluster.org
  Sent: Tuesday, May 13, 2014 8:01:53 PM
  Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for
  setxattr

  On 05/13/2014 08:00 AM, Nagaprasad Sathyanarayana wrote:
   On 05/07/2014 03:44 PM, Pranith Kumar Karampuri wrote:

   - Original Message -
   From: Raghavendra Gowdappa rgowd...@redhat.com
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Vijay Bellur vbel...@redhat.com, gluster-devel@gluster.org,
   Anand Avati aav...@redhat.com
   Sent: Wednesday, May 7, 2014 3:42:16 PM
   Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP
   for setxattr

   I think with repetitive log message suppression patch being merged,
   we
   don't really need gf_log_occasionally (except if they are logged in
   DEBUG or
   TRACE levels).
   That definitely helps. But still, setxattr calls are not supposed to
   fail with ENOTSUP on FS where we support gluster. If there are special
   keys which fail with ENOTSUPP, we can conditionally log setxattr
   failures only when the key is something new?

  I know this is about EOPNOTSUPP (a.k.a. ENOTSUPP) returned by
  setxattr(2) for legitimate attrs.

  But I can't help but wondering if this isn't related to other bugs we've
  had with, e.g., lgetxattr(2) called on invalid xattrs?

  E.g. see https://bugzilla.redhat.com/show_bug.cgi?id=765202. We have a
  hack where xlators communicate with each other by getting (and setting?)
  invalid xattrs; the posix xlator has logic to filter out  invalid
  xattrs, but due to bugs this hasn't always worked perfectly.

  It would be interesting to know which xattrs are getting errors and on
  which fs types.

  FWIW, in a quick perusal of a fairly recent (3.14.3) kernel, in xfs
  there are only six places where EOPNOTSUPP is returned, none of them
  related to xattrs. In ext[34] EOPNOTSUPP can be returned if the
  user_xattr option is not enabled (enabled by default in ext4.) And in
  the higher level vfs xattr code there are many places where EOPNOTSUPP
  _might_ be returned, primarily only if subordinate function calls aren't
  invoked which would clear the default or return a different error.

  --

  Kaleb

  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failire ./tests/bugs/bug-1049834.t [16]

2014-05-27 Thread Pranith Kumar Karampuri


CC gluster-devel

Pranith
- Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: Avra Sengupta aseng...@redhat.com
 Sent: Wednesday, May 28, 2014 6:42:53 AM
 Subject: Spurious failire ./tests/bugs/bug-1049834.t [16]
 
 hi Avra,
Could you look into it.
 
 Patch == http://review.gluster.com/7889/1
 Author==  Avra Sengupta aseng...@redhat.com
 Build triggered by== amarts
 Build-url ==
 http://build.gluster.org/job/regression/4586/consoleFull
 Download-log-at   ==
 http://build.gluster.org:443/logs/regression/glusterfs-logs-20140527:14:51:09.tgz
 Test written by   == Author: Avra Sengupta aseng...@redhat.com
 
 ./tests/bugs/bug-1049834.t [16]
   #!/bin/bash
   
   . $(dirname $0)/../include.rc
   . $(dirname $0)/../cluster.rc
   . $(dirname $0)/../volume.rc
   . $(dirname $0)/../snapshot.rc
   
   cleanup;
 1 TEST verify_lvm_version
 2 TEST launch_cluster 2
 3 TEST setup_lvm 2
   
 4 TEST $CLI_1 peer probe $H2
 5 EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count
   
 6 TEST $CLI_1 volume create $V0 $H1:$L1 $H2:$L2
 7 EXPECT 'Created' volinfo_field $V0 'Status'
   
 8 TEST $CLI_1 volume start $V0
 9 EXPECT 'Started' volinfo_field $V0 'Status'
   
   #Setting the snap-max-hard-limit to 4
10 TEST $CLI_1 snapshot config $V0 snap-max-hard-limit 4
   PID_1=$!
   wait $PID_1
   
   #Creating 3 snapshots on the volume (which is the soft-limit)
11 TEST create_n_snapshots $V0 3 $V0_snap
12 TEST snapshot_n_exists $V0 3 $V0_snap
   
   #Creating the 4th snapshot on the volume and expecting it to be created
   # but with the deletion of the oldest snapshot i.e 1st snapshot
13 TEST  $CLI_1 snapshot create ${V0}_snap4 ${V0}
14 TEST  snapshot_exists 1 ${V0}_snap4
15 TEST ! snapshot_exists 1 ${V0}_snap1
 ***16 TEST $CLI_1 snapshot delete ${V0}_snap4
17 TEST $CLI_1 snapshot create ${V0}_snap1 ${V0}
18 TEST snapshot_exists 1 ${V0}_snap1
   
   #Deleting the 4 snaps
   #TEST delete_n_snapshots $V0 4 $V0_snap
   #TEST ! snapshot_n_exists $V0 4 $V0_snap
   
   cleanup;
 
 Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Spurious failure in ./tests/bugs/bug-948686.t [14, 15, 16]

2014-05-27 Thread Pranith Kumar Karampuri

hi kp,
 Could you look into it.

Patch == http://review.gluster.com/7889/1
Author==  Avra Sengupta aseng...@redhat.com
Build triggered by== amarts
Build-url == 
http://build.gluster.org/job/regression/4586/consoleFull
Download-log-at   == 
http://build.gluster.org:443/logs/regression/glusterfs-logs-20140527:14:51:09.tgz
Test written by   == Author: Krishnan Parthasarathi 
kpart...@redhat.com

./tests/bugs/bug-948686.t [14, 15, 16]
  #!/bin/bash
  
  . $(dirname $0)/../include.rc
  . $(dirname $0)/../volume.rc
  . $(dirname $0)/../cluster.rc
  
  function check_peers {
  $CLI_1 peer status | grep 'Peer in Cluster (Connected)' | wc -l
  }
  cleanup;
  #setup cluster and test volume
1 TEST launch_cluster 3; # start 3-node virtual cluster
2 TEST $CLI_1 peer probe $H2; # peer probe server 2 from server 1 cli
3 TEST $CLI_1 peer probe $H3; # peer probe server 3 from server 1 cli
  
4 EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers;
  
5 TEST $CLI_1 volume create $V0 replica 2 $H1:$B1/$V0 $H1:$B1/${V0}_1 
$H2:$B2/$V0 $H3:$B3/$V0
6 TEST $CLI_1 volume start $V0
7 TEST glusterfs --volfile-server=$H1 --volfile-id=$V0 $M0
  
  #kill a node
8 TEST kill_node 3
  
  #modify volume config to see change in volume-sync
9 TEST $CLI_1 volume set $V0 write-behind off
  #add some files to the volume to see effect of volume-heal cmd
   10 TEST touch $M0/{1..100};
   11 TEST $CLI_1 volume stop $V0;
   12 TEST $glusterd_3;
   13 EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers;
***14 TEST $CLI_3 volume start $V0;
***15 TEST $CLI_2 volume stop $V0;
***16 TEST $CLI_2 volume delete $V0;
  
  cleanup;
  
   17 TEST glusterd;
   18 TEST $CLI volume create $V0 $H0:$B0/$V0
   19 TEST $CLI volume start $V0
  pkill glusterd;
  pkill glusterfsd;
   20 TEST glusterd
   21 TEST $CLI volume status $V0
  
  cleanup;
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding special treatment of ENOTSUP for setxattr

2014-05-28 Thread Pranith Kumar Karampuri

Vijay,
   Could you please merge http://review.gluster.com/7788 if there are no more 
concerns.

Pranith
- Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: Harshavardhana har...@harshavardhana.net
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Monday, May 26, 2014 1:18:18 PM
 Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for 
 setxattr

 Please review http://review.gluster.com/7788 submitted to remove the
 filtering of that error.

 Pranith
 - Original Message -
  From: Harshavardhana har...@harshavardhana.net
  To: Pranith Kumar Karampuri pkara...@redhat.com
  Cc: Kaleb KEITHLEY kkeit...@redhat.com, Gluster Devel
  gluster-devel@gluster.org
  Sent: Friday, May 23, 2014 2:12:02 AM
  Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for
  setxattr

  http://review.gluster.com/#/c/7823/ - the fix here

  On Thu, May 22, 2014 at 1:41 PM, Harshavardhana
  har...@harshavardhana.net wrote:
   Here are the important locations in the XFS tree coming from 2.6.32
   branch

   STATIC int
   xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
   {
   struct xfs_inode *ip = XFS_I(inode);
   unsigned char *ea_name;
   int error;

   if (S_ISLNK(inode-i_mode))  I would
   generally think this is the issue.
   return -EOPNOTSUPP;

   STATIC long
   xfs_vn_fallocate(
   struct inode*inode,
   int mode,
   loff_t  offset,
   loff_t  len)
   {
   longerror;
   loff_t  new_size = 0;
   xfs_flock64_t   bf;
   xfs_inode_t *ip = XFS_I(inode);
   int cmd = XFS_IOC_RESVSP;
   int attr_flags = XFS_ATTR_NOLOCK;

   if (mode  ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
   return -EOPNOTSUPP;

   STATIC int
   xfs_ioc_setxflags(
   xfs_inode_t *ip,
   struct file *filp,
   void__user *arg)
   {
   struct fsxattr  fa;
   unsigned intflags;
   unsigned intmask;
   int error;

   if (copy_from_user(flags, arg, sizeof(flags)))
   return -EFAULT;

   if (flags  ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \
 FS_NOATIME_FL | FS_NODUMP_FL | \
 FS_SYNC_FL))
   return -EOPNOTSUPP;

   Perhaps some sort of system level acl's are being propagated by us
   over symlinks() ? - perhaps this is the related to the same issue of
   following symlinks?

   On Sun, May 18, 2014 at 10:48 AM, Pranith Kumar Karampuri
   pkara...@redhat.com wrote:
   Sent the following patch to remove the special treatment of ENOTSUP
   here:
   http://review.gluster.org/7788

   Pranith
   - Original Message -
   From: Kaleb KEITHLEY kkeit...@redhat.com
   To: gluster-devel@gluster.org
   Sent: Tuesday, May 13, 2014 8:01:53 PM
   Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for
   setxattr

   On 05/13/2014 08:00 AM, Nagaprasad Sathyanarayana wrote:
On 05/07/2014 03:44 PM, Pranith Kumar Karampuri wrote:

- Original Message -
From: Raghavendra Gowdappa rgowd...@redhat.com
To: Pranith Kumar Karampuri pkara...@redhat.com
Cc: Vijay Bellur vbel...@redhat.com, gluster-devel@gluster.org,
Anand Avati aav...@redhat.com
Sent: Wednesday, May 7, 2014 3:42:16 PM
Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP
for setxattr

I think with repetitive log message suppression patch being
merged,
we
don't really need gf_log_occasionally (except if they are logged in
DEBUG or
TRACE levels).
That definitely helps. But still, setxattr calls are not supposed to
fail with ENOTSUP on FS where we support gluster. If there are
special
keys which fail with ENOTSUPP, we can conditionally log setxattr
failures only when the key is something new?

   I know this is about EOPNOTSUPP (a.k.a. ENOTSUPP) returned by
   setxattr(2) for legitimate attrs.

   But I can't help but wondering if this isn't related to other bugs
   we've
   had with, e.g., lgetxattr(2) called on invalid xattrs?

   E.g. see https://bugzilla.redhat.com/show_bug.cgi?id=765202. We have a
   hack where xlators communicate with each other by getting (and
   setting?)
   invalid xattrs; the posix xlator has logic to filter out  invalid
   xattrs, but due to bugs this hasn't always worked perfectly.

   It would be interesting to know which xattrs are getting errors and on
   which fs types.

   FWIW, in a quick perusal of a fairly recent (3.14.3) kernel, in xfs
   there are only six places where EOPNOTSUPP is returned, none of them
   related to xattrs. In ext[34] EOPNOTSUPP can be returned

Re: [Gluster-devel] Spurious failire ./tests/bugs/bug-1049834.t [16]

2014-05-28 Thread Pranith Kumar Karampuri



- Original Message -
 From: Avra Sengupta aseng...@redhat.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Wednesday, May 28, 2014 5:04:40 PM
 Subject: Re: [Gluster-devel] Spurious failire ./tests/bugs/bug-1049834.t [16]
 
 Pranith am looking into a priority issue for
 snapshot(https://bugzilla.redhat.com/show_bug.cgi?id=1098045) right now,
 I will get started with this spurious failure as soon as I finish it,
 which should be max by eod tomorrow.

Thanks for the ack Avra.

Pranith

 
 Regards,
 Avra
 
 On 05/28/2014 06:46 AM, Pranith Kumar Karampuri wrote:
  FYI, this test failed more than once yesterday. Same test failed both the
  times.
 
  Pranith
  - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: Avra Sengupta aseng...@redhat.com
  Cc: Gluster Devel gluster-devel@gluster.org
  Sent: Wednesday, May 28, 2014 6:43:52 AM
  Subject: Re: [Gluster-devel] Spurious failire ./tests/bugs/bug-1049834.t
  [16]
 
 
  CC gluster-devel
 
  Pranith
  - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: Avra Sengupta aseng...@redhat.com
  Sent: Wednesday, May 28, 2014 6:42:53 AM
  Subject: Spurious failire ./tests/bugs/bug-1049834.t [16]
 
  hi Avra,
  Could you look into it.
 
  Patch == http://review.gluster.com/7889/1
  Author==  Avra Sengupta aseng...@redhat.com
  Build triggered by== amarts
  Build-url ==
  http://build.gluster.org/job/regression/4586/consoleFull
  Download-log-at   ==
  http://build.gluster.org:443/logs/regression/glusterfs-logs-20140527:14:51:09.tgz
  Test written by   == Author: Avra Sengupta aseng...@redhat.com
 
  ./tests/bugs/bug-1049834.t [16]
 #!/bin/bash
 
 . $(dirname $0)/../include.rc
 . $(dirname $0)/../cluster.rc
 . $(dirname $0)/../volume.rc
 . $(dirname $0)/../snapshot.rc
 
 cleanup;
   1 TEST verify_lvm_version
   2 TEST launch_cluster 2
   3 TEST setup_lvm 2
 
   4 TEST $CLI_1 peer probe $H2
   5 EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count
 
   6 TEST $CLI_1 volume create $V0 $H1:$L1 $H2:$L2
   7 EXPECT 'Created' volinfo_field $V0 'Status'
 
   8 TEST $CLI_1 volume start $V0
   9 EXPECT 'Started' volinfo_field $V0 'Status'
 
 #Setting the snap-max-hard-limit to 4
  10 TEST $CLI_1 snapshot config $V0 snap-max-hard-limit 4
 PID_1=$!
 wait $PID_1
 
 #Creating 3 snapshots on the volume (which is the soft-limit)
  11 TEST create_n_snapshots $V0 3 $V0_snap
  12 TEST snapshot_n_exists $V0 3 $V0_snap
 
 #Creating the 4th snapshot on the volume and expecting it to be
 created
 # but with the deletion of the oldest snapshot i.e 1st snapshot
  13 TEST  $CLI_1 snapshot create ${V0}_snap4 ${V0}
  14 TEST  snapshot_exists 1 ${V0}_snap4
  15 TEST ! snapshot_exists 1 ${V0}_snap1
  ***16 TEST $CLI_1 snapshot delete ${V0}_snap4
  17 TEST $CLI_1 snapshot create ${V0}_snap1 ${V0}
  18 TEST snapshot_exists 1 ${V0}_snap1
 
 #Deleting the 4 snaps
 #TEST delete_n_snapshots $V0 4 $V0_snap
 #TEST ! snapshot_n_exists $V0 4 $V0_snap
 
 cleanup;
 
  Pranith
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel
 
 
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [wireshark] TODO features

2014-05-28 Thread Pranith Kumar Karampuri

- Original Message -
 From: Vikhyat Umrao vum...@redhat.com
 To: Niels de Vos nde...@redhat.com
 Cc: gluster-devel@gluster.org
 Sent: Wednesday, May 28, 2014 3:37:47 PM
 Subject: Re: [Gluster-devel] [wireshark] TODO features

 Hi Niels,

 Thanks for all your inputs and help, I have submitted a patch:

 https://code.wireshark.org/review/1833

I have absolutely no idea how this is supposed to work, but just wanted to ask 
what will the 'name' variable be if the file name is 'EMPTY' i.e. 
RPC_STRING_EMPTY

Pranith

 glusterfs: show filenames in the summary for common procedures

 With this patch we will have filename on the summary for procedures MKDIR,
 CREATE and LOOKUP.

 Example output:

 173 18.309307 192.168.100.3 - 192.168.100.4 GlusterFS 224 MKDIR V330 MKDIR
 Call, Filename: testdir
 2606 36.767766 192.168.100.3 - 192.168.100.4 GlusterFS 376 LOOKUP V330
 LOOKUP Call, Filename: 1.txt
 2612 36.768242 192.168.100.3 - 192.168.100.4 GlusterFS 228 CREATE V330
 CREATE Call, Filename: 1.txt

That looks good :-)

Pranith

 Thanks,
 Vikhyat

 From: Niels de Vos nde...@redhat.com
 To: Vikhyat Umrao vum...@redhat.com
 Cc: gluster-devel@gluster.org
 Sent: Tuesday, April 29, 2014 11:16:20 PM
 Subject: Re: [Gluster-devel] [wireshark] TODO features

 On Tue, Apr 29, 2014 at 06:25:15AM -0400, Vikhyat Umrao wrote:
  Hi,

  I am interested in TODO wireshark features for GlusterFS :
  I can start from below given feature for one procedure:
  = display the filename or filehandle on the summary for common procedures

 Things to get you and others prepared:

 1. go to https://forge.gluster.org/wireshark/pages/Todo
 2. login and edit the wiki page, add your name to the topic
 3. clone the wireshark repository:
 $ git clone g...@forge.gluster.org:wireshark/wireshark.git
 (you have been added to the 'wireshark' group, so you should have
 push access over ssh)
 4. create a new branch for your testing
 $ git checkout -t -b wip/master/visible-filenames upstream/master
 5. make sure you have all the dependencies for compiling Wireshark
 (quite a lot are needed)
 $ ./autogen.sh
 $ ./configure --disable-wireshark
 (I tend to build only the commandline tools like 'tshark')
 $ make
 6. you should now have a ./tshark executable that you can use for
 testing

 The changes you want to make are in epan/dissectors/packet-glusterfs.c.
 For example, start with adding the name of the file/dir that is passed
 to LOOKUP. The work to dissect the data in the network packet is done in
 glusterfs_gfs3_3_op_lookup_call(). It does not really matter on how that
 function gets executed, that is more a thing for an other task (add
 support for new procedures).

 In the NFS-dissector, you can see how this is done. Check the
 implementation of the dissect_nfs3_lookup_call() function in
 epan/dissectors/packet-nfs.c. The col_append_fstr() function achieves
 what you want to do.

 Of course, you really should share your changes! Now, 'git commit' your
 change with a suitable commit message and do

 $ git push origin wip/master/visible-filenames

 Your branch should now be visible under
 https://forge.gluster.org/wireshark/wireshark. Let me know, and I'll
 give it a whirl.

 Now you've done the filename for LOOKUP, I'm sure you can think of other
 things that make sense to get displayed.

 Do ask questions and send corrections if something is missing, or not
 working as explained here. This email should probably get included in
 the projects wiki https://forge.gluster.org/wireshark/pages/Home some
 where.

 Good luck,
 Niels

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Spurious failure in ./tests/bugs/bug-1038598.t [28]

2014-05-28 Thread Pranith Kumar Karampuri

hi Anuradha,
  Please look into this.

Patch == http://review.gluster.com/#/c/7880/1
Author==  Emmanuel Dreyfus m...@netbsd.org
Build triggered by== kkeithle
Build-url == 
http://build.gluster.org/job/regression/4603/consoleFull
Download-log-at   == 
http://build.gluster.org:443/logs/regression/glusterfs-logs-20140528:18:25:12.tgz
Test written by   == Author: Anuradha ata...@redhat.com

./tests/bugs/bug-1038598.t [28]
0 #!/bin/bash
1 .  $(dirname $0)/../include.rc
2 .  $(dirname $0)/../volume.rc
3 
4 cleanup;
5 
6 TEST glusterd
7 TEST pidof glusterd
8 TEST $CLI volume info;
9 
   10 TEST $CLI volume create $V0 replica 2  $H0:$B0/${V0}{1,2};
   11 
   12 function hard_limit()
   13 {
   14 local QUOTA_PATH=$1;
   15 $CLI volume quota $V0 list $QUOTA_PATH | grep $QUOTA_PATH | awk 
'{print $2}'
   16 }
   17 
   18 function soft_limit()
   19 {
   20 local QUOTA_PATH=$1;
   21 $CLI volume quota $V0 list $QUOTA_PATH | grep $QUOTA_PATH | awk 
'{print $3}'
   22 }
   23 
   24 function usage()
   25 {
   26 local QUOTA_PATH=$1;
   27 $CLI volume quota $V0 list $QUOTA_PATH | grep $QUOTA_PATH | awk 
'{print $4}'
   28 }
   29 
   30 function sl_exceeded()
   31 {
   32 local QUOTA_PATH=$1;
   33 $CLI volume quota $V0 list $QUOTA_PATH | grep $QUOTA_PATH | awk 
'{print $6}'
   34 }
   35 
   36 function hl_exceeded()
   37 {
   38 local QUOTA_PATH=$1;
   39 $CLI volume quota $V0 list $QUOTA_PATH | grep $QUOTA_PATH | awk 
'{print $7}'
   40 
   41 }
   42 
   43 EXPECT $V0 volinfo_field $V0 'Volume Name';
   44 EXPECT 'Created' volinfo_field $V0 'Status';
   45 EXPECT '2' brick_count $V0
   46 
   47 TEST $CLI volume start $V0;
   48 EXPECT 'Started' volinfo_field $V0 'Status';
   49 
   50 TEST $CLI volume quota $V0 enable
   51 sleep 5
   52 
   53 TEST glusterfs -s $H0 --volfile-id $V0 $M0;
   54 
   55 TEST mkdir -p $M0/test_dir
   56 TEST $CLI volume quota $V0 limit-usage /test_dir 10MB 50
   57 
   58 EXPECT 10.0MB hard_limit /test_dir;
   59 EXPECT 50% soft_limit /test_dir;
   60 
   61 TEST dd if=/dev/zero of=$M0/test_dir/file1.txt bs=1M count=4
   62 EXPECT 4.0MB usage /test_dir;
   63 EXPECT 'No' sl_exceeded /test_dir;
   64 EXPECT 'No' hl_exceeded /test_dir;
   65 
   66 TEST dd if=/dev/zero of=$M0/test_dir/file1.txt bs=1M count=6
   67 EXPECT 6.0MB usage /test_dir;
   68 EXPECT 'Yes' sl_exceeded /test_dir;
   69 EXPECT 'No' hl_exceeded /test_dir;
   70 
   71 #set timeout to 0 so that quota gets enforced without any lag
   72 TEST $CLI volume set $V0 features.hard-timeout 0
   73 TEST $CLI volume set $V0 features.soft-timeout 0
   74 
   75 TEST ! dd if=/dev/zero of=$M0/test_dir/file1.txt bs=1M count=15
   76 EXPECT 'Yes' sl_exceeded /test_dir;
***77 EXPECT 'Yes' hl_exceeded /test_dir;
   78 
   79 cleanup;

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Change in glusterfs[master]: NetBSD build fix for gettext

2014-05-29 Thread Pranith Kumar Karampuri

Done

Pranith
- Original Message -
 From: Emmanuel Dreyfus m...@netbsd.org
 To: jGluster Devel gluster-devel@gluster.org
 Sent: Thursday, May 29, 2014 1:53:12 PM
 Subject: Re: [Gluster-devel] Change in glusterfs[master]: NetBSD build fix
 for gettext
 
  http://build.gluster.org/job/regression/4603/consoleFull : FAILED
 
 Is it possible to reschedule this test? I feel like something was wrong
 which is not related to my change.
 
 --
 Emmanuel Dreyfus
 http://hcpnet.free.fr/pubz
 m...@netbsd.org
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] regarding fsetattr

2014-05-29 Thread Pranith Kumar Karampuri

hi,
   When I run the following program on fuse mount it fails with ENOENT. When I 
look at the mount logs, it prints error for setattr instead of fsetattr. 
Wondering anyone knows why the fop comes as setattr instead of fsetattr.

Log:
[2014-05-29 09:33:38.658023] W [fuse-bridge.c:1056:fuse_setattr_cbk] 
0-glusterfs-fuse: 2569: SETATTR() gfid:ae44dd74-ff45-42a8-886e-b4ce2373a267 
= -1 (No such file or directory)

Program:
#include stdio.h
#include unistd.h
#include sys/types.h
#include sys/stat.h
#include fcntl.h
#include errno.h
#include string.h


int
main ()
{
int ret = 0;
int fd=open(a.txt, O_CREAT|O_RDWR);

if (fd  0)
printf (open failed: %s\n, strerror(errno));
ret = unlink(a.txt);
if (ret  0)
printf (unlink failed: %s\n, strerror(errno));
if (write (fd, abc, 3)  0)
printf (Not able to print %s\n, strerror (errno));
ret = fchmod (fd, S_IRUSR|S_IWUSR|S_IXUSR);
if (ret  0)
printf (fchmod failed %s\n, strerror(errno));
return 0;
}

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding fsetattr

2014-05-29 Thread Pranith Kumar Karampuri



- Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: jGluster Devel gluster-devel@gluster.org
 Cc: Brian Foster bfos...@redhat.com
 Sent: Thursday, May 29, 2014 3:08:33 PM
 Subject: regarding fsetattr
 
 hi,
When I run the following program on fuse mount it fails with ENOENT. When
I look at the mount logs, it prints error for setattr instead of
fsetattr. Wondering anyone knows why the fop comes as setattr instead of
fsetattr.
 
 Log:
 [2014-05-29 09:33:38.658023] W [fuse-bridge.c:1056:fuse_setattr_cbk]
 0-glusterfs-fuse: 2569: SETATTR()
 gfid:ae44dd74-ff45-42a8-886e-b4ce2373a267 = -1 (No such file or
 directory)
 
 Program:
 #include stdio.h
 #include unistd.h
 #include sys/types.h
 #include sys/stat.h
 #include fcntl.h
 #include errno.h
 #include string.h
 
 
 int
 main ()
 {
 int ret = 0;
 int fd=open(a.txt, O_CREAT|O_RDWR);
 
 if (fd  0)
 printf (open failed: %s\n, strerror(errno));
 ret = unlink(a.txt);
 if (ret  0)
 printf (unlink failed: %s\n, strerror(errno));
 if (write (fd, abc, 3)  0)
 printf (Not able to print %s\n, strerror (errno));
 ret = fchmod (fd, S_IRUSR|S_IWUSR|S_IXUSR);
 if (ret  0)
 printf (fchmod failed %s\n, strerror(errno));
 return 0;
 }

Based on vijay's inputs I checked in fuse-brige and this is what I see:
1162if (fsi-valid  FATTR_FH 
1163!(fsi-valid  (FATTR_ATIME|FATTR_MTIME))) {
1164/* We need no loc if kernel sent us an fd and
1165 * we are not fiddling with times */
1166state-fd = FH_TO_FD (fsi-fh);
(gdb) 
1167fuse_resolve_fd_init (state, state-resolve, 
state-fd);
1168} else {
1169fuse_resolve_inode_init (state, state-resolve, 
finh-nodeid);
1170}
1171

(gdb) p fsi-valid
$4 = 1
(gdb) p (fsi-valid  FATTR_FH)
$5 = 0
(gdb) 

fsi-valid doesn't have FATTR_FH. Who is supposed to set it?

Pranith
 
 Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding fsetattr

2014-05-29 Thread Pranith Kumar Karampuri



- Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: jGluster Devel gluster-devel@gluster.org
 Sent: Thursday, May 29, 2014 3:37:37 PM
 Subject: Re: [Gluster-devel] regarding fsetattr
 
 
 
 - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: jGluster Devel gluster-devel@gluster.org
  Cc: Brian Foster bfos...@redhat.com
  Sent: Thursday, May 29, 2014 3:08:33 PM
  Subject: regarding fsetattr
  
  hi,
 When I run the following program on fuse mount it fails with ENOENT.
 When
 I look at the mount logs, it prints error for setattr instead of
 fsetattr. Wondering anyone knows why the fop comes as setattr instead of
 fsetattr.
  
  Log:
  [2014-05-29 09:33:38.658023] W [fuse-bridge.c:1056:fuse_setattr_cbk]
  0-glusterfs-fuse: 2569: SETATTR()
  gfid:ae44dd74-ff45-42a8-886e-b4ce2373a267 = -1 (No such file or
  directory)
  
  Program:
  #include stdio.h
  #include unistd.h
  #include sys/types.h
  #include sys/stat.h
  #include fcntl.h
  #include errno.h
  #include string.h
  
  
  int
  main ()
  {
  int ret = 0;
  int fd=open(a.txt, O_CREAT|O_RDWR);
  
  if (fd  0)
  printf (open failed: %s\n, strerror(errno));
  ret = unlink(a.txt);
  if (ret  0)
  printf (unlink failed: %s\n, strerror(errno));
  if (write (fd, abc, 3)  0)
  printf (Not able to print %s\n, strerror (errno));
  ret = fchmod (fd, S_IRUSR|S_IWUSR|S_IXUSR);
  if (ret  0)
  printf (fchmod failed %s\n, strerror(errno));
  return 0;
  }
 
 Based on vijay's inputs I checked in fuse-brige and this is what I see:
 1162  if (fsi-valid  FATTR_FH 
 1163  !(fsi-valid  (FATTR_ATIME|FATTR_MTIME))) {
 1164  /* We need no loc if kernel sent us an fd and
 1165   * we are not fiddling with times */
 1166  state-fd = FH_TO_FD (fsi-fh);
 (gdb)
 1167  fuse_resolve_fd_init (state, state-resolve,
 state-fd);
 1168  } else {
 1169  fuse_resolve_inode_init (state, state-resolve,
 finh-nodeid);
 1170  }
 1171
 
 (gdb) p fsi-valid
 $4 = 1
 (gdb) p (fsi-valid  FATTR_FH)
 $5 = 0
 (gdb)
 
 fsi-valid doesn't have FATTR_FH. Who is supposed to set it?

had a discussion with brian foster on IRC.
The issue is that gluster depends on client fd to be passed down to perform the 
operations where as setattr is sent on an inode from vfs to fuse and since 
gluster doesn't have any reference to inode once unlink happens, this issue is 
seen. I will have one more conversation with brian to find what needs to be 
fixed.

Pranith.

 
 Pranith
  
  Pranith
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-devel
 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Initiative to increase developer paritcipation

2014-05-29 Thread Pranith Kumar Karampuri

hi,
We are taking an initiative to come up with some easy bugs where we can 
help volunteers in the community to send patches for.

Goals of this initiative:
 - Each maintainer needs to come up with a list of bugs that are easy to fix in 
their components.
 - All the developers who are already active in the community to help the new 
comers by answering the questions.
 - Improve developer documentation to address FAQ
 - Over time make these new comers as experienced developers in glusterfs :-)

Maintainers,
 Could you please come up with the initial list of bugs by next Wednesday 
before community meeting?

Niels,
   Could you send out the guideline to mark the bugs as easy fix. Also the wiki 
link for backports.

PS: This is not just for new comers to the community but also for existing 
developers to explore other components.

Please feel free to suggest and give feedback to improve this process :-).

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failure in tests/basic/bd.t [22, 23, 24, 25]

2014-05-29 Thread Pranith Kumar Karampuri



- Original Message -
 From: Bharata B Rao bharata@gmail.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: jGluster Devel gluster-devel@gluster.org, M. Mohan Kumar 
 mohankuma...@gmail.com
 Sent: Friday, May 30, 2014 8:28:15 AM
 Subject: Re: [Gluster-devel] Spurious failure in tests/basic/bd.t [22, 23, 
 24, 25]
 
 CC'ing to the correct ID of Mohan

Thanks!

Pranith
 
 
 On Fri, May 30, 2014 at 5:45 AM, Pranith Kumar Karampuri 
 pkara...@redhat.com wrote:
 
  hi Mohan,
 Could you please look into this:
  Patch == http://review.gluster.com/#/c/7926/1
  Author==  Avra Sengupta aseng...@redhat.com
  Build triggered by== amarts
  Build-url ==
  http://build.gluster.org/job/regression/4615/consoleFull
  Download-log-at   ==
  http://build.gluster.org:443/logs/regression/glusterfs-logs-20140529:10:51:46.tgz
  Test written by   == Author: M. Mohan Kumar mo...@in.ibm.com
 
  ./tests/basic/bd.t [22, 23, 24, 25]
  0 #!/bin/bash
  1
  2 . $(dirname $0)/../include.rc
  3
  4 function execute()
  5 {
  6 cmd=$1
  7 shift
  8 ${cmd} $@ /dev/null 21
  9 }
 10
 11 function bd_cleanup()
 12 {
 13 execute vgremove -f ${V0}
 14 execute pvremove ${ld}
 15 execute losetup -d ${ld}
 16 execute rm ${BD_DISK}
 17 cleanup
 18 }
 19
 20 function check()
 21 {
 22 if [ $? -ne 0 ]; then
 23 echo prerequsite $@ failed
 24 bd_cleanup
 25 exit
 26 fi
 27 }
 28
 29 SIZE=256 #in MB
 30
 31 bd_cleanup;
 32
 33 ## Configure environment needed for BD backend volumes
 34 ## Create a file with configured size and
 35 ## set it as a temporary loop device to create
 36 ## physical volume  VG. These are basic things needed
 37 ## for testing BD xlator if anyone of these steps fail,
 38 ## test script exits
 39 function configure()
 40 {
 41 GLDIR=`$CLI system:: getwd`
 42 BD_DISK=${GLDIR}/bd_disk
 43
 44 execute truncate -s${SIZE}M ${BD_DISK}
 45 check ${BD_DISK} creation
 46
 47 execute losetup -f
 48 check losetup
 49 ld=`losetup -f`
 50
 51 execute losetup ${ld} ${BD_DISK}
 52 check losetup ${BD_DISK}
 53 execute pvcreate -f ${ld}
 54 check pvcreate ${ld}
 55 execute vgcreate ${V0} ${ld}
 56 check vgcreate ${V0}
 57 execute lvcreate --thin ${V0}/pool --size 128M
 58 }
 59
 60 function volinfo_field()
 61 {
 62 local vol=$1;
 63 local field=$2;
 64 $CLI volume info $vol | grep ^$field:  | sed 's/.*: //';
 65 }
 66
 67 function volume_type()
 68 {
 69 getfattr -n volume.type $M0/. --only-values --absolute-names
  -e text
 70 }
 71
 72 TEST glusterd
 73 TEST pidof glusterd
 74 configure
 75
 76 TEST $CLI volume create $V0 ${H0}:/$B0/$V0?${V0}
 77 EXPECT $V0 volinfo_field $V0 'Volume Name';
 78 EXPECT 'Created' volinfo_field $V0 'Status';
 79
 80 ## Start volume and verify
 81 TEST $CLI volume start $V0;
 82 EXPECT 'Started' volinfo_field $V0 'Status'
 83
 84 TEST glusterfs --volfile-id=/$V0 --volfile-server=$H0 $M0
 85 EXPECT '1' volume_type
 86
 87 ## Create posix file
 88 TEST touch $M0/posix
 89
 90 TEST touch $M0/lv
 91 gfid=`getfattr -n glusterfs.gfid.string $M0/lv --only-values
  --absolute-names`
 92 TEST setfattr -n user.glusterfs.bd -v lv:4MB $M0/lv
 93 # Check if LV is created
 94 TEST stat /dev/$V0/${gfid}
 95
 96 ## Create filesystem
 97 sleep 1
 98 TEST mkfs.ext4 -qF $M0/lv
 99 # Cloning
100 TEST touch $M0/lv_clone
101 gfid=`getfattr -n glusterfs.gfid.string $M0/lv_clone --only-values
  --absolute-names`
102 TEST setfattr -n clone -v ${gfid} $M0/lv
103 TEST stat /dev/$V0/${gfid}
104
105 sleep 1
106 ## Check mounting
107 TEST mount -o loop $M0/lv $M1
108 umount $M1
109
110 # Snapshot
111 TEST touch $M0/lv_sn
112 gfid=`getfattr -n glusterfs.gfid.string $M0/lv_sn --only-values
  --absolute-names`
113 TEST setfattr -n snapshot -v ${gfid} $M0/lv
114 TEST stat /dev/$V0/${gfid}
115
116 # Merge
117 sleep 1
  **118 TEST setfattr -n merge -v $M0/lv_sn $M0/lv_sn
  **119 TEST ! stat $M0/lv_sn
  **120 TEST ! stat /dev/$V0/${gfid}
121
122
123 rm $M0/* -f
124
  **125 TEST umount $M0
126 TEST $CLI volume stop ${V0}
127 EXPECT 'Stopped' volinfo_field $V0 'Status';
128 TEST $CLI volume delete ${V0}
129
130 bd_cleanup
 
  Pranith
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http

Re: [Gluster-devel] Initiative to increase developer paritcipation

2014-05-29 Thread Pranith Kumar Karampuri

CC gluster-devel

Pranith
- Original Message -
 From: HUANG Qiulan huan...@ihep.ac.cn
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Sent: Friday, May 30, 2014 9:10:54 AM
 Subject: Re: [Gluster-devel] Initiative to increase developer paritcipation

 Hi Pranith,

 I'm glad to participate the Gluster developer team. First introduce myself
 briefly, I'm a staff of Computing Center, Institute of High Energy Physics,
 Chinese Academy Science. I have deployed Gluster 3.2.7 in our computing farm
 with 5 servers which provides storage services for physicists and is about
 315TB.

 For the production package, I do many changes like data distribution
 ,optimize the lookup request without send requests to all bricks only the
 hash and hash+1 brick and so on.

 Recently, I developed a distributed metadata services for Gluster which is
 being tested.

 Hope you are intested the work what I have done.

 Thank you.

 Cheers,
 Qiulan

 Computing center,the Institute of High Energy Physics, China
 Huang, QiulanTel: (+86) 10 8823 6010-105
 P.O. Box 918-7   Fax: (+86) 10 8823 6839
 Beijing 100049  P.R. China   Email: huan...@ihep.ac.cn
 ===

  -原始邮件-
  发件人: Pranith Kumar Karampuri pkara...@redhat.com
  发送时间: 2014年5月30日 星期五
  收件人: jGluster Devel gluster-devel@gluster.org, gluster-users
  gluster-us...@gluster.org
  抄送: Kaushal Madappa kmada...@redhat.com
  主题: [Gluster-devel] Initiative to increase developer paritcipation

  hi,
  We are taking an initiative to come up with some easy bugs where we can
  help volunteers in the community to send patches for.

  Goals of this initiative:
   - Each maintainer needs to come up with a list of bugs that are easy to
   fix in their components.
   - All the developers who are already active in the community to help the
   new comers by answering the questions.
   - Improve developer documentation to address FAQ
   - Over time make these new comers as experienced developers in glusterfs
   :-)

  Maintainers,
   Could you please come up with the initial list of bugs by next
   Wednesday before community meeting?

  Niels,
 Could you send out the guideline to mark the bugs as easy fix. Also the
 wiki link for backports.

  PS: This is not just for new comers to the community but also for existing
  developers to explore other components.

  Please feel free to suggest and give feedback to improve this process :-).

  Pranith
  ___
  Gluster-devel mailing list
  Gluster-devel@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding special treatment of ENOTSUP for setxattr

2014-05-30 Thread Pranith Kumar Karampuri

- Original Message -
 From: Pranith Kumar Karampuri pkara...@redhat.com
 To: Vijay Bellur vbel...@redhat.com
 Cc: Gluster Devel gluster-devel@gluster.org
 Sent: Wednesday, May 28, 2014 4:16:32 PM
 Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for 
 setxattr

 Vijay,
Could you please merge http://review.gluster.com/7788 if there are no more
concerns.

Gentle reminder.

Pranith.

 Pranith
 - Original Message -
  From: Pranith Kumar Karampuri pkara...@redhat.com
  To: Harshavardhana har...@harshavardhana.net
  Cc: Gluster Devel gluster-devel@gluster.org
  Sent: Monday, May 26, 2014 1:18:18 PM
  Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for
  setxattr

  Please review http://review.gluster.com/7788 submitted to remove the
  filtering of that error.

  Pranith
  - Original Message -
   From: Harshavardhana har...@harshavardhana.net
   To: Pranith Kumar Karampuri pkara...@redhat.com
   Cc: Kaleb KEITHLEY kkeit...@redhat.com, Gluster Devel
   gluster-devel@gluster.org
   Sent: Friday, May 23, 2014 2:12:02 AM
   Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP for
   setxattr

   http://review.gluster.com/#/c/7823/ - the fix here

   On Thu, May 22, 2014 at 1:41 PM, Harshavardhana
   har...@harshavardhana.net wrote:
Here are the important locations in the XFS tree coming from 2.6.32
branch

STATIC int
xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
{
struct xfs_inode *ip = XFS_I(inode);
unsigned char *ea_name;
int error;

if (S_ISLNK(inode-i_mode))  I would
generally think this is the issue.
return -EOPNOTSUPP;

STATIC long
xfs_vn_fallocate(
struct inode*inode,
int mode,
loff_t  offset,
loff_t  len)
{
longerror;
loff_t  new_size = 0;
xfs_flock64_t   bf;
xfs_inode_t *ip = XFS_I(inode);
int cmd = XFS_IOC_RESVSP;
int attr_flags = XFS_ATTR_NOLOCK;

if (mode  ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
return -EOPNOTSUPP;

STATIC int
xfs_ioc_setxflags(
xfs_inode_t *ip,
struct file *filp,
void__user *arg)
{
struct fsxattr  fa;
unsigned intflags;
unsigned intmask;
int error;

if (copy_from_user(flags, arg, sizeof(flags)))
return -EFAULT;

if (flags  ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \
  FS_NOATIME_FL | FS_NODUMP_FL | \
  FS_SYNC_FL))
return -EOPNOTSUPP;

Perhaps some sort of system level acl's are being propagated by us
over symlinks() ? - perhaps this is the related to the same issue of
following symlinks?

On Sun, May 18, 2014 at 10:48 AM, Pranith Kumar Karampuri
pkara...@redhat.com wrote:
Sent the following patch to remove the special treatment of ENOTSUP
here:
http://review.gluster.org/7788

Pranith
- Original Message -
From: Kaleb KEITHLEY kkeit...@redhat.com
To: gluster-devel@gluster.org
Sent: Tuesday, May 13, 2014 8:01:53 PM
Subject: Re: [Gluster-devel] regarding special treatment of ENOTSUP
for
setxattr

On 05/13/2014 08:00 AM, Nagaprasad Sathyanarayana wrote:
 On 05/07/2014 03:44 PM, Pranith Kumar Karampuri wrote:

 - Original Message -
 From: Raghavendra Gowdappa rgowd...@redhat.com
 To: Pranith Kumar Karampuri pkara...@redhat.com
 Cc: Vijay Bellur vbel...@redhat.com,
 gluster-devel@gluster.org,
 Anand Avati aav...@redhat.com
 Sent: Wednesday, May 7, 2014 3:42:16 PM
 Subject: Re: [Gluster-devel] regarding special treatment of
 ENOTSUP
 for setxattr

 I think with repetitive log message suppression patch being
 merged,
 we
 don't really need gf_log_occasionally (except if they are logged
 in
 DEBUG or
 TRACE levels).
 That definitely helps. But still, setxattr calls are not supposed
 to
 fail with ENOTSUP on FS where we support gluster. If there are
 special
 keys which fail with ENOTSUPP, we can conditionally log setxattr
 failures only when the key is something new?

I know this is about EOPNOTSUPP (a.k.a. ENOTSUPP) returned by
setxattr(2) for legitimate attrs.

But I can't help but wondering if this isn't related to other bugs
we've
had with, e.g., lgetxattr(2) called on invalid xattrs?

E.g. see https://bugzilla.redhat.com/show_bug.cgi?id=765202. We have
a
hack where xlators communicate

Re: [Gluster-devel] All builds are failing with BUILD ERROR

2014-06-03 Thread Pranith Kumar Karampuri


Guys its failing again with the same error:

Please proceed with configuring, compiling, and installing.
rm: cannot remove `/build/install/var/run/gluster/patchy': Device or resource 
busy
+ RET=1
+ '[' 1 '!=' 0 ']'
+ VERDICT='BUILD FAILURE'


Pranith

On 06/02/2014 09:08 PM, Justin Clift wrote:

On 02/06/2014, at 7:04 AM, Kaleb KEITHLEY wrote:
snip

someone cleaned the loopback devices. I deleted 500 unix domain sockets in 
/d/install/var/run and requeued the regressions.

Interesting.  The extra sockets problem is what prompted me
to rewrite the cleanup function.  The sockets are being
created by glusterd during each test startup, but aren't
removed by the existing cleanup function.  (so, substantial
build up over time)



I'm not sure which of those two things was the solution.


_Probably_ the loopback device thing.  The extra sockets seem
to be messy but (so far) I haven't seen them break anything.

+ Justin

--
Open Source and Standards @ Red Hat

twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Erasure coding doubts session

2014-06-03 Thread Pranith Kumar Karampuri


hi Xavier,
Some of the developers are reading the code you submitted for 
erasure code. We want to know if you would be available on Friday IST so 
that we can have a discussion and doubt clarification session on IRC. 
Could you tell which time is good for you.


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Need testers for GlusterFS 3.4.4

2014-06-03 Thread Pranith Kumar Karampuri

On 06/04/2014 01:35 AM, Ben Turner wrote:

- Original Message -

From: Justin Clift jus...@gluster.org
To: Ben Turner btur...@redhat.com
Cc: James purplei...@gmail.com, gluster-us...@gluster.org, Gluster Devel 
gluster-devel@gluster.org
Sent: Thursday, May 29, 2014 6:12:40 PM
Subject: Re: [Gluster-users] [Gluster-devel] Need testers for GlusterFS 3.4.4

On 29/05/2014, at 8:04 PM, Ben Turner wrote:

From: James purplei...@gmail.com
Sent: Wednesday, May 28, 2014 5:21:21 PM
On Wed, May 28, 2014 at 5:02 PM, Justin Clift jus...@gluster.org wrote:

Hi all,

Are there any Community members around who can test the GlusterFS 3.4.4
beta (rpms are available)?

I've provided all the tools and how-to to do this yourself. Should
probably take about ~20 min.

Old example:

https://ttboj.wordpress.com/2014/01/16/testing-glusterfs-during-glusterfest/

Same process should work, except base your testing on the latest
vagrant article:

https://ttboj.wordpress.com/2014/05/13/vagrant-on-fedora-with-libvirt-reprise/

If you haven't set it up already.

I can help out here, I'll have a chance to run through some stuff this
weekend.  Where should I post feedback?

Excellent Ben!  Please send feedback to gluster-devel. :)

So far so good on 3.4.4, sorry for the delay here.  I had to fix my downstream 
test suites to run outside of RHS / downstream gluster.  I did basic sanity 
testing on glusterfs mounts including:

FSSANITY_TEST_LIST: arequal bonnie glusterfs_build compile_kernel dbench dd 
ffsb fileop fsx fs_mark iozone locks ltp multiple_files posix_compliance 
postmark read_large rpc syscallbench tiobench

I am starting on NFS now, I'll have results tonight or tomorrow morning.  I'll 
look updating the component scripts to work and run them as well.

Thanks a lot for this ben.

Justin, Ben,
 Do you think we can automate running of these scripts without a 
lot of human intervention? If yes, how can I help?

We can use that just before making any release in future :-).

Pranith

-b

+ Justin

--
Open Source and Standards @ Red Hat

twitter.com/realjustinclift

___
Gluster-users mailing list
gluster-us...@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Regarding doing away with refkeeper in locks xlator

2014-06-04 Thread Pranith Kumar Karampuri



On 06/04/2014 11:37 AM, Krutika Dhananjay wrote:

Hi,

Recently there was a crash in locks translator (BZ 1103347, BZ 
1097102) with the following backtrace:

(gdb) bt
#0  uuid_unpack (in=0x8 Address 0x8 out of bounds, 
uu=0x7fffea6c6a60) at ../../contrib/uuid/unpack.c:44
#1  0x7feeba9e19d6 in uuid_unparse_x (uu=value optimized out, 
out=0x2350fc0 081bbc7a-7551-44ac-85c7-aad5e2633db9,
fmt=0x7feebaa08e00 
%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x) at 
../../contrib/uuid/unparse.c:55
#2  0x7feeba9be837 in uuid_utoa (uuid=0x8 Address 0x8 out of 
bounds) at common-utils.c:2138
#3  0x7feeb06e8a58 in pl_inodelk_log_cleanup (this=0x230d910, 
ctx=0x7fee700f0c60) at inodelk.c:396
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at 
inodelk.c:428
#5  0x7feeb06ddf3a in pl_client_disconnect_cbk (this=0x230d910, 
client=value optimized out) at posix.c:2550
#6  0x7feeba9fa2dd in gf_client_disconnect (client=0x27724a0) at 
client_t.c:368
#7  0x7feeab77ed48 in server_connection_cleanup (this=0x2316390, 
client=0x27724a0, flags=value optimized out) at server-helpers.c:354
#8  0x7feeab77ae2c in server_rpc_notify (rpc=value optimized 
out, xl=0x2316390, event=value optimized out, data=0x2bf51c0) at 
server.c:527
#9  0x7feeba775155 in rpcsvc_handle_disconnect (svc=0x2325980, 
trans=0x2bf51c0) at rpcsvc.c:720
#10 0x7feeba776c30 in rpcsvc_notify (trans=0x2bf51c0, 
mydata=value optimized out, event=value optimized out, 
data=0x2bf51c0) at rpcsvc.c:758
#11 0x7feeba778638 in rpc_transport_notify (this=value optimized 
out, event=value optimized out, data=value optimized out) at 
rpc-transport.c:512
#12 0x7feeb115e971 in socket_event_poll_err (fd=value optimized 
out, idx=value optimized out, data=0x2bf51c0, poll_in=value 
optimized out, poll_out=0,

poll_err=0) at socket.c:1071
#13 socket_event_handler (fd=value optimized out, idx=value 
optimized out, data=0x2bf51c0, poll_in=value optimized out, 
poll_out=0, poll_err=0) at socket.c:2240
#14 0x7feeba9fc6a7 in event_dispatch_epoll_handler 
(event_pool=0x22e2d00) at event-epoll.c:384

#15 event_dispatch_epoll (event_pool=0x22e2d00) at event-epoll.c:445
#16 0x00407e93 in main (argc=19, argv=0x7fffea6c7f88) at 
glusterfsd.c:2023

(gdb) f 4
#4  pl_inodelk_client_cleanup (this=0x230d910, ctx=0x7fee700f0c60) at 
inodelk.c:428

428pl_inodelk_log_cleanup (l);
(gdb) p l-pl_inode-refkeeper
$1 = (inode_t *) 0x0
(gdb)

pl_inode-refkeeper was found to be NULL even when there were some 
blocked inodelks in a certain domain of the inode,
which when dereferenced by the epoll thread in the cleanup codepath 
led to a crash.


On inspecting the code (for want of a consistent reproducer), three 
things were found:


1. The function where the crash happens (pl_inodelk_log_cleanup()), 
makes an attempt to resolve the inode to path as can be seen below. 
But the way inode_path() itself
works is to first construct the path based on the given inode's 
ancestry and place it in the buffer provided. And if all else fails, 
the gfid of the inode is placed in a certain format (gfid:%s).
This eliminates the need for statements from line 4 through 7 
below, thereby preventing dereferencing of pl_inode-refkeeper.
Now, although this change prevents the crash altogether, it still 
does not fix the race that led to pl_inode-refkeeper becoming NULL, 
and comes at the cost of
printing (null) in the log message on line 9 every time 
pl_inode-refkeeper is found to be NULL, rendering the logged messages 
somewhat useless.


code
  0 pl_inode = lock-pl_inode;
  1
  2 inode_path (pl_inode-refkeeper, NULL, path);
  3
  4 if (path)
  5 file = path;
  6 else
  7 file = uuid_utoa (pl_inode-refkeeper-gfid);
8
  9 gf_log (THIS-name, GF_LOG_WARNING,
 10 releasing lock on %s held by 
 11 {client=%p, pid=%PRId64 lk-owner=%s},
 12 file, lock-client, (uint64_t) lock-client_pid,
 13 lkowner_utoa (lock-owner));
\code
I think this logging code is from the days when gfid handle concept was 
not there. So it wasn't returning gfid:gfid-str in cases the path is 
not present in the dentries. I believe the else block can be deleted 
safely now.


Pranith


2. There is at least one codepath found that can lead to this crash:
Imagine an inode on which an inodelk operation is attempted by a 
client and is successfully granted too.
   Now, between the time the lock was granted and 
pl_update_refkeeper() was called by this thread, the client could send 
a DISCONNECT event,
   causing cleanup codepath to be executed, where the epoll thread 
crashes on dereferencing pl_inode-refkeeper which is STILL NULL at 
this point.


   Besides, there are still places in locks xlator where the refkeeper 
is NOT updated whenever the lists are modified - for instance in the 
cleanup codepath from a

Re: [Gluster-devel] [Gluster-users] Need testers for GlusterFS 3.4.4

2014-06-04 Thread Pranith Kumar Karampuri

On 06/04/2014 07:44 PM, Ben Turner wrote:

- Original Message -

From: Justin Clift jus...@gluster.org
To: Pranith Kumar Karampuri pkara...@redhat.com
Cc: Ben Turner btur...@redhat.com, gluster-us...@gluster.org, Gluster Devel 
gluster-devel@gluster.org
Sent: Wednesday, June 4, 2014 9:35:47 AM
Subject: Re: [Gluster-users] [Gluster-devel] Need testers for GlusterFS 3.4.4

On 04/06/2014, at 6:33 AM, Pranith Kumar Karampuri wrote:

On 06/04/2014 01:35 AM, Ben Turner wrote:

Sent: Thursday, May 29, 2014 6:12:40 PM

snip

FSSANITY_TEST_LIST: arequal bonnie glusterfs_build compile_kernel dbench
dd ffsb fileop fsx fs_mark iozone locks ltp multiple_files
posix_compliance postmark read_large rpc syscallbench tiobench

I am starting on NFS now, I'll have results tonight or tomorrow morning.
I'll look updating the component scripts to work and run them as well.

Thanks a lot for this ben.

Justin, Ben,
 Do you think we can automate running of these scripts without a lot of
 human intervention? If yes, how can I help?

We can use that just before making any release in future :-).

It's a decent idea.  :)

Do you have time to get this up and running?

Yep, can do.  I'll see what else I can get going as well, I'll start with the 
sanity tests I mentioned above and go from there.  How often do we want these 
run?  Daily?  Weekly?  On GIT checkin?  Only on RC?

How long does it take to run them?

Pranith

-b

+ Justin

--
Open Source and Standards @ Red Hat

twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Shall we revert quota-anon-fd.t?

2014-06-10 Thread Pranith Kumar Karampuri


hi,
   I see that quota-anon-fd.t is causing too many spurious failures. I 
think we should revert it and raise a bug so that it can be fixed and 
committed again along with the fix.


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Please use http://build.gluster.org/job/rackspace-regression/

2014-06-12 Thread Pranith Kumar Karampuri


hi Guys,
 Rackspace slaves are in action now, thanks to Justin. Please use 
the URL in Subject to run the regressions. I already shifted some jobs 
to rackspace.


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] spurious regression failure in tests/bugs/bug-1104642.t

2014-06-12 Thread Pranith Kumar Karampuri


Thanks a lot for quick resolution Sachin

Pranith
On 06/12/2014 04:38 PM, Sachin Pandit wrote:

http://review.gluster.org/#/c/8041/ is merged upstream.

~ Sachin.
- Original Message -
From: Sachin Pandit span...@redhat.com
To: Raghavendra Talur rta...@redhat.com
Cc: Pranith Kumar Karampuri pkara...@redhat.com, Gluster Devel 
gluster-devel@gluster.org
Sent: Thursday, June 12, 2014 12:58:44 PM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Patch link http://review.gluster.org/#/c/8041/.

~ Sachin.

- Original Message -
From: Raghavendra Talur rta...@redhat.com
To: Pranith Kumar Karampuri pkara...@redhat.com
Cc: Sachin Pandit span...@redhat.com, Gluster Devel 
gluster-devel@gluster.org
Sent: Thursday, June 12, 2014 10:46:14 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Sachin and I looked at the failure.

Current guess is that glusterd_2 had not yet completed the handshake with
glusterd_1 and hence did not know about the option set.

KP suggested that instead of having a sleep before this command,
we could get peer status and verify that it is 1 and then get the
vol info. Although even this does not make the test fully deterministic,
we will be closer to it. Sachin will send out a patch for the same.

Raghavendra Talur

- Original Message -
From: Pranith Kumar Karampuri pkara...@redhat.com
To: Sachin Pandit span...@redhat.com
Cc: Gluster Devel gluster-devel@gluster.org
Sent: Thursday, June 12, 2014 9:54:03 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Check the logs to find the reason.

Pranith.
On 06/12/2014 09:24 AM, Sachin Pandit wrote:

I am not hitting this even after running the test case in a loop.
I'll update in this thread once I find out the root cause of the failure.

~ Sachin

- Original Message -
From: Sachin Pandit span...@redhat.com
To: Pranith Kumar Karampuri pkara...@redhat.com
Cc: Gluster Devel gluster-devel@gluster.org
Sent: Thursday, June 12, 2014 8:50:40 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

I will look into this.

- Original Message -
From: Pranith Kumar Karampuri pkara...@redhat.com
To: Gluster Devel gluster-devel@gluster.org
Cc: rta...@redhat.com, span...@redhat.com
Sent: Wednesday, June 11, 2014 9:08:44 PM
Subject: spurious regression failure in tests/bugs/bug-1104642.t

Raghavendra/Sachin,
Could one of you guys take a look at this please.

pk1@localhost - ~/workspace/gerrit-repo (master)
21:04:46 :) ⚡ ~/.scripts/regression.py
http://build.gluster.org/job/regression/4831/consoleFull
Patch == http://review.gluster.com/#/c/7994/2
Author == Raghavendra Talur rta...@redhat.com
Build triggered by == amarts
Build-url == http://build.gluster.org/job/regression/4831/consoleFull
Download-log-at ==
http://build.gluster.org:443/logs/regression/glusterfs-logs-20140611:08:39:04.tgz
Test written by == Author: Sachin Pandit span...@redhat.com

./tests/bugs/bug-1104642.t [13]
0 #!/bin/bash
1
2 . $(dirname $0)/../include.rc
3 . $(dirname $0)/../volume.rc
4 . $(dirname $0)/../cluster.rc
5
6
7 function get_value()
8 {
9 local key=$1
10 local var=CLI_$2
11
12 eval cli_index=\$$var
13
14 $cli_index volume info | grep ^$key\
15 | sed 's/.*: //'
16 }
17
18 cleanup
19
20 TEST launch_cluster 2
21
22 TEST $CLI_1 peer probe $H2;
23 EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count
24
25 TEST $CLI_1 volume create $V0 $H1:$B1/${V0}0 $H2:$B2/${V0}1
26 EXPECT $V0 get_value 'Volume Name' 1
27 EXPECT Created get_value 'Status' 1
28
29 TEST $CLI_1 volume start $V0
30 EXPECT Started get_value 'Status' 1
31
32 #Bring down 2nd glusterd
33 TEST kill_glusterd 2
34
35 #set the volume all options from the 1st glusterd
36 TEST $CLI_1 volume set all cluster.server-quorum-ratio 80
37
38 #Bring back the 2nd glusterd
39 TEST $glusterd_2
40
41 #Verify whether the value has been synced
42 EXPECT '80' get_value 'cluster.server-quorum-ratio' 1
***43 EXPECT '80' get_value 'cluster.server-quorum-ratio' 2
44
45 cleanup;

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious regression test failure in ./tests/bugs/bug-1101143.t

2014-06-12 Thread Pranith Kumar Karampuri


Thanks for reporting. Will take a look.

Pranith

On 06/12/2014 05:52 PM, Raghavendra Talur wrote:

Hi Pranith,

This test failed for my patch set today and seems to be a spurious 
failure.

Here is the console output for the run.
http://build.gluster.org/job/rackspace-regression/107/consoleFull

Could you please have a look at it?

--
Thanks!
Raghavendra Talur | Red Hat Storage Developer | Bangalore |+918039245176



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] glusterfs split-brain problem

2014-06-12 Thread Pranith Kumar Karampuri


hi,
Could you let us know what is the exact problem you are running into?

Pranith
On 06/13/2014 09:27 AM, Krishnan Parthasarathi wrote:

Hi,
Pranith, who is the AFR maintainer, would be the best person to answer this
question. CC'ing Pranith and gluster-devel.

Krish

- Original Message -

hi  Krishnan Parthasarathi

Do you tell me which glusterfs-version has great improvement for glusterfs
split-brain problem？
Can you tell me the relevant links?

thank you very much!




justgluste...@gmail.com



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Want more spurious regression failure alerts... ?

2014-06-14 Thread Pranith Kumar Karampuri



On 06/13/2014 06:41 PM, Justin Clift wrote:

Hi Pranith,

Do you want me to keep sending you spurious regression failure
notification?

There's a fair few of them isn't there?
I am doing one run on my VM. I will get back with the ones that fail on 
my VM. You can also do the same on your machine.


Give the output of for i in `cat problematic-ones.txt`; do echo $i 
$(git log $i | grep Author| tail -1); done


Maybe we should make 1 BZ for the lot, and attach the logs
to that BZ for later analysis?

I am already using 1092850 for this.


+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Want more spurious regression failure alerts... ?

2014-06-15 Thread Pranith Kumar Karampuri



On 06/15/2014 03:55 PM, Justin Clift wrote:

On 15/06/2014, at 3:36 AM, Pranith Kumar Karampuri wrote:

On 06/13/2014 06:41 PM, Justin Clift wrote:

Hi Pranith,

Do you want me to keep sending you spurious regression failure
notification?

There's a fair few of them isn't there?

I am doing one run on my VM. I will get back with the ones that fail on my VM. 
You can also do the same on your machine.

Cool, that should help. :)

These are the spurious failures found when running the rackspace-regression-2G
tests over friday and yesterday:

   * bug-859581.t -- SPURIOUS
 * 4846 - 
http://slave24.cloud.gluster.org/logs/glusterfs-logs-20140614:14:33:41.tgz
 * 6009 - 
http://slave20.cloud.gluster.org/logs/glusterfs-logs-20140613:20:24:58.tgz
 * 6652 - 
http://slave22.cloud.gluster.org/logs/glusterfs-logs-20140613:22:04:16.tgz
 * 7796 - 
http://slave20.cloud.gluster.org/logs/glusterfs-logs-20140614:14:22:53.tgz
 * 7987 - 
http://slave22.cloud.gluster.org/logs/glusterfs-logs-20140613:15:21:04.tgz
 * 7992 - 
http://slave10.cloud.gluster.org/logs/glusterfs-logs-20140613:20:21:15.tgz
 * 8014 - 
http://slave24.cloud.gluster.org/logs/glusterfs-logs-20140613:20:39:01.tgz
 * 8054 - 
http://slave24.cloud.gluster.org/logs/glusterfs-logs-20140613:13:15:50.tgz
 * 8062 - 
http://slave10.cloud.gluster.org/logs/glusterfs-logs-20140613:13:28:48.tgz

Xavi,
 Please review http://review.gluster.org/8069



   * mgmt_v3-locks.t -- SPURIOUS
 * 6483 - build.gluster.org - 
http://build.gluster.org/job/regression/4847/consoleFull
 * 6630 - 
http://slave22.cloud.gluster.org/logs/glusterfs-logs-20140614:15:42:39.tgz
 * 6946 - 
http://slave21.cloud.gluster.org/logs/glusterfs-logs-20140613:20:57:27.tgz
 * 7392 - 
http://slave21.cloud.gluster.org/logs/glusterfs-logs-20140613:13:57:20.tgz
 * 7852 - 
http://slave24.cloud.gluster.org/logs/glusterfs-logs-20140613:19:23:17.tgz
 * 8014 - 
http://slave24.cloud.gluster.org/logs/glusterfs-logs-20140613:20:39:01.tgz
 * 8015 - 
http://slave23.cloud.gluster.org/logs/glusterfs-logs-20140613:14:26:01.tgz
 * 8048 - 
http://slave24.cloud.gluster.org/logs/glusterfs-logs-20140613:18:13:07.tgz

Avra,
 Could you take a look.



   * bug-918437-sh-mtime.t -- SPURIOUS
 * 6459 - 
http://slave21.cloud.gluster.org/logs/glusterfs-logs-20140614:18:28:43.tgz
 * 7493 - 
http://slave22.cloud.gluster.org/logs/glusterfs-logs-20140613:10:30:16.tgz
 * 7987 - 
http://slave10.cloud.gluster.org/logs/glusterfs-logs-20140613:14:23:02.tgz
 * 7992 - 
http://slave10.cloud.gluster.org/logs/glusterfs-logs-20140613:20:21:15.tgz

Vijay, Could you review and merge http://review.gluster.com/8068


   * fops-sanity.t -- SPURIOUS
 * 8014 - 
http://slave20.cloud.gluster.org/logs/glusterfs-logs-20140613:18:18:33.tgz
 * 8066 - 
http://slave20.cloud.gluster.org/logs/glusterfs-logs-20140614:21:35:57.tgz

Still trying to figure this one out. May take a while.


   * bug-857330/xml.t - SPURIOUS
 * 7523 - logs may (?) be hard to parse due to other failure data for this 
CR in them
 * 8029 - 
http://slave23.cloud.gluster.org/logs/glusterfs-logs-20140613:16:46:03.tgz

Kaushal,
  Do you want to change the regression test to expect failures in 
commands executed by EXPECT_WITHIN. i.e. if the command it executes 
fails then give different output than the one it expects. I fixed quite 
a few of 'heal full' based spurious failures where they wait for 'cat 
some-file' to give some output but by the time EXPECT_WITHIN executes 
'cat' the file wouldn't even be created. I guess even normal.t will be 
benefited by this change?


Pranith


If we resolve these five, our regression testing should be a *lot* more
predictable. :)

Text file (attached to this email) has the bulk test results.  Manually
cut-n-pasted from browser to the text doc, so be wary of possible typos. ;)



Give the output of for i in `cat problematic-ones.txt`; do echo $i $(git log $i | 
grep Author| tail -1); done

Maybe we should make 1 BZ for the lot, and attach the logs
to that BZ for later analysis?

I am already using 1092850 for this.

Good info. :)

+ Justin



--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] idea for reducing probability of spurious regression failures.

2014-06-15 Thread Pranith Kumar Karampuri


hi,
 Whenever there is a change-to-test/new-test file submitted as part 
of a commit we run it 5 (may be 10?) times and expect it to be success 
all the time. This should decrease the probability for spurious 
regression failures. I am not sure if I have bandwidth this month with 
the upcoming deadlines for 3.6. I guess I can pursue this change next 
month if no one completes by that time.


Pranith.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] quota tests and usage of sleep

2014-06-15 Thread Pranith Kumar Karampuri


hi,
Could you guys remove 'sleep' for quota tests authored by you guys 
if it can be done. They are leading to spurious failures.

I will be sending out a patch removing 'sleep' in other tests.

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] tests and umount

2014-06-16 Thread Pranith Kumar Karampuri


hi,
 I see that most of the tests are doing umount and these may fail 
sometimes because of EBUSY etc. I am wondering if we should change all 
of them to umount -l.

Let me know if you foresee any problems.

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Regression testing status report

2014-06-16 Thread Pranith Kumar Karampuri



On 06/16/2014 09:24 PM, Justin Clift wrote:

On 16/06/2014, at 4:50 PM, Jeff Darcy wrote:

Can't thank you enough for this :-)

+100

Justin has done a lot of hard, tedious work whipping this infrastructure into 
better shape, and has significantly improved the project as a result.  Such 
efforts deserve to be recognized.  Justin, I owe you a beer.


Written thanks to my manager are definitely welcome: :D

   Daniel Veillard veill...@redhat.com

Just saying. ;)

Done.


+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Spurious failure - ./tests/bugs/bug-859581.t

2014-06-17 Thread Pranith Kumar Karampuri



On 06/18/2014 10:11 AM, Atin Mukherjee wrote:


On 06/18/2014 10:04 AM, Pranith Kumar Karampuri wrote:

On 06/18/2014 09:39 AM, Atin Mukherjee wrote:

Pranith,

Regression test mentioned in $SUBJECT failed (testcase : 14  16)

Console log can be found at
http://build.gluster.org/job/rackspace-regression-2GB/227/consoleFull

My initial suspect is on HEAL_TIMEOUT (set to 60 seconds) where healing
might not have been completed within this time frame and i.e. why
EXPECT_WITHIN fails.


I am not sure on what basis this HEAL_TIMEOUT's value was derived.
Probably you would be the better to analyse it. Having a larger time out
value might help here?

I don't think it is a spurious failure. There seems to be a bug in
afr-v2. I will have to fix that.

If its not a spurious failure why its not failing every time?
Depends on which subvolume afr picks in readdir. If it reads the one 
with the directory it will succeed. Otherwise it will fail.


Pranith

Pranith

Cheers,
Atin




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] tests and umount

2014-06-18 Thread Pranith Kumar Karampuri



On 06/16/2014 09:08 PM, Pranith Kumar Karampuri wrote:


On 06/16/2014 09:00 PM, Jeff Darcy wrote:

   I see that most of the tests are doing umount and these may fail
sometimes because of EBUSY etc. I am wondering if we should change all
of them to umount -l.
Let me know if you foresee any problems.

I think I'd try umount -f first.  Using -l too much can cause an
accumulation of zombie mounts.  When I'm hacking around on my own, I
sometimes have to do umount -f twice but that's always sufficient.
Cool, I will do some kind of EXPECT_WITHIN with umount -f may be 5 
times just to be on the safer side.
I submitted http://review.gluster.com/8104 for one of the tests as it is 
failing frequently. Will do the next round later.


Pranith


If no one has any objections I will send out a patch tomorrow for this.

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Automating spurious failure status

2014-06-19 Thread Pranith Kumar Karampuri



On 06/19/2014 06:14 PM, Justin Clift wrote:

On 19/06/2014, at 1:23 PM, Pranith Kumar Karampuri wrote:

hi,
  I was told that Justin and I were given permission to mark a patch as 
verified+1 when the tests that failed are spurious failures. I think this 
process can be automated as well. I already have a script to parse the Console 
log to identify the tests that failed (I send mails using this, yet to automate 
the mailing part). What we need to do now is the following:
1) Find the list of tests that are modified/added as part of the commit.
2) Parse the list of tests that failed the full regression (I already have this 
script).

Run 'prove' on these files separately say 5/10 times. If a particular test 
fails all the time. It is a real failure with more probability. Otherwise it is 
a spurious failure.
If a file that is added as a new test fails even a single time, lets accept the 
patch after fixing the failures.
Otherwise we can give +1 on it, instead of Justin/I manually doing it.

Sounds good to me. :)

+ Justin



Also send a mail to gluster-devel about the failures for each test.


We'll might want to make that weekly or something?  There are several failures
every day. :/

Agreed.

Pranith


+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] 3.5.1 beta 2 Sanity tests

2014-06-20 Thread Pranith Kumar Karampuri



On 06/19/2014 11:32 PM, Justin Clift wrote:

On 19/06/2014, at 6:55 PM, Benjamin Turner wrote:
snip

I went through these a while back and removed anything that wasn't valid for 
GlusterFS.  This test was passing on 3.4.59 when it was released, i am thinking 
it may have something to do with a sym link to the same directory bz i found a 
while back? Idk, I'll get it sorted tomorrow.

I got this sorted, I needed to add a sleep between the file create and the 
link.  I ran through it manually and it worked every time, took me a few goes 
to think of timing issue.  I didn't need this on 3.4.0.59, is there anything 
that needs investigated?

Any ideas? :)

Nope :-(

Pranith


+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] regarding inode-unref on root inode

2014-06-24 Thread Pranith Kumar Karampuri


Does anyone know why inode_unref is no-op for root inode?

I see the following code in inode.c

 static inode_t *
 __inode_unref (inode_t *inode)
 {
 if (!inode)
 return NULL;

 if (__is_root_gfid(inode-gfid))
 return inode;
 ...
}

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding inode-unref on root inode

2014-06-25 Thread Pranith Kumar Karampuri



On 06/25/2014 11:52 AM, Raghavendra Bhat wrote:

On Tuesday 24 June 2014 08:17 PM, Pranith Kumar Karampuri wrote:

Does anyone know why inode_unref is no-op for root inode?

I see the following code in inode.c

 static inode_t *
 __inode_unref (inode_t *inode)
 {
 if (!inode)
 return NULL;

 if (__is_root_gfid(inode-gfid))
 return inode;
 ...
}


I think its done with the intention that, root inode should *never* 
ever get removed from the active inodes list. (not even accidentally). 
So unref on root-inode is a no-op. Dont know whether there are any 
other reasons.

Thanks, That helps.

Pranith.


Regards,
Raghavendra Bhat



Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Feature review: Improved rebalance performance

2014-07-02 Thread Pranith Kumar Karampuri



On 07/01/2014 11:15 AM, Harshavardhana wrote:

Besides bandwidth limits, there also needs to be monitors on brick latency.
We don't want so many queued iops that operating performance is impacted.


AFAIK - rebalance and self-heal threads run in low-priority queue in
io-threads by default.
No, they don't. We tried doing that but based on experiences from users 
we disabled that in io-threads.


Pranith




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Running gfid-mismatch.t on NetBSD

2014-07-03 Thread Pranith Kumar Karampuri


Yes this is the expected behavior.

Pranith
On 07/03/2014 03:25 PM, Emmanuel Dreyfus wrote:

Hi

Running the first test on NetBSD, I get:
=
TEST 11 (line 22): ! find /mnt/glusterfs/0/file | xargs stat
find: /mnt/glusterfs/0/file: Input/output error

not ok 11
RESULT 11: 1
=

Why is this failing? If I read correctly the test, we have
set a gfid mismatch and find /mnt/glusterfs/0/file getting EIO
is  the expected behavior.



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Running gfid-mismatch.t on NetBSD

2014-07-03 Thread Pranith Kumar Karampuri



On 07/03/2014 04:56 PM, Emmanuel Dreyfus wrote:

Pranith Kumar Karampuri pkara...@redhat.com wrote:


Yes this is the expected behavior.

Then is the not ok 11 something I should see?

Yes. See why it is succeeding instead of failing?

Pranith




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] bug-822830.t fails on release-3.5 branch

2014-07-03 Thread Pranith Kumar Karampuri



On 07/04/2014 11:19 AM, Ravishankar N wrote:

On 07/04/2014 11:09 AM, Pranith Kumar Karampuri wrote:

Ravi,
I already sent a patch for it in the morning at 
http://review.gluster.com/8233

Review please :-)

830665.t is identical in master where it succeeds. Looks like 
*match_subnet_v4() changes  in master need to be backported to 3.5 as 
well.
That is because Avati's patch where EXPECT matches reg-ex is not present 
on release-3.5


commit 9a34ea6a0a95154013676cabf8528b2679fb36c4
Author: Anand Avati av...@redhat.com
Date:   Fri Jan 24 18:30:32 2014 -0800

tests: support regex in EXPECT constructs

Instead of just strings, provide the ability to specify a regex
of the pattern to expect

Change-Id: I6ada978197dceecc28490a2a40de73a04ab9abcd
Signed-off-by: Anand Avati av...@redhat.com
Reviewed-on: http://review.gluster.org/6788
Reviewed-by: Pranith Kumar Karampuri pkara...@redhat.com
Tested-by: Gluster Build System jenk...@build.gluster.com

Shall we backport this?

Pranith

Pranith

On 07/04/2014 11:00 AM, Ravishankar N wrote:

Hi Niels/ Santosh,

tests/bugs/bug-830665.t is consistently failing on 3.5 branch:
not ok 17 Got *.redhat.com instead of \*.redhat.com
not ok 19 Got 192.168.10.[1-5] instead of 192.168.10.\[1-5]

and seems to be introduced by http://review.gluster.org/#/c/8223/

Could you please look into it?

Thanks,
Ravi




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel






___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] bug-822830.t fails on release-3.5 branch

2014-07-04 Thread Pranith Kumar Karampuri



On 07/04/2014 12:00 PM, Santosh Pradhan wrote:
Thanks guys for looking into this. I am just wondering how this passed 
the regression before Niels could merged this in? Good part is test 
case needs modification not code ;)
There seems to be some bug in our regression testing code. Even though 
the regression failed it gave the verdict as SUCCESS

http://build.gluster.org/job/rackspace-regression-2GB-triggered/97/consoleFull

Pranith


-Santosh


On 07/04/2014 11:51 AM, Ravishankar N wrote:

On 07/04/2014 11:20 AM, Pranith Kumar Karampuri wrote:


On 07/04/2014 11:19 AM, Ravishankar N wrote:

On 07/04/2014 11:09 AM, Pranith Kumar Karampuri wrote:

Ravi,
I already sent a patch for it in the morning at 
http://review.gluster.com/8233

Review please :-)

830665.t is identical in master where it succeeds. Looks like 
*match_subnet_v4() changes  in master need to be backported to 3.5 
as well.
That is because Avati's patch where EXPECT matches reg-ex is not 
present on release-3.5


commit 9a34ea6a0a95154013676cabf8528b2679fb36c4
Author: Anand Avati av...@redhat.com
Date:   Fri Jan 24 18:30:32 2014 -0800

tests: support regex in EXPECT constructs

Instead of just strings, provide the ability to specify a regex
of the pattern to expect

Change-Id: I6ada978197dceecc28490a2a40de73a04ab9abcd
Signed-off-by: Anand Avati av...@redhat.com
Reviewed-on: http://review.gluster.org/6788
Reviewed-by: Pranith Kumar Karampuri pkara...@redhat.com
Tested-by: Gluster Build System jenk...@build.gluster.com

Shall we backport this?



I think we should; reviewed http://review.gluster.org/#/c/8235/. 
Thanks for the fix :)



Pranith

Pranith

On 07/04/2014 11:00 AM, Ravishankar N wrote:

Hi Niels/ Santosh,

tests/bugs/bug-830665.t is consistently failing on 3.5 branch:
not ok 17 Got *.redhat.com instead of \*.redhat.com
not ok 19 Got 192.168.10.[1-5] instead of 192.168.10.\[1-5]

and seems to be introduced by http://review.gluster.org/#/c/8223/

Could you please look into it?

Thanks,
Ravi




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel












___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] bug-822830.t fails on release-3.5 branch

2014-07-04 Thread Pranith Kumar Karampuri



On 07/04/2014 12:06 PM, Harshavardhana wrote:

There seems to be some bug in our regression testing code. Even though the
regression failed it gave the verdict as SUCCESS
http://build.gluster.org/job/rackspace-regression-2GB-triggered/97/consoleFull


This was fixed by Justin Clift recently

All is well then :-)

Pranith




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] bug-822830.t fails on release-3.5 branch

2014-07-04 Thread Pranith Kumar Karampuri



On 07/04/2014 12:04 PM, Harshavardhana wrote:

On Thu, Jul 3, 2014 at 11:30 PM, Santosh Pradhan sprad...@redhat.com wrote:

Thanks guys for looking into this. I am just wondering how this passed the
regression before Niels could merged this in? Good part is test case needs
modification not code ;)

We need a single maintainer for test cases alone to keep stability
across, this would occur
if some changes introduce races as we add more and more test cases.

I don't mind maintaining it along with Justin if people are okay with it.

Pranith.


For example chmod.t from posix-compliance fails once in a while and it
is not never maintained
by us.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding inode_link/unlink

2014-07-04 Thread Pranith Kumar Karampuri

On 07/04/2014 04:28 PM, Raghavendra Gowdappa wrote:

- Original Message -

From: Pranith Kumar Karampuri pkara...@redhat.com
To: Gluster Devel gluster-devel@gluster.org, Anand Avati av...@gluster.org, 
Brian Foster
bfos...@redhat.com, Raghavendra Gowdappa rgowd...@redhat.com, Raghavendra 
Bhat rab...@redhat.com
Sent: Friday, July 4, 2014 3:44:29 PM
Subject: regarding inode_link/unlink

hi,
   I have a doubt about when a particular dentry_unset thus
inode_unref on parent dir happens on fuse-bridge in gluster.
When a file is looked up for the first time fuse_entry_cbk does
'inode_link' with parent-gfid/bname. Whenever an unlink/rmdir/(lookup
gives ENOENT) happens then corresponding inode unlink happens. The
question is, will the present set of operations lead to leaks:
1) Mount 'M0' creates a file 'a'
2) Mount 'M1' of same volume deletes file 'a'

M0 never touches 'a' anymore. When will inode_unlink happen for such
cases? Will it lead to memory leaks?

Kernel will eventually send forget (a) on M0 and that will cleanup the dentries 
and inode. Its equivalent to a file being looked up and never used again 
(deleting doesn't matter in this case).
Do you know the trigger points for that? When I do 'touch a' on the 
mount point and leave the system like that, forget is not coming.

If I do unlink on the file then forget is coming.

Pranith

Pranith

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] triggers for sending inode forgets

2014-07-04 Thread Pranith Kumar Karampuri


hi,
I work on glusterfs and was debugging a memory leak. Need your help 
in figuring out if something is done properly or not.
When a file is looked up for the first time in gluster through fuse, 
gluster remembers the parent-inode, basename for that inode. Whenever an 
unlink/rmdir/(lookup gives ENOENT) happens then corresponding forgetting 
of parent-inode, basename happens. In all other cases it relies on fuse 
to send forget of an inode to release these associations. I was 
wondering what are the trigger points for sending forgets by fuse.


Lets say M0, M1 are fuse mounts of same volume.
1) Mount 'M0' creates a file 'a'
2) Mount 'M1' of deletes file 'a'

M0 never touches 'a' anymore. Will a forget be sent on inode of 'a'? If 
yes when?


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] triggers for sending inode forgets

2014-07-04 Thread Pranith Kumar Karampuri



On 07/05/2014 08:17 AM, Anand Avati wrote:




On Fri, Jul 4, 2014 at 7:03 PM, Pranith Kumar Karampuri 
pkara...@redhat.com mailto:pkara...@redhat.com wrote:


hi,
I work on glusterfs and was debugging a memory leak. Need your
help in figuring out if something is done properly or not.
When a file is looked up for the first time in gluster through
fuse, gluster remembers the parent-inode, basename for that inode.
Whenever an unlink/rmdir/(lookup gives ENOENT) happens then
corresponding forgetting of parent-inode, basename happens.


This is because of the path resolver explicitly calls d_invalidate() 
on a dentry when d_revalidate() fails on it.


In all other cases it relies on fuse to send forget of an inode to
release these associations. I was wondering what are the trigger
points for sending forgets by fuse.

Lets say M0, M1 are fuse mounts of same volume.
1) Mount 'M0' creates a file 'a'
2) Mount 'M1' of deletes file 'a'

M0 never touches 'a' anymore. Will a forget be sent on inode of
'a'? If yes when?


Really depends on when the memory manager decides to start reclaiming 
memory from dcache due to memory pressure. If the system is not under 
memory pressure, and if the stale dentry is never encountered by the 
path resolver, the inode may never receive a forget. To keep a tight 
utilization limit on the inode/dcache, you will have to proactively 
fuse_notify_inval_entry on old/deleted files.
Thanks for this info Avati. I see that in fuse-bridge for glusterfs 
there is a setxattr interface to do that. Is that what you are referring to?


Pranith


Thanks


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t

2014-07-04 Thread Pranith Kumar Karampuri


hi Joseph,
The test above failed on a documentation patch, so it has got to be 
a spurious failure.
Check 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/150/consoleFull 
for more information


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Problem with smoke, regression ordering

2014-07-05 Thread Pranith Kumar Karampuri


hi Justin,
 If the regression results complete before the smoke test then 
'green-tick-mark' is over-written and people don't realize that the 
regression succeeded by a simple glance at the list of patches. Can we 
do anything about it?


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] regarding message for '-1' on gerrit

2014-07-06 Thread Pranith Kumar Karampuri


hi Justin/Vijay,
 I always felt '-1' saying 'I prefer you didn't submit this' is a 
bit harsh. Most of the times all it means is 'Need some more changes' Do 
you think we can change this message?


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding message for '-1' on gerrit

2014-07-06 Thread Pranith Kumar Karampuri



On 07/06/2014 11:05 PM, Vijay Bellur wrote:

On 07/06/2014 07:47 PM, Pranith Kumar Karampuri wrote:

hi Justin/Vijay,
  I always felt '-1' saying 'I prefer you didn't submit this' is a
bit harsh. Most of the times all it means is 'Need some more changes' Do
you think we can change this message?



The message can be changed. What would everyone like to see as 
appropriate messages accompanying values '-1' and '-2'?

For '-1' - 'Please address the comments and Resubmit.'
I am not sure about '-2'

Pranith


-Vijay



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding message for '-1' on gerrit

2014-07-07 Thread Pranith Kumar Karampuri



On 07/07/2014 03:11 PM, Justin Clift wrote:

On 07/07/2014, at 2:50 AM, Pranith Kumar Karampuri wrote:

On 07/06/2014 11:05 PM, Vijay Bellur wrote:

On 07/06/2014 07:47 PM, Pranith Kumar Karampuri wrote:

hi Justin/Vijay,
  I always felt '-1' saying 'I prefer you didn't submit this' is a
bit harsh. Most of the times all it means is 'Need some more changes' Do
you think we can change this message?

The message can be changed. What would everyone like to see as appropriate 
messages accompanying values '-1' and '-2'?

For '-1' - 'Please address the comments and Resubmit.'

That sounds good. :)


I am not sure about '-2'

Maybe something like?

   I have strong doubts about this approach
(seems to reflect it's usage)

Agree :-)

Pranith


+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding spurious failure tests/bugs/bug-1112559.t

2014-07-07 Thread Pranith Kumar Karampuri



On 07/07/2014 06:18 PM, Pranith Kumar Karampuri wrote:

Joseph,
Any updates on this? It failed 5 regressions today.
http://build.gluster.org/job/rackspace-regression-2GB/541/consoleFull
http://build.gluster.org/job/rackspace-regression-2GB-triggered/175/consoleFull 

http://build.gluster.org/job/rackspace-regression-2GB-triggered/173/consoleFull 

http://build.gluster.org/job/rackspace-regression-2GB-triggered/166/consoleFull 

http://build.gluster.org/job/rackspace-regression-2GB-triggered/172/consoleFull 



One more : http://build.gluster.org/job/rackspace-regression-2GB/543/console

Pranith



CC some more folks who work on snapshot.

Pranith

On 07/05/2014 11:19 AM, Pranith Kumar Karampuri wrote:

hi Joseph,
The test above failed on a documentation patch, so it has got to 
be a spurious failure.
Check 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/150/consoleFull 
for more information


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Locale problem in master

2014-07-07 Thread Pranith Kumar Karampuri


Including Bala who is the author of the commit

Pranith
On 07/07/2014 10:18 PM, Anders Blomdell wrote:

Due to the line (commit 040319d8bced2f25bf25d8f6b937901c3a40e34b):

   ./libglusterfs/src/logging.c:503:setlocale(LC_ALL, );

The command

   env -i LC_NUMERIC=sv_SE.utf8 /usr/sbin/glusterfs ...

will fail due to the fact that the swedish decimal separator is not '.', but 
',',
i.e. _gf_string2double will fail due to strtod ('1.0', tail) will give the tail
'.0'.

/Anders



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] FS Sanity daily results.

2014-07-08 Thread Pranith Kumar Karampuri



On 07/06/2014 07:58 PM, Pranith Kumar Karampuri wrote:


On 07/06/2014 02:53 AM, Benjamin Turner wrote:
Hi all.  I have been running FS sanity on daily builds(glusterfs 
mounts only at this point) for a few days for a few days and I have 
been hitting a couple of problems:


 final pass/fail report =
Test Date: Sat Jul  5 01:53:00 EDT 2014
Total : [44]
Passed: [41]
Failed: [3]
Abort : [0]
Crash : [0]
-
[   PASS   ]  FS Sanity Setup
[   PASS   ]  Running tests.
[   PASS   ]  FS SANITY TEST - arequal
[   PASS   ]  FS SANITY LOG SCAN - arequal
[   PASS   ]  FS SANITY LOG SCAN - bonnie
[   PASS   ]  FS SANITY TEST - glusterfs_build
[   PASS   ]  FS SANITY LOG SCAN - glusterfs_build
[   PASS   ]  FS SANITY TEST - compile_kernel
[   PASS   ]  FS SANITY LOG SCAN - compile_kernel
[   PASS   ]  FS SANITY TEST - dbench
[   PASS   ]  FS SANITY LOG SCAN - dbench
[   PASS   ]  FS SANITY TEST - dd
[   PASS   ]  FS SANITY LOG SCAN - dd
[   PASS   ]  FS SANITY TEST - ffsb
[   PASS   ]  FS SANITY LOG SCAN - ffsb
[   PASS   ]  FS SANITY TEST - fileop
[   PASS   ]  FS SANITY LOG SCAN - fileop
[   PASS   ]  FS SANITY TEST - fsx
[   PASS   ]  FS SANITY LOG SCAN - fsx
[   PASS   ]  FS SANITY LOG SCAN - fs_mark
[   PASS   ]  FS SANITY TEST - iozone
[   PASS   ]  FS SANITY LOG SCAN - iozone
[   PASS   ]  FS SANITY TEST - locks
[   PASS   ]  FS SANITY LOG SCAN - locks
[   PASS   ]  FS SANITY TEST - ltp
[   PASS   ]  FS SANITY LOG SCAN - ltp
[   PASS   ]  FS SANITY TEST - multiple_files
[   PASS   ]  FS SANITY LOG SCAN - multiple_files
[   PASS   ]  FS SANITY TEST - posix_compliance
[   PASS   ]  FS SANITY LOG SCAN - posix_compliance
[   PASS   ]  FS SANITY TEST - postmark
[   PASS   ]  FS SANITY LOG SCAN - postmark
[   PASS   ]  FS SANITY TEST - read_large
[   PASS   ]  FS SANITY LOG SCAN - read_large
[   PASS   ]  FS SANITY TEST - rpc
[   PASS   ]  FS SANITY LOG SCAN - rpc
[   PASS   ]  FS SANITY TEST - syscallbench
[   PASS   ]  FS SANITY LOG SCAN - syscallbench
[   PASS   ]  FS SANITY TEST - tiobench
[   PASS   ]  FS SANITY LOG SCAN - tiobench
[   PASS   ]  FS Sanity Cleanup

[   FAIL   ]  FS SANITY TEST - bonnie
[   FAIL   ]  FS SANITY TEST - fs_mark
[   FAIL   ]  
/rhs-tests/beaker/rhs/auto-tests/components/sanity/fs-sanity-tests-v2
Bonnie++ is just very slow(running for 10+ hours on 1 16 GB file) and FS mark 
has been failing.  The bonnie slowness is in re read, here is the best 
explanation I can find on it:
https://blogs.oracle.com/roch/entry/decoding_bonnie
*Rewriting...done*  



This gets a little interesting. It actually reads 8K, lseek back to the start 
of the block, overwrites the 8K with new data and loops. (see article for 
more.).

On FS mark I am seeing:
#  fs_mark  -d  .  -D  4  -t  4  -S  5
#   Version 3.3, 4 thread(s) starting at Sat Jul  5 00:54:00 2014
#   Sync method: POST: Reopen and fsync() each file in order after main 
write loop.
#   Directories:  Time based hash between directories across 4 
subdirectories with 180 seconds per subdirectory.
#   File names: 40 bytes long, (16 initial bytes of time stamp with 24 
random bytes at end of name)
#   Files info: size 51200 bytes, written with an IO size of 16384 bytes 
per write
#   App overhead is time in microseconds spent in the test not doing file 
writing related system calls.

FSUse%Count SizeFiles/sec App Overhead
Error in unlink of ./00/53b784e8SKZ0QS9BO7O2EG1DIFQLRDYY : No such file 
or directory
fopen failed to open: fs_log.txt.26676
fs-mark pass # 5 failed
I am working on reporting so look for a daily status report email from my 
jenkins server soon.  How do we want to handle failures like this moving 
forward?  Should I just open a BZ after I triage?  Do you guys do a new BZ for 
every failure in the normal regressions tests?
Yes bz would be great with all the logs. For spurious regressions at 
least I just opened one bz and fixed all the bugs reported by Justin 
against that one.

Ben,
   Did you get a chance to raise the bug?

Pranith


Pranith

-b


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] EXPECT_WITHIN output change

2014-07-09 Thread Pranith Kumar Karampuri


hi,
I sent the following patch to change the output of EXPECT_WITHIN:
http://review.gluster.org/8263

Patch got one +1 and regressions passed. Merge it please :-).

Test:
#!/bin/bash

. $(dirname $0)/../include.rc

EXPECT_WITHIN 10 abc echo def
EXPECT_WITHIN 10 def echo def
EXPECT_WITHIN 10 abc ls asjfrhg



Old-style-output:
===
15:10:03 :) ⚡ prove -rfv tests/basic/self-heald-test.t
tests/basic/self-heald-test.t ..
1..3
not ok 1
FAILED COMMAND: abc echo def
ok 2
ls: cannot access asjfrhg: No such file or directory
not ok 3
FAILED COMMAND: abc ls asjfrhg
Failed 2/3 subtests

Test Summary Report
---
tests/basic/self-heald-test.t (Wstat: 0 Tests: 3 Failed: 2)
Failed tests: 1, 3



New-style-output:

root@pranithk-laptop - /home/pk1/workspace/gerrit-repo (master)
15:10:21 :( ⚡ prove -rfv tests/basic/self-heald-test.t
tests/basic/self-heald-test.t ..
1..3
not ok 1 Got def instead of abc
FAILED COMMAND: abc echo def
ok 2
ls: cannot access asjfrhg: No such file or directory
not ok 3 Got  instead of abc
FAILED COMMAND: abc ls asjfrhg
Failed 2/3 subtests

Test Summary Report
---
tests/basic/self-heald-test.t (Wstat: 0 Tests: 3 Failed: 2)
Failed tests: 1, 3

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] documentation about statedump

2014-07-10 Thread Pranith Kumar Karampuri


hi,
  I wanted to document the core data structres and debugging infra 
in gluster. This is the first patch in that series. Please review and 
provide comments.
I am not very familiar with iobuf infra. Please feel free to provide 
comments in the patch for that section as well. I can amend the document 
with those changes and resend the patch.


http://review.gluster.org/8288

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] regarding warnings on master

2014-07-10 Thread Pranith Kumar Karampuri


hi Harsha,
 Know anything about the following warnings on latest master?
In file included from msg-nfs3.h:20:0,
 from msg-nfs3.c:22:
nlm4-xdr.h:6:14: warning: extra tokens at end of #ifndef directive 
[enabled by default]

 #ifndef _NLM4-XDR_H_RPCGEN
  ^
nlm4-xdr.h:7:14: warning: missing whitespace after the macro name 
[enabled by default]

 #define _NLM4-XDR_H_RPCGEN

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Is this a transient failure?

2014-07-13 Thread Pranith Kumar Karampuri



On 07/11/2014 07:05 PM, Justin Clift wrote:

On 11/07/2014, at 11:36 AM, Anders Blomdell wrote:

In 
http://build.gluster.org/job/rackspace-regression-2GB-triggered/297/consoleFull,
 I have
one failure:

No volumes present
read failed: No data available
read returning junk
fd based file operation 1 failed
read failed: No data available
read returning junk
fstat failed : No data available
fd based file operation 2 failed
read failed: No data available
read returning junk
dup fd based file operation failed
[18:51:01] ./tests/basic/fops-sanity.t ...

What should I do about it?


Kick off a regression test manually here, and see if the same
failure occurs:

   http://build.gluster.org/job/rackspace-regression-2GB/

If it happens again, it's not a spurious one.

I believe this is a spurious one. I didn't get a chance to debug this issue.

Pranith


I'll send you a login for Jenkins, so you can kick off jobs
manually and stuff.

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Change in glusterfs[master]: porting: use __builtin_ffsll() instead of ffsll()

2014-07-14 Thread Pranith Kumar Karampuri


CC gluster-devel, Anuradha who committed the test.

Pranith

On 07/15/2014 01:58 AM, Harshavardhana wrote:

Mr Spurious is here again!


Patch Set 2: Verified-1

http://build.gluster.org/job/rackspace-regression-2GB-triggered/351/consoleFull 
: FAILED


Test Summary Report
---
./tests/bugs/bug-1038598.t  (Wstat: 0 Tests: 28 Failed: 1)
   Failed test:  28
Files=262, Tests=7738, 5904 wallclock secs ( 4.09 usr  2.35 sys +
546.04 cusr 750.64 csys = 1303.12 CPU)
Result: FAIL




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Developer Documentation for datastructures in gluster

2014-07-15 Thread Pranith Kumar Karampuri


hi,
  Please respond if you guys volunteer to add documentation for any 
of the following things that are not already taken.


client_t - pranith
integration with statedump - pranith
mempool - Pranith

event-hostory + circ-buff - Raghavendra Bhat
inode - Raghavendra Bhat

call-stub
fd
iobuf
graph
xlator
option-framework
rbthash
runner-framework
stack/frame
strfd
timer
store
gid-cache(source is heavily documented)
dict
event-poll

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Developer Documentation for datastructures in gluster

2014-07-15 Thread Pranith Kumar Karampuri



On 07/15/2014 04:47 PM, Kaushal M wrote:

What do you mean by 'option-framework'? Is it the xlator options table
that we have in each xlator? Or the glusterd volume set framework
(which requires the xlator options tables to function anyway)?

options.c in libglusterfs

Pranith


On Tue, Jul 15, 2014 at 4:39 PM, Pranith Kumar Karampuri
pkara...@redhat.com wrote:

hi,
   Please respond if you guys volunteer to add documentation for any of
the following things that are not already taken.

client_t - pranith
integration with statedump - pranith
mempool - Pranith

event-hostory + circ-buff - Raghavendra Bhat
inode - Raghavendra Bhat

call-stub
fd
iobuf
graph
xlator
option-framework
rbthash
runner-framework
stack/frame
strfd
timer
store
gid-cache(source is heavily documented)
dict
event-poll

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Developer Documentation for datastructures in gluster

2014-07-15 Thread Pranith Kumar Karampuri



On 07/15/2014 07:22 PM, Niels de Vos wrote:

On Tue, Jul 15, 2014 at 08:45:45AM -0400, Jeff Darcy wrote:

Please respond if you guys volunteer to add documentation for any
of the following things that are not already taken.

I think the most important thing to describe for each of these is the
life cycle rules.  When I've tried to teach people about translators,
one of the biggest stumbling blocks has been the question of what gets
freed after the fop, what gets freed after the callback, and what lives
on even longer.  There are different rules for dict_t, loc_t, inode_t,
etc.  Dict_set_*str is one of the worst offenders; even after all this
time, I have to go back and re-check which variants do what when the
dict itself is freed.  If the only thing that comes out of this effort
is greater clarity regarding what should be freed when, it will be
worth it.


client_t - pranith
integration with statedump - pranith
mempool - Pranith

event-hostory + circ-buff - Raghavendra Bhat
inode - Raghavendra Bhat

call-stub
fd
iobuf
graph
xlator
option-framework
rbthash
runner-framework
stack/frame
strfd
timer
store
gid-cache(source is heavily documented)
dict
event-poll

My Translator 101 series already covers xlators and call frames,
so I might as well continue with those.

Can you make these available in MarkDown format somewhere under the
docs/ directory?

Oops sorry. That is what we are going to do. Send patches :-).

Pranith.


Thanks,
Niels


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] spurious regression failures again!

2014-07-15 Thread Pranith Kumar Karampuri


hi,
We have 4 tests failing once in a while causing problems:
1) tests/bugs/bug-1087198.t - Author: Varun
2) tests/basic/mgmt_v3-locks.t - Author: Avra
3) tests/basic/fops-sanity.t - Author: Pranith

Please take a look at them and post updates.

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Developer Documentation for datastructures in gluster

2014-07-16 Thread Pranith Kumar Karampuri



On 07/16/2014 11:57 AM, Kaushal M wrote:

I'll take up documenting the options framework. I'd like take up graph
and dict, if Jeff doesn't mind.

Also, I think we should be aiming to document the complete API
provided by these components instead of just the data structure. That
would be more helpful to everyone IMO.

Yes. Will keep that in mind while writing the documentation :-)

Pranith


~kaushal

On Wed, Jul 16, 2014 at 11:21 AM, Raghavendra Gowdappa
rgowd...@redhat.com wrote:

syncop-framework is not listed here. I would like to take that up. Also, if 
nobody is willing to pick up runner framework, I can handle that too.

- Original Message -

From: Krutika Dhananjay kdhan...@redhat.com
To: Pranith Kumar Karampuri pkara...@redhat.com
Cc: Gluster Devel gluster-devel@gluster.org
Sent: Wednesday, July 16, 2014 10:41:28 AM
Subject: Re: [Gluster-devel] Developer Documentation for datastructures   
in  gluster

Hi,

I'd like to pick up timer and call-stub.

-Krutika




From: Pranith Kumar Karampuri pkara...@redhat.com
To: Gluster Devel gluster-devel@gluster.org
Sent: Tuesday, July 15, 2014 4:39:39 PM
Subject: [Gluster-devel] Developer Documentation for datastructures in
gluster

hi,
Please respond if you guys volunteer to add documentation for any
of the following things that are not already taken.

client_t - pranith
integration with statedump - pranith
mempool - Pranith

event-hostory + circ-buff - Raghavendra Bhat
inode - Raghavendra Bhat

call-stub
fd
iobuf
graph
xlator
option-framework
rbthash
runner-framework
stack/frame
strfd
timer
store
gid-cache(source is heavily documented)
dict
event-poll

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] What's the impact of enabling the profiler?

2014-07-17 Thread Pranith Kumar Karampuri



On 07/18/2014 03:05 AM, Joe Julian wrote:
What impact, if any, does starting profiling (gluster volume profile 
$vol start) have on performance?

Joe,
According to the code the only extra things it does is calling 
gettimeofday() call at the beginning and end of the FOP to calculate 
latency, increment some variables. So I guess not much?


Pranith

___
Gluster-users mailing list
gluster-us...@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Inspiration for improving our contributor documentation

2014-07-17 Thread Pranith Kumar Karampuri



On 07/17/2014 07:25 PM, Kaushal M wrote:

I came across mediawiki's developer documentation and guides when
browsing. These docs felt really good to me, and easy to approach.
I feel that we should take inspiration from them and start enhancing
our docs. (Outright copying with modifications as necessary, could
work too. But that just doesn't feel right)

Any volunteers?
(I'll start as soon as I finish with the developer documentation for
data structures for the components I volunteered earlier)

~kaushal

[0] - https://www.mediawiki.org/wiki/Developer_hub
I love the idea but not sure about the implementation. i.e. considering 
we already started with .md pages, why not have same kind of pages as 
.md files in /doc of gluster? We can modify the README in our project so 
that people can browse all the details in github? Please let me know 
your thoughts.


Pranith

[1] - https://www.mediawiki.org/wiki/Category:New_contributors
[2] - https://www.mediawiki.org/wiki/Gerrit/Code_review
[3] - https://www.mediawiki.org/wiki/Gerrit
[4] - https://www.mediawiki.org/wiki/Gerrit/Tutorial
[5] - https://www.mediawiki.org/wiki/Gerrit/Getting_started
[6] - https://www.mediawiki.org/wiki/Gerrit/Advanced_usage
... and lots more.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume

2014-07-18 Thread Pranith Kumar Karampuri



On 07/18/2014 07:57 PM, Anders Blomdell wrote:

During testing of a 3*4 gluster (from master as of yesterday), I encountered
two major weirdnesses:

   1. A 'rm -rf some_dir' needed several invocations to finish, each time
  reporting a number of lines like these:

	rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty


   2. After having successfully deleted all files from the volume,
  i have a single directory that is duplicated in gluster-fuse,
  like this:
# ls -l /mnt/gluster
 total 24
 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/
 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/

any idea on how to debug this issue?
What are the steps to recreate? We need to first find what lead to this. 
Then probably which xlator leads to this.


Pranith


/Anders
  


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [ovirt-users] Can we debug some truths/myths/facts about hosted-engine and gluster?

2014-07-19 Thread Pranith Kumar Karampuri

On 07/19/2014 11:25 AM, Andrew Lau wrote:

On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri
pkara...@redhat.com mailto:pkara...@redhat.com wrote:

On 07/18/2014 05:43 PM, Andrew Lau wrote:

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
vbel...@redhat.com mailto:vbel...@redhat.com wrote:

[Adding gluster-devel]

On 07/18/2014 05:20 PM, Andrew Lau wrote:

Hi all,

As most of you have got hints from previous messages,
hosted engine
won't work on gluster . A quote from BZ1097639

Using hosted engine with Gluster backed storage is
currently something
we really warn against.

I think this bug should be closed or re-targeted at
documentation, because there is nothing we can do here.
Hosted engine assumes that all writes are atomic and
(immediately) available for all hosts in the cluster.
Gluster violates those assumptions.

I tried going through BZ1097639 but could not find much
detail with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that
causes the problem?

2. Is hosted engine performing synchronous writes to ensure
that writes are durable?

Also, if there is any documentation that details the hosted
engine architecture that would help in enhancing our
understanding of its interactions with gluster.

Now my question, does this theory prevent a scenario of
perhaps
something like a gluster replicated volume being mounted
as a glusterfs
filesystem and then re-exported as the native kernel NFS
share for the
hosted-engine to consume? It could then be possible to
chuck ctdb in
there to provide a last resort failover solution. I have
tried myself
and suggested it to two people who are running a similar
setup. Now
using the native kernel NFS server for hosted-engine and
they haven't
reported as many issues. Curious, could anyone validate
my theory on this?

If we obtain more details on the use case and obtain gluster
logs from the failed scenarios, we should be able to
understand the problem better. That could be the first step
in validating your theory or evolving further recommendations :).

I'm not sure how useful this is, but Jiri Moskovcak tracked
this down in an off list message.

Message Quote:

We were able to track it down to this (thanks Andrew for
providing the testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 165, in handle
response = success + self._dispatch(data)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 261, in _dispatch
.get_all_stats_for_service_type(**options)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 41, in get_all_stats_for_service_type
d = self.get_raw_stats_for_service_type(storage_dir, service_type)
File

/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 74, in get_raw_stats_for_service_type
f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle:

'/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'

Andrew/Jiri,
Would it be possible to post gluster logs of both the
mount and bricks on the bz? I can take a look at it once. If I
gather nothing then probably I will ask for your help in
re-creating the issue.

Pranith

Unfortunately, I don't have the logs for that setup any more.. I'll
try replicate when I get a chance. If I understand the comment from
the BZ, I don't think it's a gluster bug per-say, more just how
gluster does its replication.

hi Andrew,
Thanks for that. I couldn't come to any conclusions because no
logs were available. It is unlikely that self-heal is involved because
there were no bricks going down/up according to the bug description.

Pranith

It's definitely connected to the storage which leads us to the
gluster, I'm not very familiar with the gluster so I need to
check this with our gluster gurus.

Thanks,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org mailto:Gluster-devel@gluster.org
http

Re: [Gluster-devel] [ovirt-users] Can we debug some truths/myths/facts about hosted-engine and gluster?

2014-07-21 Thread Pranith Kumar Karampuri

On 07/21/2014 02:08 PM, Jiri Moskovcak wrote:

On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote:

On 07/19/2014 11:25 AM, Andrew Lau wrote:

On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri
pkara...@redhat.com mailto:pkara...@redhat.com wrote:

On 07/18/2014 05:43 PM, Andrew Lau wrote:

On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur
vbel...@redhat.com mailto:vbel...@redhat.com wrote:

[Adding gluster-devel]

On 07/18/2014 05:20 PM, Andrew Lau wrote:

Hi all,

As most of you have got hints from previous messages,
hosted engine
won't work on gluster . A quote from BZ1097639

Using hosted engine with Gluster backed storage is
currently something
we really warn against.

I tried going through BZ1097639 but could not find much
detail with respect to gluster there.

A few questions around the problem:

1. Can somebody please explain in detail the scenario that
causes the problem?

2. Is hosted engine performing synchronous writes to ensure
that writes are durable?

Also, if there is any documentation that details the hosted
engine architecture that would help in enhancing our
understanding of its interactions with gluster.

I'm not sure how useful this is, but Jiri Moskovcak tracked
this down in an off list message.

Message Quote:

We were able to track it down to this (thanks Andrew for
providing the testing setup):

-b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine'
Traceback (most recent call last):
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 165, in handle
response = success + self._dispatch(data)
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py,
line 261, in _dispatch
.get_all_stats_for_service_type(**options)
File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 41, in get_all_stats_for_service_type
d = self.get_raw_stats_for_service_type(storage_dir,
service_type)

File
/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py,
line 74, in get_raw_stats_for_service_type
f = os.open(path, direct_flag | os.O_RDONLY)
OSError: [Errno 116] Stale file handle:
'/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata'

Pranith

Hi,
I've never had such setup, I guessed problem with gluster based on
OSError: [Errno 116] Stale file handle: which happens when the file
opened by application on client gets removed on the server. I'm pretty
sure we (hosted-engine) don't remove that file, so I think it's some
gluster magic moving the data around

Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume

2014-07-21 Thread Pranith Kumar Karampuri



On 07/21/2014 05:03 PM, Anders Blomdell wrote:

On 2014-07-19 04:43, Pranith Kumar Karampuri wrote:

On 07/18/2014 07:57 PM, Anders Blomdell wrote:

During testing of a 3*4 gluster (from master as of yesterday), I encountered
two major weirdnesses:

1. A 'rm -rf some_dir' needed several invocations to finish, each time
   reporting a number of lines like these:
 rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty

2. After having successfully deleted all files from the volume,
   i have a single directory that is duplicated in gluster-fuse,
   like this:
 # ls -l /mnt/gluster
  total 24
  drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/
  drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/

any idea on how to debug this issue?

What are the steps to recreate? We need to first find what lead to this. Then 
probably which xlator leads to this.

Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster'
on all the hosts before and after the following commands are run be of
any help:

   # mount -t glusterfs gluster-host:/test /mnt/gluster
   # mkdir /mnt/gluster/work2 ;
   # ls /mnt/gluster
   work2  work2

Are you using ext4? Is this on latest upstream?

Pranith


If so, where should I send them (size is 2*12*31MB [.tar] + 220kB [pcap])

/Anders


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume

2014-07-21 Thread Pranith Kumar Karampuri



On 07/21/2014 05:17 PM, Anders Blomdell wrote:

On 2014-07-21 13:36, Pranith Kumar Karampuri wrote:

On 07/21/2014 05:03 PM, Anders Blomdell wrote:

On 2014-07-19 04:43, Pranith Kumar Karampuri wrote:

On 07/18/2014 07:57 PM, Anders Blomdell wrote:

During testing of a 3*4 gluster (from master as of yesterday), I encountered
two major weirdnesses:

 1. A 'rm -rf some_dir' needed several invocations to finish, each time
reporting a number of lines like these:
  rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty

 2. After having successfully deleted all files from the volume,
i have a single directory that is duplicated in gluster-fuse,
like this:
  # ls -l /mnt/gluster
   total 24
   drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/
   drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/

any idea on how to debug this issue?

What are the steps to recreate? We need to first find what lead to this. Then 
probably which xlator leads to this.

Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster'
on all the hosts before and after the following commands are run be of
any help:

# mount -t glusterfs gluster-host:/test /mnt/gluster
# mkdir /mnt/gluster/work2 ;
# ls /mnt/gluster
work2  work2

Are you using ext4?

Yes


Is this on latest upstream?

kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream, I don't know.
gluster is from master as of end of last week

If there are known issues with ext4 i could switch to something else, but during
the last 15 years or so, I have had very little problems with ext2/3/4, thats 
the
reason for choosing it.
The problem is afrv2 + dht + ext4 offsets. Soumya and Xavier were 
working on it last I heard(CCed)


Pranith


/Anders



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] What's the impact of enabling the profiler?

2014-07-22 Thread Pranith Kumar Karampuri



On 07/22/2014 11:56 AM, Joe Julian wrote:


On 07/21/2014 11:20 PM, Pranith Kumar Karampuri wrote:


On 07/22/2014 11:39 AM, Joe Julian wrote:


On 07/17/2014 07:30 PM, Pranith Kumar Karampuri wrote:


On 07/18/2014 03:05 AM, Joe Julian wrote:
What impact, if any, does starting profiling (gluster volume 
profile $vol start) have on performance?

Joe,
According to the code the only extra things it does is calling 
gettimeofday() call at the beginning and end of the FOP to 
calculate latency, increment some variables. So I guess not much?




So far so good. Is the only way to clear the stats to restart the 
brick?

I think when the feature is initially proposed we wanted two things
1) cumulative stats
2) Interval stats

Interval stats get cleared whenever 'gluster volume profile volname 
info' is executed (Although it starts counting the next set of fops 
that happen after this command execution). But there is no way to 
clear the cumulative stats. It would be nice if you could give some 
feedback about what you liked/what you think should change to make 
better use of it. So I am guessing there wasn't big performance hit?


Pranith

No noticeable performance hit, no.

I'm writing a whitepaper for the best practices for OpenStack on 
GlusterFS so I needed some idea how qemu actually uses the filesystem. 
What the operations are so I can look at not only the best ways to 
tune for that use, but how to build the systems around that.


At this point, I'm just collecting data. TBH, I hadn't noticed the 
interval data. That should be perfect for this. I'll poll it in XML 
and run the numbers in a few days.

Joe,
 Do let us know your feedback. It needs some real-world usage 
suggestions from users like you :-).


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Developer Documentation for datastructures in gluster

2014-07-22 Thread Pranith Kumar Karampuri

Here is my first draft of mem-pool data structure for review: 
http://review.gluster.org/8343

Please don't laugh at the ascii art ;-).

Pranith

On 07/17/2014 04:10 PM, Ravishankar N wrote:

On 07/15/2014 04:39 PM, Pranith Kumar Karampuri wrote:

hi,
  Please respond if you guys volunteer to add documentation for 
any of the following things that are not already taken.


client_t - pranith
integration with statedump - pranith
mempool - Pranith

event-hostory + circ-buff - Raghavendra Bhat
inode - Raghavendra Bhat

call-stub
fd
iobuf
graph
xlator
option-framework
rbthash
runner-framework
stack/frame
strfd
timer
store
gid-cache(source is heavily documented)
dict
event-poll




I'll take up event-poll. I have created an etherpad link with the 
components and volunteers thus far:

https://etherpad.wikimedia.org/p/glusterdoc
Feel free to update this doc with your patch details, other components 
etc.


- Ravi


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Symlinks change date while migrating

2014-07-23 Thread Pranith Kumar Karampuri



On 07/23/2014 02:44 PM, Anders Blomdell wrote:

When migrating approx 1 GB of data data by doing

   gluster volume add-brick test new-host1:/path/to/new/brick ...
   gluster volume remove-brick old-host1:/path/to/old/brick ... start
   ... wait for removal to finish
   gluster volume remove-brick old-host1:/path/to/old/brick ... commit

on a 3*4 - 6*4 - 3*4 gluster [version  3.7dev-0.9.git5b8de97] approximately
40% of the symlinks change their mtime to the time they were copied. Is this
expected/known or should I file a bug?

hi,
   Seems like a dht issue. File a bug. Assign the component to 
dht/'distribute' for now. If it is different component, assignee of that 
bug can change it accordingly.


Pranith


/Anders



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Can anyone else shed any light on this warning?

2014-07-25 Thread Pranith Kumar Karampuri



On 07/26/2014 03:06 AM, Joe Julian wrote:
How can it come about? Is this from replacing a brick days ago? Can I 
prevent it from happening?


[2014-07-25 07:00:29.287680] W [fuse-resolve.c:546:fuse_resolve_fd] 
0-fuse-resolve: migration of basefd


(ptr:0x7f17cb846444 inode-gfid:87544fde-9bad-46d8-b610-1a8c93b85113) 
did not complete, failing fop with


EBADF (old-subvolume:gv-nova-3 new-subvolume:gv-nova-4)


It's critical because it causes a segfault every time. :(

Joe,
 This is fd migration code. When a brick layout changes (graph 
change) the file needs to be re-opened in the new graph. This re-open 
seemed to have failed. It leads to crash probably because extra unref in 
failure code path. Could you add brick/mount logs to the bug 
https://bugzilla.redhat.com/show_bug.cgi?id=1123289. What is the 
configuration of the volume?


pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Can anyone else shed any light on this warning?

2014-07-26 Thread Pranith Kumar Karampuri



On 07/26/2014 11:06 AM, Pranith Kumar Karampuri wrote:


On 07/26/2014 03:06 AM, Joe Julian wrote:
How can it come about? Is this from replacing a brick days ago? Can I 
prevent it from happening?


[2014-07-25 07:00:29.287680] W [fuse-resolve.c:546:fuse_resolve_fd] 
0-fuse-resolve: migration of basefd


(ptr:0x7f17cb846444 inode-gfid:87544fde-9bad-46d8-b610-1a8c93b85113) 
did not complete, failing fop with


EBADF (old-subvolume:gv-nova-3 new-subvolume:gv-nova-4)


It's critical because it causes a segfault every time. :(

Joe,
 This is fd migration code. When a brick layout changes (graph 
change) the file needs to be re-opened in the new graph. This re-open 
seemed to have failed. It leads to crash probably because extra unref 
in failure code path. Could you add brick/mount logs to the bug 
https://bugzilla.redhat.com/show_bug.cgi?id=1123289. What is the 
configuration of the volume?
I checked the code, I don't see any extra unrefs as of now. Please 
provide the details I asked for in the bug.


CC Raghavendra G, Raghavendra Bhat who know this code path a bit more.

Pranith

pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] How should I submit a testcase without proper solution

2014-07-29 Thread Pranith Kumar Karampuri


hi Anders,
   Generally new test cases are submitted along with the fix to prevent 
this situation. You should either submit the fix along with the test 
case or wait until the fix is submitted by someone else in case you are 
not actively working on it and then we can re-trigger the regression 
build for this patch. Then it will be taken in.


Pranith
On 07/29/2014 08:51 PM, Anders Blomdell wrote:

Hi,

finally got around to look into Symlink mtime changes when rebalancing
(https://bugzilla.redhat.com/show_bug.cgi?id=1122443), and I have submitted a
test-case (http://review.gluster.org/#/c/8383/), but that is expected to
fail (since I have not managed to write a patch that adresses the problem),
and hence it will be voted down by Jenkins, is there something I should do
about this?

/Anders



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Monotonically increasing memory

2014-07-31 Thread Pranith Kumar Karampuri

Yes, even I saw the following leaks, when I tested it a week back. These 
were the leaks:

You should probably take a statedump and see what datatypes are leaking.

root@localhost - /usr/local/var/run/gluster
14:10:26 ? awk -f /home/pk1/mem-leaks.awk glusterdump.22412.dump.1406174043
[mount/fuse.fuse - usage-type gf_common_mt_char memusage]
size=341240
num_allocs=23602
max_size=347987
max_num_allocs=23604
total_allocs=653194

[mount/fuse.fuse - usage-type gf_common_mt_mem_pool memusage]
size=4335440
num_allocs=45159
max_size=7509032
max_num_allocs=77391
total_allocs=530058

[performance/quick-read.r2-quick-read - usage-type gf_common_mt_asprintf 
memusage]

size=182526
num_allocs=30421
max_size=182526
max_num_allocs=30421
total_allocs=30421

[performance/quick-read.r2-quick-read - usage-type gf_common_mt_char 
memusage]

size=547578
num_allocs=30421
max_size=547578
max_num_allocs=30421
total_allocs=30421

[performance/quick-read.r2-quick-read - usage-type gf_common_mt_mem_pool 
memusage]

size=3117196
num_allocs=52999
max_size=3117368
max_num_allocs=53000
total_allocs=109484

[cluster/distribute.r2-dht - usage-type gf_common_mt_asprintf memusage]
size=257304
num_allocs=82988
max_size=257304
max_num_allocs=82988
total_allocs=97309

[cluster/distribute.r2-dht - usage-type gf_common_mt_char memusage]
size=2082904
num_allocs=82985
max_size=2082904
max_num_allocs=82985
total_allocs=101346

[cluster/distribute.r2-dht - usage-type gf_common_mt_mem_pool memusage]
size=9958372
num_allocs=165972
max_size=9963396
max_num_allocs=165980
total_allocs=467956

[performance/quick-read.r2-quick-read - usage-type gf_common_mt_asprintf 
memusage]

size=182526
num_allocs=30421
max_size=182526
max_num_allocs=30421
total_allocs=30421

[performance/quick-read.r2-quick-read - usage-type gf_common_mt_char 
memusage]

size=547578
num_allocs=30421
max_size=547578
max_num_allocs=30421
total_allocs=30421

[performance/quick-read.r2-quick-read - usage-type gf_common_mt_mem_pool 
memusage]

size=3117196
num_allocs=52999
max_size=3117368
max_num_allocs=53000
total_allocs=109484

[cluster/distribute.r2-dht - usage-type gf_common_mt_asprintf memusage]
size=257304
num_allocs=82988
max_size=257304
max_num_allocs=82988
total_allocs=97309

[cluster/distribute.r2-dht - usage-type gf_common_mt_char memusage]
size=2082904
num_allocs=82985
max_size=2082904
max_num_allocs=82985
total_allocs=101346

[cluster/distribute.r2-dht - usage-type gf_common_mt_mem_pool memusage]
size=9958372
num_allocs=165972
max_size=9963396
max_num_allocs=165980
total_allocs=467956


root@localhost - /usr/local/var/run/gluster
14:10:28 ?

Pranith

On 08/01/2014 12:01 AM, Anders Blomdell wrote:

During rsync of 35 files, memory consumption of glusterfs
rose to 12 GB (after approx 14 hours), I take it that this is a
bug I should try to track down?

Version is 3.7dev as of tuesday...

/Anders



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] regarding mempool documentation patch

2014-07-31 Thread Pranith Kumar Karampuri


hi,
If there are no more comments, could we take 
http://review.gluster.com/#/c/8343 in.


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] regarding resolution for fuse/server

2014-08-01 Thread Pranith Kumar Karampuri


hi,
 Does anyone know why there is different code for resolution in 
fuse vs server? There are some differences too, like server asserts 
about the resolution types like RESOLVE_MUST/RESOLVE_NOT etc where as 
fuse doesn't do any such thing. Wondering if there is any reason why the 
code is different in these two xlators.


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

1 2 3 4 5 6 7 8 >

1 - 100 of 772 matches

Mail list logo