Re: [Gluster-devel] Thank You!

2016-07-12 Thread Karthik Subrahmanya


- Original Message -
> From: "Niels de Vos" 
> To: "Karthik Subrahmanya" 
> Cc: "Gluster Devel" , josephau...@gmail.com, 
> "vivek sb agarwal"
> , "Vijaikumar Mallikarjuna" 
> 
> Sent: Wednesday, July 13, 2016 2:30:35 AM
> Subject: Re: [Gluster-devel] Thank You!
> 
> On Sat, Jul 09, 2016 at 11:35:13AM -0400, Karthik Subrahmanya wrote:
> > Hi all,
> > 
> > I am a intern joined on 11th of January 2016, and worked on the
> > WORM/Retention feature for GlusterFS. It is released as an
> > experimental feature with the GlusterFS v3.8. The blog post on
> > the feature is published on "Planet Gluster" [1] and
> > "blog.gluster.org" [2].
> > 
> > Monday 11th July 2016 I am getting converted as "Associate Software
> > Engineer".
> > I would like to take this opportunity to thank all of you for all your
> > valuable
> > guidance, support and help during this period. I hope you will guide me in
> > my future works, correct me when I am wrong and help me top learn more.
> 
> Congrats! Keep up the good work on improving the feature, sending the
> awesome weekly status updates and detailed blog posts. You've set the
> bar high for yourself already ;-)
> 
> Good to know that you will stick around for a while.

Hey,

Thanks Niels :). I am asked to work on the "Gluster Must Fixes" and
understand more about various components of gluster.
I'll try to improve the feature during my free time. Thanks for
all your support and feedback.

Regards,
Karthik
> 
> Cheers,
> Niels
> 
> 
> > Thank you all.
> > 
> > [1] http://planet.gluster.org/
> > [2]
> > https://blog.gluster.org/2016/07/worm-write-once-read-multiple-retention-and-compliance-2/
> > 
> > Regards,
> > Karthik
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [puzzle] readv operation allocate iobuf twice

2016-07-12 Thread Zhengping Zhou
I have all ready filed a bug with bugid 1354205, but my current patch
still has problem in my test environment, I'll check it out and post
later.

2016-07-12 12:38 GMT+08:00 Raghavendra Gowdappa :
>
>
> - Original Message -
>> From: "Zhengping Zhou" 
>> To: gluster-devel@gluster.org
>> Sent: Tuesday, July 12, 2016 9:28:01 AM
>> Subject: [Gluster-devel] [puzzle] readv operation allocate iobuf twice
>>
>> Hi all:
>>
>> It is a puzzle to me that we  allocate rsp buffers for rspond
>> content in function client3_3_readv, but these rsp parameters hasn't
>> ever been saved to struct saved_frame in submit procedure.
>
> Good catch :). We were aware of this issue, but the fix wasn't prioritized. 
> Can you please file a bug on this? If you want to send a fix (which 
> essentially stores the rsp payload ptr in saved-frame and passes it down 
> during rpc_clnt_fill_request_info - as part of handling 
> RPC_TRANSPORT_MAP_XID_REQUEST event in rpc-clnt), please post a patch to 
> gerrit and I'll accept it. If you don't have bandwidth, one of us can send 
> out a fix too.
>
> Again, thanks for the effort :).
>
> regards,
> Raghavendra
>
>> Which means
>> the iobuf will reallocated by transport layer in function
>> __socket_read_accepted_successful_reply.
>> According to  the commnet of fucntion rpc_clnt_submit :
>> 1. Both @rsp_hdr and @rsp_payload are optional.
>> 2. The user of rpc_clnt_submit, if wants response hdr and payload in its
>> own
>> buffers, then it has to populate @rsphdr and @rsp_payload.
>> 
>> The rsp_payload  is optional, ransport layer will not reallocate
>> rsp buffers if
>> it populated. But the fact is readv operation will allocate rsp buffer twice.
>>
>> Thanks
>> Zhengping
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Snapshot Scheduler

2016-07-12 Thread Niels de Vos
On Wed, Jul 13, 2016 at 12:37:17AM +0530, Avra Sengupta wrote:
> Thanks Joe for the feedback. We are aware of the following issue, and we
> will try and address this by going for a more generic approach, which will
> not have platform dependencies.

I'm mostly in favour of using the standard functionalities that other
components already provide. Use systemd-timers when available, and cron
as fallback would have my preference. Not sure how much my opinion
counts, but I hope you'll take it into consideration. Writing a bug-free
scheduler from scratch is difficult :-)

Niels


> 
> On 07/12/2016 11:59 PM, Joe Julian wrote:
> > cron isn't installed by default on Arch rather scheduling is done by
> > systemd timers. We might want to consider using systemd.timer for
> > systemd distros and crontab for legacy distros.
> > 
> > 
> > On 07/08/2016 03:01 AM, Avra Sengupta wrote:
> > > Hi,
> > > 
> > > Snaphsots in gluster have a scheduler, which relies heavily on
> > > crontab, and the shared storage. I would like people using this
> > > scheduler, or for people to use this scheduler, and provide us
> > > feedback on it's experience. We are looking for feedback on ease of
> > > use, complexity of features, additional feature support etc.
> > > 
> > > It will help us in deciding if we need to revamp the existing
> > > scheduler, or maybe rethink relying on crontab and re-writing our
> > > own, thus providing us more flexibility. Thanks.
> > > 
> > > Regards,
> > > Avra
> > > ___
> > > Gluster-devel mailing list
> > > Gluster-devel@gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > 
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Thank You!

2016-07-12 Thread Niels de Vos
On Sat, Jul 09, 2016 at 11:35:13AM -0400, Karthik Subrahmanya wrote:
> Hi all,
> 
> I am a intern joined on 11th of January 2016, and worked on the
> WORM/Retention feature for GlusterFS. It is released as an 
> experimental feature with the GlusterFS v3.8. The blog post on
> the feature is published on "Planet Gluster" [1] and
> "blog.gluster.org" [2].
> 
> Monday 11th July 2016 I am getting converted as "Associate Software Engineer".
> I would like to take this opportunity to thank all of you for all your 
> valuable
> guidance, support and help during this period. I hope you will guide me in
> my future works, correct me when I am wrong and help me top learn more.

Congrats! Keep up the good work on improving the feature, sending the
awesome weekly status updates and detailed blog posts. You've set the
bar high for yourself already ;-)

Good to know that you will stick around for a while.

Cheers,
Niels


> Thank you all.
> 
> [1] http://planet.gluster.org/
> [2] 
> https://blog.gluster.org/2016/07/worm-write-once-read-multiple-retention-and-compliance-2/
> 
> Regards,
> Karthik
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Snapshot Scheduler

2016-07-12 Thread Avra Sengupta
Thanks Joe for the feedback. We are aware of the following issue, and we 
will try and address this by going for a more generic approach, which 
will not have platform dependencies.


On 07/12/2016 11:59 PM, Joe Julian wrote:
cron isn't installed by default on Arch rather scheduling is done by 
systemd timers. We might want to consider using systemd.timer for 
systemd distros and crontab for legacy distros.



On 07/08/2016 03:01 AM, Avra Sengupta wrote:

Hi,

Snaphsots in gluster have a scheduler, which relies heavily on 
crontab, and the shared storage. I would like people using this 
scheduler, or for people to use this scheduler, and provide us 
feedback on it's experience. We are looking for feedback on ease of 
use, complexity of features, additional feature support etc.


It will help us in deciding if we need to revamp the existing 
scheduler, or maybe rethink relying on crontab and re-writing our 
own, thus providing us more flexibility. Thanks.


Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Snapshot Scheduler

2016-07-12 Thread Avra Sengupta
Thanks Alastair for the feedback. As of today we have auto-delete which 
when enabled deletes the oldest snapshot on exceeding the 
snap-max-soft-limit. Is this what you were trying to achieve, or were 
you thinking of more of a policy based approach, where like creation, 
deletion policies can be set, to target specific snapshots?



On 07/12/2016 11:45 PM, Alastair Neil wrote:
I don't know if I did something wrong, but I found the location that 
the scheduler wanted the shared storage was problematic as I recall it 
was under /run/gluster/snaps.  On CentOS 7 this failed to mount on 
boot.  I hacked the scheduler to use a location under /var/lib.


I also think there needs to be a way to schedule the removal of snapshots.

-Alastair


On 8 July 2016 at 06:01, Avra Sengupta > wrote:


Hi,

Snaphsots in gluster have a scheduler, which relies heavily on
crontab, and the shared storage. I would like people using this
scheduler, or for people to use this scheduler, and provide us
feedback on it's experience. We are looking for feedback on ease
of use, complexity of features, additional feature support etc.

It will help us in deciding if we need to revamp the existing
scheduler, or maybe rethink relying on crontab and re-writing our
own, thus providing us more flexibility. Thanks.

Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Snapshot Scheduler

2016-07-12 Thread Joe Julian
cron isn't installed by default on Arch rather scheduling is done by 
systemd timers. We might want to consider using systemd.timer for 
systemd distros and crontab for legacy distros.



On 07/08/2016 03:01 AM, Avra Sengupta wrote:

Hi,

Snaphsots in gluster have a scheduler, which relies heavily on 
crontab, and the shared storage. I would like people using this 
scheduler, or for people to use this scheduler, and provide us 
feedback on it's experience. We are looking for feedback on ease of 
use, complexity of features, additional feature support etc.


It will help us in deciding if we need to revamp the existing 
scheduler, or maybe rethink relying on crontab and re-writing our own, 
thus providing us more flexibility. Thanks.


Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Snapshot Scheduler

2016-07-12 Thread Alastair Neil
I don't know if I did something wrong, but I found the location that the
scheduler wanted the shared storage was problematic as I recall it was
under /run/gluster/snaps.  On CentOS 7 this failed to mount on boot.  I
hacked the scheduler to use a location under /var/lib.

I also think there needs to be a way to schedule the removal of snapshots.

-Alastair


On 8 July 2016 at 06:01, Avra Sengupta  wrote:

> Hi,
>
> Snaphsots in gluster have a scheduler, which relies heavily on crontab,
> and the shared storage. I would like people using this scheduler, or for
> people to use this scheduler, and provide us feedback on it's experience.
> We are looking for feedback on ease of use, complexity of features,
> additional feature support etc.
>
> It will help us in deciding if we need to revamp the existing scheduler,
> or maybe rethink relying on crontab and re-writing our own, thus providing
> us more flexibility. Thanks.
>
> Regards,
> Avra
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-12 Thread Glomski, Patrick
Hello, Jeff.

Thanks for responding so quickly. I'm not familiar with the codebase, so if
you don't mind me asking, how much would that list reordering slow things
down for, say, a queue of 1500 client machines? i.e. round-about how long
of a client list would significantly affect latency?

I only ask because we have quite a few clients and you explicitly call out
that the queue reordering method used may have problems for lots of clients.

Thanks again,

Patrick


On Tue, Jul 12, 2016 at 11:18 AM, Jeff Darcy  wrote:

> > > * We might be able to tweak io-threads (which already runs on the
> > > bricks and already has a global queue) to schedule requests in a
> > > fairer way across clients. Right now it executes them in the
> > > same order that they were read from the network.
> >
> > This sounds to be an easier fix. We can make io-threads to factor in
> another
> > input i.e., the client through which request came in (essentially
> > frame->root->client) before scheduling. That should make the problem
> > bearable at-least if not crippling. As to what algorithm to use, I think
> we
> > can consider leaky bucket of bit-rot implementation or dmclock. I've not
> > really thought deeper about the algorithm part. If the approach sounds
> ok,
> > we can discuss more about algos.
>
> I've created a patch to address the most basic part of this, in the
> simplest
> way I could think of.
>
> http://review.gluster.org/#/c/14904/
>
> It's still running through basic tests, so I don't even know if it's really
> correct yet, but it should give an idea of the conceptual direction.
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] One client can effectively hang entire gluster array

2016-07-12 Thread Jeff Darcy
> > * We might be able to tweak io-threads (which already runs on the
> > bricks and already has a global queue) to schedule requests in a
> > fairer way across clients. Right now it executes them in the
> > same order that they were read from the network.
> 
> This sounds to be an easier fix. We can make io-threads to factor in another
> input i.e., the client through which request came in (essentially
> frame->root->client) before scheduling. That should make the problem
> bearable at-least if not crippling. As to what algorithm to use, I think we
> can consider leaky bucket of bit-rot implementation or dmclock. I've not
> really thought deeper about the algorithm part. If the approach sounds ok,
> we can discuss more about algos.

I've created a patch to address the most basic part of this, in the simplest
way I could think of.

http://review.gluster.org/#/c/14904/

It's still running through basic tests, so I don't even know if it's really
correct yet, but it should give an idea of the conceptual direction.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] tests/basic/op_errnos.t failure

2016-07-12 Thread Nigel Babu
Please file a bug against project-infrastructure.

On Tue, Jul 12, 2016 at 6:50 PM, Raghavendra Talur 
wrote:

> Nigel/Misc,
>
> Could you please look into this?
> slave29 does not seem to have a xfs formatted backend for tests.
>
> Thanks,
> Raghavendra Talur
>
> On Tue, Jul 12, 2016 at 6:41 PM, Avra Sengupta 
> wrote:
>
>> Atin,
>>
>> I am not sure about the docker containers, but both the failures you
>> mentioned are in slave29, which as Talur explained is missing the
>> appropriate backend filesystem. Owing to this, op-errno.t is just the tip
>> of the iceberg, and every other test that uses lvm will fail in this
>> particular slave will fail too.
>>
>> Talur,
>> Thanks for looking into it. It is indeed strange this. I checked the
>> dmesg and the /var/log/messages in this slave and I couldn't find any
>> relevant log.
>>
>>
>> On 07/12/2016 05:29 PM, Raghavendra Talur wrote:
>>
>> I checked the machine.
>>
>> Here is the df -hT output
>> [jenkins@slave29 ~]$ cat /etc/fstab
>> # Accessible filesystems, by reference, are maintained under '/dev/disk'
>> # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more
>> info
>> #
>> /dev/xvda1  /   ext3
>>  defaults,noatime,barrier=0 1 1
>> tmpfs   /dev/shmtmpfs   defaults0
>> 0
>> devpts  /dev/ptsdevpts  gid=5,mode=620  0
>> 0
>> sysfs   /syssysfs   defaults0
>> 0
>> proc/proc   procdefaults0
>> 0
>> #/dev/xvdc1 noneswapsw  0
>> 0
>>
>>
>> We don't see a xfs device mounted at /d and / is of type ext3 which does
>> not support fallocate. The uptime of the machine is 73 days though. I don't
>> know how the /d xfs partition vanished.
>>
>> On Tue, Jul 12, 2016 at 4:54 PM, Atin Mukherjee < 
>> amukh...@redhat.com> wrote:
>>
>>>
>>> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22156/consoleFull
>>> - another failure
>>>
>>> On Tue, Jul 12, 2016 at 4:42 PM, Atin Mukherjee < 
>>> amukh...@redhat.com> wrote:
>>>


 On Tue, Jul 12, 2016 at 4:36 PM, Avra Sengupta < 
 aseng...@redhat.com> wrote:

> Hi Atin,
>
> Please check the testcase result in the console. It clearly states the
> reason of the failure. A quick search of 30815, as shown in the testcase
> shows that the error that is generated is a thinp issue, and we can see
> fallocate failing and lvm not properly being setup in the environment.
>

 While this is valid for my docker containers, I am just wondering why
 did this happen in jenkins slave?


> Regards,
> Avra
>
> P.S Here are the logs from the console stating so.
>
> *02:50:34* [09:50:34] Running tests in file 
> ./tests/basic/op_errnos.t*02:50:41* fallocate: 
> /d/backends/patchy_snap_vhd: fallocate failed: Operation not 
> supported*02:50:41* losetup: /d/backends/patchy_snap_vhd: warning: file 
> smaller than 512 bytes, the loop device maybe be useless or invisible for 
> system tools.*02:50:41*   Device /d/backends/patchy_snap_loop not found 
> (or ignored by filtering).*02:50:41*   Device 
> /d/backends/patchy_snap_loop not found (or ignored by 
> filtering).*02:50:41*   Unable to add physical volume 
> '/d/backends/patchy_snap_loop' to volume group 
> 'patchy_snap_vg_1'.*02:50:41*   Volume group "patchy_snap_vg_1" not 
> found*02:50:41*   Cannot process volume group patchy_snap_vg_1*02:50:42*  
>  Volume group "patchy_snap_vg_1" not found*02:50:42*   Cannot process 
> volume group patchy_snap_vg_1*02:50:42* /dev/patchy_snap_vg_1/brick_lvm: 
> No such file or directory*02:50:42* Usage: mkfs.xfs*02:50:42* /* 
> blocksize */ [-b log=n|size=num]*02:50:42* /* data subvol */ 
> [-d agcount=n,agsize=n,file,name=xxx,size=num,*02:50:42*  
>   (sunit=value,swidth=value|su=num,sw=num),*02:50:42* 
> sectlog=n|sectsize=num*02:50:42* /* inode size */   [-i 
> log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,*02:50:42*  
>   projid32bit=0|1]*02:50:42* /* log subvol */ [-l 
> agnum=n,internal,size=num,logdev=xxx,version=n*02:50:42*  
>   sunit=value|su=num,sectlog=n|sectsize=num,*02:50:42*
> lazy-count=0|1]*02:50:42* /* label */   [-L label 
> (maximum 12 characters)]*02:50:42* /* naming */   [-n 
> log=n|size=num,version=2|ci]*02:50:42* /* prototype file */ [-p 
> fname]*02:50:42* /* quiet */[-q]*02:50:42* /* realtime 
> subvol */[-r extsize=num,size=num,rtdev=xxx]*02:50:42* /* sectorsize 
> */  [-s 

Re: [Gluster-devel] tests/basic/op_errnos.t failure

2016-07-12 Thread Raghavendra Talur
Nigel/Misc,

Could you please look into this?
slave29 does not seem to have a xfs formatted backend for tests.

Thanks,
Raghavendra Talur

On Tue, Jul 12, 2016 at 6:41 PM, Avra Sengupta  wrote:

> Atin,
>
> I am not sure about the docker containers, but both the failures you
> mentioned are in slave29, which as Talur explained is missing the
> appropriate backend filesystem. Owing to this, op-errno.t is just the tip
> of the iceberg, and every other test that uses lvm will fail in this
> particular slave will fail too.
>
> Talur,
> Thanks for looking into it. It is indeed strange this. I checked the dmesg
> and the /var/log/messages in this slave and I couldn't find any relevant
> log.
>
>
> On 07/12/2016 05:29 PM, Raghavendra Talur wrote:
>
> I checked the machine.
>
> Here is the df -hT output
> [jenkins@slave29 ~]$ cat /etc/fstab
> # Accessible filesystems, by reference, are maintained under '/dev/disk'
> # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
> #
> /dev/xvda1  /   ext3
>  defaults,noatime,barrier=0 1 1
> tmpfs   /dev/shmtmpfs   defaults0 0
> devpts  /dev/ptsdevpts  gid=5,mode=620  0 0
> sysfs   /syssysfs   defaults0 0
> proc/proc   procdefaults0 0
> #/dev/xvdc1 noneswapsw  0 0
>
>
> We don't see a xfs device mounted at /d and / is of type ext3 which does
> not support fallocate. The uptime of the machine is 73 days though. I don't
> know how the /d xfs partition vanished.
>
> On Tue, Jul 12, 2016 at 4:54 PM, Atin Mukherjee < 
> amukh...@redhat.com> wrote:
>
>>
>> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22156/consoleFull
>> - another failure
>>
>> On Tue, Jul 12, 2016 at 4:42 PM, Atin Mukherjee < 
>> amukh...@redhat.com> wrote:
>>
>>>
>>>
>>> On Tue, Jul 12, 2016 at 4:36 PM, Avra Sengupta < 
>>> aseng...@redhat.com> wrote:
>>>
 Hi Atin,

 Please check the testcase result in the console. It clearly states the
 reason of the failure. A quick search of 30815, as shown in the testcase
 shows that the error that is generated is a thinp issue, and we can see
 fallocate failing and lvm not properly being setup in the environment.

>>>
>>> While this is valid for my docker containers, I am just wondering why
>>> did this happen in jenkins slave?
>>>
>>>
 Regards,
 Avra

 P.S Here are the logs from the console stating so.

 *02:50:34* [09:50:34] Running tests in file 
 ./tests/basic/op_errnos.t*02:50:41* fallocate: 
 /d/backends/patchy_snap_vhd: fallocate failed: Operation not 
 supported*02:50:41* losetup: /d/backends/patchy_snap_vhd: warning: file 
 smaller than 512 bytes, the loop device maybe be useless or invisible for 
 system tools.*02:50:41*   Device /d/backends/patchy_snap_loop not found 
 (or ignored by filtering).*02:50:41*   Device /d/backends/patchy_snap_loop 
 not found (or ignored by filtering).*02:50:41*   Unable to add physical 
 volume '/d/backends/patchy_snap_loop' to volume group 
 'patchy_snap_vg_1'.*02:50:41*   Volume group "patchy_snap_vg_1" not 
 found*02:50:41*   Cannot process volume group patchy_snap_vg_1*02:50:42*   
 Volume group "patchy_snap_vg_1" not found*02:50:42*   Cannot process 
 volume group patchy_snap_vg_1*02:50:42* /dev/patchy_snap_vg_1/brick_lvm: 
 No such file or directory*02:50:42* Usage: mkfs.xfs*02:50:42* /* blocksize 
 */  [-b log=n|size=num]*02:50:42* /* data subvol */ [-d 
 agcount=n,agsize=n,file,name=xxx,size=num,*02:50:42*   
  (sunit=value,swidth=value|su=num,sw=num),*02:50:42*   
   sectlog=n|sectsize=num*02:50:42* /* inode size */   [-i 
 log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,*02:50:42*   
  projid32bit=0|1]*02:50:42* /* log subvol */ [-l 
 agnum=n,internal,size=num,logdev=xxx,version=n*02:50:42*   
  sunit=value|su=num,sectlog=n|sectsize=num,*02:50:42*  
   lazy-count=0|1]*02:50:42* /* label */   [-L label 
 (maximum 12 characters)]*02:50:42* /* naming */   [-n 
 log=n|size=num,version=2|ci]*02:50:42* /* prototype file */ [-p 
 fname]*02:50:42* /* quiet */[-q]*02:50:42* /* realtime 
 subvol */[-r extsize=num,size=num,rtdev=xxx]*02:50:42* /* sectorsize 
 */  [-s log=n|size=num]*02:50:42* /* version */ [-V]*02:50:42* 
  devicename*02:50:42*  is required unless -d 
 name=xxx is given.*02:50:42*  is xxx (bytes), xxxs (sectors), xxxb 
 (fs blocks), xxxk (xxx KiB),*02:50:42*   xxxm (xxx MiB), xxxg (xxx 
 

Re: [Gluster-devel] tests/basic/op_errnos.t failure

2016-07-12 Thread Avra Sengupta

Atin,

I am not sure about the docker containers, but both the failures you 
mentioned are in slave29, which as Talur explained is missing the 
appropriate backend filesystem. Owing to this, op-errno.t is just the 
tip of the iceberg, and every other test that uses lvm will fail in this 
particular slave will fail too.


Talur,
Thanks for looking into it. It is indeed strange this. I checked the 
dmesg and the /var/log/messages in this slave and I couldn't find any 
relevant log.


On 07/12/2016 05:29 PM, Raghavendra Talur wrote:

I checked the machine.

Here is the df -hT output
[jenkins@slave29 ~]$ cat /etc/fstab
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more 
info

#
/dev/xvda1  /   ext3 
 defaults,noatime,barrier=0 1 1

tmpfs   /dev/shmtmpfs defaults0 0
devpts  /dev/ptsdevpts  gid=5,mode=620 
 0 0

sysfs   /syssysfs defaults0 0
proc/proc   proc  defaults0 0
#/dev/xvdc1 noneswap  sw  0 0


We don't see a xfs device mounted at /d and / is of type ext3 which 
does not support fallocate. The uptime of the machine is 73 days 
though. I don't know how the /d xfs partition vanished.


On Tue, Jul 12, 2016 at 4:54 PM, Atin Mukherjee > wrote:



https://build.gluster.org/job/rackspace-regression-2GB-triggered/22156/consoleFull
- another failure

On Tue, Jul 12, 2016 at 4:42 PM, Atin Mukherjee
> wrote:



On Tue, Jul 12, 2016 at 4:36 PM, Avra Sengupta
> wrote:

Hi Atin,

Please check the testcase result in the console. It
clearly states the reason of the failure. A quick search
of 30815, as shown in the testcase shows that the error
that is generated is a thinp issue, and we can see
fallocate failing and lvm not properly being setup in the
environment.


While this is valid for my docker containers, I am just
wondering why did this happen in jenkins slave?


Regards,
Avra

P.S Here are the logs from the console stating so.

*02:50:34* [09:50:34] Running tests in file 
./tests/basic/op_errnos.t
*02:50:41* fallocate: /d/backends/patchy_snap_vhd: fallocate 
failed: Operation not supported
*02:50:41* losetup: /d/backends/patchy_snap_vhd: warning: file 
smaller than 512 bytes, the loop device maybe be useless or invisible for 
system tools.
*02:50:41*Device /d/backends/patchy_snap_loop not found (or 
ignored by filtering).
*02:50:41*Device /d/backends/patchy_snap_loop not found (or 
ignored by filtering).
*02:50:41*Unable to add physical volume 
'/d/backends/patchy_snap_loop' to volume group 'patchy_snap_vg_1'.
*02:50:41*Volume group "patchy_snap_vg_1" not found
*02:50:41*Cannot process volume group patchy_snap_vg_1
*02:50:42*Volume group "patchy_snap_vg_1" not found
*02:50:42*Cannot process volume group patchy_snap_vg_1
*02:50:42* /dev/patchy_snap_vg_1/brick_lvm: No such file or 
directory
*02:50:42* Usage: mkfs.xfs
*02:50:42* /* blocksize */  [-b log=n|size=num]
*02:50:42* /* data subvol */[-d 
agcount=n,agsize=n,file,name=xxx,size=num,
*02:50:42*  
(sunit=value,swidth=value|su=num,sw=num),
*02:50:42*  sectlog=n|sectsize=num
*02:50:42* /* inode size */ [-i 
log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,
*02:50:42*  projid32bit=0|1]
*02:50:42* /* log subvol */ [-l 
agnum=n,internal,size=num,logdev=xxx,version=n
*02:50:42*  
sunit=value|su=num,sectlog=n|sectsize=num,
*02:50:42*  lazy-count=0|1]
*02:50:42* /* label */  [-L label (maximum 12 
characters)]
*02:50:42* /* naming */ [-n log=n|size=num,version=2|ci]
*02:50:42* /* prototype file */ [-p fname]
*02:50:42* /* quiet */  [-q]
*02:50:42* /* realtime subvol */[-r 
extsize=num,size=num,rtdev=xxx]
*02:50:42* /* sectorsize */ [-s log=n|size=num]
*02:50:42* /* version */[-V]
*02:50:42*  devicename
*02:50:42*  is required unless -d name=xxx is given.
*02:50:42*  is xxx (bytes), xxxs (sectors), xxxb (fs blocks), 
xxxk (xxx KiB),
*02:50:42*  

[Gluster-devel] Minutes from today's Gluster Community Bug Triage meeting (July 12 2016)

2016-07-12 Thread Soumya Koduri

Hi,

Thanks to everyone who joined the meeting. Please find the minutes of 
today's Gluster Community Bug Triage meeting at the below links.


Minutes: 
https://meetbot.fedoraproject.org/gluster-meeting/2016-07-12/gluster_bug_triage.2016-07-12-12.00.html
Minutes (text): 
https://meetbot.fedoraproject.org/gluster-meeting/2016-07-12/gluster_bug_triage.2016-07-12-12.00.txt 

Log: 
https://meetbot.fedoraproject.org/gluster-meeting/2016-07-12/gluster_bug_triage.2016-07-12-12.00.log.html



Thanks,
Soumya
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Mark all the xlator fops 'static '

2016-07-12 Thread Jiffin Tony Thottan



On 12/07/16 16:00, Kaleb KEITHLEY wrote:

On 07/12/2016 03:00 AM, Jiffin Tony Thottan wrote:


On 31/07/15 19:29, Kaleb S. KEITHLEY wrote:

On 07/30/2015 05:16 PM, Niels de Vos wrote:

On Thu, Jul 30, 2015 at 08:27:15PM +0530, Soumya Koduri wrote:

Hi,

With the applications using and loading different libraries, the
function
symbols with the same name may get resolved incorrectly depending on
the
order in which those libraries get dynamically loaded.

Recently we have seen an issue with 'snapview-client' xlator lookup
fop -
'svc_lookup' which matched with one of the routines provided by
libntirpc,
used by NFS-Ganesha. More details are in [1], [2].

Indeed, the problem seems to be caused in an execution flow like this:

1. nfs-ganesha main binary starts
2. the dynamic linker loads libntirpc (and others)
3. the dynamic linker retrieves symbols from the libntirpc (and others)
4. 'svc_lookup' is amoung the symbols added to a lookup table (or such)
5. during execution, ganesha loads plugins with dlopen()
6. the fsalgluster.so plugin is linked against libgfapi and gfapi gets
 loaded
7. libgfapi retrieves the .vol file and loads the xlators, including
 snapview-client
8. snapview-client provices a 'svc_lookup' symbol, same name as
 libntirpc provides, complete different functionality

So far so good. But I would have expected the compiler to have populated
the function pointers in snapview-client's fops table at compile time;
the dynamic loader should not have been needed to resolve
snapview-client's svc_lookup, because it was (should have been) already
resolved at compile time.

And in fact it is, but, there are semantics for global (.globl) symbols
and run-time linkage that are biting us.



Hi all,

I am hitting a similar type of collision on ganesha 2.4.  In ganesha
2.4, we introduced stackable
mdcache at top of every FSAL. The lookup(mdc_lookup) function has
similar signature to gluster
mdcache lookup fop. In my case ganesha always pick up mdc_lookup from
its layer not from
gfapi graph. When I disabled md-cache it worked perfectly. As Soumya
suggested before do we
need to change every xlator fop to static?


The xlator fops are already effectively static, at least in 3.8 and later.

Starting in 3.8 the xlators are linked with an export map that only
exposes init(), fini(), fops, cbks, options, notify(), mem_acct_init(),
reconfigure(), and dumpops. (A few xlators export other symbols too.)

If this is biting us in 3.7 then we need to make the mdcache fops static.

This isn't C++, so all it takes for a symbol name collision is for the
functions to have the same name, i.e. mdc_lookup() in this case. :-/


Thanks Kaleb  for the information. I was using gluster 3.7 in my set up
--
Jiffin


--

Kaleb



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] REMINDER: Gluster Community Bug Triage meeting at 12:00 UTC (~in 30 minutes)

2016-07-12 Thread Soumya Koduri

Hi all,

This meeting is scheduled for anyone who is interested in learning more
about, or assisting with the Bug Triage.

Meeting details:
- location: #gluster-meeting on Freenode IRC
 (https://webchat.freenode.net/?channels=gluster-meeting  )
- date: every Tuesday
- time: 12:00 UTC
 (in your terminal, run: date -d "12:00 UTC")
- agenda: https://public.pad.fsfe.org/p/gluster-bug-triage

Currently the following items are listed:
* Roll Call
* Status of last weeks action items
* Group Triage
* Open Floor

The last two topics have space for additions. If you have a suitable bug
or topic to discuss, please add it to the agenda.

Appreciate your participation.

Thanks,
Soumya
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] tests/basic/op_errnos.t failure

2016-07-12 Thread Atin Mukherjee
https://build.gluster.org/job/rackspace-regression-2GB-triggered/22156/consoleFull
- another failure

On Tue, Jul 12, 2016 at 4:42 PM, Atin Mukherjee  wrote:

>
>
> On Tue, Jul 12, 2016 at 4:36 PM, Avra Sengupta 
> wrote:
>
>> Hi Atin,
>>
>> Please check the testcase result in the console. It clearly states the
>> reason of the failure. A quick search of 30815, as shown in the testcase
>> shows that the error that is generated is a thinp issue, and we can see
>> fallocate failing and lvm not properly being setup in the environment.
>>
>
> While this is valid for my docker containers, I am just wondering why did
> this happen in jenkins slave?
>
>
>> Regards,
>> Avra
>>
>> P.S Here are the logs from the console stating so.
>>
>> *02:50:34* [09:50:34] Running tests in file 
>> ./tests/basic/op_errnos.t*02:50:41* fallocate: /d/backends/patchy_snap_vhd: 
>> fallocate failed: Operation not supported*02:50:41* losetup: 
>> /d/backends/patchy_snap_vhd: warning: file smaller than 512 bytes, the loop 
>> device maybe be useless or invisible for system tools.*02:50:41*   Device 
>> /d/backends/patchy_snap_loop not found (or ignored by filtering).*02:50:41*  
>>  Device /d/backends/patchy_snap_loop not found (or ignored by 
>> filtering).*02:50:41*   Unable to add physical volume 
>> '/d/backends/patchy_snap_loop' to volume group 'patchy_snap_vg_1'.*02:50:41* 
>>   Volume group "patchy_snap_vg_1" not found*02:50:41*   Cannot process 
>> volume group patchy_snap_vg_1*02:50:42*   Volume group "patchy_snap_vg_1" 
>> not found*02:50:42*   Cannot process volume group patchy_snap_vg_1*02:50:42* 
>> /dev/patchy_snap_vg_1/brick_lvm: No such file or directory*02:50:42* Usage: 
>> mkfs.xfs*02:50:42* /* blocksize */[-b 
>> log=n|size=num]*02:50:42* /* data subvol */ [-d 
>> agcount=n,agsize=n,file,name=xxx,size=num,*02:50:42* 
>>(sunit=value,swidth=value|su=num,sw=num),*02:50:42*   
>>   sectlog=n|sectsize=num*02:50:42* /* inode size */   [-i 
>> log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,*02:50:42* 
>>projid32bit=0|1]*02:50:42* /* log subvol */ [-l 
>> agnum=n,internal,size=num,logdev=xxx,version=n*02:50:42* 
>>sunit=value|su=num,sectlog=n|sectsize=num,*02:50:42*  
>>   lazy-count=0|1]*02:50:42* /* label */   [-L label (maximum 
>> 12 characters)]*02:50:42* /* naming */   [-n 
>> log=n|size=num,version=2|ci]*02:50:42* /* prototype file */ [-p 
>> fname]*02:50:42* /* quiet */[-q]*02:50:42* /* realtime 
>> subvol */[-r extsize=num,size=num,rtdev=xxx]*02:50:42* /* sectorsize */  
>> [-s log=n|size=num]*02:50:42* /* version */ [-V]*02:50:42*   
>>devicename*02:50:42*  is required unless -d name=xxx 
>> is given.*02:50:42*  is xxx (bytes), xxxs (sectors), xxxb (fs blocks), 
>> xxxk (xxx KiB),*02:50:42*   xxxm (xxx MiB), xxxg (xxx GiB), xxxt (xxx 
>> TiB) or xxxp (xxx PiB).*02:50:42*  is xxx (512 byte 
>> blocks).*02:50:42* mount: special device /dev/patchy_snap_vg_1/brick_lvm 
>> does not exist*02:50:53* ./tests/basic/op_errnos.t .. *02:50:53* 
>> 1..21*02:50:53* ok 1, LINENUM:12*02:50:53* ok 2, LINENUM:13*02:50:53* ok 3, 
>> LINENUM:14*02:50:53* ok 4, LINENUM:16*02:50:53* ok 5, LINENUM:18*02:50:53* 
>> ok 6, LINENUM:19*02:50:53* ok 7, LINENUM:20
>>
>>
>>
>>
>> On 07/12/2016 03:47 PM, Atin Mukherjee wrote:
>>
>> Hi Avra,
>>
>> The above fails locally as well along with few regression failures I
>> observed and one of them are at [1]
>>
>> not ok 12 Got "  30807" instead of "30809", LINENUM:26
>> FAILED COMMAND: 30809 get-op_errno-xml snapshot restore snap1
>>
>> not ok 17 Got "  30815" instead of "30812", LINENUM:31
>> FAILED COMMAND: 30812 get-op_errno-xml snapshot create snap1 patchy
>> no-timestamp
>>
>> [1]
>> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22154/console
>>
>> --Atin
>>
>>
>>
>
>
> --
>
> --Atin
>



-- 

--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] tests/basic/op_errnos.t failure

2016-07-12 Thread Atin Mukherjee
On Tue, Jul 12, 2016 at 4:36 PM, Avra Sengupta  wrote:

> Hi Atin,
>
> Please check the testcase result in the console. It clearly states the
> reason of the failure. A quick search of 30815, as shown in the testcase
> shows that the error that is generated is a thinp issue, and we can see
> fallocate failing and lvm not properly being setup in the environment.
>

While this is valid for my docker containers, I am just wondering why did
this happen in jenkins slave?


> Regards,
> Avra
>
> P.S Here are the logs from the console stating so.
>
> *02:50:34* [09:50:34] Running tests in file 
> ./tests/basic/op_errnos.t*02:50:41* fallocate: /d/backends/patchy_snap_vhd: 
> fallocate failed: Operation not supported*02:50:41* losetup: 
> /d/backends/patchy_snap_vhd: warning: file smaller than 512 bytes, the loop 
> device maybe be useless or invisible for system tools.*02:50:41*   Device 
> /d/backends/patchy_snap_loop not found (or ignored by filtering).*02:50:41*   
> Device /d/backends/patchy_snap_loop not found (or ignored by 
> filtering).*02:50:41*   Unable to add physical volume 
> '/d/backends/patchy_snap_loop' to volume group 'patchy_snap_vg_1'.*02:50:41*  
>  Volume group "patchy_snap_vg_1" not found*02:50:41*   Cannot process volume 
> group patchy_snap_vg_1*02:50:42*   Volume group "patchy_snap_vg_1" not 
> found*02:50:42*   Cannot process volume group patchy_snap_vg_1*02:50:42* 
> /dev/patchy_snap_vg_1/brick_lvm: No such file or directory*02:50:42* Usage: 
> mkfs.xfs*02:50:42* /* blocksize */ [-b log=n|size=num]*02:50:42* /* 
> data subvol */ [-d agcount=n,agsize=n,file,name=xxx,size=num,*02:50:42*   
>  (sunit=value,swidth=value|su=num,sw=num),*02:50:42*  
>sectlog=n|sectsize=num*02:50:42* /* inode size */   
> [-i log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,*02:50:42*  
>   projid32bit=0|1]*02:50:42* /* log subvol */ [-l 
> agnum=n,internal,size=num,logdev=xxx,version=n*02:50:42*  
>   sunit=value|su=num,sectlog=n|sectsize=num,*02:50:42*
> lazy-count=0|1]*02:50:42* /* label */   [-L label (maximum 12 
> characters)]*02:50:42* /* naming */   [-n 
> log=n|size=num,version=2|ci]*02:50:42* /* prototype file */ [-p 
> fname]*02:50:42* /* quiet */[-q]*02:50:42* /* realtime subvol 
> */[-r extsize=num,size=num,rtdev=xxx]*02:50:42* /* sectorsize */  [-s 
> log=n|size=num]*02:50:42* /* version */ [-V]*02:50:42*
>   devicename*02:50:42*  is required unless -d name=xxx is 
> given.*02:50:42*  is xxx (bytes), xxxs (sectors), xxxb (fs blocks), xxxk 
> (xxx KiB),*02:50:42*   xxxm (xxx MiB), xxxg (xxx GiB), xxxt (xxx TiB) or 
> xxxp (xxx PiB).*02:50:42*  is xxx (512 byte blocks).*02:50:42* mount: 
> special device /dev/patchy_snap_vg_1/brick_lvm does not exist*02:50:53* 
> ./tests/basic/op_errnos.t .. *02:50:53* 1..21*02:50:53* ok 1, 
> LINENUM:12*02:50:53* ok 2, LINENUM:13*02:50:53* ok 3, LINENUM:14*02:50:53* ok 
> 4, LINENUM:16*02:50:53* ok 5, LINENUM:18*02:50:53* ok 6, LINENUM:19*02:50:53* 
> ok 7, LINENUM:20
>
>
>
>
> On 07/12/2016 03:47 PM, Atin Mukherjee wrote:
>
> Hi Avra,
>
> The above fails locally as well along with few regression failures I
> observed and one of them are at [1]
>
> not ok 12 Got "  30807" instead of "30809", LINENUM:26
> FAILED COMMAND: 30809 get-op_errno-xml snapshot restore snap1
>
> not ok 17 Got "  30815" instead of "30812", LINENUM:31
> FAILED COMMAND: 30812 get-op_errno-xml snapshot create snap1 patchy
> no-timestamp
>
> [1]
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/22154/console
>
> --Atin
>
>
>


-- 

--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] tests/basic/op_errnos.t failure

2016-07-12 Thread Avra Sengupta

Hi Atin,

Please check the testcase result in the console. It clearly states the 
reason of the failure. A quick search of 30815, as shown in the testcase 
shows that the error that is generated is a thinp issue, and we can see 
fallocate failing and lvm not properly being setup in the environment.


Regards,
Avra

P.S Here are the logs from the console stating so.

*02:50:34* [09:50:34] Running tests in file ./tests/basic/op_errnos.t
*02:50:41* fallocate: /d/backends/patchy_snap_vhd: fallocate failed: Operation 
not supported
*02:50:41* losetup: /d/backends/patchy_snap_vhd: warning: file smaller than 512 
bytes, the loop device maybe be useless or invisible for system tools.
*02:50:41*Device /d/backends/patchy_snap_loop not found (or ignored by 
filtering).
*02:50:41*Device /d/backends/patchy_snap_loop not found (or ignored by 
filtering).
*02:50:41*Unable to add physical volume '/d/backends/patchy_snap_loop' to 
volume group 'patchy_snap_vg_1'.
*02:50:41*Volume group "patchy_snap_vg_1" not found
*02:50:41*Cannot process volume group patchy_snap_vg_1
*02:50:42*Volume group "patchy_snap_vg_1" not found
*02:50:42*Cannot process volume group patchy_snap_vg_1
*02:50:42* /dev/patchy_snap_vg_1/brick_lvm: No such file or directory
*02:50:42* Usage: mkfs.xfs
*02:50:42* /* blocksize */  [-b log=n|size=num]
*02:50:42* /* data subvol */[-d agcount=n,agsize=n,file,name=xxx,size=num,
*02:50:42*  (sunit=value,swidth=value|su=num,sw=num),
*02:50:42*  sectlog=n|sectsize=num
*02:50:42* /* inode size */ [-i 
log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,
*02:50:42*  projid32bit=0|1]
*02:50:42* /* log subvol */ [-l 
agnum=n,internal,size=num,logdev=xxx,version=n
*02:50:42*  sunit=value|su=num,sectlog=n|sectsize=num,
*02:50:42*  lazy-count=0|1]
*02:50:42* /* label */  [-L label (maximum 12 characters)]
*02:50:42* /* naming */ [-n log=n|size=num,version=2|ci]
*02:50:42* /* prototype file */ [-p fname]
*02:50:42* /* quiet */  [-q]
*02:50:42* /* realtime subvol */[-r extsize=num,size=num,rtdev=xxx]
*02:50:42* /* sectorsize */ [-s log=n|size=num]
*02:50:42* /* version */[-V]
*02:50:42*  devicename
*02:50:42*  is required unless -d name=xxx is given.
*02:50:42*  is xxx (bytes), xxxs (sectors), xxxb (fs blocks), xxxk (xxx 
KiB),
*02:50:42*xxxm (xxx MiB), xxxg (xxx GiB), xxxt (xxx TiB) or xxxp (xxx 
PiB).
*02:50:42*  is xxx (512 byte blocks).
*02:50:42* mount: special device /dev/patchy_snap_vg_1/brick_lvm does not exist
*02:50:53* ./tests/basic/op_errnos.t ..
*02:50:53* 1..21
*02:50:53* ok 1, LINENUM:12
*02:50:53* ok 2, LINENUM:13
*02:50:53* ok 3, LINENUM:14
*02:50:53* ok 4, LINENUM:16
*02:50:53* ok 5, LINENUM:18
*02:50:53* ok 6, LINENUM:19
*02:50:53* ok 7, LINENUM:20



On 07/12/2016 03:47 PM, Atin Mukherjee wrote:

Hi Avra,

The above fails locally as well along with few regression failures I 
observed and one of them are at [1]


not ok 12 Got "  30807" instead of "30809", LINENUM:26
FAILED COMMAND: 30809 get-op_errno-xml snapshot restore snap1

not ok 17 Got "  30815" instead of "30812", LINENUM:31
FAILED COMMAND: 30812 get-op_errno-xml snapshot create snap1 patchy 
no-timestamp


[1] 
https://build.gluster.org/job/rackspace-regression-2GB-triggered/22154/console


--Atin


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Mark all the xlator fops 'static '

2016-07-12 Thread Kaleb KEITHLEY
On 07/12/2016 03:00 AM, Jiffin Tony Thottan wrote:
> 
> 
> On 31/07/15 19:29, Kaleb S. KEITHLEY wrote:
>> On 07/30/2015 05:16 PM, Niels de Vos wrote:
>>> On Thu, Jul 30, 2015 at 08:27:15PM +0530, Soumya Koduri wrote:
 Hi,

 With the applications using and loading different libraries, the
 function
 symbols with the same name may get resolved incorrectly depending on
 the
 order in which those libraries get dynamically loaded.

 Recently we have seen an issue with 'snapview-client' xlator lookup
 fop -
 'svc_lookup' which matched with one of the routines provided by
 libntirpc,
 used by NFS-Ganesha. More details are in [1], [2].
>>> Indeed, the problem seems to be caused in an execution flow like this:
>>>
>>> 1. nfs-ganesha main binary starts
>>> 2. the dynamic linker loads libntirpc (and others)
>>> 3. the dynamic linker retrieves symbols from the libntirpc (and others)
>>> 4. 'svc_lookup' is amoung the symbols added to a lookup table (or such)
>>> 5. during execution, ganesha loads plugins with dlopen()
>>> 6. the fsalgluster.so plugin is linked against libgfapi and gfapi gets
>>> loaded
>>> 7. libgfapi retrieves the .vol file and loads the xlators, including
>>> snapview-client
>>> 8. snapview-client provices a 'svc_lookup' symbol, same name as
>>> libntirpc provides, complete different functionality
>> So far so good. But I would have expected the compiler to have populated
>> the function pointers in snapview-client's fops table at compile time;
>> the dynamic loader should not have been needed to resolve
>> snapview-client's svc_lookup, because it was (should have been) already
>> resolved at compile time.
>>
>> And in fact it is, but, there are semantics for global (.globl) symbols
>> and run-time linkage that are biting us.
>>
>>
> 
> Hi all,
> 
> I am hitting a similar type of collision on ganesha 2.4.  In ganesha
> 2.4, we introduced stackable
> mdcache at top of every FSAL. The lookup(mdc_lookup) function has
> similar signature to gluster
> mdcache lookup fop. In my case ganesha always pick up mdc_lookup from
> its layer not from
> gfapi graph. When I disabled md-cache it worked perfectly. As Soumya
> suggested before do we
> need to change every xlator fop to static?
>

The xlator fops are already effectively static, at least in 3.8 and later.

Starting in 3.8 the xlators are linked with an export map that only
exposes init(), fini(), fops, cbks, options, notify(), mem_acct_init(),
reconfigure(), and dumpops. (A few xlators export other symbols too.)

If this is biting us in 3.7 then we need to make the mdcache fops static.

This isn't C++, so all it takes for a symbol name collision is for the
functions to have the same name, i.e. mdc_lookup() in this case. :-/

--

Kaleb

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] tests/basic/op_errnos.t failure

2016-07-12 Thread Atin Mukherjee
Hi Avra,

The above fails locally as well along with few regression failures I
observed and one of them are at [1]

not ok 12 Got "  30807" instead of "30809", LINENUM:26
FAILED COMMAND: 30809 get-op_errno-xml snapshot restore snap1

not ok 17 Got "  30815" instead of "30812", LINENUM:31
FAILED COMMAND: 30812 get-op_errno-xml snapshot create snap1 patchy
no-timestamp

[1]
https://build.gluster.org/job/rackspace-regression-2GB-triggered/22154/console

--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Question on merging zfs snapshot support into the mainline glusterfs

2016-07-12 Thread Rajesh Joseph
Hi Sriram,

The interface is not yet finalized. May be this is the right time to
re-ignite discussion on this.
I can create an etherpad which will explain the initial thoughts and design
ideas on the same.

Thanks & Regards,
Rajesh

On Mon, Jul 11, 2016 at 11:57 PM,  wrote:

> Hi Rajesh,
>
> Could you let us know the idea on how to go about this?
>
> Sriram
>
>
> On Wed, Jul 6, 2016, at 03:18 PM, Pranith Kumar Karampuri wrote:
>
> I believe Rajesh already has something here. May be he can post an outline
> so that we can take it from there?
>
> On Tue, Jul 5, 2016 at 10:52 PM,  wrote:
>
>
> Hi,
>
> I tried to go through the patch and find the reason behind the question
> posted. But could'nt get any concrete details about the same.
>
> When going through the mail chain, there were mentions of generic snapshot
> interface. I'd be interested in doing the changes if you guys could fill me
> with some initial information. Thanks.
>
> Sriram
>
>
> On Mon, Jul 4, 2016, at 01:59 PM, B.K.Raghuram wrote:
>
> Hi Rajesh,
> I did not want to respond to the question that you'd posed on the zfs
> snapshot code (about the volume backend backup) as I am not too familiar
> with the code and the person who's coded it is not with us anymore. This
> was done in bit of a hurry so it could be that it was just kept for later..
>
> However, Sriram who is cc'd on this email, has been helping us by starting
> to look at the gluster code  and has expressed an interest in taking the
> zfs code changes on. So he can probably dig out an answer to your question.
> Sriram, Rajesh had a question on one of the zfs related patches - (
> https://github.com/fractalio/glusterfs/commit/39a163eca338b6da146f72f380237abd4c671db2#commitcomment-18109851
> )
>
> Sriram is also interested in contributing to the process of creating a
> generic snapshot interface in the gluster code which you and Pranith
> mentioned above. If this is ok with you all, could you fill him in on what
> your thoughts are on that and how he could get started?
> Thanks!
> -Ram
>
> On Wed, Jun 22, 2016 at 11:45 AM, Rajesh Joseph 
> wrote:
>
>
>
> On Tue, Jun 21, 2016 at 4:24 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
> hi,
>   Is there a plan to come up with an interface for snapshot
> functionality? For example, in handling different types of sockets in
> gluster all we need to do is to specify which interface we want to use and
> ib,network-socket,unix-domain sockets all implement the interface. The code
> doesn't have to assume anything about underlying socket type. Do you guys
> think it is a worthwhile effort to separate out the logic of interface and
> the code which uses snapshots? I see quite a few of if (strcmp ("zfs",
> fstype)) code which can all be removed if we do this. Giving btrfs
> snapshots in future will be a breeze as well, this way? All we need to do
> is implementing snapshot interface using btrfs snapshot commands. I am not
> talking about this patch per se. Just wanted to seek your inputs about
> future plans for ease of maintaining the feature.
>
>
>
> As I said in my previous mail this is in plan and we will be doing it. But
> due to other priorities this was not taken in yet.
>
>
>
>
>
> On Tue, Jun 21, 2016 at 11:46 AM, Atin Mukherjee 
> wrote:
>
>
>
> On 06/21/2016 11:41 AM, Rajesh Joseph wrote:
> > What kind of locking issues you see? If you can provide some more
> > information I can be able to help you.
>
> That's related to stale lock issues on GlusterD which are there in 3.6.1
> since the fixes landed in the branch post 3.6.1. I have already provided
> the workaround/way to fix them [1]
>
> [1]
> http://www.gluster.org/pipermail/gluster-users/2016-June/thread.html#26995
>
> ~Atin
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>
>
> --
> Pranith
>
>
>
>
>
>
>
>
>
> --
> Pranith
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Mark all the xlator fops 'static '

2016-07-12 Thread Jiffin Tony Thottan



On 31/07/15 19:29, Kaleb S. KEITHLEY wrote:

On 07/30/2015 05:16 PM, Niels de Vos wrote:

On Thu, Jul 30, 2015 at 08:27:15PM +0530, Soumya Koduri wrote:

Hi,

With the applications using and loading different libraries, the function
symbols with the same name may get resolved incorrectly depending on the
order in which those libraries get dynamically loaded.

Recently we have seen an issue with 'snapview-client' xlator lookup fop -
'svc_lookup' which matched with one of the routines provided by libntirpc,
used by NFS-Ganesha. More details are in [1], [2].

Indeed, the problem seems to be caused in an execution flow like this:

1. nfs-ganesha main binary starts
2. the dynamic linker loads libntirpc (and others)
3. the dynamic linker retrieves symbols from the libntirpc (and others)
4. 'svc_lookup' is amoung the symbols added to a lookup table (or such)
5. during execution, ganesha loads plugins with dlopen()
6. the fsalgluster.so plugin is linked against libgfapi and gfapi gets
loaded
7. libgfapi retrieves the .vol file and loads the xlators, including
snapview-client
8. snapview-client provices a 'svc_lookup' symbol, same name as
libntirpc provides, complete different functionality

So far so good. But I would have expected the compiler to have populated
the function pointers in snapview-client's fops table at compile time;
the dynamic loader should not have been needed to resolve
snapview-client's svc_lookup, because it was (should have been) already
resolved at compile time.

And in fact it is, but, there are semantics for global (.globl) symbols
and run-time linkage that are biting us.




Hi all,

I am hitting a similar type of collision on ganesha 2.4.  In ganesha 
2.4, we introduced stackable
mdcache at top of every FSAL. The lookup(mdc_lookup) function has 
similar signature to gluster
mdcache lookup fop. In my case ganesha always pick up mdc_lookup from 
its layer not from
gfapi graph. When I disabled md-cache it worked perfectly. As Soumya 
suggested before do we

need to change every xlator fop to static?

Regards,
Jiffin




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel