Re: [Gluster-devel] bad file access (bit-rot + AFR)

2015-06-29 Thread Venky Shankar
On Tue, Jun 30, 2015 at 10:21 AM, Raghavendra Bhat  wrote:
> On 06/27/2015 03:28 PM, Venky Shankar wrote:
>>
>>
>>
>> On 06/27/2015 02:32 PM, Raghavendra Bhat wrote:
>>>
>>> Hi,
>>>
>>> There is a patch that is submitted for review to deny access to objects
>>> which are marked as bad by scrubber (i.e. the data of the object might have
>>> been corrupted in the backend).
>>>
>>> http://review.gluster.org/#/c/11126/10
>>> http://review.gluster.org/#/c/11389/4
>>>
>>> The above  2 patch sets solve the problem of denying access to the bad
>>> objects (they have passed regression and received a +1 from venky). But in
>>> our testing we found that there is a race window (depending upon the
>>> scrubber frequency the race window can be larger) where there is a
>>> possibility of self-heal daemon healing the contents of the bad file before
>>> scrubber can mark it as bad.
>>>
>>> I am not sure if the data truly gets corrupted in the backend, there is a
>>> chance of hitting this issue. But in our testing to simulate backend
>>> corruption we modify the contents of the file directly in the backend. Now
>>> in this case, before the scrubber can mark the object as bad, the self-heal
>>> daemon kicks in and heals the contents of the bad file to the good copy. Or
>>> before the scrubber marks the file as bad, if the client accesses it AFR
>>> finds that there is a mismatch in metadata (since we modified the contents
>>> of the file in the backend) and does data and metadata self-healing, thus
>>> copying the contents of the bad copy to good copy. And from now onwards the
>>> clients accessing that object always gets bad data.
>>
>>
>> I understand from Ravi (ranaraya@) that AFR-v2 would chose the "biggest"
>> file as the source, provided that afr xattrs are "clean" (AFR-v1 would give
>> back EIO). If a file is modified directly from the brick but leaves the size
>> unchanged, contents can be served from either copy. For self-heal to detect
>> anomalies, there needs to be verification (checksum/signature) at each stage
>> of it's operation. But this might be too heavy on the I/O side. We could
>> still cache mtime [but update on client I/O] after pre-check, but this still
>> would not catch bit flips (unless a filesystem scrub is done).
>>
>> Thoughts?
>>
>
> Yes. Even if wants to verify just before healing the file, the time taken to
> verify the checksum might be large if the file size is large. It might
> affect the self-heal performance.

Yes, but only when bitrot is enabled.

Probably this needs a bit more thinking.

>
> Regards,
> Raghavendra Bhat
>
>
>>>
>>> Pranith?Do you have any solution for this? Venky and me are trying to
>>> come up with a solution for this.
>>>
>>> But does this issue block the above patches in anyway? (Those 2 patches
>>> are still needed to deny access to objects once they are marked as bad by
>>> scrubber).
>>>
>>>
>>> Regards,
>>> Raghavendra Bhat
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] nbslave7i disabled

2015-06-29 Thread Kaushal M
nbslave7{2,4,h} were also faulty. I've rebooted them as well.

On Tue, Jun 30, 2015 at 10:45 AM, Kaushal M  wrote:
> I've disables nbslave7i as it has failed 10 NetBSD regression runs in
> a row 7641 to 7650. [1]
>
> I've retriggered the failed jobs. I'll reboot the machine and add it
> to the pool if it works.
>
> ~kaushal
>
> [1]: http://build.gluster.org/computer/nbslave7i.cloud.gluster.org/builds
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] nbslave7i disabled

2015-06-29 Thread Kaushal M
I've disables nbslave7i as it has failed 10 NetBSD regression runs in
a row 7641 to 7650. [1]

I've retriggered the failed jobs. I'll reboot the machine and add it
to the pool if it works.

~kaushal

[1]: http://build.gluster.org/computer/nbslave7i.cloud.gluster.org/builds
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] bad file access (bit-rot + AFR)

2015-06-29 Thread Raghavendra Bhat

On 06/27/2015 03:28 PM, Venky Shankar wrote:



On 06/27/2015 02:32 PM, Raghavendra Bhat wrote:

Hi,

There is a patch that is submitted for review to deny access to 
objects which are marked as bad by scrubber (i.e. the data of the 
object might have been corrupted in the backend).


http://review.gluster.org/#/c/11126/10
http://review.gluster.org/#/c/11389/4

The above  2 patch sets solve the problem of denying access to the 
bad objects (they have passed regression and received a +1 from 
venky). But in our testing we found that there is a race window 
(depending upon the scrubber frequency the race window can be larger) 
where there is a possibility of self-heal daemon healing the contents 
of the bad file before scrubber can mark it as bad.


I am not sure if the data truly gets corrupted in the backend, there 
is a chance of hitting this issue. But in our testing to simulate 
backend corruption we modify the contents of the file directly in the 
backend. Now in this case, before the scrubber can mark the object as 
bad, the self-heal daemon kicks in and heals the contents of the bad 
file to the good copy. Or before the scrubber marks the file as bad, 
if the client accesses it AFR finds that there is a mismatch in 
metadata (since we modified the contents of the file in the backend) 
and does data and metadata self-healing, thus copying the contents of 
the bad copy to good copy. And from now onwards the clients accessing 
that object always gets bad data.


I understand from Ravi (ranaraya@) that AFR-v2 would chose the 
"biggest" file as the source, provided that afr xattrs are "clean" 
(AFR-v1 would give back EIO). If a file is modified directly from the 
brick but leaves the size unchanged, contents can be served from 
either copy. For self-heal to detect anomalies, there needs to be 
verification (checksum/signature) at each stage of it's operation. But 
this might be too heavy on the I/O side. We could still cache mtime 
[but update on client I/O] after pre-check, but this still would not 
catch bit flips (unless a filesystem scrub is done).


Thoughts?



Yes. Even if wants to verify just before healing the file, the time 
taken to verify the checksum might be large if the file size is large. 
It might affect the self-heal performance.


Regards,
Raghavendra Bhat



Pranith?Do you have any solution for this? Venky and me are trying to 
come up with a solution for this.


But does this issue block the above patches in anyway? (Those 2 
patches are still needed to deny access to objects once they are 
marked as bad by scrubber).



Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Niels de Vos
On Mon, Jun 29, 2015 at 01:47:11PM -0400, Prashanth Pai wrote:
> Ah, I thought it was just me who was running into this.
> http://review.gluster.org/11214

Please file a decent bug with the behaviour and error messages. We need
this documented for our users that will likely hit the same problem at
one point.

Thanks,
Niels

> 
> Regards,
>  -Prashanth Pai
> 
> - Original Message -
> > From: "Joseph Fernandes" 
> > To: "Gluster Devel" 
> > Sent: Monday, June 29, 2015 5:54:28 PM
> > Subject: [Gluster-devel] Gluster and GCC 5.1
> > 
> > Hi All,
> > 
> > Recently I installed Fedora 22 on some fresh vms, which comes with gcc
> > 5.1.1-1(which can be upgraded to 5.1.1-4)
> > Observed one thing that normal "inline functions" will be undefined symbols
> > in the "so" files.
> > As a result I had trouble in start volume as "gf_sql_str2sync_t"
> > 
> > [2015-06-29 05:52:38.491378] I [MSGID: 101190]
> > [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with
> > index 1
> > [2015-06-29 05:52:38.499205] W [MSGID: 101095] [xlator.c:189:xlator_dynload]
> > 0-xlator: /usr/local/lib/libgfdb.so.0: undefined symbol: gf_sql_str2sync_t
> > [2015-06-29 05:52:38.499229] E [MSGID: 101002] [graph.y:211:volume_type]
> > 0-parser: Volume 'test-changetimerecorder', line 16: type
> > 'features/changetimerecorder' is not valid or not found on this machine
> > [2015-06-29 05:52:38.499262] E [MSGID: 101019] [graph.y:319:volume_end]
> > 0-parser: "type" not specified for volume test-changetimerecorder
> > [2015-06-29 05:52:38.499335] E [MSGID: 100026]
> > [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the
> > graph
> > [2015-06-29 05:52:38.499470] W [glusterfsd.c:1214:cleanup_and_exit] (--> 0-:
> > received signum (0), shutting down
> > 
> > when gf_sql_str2sync_t was made "static inline gf_sql_str2sync_t" the next
> > issue was with "changelog_dispatch_vec"
> > 
> > [2015-06-29 07:11:33.367259] I [MSGID: 101190]
> > [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with
> > index 1
> > [2015-06-29 07:11:33.368816] W [MSGID: 101095] [xlator.c:189:xlator_dynload]
> > 0-xlator: /usr/local/lib/glusterfs/3.8dev/xlator/features/changelog.so:
> > undefined symbol: changelog_dispatch_vec
> > [2015-06-29 07:11:33.368829] E [MSGID: 101002] [graph.y:211:volume_type]
> > 0-parser: Volume 'test-changelog', line 32: type 'features/changelog' is not
> > valid or not found on this machine
> > [2015-06-29 07:11:33.368843] E [MSGID: 101019] [graph.y:319:volume_end]
> > 0-parser: "type" not specified for volume test-changelog
> > [2015-06-29 07:11:33.368922] E [MSGID: 100026]
> > [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the
> > graph
> > [2015-06-29 07:11:33.369025] W [glusterfsd.c:1214:cleanup_and_exit] (--> 0-:
> > received signum (0), shutting down
> > 
> > and so on.
> > 
> > Looks like "inline" functions should be marked as "static inline" or "extern
> > inline" explicitly
> > Please refer https://gcc.gnu.org/gcc-5/porting_to.html
> > 
> > To recreate the issue without glusterfs, try out this sample code program on
> > fedora 22 gcc 5.1.1 or higher
> > 
> > hello.c
> > ===
> > 
> > #include 
> > 
> > inline void foo () {
> > printf ("hello world");
> > }
> > 
> > int main () {
> > 
> > foo ();
> > return 0;
> > }
> > 
> > # gcc hello.c
> > /tmp/ccUQ1XPp.o: In function `main':
> > hello.c:(.text+0xa): undefined reference to `foo'
> > collect2: error: ld returned 1 exit status
> > #
> > 
> > Should we change all the inline function to "static inline" or "extern
> > inline" in gluster, appropriately to their scope of use (IMHO would be a
> > right thing to do)?
> > or should we use a compiler flag to suppress this?
> > 
> > Regards,
> > Joe
> > 
> > 
> > 
> > 
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> > 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Niels de Vos
On Mon, Jun 29, 2015 at 02:45:50PM -0400, Kaleb S. KEITHLEY wrote:
> On 06/29/2015 02:40 PM, Anoop C S wrote:
> > 
> > 
> > On 06/29/2015 11:56 PM, Kaleb S. KEITHLEY wrote:
> >> On 06/29/2015 02:04 PM, Anoop C S wrote:
> >>>
> >>> Reading through gcc docs I could see that with gcc v5, it
> >>> defaults to C99 semantics and compiling with -fgnu89-inline
> >>> solves the above issue. I'm wondering how glusterfs compiled
> >>> successfully without providing this flag.
> > 
> >> Well, it didn't originally.
> > 
> >> Most of the bugs were fixed before Fedora 22 arrived with gcc-5.
>
> 
> I didn't say all of them had been fixed. ;-)
> 
> > 
> >> E.g. by building on Fedora Rawhide last year, reporting the bugs,
> >> and getting them fixed before Rawhide turned into Fedora 22.
> > 
> > 
> > Immediately after the Fedora 22 release, we had a similar issue
> > reported from a user with gcc v5 and was fixed through [1].
> >
> 
> Yes, there were still some, and/or some had crept (back) in, when
> Fedora22 was released.
> 
> I think we're still finding some more. Not all of the developers are
> using Fedora 22 yet. I know a few that are still on Fedora 19!
> 
> Niels and I had a brief chat a while back about adding a Fedora 22 VM to
> Jenkins to catch these. I don't think anything ever happened though.

I'm waiting for the new Jenkins slaves to become available... We should
try to not exceed the Rackspace budget too much.

Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Niels de Vos
On Mon, Jun 29, 2015 at 03:00:26PM -0400, Kaleb S. KEITHLEY wrote:
> On 06/29/2015 02:45 PM, Kaleb S. KEITHLEY wrote:
> >>
> >>> E.g. by building on Fedora Rawhide last year, reporting the bugs,
> >>> and getting them fixed before Rawhide turned into Fedora 22.
> >>
> >>
> >> Immediately after the Fedora 22 release, we had a similar issue
> >> reported from a user with gcc v5 and was fixed through [1].
> >>
> > 
> > Yes, there were still some, and/or some had crept (back) in, when
> > Fedora22 was released.
> > 
> > I think we're still finding some more. Not all of the developers are
> > using Fedora 22 yet. I know a few that are still on Fedora 19!
> > 
> > Niels and I had a brief chat a while back about adding a Fedora 22 VM to
> > Jenkins to catch these. I don't think anything ever happened though.
> 
> Or perhaps we could just get everyone to stop using 'inline'

YES, THIS! I think there are only very few exceptions where we should
use suggest to the compiler what to do. Most compilers are smart enough
to make the right decisions.

It also prevents weird corner cases, like what happens to alloca() when
a function is inline'd? If you are not sure about when you could use
inline, you should rather not use it ;-)

Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Kaleb S. KEITHLEY
On 06/29/2015 02:45 PM, Kaleb S. KEITHLEY wrote:
>>
>>> E.g. by building on Fedora Rawhide last year, reporting the bugs,
>>> and getting them fixed before Rawhide turned into Fedora 22.
>>
>>
>> Immediately after the Fedora 22 release, we had a similar issue
>> reported from a user with gcc v5 and was fixed through [1].
>>
> 
> Yes, there were still some, and/or some had crept (back) in, when
> Fedora22 was released.
> 
> I think we're still finding some more. Not all of the developers are
> using Fedora 22 yet. I know a few that are still on Fedora 19!
> 
> Niels and I had a brief chat a while back about adding a Fedora 22 VM to
> Jenkins to catch these. I don't think anything ever happened though.

Or perhaps we could just get everyone to stop using 'inline'


-- 

Kaleb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Kaleb S. KEITHLEY
On 06/29/2015 02:40 PM, Anoop C S wrote:
> 
> 
> On 06/29/2015 11:56 PM, Kaleb S. KEITHLEY wrote:
>> On 06/29/2015 02:04 PM, Anoop C S wrote:
>>>
>>> Reading through gcc docs I could see that with gcc v5, it
>>> defaults to C99 semantics and compiling with -fgnu89-inline
>>> solves the above issue. I'm wondering how glusterfs compiled
>>> successfully without providing this flag.
> 
>> Well, it didn't originally.
> 
>> Most of the bugs were fixed before Fedora 22 arrived with gcc-5.
   

I didn't say all of them had been fixed. ;-)

> 
>> E.g. by building on Fedora Rawhide last year, reporting the bugs,
>> and getting them fixed before Rawhide turned into Fedora 22.
> 
> 
> Immediately after the Fedora 22 release, we had a similar issue
> reported from a user with gcc v5 and was fixed through [1].
>

Yes, there were still some, and/or some had crept (back) in, when
Fedora22 was released.

I think we're still finding some more. Not all of the developers are
using Fedora 22 yet. I know a few that are still on Fedora 19!

Niels and I had a brief chat a while back about adding a Fedora 22 VM to
Jenkins to catch these. I don't think anything ever happened though.

-- 

Kaleb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Anoop C S
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



On 06/29/2015 11:56 PM, Kaleb S. KEITHLEY wrote:
> On 06/29/2015 02:04 PM, Anoop C S wrote:
>> 
>> Reading through gcc docs I could see that with gcc v5, it
>> defaults to C99 semantics and compiling with -fgnu89-inline
>> solves the above issue. I'm wondering how glusterfs compiled
>> successfully without providing this flag.
> 
> Well, it didn't originally.
> 
> Most of the bugs were fixed before Fedora 22 arrived with gcc-5.
> 
> E.g. by building on Fedora Rawhide last year, reporting the bugs,
> and getting them fixed before Rawhide turned into Fedora 22.
> 

Immediately after the Fedora 22 release, we had a similar issue
reported from a user with gcc v5 and was fixed through [1].

[1] http://review.gluster.org/#/c/11004/

>> I'm investigating on other ways to use inline that behave the
>> same in the old and the new semantics.
> 
> Using a command line option to get 1989 compiler semantics in 2015
> seems like a mistake.
> 

That's correct. I just tried that 89 option to see whether it fixes or
not. I will have to check once more building glusterfs on f22 with gcc
v5. I will update the thread when I'm done.

- --Anoop C S.
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iJwEAQEIAAYFAlWRkREACgkQ9uBlhhlWMJZ3zQQAvN30O7tkUwZMGSkVZB5FB8fo
ixoAYiJ00tGYIwp4jTxvJ4daWI7dThxuYAKB8qmX+qlE0boSO5toL5MIqo2BwfeK
I26DHn9YTCG1Mi3/yGaXkO02wFRSGjFIFc9UgV2HJ01J1znHHzPdkfF30nwDuw7S
orkn2BMDwizbIClHgAo=
=NRz9
-END PGP SIGNATURE-
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Kaleb S. KEITHLEY
On 06/29/2015 02:04 PM, Anoop C S wrote:
> 
> Reading through gcc docs I could see that with gcc v5, it defaults to
> C99 semantics and compiling with -fgnu89-inline solves the above
> issue. I'm wondering how glusterfs compiled successfully without
> providing this flag. 

Well, it didn't originally.

Most of the bugs were fixed before Fedora 22 arrived with gcc-5.

E.g. by building on Fedora Rawhide last year, reporting the bugs, and
getting them fixed before Rawhide turned into Fedora 22.

> I'm investigating on other ways to use inline
> that behave the same in the old and the new semantics.

Using a command line option to get 1989 compiler semantics in 2015 seems
like a mistake.

-- 

Kaleb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Anoop C S
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



On 06/29/2015 05:54 PM, Joseph Fernandes wrote:
> Hi All,
> 
> Recently I installed Fedora 22 on some fresh vms, which comes with
> gcc 5.1.1-1(which can be upgraded to 5.1.1-4) Observed one thing
> that normal "inline functions" will be undefined symbols in the
> "so" files. As a result I had trouble in start volume as
> "gf_sql_str2sync_t"
> 
> [2015-06-29 05:52:38.491378] I [MSGID: 101190]
> [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started
> thread with index 1 [2015-06-29 05:52:38.499205] W [MSGID: 101095]
> [xlator.c:189:xlator_dynload] 0-xlator:
> /usr/local/lib/libgfdb.so.0: undefined symbol: gf_sql_str2sync_t 
> [2015-06-29 05:52:38.499229] E [MSGID: 101002]
> [graph.y:211:volume_type] 0-parser: Volume
> 'test-changetimerecorder', line 16: type
> 'features/changetimerecorder' is not valid or not found on this
> machine [2015-06-29 05:52:38.499262] E [MSGID: 101019]
> [graph.y:319:volume_end] 0-parser: "type" not specified for volume
> test-changetimerecorder [2015-06-29 05:52:38.499335] E [MSGID:
> 100026] [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to
> construct the graph [2015-06-29 05:52:38.499470] W
> [glusterfsd.c:1214:cleanup_and_exit] (--> 0-: received signum (0),
> shutting down
> 
> when gf_sql_str2sync_t was made "static inline gf_sql_str2sync_t"
> the next issue was with "changelog_dispatch_vec"
> 
> [2015-06-29 07:11:33.367259] I [MSGID: 101190]
> [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started
> thread with index 1 [2015-06-29 07:11:33.368816] W [MSGID: 101095]
> [xlator.c:189:xlator_dynload] 0-xlator:
> /usr/local/lib/glusterfs/3.8dev/xlator/features/changelog.so:
> undefined symbol: changelog_dispatch_vec [2015-06-29
> 07:11:33.368829] E [MSGID: 101002] [graph.y:211:volume_type]
> 0-parser: Volume 'test-changelog', line 32: type
> 'features/changelog' is not valid or not found on this machine 
> [2015-06-29 07:11:33.368843] E [MSGID: 101019]
> [graph.y:319:volume_end] 0-parser: "type" not specified for volume
> test-changelog [2015-06-29 07:11:33.368922] E [MSGID: 100026]
> [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct
> the graph [2015-06-29 07:11:33.369025] W
> [glusterfsd.c:1214:cleanup_and_exit] (--> 0-: received signum (0),
> shutting down
> 
> and so on.
> 

I was doing my development work on Fedora 22 since 2 weeks and I
haven't encountered similar error recently. I could create, start and
mount volumes successfully. If you had rpms installed previously, can
you make sure that those are cleaned up correctly?

> Looks like "inline" functions should be marked as "static inline"
> or "extern inline" explicitly Please refer
> https://gcc.gnu.org/gcc-5/porting_to.html
> 
> To recreate the issue without glusterfs, try out this sample code
> program on fedora 22 gcc 5.1.1 or higher
> 
> hello.c ===
> 
> #include 
> 
> inline void foo () { printf ("hello world"); }
> 
> int main () {
> 
> foo (); return 0; }
> 
> # gcc hello.c /tmp/ccUQ1XPp.o: In function `main': 
> hello.c:(.text+0xa): undefined reference to `foo' collect2: error:
> ld returned 1 exit status #
> 

Reading through gcc docs I could see that with gcc v5, it defaults to
C99 semantics and compiling with -fgnu89-inline solves the above
issue. I'm wondering how glusterfs compiled successfully without
providing this flag. I'm investigating on other ways to use inline
that behave the same in the old and the new semantics.

> Should we change all the inline function to "static inline" or
> "extern inline" in gluster, appropriately to their scope of use
> (IMHO would be a right thing to do)? or should we use a compiler
> flag to suppress this?
> 
> Regards, Joe
> 
> 
> 
> 
> ___ Gluster-devel
> mailing list Gluster-devel@gluster.org 
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iJwEAQEIAAYFAlWRiKQACgkQ9uBlhhlWMJacxgQAo49rPf6pX9wWRNd2tEo8n0iP
3hytIGPOswqWf5Cii5cunf2qXHH+vW4nI5biXiVeLgfs9WmK+dEj+CL2iWJTKwX2
Fs5VYUt2BBtWf73f3AR1OS5U7ARd38H/Q0JXfGhrhLcIDEAuBlJ4BBSennlCOF/A
fCiDSEAhCPctPLBdP0w=
=OiXD
-END PGP SIGNATURE-
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Prashanth Pai
Ah, I thought it was just me who was running into this.
http://review.gluster.org/11214

Regards,
 -Prashanth Pai

- Original Message -
> From: "Joseph Fernandes" 
> To: "Gluster Devel" 
> Sent: Monday, June 29, 2015 5:54:28 PM
> Subject: [Gluster-devel] Gluster and GCC 5.1
> 
> Hi All,
> 
> Recently I installed Fedora 22 on some fresh vms, which comes with gcc
> 5.1.1-1(which can be upgraded to 5.1.1-4)
> Observed one thing that normal "inline functions" will be undefined symbols
> in the "so" files.
> As a result I had trouble in start volume as "gf_sql_str2sync_t"
> 
> [2015-06-29 05:52:38.491378] I [MSGID: 101190]
> [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with
> index 1
> [2015-06-29 05:52:38.499205] W [MSGID: 101095] [xlator.c:189:xlator_dynload]
> 0-xlator: /usr/local/lib/libgfdb.so.0: undefined symbol: gf_sql_str2sync_t
> [2015-06-29 05:52:38.499229] E [MSGID: 101002] [graph.y:211:volume_type]
> 0-parser: Volume 'test-changetimerecorder', line 16: type
> 'features/changetimerecorder' is not valid or not found on this machine
> [2015-06-29 05:52:38.499262] E [MSGID: 101019] [graph.y:319:volume_end]
> 0-parser: "type" not specified for volume test-changetimerecorder
> [2015-06-29 05:52:38.499335] E [MSGID: 100026]
> [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the
> graph
> [2015-06-29 05:52:38.499470] W [glusterfsd.c:1214:cleanup_and_exit] (--> 0-:
> received signum (0), shutting down
> 
> when gf_sql_str2sync_t was made "static inline gf_sql_str2sync_t" the next
> issue was with "changelog_dispatch_vec"
> 
> [2015-06-29 07:11:33.367259] I [MSGID: 101190]
> [event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with
> index 1
> [2015-06-29 07:11:33.368816] W [MSGID: 101095] [xlator.c:189:xlator_dynload]
> 0-xlator: /usr/local/lib/glusterfs/3.8dev/xlator/features/changelog.so:
> undefined symbol: changelog_dispatch_vec
> [2015-06-29 07:11:33.368829] E [MSGID: 101002] [graph.y:211:volume_type]
> 0-parser: Volume 'test-changelog', line 32: type 'features/changelog' is not
> valid or not found on this machine
> [2015-06-29 07:11:33.368843] E [MSGID: 101019] [graph.y:319:volume_end]
> 0-parser: "type" not specified for volume test-changelog
> [2015-06-29 07:11:33.368922] E [MSGID: 100026]
> [glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the
> graph
> [2015-06-29 07:11:33.369025] W [glusterfsd.c:1214:cleanup_and_exit] (--> 0-:
> received signum (0), shutting down
> 
> and so on.
> 
> Looks like "inline" functions should be marked as "static inline" or "extern
> inline" explicitly
> Please refer https://gcc.gnu.org/gcc-5/porting_to.html
> 
> To recreate the issue without glusterfs, try out this sample code program on
> fedora 22 gcc 5.1.1 or higher
> 
> hello.c
> ===
> 
> #include 
> 
> inline void foo () {
> printf ("hello world");
> }
> 
> int main () {
> 
> foo ();
> return 0;
> }
> 
> # gcc hello.c
> /tmp/ccUQ1XPp.o: In function `main':
> hello.c:(.text+0xa): undefined reference to `foo'
> collect2: error: ld returned 1 exit status
> #
> 
> Should we change all the inline function to "static inline" or "extern
> inline" in gluster, appropriately to their scope of use (IMHO would be a
> right thing to do)?
> or should we use a compiler flag to suppress this?
> 
> Regards,
> Joe
> 
> 
> 
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] GF_FOP_IPC changes

2015-06-29 Thread Soumya Koduri



On 06/29/2015 08:18 PM, Niels de Vos wrote:

On Wed, Jun 24, 2015 at 07:44:13PM +0530, Soumya Koduri wrote:



On 06/24/2015 10:14 AM, Krishnan Parthasarathi wrote:



- Original Message -

I've been looking at the recent patches to redirect GF_FOP_IPC to an active
subvolume instead of always to the first.  Specifically, these:

http://review.gluster.org/11346 for DHT
http://review.gluster.org/11347 for EC
http://review.gluster.org/11348 for AFR

I can't help but wonder if there's a simpler and more generic way to do this,
instead of having to do this in a translator-specific way each time - then
again for NSR, or for a separate tiering translator, and so on.  For example
what if each translator had a first_active_child callback?

xlator_t * (*first_active_child) (xlator_t *parent);

Then default_ipc could invoke this, if it exists, where it currently invokes
FIRST_CHILD.  Each translator could implement a bare minimum to select a
child, then "step out of the way" for a fop it really wasn't all that
interested in to begin with.  Any thoughts?


We should do this right away. This change doesn't affect external interfaces.
we should be bold and implement the first solution. Over time we could improve
on this.


+1. It would definitely ease the implementation of many such fops which have
to default to first active child. We need not keep track of all the fops
which may get affected with new clustering xlators being added.


I think it is a great improvement and makes the code much easier to
understand. Do we have a volunteer that wants to have a go at
implementing this?

I volunteer. Pranith already seem to have some thoughts on it. I shall 
check with him and update the initial findings.


Thanks,
Soumya


Thanks,
Niels


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] GF_FOP_IPC changes

2015-06-29 Thread Niels de Vos
On Wed, Jun 24, 2015 at 07:44:13PM +0530, Soumya Koduri wrote:
> 
> 
> On 06/24/2015 10:14 AM, Krishnan Parthasarathi wrote:
> >
> >
> >- Original Message -
> >>I've been looking at the recent patches to redirect GF_FOP_IPC to an active
> >>subvolume instead of always to the first.  Specifically, these:
> >>
> >>http://review.gluster.org/11346 for DHT
> >>http://review.gluster.org/11347 for EC
> >>http://review.gluster.org/11348 for AFR
> >>
> >>I can't help but wonder if there's a simpler and more generic way to do 
> >>this,
> >>instead of having to do this in a translator-specific way each time - then
> >>again for NSR, or for a separate tiering translator, and so on.  For example
> >>what if each translator had a first_active_child callback?
> >>
> >>xlator_t * (*first_active_child) (xlator_t *parent);
> >>
> >>Then default_ipc could invoke this, if it exists, where it currently invokes
> >>FIRST_CHILD.  Each translator could implement a bare minimum to select a
> >>child, then "step out of the way" for a fop it really wasn't all that
> >>interested in to begin with.  Any thoughts?
> >
> >We should do this right away. This change doesn't affect external interfaces.
> >we should be bold and implement the first solution. Over time we could 
> >improve
> >on this.
> 
> +1. It would definitely ease the implementation of many such fops which have
> to default to first active child. We need not keep track of all the fops
> which may get affected with new clustering xlators being added.

I think it is a great improvement and makes the code much easier to
understand. Do we have a volunteer that wants to have a go at
implementing this?

Thanks,
Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] xattr creation failure in posix_lookup

2015-06-29 Thread Raghavendra Bhat


Hi,

In posix_lookup, it allocates a dict for storing the values of the 
extended attributes and other hint keys set into the xdata of call path 
(i.e. wind path) by higher xlators (such as quick-read, bit-rot-stub etc).


But if the creation of new dict fails, then a NULL dict is returned in 
the callback path. There might be many xlators for which the key-value 
information present in the dict might be very important for making 
certain decisions (Ex: In bit-rot-stub it tries to fetch an extended 
attribute which tells whether the object is bad or not. If the the key 
is present in the dict means the object is bad and the xlator updates 
the same in the inode context. Later when there is any read/modify 
operations on that object, the fop is failed instead of allowing to 
continue).


Now suppose in posix_lookup the dict creation fails, then posix simply 
proceeds with the lookup operation and if other stat operations 
succeeded, then lookup will return success with NULL dict.


if (xdata && (op_ret == 0)) {
xattr = posix_xattr_fill (this, real_path, loc, NULL, 
-1, xdata,

  &buf);
}

The above piece of code in posix_lookup creates a new dict called 
@xattr. The return value of posix_xattr_fill is not checked.


So in this case, as per the bit-rot-stub example mentioned above, there 
is a possibility that the object being looked up is a bad object (marked 
by the scrubber). And since lookup succeeded, but the bad-object xattr 
is not obtained in the callback (dict itself being NULL), bit-rot-stub 
xlator does not mark that object as bad and might allow further 
read/write requests coming, thus allowing bad data to be served.


There might be other xlators as well dependent upon the xattrs being 
returned in lookup.


Should we fail lookup if the dict creation fails?

Regards,
Raghavendra Bhat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Looking for Fedora Package Maintainers.

2015-06-29 Thread Niels de Vos
On Mon, Jun 29, 2015 at 05:40:37PM +0530, Humble Devassy Chirammal wrote:
> Thanks Vishwanath++, Ragavendra++ & Saravana for volunteering!

Oh, wow, only Red Hat people? I did hope that some community users of
the RPMs could help out. Still not too late (never will be!) to sign up.

Thanks,
Niels


> 
> 
> --Humble
> 
> 
> On Wed, Jun 24, 2015 at 6:53 PM, Saravanakumar Arumugam  > wrote:
> 
> > mailto:humble.deva...@gmail.com>> wrote:
> >
> >>
> >>> Hi All,
> >>>
> >>> As we maintain 3 releases ( currently 3.5, 3.6 and 3.7)  of
> >>> GlusterFS and having an average of  one release per week , we need
> >>> more helping hands on this task.
> >>>
> >>> The responsibility includes building fedora and epel rpms using koji
> >>> build system and deploying  the rpms to download.gluster.org
> >>>  [1] after signing and creating repos.
> >>>
> >>> If any one is interested to help us on maintaining fedora GlusterFS
> >>> packaging, please let us ( kkeithley,  ndevos or myself ) know.
> >>>
> >>>
> >>> I'm interested in helping/maintaining of gluster packaging.
> >>>
> >>> Best Regards,
> >>> Vishwanath
> >>>
> >>>
> >>> [1] http://download.gluster.org/pub/gluster/glusterfs/
> >>>
> >>> --Humble
> >>>
> >>
> >> Add my name to list of volunteers.
> >>
> >> Raghavendra Talur
> >>
> >
> > Hi Humble,
> > You can count me too for any help related.
> >
> > Thanks,
> > Saravana
> >
> >

> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Gluster and GCC 5.1

2015-06-29 Thread Joseph Fernandes
Hi All,

Recently I installed Fedora 22 on some fresh vms, which comes with gcc 
5.1.1-1(which can be upgraded to 5.1.1-4)
Observed one thing that normal "inline functions" will be undefined symbols in 
the "so" files.
As a result I had trouble in start volume as "gf_sql_str2sync_t" 

[2015-06-29 05:52:38.491378] I [MSGID: 101190] 
[event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 1
[2015-06-29 05:52:38.499205] W [MSGID: 101095] [xlator.c:189:xlator_dynload] 
0-xlator: /usr/local/lib/libgfdb.so.0: undefined symbol: gf_sql_str2sync_t
[2015-06-29 05:52:38.499229] E [MSGID: 101002] [graph.y:211:volume_type] 
0-parser: Volume 'test-changetimerecorder', line 16: type 
'features/changetimerecorder' is not valid or not found on this machine
[2015-06-29 05:52:38.499262] E [MSGID: 101019] [graph.y:319:volume_end] 
0-parser: "type" not specified for volume test-changetimerecorder
[2015-06-29 05:52:38.499335] E [MSGID: 100026] 
[glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the graph
[2015-06-29 05:52:38.499470] W [glusterfsd.c:1214:cleanup_and_exit] (--> 0-: 
received signum (0), shutting down

when gf_sql_str2sync_t was made "static inline gf_sql_str2sync_t" the next 
issue was with "changelog_dispatch_vec"

[2015-06-29 07:11:33.367259] I [MSGID: 101190] 
[event-epoll.c:627:event_dispatch_epoll_worker] 0-epoll: Started thread with 
index 1
[2015-06-29 07:11:33.368816] W [MSGID: 101095] [xlator.c:189:xlator_dynload] 
0-xlator: /usr/local/lib/glusterfs/3.8dev/xlator/features/changelog.so: 
undefined symbol: changelog_dispatch_vec
[2015-06-29 07:11:33.368829] E [MSGID: 101002] [graph.y:211:volume_type] 
0-parser: Volume 'test-changelog', line 32: type 'features/changelog' is not 
valid or not found on this machine
[2015-06-29 07:11:33.368843] E [MSGID: 101019] [graph.y:319:volume_end] 
0-parser: "type" not specified for volume test-changelog
[2015-06-29 07:11:33.368922] E [MSGID: 100026] 
[glusterfsd.c:2151:glusterfs_process_volfp] 0-: failed to construct the graph
[2015-06-29 07:11:33.369025] W [glusterfsd.c:1214:cleanup_and_exit] (--> 0-: 
received signum (0), shutting down

and so on.

Looks like "inline" functions should be marked as "static inline" or "extern 
inline" explicitly
Please refer https://gcc.gnu.org/gcc-5/porting_to.html

To recreate the issue without glusterfs, try out this sample code program on 
fedora 22 gcc 5.1.1 or higher

hello.c
===

#include 

inline void foo () {
printf ("hello world");
}

int main () {

foo ();
return 0;
}

# gcc hello.c 
/tmp/ccUQ1XPp.o: In function `main':
hello.c:(.text+0xa): undefined reference to `foo'
collect2: error: ld returned 1 exit status
#

Should we change all the inline function to "static inline" or "extern inline" 
in gluster, appropriately to their scope of use (IMHO would be a right thing to 
do)?
or should we use a compiler flag to suppress this?

Regards,
Joe 




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Looking for Fedora Package Maintainers.

2015-06-29 Thread Humble Devassy Chirammal
Thanks Vishwanath++, Ragavendra++ & Saravana for volunteering!


--Humble


On Wed, Jun 24, 2015 at 6:53 PM, Saravanakumar Arumugam  wrote:

> mailto:humble.deva...@gmail.com>> wrote:
>
>>
>>> Hi All,
>>>
>>> As we maintain 3 releases ( currently 3.5, 3.6 and 3.7)  of
>>> GlusterFS and having an average of  one release per week , we need
>>> more helping hands on this task.
>>>
>>> The responsibility includes building fedora and epel rpms using koji
>>> build system and deploying  the rpms to download.gluster.org
>>>  [1] after signing and creating repos.
>>>
>>> If any one is interested to help us on maintaining fedora GlusterFS
>>> packaging, please let us ( kkeithley,  ndevos or myself ) know.
>>>
>>>
>>> I'm interested in helping/maintaining of gluster packaging.
>>>
>>> Best Regards,
>>> Vishwanath
>>>
>>>
>>> [1] http://download.gluster.org/pub/gluster/glusterfs/
>>>
>>> --Humble
>>>
>>
>> Add my name to list of volunteers.
>>
>> Raghavendra Talur
>>
>
> Hi Humble,
> You can count me too for any help related.
>
> Thanks,
> Saravana
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Unknown core reg.,

2015-06-29 Thread Anuradha Talur


- Original Message -
> From: "Niels de Vos" 
> To: "Raghavendra Talur" 
> Cc: "Mohamed Ashiq Liyazudeen" , "Gluster Devel" 
> 
> Sent: Sunday, June 28, 2015 7:01:56 PM
> Subject: Re: [Gluster-devel] Unknown core reg.,
> 
> On Fri, Jun 26, 2015 at 01:58:13AM +0530, Raghavendra Talur wrote:
> > On Thu, Jun 25, 2015 at 6:23 PM, Vijay Bellur  wrote:
> > 
> > > On Thursday 25 June 2015 07:13 AM, Mohamed Ashiq Liyazudeen wrote:
> > >
> > >> Hi,
> > >>
> > >> There is a core created while building the patch, the failure is not
> > >> related to my patch. can anyone look into
> > >> http://build.gluster.org/job/rackspace-regression-2GB-triggered/11353/console
> > >>
> > >>
> > > Since core files and the associated runtime gets archived, it is not very
> > > complicated to obtain a backtrace from the core. Providing as much
> > > specific
> > > information possible while reporting a problem usually helps in gaining
> > > more attention and quicker resolution.
> > >
> > > Can you please check if you can grab the backtrace and post it here?
> > >
> > 
> > Refer to this mail by Shyam on how to get the details
> > http://www.gluster.org/pipermail/gluster-devel/2014-July/041352.html
> 
> We should include these steps in the developers documentation. Who wants
> to take that small task?

I've sent a doc on github for this.
Have also included it for review here: http://review.gluster.org/#/c/11453/,
 as I was unsure of the final protocol for sending docs.

> 
> Thanks,
> Niels
> 
> 
> > 
> > Thanks,
> > Raghavendra Talur
> > 
> > 
> > > Regards,
> > > Vijay
> > >
> > > ___
> > > Gluster-devel mailing list
> > > Gluster-devel@gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-devel
> > >
> > 
> > 
> > 
> > --
> > *Raghavendra Talur *
> 
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 

-- 
Thanks,
Anuradha.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel