Re: [Gluster-users] Questions about healing

2016-05-20 Thread Kevin Lemonnier
>Anyway how is possible to keep VM up and running when healing is happening
>on a shard? That part of disk image is not accessible and thus the VM
>could have some issue on a filesystem.

Yeah, but healing a few MB shard takes a few second, so the VM is frozen for a 
very small
amount of time. Without sharding, the VM is frozen as long as the whole disk 
hasn't
been healed, which will take hours on big clusters.

-- 
Kevin Lemonnier
PGP Fingerprint : 89A5 2283 04A0 E6E9 0111


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Questions about healing

2016-05-20 Thread Alastair Neil
Well it's not magic, there is an algorithm that is documented and it is
trivial script the recreation of the file from the shards if gluster was
truly unavailable:
>
>
> #!/bin/bash
> #
> # quick and dirty reconstruct file from shards
> # takes brick path and file name as arguments
> # Copyright May 20th 2016 A. Neil
> #
> brick=$1
> filen=$2
> file=`find $brick -name $filen`
> inode=`ls -i $file | cut -d' ' -f1`
> pushd $brick/.glusterfs
> gfid=`find . -inum $inode | cut -d'/' -f4`
> popd
> nshard=`ls -1  $brick/.shard/${gfid}.* | wc -l`
> cp $file ./${filen}.restored
> for i in `seq 1 $nshard`; do cat $brick/.shard/${gfid}.$i >>
> ./${filen}.restored; done


 Admittedly this is not as easy as pulling the image for from the brick
file system, but then the advantages are pretty big.

The point is that each shard is small and healing of them is fast.  The
majority of the time when you need to heal a vm it's is only a few blocks
that have changed and without sharding you might have to heal 10 , 20 or
100GB.  In my experience if you have 30 or 40 VMs it can take hours to
heal.  With the limited testing I have done I have found  that yes some VMs
will experience IO timeouts, freeze, and then need to be restarted.
However, at least you don't need to wait hours before you can do that.






On 20 May 2016 at 15:20, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Il 20 mag 2016 20:14, "Alastair Neil"  ha scritto:
> >
> > I think you are confused about what sharding does.   In a sharded
> replica 3 volume all the shards exist on all the replicas so there is no
> distribution.  Might you be getting confused with erasure coding?  The
> upshot of sharding is that if you have a failure, instead of healing
> multiple gigabyte vm files for example, you only heal the shards that have
> changed. This generally shortens the heal time dramatically.
>
> I know what sharding is.
> it split each file in multiple, smaller,  chunks
>
> But if all is gonna bad, how can i reconstruct a file from each shard
> without gluster? It would be a pain.
> Let's assume tens of terabytes of shards to be manually reconstructed ...
>
> Anyway how is possible to keep VM up and running when healing is happening
> on a shard? That part of disk image is not accessible and thus the VM could
> have some issue on a filesystem.
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Questions about healing

2016-05-20 Thread Gandalf Corvotempesta
Il 20 mag 2016 20:14, "Alastair Neil"  ha scritto:
>
> I think you are confused about what sharding does.   In a sharded replica
3 volume all the shards exist on all the replicas so there is no
distribution.  Might you be getting confused with erasure coding?  The
upshot of sharding is that if you have a failure, instead of healing
multiple gigabyte vm files for example, you only heal the shards that have
changed. This generally shortens the heal time dramatically.

I know what sharding is.
it split each file in multiple, smaller,  chunks

But if all is gonna bad, how can i reconstruct a file from each shard
without gluster? It would be a pain.
Let's assume tens of terabytes of shards to be manually reconstructed ...

Anyway how is possible to keep VM up and running when healing is happening
on a shard? That part of disk image is not accessible and thus the VM could
have some issue on a filesystem.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Questions about healing

2016-05-20 Thread Alastair Neil
I think you are confused about what sharding does.   In a sharded replica 3
volume all the shards exist on all the replicas so there is no
distribution.  Might you be getting confused with erasure coding?  The
upshot of sharding is that if you have a failure, instead of healing
multiple gigabyte vm files for example, you only heal the shards that have
changed. This generally shortens the heal time dramatically.

Alastair

On 18 May 2016 at 12:54, Gandalf Corvotempesta <
gandalf.corvotempe...@gmail.com> wrote:

> Il 18/05/2016 13:55, Kevin Lemonnier ha scritto:
>
>> Yes, that's why you need to use sharding. With sharding, the heal is much
>> quicker and the whole VM isn't freezed during the heal, only the shard
>> being healed. I'm testing that right now myself and that's almost invisible
>> for the VM using 3.7.11. Use the latest version though, it really really
>> wasn't transparent in 3.7.6 :).
>>
> I don't like sharding. With sharing all "files" are split in shard and
> distributed across the whole cluster.
> If everything went bad, reconstructing a file from it shards could be hard.
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Using a volume during rebalance

2016-05-20 Thread Christoph Schäfer
Hi all,

I'm wondering if it is ok to use a volume during rebalance (Gluster
3.6.9). Currently I'm preparing to expand a distributed replicated
volume by 2 nodes (replica is 2).

I'm currently testing with a virtual machine setup. If I do heavy copy
operation during rebalance after the 2 nodes are added, I've already had
oddities in 2 of 2 tries. One time a directory with 10k files was not
copied completely and the other time copy suddenly asked to overwrite
files, though the target directory was empty before copy command. This
makes me realy feel bad thinking on rebalacing a production system...
The client is a linux box with the mounted volume.

Are there any new known limitations? Should I not use the volume during
rebalance??

Thanks for you feedback and best regards
Christoph
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster logs filling the disk.

2016-05-20 Thread Serkan Çoban
This is a bug which will be fixed in 3.7.12. You an try to set log
level to WARNING to get rid of it..

On Fri, May 20, 2016 at 7:18 PM, Ernie Dunbar  wrote:
> We had one of our gluster servers in the cluster fail on us yesterday, and
> now one (and only one) of the other servers in the cluster has managed to
> collect about 7 gigabytes of logs in the past 12 hours, seemingly only with
> lines like this:
>
> [2016-05-20 16:08:05.119529] I [dict.c:473:dict_get]
> (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xac)
> [0x7f3ece2bd17c]
> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7)
> [0x7f3ec2ca6877]
> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac)
> [0x7f3ece2ad91c] ) 0-dict: !this || key=() [Invalid argument]
> [2016-05-20 16:08:05.121478] I [dict.c:473:dict_get]
> (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xac)
> [0x7f3ece2bd17c]
> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7)
> [0x7f3ec2ca6877]
> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac)
> [0x7f3ece2ad91c] ) 0-dict: !this || key=() [Invalid argument]
> [2016-05-20 16:08:05.123640] I [dict.c:473:dict_get]
> (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xac)
> [0x7f3ece2bd17c]
> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7)
> [0x7f3ec2ca6877]
> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac)
> [0x7f3ece2ad91c] ) 0-dict: !this || key=() [Invalid argument]
>
>
> I've set the loglevel to INFO for the brick, and I'm still getting all this.
> It's streaming in at an insane rate. Perhaps the logger could note that this
> one message has been logged 27,000,000 times in the past 17 seconds, instead
> of spewing into the log files like this? Nevermind that this is pure debug
> information that no user could ever decipher.
>
> We're using Ubuntu and Gluster v3.7.11.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] CentOS 7.2 + Gluster 3.6.3 volume stuck in heal

2016-05-20 Thread Kingsley
Hi,

We've got a volume that has been stuck in "Possibly undergoing heal"
status for several months. I would like to upgrade gluster to a newer
version but I would feel safer if we could get the volume fixed first.

Some info:

# gluster volume info voicemail

Volume Name: voicemail
Type: Replicate
Volume ID: 8f628d09-60d0-4cb8-8d1a-b7a272c42a23
Status: Started
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: gluster1a-1:/data/brick/voicemail
Brick2: gluster1b-1:/data/brick/voicemail
Brick3: gluster2a-1:/data/brick/voicemail
Brick4: gluster2b-1:/data/brick/voicemail



# gluster volume heal voicemail info
Brick gluster1a-1.dns99.co.uk:/data/brick/voicemail/
/borked/Old - Possibly undergoing heal

Number of entries: 1

Brick gluster1b-1.dns99.co.uk:/data/brick/voicemail/
Number of entries: 0

Brick gluster2a-1.dns99.co.uk:/data/brick/voicemail/
/borked/Old - Possibly undergoing heal

Number of entries: 1

Brick gluster2b-1.dns99.co.uk:/data/brick/voicemail/
/borked/Old - Possibly undergoing heal

Number of entries: 1


There's a statedump here; I've appended the server name to each file (I
grabbed a copy from each server, in case it helped):

http://gluster.dogwind.com/files/statedump.voicemail.tar.gz


What should I do?

Cheers,
Kingsley.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Gluster logs filling the disk.

2016-05-20 Thread Ernie Dunbar
We had one of our gluster servers in the cluster fail on us yesterday, 
and now one (and only one) of the other servers in the cluster has 
managed to collect about 7 gigabytes of logs in the past 12 hours, 
seemingly only with lines like this:


[2016-05-20 16:08:05.119529] I [dict.c:473:dict_get] 
(-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xac) 
[0x7f3ece2bd17c] 
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) 
[0x7f3ec2ca6877] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) 
[0x7f3ece2ad91c] ) 0-dict: !this || key=() [Invalid argument]
[2016-05-20 16:08:05.121478] I [dict.c:473:dict_get] 
(-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xac) 
[0x7f3ece2bd17c] 
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) 
[0x7f3ec2ca6877] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) 
[0x7f3ece2ad91c] ) 0-dict: !this || key=() [Invalid argument]
[2016-05-20 16:08:05.123640] I [dict.c:473:dict_get] 
(-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_getxattr_cbk+0xac) 
[0x7f3ece2bd17c] 
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.11/xlator/features/marker.so(marker_getxattr_cbk+0xa7) 
[0x7f3ec2ca6877] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_get+0xac) 
[0x7f3ece2ad91c] ) 0-dict: !this || key=() [Invalid argument]



I've set the loglevel to INFO for the brick, and I'm still getting all 
this. It's streaming in at an insane rate. Perhaps the logger could note 
that this one message has been logged 27,000,000 times in the past 17 
seconds, instead of spewing into the log files like this? Nevermind that 
this is pure debug information that no user could ever decipher.


We're using Ubuntu and Gluster v3.7.11.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread Atin Mukherjee
-Atin
Sent from one plus one
On 20-May-2016 5:34 PM, "ABHISHEK PALIWAL"  wrote:
>
> Actually we have some other files related to system initial configuration
for that we
> need to format the volume where these bricks are also created and after
this we are
> facing some abnormal behavior in gluster and some failure logs like
volume ID mismatch something.
>
> That is why I am asking this is the right way to format volume where
bricks are created.

No certainly not. If you format your brick, you loose the data and so as
all the extended attributes. In this case your volume would bound to behave
abnormally.
>
> and also is there any link between /var/lib/glusterd and xattr stored in
.glusterfs directory at brick path.
>
> Regards,
> Abhishek
>
> On Fri, May 20, 2016 at 5:25 PM, Atin Mukherjee 
wrote:
>>
>> And most importantly why would you do that? What's your use case
Abhishek?
>>
>> On 05/20/2016 05:03 PM, Lindsay Mathieson wrote:
>> > On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
>> >> I am not getting any failure and after restart the glusterd when I run
>> >> volume info command it creates the brick directory
>> >> as well as .glsuterfs (xattrs).
>> >>
>> >> but some time even after restart the glusterd, volume info command
>> >> showing no volume present.
>> >>
>> >> Could you please tell me why this unpredictable problem is occurring.
>> >>
>> >
>> > Because as stated earlier you erase all the information about the
>> > brick?  How is this unpredictable?
>> >
>> >
>> > If you want to delete and recreate a brick you should have used the
>> > remove-brick/add-brick commands.
>> >
>> > --
>> > Lindsay Mathieson
>> >
>> >
>> >
>> > ___
>> > Gluster-users mailing list
>> > Gluster-users@gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-users
>> >
>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

-Atin
Sent from one plus one
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Possible error not being returned

2016-05-20 Thread Ankireddypalle Reddy
Hi,
Did anyone get a chance to check this. We are intermittently receiving 
corrupted data in read operations because of this.

Thanks and Regards,
Ram

From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Ankireddypalle Reddy
Sent: Thursday, May 19, 2016 3:59 PM
To: gluster-users@gluster.org
Subject: [Gluster-users] Possible error not being returned

Hi,
   A disperse volume  was configured on  servers with limited network 
bandwidth. Some of the read operations failed with error

[2016-05-16 18:38:36.035559] E [MSGID: 122034] 
[ec-common.c:461:ec_child_select] 0-SDSStoragePool-disperse-2: Insufficient 
available childs for this request (have 1, need 2)
[2016-05-16 18:38:36.035713] W [fuse-bridge.c:2213:fuse_readv_cbk] 
0-glusterfs-fuse: 155121179: READ => -1 (Input/output error)

For some read operations just the following error was logged but the I/O did 
not fail.
[2016-05-16 18:42:45.401570] E [MSGID: 122034] 
[ec-common.c:461:ec_child_select] 0-SDSStoragePool-disperse-3: Insufficient 
available childs for this request (have 1, need 2)
[2016-05-16 18:42:45.402054] W [MSGID: 122053] 
[ec-common.c:116:ec_check_status] 0-SDSStoragePool-disperse-3: Operation failed 
on some subvolumes (up=7, mask=6, remaining=0, good=6, bad=1)

We are receiving corrupted data in the read operation when the error is logged 
but the read call did not return any error.

Thanks and Regards,
Ram







***Legal Disclaimer***

"This communication may contain confidential and privileged material for the

sole use of the intended recipient. Any unauthorized review, use or distribution

by others is strictly prohibited. If you have received the message by mistake,

please advise the sender by reply email and delete the message. Thank you."

**



***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Idea: Alternate Release process

2016-05-20 Thread Shyam

On 05/19/2016 10:25 PM, Pranith Kumar Karampuri wrote:

Once every 3 months i.e. option 3 sounds good to me.


+1 from my end.

Every 2 months seems to be a bit too much, 4 months is still fine, but 
gives us 1 in 3 to pick the LTS, I like 1:4 odds better for the LTS, 
hence the 3 months (or 'alternative 2').




Pranith

On Fri, May 13, 2016 at 1:46 PM, Aravinda mailto:avish...@redhat.com>> wrote:

Hi,

Based on the discussion in last community meeting and previous
discussions,

1. Too frequent releases are difficult to manage.(without dedicated
release manager)
2. Users wants to see features early for testing or POC.
3. Backporting patches to more than two release branches is pain

Enclosed visualizations to understand existing release and support
cycle and proposed alternatives.

- Each grid interval is 6 months
- Green rectangle shows supported release or LTS
- Black dots are minor releases till it is supported(once a month)
- Orange rectangle is non LTS release with minor releases(Support
ends when next version released)

Enclosed following images
1. Existing Release cycle and support plan(6 months release cycle, 3
releases supported all the time)
2. Proposed alternative 1 - One LTS every year and non LTS stable
release once in every 2 months
3. Proposed alternative 2 - One LTS every year and non LTS stable
release once in every 3 months
4. Proposed alternative 3 - One LTS every year and non LTS stable
release once in every 4 months
5. Proposed alternative 4 - One LTS every year and non LTS stable
release once in every 6 months (Similar to existing but only
alternate one will become LTS)

Please do vote for the proposed alternatives about release intervals
and LTS releases. You can also vote for the existing plan.

Do let me know if I missed anything.

regards
Aravinda

On 05/11/2016 12:01 AM, Aravinda wrote:


I couldn't find any solution for the backward incompatible
changes. As you mentioned this model will not work for LTS.

How about adopting this only for non LTS releases? We will not
have backward incompatibility problem since we need not release
minor updates to non LTS releases.

regards
Aravinda
On 05/05/2016 04:46 PM, Aravinda wrote:


regards
Aravinda

On 05/05/2016 03:54 PM, Kaushal M wrote:

On Thu, May 5, 2016 at 11:48 AM, Aravinda 
 wrote:

Hi,

Sharing an idea to manage multiple releases without maintaining
multiple release branches and backports.

This idea is heavily inspired by the Rust release model(you may
feel
exactly same except the LTS part). I think Chrome/Firefox also
follows
the same model.

http://blog.rust-lang.org/2014/10/30/Stability.html

Feature Flag:
--
Compile time variable to prevent compiling featurerelated code
when
disabled. (For example, ./configure--disable-geo-replication
or ./configure --disable-xml etc)

Plan
-
- Nightly build with all the features enabled(./build --nightly)

- All new patches will land in Master, if the patch belongs to a
   existing feature then it should be written behind that
feature flag.

- If a feature is still work in progress then it will be only
enabled in
   nightly build and not enabled in beta or stable builds.
   Once the maintainer thinks the feature is ready for testing
then that
   feature will be enabled in beta build.

- Every 6 weeks, beta branch will be created by enabling all the
   features which maintainers thinks it is stable and previous
beta
   branch will be promoted as stable.
   All the previous beta features will be enabled in stable
unless it
   is marked as unstable during beta testing.

- LTS builds are same as stable builds but without enabling all
the
   features. If we decide last stable build will become LTS
release,
   then the feature list from last stable build will be saved as
   `features-release-.yaml`, For example:
   features-release-3.9.yaml`
   Same feature list will be used while building minor releases
for the
   LTS. For example, `./build --stable --features
features-release-3.8.yaml`

- Three branches, nightly/master, testing/beta, stable

To summarize,
- One stable release once in 6 weeks
- One Beta release once in 6 weeks
- Nightly builds every day
- LTS release once in 6 months or 1 year, Minor releases once
in 6 weeks.

Advantageous:
-
1. No more backports required to different release branches.(only
exceptional backports, discussed below)
2. Non feature Bugfix will never get missed in releases.
3. Release process can be automated.
4. Bugzilla process can be simplified.

Challenges:

1. Enforci

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread ABHISHEK PALIWAL
Actually we have some other files related to system initial configuration
for that we
need to format the volume where these bricks are also created and after
this we are
facing some abnormal behavior in gluster and some failure logs like volume
ID mismatch something.

That is why I am asking this is the right way to format volume where bricks
are created.

and also is there any link between /var/lib/glusterd and xattr stored in
.glusterfs directory at brick path.

Regards,
Abhishek

On Fri, May 20, 2016 at 5:25 PM, Atin Mukherjee  wrote:

> And most importantly why would you do that? What's your use case Abhishek?
>
> On 05/20/2016 05:03 PM, Lindsay Mathieson wrote:
> > On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
> >> I am not getting any failure and after restart the glusterd when I run
> >> volume info command it creates the brick directory
> >> as well as .glsuterfs (xattrs).
> >>
> >> but some time even after restart the glusterd, volume info command
> >> showing no volume present.
> >>
> >> Could you please tell me why this unpredictable problem is occurring.
> >>
> >
> > Because as stated earlier you erase all the information about the
> > brick?  How is this unpredictable?
> >
> >
> > If you want to delete and recreate a brick you should have used the
> > remove-brick/add-brick commands.
> >
> > --
> > Lindsay Mathieson
> >
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
>



-- 




Regards
Abhishek Paliwal
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread Atin Mukherjee
And most importantly why would you do that? What's your use case Abhishek?

On 05/20/2016 05:03 PM, Lindsay Mathieson wrote:
> On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
>> I am not getting any failure and after restart the glusterd when I run
>> volume info command it creates the brick directory
>> as well as .glsuterfs (xattrs).
>>
>> but some time even after restart the glusterd, volume info command
>> showing no volume present.
>>
>> Could you please tell me why this unpredictable problem is occurring.
>>
> 
> Because as stated earlier you erase all the information about the
> brick?  How is this unpredictable?
> 
> 
> If you want to delete and recreate a brick you should have used the
> remove-brick/add-brick commands.
> 
> -- 
> Lindsay Mathieson
> 
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread Lindsay Mathieson

On 20/05/2016 8:37 PM, ABHISHEK PALIWAL wrote:
I am not getting any failure and after restart the glusterd when I run 
volume info command it creates the brick directory

as well as .glsuterfs (xattrs).

but some time even after restart the glusterd, volume info command 
showing no volume present.


Could you please tell me why this unpredictable problem is occurring.



Because as stated earlier you erase all the information about the 
brick?  How is this unpredictable?



If you want to delete and recreate a brick you should have used the 
remove-brick/add-brick commands.


--
Lindsay Mathieson

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread ABHISHEK PALIWAL
I am not getting any failure and after restart the glusterd when I run
volume info command it creates the brick directory
as well as .glsuterfs (xattrs).

but some time even after restart the glusterd, volume info command showing
no volume present.

Could you please tell me why this unpredictable problem is occurring.

Regards,
Abhishek

On Fri, May 20, 2016 at 3:50 PM, Kaushal M  wrote:

> This would erase the xattrs set on the brick root (volume-id), which
> identify it as a brick. Brick processes will fail to start when this
> xattr isn't present.
>
>
> On Fri, May 20, 2016 at 3:42 PM, ABHISHEK PALIWAL
>  wrote:
> > Hi
> >
> > What will happen if we format the volume where the bricks of replicate
> > gluster volume's are created and restart the glusterd on both node.
> >
> > It will work fine or in this case need to remove /var/lib/glusterd
> directory
> > as well.
> >
> > --
> > Regards
> > Abhishek Paliwal
> >
> > ___
> > Gluster-devel mailing list
> > gluster-de...@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 




Regards
Abhishek Paliwal
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Query!

2016-05-20 Thread Kaushal M
This would erase the xattrs set on the brick root (volume-id), which
identify it as a brick. Brick processes will fail to start when this
xattr isn't present.


On Fri, May 20, 2016 at 3:42 PM, ABHISHEK PALIWAL
 wrote:
> Hi
>
> What will happen if we format the volume where the bricks of replicate
> gluster volume's are created and restart the glusterd on both node.
>
> It will work fine or in this case need to remove /var/lib/glusterd directory
> as well.
>
> --
> Regards
> Abhishek Paliwal
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Query!

2016-05-20 Thread ABHISHEK PALIWAL
Hi

What will happen if we format the volume where the bricks of replicate
gluster volume's are created and restart the glusterd on both node.

It will work fine or in this case need to remove /var/lib/glusterd
directory as well.

-- 
Regards
Abhishek Paliwal
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] 答复: 答复: 答复: 答复: geo-replication status partial faulty

2016-05-20 Thread vyyy杨雨阳
Hello, Kotresh

I 'create force', but still some nodes work ,some nodes faulty.

On faulty nodes 
etc-glusterfs-glusterd.vol.log shown:
[2016-05-20 06:27:03.260870] I 
[glusterd-geo-rep.c:3516:glusterd_read_status_file] 0-: Using passed config 
template(/var/lib/glusterd/geo-replication/filews_glusterfs01.sh3.ctripcorp.com_filews_slave/gsyncd.conf).
[2016-05-20 06:27:03.404544] E 
[glusterd-geo-rep.c:3200:glusterd_gsync_read_frm_status] 0-: Unable to read 
gsyncd status file
[2016-05-20 06:27:03.404583] E 
[glusterd-geo-rep.c:3603:glusterd_read_status_file] 0-: Unable to read the 
statusfile for /export/sdb/brick1 brick for  filews(master), 
glusterfs01.sh3.ctripcorp.com::filews_slave(slave) session


/var/log/glusterfs/geo-replication/filews/ssh%3A%2F%2Froot%4010.15.65.66%3Agluster%3A%2F%2F127.0.0.1%3Afilews_slave.log
 shown:
[2016-05-20 15:04:01.858340] I [monitor(monitor):215:monitor] Monitor: 

[2016-05-20 15:04:01.858688] I [monitor(monitor):216:monitor] Monitor: starting 
gsyncd worker
[2016-05-20 15:04:01.986754] D [gsyncd(agent):627:main_i] : rpc_fd: 
'7,11,10,9'
[2016-05-20 15:04:01.987505] I [changelogagent(agent):72:__init__] 
ChangelogAgent: Agent listining...
[2016-05-20 15:04:01.988079] I [repce(agent):92:service_loop] RepceServer: 
terminating on reaching EOF.
[2016-05-20 15:04:01.988238] I [syncdutils(agent):214:finalize] : exiting.
[2016-05-20 15:04:01.988250] I [monitor(monitor):267:monitor] Monitor: 
worker(/export/sdb/brick1) died before establishing connection 

Can you help me!


Best Regards 
杨雨阳 Yuyang Yang



-邮件原件-
发件人: vyyy杨雨阳 
发送时间: Thursday, May 19, 2016 7:45 PM
收件人: 'Kotresh Hiremath Ravishankar' 
抄送: Saravanakumar Arumugam ; Gluster-users@gluster.org; 
Aravinda Vishwanathapura Krishna Murthy 
主题: 答复: 答复: 答复: [Gluster-users] 答复: geo-replication status partial faulty

Still not work. 

I need copy /var/lib/glusterd/geo-replication/secret.* to /root/.ssh/id_rsa  to 
make passwordless ssh work. 

 I generate /var/lib/glusterd/geo-replication/secret.pem file on  every master 
nodes.  

I am not sure is this right.


[root@sh02svr5956 ~]# gluster volume geo-replication filews 
glusterfs01.sh3.ctripcorp.com::filews_slave create push-pem force Passwordless 
ssh login has not been setup with glusterfs01.sh3.ctripcorp.com for user root.
geo-replication command failed

[root@sh02svr5956 .ssh]# cp /var/lib/glusterd/geo-replication/secret.pem 
./id_rsa
cp: overwrite `./id_rsa'? y
[root@sh02svr5956 .ssh]# cp /var/lib/glusterd/geo-replication/secret.pem.pub 
./id_rsa.pub
cp: overwrite `./id_rsa.pub'?

 [root@sh02svr5956 ~]# gluster volume geo-replication filews 
glusterfs01.sh3.ctripcorp.com::filews_slave create push-pem force Creating 
geo-replication session between filews & 
glusterfs01.sh3.ctripcorp.com::filews_slave has been successful
[root@sh02svr5956 ~]#




Best Regards
杨雨阳 Yuyang Yang
OPS
Ctrip Infrastructure Service (CIS)
Ctrip Computer Technology (Shanghai) Co., Ltd
Phone: + 86 21 34064880-15554 | Fax: + 86 21 52514588-13389
Web: www.Ctrip.com


-邮件原件-
发件人: Kotresh Hiremath Ravishankar [mailto:khire...@redhat.com]
发送时间: Thursday, May 19, 2016 5:07 PM
收件人: vyyy杨雨阳 
抄送: Saravanakumar Arumugam ; Gluster-users@gluster.org; 
Aravinda Vishwanathapura Krishna Murthy 
主题: Re: 答复: 答复: [Gluster-users] 答复: geo-replication status partial faulty

Hi,

Could you just try 'create force' once to fix those status file errors?

e.g., 'gluster volume geo-rep  :: create 
push-pem force

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "vyyy杨雨阳" 
> To: "Saravanakumar Arumugam" , 
> Gluster-users@gluster.org, "Aravinda Vishwanathapura Krishna Murthy"
> , "Kotresh Hiremath Ravishankar" 
> 
> Sent: Thursday, May 19, 2016 2:15:34 PM
> Subject: 答复: 答复: [Gluster-users] 答复: geo-replication status partial 
> faulty
> 
> I have checked all the nodes both on masters and slaves, the software 
> is the same.
> 
> I am puzzled why there were half masters work, halt faulty.
> 
> 
> [admin@SVR6996HW2285 ~]$ rpm -qa |grep gluster
> glusterfs-api-3.6.3-1.el6.x86_64
> glusterfs-fuse-3.6.3-1.el6.x86_64
> glusterfs-geo-replication-3.6.3-1.el6.x86_64
> glusterfs-3.6.3-1.el6.x86_64
> glusterfs-cli-3.6.3-1.el6.x86_64
> glusterfs-server-3.6.3-1.el6.x86_64
> glusterfs-libs-3.6.3-1.el6.x86_64
> 
> 
> 
> 
> Best Regards
> 杨雨阳 Yuyang Yang
> 
> OPS
> Ctrip Infrastructure Service (CIS)
> Ctrip Computer Technology (Shanghai) Co., Ltd
> Phone: + 86 21 34064880-15554 | Fax: + 86 21 52514588-13389
> Web: www.Ctrip.com
> 
> 
> 
> 发件人: Saravanakumar Arumugam [mailto:sarum...@redhat.com]
> 发送时间: Thursday, May 19, 2016 4:33 PM
> 收件人: vyyy杨雨阳 ; Gluster-users@gluster.org; 
> Aravinda Vishwanathapura Krishna Murthy ; Kotresh 
> Hiremath Ravishankar 
> 主题: Re: 答复: [Gluster-users] 答复: geo-replication status partial faulty
> 
> Hi,
> +geo-rep team.
> 
> Can you get the gluster version you are using?
> 
> # For example:
> rpm