[Gluster-devel] Geo-rep: Solving changelog ordering problem!

2015-09-02 Thread Kotresh Hiremath Ravishankar
Hi DHT Team and Others,

Changelog is a server side translator sits above POSIX and records FOPs.
Hence, the order of operation is true only for that brick and the order
of operation is lost across bricks.

e.g.,(f1 hashes to brick1 and f2 to brick2)
  brick1 brick2
  CREATE f1
  RENAME f1, f2
> Re-balance happens, which is very common with Tiering in place
 RENAME f2, f3
 DATA f3

The moment re-balance happens, the changelogs related to same entry is 
distributed
across bricks and since geo-rep sync these changes independently, it is well 
possible
that it processes in wrong order and end up in inconsistent state in slave.

SOLUTION APPROACHES:

1. Capture re-balance traffic as well and workout all combinations of FOPs to 
end
   up in correct state. Though we started thinking in these lines, one or the 
other
   corner case does exist and still end up in out of order syncing.

2. The changes related to the 'entry'(file), should always be captured on the 
first
   brick where it recorded initially no matter where the file moves because of 
re-balance.
   This retains the ordering for an entry implicitly and yet geo-rep can sync 
in distributed
   manner from each brick keeping the performance up.

   DHT needs to maintain the state for each entry where it was first cached (to 
be precise, 
   which brick it gets recorded in changelog) and always notifies changelog the 
FOP.

   I think if can achieve second solution, it would solve geo-rep's out of 
order syncing
   problem for ever. 

   Let me know your comments and suggestions on this!

 
  

Thanks and Regards,
Kotresh H R

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Venky Shankar
>
> Hi Venky,
>
> It is not apparent to me what issues you see with approach 2. If you could
> lay them out here, it would be helpful in taking the discussions further.
>
> -Krutika
>

It's unclean (to me at least). Replicating shard sizes looks like
*stitching* a filesystem by hand.

What do you think?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Krutika Dhananjay
- Original Message -

> From: "Shyam" 
> To: "Krutika Dhananjay" 
> Cc: "Aravinda" , "Gluster Devel"
> 
> Sent: Wednesday, September 2, 2015 11:13:55 PM
> Subject: Re: [Gluster-devel] Gluster Sharding and Geo-replication

> On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:
> >
> >
> > 
> >
> > *From: *"Shyam" 
> > *To: *"Aravinda" , "Gluster Devel"
> > 
> > *Sent: *Wednesday, September 2, 2015 8:09:55 PM
> > *Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication
> >
> > On 09/02/2015 03:12 AM, Aravinda wrote:
> > > Geo-replication and Sharding Team today discussed about the approach
> > > to make Sharding aware Geo-replication. Details are as below
> > >
> > > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
> > >
> > > - Both Master and Slave Volumes should be Sharded Volumes with same
> > > configurations.
> >
> > If I am not mistaken, geo-rep supports replicating to a non-gluster
> > local FS at the slave end. Is this correct? If so, would this
> > limitation
> > not make that problematic?
> >
> > When you state *same configuration*, I assume you mean the sharding
> > configuration, not the volume graph, right?
> >
> > That is correct. The only requirement is for the slave to have shard
> > translator (for, someone needs to present aggregated view of the file to
> > the READers on the slave).
> > Also the shard-block-size needs to be kept same between master and
> > slave. Rest of the configuration (like the number of subvols of DHT/AFR)
> > can vary across master and slave.

> Do we need to have the sharded block size the same? As I assume the file
> carries an xattr that contains the size it is sharded with
> (trusted.glusterfs.shard.block-size), so if this is synced across, it
> would do. If this is true, what it would mean is that "a sharded volume
> needs a shard supported slave to ge-rep to".

Yep. Even I feel it should probably not be necessary to enforce 
same-shard-size-everywhere as long as shard translator on the slave takes care 
not to further "shard" the individual shards gsyncD would write to, on the 
slave volume. 
This is especially true if different files/images/vdisks on the master volume 
are associated with different block sizes. 
This logic has to be built into the shard translator based on parameters 
(client-pid, parent directory of the file being written to). 
What this means is that shard-block-size attribute on the slave would 
essentially be a don't-care parameter. I need to give all this some more 
thought though. 

-Krutika 

> >
> > -Krutika
> >
> >
> >
> > > - In Changelog record changes related to Sharded files also. Just
> > like
> > > any regular files.
> > > - Sharding should allow Geo-rep to list/read/write Sharding internal
> > > Xattrs if Client PID is gsyncd(-1)
> > > - Sharding should allow read/write of Sharded files(that is in
> > .shards
> > > directory) if Client PID is GSYNCD
> > > - Sharding should return actual file instead of returning the
> > > aggregated content when the Main file is requested(Client PID
> > > GSYNCD)
> > >
> > > For example, a file f1 is created with GFID G1.
> > >
> > > When the file grows it gets sharded into chunks(say 5 chunks).
> > >
> > > f1 G1
> > > .shards/G1.1 G2
> > > .shards/G1.2 G3
> > > .shards/G1.3 G4
> > > .shards/G1.4 G5
> > >
> > > In Changelog, this is recorded as 5 different files as below
> > >
> > > CREATE G1 f1
> > > DATA G1
> > > META G1
> > > CREATE G2 PGS/G1.1
> > > DATA G2
> > > META G1
> > > CREATE G3 PGS/G1.2
> > > DATA G3
> > > META G1
> > > CREATE G4 PGS/G1.3
> > > DATA G4
> > > META G1
> > > CREATE G5 PGS/G1.4
> > > DATA G5
> > > META G1
> > >
> > > Where PGS is GFID of .shards directory.
> > >
> > > Geo-rep will create these files independently in Slave Volume and
> > > syncs Xattrs of G1. Data can be read only when all the chunks are
> > > synced to Slave Volume. Data can be read partially if main/first file
> > > and some of the chunks synced to Slave.
> > >
> > > Please add if I missed anything. C & S Welcome.
> > >
> > > regards
> > > Aravinda
> > >
> > > On 08/11/2015 04:36 PM, Aravinda wrote:
> > >> Hi,
> > >>
> > >> We are thinking different approaches to add support in
> > Geo-replication
> > >> for Sharded Gluster Volumes[1]
> > >>
> > >> *Approach 1: Geo-rep: Sync Full file*
> > >> - In Changelog only record main file details in the same brick
> > >> where it is created
> > >> - Record as DATA in Changelog whenever any addition/changes
> > to the
> > >> sharded file
> > >> - Geo-rep rsync will do checksum as a full file from mount and
> > >> syncs as new file
> > >> - Slave side sharding is managed by Slave Volume
> > >> *Approach 2: Geo-rep: Sync sharded file separately*
> > >> - Geo-rep rsync will do checksum for sharded files only
> > >> - Geo-rep syncs each sharded files independently as new files
> > >> - [UNKNOWN] Sync internal xattrs(file size and block count)
> > in the
> > >

Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Krutika Dhananjay
- Original Message -

> From: "Venky Shankar" 
> To: "Aravinda" 
> Cc: "Shyam" , "Krutika Dhananjay" ,
> "Gluster Devel" 
> Sent: Thursday, September 3, 2015 8:29:37 AM
> Subject: Re: [Gluster-devel] Gluster Sharding and Geo-replication

> On Wed, Sep 2, 2015 at 11:39 PM, Aravinda  wrote:
> >
> > On 09/02/2015 11:13 PM, Shyam wrote:
> >>
> >> On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:
> >>>
> >>>
> >>>
> >>> 
> >>>
> >>> *From: *"Shyam" 
> >>> *To: *"Aravinda" , "Gluster Devel"
> >>> 
> >>> *Sent: *Wednesday, September 2, 2015 8:09:55 PM
> >>> *Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication
> >>>
> >>> On 09/02/2015 03:12 AM, Aravinda wrote:
> >>> > Geo-replication and Sharding Team today discussed about the
> >>> approach
> >>> > to make Sharding aware Geo-replication. Details are as below
> >>> >
> >>> > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay
> >>> Bellur
> >>> >
> >>> > - Both Master and Slave Volumes should be Sharded Volumes with
> >>> same
> >>> > configurations.
> >>>
> >>> If I am not mistaken, geo-rep supports replicating to a non-gluster
> >>> local FS at the slave end. Is this correct? If so, would this
> >>> limitation
> >>> not make that problematic?
> >>>
> >>> When you state *same configuration*, I assume you mean the sharding
> >>> configuration, not the volume graph, right?
> >>>
> >>> That is correct. The only requirement is for the slave to have shard
> >>> translator (for, someone needs to present aggregated view of the file to
> >>> the READers on the slave).
> >>> Also the shard-block-size needs to be kept same between master and
> >>> slave. Rest of the configuration (like the number of subvols of DHT/AFR)
> >>> can vary across master and slave.
> >>
> >>
> >> Do we need to have the sharded block size the same? As I assume the file
> >> carries an xattr that contains the size it is sharded with
> >> (trusted.glusterfs.shard.block-size), so if this is synced across, it
> >> would
> >> do. If this is true, what it would mean is that "a sharded volume needs a
> >> shard supported slave to ge-rep to".
> >
> > Yes. Number of bricks and replica count can be different. But sharded block
> > size should be same. Only the first file will have
> > xattr(trusted.glusterfs.shard.block-size), Geo-rep should sync this xattr
> > also to Slave. Only Gsyncd can read/write the sharded chunks. Sharded Slave
> > Volume is required to understand these chunks when read(non Gsyncd clients)

> Even if this works I am very much is disagreement with this mechanism
> of synchronization (not that I have a working solution in my head as
> of now).

Hi Venky, 

It is not apparent to me what issues you see with approach 2. If you could lay 
them out here, it would be helpful in taking the discussions further. 

-Krutika 

> >
> >>
> >>>
> >>> -Krutika
> >>>
> >>>
> >>>
> >>> > - In Changelog record changes related to Sharded files also. Just
> >>> like
> >>> > any regular files.
> >>> > - Sharding should allow Geo-rep to list/read/write Sharding
> >>> internal
> >>> > Xattrs if Client PID is gsyncd(-1)
> >>> > - Sharding should allow read/write of Sharded files(that is in
> >>> .shards
> >>> > directory) if Client PID is GSYNCD
> >>> > - Sharding should return actual file instead of returning the
> >>> > aggregated content when the Main file is requested(Client PID
> >>> > GSYNCD)
> >>> >
> >>> > For example, a file f1 is created with GFID G1.
> >>> >
> >>> > When the file grows it gets sharded into chunks(say 5 chunks).
> >>> >
> >>> > f1 G1
> >>> > .shards/G1.1 G2
> >>> > .shards/G1.2 G3
> >>> > .shards/G1.3 G4
> >>> > .shards/G1.4 G5
> >>> >
> >>> > In Changelog, this is recorded as 5 different files as below
> >>> >
> >>> > CREATE G1 f1
> >>> > DATA G1
> >>> > META G1
> >>> > CREATE G2 PGS/G1.1
> >>> > DATA G2
> >>> > META G1
> >>> > CREATE G3 PGS/G1.2
> >>> > DATA G3
> >>> > META G1
> >>> > CREATE G4 PGS/G1.3
> >>> > DATA G4
> >>> > META G1
> >>> > CREATE G5 PGS/G1.4
> >>> > DATA G5
> >>> > META G1
> >>> >
> >>> > Where PGS is GFID of .shards directory.
> >>> >
> >>> > Geo-rep will create these files independently in Slave Volume and
> >>> > syncs Xattrs of G1. Data can be read only when all the chunks are
> >>> > synced to Slave Volume. Data can be read partially if main/first
> >>> file
> >>> > and some of the chunks synced to Slave.
> >>> >
> >>> > Please add if I missed anything. C & S Welcome.
> >>> >
> >>> > regards
> >>> > Aravinda
> >>> >
> >>> > On 08/11/2015 04:36 PM, Aravinda wrote:
> >>> >> Hi,
> >>> >>
> >>> >> We are thinking different approaches to add support in
> >>> Geo-replication
> >>> >> for Sharded Gluster Volumes[1]
> >>> >>
> >>> >> *Approach 1: Geo-rep: Sync Full file*
> >>> >> - In Changelog only record main file details in the same brick
> >>> >> where it is created
> >>> >> - Record as DATA in Changelog whenever any addition/changes
> >>> to the
> >>> >>

Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Aravinda


On 09/03/2015 08:29 AM, Venky Shankar wrote:

On Wed, Sep 2, 2015 at 11:39 PM, Aravinda  wrote:

On 09/02/2015 11:13 PM, Shyam wrote:

On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:





 *From: *"Shyam" 
 *To: *"Aravinda" , "Gluster Devel"
 
 *Sent: *Wednesday, September 2, 2015 8:09:55 PM
 *Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication

 On 09/02/2015 03:12 AM, Aravinda wrote:
  > Geo-replication and Sharding Team today discussed about the
approach
  > to make Sharding aware Geo-replication. Details are as below
  >
  > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay
Bellur
  >
  > - Both Master and Slave Volumes should be Sharded Volumes with
same
  >configurations.

 If I am not mistaken, geo-rep supports replicating to a non-gluster
 local FS at the slave end. Is this correct? If so, would this
 limitation
 not make that problematic?

 When you state *same configuration*, I assume you mean the sharding
 configuration, not the volume graph, right?

That is correct. The only requirement is for the slave to have shard
translator (for, someone needs to present aggregated view of the file to
the READers on the slave).
Also the shard-block-size needs to be kept same between master and
slave. Rest of the configuration (like the number of subvols of DHT/AFR)
can vary across master and slave.


Do we need to have the sharded block size the same? As I assume the file
carries an xattr that contains the size it is sharded with
(trusted.glusterfs.shard.block-size), so if this is synced across, it would
do. If this is true, what it would mean is that "a sharded volume needs a
shard supported slave to ge-rep to".

Yes. Number of bricks and replica count can be different. But sharded block
size should be same. Only the first file will have
xattr(trusted.glusterfs.shard.block-size), Geo-rep should sync this xattr
also to Slave. Only Gsyncd can read/write the sharded chunks. Sharded Slave
Volume is required to understand these chunks when read(non Gsyncd clients)

Even if this works I am very much is disagreement with this mechanism
of synchronization (not that I have a working solution in my head as
of now).
Supporting non sharded Volume should be easy. As discussed, let 
Changelog record everything including Sharded file changes.(May be add 
Flag as internal)
Sharding has to make sure that Xattr operation on main file whenever any 
of its chunks updated.


In Geo-rep based on config(say --use-slave-sharding) decide to sync 
chunks or full file.



-Krutika



  > - In Changelog record changes related to Sharded files also. Just
 like
  >any regular files.
  > - Sharding should allow Geo-rep to list/read/write Sharding
internal
  >Xattrs if Client PID is gsyncd(-1)
  > - Sharding should allow read/write of Sharded files(that is in
 .shards
  >directory) if Client PID is GSYNCD
  > - Sharding should return actual file instead of returning the
  >aggregated content when the Main file is requested(Client PID
  >GSYNCD)
  >
  > For example, a file f1 is created with GFID G1.
  >
  > When the file grows it gets sharded into chunks(say 5 chunks).
  >
  >  f1   G1
  >  .shards/G1.1   G2
  >  .shards/G1.2   G3
  >  .shards/G1.3   G4
  >  .shards/G1.4   G5
  >
  > In Changelog, this is recorded as 5 different files as below
  >
  >  CREATE G1 f1
  >  DATA G1
  >  META G1
  >  CREATE G2 PGS/G1.1
  >  DATA G2
  >  META G1
  >  CREATE G3 PGS/G1.2
  >  DATA G3
  >  META G1
  >  CREATE G4 PGS/G1.3
  >  DATA G4
  >  META G1
  >  CREATE G5 PGS/G1.4
  >  DATA G5
  >  META G1
  >
  > Where PGS is GFID of .shards directory.
  >
  > Geo-rep will create these files independently in Slave Volume and
  > syncs Xattrs of G1. Data can be read only when all the chunks are
  > synced to Slave Volume. Data can be read partially if main/first
file
  > and some of the chunks synced to Slave.
  >
  > Please add if I missed anything. C & S Welcome.
  >
  > regards
  > Aravinda
  >
  > On 08/11/2015 04:36 PM, Aravinda wrote:
  >> Hi,
  >>
  >> We are thinking different approaches to add support in
 Geo-replication
  >> for Sharded Gluster Volumes[1]
  >>
  >> *Approach 1: Geo-rep: Sync Full file*
  >>- In Changelog only record main file details in the same brick
  >> where it is created
  >>- Record as DATA in Changelog whenever any addition/changes
 to the
  >> sharded file
  >>- Geo-rep rsync will do checksum as a full file from mount and
  >> syncs as new file
  >>- Slave side s

Re: [Gluster-devel] Introducing georepsetup - Gluster Geo-replication Setup Tool

2015-09-02 Thread Aravinda

Thanks Kotresh.

Existing Geo-replication Create command does many things 
asynchronously(distribute and update SSH keys using hook scripts). It is 
difficult to understand the problem. But this tool does all the steps 
synchronously so that we will get to know the issues immediately.


Run this tool in your already corrupted setup to figure out any of the 
steps failed previously. If all the steps succeeded and still Faulty 
then may be manual checking is required.


I will try to accommodate verification using --verify option. Thanks for 
the feedback.


regards
Aravinda

On 09/03/2015 10:40 AM, Kotresh Hiremath Ravishankar wrote:

Hi Aravinda,

I used it yesterday. It greatly simplifies the geo-rep setup.
It would be great if it is enhanced to troubleshoot what's
wrong in already corrupted setup.

Thanks and Regards,
Kotresh H R

- Original Message -

From: "Aravinda" 
To: "Gluster Devel" , "gluster-users" 

Sent: Wednesday, September 2, 2015 11:25:15 PM
Subject: [Gluster-devel] Introducing georepsetup - Gluster Geo-replication  
Setup Tool

Hi,

Created a CLI tool using Python to simplify the Geo-replication Setup
process. This tool takes care of running gsec_create command,
distributing the SSH keys from Master to all Slave nodes etc. All in
one single command :)

Initial password less SSH login is not required, this tool prompts the
Root's password during run. Will not store password!


Wrote a blog post about thesame.
http://aravindavk.in/blog/introducing-georepsetup

Comments and Suggestions Welcome.

--
regards
Aravinda
http://aravindavk.in

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Introducing an eventing framework for GlusterFS through storaged

2015-09-02 Thread Deepak Shetty
On Mon, Aug 31, 2015 at 7:35 PM, Shyam  wrote:

> On 08/31/2015 12:24 AM, Samikshan Bairagya wrote:
>
>> Hi everyone,
>>
>> I have been working on this project for the past few weeks that aims at
>> improving the eventing framework for GlusterFS through storaged [0][1].
>> Through a DBus API over the existing GlusterFS CLI, storaged could help
>> with better notifications for gluster events.
>>
>
> Is this a Linux only solution? What is the alternative for NetBSD? (a
> quick search for systemd/storaged on NetBSD yielded nothing of significance)
>
>
>> The plan:
>> =
>>
>> storaged exports objects implementing the respective interfaces for
>> Linux block devices, drives, RAID arrays, etc. on the DBus. More objects
>> implementing other interfaces can be exported through modules. The
>> "glusterfs" module in storaged will populate the DBus with GlusterFS
>> specific objects implementing the required interfaces. As an example,
>> the "iscsi" module in storaged adds DBus objects implementing the
>> org.storaged.Storaged.ISCSI.Session interface[2].
>>
>> Once DBus objects are exported to the DBus for GlusterFS volumes and
>> bricks, implementing the respective interfaces, it would be convenient
>> for interested clients to receive event notifications through DBus
>> signals or method calls. This enables clients to get updates wrt changes
>> on the glusterfs side in real time over the existing logging framework.
>>
>
> Can you provide a sample list of events that the clients would be
> interested in, or Gluster would need to provide? I am curious to know the
> level of integration sought here based on the type of events that we intend
> to publish.


+1 , same here

Also to add, it would be good to provide the below :

1) A list of usecases where this will be used/ useful. It seems this can be
helpful in openstack , but not fully clear to me, yet!

2) An example of how the client will use it and what it takes for the
client to use/consume this interface ?

3) A mapping of gluster events (hooks or otherwise ?) and storaged / dbus
events

4) For eg: We have tiering in gluster, is it possible for the client to
know when a file was moved from hot to cold tier or vice versa using this
interface, if not, what would it take to do so ? Something liek this, i
would expect
in the response to #1 above.

thanx,
deepak
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Raghavendra Gowdappa


- Original Message -
> From: "Emmanuel Dreyfus" 
> To: "Raghavendra Gowdappa" , "Pranith Kumar Karampuri" 
> 
> Cc: gluster-devel@gluster.org
> Sent: Wednesday, September 2, 2015 8:12:37 PM
> Subject: Re: [Gluster-devel] FOP ratelimit?
> 
> Raghavendra Gowdappa  wrote:
> 
> > Its helpful if you can give some pointers on what parameters (like
> > latency, throughput etc) you want us to consider for QoS.
> 
> Full blown QoS would be nice, but a first line of defense against
> resource hogs seems just badly required.
> 
> A bare minimum could be to process client's FOP in a round robin
> fashion. That way even if one client sends a lot of FOPs, there is
> always some window for others to slip in.
> 
> Any opinion?

As of now we depend on epoll/poll events informing servers about incoming 
messages. All sockets are put in the same event-pool represented by a single 
poll-control fd. So, the order of our processing of msgs from various clients 
really depends on how epoll/poll picks events across multiple sockets. Do 
poll/epoll have any sort of scheduling? or is it random? Any pointers on this 
are appreciated.

> 
> --
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> m...@netbsd.org
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Wanted - 3.7.5 release manager

2015-09-02 Thread Prasanna Kalever
Hi Pranith,

If you need any assistance please let me know, I will be happy to learn this.

Thanks & regards,
Prasanna Kumar K

- Original Message -
From: "Pranith Kumar Karampuri" 
To: "Vijay Bellur" , "Atin Mukherjee" 

Cc: "Gluster Devel" 
Sent: Wednesday, September 2, 2015 7:34:08 PM
Subject: Re: [Gluster-devel] Wanted - 3.7.5 release manager



On 09/02/2015 07:33 PM, Vijay Bellur wrote:
> On Wednesday 02 September 2015 06:38 PM, Atin Mukherjee wrote:
>> IIRC, Pranith already volunteered for it in one of the last community
>> meetings?
>>
>
> Thanks Atin. I do recollect it now.
>
> Pranith - can you confirm being the release manager for 3.7.5?
Yes, I can do this.

Pranith
>
> -Vijay
>
>> -Atin
>> Sent from one plus one
>>
>> On Sep 2, 2015 6:00 PM, "Vijay Bellur" > > wrote:
>>
>> Hi All,
>>
>> We have been rotating release managers for minor releases in the
>> 3.7.x train. We just released 3.7.4 and are looking for volunteers
>> to be release managers for 3.7.5 (scheduled for 30th September). If
>> anybody is interested in volunteering, please drop a note here.
>>
>> Thanks,
>> Vijay
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org 
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Introducing georepsetup - Gluster Geo-replication Setup Tool

2015-09-02 Thread Kotresh Hiremath Ravishankar
Hi Aravinda,

I used it yesterday. It greatly simplifies the geo-rep setup.
It would be great if it is enhanced to troubleshoot what's
wrong in already corrupted setup.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Aravinda" 
> To: "Gluster Devel" , "gluster-users" 
> 
> Sent: Wednesday, September 2, 2015 11:25:15 PM
> Subject: [Gluster-devel] Introducing georepsetup - Gluster Geo-replication
> Setup Tool
> 
> Hi,
> 
> Created a CLI tool using Python to simplify the Geo-replication Setup
> process. This tool takes care of running gsec_create command,
> distributing the SSH keys from Master to all Slave nodes etc. All in
> one single command :)
> 
> Initial password less SSH login is not required, this tool prompts the
> Root's password during run. Will not store password!
> 
> 
> Wrote a blog post about thesame.
> http://aravindavk.in/blog/introducing-georepsetup
> 
> Comments and Suggestions Welcome.
> 
> --
> regards
> Aravinda
> http://aravindavk.in
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Venky Shankar
On Wed, Sep 2, 2015 at 11:39 PM, Aravinda  wrote:
>
> On 09/02/2015 11:13 PM, Shyam wrote:
>>
>> On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:
>>>
>>>
>>>
>>> 
>>>
>>> *From: *"Shyam" 
>>> *To: *"Aravinda" , "Gluster Devel"
>>> 
>>> *Sent: *Wednesday, September 2, 2015 8:09:55 PM
>>> *Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication
>>>
>>> On 09/02/2015 03:12 AM, Aravinda wrote:
>>>  > Geo-replication and Sharding Team today discussed about the
>>> approach
>>>  > to make Sharding aware Geo-replication. Details are as below
>>>  >
>>>  > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay
>>> Bellur
>>>  >
>>>  > - Both Master and Slave Volumes should be Sharded Volumes with
>>> same
>>>  >configurations.
>>>
>>> If I am not mistaken, geo-rep supports replicating to a non-gluster
>>> local FS at the slave end. Is this correct? If so, would this
>>> limitation
>>> not make that problematic?
>>>
>>> When you state *same configuration*, I assume you mean the sharding
>>> configuration, not the volume graph, right?
>>>
>>> That is correct. The only requirement is for the slave to have shard
>>> translator (for, someone needs to present aggregated view of the file to
>>> the READers on the slave).
>>> Also the shard-block-size needs to be kept same between master and
>>> slave. Rest of the configuration (like the number of subvols of DHT/AFR)
>>> can vary across master and slave.
>>
>>
>> Do we need to have the sharded block size the same? As I assume the file
>> carries an xattr that contains the size it is sharded with
>> (trusted.glusterfs.shard.block-size), so if this is synced across, it would
>> do. If this is true, what it would mean is that "a sharded volume needs a
>> shard supported slave to ge-rep to".
>
> Yes. Number of bricks and replica count can be different. But sharded block
> size should be same. Only the first file will have
> xattr(trusted.glusterfs.shard.block-size), Geo-rep should sync this xattr
> also to Slave. Only Gsyncd can read/write the sharded chunks. Sharded Slave
> Volume is required to understand these chunks when read(non Gsyncd clients)

Even if this works I am very much is disagreement with this mechanism
of synchronization (not that I have a working solution in my head as
of now).

>
>>
>>>
>>> -Krutika
>>>
>>>
>>>
>>>  > - In Changelog record changes related to Sharded files also. Just
>>> like
>>>  >any regular files.
>>>  > - Sharding should allow Geo-rep to list/read/write Sharding
>>> internal
>>>  >Xattrs if Client PID is gsyncd(-1)
>>>  > - Sharding should allow read/write of Sharded files(that is in
>>> .shards
>>>  >directory) if Client PID is GSYNCD
>>>  > - Sharding should return actual file instead of returning the
>>>  >aggregated content when the Main file is requested(Client PID
>>>  >GSYNCD)
>>>  >
>>>  > For example, a file f1 is created with GFID G1.
>>>  >
>>>  > When the file grows it gets sharded into chunks(say 5 chunks).
>>>  >
>>>  >  f1   G1
>>>  >  .shards/G1.1   G2
>>>  >  .shards/G1.2   G3
>>>  >  .shards/G1.3   G4
>>>  >  .shards/G1.4   G5
>>>  >
>>>  > In Changelog, this is recorded as 5 different files as below
>>>  >
>>>  >  CREATE G1 f1
>>>  >  DATA G1
>>>  >  META G1
>>>  >  CREATE G2 PGS/G1.1
>>>  >  DATA G2
>>>  >  META G1
>>>  >  CREATE G3 PGS/G1.2
>>>  >  DATA G3
>>>  >  META G1
>>>  >  CREATE G4 PGS/G1.3
>>>  >  DATA G4
>>>  >  META G1
>>>  >  CREATE G5 PGS/G1.4
>>>  >  DATA G5
>>>  >  META G1
>>>  >
>>>  > Where PGS is GFID of .shards directory.
>>>  >
>>>  > Geo-rep will create these files independently in Slave Volume and
>>>  > syncs Xattrs of G1. Data can be read only when all the chunks are
>>>  > synced to Slave Volume. Data can be read partially if main/first
>>> file
>>>  > and some of the chunks synced to Slave.
>>>  >
>>>  > Please add if I missed anything. C & S Welcome.
>>>  >
>>>  > regards
>>>  > Aravinda
>>>  >
>>>  > On 08/11/2015 04:36 PM, Aravinda wrote:
>>>  >> Hi,
>>>  >>
>>>  >> We are thinking different approaches to add support in
>>> Geo-replication
>>>  >> for Sharded Gluster Volumes[1]
>>>  >>
>>>  >> *Approach 1: Geo-rep: Sync Full file*
>>>  >>- In Changelog only record main file details in the same brick
>>>  >> where it is created
>>>  >>- Record as DATA in Changelog whenever any addition/changes
>>> to the
>>>  >> sharded file
>>>  >>- Geo-rep rsync will do checksum as a full file from mount and
>>>  >> syncs as new file
>>>  >>- Slave side sh

Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Aravinda


On 09/02/2015 11:13 PM, Shyam wrote:

On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:





*From: *"Shyam" 
*To: *"Aravinda" , "Gluster Devel"

*Sent: *Wednesday, September 2, 2015 8:09:55 PM
*Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication

On 09/02/2015 03:12 AM, Aravinda wrote:
 > Geo-replication and Sharding Team today discussed about the 
approach

 > to make Sharding aware Geo-replication. Details are as below
 >
 > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay 
Bellur

 >
 > - Both Master and Slave Volumes should be Sharded Volumes with 
same

 >configurations.

If I am not mistaken, geo-rep supports replicating to a non-gluster
local FS at the slave end. Is this correct? If so, would this
limitation
not make that problematic?

When you state *same configuration*, I assume you mean the sharding
configuration, not the volume graph, right?

That is correct. The only requirement is for the slave to have shard
translator (for, someone needs to present aggregated view of the file to
the READers on the slave).
Also the shard-block-size needs to be kept same between master and
slave. Rest of the configuration (like the number of subvols of DHT/AFR)
can vary across master and slave.


Do we need to have the sharded block size the same? As I assume the 
file carries an xattr that contains the size it is sharded with 
(trusted.glusterfs.shard.block-size), so if this is synced across, it 
would do. If this is true, what it would mean is that "a sharded 
volume needs a shard supported slave to ge-rep to".
Yes. Number of bricks and replica count can be different. But sharded 
block size should be same. Only the first file will have 
xattr(trusted.glusterfs.shard.block-size), Geo-rep should sync this 
xattr also to Slave. Only Gsyncd can read/write the sharded chunks. 
Sharded Slave Volume is required to understand these chunks when 
read(non Gsyncd clients)




-Krutika



 > - In Changelog record changes related to Sharded files also. Just
like
 >any regular files.
 > - Sharding should allow Geo-rep to list/read/write Sharding 
internal

 >Xattrs if Client PID is gsyncd(-1)
 > - Sharding should allow read/write of Sharded files(that is in
.shards
 >directory) if Client PID is GSYNCD
 > - Sharding should return actual file instead of returning the
 >aggregated content when the Main file is requested(Client PID
 >GSYNCD)
 >
 > For example, a file f1 is created with GFID G1.
 >
 > When the file grows it gets sharded into chunks(say 5 chunks).
 >
 >  f1   G1
 >  .shards/G1.1   G2
 >  .shards/G1.2   G3
 >  .shards/G1.3   G4
 >  .shards/G1.4   G5
 >
 > In Changelog, this is recorded as 5 different files as below
 >
 >  CREATE G1 f1
 >  DATA G1
 >  META G1
 >  CREATE G2 PGS/G1.1
 >  DATA G2
 >  META G1
 >  CREATE G3 PGS/G1.2
 >  DATA G3
 >  META G1
 >  CREATE G4 PGS/G1.3
 >  DATA G4
 >  META G1
 >  CREATE G5 PGS/G1.4
 >  DATA G5
 >  META G1
 >
 > Where PGS is GFID of .shards directory.
 >
 > Geo-rep will create these files independently in Slave Volume and
 > syncs Xattrs of G1. Data can be read only when all the chunks are
 > synced to Slave Volume. Data can be read partially if 
main/first file

 > and some of the chunks synced to Slave.
 >
 > Please add if I missed anything. C & S Welcome.
 >
 > regards
 > Aravinda
 >
 > On 08/11/2015 04:36 PM, Aravinda wrote:
 >> Hi,
 >>
 >> We are thinking different approaches to add support in
Geo-replication
 >> for Sharded Gluster Volumes[1]
 >>
 >> *Approach 1: Geo-rep: Sync Full file*
 >>- In Changelog only record main file details in the same 
brick

 >> where it is created
 >>- Record as DATA in Changelog whenever any addition/changes
to the
 >> sharded file
 >>- Geo-rep rsync will do checksum as a full file from mount 
and

 >> syncs as new file
 >>- Slave side sharding is managed by Slave Volume
 >> *Approach 2: Geo-rep: Sync sharded file separately*
 >>- Geo-rep rsync will do checksum for sharded files only
 >>- Geo-rep syncs each sharded files independently as new files
 >>- [UNKNOWN] Sync internal xattrs(file size and block count)
in the
 >> main sharded file to Slave Volume to maintain the same state as
in Master.
 >>- Sharding translator to allow file creation under .shards
dir for
 >> gsyncd. that is Parent GFID is .shards directory
 >>- If sharded files are modified during Geo-rep run may end up
stale
 >> data in Slave.
 

[Gluster-devel] Introducing georepsetup - Gluster Geo-replication Setup Tool

2015-09-02 Thread Aravinda

Hi,

Created a CLI tool using Python to simplify the Geo-replication Setup
process. This tool takes care of running gsec_create command,
distributing the SSH keys from Master to all Slave nodes etc. All in
one single command :)

Initial password less SSH login is not required, this tool prompts the
Root's password during run. Will not store password!


Wrote a blog post about thesame.
http://aravindavk.in/blog/introducing-georepsetup

Comments and Suggestions Welcome.

--
regards
Aravinda
http://aravindavk.in

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Shyam

On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:





*From: *"Shyam" 
*To: *"Aravinda" , "Gluster Devel"

*Sent: *Wednesday, September 2, 2015 8:09:55 PM
*Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication

On 09/02/2015 03:12 AM, Aravinda wrote:
 > Geo-replication and Sharding Team today discussed about the approach
 > to make Sharding aware Geo-replication. Details are as below
 >
 > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
 >
 > - Both Master and Slave Volumes should be Sharded Volumes with same
 >configurations.

If I am not mistaken, geo-rep supports replicating to a non-gluster
local FS at the slave end. Is this correct? If so, would this
limitation
not make that problematic?

When you state *same configuration*, I assume you mean the sharding
configuration, not the volume graph, right?

That is correct. The only requirement is for the slave to have shard
translator (for, someone needs to present aggregated view of the file to
the READers on the slave).
Also the shard-block-size needs to be kept same between master and
slave. Rest of the configuration (like the number of subvols of DHT/AFR)
can vary across master and slave.


Do we need to have the sharded block size the same? As I assume the file 
carries an xattr that contains the size it is sharded with 
(trusted.glusterfs.shard.block-size), so if this is synced across, it 
would do. If this is true, what it would mean is that "a sharded volume 
needs a shard supported slave to ge-rep to".




-Krutika



 > - In Changelog record changes related to Sharded files also. Just
like
 >any regular files.
 > - Sharding should allow Geo-rep to list/read/write Sharding internal
 >Xattrs if Client PID is gsyncd(-1)
 > - Sharding should allow read/write of Sharded files(that is in
.shards
 >directory) if Client PID is GSYNCD
 > - Sharding should return actual file instead of returning the
 >aggregated content when the Main file is requested(Client PID
 >GSYNCD)
 >
 > For example, a file f1 is created with GFID G1.
 >
 > When the file grows it gets sharded into chunks(say 5 chunks).
 >
 >  f1   G1
 >  .shards/G1.1   G2
 >  .shards/G1.2   G3
 >  .shards/G1.3   G4
 >  .shards/G1.4   G5
 >
 > In Changelog, this is recorded as 5 different files as below
 >
 >  CREATE G1 f1
 >  DATA G1
 >  META G1
 >  CREATE G2 PGS/G1.1
 >  DATA G2
 >  META G1
 >  CREATE G3 PGS/G1.2
 >  DATA G3
 >  META G1
 >  CREATE G4 PGS/G1.3
 >  DATA G4
 >  META G1
 >  CREATE G5 PGS/G1.4
 >  DATA G5
 >  META G1
 >
 > Where PGS is GFID of .shards directory.
 >
 > Geo-rep will create these files independently in Slave Volume and
 > syncs Xattrs of G1. Data can be read only when all the chunks are
 > synced to Slave Volume. Data can be read partially if main/first file
 > and some of the chunks synced to Slave.
 >
 > Please add if I missed anything. C & S Welcome.
 >
 > regards
 > Aravinda
 >
 > On 08/11/2015 04:36 PM, Aravinda wrote:
 >> Hi,
 >>
 >> We are thinking different approaches to add support in
Geo-replication
 >> for Sharded Gluster Volumes[1]
 >>
 >> *Approach 1: Geo-rep: Sync Full file*
 >>- In Changelog only record main file details in the same brick
 >> where it is created
 >>- Record as DATA in Changelog whenever any addition/changes
to the
 >> sharded file
 >>- Geo-rep rsync will do checksum as a full file from mount and
 >> syncs as new file
 >>- Slave side sharding is managed by Slave Volume
 >> *Approach 2: Geo-rep: Sync sharded file separately*
 >>- Geo-rep rsync will do checksum for sharded files only
 >>- Geo-rep syncs each sharded files independently as new files
 >>- [UNKNOWN] Sync internal xattrs(file size and block count)
in the
 >> main sharded file to Slave Volume to maintain the same state as
in Master.
 >>- Sharding translator to allow file creation under .shards
dir for
 >> gsyncd. that is Parent GFID is .shards directory
 >>- If sharded files are modified during Geo-rep run may end up
stale
 >> data in Slave.
 >>- Files on Slave Volume may not be readable unless all sharded
 >> files sync to Slave(Each bricks in Master independently sync
files to
 >> slave)
 >>
 >> First approach looks more clean, but we have to analize the Rsync
 >> checksum performance on big files(Sharded in backend, accessed
as one
 >> big file from rsync)
 >>
 >> Let us know your thoughts. Thanks
  

Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Venky Shankar
]
On Wed, Sep 2, 2015 at 8:50 PM, Aravinda  wrote:
>
> On 09/02/2015 08:22 PM, Venky Shankar wrote:
>>
>> On Wed, Sep 2, 2015 at 12:42 PM, Aravinda  wrote:
>>>
>>> Geo-replication and Sharding Team today discussed about the approach
>>> to make Sharding aware Geo-replication. Details are as below
>>>
>>> Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
>>>
>>> - Both Master and Slave Volumes should be Sharded Volumes with same
>>>configurations.
>>> - In Changelog record changes related to Sharded files also. Just like
>>>any regular files.
>>> - Sharding should allow Geo-rep to list/read/write Sharding internal
>>>Xattrs if Client PID is gsyncd(-1)
>>> - Sharding should allow read/write of Sharded files(that is in .shards
>>>directory) if Client PID is GSYNCD
>>> - Sharding should return actual file instead of returning the
>>>aggregated content when the Main file is requested(Client PID
>>>GSYNCD)
>>>
>>> For example, a file f1 is created with GFID G1.
>>>
>>> When the file grows it gets sharded into chunks(say 5 chunks).
>>>
>>>  f1   G1
>>>  .shards/G1.1   G2
>>>  .shards/G1.2   G3
>>>  .shards/G1.3   G4
>>>  .shards/G1.4   G5
>>>
>>> In Changelog, this is recorded as 5 different files as below
>>>
>>>  CREATE G1 f1
>>>  DATA G1
>>>  META G1
>>>  CREATE G2 PGS/G1.1
>>>  DATA G2
>>>  META G1
>>>  CREATE G3 PGS/G1.2
>>>  DATA G3
>>>  META G1
>>>  CREATE G4 PGS/G1.3
>>>  DATA G4
>>>  META G1
>>>  CREATE G5 PGS/G1.4
>>>  DATA G5
>>>  META G1
>>>
>>> Where PGS is GFID of .shards directory.
>>>
>>> Geo-rep will create these files independently in Slave Volume and
>>> syncs Xattrs of G1. Data can be read only when all the chunks are
>>> synced to Slave Volume. Data can be read partially if main/first file
>>> and some of the chunks synced to Slave.
>>
>> So, before replicating data to the salve, all shards needs to be created
>> there?
>
> No. each files will be synced independently. But for reading complete file
> all the shards should be present, else partial data is read.

Oh yes. I figured that out a bit late. Option #2 is restrictive.
Probably needs more thinking...

>
>>
>>> Please add if I missed anything. C & S Welcome.
>>>
>>> regards
>>> Aravinda
>>>
>>> On 08/11/2015 04:36 PM, Aravinda wrote:
>>>
>>> Hi,
>>>
>>> We are thinking different approaches to add support in Geo-replication
>>> for
>>> Sharded Gluster Volumes[1]
>>>
>>> Approach 1: Geo-rep: Sync Full file
>>> - In Changelog only record main file details in the same brick where
>>> it
>>> is created
>>> - Record as DATA in Changelog whenever any addition/changes to the
>>> sharded file
>>> - Geo-rep rsync will do checksum as a full file from mount and syncs
>>> as
>>> new file
>>> - Slave side sharding is managed by Slave Volume
>>>
>>> Approach 2: Geo-rep: Sync sharded file separately
>>> - Geo-rep rsync will do checksum for sharded files only
>>> - Geo-rep syncs each sharded files independently as new files
>>> - [UNKNOWN] Sync internal xattrs(file size and block count) in the
>>> main
>>> sharded file to Slave Volume to maintain the same state as in Master.
>>> - Sharding translator to allow file creation under .shards dir for
>>> gsyncd. that is Parent GFID is .shards directory
>>> - If sharded files are modified during Geo-rep run may end up stale
>>> data
>>> in Slave.
>>> - Files on Slave Volume may not be readable unless all sharded files
>>> sync
>>> to Slave(Each bricks in Master independently sync files to slave)
>>>
>>> First approach looks more clean, but we have to analize the Rsync
>>> checksum
>>> performance on big files(Sharded in backend, accessed as one big file
>>> from
>>> rsync)
>>>
>>> Let us know your thoughts. Thanks
>>>
>>> Ref:
>>> [1]
>>>
>>> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
>>>
>>> --
>>> regards
>>> Aravinda
>>>
>>>
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>
> regards
> Aravinda
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Aravinda


On 09/02/2015 08:22 PM, Venky Shankar wrote:

On Wed, Sep 2, 2015 at 12:42 PM, Aravinda  wrote:

Geo-replication and Sharding Team today discussed about the approach
to make Sharding aware Geo-replication. Details are as below

Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur

- Both Master and Slave Volumes should be Sharded Volumes with same
   configurations.
- In Changelog record changes related to Sharded files also. Just like
   any regular files.
- Sharding should allow Geo-rep to list/read/write Sharding internal
   Xattrs if Client PID is gsyncd(-1)
- Sharding should allow read/write of Sharded files(that is in .shards
   directory) if Client PID is GSYNCD
- Sharding should return actual file instead of returning the
   aggregated content when the Main file is requested(Client PID
   GSYNCD)

For example, a file f1 is created with GFID G1.

When the file grows it gets sharded into chunks(say 5 chunks).

 f1   G1
 .shards/G1.1   G2
 .shards/G1.2   G3
 .shards/G1.3   G4
 .shards/G1.4   G5

In Changelog, this is recorded as 5 different files as below

 CREATE G1 f1
 DATA G1
 META G1
 CREATE G2 PGS/G1.1
 DATA G2
 META G1
 CREATE G3 PGS/G1.2
 DATA G3
 META G1
 CREATE G4 PGS/G1.3
 DATA G4
 META G1
 CREATE G5 PGS/G1.4
 DATA G5
 META G1

Where PGS is GFID of .shards directory.

Geo-rep will create these files independently in Slave Volume and
syncs Xattrs of G1. Data can be read only when all the chunks are
synced to Slave Volume. Data can be read partially if main/first file
and some of the chunks synced to Slave.

So, before replicating data to the salve, all shards needs to be created there?
No. each files will be synced independently. But for reading complete 
file all the shards should be present, else partial data is read.



Please add if I missed anything. C & S Welcome.

regards
Aravinda

On 08/11/2015 04:36 PM, Aravinda wrote:

Hi,

We are thinking different approaches to add support in Geo-replication for
Sharded Gluster Volumes[1]

Approach 1: Geo-rep: Sync Full file
- In Changelog only record main file details in the same brick where it
is created
- Record as DATA in Changelog whenever any addition/changes to the
sharded file
- Geo-rep rsync will do checksum as a full file from mount and syncs as
new file
- Slave side sharding is managed by Slave Volume

Approach 2: Geo-rep: Sync sharded file separately
- Geo-rep rsync will do checksum for sharded files only
- Geo-rep syncs each sharded files independently as new files
- [UNKNOWN] Sync internal xattrs(file size and block count) in the main
sharded file to Slave Volume to maintain the same state as in Master.
- Sharding translator to allow file creation under .shards dir for
gsyncd. that is Parent GFID is .shards directory
- If sharded files are modified during Geo-rep run may end up stale data
in Slave.
- Files on Slave Volume may not be readable unless all sharded files sync
to Slave(Each bricks in Master independently sync files to slave)

First approach looks more clean, but we have to analize the Rsync checksum
performance on big files(Sharded in backend, accessed as one big file from
rsync)

Let us know your thoughts. Thanks

Ref:
[1]
http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator

--
regards
Aravinda



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



regards
Aravinda

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Wanted - 3.7.5 release manager

2015-09-02 Thread Vijay Bellur

On Wednesday 02 September 2015 07:34 PM, Pranith Kumar Karampuri wrote:



On 09/02/2015 07:33 PM, Vijay Bellur wrote:

On Wednesday 02 September 2015 06:38 PM, Atin Mukherjee wrote:

IIRC, Pranith already volunteered for it in one of the last community
meetings?



Thanks Atin. I do recollect it now.

Pranith - can you confirm being the release manager for 3.7.5?

Yes, I can do this.



Thanks, Pranith!

-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Venky Shankar
On Wed, Sep 2, 2015 at 8:09 PM, Shyam  wrote:
> On 09/02/2015 03:12 AM, Aravinda wrote:
>>
>> Geo-replication and Sharding Team today discussed about the approach
>> to make Sharding aware Geo-replication. Details are as below
>>
>> Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
>>
>> - Both Master and Slave Volumes should be Sharded Volumes with same
>>configurations.
>
>
> If I am not mistaken, geo-rep supports replicating to a non-gluster local FS
> at the slave end. Is this correct? If so, would this limitation not make
> that problematic?

That was taken out when distributed geo-replication was developed
(with support for GFID synchronization b/w master and slave). Slave
therefore needs to be a Gluster volume.

>
> When you state *same configuration*, I assume you mean the sharding
> configuration, not the volume graph, right?
>
>> - In Changelog record changes related to Sharded files also. Just like
>>any regular files.
>> - Sharding should allow Geo-rep to list/read/write Sharding internal
>>Xattrs if Client PID is gsyncd(-1)
>> - Sharding should allow read/write of Sharded files(that is in .shards
>>directory) if Client PID is GSYNCD
>> - Sharding should return actual file instead of returning the
>>aggregated content when the Main file is requested(Client PID
>>GSYNCD)
>>
>> For example, a file f1 is created with GFID G1.
>>
>> When the file grows it gets sharded into chunks(say 5 chunks).
>>
>>  f1   G1
>>  .shards/G1.1   G2
>>  .shards/G1.2   G3
>>  .shards/G1.3   G4
>>  .shards/G1.4   G5
>>
>> In Changelog, this is recorded as 5 different files as below
>>
>>  CREATE G1 f1
>>  DATA G1
>>  META G1
>>  CREATE G2 PGS/G1.1
>>  DATA G2
>>  META G1
>>  CREATE G3 PGS/G1.2
>>  DATA G3
>>  META G1
>>  CREATE G4 PGS/G1.3
>>  DATA G4
>>  META G1
>>  CREATE G5 PGS/G1.4
>>  DATA G5
>>  META G1
>>
>> Where PGS is GFID of .shards directory.
>>
>> Geo-rep will create these files independently in Slave Volume and
>> syncs Xattrs of G1. Data can be read only when all the chunks are
>> synced to Slave Volume. Data can be read partially if main/first file
>> and some of the chunks synced to Slave.
>>
>> Please add if I missed anything. C & S Welcome.
>>
>> regards
>> Aravinda
>>
>> On 08/11/2015 04:36 PM, Aravinda wrote:
>>>
>>> Hi,
>>>
>>> We are thinking different approaches to add support in Geo-replication
>>> for Sharded Gluster Volumes[1]
>>>
>>> *Approach 1: Geo-rep: Sync Full file*
>>>- In Changelog only record main file details in the same brick
>>> where it is created
>>>- Record as DATA in Changelog whenever any addition/changes to the
>>> sharded file
>>>- Geo-rep rsync will do checksum as a full file from mount and
>>> syncs as new file
>>>- Slave side sharding is managed by Slave Volume
>>> *Approach 2: Geo-rep: Sync sharded file separately*
>>>
>>>- Geo-rep rsync will do checksum for sharded files only
>>>- Geo-rep syncs each sharded files independently as new files
>>>- [UNKNOWN] Sync internal xattrs(file size and block count) in the
>>> main sharded file to Slave Volume to maintain the same state as in
>>> Master.
>>>- Sharding translator to allow file creation under .shards dir for
>>> gsyncd. that is Parent GFID is .shards directory
>>>- If sharded files are modified during Geo-rep run may end up stale
>>> data in Slave.
>>>- Files on Slave Volume may not be readable unless all sharded
>>> files sync to Slave(Each bricks in Master independently sync files to
>>> slave)
>>>
>>> First approach looks more clean, but we have to analize the Rsync
>>> checksum performance on big files(Sharded in backend, accessed as one
>>> big file from rsync)
>>>
>>> Let us know your thoughts. Thanks
>>>
>>> Ref:
>>> [1]
>>>
>>> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
>>> --
>>> regards
>>> Aravinda
>>>
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Vijay Bellur

On Wednesday 02 September 2015 08:09 PM, Shyam wrote:

On 09/02/2015 03:12 AM, Aravinda wrote:

Geo-replication and Sharding Team today discussed about the approach
to make Sharding aware Geo-replication. Details are as below

Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur

- Both Master and Slave Volumes should be Sharded Volumes with same
   configurations.


If I am not mistaken, geo-rep supports replicating to a non-gluster
local FS at the slave end. Is this correct? If so, would this limitation
not make that problematic?



With recent Gluster releases (>= 3.5), remote replication to a local 
filesystem is no longer possible with geo-replication. Since 
geo-replication's syncing relies on gfids rather than paths (for better 
consistency), we lost the previous behavior.


-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Venky Shankar
On Wed, Sep 2, 2015 at 12:42 PM, Aravinda  wrote:
> Geo-replication and Sharding Team today discussed about the approach
> to make Sharding aware Geo-replication. Details are as below
>
> Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
>
> - Both Master and Slave Volumes should be Sharded Volumes with same
>   configurations.
> - In Changelog record changes related to Sharded files also. Just like
>   any regular files.
> - Sharding should allow Geo-rep to list/read/write Sharding internal
>   Xattrs if Client PID is gsyncd(-1)
> - Sharding should allow read/write of Sharded files(that is in .shards
>   directory) if Client PID is GSYNCD
> - Sharding should return actual file instead of returning the
>   aggregated content when the Main file is requested(Client PID
>   GSYNCD)
>
> For example, a file f1 is created with GFID G1.
>
> When the file grows it gets sharded into chunks(say 5 chunks).
>
> f1   G1
> .shards/G1.1   G2
> .shards/G1.2   G3
> .shards/G1.3   G4
> .shards/G1.4   G5
>
> In Changelog, this is recorded as 5 different files as below
>
> CREATE G1 f1
> DATA G1
> META G1
> CREATE G2 PGS/G1.1
> DATA G2
> META G1
> CREATE G3 PGS/G1.2
> DATA G3
> META G1
> CREATE G4 PGS/G1.3
> DATA G4
> META G1
> CREATE G5 PGS/G1.4
> DATA G5
> META G1
>
> Where PGS is GFID of .shards directory.
>
> Geo-rep will create these files independently in Slave Volume and
> syncs Xattrs of G1. Data can be read only when all the chunks are
> synced to Slave Volume. Data can be read partially if main/first file
> and some of the chunks synced to Slave.

So, before replicating data to the salve, all shards needs to be created there?

>
> Please add if I missed anything. C & S Welcome.
>
> regards
> Aravinda
>
> On 08/11/2015 04:36 PM, Aravinda wrote:
>
> Hi,
>
> We are thinking different approaches to add support in Geo-replication for
> Sharded Gluster Volumes[1]
>
> Approach 1: Geo-rep: Sync Full file
>- In Changelog only record main file details in the same brick where it
> is created
>- Record as DATA in Changelog whenever any addition/changes to the
> sharded file
>- Geo-rep rsync will do checksum as a full file from mount and syncs as
> new file
>- Slave side sharding is managed by Slave Volume
>
> Approach 2: Geo-rep: Sync sharded file separately
>- Geo-rep rsync will do checksum for sharded files only
>- Geo-rep syncs each sharded files independently as new files
>- [UNKNOWN] Sync internal xattrs(file size and block count) in the main
> sharded file to Slave Volume to maintain the same state as in Master.
>- Sharding translator to allow file creation under .shards dir for
> gsyncd. that is Parent GFID is .shards directory
>- If sharded files are modified during Geo-rep run may end up stale data
> in Slave.
>- Files on Slave Volume may not be readable unless all sharded files sync
> to Slave(Each bricks in Master independently sync files to slave)
>
> First approach looks more clean, but we have to analize the Rsync checksum
> performance on big files(Sharded in backend, accessed as one big file from
> rsync)
>
> Let us know your thoughts. Thanks
>
> Ref:
> [1]
> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
>
> --
> regards
> Aravinda
>
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Krutika Dhananjay
- Original Message -

> From: "Shyam" 
> To: "Aravinda" , "Gluster Devel"
> 
> Sent: Wednesday, September 2, 2015 8:09:55 PM
> Subject: Re: [Gluster-devel] Gluster Sharding and Geo-replication

> On 09/02/2015 03:12 AM, Aravinda wrote:
> > Geo-replication and Sharding Team today discussed about the approach
> > to make Sharding aware Geo-replication. Details are as below
> >
> > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur
> >
> > - Both Master and Slave Volumes should be Sharded Volumes with same
> > configurations.

> If I am not mistaken, geo-rep supports replicating to a non-gluster
> local FS at the slave end. Is this correct? If so, would this limitation
> not make that problematic?

> When you state *same configuration*, I assume you mean the sharding
> configuration, not the volume graph, right?
That is correct. The only requirement is for the slave to have shard translator 
(for, someone needs to present aggregated view of the file to the READers on 
the slave). 
Also the shard-block-size needs to be kept same between master and slave. Rest 
of the configuration (like the number of subvols of DHT/AFR) can vary across 
master and slave. 

-Krutika 

> > - In Changelog record changes related to Sharded files also. Just like
> > any regular files.
> > - Sharding should allow Geo-rep to list/read/write Sharding internal
> > Xattrs if Client PID is gsyncd(-1)
> > - Sharding should allow read/write of Sharded files(that is in .shards
> > directory) if Client PID is GSYNCD
> > - Sharding should return actual file instead of returning the
> > aggregated content when the Main file is requested(Client PID
> > GSYNCD)
> >
> > For example, a file f1 is created with GFID G1.
> >
> > When the file grows it gets sharded into chunks(say 5 chunks).
> >
> > f1 G1
> > .shards/G1.1 G2
> > .shards/G1.2 G3
> > .shards/G1.3 G4
> > .shards/G1.4 G5
> >
> > In Changelog, this is recorded as 5 different files as below
> >
> > CREATE G1 f1
> > DATA G1
> > META G1
> > CREATE G2 PGS/G1.1
> > DATA G2
> > META G1
> > CREATE G3 PGS/G1.2
> > DATA G3
> > META G1
> > CREATE G4 PGS/G1.3
> > DATA G4
> > META G1
> > CREATE G5 PGS/G1.4
> > DATA G5
> > META G1
> >
> > Where PGS is GFID of .shards directory.
> >
> > Geo-rep will create these files independently in Slave Volume and
> > syncs Xattrs of G1. Data can be read only when all the chunks are
> > synced to Slave Volume. Data can be read partially if main/first file
> > and some of the chunks synced to Slave.
> >
> > Please add if I missed anything. C & S Welcome.
> >
> > regards
> > Aravinda
> >
> > On 08/11/2015 04:36 PM, Aravinda wrote:
> >> Hi,
> >>
> >> We are thinking different approaches to add support in Geo-replication
> >> for Sharded Gluster Volumes[1]
> >>
> >> *Approach 1: Geo-rep: Sync Full file*
> >> - In Changelog only record main file details in the same brick
> >> where it is created
> >> - Record as DATA in Changelog whenever any addition/changes to the
> >> sharded file
> >> - Geo-rep rsync will do checksum as a full file from mount and
> >> syncs as new file
> >> - Slave side sharding is managed by Slave Volume
> >> *Approach 2: Geo-rep: Sync sharded file separately*
> >> - Geo-rep rsync will do checksum for sharded files only
> >> - Geo-rep syncs each sharded files independently as new files
> >> - [UNKNOWN] Sync internal xattrs(file size and block count) in the
> >> main sharded file to Slave Volume to maintain the same state as in Master.
> >> - Sharding translator to allow file creation under .shards dir for
> >> gsyncd. that is Parent GFID is .shards directory
> >> - If sharded files are modified during Geo-rep run may end up stale
> >> data in Slave.
> >> - Files on Slave Volume may not be readable unless all sharded
> >> files sync to Slave(Each bricks in Master independently sync files to
> >> slave)
> >>
> >> First approach looks more clean, but we have to analize the Rsync
> >> checksum performance on big files(Sharded in backend, accessed as one
> >> big file from rsync)
> >>
> >> Let us know your thoughts. Thanks
> >>
> >> Ref:
> >> [1]
> >> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
> >> --
> >> regards
> >> Aravinda
> >>
> >>
> >> ___
> >> Gluster-devel mailing list
> >> Gluster-devel@gluster.org
> >> http://www.gluster.org/mailman/listinfo/gluster-devel
> >
> >
> >
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> >
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Shyam

On 09/02/2015 03:12 AM, Aravinda wrote:

Geo-replication and Sharding Team today discussed about the approach
to make Sharding aware Geo-replication. Details are as below

Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur

- Both Master and Slave Volumes should be Sharded Volumes with same
   configurations.


If I am not mistaken, geo-rep supports replicating to a non-gluster 
local FS at the slave end. Is this correct? If so, would this limitation 
not make that problematic?


When you state *same configuration*, I assume you mean the sharding 
configuration, not the volume graph, right?



- In Changelog record changes related to Sharded files also. Just like
   any regular files.
- Sharding should allow Geo-rep to list/read/write Sharding internal
   Xattrs if Client PID is gsyncd(-1)
- Sharding should allow read/write of Sharded files(that is in .shards
   directory) if Client PID is GSYNCD
- Sharding should return actual file instead of returning the
   aggregated content when the Main file is requested(Client PID
   GSYNCD)

For example, a file f1 is created with GFID G1.

When the file grows it gets sharded into chunks(say 5 chunks).

 f1   G1
 .shards/G1.1   G2
 .shards/G1.2   G3
 .shards/G1.3   G4
 .shards/G1.4   G5

In Changelog, this is recorded as 5 different files as below

 CREATE G1 f1
 DATA G1
 META G1
 CREATE G2 PGS/G1.1
 DATA G2
 META G1
 CREATE G3 PGS/G1.2
 DATA G3
 META G1
 CREATE G4 PGS/G1.3
 DATA G4
 META G1
 CREATE G5 PGS/G1.4
 DATA G5
 META G1

Where PGS is GFID of .shards directory.

Geo-rep will create these files independently in Slave Volume and
syncs Xattrs of G1. Data can be read only when all the chunks are
synced to Slave Volume. Data can be read partially if main/first file
and some of the chunks synced to Slave.

Please add if I missed anything. C & S Welcome.

regards
Aravinda

On 08/11/2015 04:36 PM, Aravinda wrote:

Hi,

We are thinking different approaches to add support in Geo-replication
for Sharded Gluster Volumes[1]

*Approach 1: Geo-rep: Sync Full file*
   - In Changelog only record main file details in the same brick
where it is created
   - Record as DATA in Changelog whenever any addition/changes to the
sharded file
   - Geo-rep rsync will do checksum as a full file from mount and
syncs as new file
   - Slave side sharding is managed by Slave Volume
*Approach 2: Geo-rep: Sync sharded file separately*
   - Geo-rep rsync will do checksum for sharded files only
   - Geo-rep syncs each sharded files independently as new files
   - [UNKNOWN] Sync internal xattrs(file size and block count) in the
main sharded file to Slave Volume to maintain the same state as in Master.
   - Sharding translator to allow file creation under .shards dir for
gsyncd. that is Parent GFID is .shards directory
   - If sharded files are modified during Geo-rep run may end up stale
data in Slave.
   - Files on Slave Volume may not be readable unless all sharded
files sync to Slave(Each bricks in Master independently sync files to
slave)

First approach looks more clean, but we have to analize the Rsync
checksum performance on big files(Sharded in backend, accessed as one
big file from rsync)

Let us know your thoughts. Thanks

Ref:
[1]
http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
--
regards
Aravinda


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Emmanuel Dreyfus
Raghavendra Gowdappa  wrote:

> Its helpful if you can give some pointers on what parameters (like
> latency, throughput etc) you want us to consider for QoS.

Full blown QoS would be nice, but a first line of defense against
resource hogs seems just badly required.

A bare minimum could be to process client's FOP in a round robin
fashion. That way even if one client sends a lot of FOPs, there is
always some window for others to slip in.

Any opinion?

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Wanted - 3.7.5 release manager

2015-09-02 Thread Vijay Bellur

On Wednesday 02 September 2015 06:38 PM, Atin Mukherjee wrote:

IIRC, Pranith already volunteered for it in one of the last community
meetings?



Thanks Atin. I do recollect it now.

Pranith - can you confirm being the release manager for 3.7.5?

-Vijay


-Atin
Sent from one plus one

On Sep 2, 2015 6:00 PM, "Vijay Bellur" mailto:vbel...@redhat.com>> wrote:

Hi All,

We have been rotating release managers for minor releases in the
3.7.x train. We just released 3.7.4 and are looking for volunteers
to be release managers for 3.7.5 (scheduled for 30th September). If
anybody is interested in volunteering, please drop a note here.

Thanks,
Vijay
___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Why are all the jenkins jobs running in fr_FR locale lately?

2015-09-02 Thread Michael Scherer
Le mercredi 02 septembre 2015 à 15:59 +0200, Michael Scherer a écrit :
> Le lundi 31 août 2015 à 10:28 -0400, Kaleb S. KEITHLEY a écrit :
> > While it doesn't bother me, some of our devs may have some trouble
> > reading the output.
> > 
> > Could we go back to the C locale?
> 
> I suspect to be some locale leak from y ssh session to jenkins. Not sure
> how to do, except a restart of jenkins after cleaning my env.

yep, seems when i ssh, my $LANG is kept, and the same when I sudo. And
since the initscript do not clean anything, it stay in the jenkins
process, who then run all in french.

I tend to use "service jenkins restart", who take care of that ( IIRC ),
but this time I didn't, so it got leaked.

-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS



signature.asc
Description: This is a digitally signed message part
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Wanted - 3.7.5 release manager

2015-09-02 Thread Pranith Kumar Karampuri



On 09/02/2015 07:33 PM, Vijay Bellur wrote:

On Wednesday 02 September 2015 06:38 PM, Atin Mukherjee wrote:

IIRC, Pranith already volunteered for it in one of the last community
meetings?



Thanks Atin. I do recollect it now.

Pranith - can you confirm being the release manager for 3.7.5?

Yes, I can do this.

Pranith


-Vijay


-Atin
Sent from one plus one

On Sep 2, 2015 6:00 PM, "Vijay Bellur" mailto:vbel...@redhat.com>> wrote:

Hi All,

We have been rotating release managers for minor releases in the
3.7.x train. We just released 3.7.4 and are looking for volunteers
to be release managers for 3.7.5 (scheduled for 30th September). If
anybody is interested in volunteering, please drop a note here.

Thanks,
Vijay
___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel





___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Why are all the jenkins jobs running in fr_FR locale lately?

2015-09-02 Thread Michael Scherer
Le lundi 31 août 2015 à 10:28 -0400, Kaleb S. KEITHLEY a écrit :
> While it doesn't bother me, some of our devs may have some trouble
> reading the output.
> 
> Could we go back to the C locale?

I suspect to be some locale leak from y ssh session to jenkins. Not sure
how to do, except a restart of jenkins after cleaning my env.

-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS



signature.asc
Description: This is a digitally signed message part
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Minutes of the Weekly Gluster Community Meeting held on 2nd September, 2015

2015-09-02 Thread Krutika Dhananjay
Minutes: 

Minutes: 
http://meetbot.fedoraproject.org/gluster-meeting/2015-09-02/gluster-meeting.2015-09-02-12.00.html
 
Minutes (text): 
http://meetbot.fedoraproject.org/gluster-meeting/2015-09-02/gluster-meeting.2015-09-02-12.00.txt
 
Log: 
http://meetbot.fedoraproject.org/gluster-meeting/2015-09-02/gluster-meeting.2015-09-02-12.00.log.html
 

Meeting summary
--
* agenda is available @
  https://public.pad.fsfe.org/p/gluster-community-meetings  (kdhananjay,
  12:00:37)
* roll call  (kdhananjay, 12:00:46)

* Action items from last week  (kdhananjay, 12:02:30)

* raghu to fill in the contents for release schedule and ask tigert to
  push the page in gluster.org  (kdhananjay, 12:02:51)
  * LINK: - http://www.gluster.org/community/release-schedule/
(hagarth, 12:03:54)

* msvbhat/rastar to send update mailing list with a DiSTAF how-to and
  start discussion on enhancements to DiSTAF.  (kdhananjay, 12:04:39)
  * LINK: DiSTAF improvements for review http://review.gluster.org/12048
(ndevos, 12:05:48)
  * ACTION: msvbhat to make announcement on DiSTAF enhabcements and
review request on the mailing list  (kdhananjay, 12:07:35)

* kshlm to check back with misc on the new jenkins slaves.  (kdhananjay,
  12:08:02)

* kslm to send  out an announcement regarding the approaching deadline
  for Gluster 3.7 next week.  (kdhananjay, 12:09:27)

* hchiramm to improve release documentation  (kdhananjay, 12:10:53)

* raghu to announce 3.6.5 on mailing lists  (kdhananjay, 12:13:36)
  * LINK:
http://www.gluster.org/pipermail/gluster-devel/2015-August/046570.html
(anoopcs, 12:15:17)

* krishnan_p to update Gluster News about Gluster.next progress
  (kdhananjay, 12:15:51)
  * ACTION: krishnan_p to update Gluster News about Gluster.next
progress  (kdhananjay, 12:17:24)

* krishnan_p to send an email about nanomsg.org to gluster-dev
  (kdhananjay, 12:17:37)

* GlusterFS 3.7  (kdhananjay, 12:18:19)
  * ACTION: hagarth to post a note on gluster-devel asking for
volunteers for the role of release maintainer for 3.7.5
(kdhananjay, 12:23:54)

* GlusterFS 3.6  (kdhananjay, 12:24:13)
  * Backports to release-3.6 for towards the 3.6.6 release are welcome.
(kdhananjay, 12:25:21)
  * raghu to make 3.6.6 on 20th of September  (kdhananjay, 12:25:40)

* GlusterFS 3.5  (kdhananjay, 12:25:59)
  * poornimag needs help in porting glfs_fini patches to 3.5
(kdhananjay, 12:27:06)
  * ACTION: poornimag to send a mail on gluster-devel asking for
volunteers to backport glfs_fini patches to release-3.5
(kdhananjay, 12:30:07)

* GlusterFS 3.8  (kdhananjay, 12:31:23)

* GlusterFS 4.0  (kdhananjay, 12:33:48)
  * LINK:
http://www.meetup.com/GlusterFS-Silicon-Valley/events/224932563/
(jdarcy, 12:35:49)
  * LINK: http://www.meetup.com/glusterfs-India/events/01221/
(hagarth, 12:38:45)
  * GlusterFS meetup in Bengaluru on September 12th -
http://www.meetup.com/glusterfs-India/events/01221/
(kdhananjay, 12:39:00)

* Open Floor  (kdhananjay, 12:42:15)
  * ACTION: rastar to initiate discussion on exploring the different
approaches towards doing GlusterFS release management and
announcements  (kdhananjay, 12:57:27)
  * REMINDER to put (even minor) interesting topics on
https://public.pad.fsfe.org/p/gluster-weekly-news  (kdhananjay,
12:59:49)
  * REMINDER to announce Gluster attendance of events:
https://public.pad.fsfe.org/p/gluster-events  (kdhananjay, 12:59:56)

Meeting ended at 13:01:08 UTC.




Action Items

* msvbhat to make announcement on DiSTAF enhabcements and review request
  on the mailing list
* krishnan_p to update Gluster News about Gluster.next progress
* hagarth to post a note on gluster-devel asking for volunteers for the
  role of release maintainer for 3.7.5
* poornimag to send a mail on gluster-devel asking for volunteers to
  backport glfs_fini patches to release-3.5
* rastar to initiate discussion on exploring the different approaches
  towards doing GlusterFS release management and announcements




Action Items, by person
---
* hagarth
  * hagarth to post a note on gluster-devel asking for volunteers for
the role of release maintainer for 3.7.5
* msvbhat
  * msvbhat to make announcement on DiSTAF enhabcements and review
request on the mailing list
* poornimag
  * poornimag to send a mail on gluster-devel asking for volunteers to
backport glfs_fini patches to release-3.5
* rastar
  * rastar to initiate discussion on exploring the different approaches
towards doing GlusterFS release management and announcements
* **UNASSIGNED**
  * krishnan_p to update Gluster News about Gluster.next progress




People Present (lines said)
---
* kdhananjay (80)
* hagarth (34)
* ndevos (33)
* jdarcy (17)
* justinclift (17)
* rastar (16)
* poornimag (10)
* lpabon (6)
* raghu (4)
* anoopcs (3)
* msvbhat (2)
* zodbot (2)
* skoduri (1)
* rjoseph (1)
* rafi (1)
* tigert (1)
*

Re: [Gluster-devel] Wanted - 3.7.5 release manager

2015-09-02 Thread Atin Mukherjee
IIRC, Pranith already volunteered for it in one of the last community
meetings?

-Atin
Sent from one plus one
On Sep 2, 2015 6:00 PM, "Vijay Bellur"  wrote:

> Hi All,
>
> We have been rotating release managers for minor releases in the 3.7.x
> train. We just released 3.7.4 and are looking for volunteers to be release
> managers for 3.7.5 (scheduled for 30th September). If anybody is interested
> in volunteering, please drop a note here.
>
> Thanks,
> Vijay
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Jeff Darcy
> Do you have any ideas here on QoS? Can it be provided as a use-case for
> multi-tenancy you were working on earlier?

My interpretation of QoS would include rate limiting, but more per
*activity* (e.g. self-heal, rebalance, user I/O) or per *tenant* rather
than per *client*.  Also, it's easier to implement at the message level
(which can be done on the servers) rather than the fop level (which has
to be on clients).  How well does that apply to what we've been
discussing in this thread?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Wanted - 3.7.5 release manager

2015-09-02 Thread Vijay Bellur

Hi All,

We have been rotating release managers for minor releases in the 3.7.x 
train. We just released 3.7.4 and are looking for volunteers to be 
release managers for 3.7.5 (scheduled for 30th September). If anybody is 
interested in volunteering, please drop a note here.


Thanks,
Vijay
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] REMINDER: Weekly gluster community meeting to start in ~2 hours

2015-09-02 Thread Krutika Dhananjay
Hi All, 

In about 2 hours from now we will have the regular weekly Gluster 
Community meeting. 

Meeting details: 
- location: #gluster-meeting on Freenode IRC 
- date: every Wednesday 
- time: 12:00 UTC, 14:00 CEST, 17:30 IST 
(in your terminal, run: date -d "12:00 UTC") 
- agenda: https://public.pad.fsfe.org/p/gluster-community-meetings 

Currently the following items are listed: 
* Roll Call 
* Status of last week's action items 
* Gluster 3.7 
* Gluster 3.8 
* Gluster 3.6 
* Gluster 3.5 
* Gluster 4.0 
* Open Floor 
- bring your own topic! 

The last topic has space for additions. If you have a suitable topic to 
discuss, please add it to the agenda. 


-Krutika 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Emmanuel Dreyfus
On Wed, Sep 02, 2015 at 02:04:32PM +0530, Pranith Kumar Karampuri wrote:
> >And more generally, do we have a way to ratelimit FOPs per client, so
> >that one client cannot make the cluster unusable for the others?
> Do you have profile data?

No, it was on a production setup and I was too focused to restoring
functionnality to have thought about it.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Emmanuel Dreyfus
On Wed, Sep 02, 2015 at 02:05:03PM +0530, Venky Shankar wrote:
> > I understand rename on DHT can be very costly because data really have
> > to be moved from a brick to another one just for a file name change.
> > Is there a workaround for this behavior?
> 
> Not really. DHT uses pointer files (so called link-to) to work around
> moving file contents on rename().

Then I have been misled by the huge amount of DHT rename opeeations in 
the logs, but the user killed performance another way. Too bad I did 
not collect profile data at that time.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Raghavendra Gowdappa
+Jeff.

Jeff,

Do you have any ideas here on QoS? Can it be provided as a use-case for 
multi-tenancy you were working on earlier?

regards,
Raghavendra.

- Original Message -
> From: "Raghavendra Gowdappa" 
> To: "Pranith Kumar Karampuri" 
> Cc: gluster-devel@gluster.org
> Sent: Wednesday, September 2, 2015 2:11:35 PM
> Subject: Re: [Gluster-devel] FOP ratelimit?
> 
> 
> 
> - Original Message -
> > From: "Pranith Kumar Karampuri" 
> > To: "Emmanuel Dreyfus" , gluster-devel@gluster.org
> > Sent: Wednesday, September 2, 2015 2:04:32 PM
> > Subject: Re: [Gluster-devel] FOP ratelimit?
> > 
> > 
> > 
> > On 09/02/2015 01:59 PM, Emmanuel Dreyfus wrote:
> > > Hi
> > >
> > > Yesterday I experienced the problem of a single user bringing down
> > > a glusterfs cluster to its knees because of a high amount of rename
> > > operations.
> > >
> > > I understand rename on DHT can be very costly because data really have
> > > to be moved from a brick to another one just for a file name change.
> > > Is there a workaround for this behavior?
> > This is not true.
> 
> Data is not moved across bricks during rename. So, may be something else is
> causing the issue. Were you running rebalance while these renames were being
> done?
> 
> > >
> > > And more generally, do we have a way to ratelimit FOPs per client, so
> > > that one client cannot make the cluster unusable for the others?
> > Do you have profile data?
> > 
> > Raghavendra G is working on some QOS related enahancements in gluster.
> > Please let us know if you have any inputs here.
> 
> Thanks Pranith.
> 
> @Manu and others,
> 
> Its helpful if you can give some pointers on what parameters (like latency,
> throughput etc) you want us to consider for QoS. Also, any ideas (like
> interface for QoS) in this area is welcome. With my very basic search, seems
> like there are not many filesystems with QoS functionality.
> 
> regards,
> Raghavendra.
> > 
> > Pranith
> > >
> > 
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> > 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Raghavendra Gowdappa


- Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Emmanuel Dreyfus" , gluster-devel@gluster.org
> Sent: Wednesday, September 2, 2015 2:04:32 PM
> Subject: Re: [Gluster-devel] FOP ratelimit?
> 
> 
> 
> On 09/02/2015 01:59 PM, Emmanuel Dreyfus wrote:
> > Hi
> >
> > Yesterday I experienced the problem of a single user bringing down
> > a glusterfs cluster to its knees because of a high amount of rename
> > operations.
> >
> > I understand rename on DHT can be very costly because data really have
> > to be moved from a brick to another one just for a file name change.
> > Is there a workaround for this behavior?
> This is not true.

Data is not moved across bricks during rename. So, may be something else is 
causing the issue. Were you running rebalance while these renames were being 
done?

> >
> > And more generally, do we have a way to ratelimit FOPs per client, so
> > that one client cannot make the cluster unusable for the others?
> Do you have profile data?
> 
> Raghavendra G is working on some QOS related enahancements in gluster.
> Please let us know if you have any inputs here.

Thanks Pranith. 

@Manu and others,

Its helpful if you can give some pointers on what parameters (like latency, 
throughput etc) you want us to consider for QoS. Also, any ideas (like 
interface for QoS) in this area is welcome. With my very basic search, seems 
like there are not many filesystems with QoS functionality.

regards,
Raghavendra.
> 
> Pranith
> >
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Venky Shankar
On Wed, Sep 2, 2015 at 2:05 PM, Venky Shankar  wrote:
> On Wed, Sep 2, 2015 at 1:59 PM, Emmanuel Dreyfus  wrote:
>> Hi
>>
>> Yesterday I experienced the problem of a single user bringing down
>> a glusterfs cluster to its knees because of a high amount of rename
>> operations.
>>
>> I understand rename on DHT can be very costly because data really have
>> to be moved from a brick to another one just for a file name change.
>> Is there a workaround for this behavior?
>
> Not really. DHT uses pointer files (so called link-to) to work around
> moving file contents on rename().
>
>>
>> And more generally, do we have a way to ratelimit FOPs per client, so
>> that one client cannot make the cluster unusable for the others?
>
> There is some form of limiting based on priority (w/ client-pids) in
> io-threads. For bit-rot, I had used token bucket
> based throttling[1] during hash calculation. But that resides on the
> client side for bitrot xlator. It may be beneficial
> to have that on the server side.

[1]: 
https://github.com/gluster/glusterfs/blob/master/xlators/features/bit-rot/src/bitd/bit-rot-tbf.c
>
>>
>> --
>> Emmanuel Dreyfus
>> m...@netbsd.org
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Venky Shankar
On Wed, Sep 2, 2015 at 1:59 PM, Emmanuel Dreyfus  wrote:
> Hi
>
> Yesterday I experienced the problem of a single user bringing down
> a glusterfs cluster to its knees because of a high amount of rename
> operations.
>
> I understand rename on DHT can be very costly because data really have
> to be moved from a brick to another one just for a file name change.
> Is there a workaround for this behavior?

Not really. DHT uses pointer files (so called link-to) to work around
moving file contents on rename().

>
> And more generally, do we have a way to ratelimit FOPs per client, so
> that one client cannot make the cluster unusable for the others?

There is some form of limiting based on priority (w/ client-pids) in
io-threads. For bit-rot, I had used token bucket
based throttling[1] during hash calculation. But that resides on the
client side for bitrot xlator. It may be beneficial
to have that on the server side.

>
> --
> Emmanuel Dreyfus
> m...@netbsd.org
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Pranith Kumar Karampuri



On 09/02/2015 01:59 PM, Emmanuel Dreyfus wrote:

Hi

Yesterday I experienced the problem of a single user bringing down
a glusterfs cluster to its knees because of a high amount of rename
operations.

I understand rename on DHT can be very costly because data really have
to be moved from a brick to another one just for a file name change.
Is there a workaround for this behavior?

This is not true.


And more generally, do we have a way to ratelimit FOPs per client, so
that one client cannot make the cluster unusable for the others?

Do you have profile data?

Raghavendra G is working on some QOS related enahancements in gluster. 
Please let us know if you have any inputs here.


Pranith




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] FOP ratelimit?

2015-09-02 Thread Emmanuel Dreyfus
Hi

Yesterday I experienced the problem of a single user bringing down 
a glusterfs cluster to its knees because of a high amount of rename
operations.

I understand rename on DHT can be very costly because data really have
to be moved from a brick to another one just for a file name change.
Is there a workaround for this behavior?

And more generally, do we have a way to ratelimit FOPs per client, so
that one client cannot make the cluster unusable for the others?

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Sharding and Geo-replication

2015-09-02 Thread Aravinda

Geo-replication and Sharding Team today discussed about the approach
to make Sharding aware Geo-replication. Details are as below

Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur

- Both Master and Slave Volumes should be Sharded Volumes with same
  configurations.
- In Changelog record changes related to Sharded files also. Just like
  any regular files.
- Sharding should allow Geo-rep to list/read/write Sharding internal
  Xattrs if Client PID is gsyncd(-1)
- Sharding should allow read/write of Sharded files(that is in .shards
  directory) if Client PID is GSYNCD
- Sharding should return actual file instead of returning the
  aggregated content when the Main file is requested(Client PID
  GSYNCD)

For example, a file f1 is created with GFID G1.

When the file grows it gets sharded into chunks(say 5 chunks).

f1   G1
.shards/G1.1   G2
.shards/G1.2   G3
.shards/G1.3   G4
.shards/G1.4   G5

In Changelog, this is recorded as 5 different files as below

CREATE G1 f1
DATA G1
META G1
CREATE G2 PGS/G1.1
DATA G2
META G1
CREATE G3 PGS/G1.2
DATA G3
META G1
CREATE G4 PGS/G1.3
DATA G4
META G1
CREATE G5 PGS/G1.4
DATA G5
META G1

Where PGS is GFID of .shards directory.

Geo-rep will create these files independently in Slave Volume and
syncs Xattrs of G1. Data can be read only when all the chunks are
synced to Slave Volume. Data can be read partially if main/first file
and some of the chunks synced to Slave.

Please add if I missed anything. C & S Welcome.

regards
Aravinda

On 08/11/2015 04:36 PM, Aravinda wrote:

Hi,

We are thinking different approaches to add support in Geo-replication 
for Sharded Gluster Volumes[1]


*Approach 1: Geo-rep: Sync Full file*
   - In Changelog only record main file details in the same brick 
where it is created
   - Record as DATA in Changelog whenever any addition/changes to the 
sharded file
   - Geo-rep rsync will do checksum as a full file from mount and 
syncs as new file

   - Slave side sharding is managed by Slave Volume
*Approach 2: Geo-rep: Sync sharded file separately*
   - Geo-rep rsync will do checksum for sharded files only
   - Geo-rep syncs each sharded files independently as new files
   - [UNKNOWN] Sync internal xattrs(file size and block count) in the 
main sharded file to Slave Volume to maintain the same state as in Master.
   - Sharding translator to allow file creation under .shards dir for 
gsyncd. that is Parent GFID is .shards directory
   - If sharded files are modified during Geo-rep run may end up stale 
data in Slave.
   - Files on Slave Volume may not be readable unless all sharded 
files sync to Slave(Each bricks in Master independently sync files to 
slave)


First approach looks more clean, but we have to analize the Rsync 
checksum performance on big files(Sharded in backend, accessed as one 
big file from rsync)


Let us know your thoughts. Thanks

Ref:
[1] 
http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator

--
regards
Aravinda


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel