Re: [Gluster-users] To GlusterFS or not...

2014-09-23 Thread Roman
Hi,

just a question ...

Would SAS disks be better in situation with lots of seek times using
GlusterFS?

2014-09-22 23:03 GMT+03:00 Jeff Darcy jda...@redhat.com:


  The biggest issue that we are having, is that we are talking about
  -billions- of small (max 5MB) files. Seek times are killing us
  completely from what we can make out. (OS, HW/RAID has been tweaked to
  kingdom come and back).

 This is probably the key point.  It's unlikely that seek times are going
 to get better with GlusterFS, unless it's because the new servers have
 more memory and disks, but if that's the case then you might as well
 just deploy more memory and disks in your existing scheme.  On top of
 that, using any distributed file system is likely to mean more network
 round trips, to maintain consistency.  There would be a benefit from
 letting GlusterFS handle the distribution (and redistribution) of files
 automatically instead of having to do your own sharding, but that's not
 the same as a performance benefit.

  I’m not yet too clued up on all the GlusterFS naming, but essentially
  if we do go the GlusterFS route, we would like to use non replicated
  storage bricks on all the front-end, as well as back-end servers in
  order to maximize storage.

 That's fine, so long as you recognize that recovering from a failed
 server becomes more of a manual process, but it's probably a moot point
 in light of the seek-time issue mentioned above.  As much as I hate to
 discourage people from using GlusterFS, it's even worse to have them be
 disappointed, or for other users with other needs to be so as we spend
 time trying to fix the unfixable.
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users




-- 
Best regards,
Roman.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] To GlusterFS or not...

2014-09-23 Thread Chris Knipe
Hi,

SSD has been considered but is not an option due to cost.  SAS has
been considered but is not a option due to the relatively small sizes
of the drives.  We are *rapidly* growing towards a PB of actual online
storage.

We are exploring raid controllers with onboard SSD cache which may help.

On Tue, Sep 23, 2014 at 7:59 AM, Roman rome...@gmail.com wrote:
 Hi,

 just a question ...

 Would SAS disks be better in situation with lots of seek times using
 GlusterFS?

 2014-09-22 23:03 GMT+03:00 Jeff Darcy jda...@redhat.com:


  The biggest issue that we are having, is that we are talking about
  -billions- of small (max 5MB) files. Seek times are killing us
  completely from what we can make out. (OS, HW/RAID has been tweaked to
  kingdom come and back).

 This is probably the key point.  It's unlikely that seek times are going
 to get better with GlusterFS, unless it's because the new servers have
 more memory and disks, but if that's the case then you might as well
 just deploy more memory and disks in your existing scheme.  On top of
 that, using any distributed file system is likely to mean more network
 round trips, to maintain consistency.  There would be a benefit from
 letting GlusterFS handle the distribution (and redistribution) of files
 automatically instead of having to do your own sharding, but that's not
 the same as a performance benefit.

  I’m not yet too clued up on all the GlusterFS naming, but essentially
  if we do go the GlusterFS route, we would like to use non replicated
  storage bricks on all the front-end, as well as back-end servers in
  order to maximize storage.

 That's fine, so long as you recognize that recovering from a failed
 server becomes more of a manual process, but it's probably a moot point
 in light of the seek-time issue mentioned above.  As much as I hate to
 discourage people from using GlusterFS, it's even worse to have them be
 disappointed, or for other users with other needs to be so as we spend
 time trying to fix the unfixable.
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users




 --
 Best regards,
 Roman.



-- 

Regards,
Chris Knipe
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] To GlusterFS or not...

2014-09-23 Thread Roman
Hi,

SAS 7200 RPM disks are not that small size at all (same as SATA basically).
If I remember right, the reason of switching to SAS here would be Full
Duplex with SAS (you can read and write in the same time to them)  instead
of Half Duplex with SATA disks (read or write per one moment only).

2014-09-23 9:02 GMT+03:00 Chris Knipe sav...@savage.za.org:

 Hi,

 SSD has been considered but is not an option due to cost.  SAS has
 been considered but is not a option due to the relatively small sizes
 of the drives.  We are *rapidly* growing towards a PB of actual online
 storage.

 We are exploring raid controllers with onboard SSD cache which may help.

 On Tue, Sep 23, 2014 at 7:59 AM, Roman rome...@gmail.com wrote:
  Hi,
 
  just a question ...
 
  Would SAS disks be better in situation with lots of seek times using
  GlusterFS?
 
  2014-09-22 23:03 GMT+03:00 Jeff Darcy jda...@redhat.com:
 
 
   The biggest issue that we are having, is that we are talking about
   -billions- of small (max 5MB) files. Seek times are killing us
   completely from what we can make out. (OS, HW/RAID has been tweaked to
   kingdom come and back).
 
  This is probably the key point.  It's unlikely that seek times are going
  to get better with GlusterFS, unless it's because the new servers have
  more memory and disks, but if that's the case then you might as well
  just deploy more memory and disks in your existing scheme.  On top of
  that, using any distributed file system is likely to mean more network
  round trips, to maintain consistency.  There would be a benefit from
  letting GlusterFS handle the distribution (and redistribution) of files
  automatically instead of having to do your own sharding, but that's not
  the same as a performance benefit.
 
   I’m not yet too clued up on all the GlusterFS naming, but essentially
   if we do go the GlusterFS route, we would like to use non replicated
   storage bricks on all the front-end, as well as back-end servers in
   order to maximize storage.
 
  That's fine, so long as you recognize that recovering from a failed
  server becomes more of a manual process, but it's probably a moot point
  in light of the seek-time issue mentioned above.  As much as I hate to
  discourage people from using GlusterFS, it's even worse to have them be
  disappointed, or for other users with other needs to be so as we spend
  time trying to fix the unfixable.
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
 
 
 
  --
  Best regards,
  Roman.



 --

 Regards,
 Chris Knipe




-- 
Best regards,
Roman.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Unable to run Gluster Test Framework on CentOS 7

2014-09-23 Thread Kiran Patil
Hi,

I followed the below steps to run tests.

The link
http://gluster.org/community/documentation/index.php/Using_the_Gluster_Test_Framework
does not have info for CentOS 7 and I tried to follow the same steps with
epel pointing to release 7

Here are the steps for CentOS 7:

1. Install EPEL:

$ sudo yum install -y
http://epel.mirror.net.in/epel/7/x86_64/e/epel-release-7-1.noarch.rpm

2. Install the CentOS 7.x dependencies:

$ sudo yum install -y --enablerepo=epel cmockery2-devel dbench git
libacl-devel mock nfs-utils perl-Test-Harness yajl xfsprogs
$ sudo yum install -y --enablerepo=epel python-webob1.0
python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
== The missing packages are
No package python-webob1.0 available.
No package python-paste-deploy1.5 available.
No package python-sphinx10 available.

$ sudo yum install -y --enablerepo=epel autoconf automake bison
dos2unix flex fuse-devel libaio-devel libibverbs-devel \
 librdmacm-devel libtool libxml2-devel lvm2-devel make
openssl-devel pkgconfig \
 python-devel python-eventlet python-netifaces python-paste-deploy \
 python-simplejson python-sphinx python-webob pyxattr
readline-devel rpm-build \
 systemtap-sdt-devel tar

3. Create the mock user

$ sudo useradd -g mock mock

4. Running the testcases results in error as below

[root@fractal-c5ac glusterfs]# ./run-tests.sh

... GlusterFS Test Framework ...

Running all the regression test cases
[09:55:02] ./tests/basic/afr/gfid-mismatch.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/afr/gfid-self-heal.t ... Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/afr/metadata-self-heal.t ... Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/afr/read-subvol-data.t . Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/afr/read-subvol-entry.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/afr/resolve.t .. Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/afr/self-heal.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/afr/sparse-file-self-heal.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/afr/stale-file-lookup.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/bd.t ... Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/cdc.t .. Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/ec-12-4.t ... Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/ec-3-1.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/ec-4-1.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/ec-5-1.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/ec-5-2.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/ec-6-2.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/ec-7-3.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/ec.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/nfs.t ... Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/ec/self-heal.t . Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/file-snapshot.t  Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:02] ./tests/basic/fops-sanity.t .. Dubious,
test returned 1 (wstat 256, 0x100)
No subtests run
[09:55:03] ./tests/basic/gfid-access.t .. Dubious,
test returned 1 (wstat 256, 0x100)

Both glusterd and glusterfsd are running fine

[root@fractal-c5ac glusterfs]# systemctl status glusterd.service
glusterd.service - GlusterFS an clustered file-system server
   Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled)
   Active: active (running) since Tue 2014-09-23 14:45:42 IST; 45min ago
  Process: 12246 ExecStart=/usr/sbin/glusterd -p /run/glusterd.pid
(code=exited, status=0/SUCCESS)
 Main PID: 12247 (glusterd)
   CGroup: /system.slice/glusterd.service
   └─12247 /usr/sbin/glusterd -p 

Re: [Gluster-users] Unable to run Gluster Test Framework on CentOS 7

2014-09-23 Thread Kiran Patil
Hi,

Sorry, I did not checkout the gluster source for the version I am running
to run the testcases.

It is working fine.

Thanks,
Kiran.

On Tue, Sep 23, 2014 at 1:09 PM, Kiran Patil kirantpa...@gmail.com wrote:

 Hi,

 I followed the below steps to run tests.

 The link
 http://gluster.org/community/documentation/index.php/Using_the_Gluster_Test_Framework
 does not have info for CentOS 7 and I tried to follow the same steps with
 epel pointing to release 7

 Here are the steps for CentOS 7:

 1. Install EPEL:

 $ sudo yum install -y 
 http://epel.mirror.net.in/epel/7/x86_64/e/epel-release-7-1.noarch.rpm

 2. Install the CentOS 7.x dependencies:

 $ sudo yum install -y --enablerepo=epel cmockery2-devel dbench git 
 libacl-devel mock nfs-utils perl-Test-Harness yajl xfsprogs
 $ sudo yum install -y --enablerepo=epel python-webob1.0 
 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
 == The missing packages are
 No package python-webob1.0 available.
 No package python-paste-deploy1.5 available.
 No package python-sphinx10 available.

 $ sudo yum install -y --enablerepo=epel autoconf automake bison dos2unix flex 
 fuse-devel libaio-devel libibverbs-devel \
  librdmacm-devel libtool libxml2-devel lvm2-devel make openssl-devel 
 pkgconfig \
  python-devel python-eventlet python-netifaces python-paste-deploy \
  python-simplejson python-sphinx python-webob pyxattr readline-devel 
 rpm-build \
  systemtap-sdt-devel tar

 3. Create the mock user

 $ sudo useradd -g mock mock

 4. Running the testcases results in error as below

 [root@fractal-c5ac glusterfs]# ./run-tests.sh

 ... GlusterFS Test Framework ...

 Running all the regression test cases
 [09:55:02] ./tests/basic/afr/gfid-mismatch.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/afr/gfid-self-heal.t ... Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/afr/metadata-self-heal.t ... Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/afr/read-subvol-data.t . Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/afr/read-subvol-entry.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/afr/resolve.t .. Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/afr/self-heal.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/afr/sparse-file-self-heal.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/afr/stale-file-lookup.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/bd.t ... Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/cdc.t .. Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/ec-12-4.t ... Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/ec-3-1.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/ec-4-1.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/ec-5-1.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/ec-5-2.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/ec-6-2.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/ec-7-3.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/ec.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/nfs.t ... Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/ec/self-heal.t . Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/file-snapshot.t  Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:02] ./tests/basic/fops-sanity.t .. Dubious,
 test returned 1 (wstat 256, 0x100)
 No subtests run
 [09:55:03] ./tests/basic/gfid-access.t .. Dubious,
 test returned 1 (wstat 256, 0x100)

 Both glusterd and glusterfsd are running fine

 [root@fractal-c5ac glusterfs]# systemctl status glusterd.service
 glusterd.service - GlusterFS an clustered file-system server
Loaded: loaded 

[Gluster-users] Fwd: GlusterFest Test Week

2014-09-23 Thread Humble Devassy Chirammal
Hi All,


GlusterFS 3.6.0beta1 RPMs for el5-7 (RHEL, CentOS, etc.), Fedora
(19,20,21,22)  are available at download.gluster.org [1].

Please use the same for GlusterFest and we welcome your
suggestions/comments/feedback about this release.

[1] http://download.gluster.org/pub/gluster/glusterfs/qa-releases/3.6.0beta1



--Humble


On Mon, Sep 22, 2014 at 3:40 PM, Humble Devassy Chirammal 
humble.deva...@gmail.com wrote:

 Hi Justin,

 We are working on releasing RPMs for 3.6.0beta1  release, but it will take
 some time. We will notify the list once the rpms are available.

 --Humble

 On Mon, Sep 22, 2014 at 3:27 PM, Justin Clift jus...@gluster.org wrote:

 On 21/09/2014, at 7:47 PM, Vijay Bellur wrote:
  On 09/18/2014 05:42 PM, Humble Devassy Chirammal wrote:
  Greetings,
 
  As decided in our last GlusterFS meeting and the 3.6 planning schedule,
  we shall conduct GlusterFS 3.6 test days  starting from next week.
  This time we intend testing one component and functionality per day.
 
  GlusterFS 3.6.0beta1 is now available to kick start the test week [1].
 As we find issues and more patches do get merged in over the test week, I
 will be triggering further beta releases.


 We're going to need RPMs... and probably .deb's too.

 + Justin

 --
 GlusterFS - http://www.gluster.org

 An open source, distributed file system scaling to several
 petabytes, and handling thousands of clients.

 My personal twitter: twitter.com/realjustinclift



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] REMINDER: Gluster Bug Triage meeting starting in approx. 2 hours from now

2014-09-23 Thread Niels de Vos
Hi all,

please join the #gluster-meeting IRC channel on irc.freenode.net to
participate on the following topics:

* Roll call
* Status of last weeks action items
* What happens after a bug has been marked Triaged?
* Add distinction between problem reports and enhancement requests
* Group Triage
* Open Floor

More details on the above, and last minute changes to the agenda are
kept in the etherpad for this meeting:
- https://public.pad.fsfe.org/p/gluster-bug-triage

The meeting starts at 12:00 UTC, you can convert that to your own
timezone with the 'date' command:

$ date -d 12:00 UTC

Cheers,
Niels
BEGIN:VCALENDAR
PRODID:Zimbra-Calendar-Provider
VERSION:2.0
METHOD:REQUEST
BEGIN:VTIMEZONE
TZID:Europe/Berlin
BEGIN:STANDARD
DTSTART:16010101T03
TZOFFSETTO:+0100
TZOFFSETFROM:+0200
RRULE:FREQ=YEARLY;WKST=MO;INTERVAL=1;BYMONTH=10;BYDAY=-1SU
TZNAME:CET
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:16010101T02
TZOFFSETTO:+0200
TZOFFSETFROM:+0100
RRULE:FREQ=YEARLY;WKST=MO;INTERVAL=1;BYMONTH=3;BYDAY=-1SU
TZNAME:CEST
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:2b343380-48a4-4972-a9e6-6671c536bb97
RRULE:FREQ=WEEKLY;INTERVAL=1
SUMMARY:Gluster Community Bug triage meeting
LOCATION:#gluster-meeting on Freenode IRC
ATTENDEE;ROLE=OPT-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=TRUE:mailto:gluster
 -de...@gluster.org
ATTENDEE;CN=gluster-users@gluster.org;ROLE=OPT-PARTICIPANT;PARTSTAT=NEEDS-AC
 TION;RSVP=TRUE:mailto:gluster-users@gluster.org
ORGANIZER;CN=Niels de Vos:mailto:nde...@redhat.com
DTSTART;TZID=Europe/Berlin:20140902T14
DTEND;TZID=Europe/Berlin:20140902T15
STATUS:CONFIRMED
CLASS:PUBLIC
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
TRANSP:OPAQUE
LAST-MODIFIED:20140829T072226Z
DTSTAMP:20140829T072226Z
SEQUENCE:0
DESCRIPTION:The following is a new meeting request:\n\nSubject: Gluster Comm
 unity Bug triage meeting \nOrganiser: Niels De Vos nde...@redhat.com \n\
 nLocation: #gluster-meeting on Freenode IRC \nTime: 2:00:00 PM - 3:00:00 PM 
 GMT +01:00 Amsterdam\, Berlin\, Bern\, Rome\, Stockholm\, Vienna\n Recurrenc
 e : Every Tuesday No end date Effective 2 Sep\, 2014\n\nRequired:  \nOptiona
 l: gluster-de...@gluster.org\; gluster-users@gluster.org \n\n*~*~*~*~*~*~*~*
 ~*~*\n\nHi all\,\n\nin order to improve the quality of GlusterFS we've been 
 discussing about doing\nregular/continuous Bug Triage. As mentioned in an ea
 rlier email [1] from Lala\,\nwe have hashed out an initial document [2] that
  explains about Bug Triage and\nwhat it means.\n\nThis meeting is scheduled 
 for anyone that is interested in learning more about\,\nor assisting with th
 e Bug Triage. All the maintainers of major components that\nare listed in th
 e MAINTAINERS file [3] are highly encouraged to join.\n\nMeeting details:\n-
  location: #gluster-meeting on Freenode IRC\n- date: every Tuesday\n- time: 
 12:00 UTC\, 14:00 CEST\, 14:00 GMT+1 (same as the weekly meeting on Wednesda
 ys)\n- agenda: https://public.pad.fsfe.org/p/gluster-bug-triage\n\nWe plan t
 o have these meeting every week\, at least for now. When everyone is\nused t
 o triage bugs more regularly\, we will increase the interval between the\nme
 etings.\n\nThanks\,\nNiels\n\n\n[1] http://supercolony.gluster.org/pipermail
 /gluster-users/2014-August/018488.html\n[2] http://www.gluster.org/community
 /documentation/index.php/Bug_triage\n[3] https://forge.gluster.org/glusterfs
 -core/glusterfs/blobs/master/MAINTAINERS\n
END:VEVENT
END:VCALENDAR___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] REMINDER: Gluster Bug Triage meeting starting in approx. 2 hours from now

2014-09-23 Thread Niels de Vos
On Tue, Sep 23, 2014 at 11:57:54AM +0200, Niels de Vos wrote:
 Hi all,
 
 please join the #gluster-meeting IRC channel on irc.freenode.net to
 participate on the following topics:
 
 * Roll call
 * Status of last weeks action items
 * What happens after a bug has been marked Triaged?
 * Add distinction between problem reports and enhancement requests
 * Group Triage
 * Open Floor
 
 More details on the above, and last minute changes to the agenda are
 kept in the etherpad for this meeting:
 - https://public.pad.fsfe.org/p/gluster-bug-triage
 
 The meeting starts at 12:00 UTC, you can convert that to your own
 timezone with the 'date' command:
 
 $ date -d 12:00 UTC

Due to low attendance we did not go through the agenda, and moved
immediately to the Open Floor. Minutes of the meeting can be found here:

Minutes: 
http://meetbot.fedoraproject.org/gluster-meeting/2014-09-23/gluster-meeting.2014-09-23-12.02.html
Minutes (text): 
http://meetbot.fedoraproject.org/gluster-meeting/2014-09-23/gluster-meeting.2014-09-23-12.02.txt
Log: 
http://meetbot.fedoraproject.org/gluster-meeting/2014-09-23/gluster-meeting.2014-09-23-12.02.log.html

I hope we'll have more people in the meeting next week.

Thanks,
Niels
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs replica volume self heal lots of small file very very slow?how to improve? why slow?

2014-09-23 Thread Jocelyn Hotte
We tried, but the process which hits 100% CPU is glusterfsd, therefore, the 
impact on Gluster is still there, because that process is virtually at 100% CPU.

-Original Message-
From: James [mailto:purplei...@gmail.com] 
Sent: 22 septembre 2014 13:25
To: Jocelyn Hotte
Cc: justgluste...@gmail.com; gluster-users@gluster.org; 
gluster-de...@gluster.org
Subject: Re: [Gluster-users] glusterfs replica volume self heal lots of small 
file very very slow?how to improve? why slow?

On Mon, Sep 22, 2014 at 10:04 AM, Jocelyn Hotte jocelyn.ho...@ubisoft.com 
wrote:
 When a self-heal hits in our use-case, it is a direct impact in performance 
 for the users. The CPU of the Gluster nodes hits 100%, and maintains this for 
 usually 1 hour, but sometimes goes up to 4-5 hours.
 This usually renders the Gluster cluster unusable, with a high impact for us.


Have you tried using cgroups to limit the effect of self-heal?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] WORM seems to be broken.

2014-09-23 Thread Paul Guo
Here are the steps to reproduce this issue. (gluster version 3.5.2)


On one server lab1 (There is another server lab2 for replica 2):


[root@lab1 ~]# gluster volume set g1 worm on
volume set: success
[root@lab1 ~]# gluster volume stop g1
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) 
y
volume stop: g1: success
[root@lab1 ~]# gluster volume start g1
volume start: g1: success
[root@lab1 ~]# gluster volume info
 
Volume Name: g1
Type: Distributed-Replicate
Volume ID: fbd79e82-b079-4d19-ae5a-b4279c31bac3
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: lab1:/brick1
Brick2: lab2:/brick1
Brick3: lab1:/brick2
Brick4: lab2:/brick2
Options Reconfigured:
diagnostics.brick-log-level: TRACE
features.worm: on


On client (also be one of the servers lab1):
[root@lab1 g1]# mount -t glusterfs -o worm lab1:g1 /mnt

process info of the gluster mount process.
root  2851 1  0 21:28 ?00:00:01 /usr/sbin/glusterfs --worm 
--volfile-server=lab1 --volfile-id=g1 /mnt



Here is the log of the client.


2014-09-23 13:28:55.345226] W [client-rpc-fops.c:1632:client3_3_entrylk_cbk] 
0-g1-client-0: remote operation failed: Read-only file system
[2014-09-23 13:28:55.346172] W [client-rpc-fops.c:1632:client3_3_entrylk_cbk] 
0-g1-client-0: remote operation failed: Read-only file system
[2014-09-23 13:28:55.346222] I [afr-lk-common.c:1092:afr_lock_blocking] 
0-g1-replicate-0: unable to lock on even one child
[2014-09-23 13:28:55.346245] I 
[afr-transaction.c:1295:afr_post_blocking_entrylk_cbk] 0-g1-replicate-0: 
Blocking entrylks failed.
[2014-09-23 13:28:55.346321] W [fuse-bridge.c:1911:fuse_create_cbk] 
0-glusterfs-fuse: 5: /test = -1 (Read-only file system)
~


It looks like it fails to lock the directory (?) becuase WORM calls 
ro_entrylk() which is originally for the read-only feature of the file system.___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] To GlusterFS or not...

2014-09-23 Thread Alexey Zilber
Yes, Roman is correct.   Also, if you have lots of random IO you're better
off with many smaller SAS drives.   This is because the greater number of
spindles you have the greater your random IO is.  This is also why we went
with ssd drives because sas drives weren't cutting it on the random io
front.

Another option you may try is using SAS drives with ZFS compression.
Compression will be especially helpful if you're using SATA drives.

-Alex
On Sep 23, 2014 2:10 PM, Roman rome...@gmail.com wrote:

 Hi,

 SAS 7200 RPM disks are not that small size at all (same as SATA
 basically). If I remember right, the reason of switching to SAS here would
 be Full Duplex with SAS (you can read and write in the same time to them)
  instead of Half Duplex with SATA disks (read or write per one moment only).

 2014-09-23 9:02 GMT+03:00 Chris Knipe sav...@savage.za.org:

 Hi,

 SSD has been considered but is not an option due to cost.  SAS has
 been considered but is not a option due to the relatively small sizes
 of the drives.  We are *rapidly* growing towards a PB of actual online
 storage.

 We are exploring raid controllers with onboard SSD cache which may help.

 On Tue, Sep 23, 2014 at 7:59 AM, Roman rome...@gmail.com wrote:
  Hi,
 
  just a question ...
 
  Would SAS disks be better in situation with lots of seek times using
  GlusterFS?
 
  2014-09-22 23:03 GMT+03:00 Jeff Darcy jda...@redhat.com:
 
 
   The biggest issue that we are having, is that we are talking about
   -billions- of small (max 5MB) files. Seek times are killing us
   completely from what we can make out. (OS, HW/RAID has been tweaked
 to
   kingdom come and back).
 
  This is probably the key point.  It's unlikely that seek times are
 going
  to get better with GlusterFS, unless it's because the new servers have
  more memory and disks, but if that's the case then you might as well
  just deploy more memory and disks in your existing scheme.  On top of
  that, using any distributed file system is likely to mean more network
  round trips, to maintain consistency.  There would be a benefit from
  letting GlusterFS handle the distribution (and redistribution) of files
  automatically instead of having to do your own sharding, but that's not
  the same as a performance benefit.
 
   I’m not yet too clued up on all the GlusterFS naming, but essentially
   if we do go the GlusterFS route, we would like to use non replicated
   storage bricks on all the front-end, as well as back-end servers in
   order to maximize storage.
 
  That's fine, so long as you recognize that recovering from a failed
  server becomes more of a manual process, but it's probably a moot point
  in light of the seek-time issue mentioned above.  As much as I hate to
  discourage people from using GlusterFS, it's even worse to have them be
  disappointed, or for other users with other needs to be so as we spend
  time trying to fix the unfixable.
  ___
  Gluster-users mailing list
  Gluster-users@gluster.org
  http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
 
 
 
  --
  Best regards,
  Roman.



 --

 Regards,
 Chris Knipe




 --
 Best regards,
 Roman.

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] To GlusterFS or not...

2014-09-23 Thread Jeff Darcy
 SSD has been considered but is not an option due to cost.  SAS has
 been considered but is not a option due to the relatively small sizes
 of the drives.  We are *rapidly* growing towards a PB of actual online
 storage.
 
 We are exploring raid controllers with onboard SSD cache which may help.

We have had some pretty good results with those in the lab.  They're not
*always* beneficial, and getting the right SSD:disk ratio for your
workload might require some experimentation, but it's certainly a good
direction to explore.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Lots of these in my brick logs.

2014-09-23 Thread Alex Crow

Hi,

Is the below in my brick logs anything to worry about?

Seen a hell of a lot of them today.

Thanks

Alex

[2014-09-23 15:11:00.252041] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (f7e985f2-381d-4fa7-9f7c-f70745f9d5d6) is not found. anonymous fd 
creation failed
[2014-09-23 15:11:00.643212] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (bb4f48f3-27ac-4909-9f0f-130b214b9576) is not found. anonymous fd 
creation failed
[2014-09-23 15:11:58.076899] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (acc3709f-b428-40ad-83b1-80838fc3916d) is not found. anonymous fd 
creation failed
[2014-09-23 15:11:58.488305] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (0e55f37b-5cd9-4fbd-be3c-0645e6fe12d9) is not found. anonymous fd 
creation failed
[2014-09-23 15:12:55.911618] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (b73d3253-ecda-4153-8555-d3152c322808) is not found. anonymous fd 
creation failed
[2014-09-23 15:12:56.333404] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (aca64a10-e232-4953-8b58-f1bf90369ea2) is not found. anonymous fd 
creation failed
[2014-09-23 15:14:03.933481] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (4ef5e96f-d3ee-4ae8-ab8a-56da60d9a882) is not found. anonymous fd 
creation failed
[2014-09-23 15:14:04.324223] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (e1fdbd09-b854-4e3e-aabf-65be845c99ff) is not found. anonymous fd 
creation failed
[2014-09-23 15:15:01.075043] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (d040a0a7-d963-41c4-8fd5-0c711dc7bc8b) is not found. anonymous fd 
creation failed
[2014-09-23 15:15:01.470147] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (a28a1758-58f9-416c-b3e0-b32aa34a3876) is not found. anonymous fd 
creation failed
[2014-09-23 15:16:03.439833] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (7d7fc84e-902c-43d8-9c6a-bdf5b0acfd30) is not found. anonymous fd 
creation failed
[2014-09-23 15:18:20.906178] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (e0a31024-528a-48ca-bc14-3ce6eec8c8d5) is not found. anonymous fd 
creation failed
[2014-09-23 15:18:21.299944] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (f0926a06-3ef0-4f68-aa1e-167f127a5d98) is not found. anonymous fd 
creation failed
[2014-09-23 15:18:21.687941] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (18fff46e-19fa-43d2-8c53-d32b57a75dbe) is not found. anonymous fd 
creation failed
[2014-09-23 15:19:21.332023] W 
[server-resolve.c:420:resolve_anonfd_simple] 0-server: inode for the 
gfid (ad7be2d4-5420-4588-be52-f00683ba69d3) is not found. anonymous fd 
creation failed


--
This message is intended only for the addressee and may contain
confidential information. Unless you are that person, you may not
disclose its contents or use it in any way and are requested to delete
the message along with any attachments and notify us immediately.
Transact is operated by Integrated Financial Arrangements plc. 29
Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608
5300. (Registered office: as above; Registered in England and Wales
under number: 3727592). Authorised and regulated by the Financial
Conduct Authority (entered on the Financial Services Register; no. 190856).

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Migrating data from a failing filesystem

2014-09-23 Thread james . bellinger
I inherited a non-replicated gluster system based on antique hardware.

One of the brick filesystems is flaking out, and remounts read-only.  I
repair it and remount it, but this is only postponing the inevitable.

How can I migrate files off a failing brick that intermittently turns
read-only?  I have enough space, thanks to a catastrophic failure on
another brick; I don't want to present people with another one.  But if I
understand migration correctly references have to be deleted, which isn't
possible if the filesystem turns read-only.

What I want to do is migrate the files off, remove it from gluster,
rebuild the array, rebuild the filesystem, and then add it back as a
brick.  (Actually what I'd really like is to hear that the students are
all done with the system and I can turn the whole thing off, but theses
aren't complete yet.)

Any advice or words of warning will be appreciated.

James Bellinger




___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Unable to run Gluster Test Framework on CentOS 7

2014-09-23 Thread Justin Clift
Cool, I've not gotten around to testing EL7 yet. :)

Would you have the time / interest to add CentOS 7 steps to the page?

+ Justin


On 23/09/2014, at 8:47 AM, Kiran Patil wrote:
 Hi,
 
 Sorry, I did not checkout the gluster source for the version I am running to 
 run the testcases.
 
 It is working fine.
 
 Thanks,
 Kiran.
 
 On Tue, Sep 23, 2014 at 1:09 PM, Kiran Patil kirantpa...@gmail.com wrote:
 Hi,
 
 I followed the below steps to run tests.
 
 The link 
 http://gluster.org/community/documentation/index.php/Using_the_Gluster_Test_Framework
  does not have info for CentOS 7 and I tried to follow the same steps with 
 epel pointing to release 7
 
 Here are the steps for CentOS 7:
 
 1. Install EPEL: 
 $ sudo yum install -y 
 http://epel.mirror.net.in/epel/7/x86_64/e/epel-release-7-1.noarch.rpm
 
 2. Install the CentOS 7.x dependencies:
 
 $ sudo yum install -y --enablerepo=epel cmockery2-devel dbench git 
 libacl-devel mock nfs-utils perl-Test-Harness yajl xfsprogs
 
 
 $ sudo yum install -y --enablerepo=epel python-webob1.0 
 python-paste-deploy1.5 python-sphinx10 redhat-rpm-config
 
 
 == The missing packages are
 No package python-webob1.0 available.
 No package python-paste-deploy1.5 available.
 No package python-sphinx10 available.
 
 $ sudo yum install -y --enablerepo=epel autoconf automake bison dos2unix flex 
 fuse-devel libaio-devel libibverbs-devel \
  librdmacm-devel libtool libxml2-devel lvm2-devel make openssl-devel 
 pkgconfig \
  python-devel python-eventlet python-netifaces python-paste-deploy \
  python-simplejson python-sphinx python-webob pyxattr readline-devel 
 rpm-build \
  systemtap-sdt-devel tar
 
  
 3. Create the mock user 
 
 $ sudo useradd -g mock mock
 
 
 4. Running the testcases results in error as below
 
 [root@fractal-c5ac glusterfs]# ./run-tests.sh 
 
 ... GlusterFS Test Framework ...
 
 Running all the regression test cases
 [09:55:02] ./tests/basic/afr/gfid-mismatch.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/afr/gfid-self-heal.t ... Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/afr/metadata-self-heal.t ... Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/afr/read-subvol-data.t . Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/afr/read-subvol-entry.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/afr/resolve.t .. Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/afr/self-heal.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/afr/sparse-file-self-heal.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/afr/stale-file-lookup.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/bd.t ... Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/cdc.t .. Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/ec-12-4.t ... Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/ec-3-1.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/ec-4-1.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/ec-5-1.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/ec-5-2.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/ec-6-2.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/ec-7-3.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/ec.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/nfs.t ... Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/ec/self-heal.t . Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/file-snapshot.t  Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:02] ./tests/basic/fops-sanity.t .. Dubious, 
 test returned 1 (wstat 256, 0x100)
 No subtests run 
 [09:55:03] ./tests/basic/gfid-access.t 

[Gluster-users] Problem setting up geo-replication: gsyncd using incorrect slave hostname

2014-09-23 Thread Darrell Budic
I was having trouble setting up geo-replication, and I finally figured out why. 
Gsyncd is trying to use the wrong (but valid) name for the slave server, and 
it’s resolving to an address it can’t reach. It does this even though I tried 
to setup the geo-replication to a specific IP address, and the initial create 
succeeded.

The environment I’m working with has server1  server3 with bricks for a 
replication volume, and my slave server ‘spire’ is located remotely. Because 
there are multiple paths between the systems, I wanted to force geo-replication 
to occur with a specific slave address, so I tried to setup replication with an 
ip based URL. 

Setup/config of the geo-replication works, and appears to start, but is 
permanently in a failure state:

[root@server1 geo-replication]# gluster volume geo-replication status
No active geo-replication sessions
[root@server1 geo-replication]# gluster volume geo-replication gvOvirt 
10.78.4.1::geobk-Ovirt create push-pem
Creating geo-replication session between gvOvirt  10.78.4.1::geobk-Ovirt has 
been successful
[root@server1 geo-replication]# gluster volume geo-replication status
 
MASTER NODEMASTER VOLMASTER BRICK SLAVE 
  STATUS CHECKPOINT STATUSCRAWL STATUS
---
server1.xxx.xxx.xxxgvOvirt   /v0/ha-engine/gbOvirt
ssh://10.78.4.1::geobk-OvirtNot StartedN/A  N/A 

server3.xxx.xxx.xxxgvOvirt   /v0/ha-engine/gbOvirt
ssh://10.78.4.1::geobk-OvirtNot StartedN/A  N/A 

[root@server1 geo-replication]# gluster volume geo-replication gvOvirt 
10.78.4.1::geobk-Ovirt start
Starting geo-replication session between gvOvirt  10.78.4.1::geobk-Ovirt has 
been successful
[root@server1 geo-replication]# gluster volume geo-replication status
 
MASTER NODEMASTER VOLMASTER BRICK SLAVE 
  STATUSCHECKPOINT STATUSCRAWL STATUS
--
server1.xxx.xxx.xxxgvOvirt   /v0/ha-engine/gbOvirt
ssh://10.78.4.1::geobk-OvirtfaultyN/A  N/A  
   
server3.xxx.xxx.xxxgvOvirt   /v0/ha-engine/gbOvirt
ssh://10.78.4.1::geobk-OvirtfaultyN/A  N/A  
   

Tracking it down in the logs, it’s because it’s trying to connect to 
“root@spire” instead of “root@10.78.4.1”, even though the setup seemed to use 
10.78.4.1 just fine:

2014-09-23 14:13:41.595123] I [monitor(monitor):130:monitor] Monitor: starting 
gsyncd worker
[2014-09-23 14:13:41.764898] I [gsyncd(/v0/ha-engine/gbOvirt):532:main_i] 
top: syncing: gluster://localhost:gvOvirt - 
ssh://root@spire:gluster://localhost:geobk-Ovirt
[2014-09-23 14:13:53.40934] E 
[syncdutils(/v0/ha-engine/gbOvirt):223:log_raise_exception] top: connection 
to peer is broken
[2014-09-23 14:13:53.41244] E [resource(/v0/ha-engine/gbOvirt):204:errlog] 
Popen: command ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i 
/var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S 
/tmp/gsyncd-aux-ssh-gaHXrp/e68410dcff276efa39a94dc5b33e0d8e.sock root@spire 
/nonexistent/gsyncd --session-owner c9a371cd-8644-4706-a4b7-f12bc2c37ac6 -N 
--listen --timeout 120 gluster://localhost:geobk-Ovirt returned with 255, 
saying:
[2014-09-23 14:13:53.41356] E [resource(/v0/ha-engine/gbOvirt):207:logerr] 
Popen: ssh ssh_exchange_identification: Connection closed by remote host
[2014-09-23 14:13:53.41569] I [syncdutils(/v0/ha-engine/gbOvirt):192:finalize] 
top: exiting.
[2014-09-23 14:13:54.44226] I [monitor(monitor):157:monitor] Monitor: 
worker(/v0/ha-engine/gbOvirt) died in startup phase

So where did it pick this up from? If anything, I think it’d try the full name 
of the slave server, but it didn’t even try that. The slave’s hostname, btw, is 
spire.yyy.yyy.xxx, not even in the same domain as the master.

I worked around this by setting up a new hostname resolution for ‘spire' and 
tweaking the search path so it resolved things the way I wanted (to 10.78.4.1), 
but that shouldn’t be necessary. Also seems like this might cause security 
issues (well, ok, probably not since it’s rsyncing over ssh, but traffic could 
definitely wind up where you didn’t expect it) because it’s not trying to 
connect to what I explicitly told it to. And in this case, it ran afoul of some 
sshd root login allowed rules, but could have easily been other firewall rules 
too. And confusion because the geo-replication status report is misleading, 
it’s trying to sync to an address that isn’t reflected there.

Seems like a bug to me, but figured I’d check here first.

  

Re: [Gluster-users] Migrating data from a failing filesystem

2014-09-23 Thread Ravishankar N

On 09/23/2014 08:56 PM, james.bellin...@icecube.wisc.edu wrote:

I inherited a non-replicated gluster system based on antique hardware.

One of the brick filesystems is flaking out, and remounts read-only.  I
repair it and remount it, but this is only postponing the inevitable.

How can I migrate files off a failing brick that intermittently turns
read-only?  I have enough space, thanks to a catastrophic failure on
another brick; I don't want to present people with another one.  But if I
understand migration correctly references have to be deleted, which isn't
possible if the filesystem turns read-only.


What you could do is initiate the  migration  with `remove-brick start' 
and monitor the progress with 'remove-brick status`. Irrespective of 
whether the rebalance  completes or fails (due to the brick turning 
read-only), you could anyway update the volume configuration with 
'remove-brick commit`. Now if the brick still has files left, mount the 
gluster volume on that node and copy the files from the brick to the 
volume via the mount.  You can then safely rebuild the array/ add a 
different brick or whatever.



What I want to do is migrate the files off, remove it from gluster,
rebuild the array, rebuild the filesystem, and then add it back as a
brick.  (Actually what I'd really like is to hear that the students are
all done with the system and I can turn the whole thing off, but theses
aren't complete yet.)

Any advice or words of warning will be appreciated.


Looks like your bricks are in trouble for over a year now 
(http://gluster.org/pipermail/gluster-users.old/2013-September/014319.html). 
Better get them fixed sooner than later! :-)

HTH,
Ravi


James Bellinger




___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users