Re: [Gluster-users] [Gluster-Maintainers] Meeting minutes : May 2nd, 2018 Maintainers meeting.

2018-05-02 Thread Amar Tumballi
On Wed, May 2, 2018 at 8:27 PM, Shyamsundar Ranganathan  wrote:

> Meeting date: 05/02/2018 (May 02nd, 2018), 19:30 IST, 14:00 UTC, 10:00 EDTBJ
> Link
>
>- Bridge: https://bluejeans.com/205933580
>- Download: 
>
> Download: https://bluejeans.com/s/fPavr


Attendance
>
>- Raghavendra M (Raghavendra Bhat), Kaleb, Atin, Amar, Nithya, Rafi,
>Shyam
>
> Agenda
>
>-
>
>Commitment (GPLv2 Cure)
>- Email
>   
> 
>   and Patch 
>   - [amarts] 20+ people already have done +1. Will wait another
>   15days before any action on this.
>   - AI: Send a reminder to the lists and get the changes merged
>   around next maintainers meeting [Amar]
>-
>
>GD2 testing upstream
>- Is there a glusterd v1 Vs v2 parity check matrix?
>  - Functional parity of the CLI
>   - As the cli format is not 100% compatible, how to proceed further
>   with regression tests without much fuss?
>  - [amarts] Easier option is to handle it similar to brick-mux
>  test. Create a new directory ‘tests2/’ which is copy of current 
> tests, and
>  files changed as per glusterd2/glustercli needs. We can do bulk 
> replace etc
>  etc… start small, make incremental progress. Run it once a day.
> - Add smoke check for core glusterfs to keep things working
> with GD2
> - Add GD2 tests into the said patch, to ensure functionality
> of GD2 itself
> - Approach ack: Shyam
> - Approach nack:
>  -
>
>Coding standards
>- Did we come to conclusion? What next?
>  - Need some more votes to take it forward
>  - Settle current conflicts to the settings
>   - [amarts] Need to see what should be ‘deadline’ for this. Ideal to
>   have before 4.1, or else backporting would be serious problem.
>   - AI: Reminder mail on release to get this closed [Shyam]
>   - Conversion work should be doable in 1/2 a day
>   - Per-patch format auto-verifier/correction job readiness
>  - Possibly not ready during roll-out
>  - Not looking at it as the blocker, as we can get it within a
>  week and sanitize patches that were merged in that week [Shyam]
>   -
>
>Branching for 4.1?
>- Today would be branching date [Shyam]
>  - No time to fold slipping features, as we are 2 weeks off
>  already!
>  - Branching is on 4th May, 2018, what makes it gets in, rest is
>  pushed to 4.2
>   - Leases?
>   - Ctime?
>   - Thin-Arbiter?
>   - ?
>-
>
>Round Robin:
>- [amarts] - Is the plan for version change (or not change) ready
>   after 4.1? or do we need to extend the period for this?
>  - AI: Send proposal to devel -> users and take action on getting
>  this done before 4.2
>  - Fold in xlator maturity states into the same release
>   - [kkeithley] - new (untriaged) upstream bugs list is getting
>   longer.
>  - Triage beyond assignment
>  - Tracking fixes and closure of the same
>  - AI: Shyam to work on this and next steps
>
> Decisions and Actions
>
>- AI (GPL cure): Send a reminder to the lists and get the changes
>merged around next maintainers meeting [Amar]
>- Decision: GD2 tests to create a patch of tests and work on it using
>nightly runs to get GD2 integrated
>- AI (format conversion): Reminder mail on release to get this closed
>[Shyam]
>- AI (format conversion): Conversion done Thursday, ready for merge
>Friday [Amar/Nigel]
>- Decision (4.1 branching): 4.1 branching date set at 4th May, no
>feature slips allowed beyond that [Shyam]
>- AI (release cadence and version numbers change): Send proposal to
>devel -> users and take action on getting this done before 4.2 [Shyam]
>- AI (Bugs triage cadence): Shyam to work on this and next steps
>[Shyam]
>
>
> ___
> maintainers mailing list
> maintain...@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>


-- 
Amar Tumballi (amarts)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Announcing mountpoint, August 27-28, 2018

2018-05-02 Thread Amye Scavarda
Our first mountpoint is coming!
Software-defined Storage (SDS) is changing the traditional way we think of
storage. Decoupling software from hardware allows you to choose your
hardware vendors and provides enterprises with more flexibility.

Attend mountpoint on August 27 - 28, 2018 in Vancouver, BC, before Open
Source Summit North America for this first time event. We are joining
forces with the Ceph and Gluster communities, SDS experts, and partners to
bring to you an exciting 2 day event. Help lead the conversation on open
source software defined storage and share your knowledge!

Our CFP is open on May 3rd through June 15th, 2018.
More details available, including sponsorship:
http://mountpoint.io/

-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] unable to remove ACLs

2018-05-02 Thread Vijay Bellur
On Wed, May 2, 2018 at 3:54 AM, lejeczek  wrote:

>
>
> On 01/05/18 23:59, Vijay Bellur wrote:
>
>>
>>
>> On Tue, May 1, 2018 at 5:46 AM, lejeczek  pelj...@yahoo.co.uk>> wrote:
>>
>> hi guys
>>
>> I have a simple case of:
>> $ setfacl -b
>> not working!
>> I copy a folder outside of autofs mounted gluster vol,
>> to a regular fs and removing acl works as expected.
>> Inside mounted gluster vol I seem to be able to
>> modify/remove ACLs for users, groups and masks but
>> that one simple, important thing does not work.
>> It is also not the case of default ACLs being enforced
>> from the parent, for I mkdir a folder next to that
>> problematic folder and there are not ACLs, as expected.
>>
>> glusterfs 3.12.9, Centos 7.4
>>
>> Any thoughts, suggestions?
>>
>>
>> Are you mounting glusterfs with -o acl ?  In the case you are not,
>> mounting with option acl is necessary for glusterfs to honor ACLs.
>>
>> Regards,
>> Vijay
>>
>>
>> surely I'm. Otherwise I'd have no "working" acls, right? Like I say, I
> can operate setfacl and this seems to work except, I cannot remove acl
> completely with "-b" which should just work, right?
>

Yes, it should ideally work.



>
> I think it should be easily reproducible, my setup is pretty "regular".
> I'm on Centos 7.4 and mount via autofs/manuall. Anybody can check that?



Here's a test that passed with a manual mount while using 3.12.9:

[root@deepthought ~]# mount -t glusterfs -o acl deepthought:/foo
/mnt/gluster
[root@deepthought ~]# cd /mnt/gluster/
[root@deepthought gluster]# setfacl -m u:nobody:rw foo
[root@deepthought gluster]# getfacl foo
# file: foo
# owner: root
# group: root
user::-w-
user:nobody:rw-
group::r--
mask::rw-
other::r--

[root@deepthought gluster]# setfacl -b foo
[root@deepthought gluster]# echo $?
0
[root@deepthought gluster]# getfacl foo
# file: foo
# owner: root
# group: root
user::-w-
group::r--
other::r--


Can you please share the sequence of acl commands that causes "setfacl -b"
to fail in your setup?

Thanks,
Vijay

>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-Maintainers] Meeting minutes : May 2nd, 2018 Maintainers meeting.

2018-05-02 Thread Shyamsundar Ranganathan
Meeting date: 05/02/2018 (May 02nd, 2018), 19:30 IST, 14:00 UTC, 10:00 EDT 
BJ Link 


* Bridge: https://bluejeans.com/205933580 
* Download:  

Attendance 


* Raghavendra M (Raghavendra Bhat), Kaleb, Atin, Amar, Nithya, Rafi, Shyam 

Agenda 


* 

Commitment (GPLv2 Cure) 
* Email and Patch 
* [amarts] 20+ people already have done +1. Will wait another 15days 
before any action on this. 
* AI: Send a reminder to the lists and get the changes merged around 
next maintainers meeting [Amar] 
* 

GD2 testing upstream 
* Is there a glusterd v1 Vs v2 parity check matrix? 
* Functional parity of the CLI 
* As the cli format is not 100% compatible, how to proceed further with 
regression tests without much fuss? 
* [amarts] Easier option is to handle it similar to brick-mux test. 
Create a new directory ‘tests2/’ which is copy of current tests, and files 
changed as per glusterd2/glustercli needs. We can do bulk replace etc etc… 
start small, make incremental progress. Run it once a day. 
* Add smoke check for core glusterfs to keep things working 
with GD2 
* Add GD2 tests into the said patch, to ensure functionality of 
GD2 itself 
* Approach ack: Shyam 
* Approach nack: 
* 

Coding standards 
* Did we come to conclusion? What next? 
* Need some more votes to take it forward 
* Settle current conflicts to the settings 
* [amarts] Need to see what should be ‘deadline’ for this. Ideal to 
have before 4.1, or else backporting would be serious problem. 
* AI: Reminder mail on release to get this closed [Shyam] 
* Conversion work should be doable in 1/2 a day 
* Per-patch format auto-verifier/correction job readiness 
* Possibly not ready during roll-out 
* Not looking at it as the blocker, as we can get it within a week 
and sanitize patches that were merged in that week [Shyam] 
* 

Branching for 4.1? 
* Today would be branching date [Shyam] 
* No time to fold slipping features, as we are 2 weeks off already! 
* Branching is on 4th May, 2018, what makes it gets in, rest is 
pushed to 4.2 
* Leases? 
* Ctime? 
* Thin-Arbiter? 
* ? 
* 

Round Robin: 
* [amarts] - Is the plan for version change (or not change) ready after 
4.1? or do we need to extend the period for this? 
* AI: Send proposal to devel -> users and take action on getting 
this done before 4.2 
* Fold in xlator maturity states into the same release 
* [kkeithley] - new (untriaged) upstream bugs list is getting longer. 
* Triage beyond assignment 
* Tracking fixes and closure of the same 
* AI: Shyam to work on this and next steps 

Decisions and Actions 


* AI (GPL cure): Send a reminder to the lists and get the changes merged 
around next maintainers meeting [Amar] 
* Decision: GD2 tests to create a patch of tests and work on it using 
nightly runs to get GD2 integrated 
* AI (format conversion): Reminder mail on release to get this closed 
[Shyam] 
* AI (format conversion): Conversion done Thursday, ready for merge Friday 
[Amar/Nigel] 
* Decision (4.1 branching): 4.1 branching date set at 4th May, no feature 
slips allowed beyond that [Shyam] 
* AI (release cadence and version numbers change): Send proposal to devel 
-> users and take action on getting this done before 4.2 [Shyam] 
* AI (Bugs triage cadence): Shyam to work on this and next steps [Shyam] 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Turn off replication

2018-05-02 Thread Jose Sanchez

Hi, All and thank you. 

We left it alone and finished rebalancing, it seems to be working . Thanks 
again for your help

J. Sanchez

-
Jose Sanchez
Systems/Network Analyst 
Center of Advanced Research Computing
1601 Central Ave.
MSC 01 1190
Albuquerque, NM 87131-0001
carc.unm.edu 
575.636.4232

> On May 2, 2018, at 3:20 AM, Hari Gowtham  wrote:
> 
> Hi,
> 
> Removing data to speed up from rebalance is not something that is recommended.
> Rebalance can be stopped but if started again it will start from the beginning
> (will have to check and skip the files already moved).
> 
> Rebalance will take a while, better to let it run. It doesn't have any
> down side.
> Unless you touch the backend the data on gluster volume will be
> available for usage
> in spite of rebalance running. If you want to speed things up
> rebalance throttle option
> can be set to aggressive to speed things up. (this might increase the
> cpu and disk usage).
> 
> 
> On Mon, Apr 30, 2018 at 6:24 PM, Jose Sanchez  wrote:
>> Hi All
>> 
>> We were able to get all 4 bricks are distributed , we can see the right
>> amount of space. but we have been rebalancing since 4 days ago for 16Tb. and
>> still only 8tb. is there a way to speed up. there is also data we can remove
>> from it to speed it up, but what is the best procedures removing data , is
>> it from the Gluster main export point or going on each brick and remove it .
>> We would like to stop rebalancing , delete the data and rebalancing again.
>> 
>> is there a down side, doing this, What happens with Gluster missing data
>> when rebalancing?
>> 
>> Thanks
>> 
>> Jose
>> 
>> 
>> 
>> 
>> 
>> 
>> -
>> Jose Sanchez
>> Systems/Network Analyst 1
>> Center of Advanced Research Computing
>> 1601 Central Ave.
>> MSC 01 1190
>> Albuquerque, NM 87131-0001
>> carc.unm.edu
>> 575.636.4232
>> 
>> On Apr 27, 2018, at 4:16 AM, Hari Gowtham  wrote:
>> 
>> Hi Jose,
>> 
>> Why are all the bricks visible in volume info if the pre-validation
>> for add-brick failed? I suspect that the remove brick wasn't done
>> properly.
>> 
>> You can provide the cmd_history.log to verify this. Better to get the
>> other log messages.
>> 
>> Also I need to know what are the bricks that were actually removed,
>> the command used and its output.
>> 
>> On Thu, Apr 26, 2018 at 3:47 AM, Jose Sanchez  wrote:
>> 
>> Looking at the logs , it seems that it is trying to add using the same port
>> was assigned for gluster01ib:
>> 
>> 
>> Any Ideas??
>> 
>> Jose
>> 
>> 
>> 
>> [2018-04-25 22:08:55.169302] I [MSGID: 106482]
>> [glusterd-brick-ops.c:447:__glusterd_handle_add_brick] 0-management:
>> Received add brick req
>> [2018-04-25 22:08:55.186037] I [run.c:191:runner_log]
>> (-->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0x33045)
>> [0x7f5464b9b045]
>> -->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0xcbd85)
>> [0x7f5464c33d85] -->/lib64/libglusterfs.so.0(runner_log+0x115)
>> [0x7f54704cf1e5] ) 0-management: Ran script:
>> /var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
>> --volname=scratch --version=1 --volume-op=add-brick
>> --gd-workdir=/var/lib/glusterd
>> [2018-04-25 22:08:55.309534] I [MSGID: 106143]
>> [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick
>> /gdata/brick1/scratch on port 49152
>> [2018-04-25 22:08:55.309659] I [MSGID: 106143]
>> [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick
>> /gdata/brick1/scratch.rdma on port 49153
>> [2018-04-25 22:08:55.310231] E [MSGID: 106005]
>> [glusterd-utils.c:4877:glusterd_brick_start] 0-management: Unable to start
>> brick gluster02ib:/gdata/brick1/scratch
>> [2018-04-25 22:08:55.310275] E [MSGID: 106074]
>> [glusterd-brick-ops.c:2493:glusterd_op_add_brick] 0-glusterd: Unable to add
>> bricks
>> [2018-04-25 22:08:55.310304] E [MSGID: 106123]
>> [glusterd-mgmt.c:294:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit
>> failed.
>> [2018-04-25 22:08:55.310316] E [MSGID: 106123]
>> [glusterd-mgmt.c:1427:glusterd_mgmt_v3_commit] 0-management: Commit failed
>> for operation Add brick on local node
>> [2018-04-25 22:08:55.310330] E [MSGID: 106123]
>> [glusterd-mgmt.c:2018:glusterd_mgmt_v3_initiate_all_phases] 0-management:
>> Commit Op Failed
>> [2018-04-25 22:09:11.678141] E [MSGID: 106452]
>> [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick:
>> gluster02ib:/gdata/brick1/scratch not available. Brick may be containing or
>> be contained by an existing brick
>> [2018-04-25 22:09:11.678184] W [MSGID: 106122]
>> [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick
>> prevalidation failed.
>> [2018-04-25 22:09:11.678200] E [MSGID: 106122]
>> [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management:
>> Pre Validation failed on operation Add brick
>> [root@gluster02 glusterfs]# gluster 

Re: [Gluster-users] Finding performance bottlenecks

2018-05-02 Thread Tony Hoyle
On 01/05/2018 02:27, Thing wrote:
> Hi,
> 
> So is the KVM or Vmware as the host(s)?  I basically have the same setup
> ie 3 x 1TB "raid1" nodes and VMs, but 1gb networking.  I do notice with
> vmware using NFS disk was pretty slow (40% of a single disk) but this
> was over 1gb networking which was clearly saturating.  Hence I am moving
> to KVM to use glusterfs hoping for better performance and bonding, it
> will be interesting to see which host type runs faster.

1gb will always be the bottleneck in that situation - that's going too
max out at the speed of a single disk or lower.  You need at minimum to
bond interfaces and preferably go to 10gb to do that.

Our NFS actually ends up faster than local disk because the read speed
of the raid is faster than the read speed of the local disk.

> Which operating system is gluster on?  

Debian Linux.  Supermicro motherboards, 24 core i7 with 128GB of RAM on
the VM hosts.

> Did you do iperf between all nodes?

Yes, around 9.7Gb/s

It doesn't appear to be raw read speed but iowait.  Under nfs load with
multiple VMs I get an iowait of around 0.3%.  Under gluster, never less
than 10% and glusterfsd is often the top of the CPU usage.  This causes
a load average of ~12 compared to 3 over NFS, and absolutely kills VMs
esp. Windows ones - one machine I set booting and it was still booting
30 minutes later!

Tony
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Usage monitoring per user

2018-05-02 Thread JOHE (John Hearns)
I rather like ageduIt probably does what you want.

But as Mohammad says you do have to traverse your filesystem.


https://www.chiark.greenend.org.uk/~sgtatham/agedu/

agedu: track down wasted disk space - chiark home 
page
www.chiark.greenend.org.uk
agedu. a Unix utility for tracking down wasted disk space Introduction. Suppose 
you're running low on disk space. You need to free some up, by finding 
something that's a waste of space and deleting it (or moving it to an archive 
medium).







From: gluster-users-boun...@gluster.org  on 
behalf of Alex Chekholko 
Sent: 01 May 2018 18:45
To: mohammad kashif
Cc: gluster-users
Subject: Re: [Gluster-users] Usage monitoring per user

Hi,

There are several programs that will basically take the outputs of your scans 
and store the results in a database. If you size the database appropriately, 
then querying that database will be much quicker than querying the filesystem.  
But of course the results will be a little bit outdated.

One such project is robinhood. 
https://github.com/cea-hpc/robinhood/wiki

A simpler way might be to just have daily/weekly cron jobs that output text 
reports, without maintaining a separate database.

But there is no way to avoid doing a recursive POSIX tree traversal, since that 
is how you get your info out of your filesystem.

Regards,
Alex

On Tue, May 1, 2018 at 5:30 AM, mohammad kashif 
> wrote:
Hi

Is there any easy way to find usage per user in Gluster? We have 300TB storage 
with almost 100 million files. Running du take too much time. Are people aware 
of any other tool which can be used to break up storage per user?

Thanks

Kashif

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] unable to remove ACLs

2018-05-02 Thread lejeczek



On 01/05/18 23:59, Vijay Bellur wrote:



On Tue, May 1, 2018 at 5:46 AM, lejeczek 
> wrote:


hi guys

I have a simple case of:
$ setfacl -b
not working!
I copy a folder outside of autofs mounted gluster vol,
to a regular fs and removing acl works as expected.
Inside mounted gluster vol I seem to be able to
modify/remove ACLs for users, groups and masks but
that one simple, important thing does not work.
It is also not the case of default ACLs being enforced
from the parent, for I mkdir a folder next to that
problematic folder and there are not ACLs, as expected.

glusterfs 3.12.9, Centos 7.4

Any thoughts, suggestions?


Are you mounting glusterfs with -o acl ?  In the case you 
are not, mounting with option acl is necessary for 
glusterfs to honor ACLs.


Regards,
Vijay


surely I'm. Otherwise I'd have no "working" acls, right? 
Like I say, I can operate setfacl and this seems to work 
except, I cannot remove acl completely with "-b" which 
should just work, right?


I think it should be easily reproducible, my setup is pretty 
"regular". I'm on Centos 7.4 and mount via autofs/manuall. 
Anybody can check that?


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Healing : No space left on device

2018-05-02 Thread Hoggins!
Oh, and *there is* space on the device where the brick's data is located.

    /dev/mapper/fedora-home   942G    868G   74G  93% /export

Le 02/05/2018 à 11:49, Hoggins! a écrit :
> Hello list,
>
> I have an issue on my Gluster cluster. It is composed of two data nodes
> and an arbiter for all my volumes.
>
> After having upgraded my bricks to gluster 3.12.9 (Fedora 27), this is
> what I get :
>
>     - on node 1, volumes won't start, and glusterd.log shows a lot of :
>         [2018-05-02 09:46:06.267817] W
> [glusterd-locks.c:843:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.12.9/xlator/mgmt/glusterd.so(+0x22549)
> [0x7f0047ae2549]
> -->/usr/lib64/glusterfs/3.12.9/xlator/mgmt/glusterd.so(+0x2bdf0)
> [0x7f0047aebdf0]
> -->/usr/lib64/glusterfs/3.12.9/xlator/mgmt/glusterd.so(+0xd8371)
> [0x7f0047b98371] ) 0-management: Lock for vol thedude not held
>         The message "W [MSGID: 106118]
> [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock
> not released for rom" repeated 3 times between [2018-05-02
> 09:45:57.262321] and [2018-05-02 09:46:06.267804]
>         [2018-05-02 09:46:06.267826] W [MSGID: 106118]
> [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock
> not released for thedude
>
>
>     - on node 2, volume are up but don't seem to be willing to correctly
> heal. The logs show a lot of :
>         [2018-05-02 09:23:01.054196] I [MSGID: 108026]
> [afr-self-heal-entry.c:887:afr_selfheal_entry_do] 0-thedude-replicate-0:
> performing entry selfheal on 4dc0ae36-c365-4fc7-b44c-d717392c7bd3
>         [2018-05-02 09:23:01.222596] E [MSGID: 114031]
> [client-rpc-fops.c:233:client3_3_mknod_cbk] 0-thedude-client-2: remote
> operation failed. Path:  [No
> space left on device]
>
>
>     - on arbiter, glustershd.log shows a lot of :
>         [2018-05-02 09:44:54.619476] I [MSGID: 108026]
> [afr-self-heal-entry.c:887:afr_selfheal_entry_do] 0-web-replicate-0:
> performing entry selfheal on 146a9a84-3db1-42ef-828e-0e4131af3667
>         [2018-05-02 09:44:54.640276] E [MSGID: 114031]
> [client-rpc-fops.c:295:client3_3_mkdir_cbk] 0-web-client-2: remote
> operation failed. Path:  [No
> space left on device]
>         [2018-05-02 09:44:54.657045] I [MSGID: 108026]
> [afr-self-heal-entry.c:887:afr_selfheal_entry_do] 0-web-replicate-0:
> performing entry selfheal on 9f9122ed-2794-4ed1-91db-be0c7fe89389
>         [2018-05-02 09:47:09.121060] W [MSGID: 101088]
> [common-utils.c:4166:gf_backtrace_save] 0-mailer-replicate-0: Failed to
> save the backtrace.
>
>
> The clients connecting to the cluster experience problems, such as
> Gluster refusing to create files, etc.
>
> I'm lost here, where should I start ?
>
>     Thanks for your help !
>
>         Hoggins!
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users




signature.asc
Description: OpenPGP digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Healing : No space left on device

2018-05-02 Thread Hoggins!
Hello list,

I have an issue on my Gluster cluster. It is composed of two data nodes
and an arbiter for all my volumes.

After having upgraded my bricks to gluster 3.12.9 (Fedora 27), this is
what I get :

    - on node 1, volumes won't start, and glusterd.log shows a lot of :
        [2018-05-02 09:46:06.267817] W
[glusterd-locks.c:843:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.9/xlator/mgmt/glusterd.so(+0x22549)
[0x7f0047ae2549]
-->/usr/lib64/glusterfs/3.12.9/xlator/mgmt/glusterd.so(+0x2bdf0)
[0x7f0047aebdf0]
-->/usr/lib64/glusterfs/3.12.9/xlator/mgmt/glusterd.so(+0xd8371)
[0x7f0047b98371] ) 0-management: Lock for vol thedude not held
        The message "W [MSGID: 106118]
[glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock
not released for rom" repeated 3 times between [2018-05-02
09:45:57.262321] and [2018-05-02 09:46:06.267804]
        [2018-05-02 09:46:06.267826] W [MSGID: 106118]
[glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock
not released for thedude


    - on node 2, volume are up but don't seem to be willing to correctly
heal. The logs show a lot of :
        [2018-05-02 09:23:01.054196] I [MSGID: 108026]
[afr-self-heal-entry.c:887:afr_selfheal_entry_do] 0-thedude-replicate-0:
performing entry selfheal on 4dc0ae36-c365-4fc7-b44c-d717392c7bd3
        [2018-05-02 09:23:01.222596] E [MSGID: 114031]
[client-rpc-fops.c:233:client3_3_mknod_cbk] 0-thedude-client-2: remote
operation failed. Path:  [No
space left on device]


    - on arbiter, glustershd.log shows a lot of :
        [2018-05-02 09:44:54.619476] I [MSGID: 108026]
[afr-self-heal-entry.c:887:afr_selfheal_entry_do] 0-web-replicate-0:
performing entry selfheal on 146a9a84-3db1-42ef-828e-0e4131af3667
        [2018-05-02 09:44:54.640276] E [MSGID: 114031]
[client-rpc-fops.c:295:client3_3_mkdir_cbk] 0-web-client-2: remote
operation failed. Path:  [No
space left on device]
        [2018-05-02 09:44:54.657045] I [MSGID: 108026]
[afr-self-heal-entry.c:887:afr_selfheal_entry_do] 0-web-replicate-0:
performing entry selfheal on 9f9122ed-2794-4ed1-91db-be0c7fe89389
        [2018-05-02 09:47:09.121060] W [MSGID: 101088]
[common-utils.c:4166:gf_backtrace_save] 0-mailer-replicate-0: Failed to
save the backtrace.


The clients connecting to the cluster experience problems, such as
Gluster refusing to create files, etc.

I'm lost here, where should I start ?

    Thanks for your help !

        Hoggins!



signature.asc
Description: OpenPGP digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Turn off replication

2018-05-02 Thread Hari Gowtham
Hi,

Removing data to speed up from rebalance is not something that is recommended.
Rebalance can be stopped but if started again it will start from the beginning
(will have to check and skip the files already moved).

Rebalance will take a while, better to let it run. It doesn't have any
down side.
Unless you touch the backend the data on gluster volume will be
available for usage
in spite of rebalance running. If you want to speed things up
rebalance throttle option
can be set to aggressive to speed things up. (this might increase the
cpu and disk usage).


On Mon, Apr 30, 2018 at 6:24 PM, Jose Sanchez  wrote:
> Hi All
>
> We were able to get all 4 bricks are distributed , we can see the right
> amount of space. but we have been rebalancing since 4 days ago for 16Tb. and
> still only 8tb. is there a way to speed up. there is also data we can remove
> from it to speed it up, but what is the best procedures removing data , is
> it from the Gluster main export point or going on each brick and remove it .
> We would like to stop rebalancing , delete the data and rebalancing again.
>
> is there a down side, doing this, What happens with Gluster missing data
> when rebalancing?
>
> Thanks
>
> Jose
>
>
>
>
>
>
> -
> Jose Sanchez
> Systems/Network Analyst 1
> Center of Advanced Research Computing
> 1601 Central Ave.
> MSC 01 1190
> Albuquerque, NM 87131-0001
> carc.unm.edu
> 575.636.4232
>
> On Apr 27, 2018, at 4:16 AM, Hari Gowtham  wrote:
>
> Hi Jose,
>
> Why are all the bricks visible in volume info if the pre-validation
> for add-brick failed? I suspect that the remove brick wasn't done
> properly.
>
> You can provide the cmd_history.log to verify this. Better to get the
> other log messages.
>
> Also I need to know what are the bricks that were actually removed,
> the command used and its output.
>
> On Thu, Apr 26, 2018 at 3:47 AM, Jose Sanchez  wrote:
>
> Looking at the logs , it seems that it is trying to add using the same port
> was assigned for gluster01ib:
>
>
> Any Ideas??
>
> Jose
>
>
>
> [2018-04-25 22:08:55.169302] I [MSGID: 106482]
> [glusterd-brick-ops.c:447:__glusterd_handle_add_brick] 0-management:
> Received add brick req
> [2018-04-25 22:08:55.186037] I [run.c:191:runner_log]
> (-->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0x33045)
> [0x7f5464b9b045]
> -->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0xcbd85)
> [0x7f5464c33d85] -->/lib64/libglusterfs.so.0(runner_log+0x115)
> [0x7f54704cf1e5] ) 0-management: Ran script:
> /var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
> --volname=scratch --version=1 --volume-op=add-brick
> --gd-workdir=/var/lib/glusterd
> [2018-04-25 22:08:55.309534] I [MSGID: 106143]
> [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick
> /gdata/brick1/scratch on port 49152
> [2018-04-25 22:08:55.309659] I [MSGID: 106143]
> [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick
> /gdata/brick1/scratch.rdma on port 49153
> [2018-04-25 22:08:55.310231] E [MSGID: 106005]
> [glusterd-utils.c:4877:glusterd_brick_start] 0-management: Unable to start
> brick gluster02ib:/gdata/brick1/scratch
> [2018-04-25 22:08:55.310275] E [MSGID: 106074]
> [glusterd-brick-ops.c:2493:glusterd_op_add_brick] 0-glusterd: Unable to add
> bricks
> [2018-04-25 22:08:55.310304] E [MSGID: 106123]
> [glusterd-mgmt.c:294:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit
> failed.
> [2018-04-25 22:08:55.310316] E [MSGID: 106123]
> [glusterd-mgmt.c:1427:glusterd_mgmt_v3_commit] 0-management: Commit failed
> for operation Add brick on local node
> [2018-04-25 22:08:55.310330] E [MSGID: 106123]
> [glusterd-mgmt.c:2018:glusterd_mgmt_v3_initiate_all_phases] 0-management:
> Commit Op Failed
> [2018-04-25 22:09:11.678141] E [MSGID: 106452]
> [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick:
> gluster02ib:/gdata/brick1/scratch not available. Brick may be containing or
> be contained by an existing brick
> [2018-04-25 22:09:11.678184] W [MSGID: 106122]
> [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick
> prevalidation failed.
> [2018-04-25 22:09:11.678200] E [MSGID: 106122]
> [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management:
> Pre Validation failed on operation Add brick
> [root@gluster02 glusterfs]# gluster volume status scratch
> Status of volume: scratch
> Gluster process TCP Port  RDMA Port  Online  Pid
> --
> Brick gluster01ib:/gdata/brick1/scratch 49152 49153  Y
> 1819
> Brick gluster01ib:/gdata/brick2/scratch 49154 49155  Y
> 1827
> Brick gluster02ib:/gdata/brick1/scratch N/A   N/AN   N/A
>
>
>
> Task Status of Volume scratch
> --
> There are no active 

Re: [Gluster-users] Usage monitoring per user

2018-05-02 Thread mohammad kashif
Hi Alex, John

Thanks for confirming my suspicion that there is no getting away from POSIX
tree traversal . I was aware of age-du but not robinhood.

Cheers

Kashif

On Wed, May 2, 2018 at 8:57 AM, JOHE (John Hearns) 
wrote:

> I rather like ageduIt probably does what you want.
>
> But as Mohammad says you do have to traverse your filesystem.
>
>
> https://www.chiark.greenend.org.uk/~sgtatham/agedu/
> agedu: track down wasted disk space - chiark home page
> 
> www.chiark.greenend.org.uk
> agedu. a Unix utility for tracking down wasted disk space Introduction.
> Suppose you're running low on disk space. You need to free some up, by
> finding something that's a waste of space and deleting it (or moving it to
> an archive medium).
>
>
>
>
>
> --
> *From:* gluster-users-boun...@gluster.org  gluster.org> on behalf of Alex Chekholko 
> *Sent:* 01 May 2018 18:45
> *To:* mohammad kashif
> *Cc:* gluster-users
> *Subject:* Re: [Gluster-users] Usage monitoring per user
>
> Hi,
>
> There are several programs that will basically take the outputs of your
> scans and store the results in a database. If you size the database
> appropriately, then querying that database will be much quicker than
> querying the filesystem.  But of course the results will be a little bit
> outdated.
>
> One such project is robinhood. https://github.com/cea-hpc/robinhood/wiki
> 
>
> A simpler way might be to just have daily/weekly cron jobs that output
> text reports, without maintaining a separate database.
>
> But there is no way to avoid doing a recursive POSIX tree traversal, since
> that is how you get your info out of your filesystem.
>
> Regards,
> Alex
>
> On Tue, May 1, 2018 at 5:30 AM, mohammad kashif 
> wrote:
>
> Hi
>
> Is there any easy way to find usage per user in Gluster? We have 300TB
> storage with almost 100 million files. Running du take too much time. Are
> people aware of any other tool which can be used to break up storage per
> user?
>
> Thanks
>
> Kashif
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 
>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster rebalance taking many years

2018-05-02 Thread Nithya Balachandran
Hi,

There is not much information in this log file. Which server's file is
this? I will need to see the rebalance logs from both nodes.

It sounds like there are a lot of files on the volume which is why the
rebalance will take time. What is the current rebalance status for the
volume?

Rebalance should not affect volume operations so is there a particular
reason why the estimated time is a cause of concern?


Regards,
Nithya




On 30 April 2018 at 13:10, shadowsocks飞飞  wrote:

> I cannot calculate the number of files normally
>
> Through df -i I got the approximate number of files is  63694442
>
> [root@CentOS-73-64-minimal ~]# df -i
> Filesystem  InodesIUsed  IFree IUse%
> Mounted on
> /dev/md2 131981312 30901030  101080282   24% /
> devtmpfs   8192893  43581924581%
> /dev
> tmpfs  8199799 802981917701%
> /dev/shm
> tmpfs  8199799 141581983841%
> /run
> tmpfs  8199799   1681997831%
> /sys/fs/cgroup
> /dev/md3 110067712 29199861   80867851   27%
> /home
> /dev/md1131072  363 1307091%
> /boot
> gluster1:/web   2559860992 63694442 24961665503%
> /web
> tmpfs  8199799181997981%
> /run/user/0
>
>
> The rebalance log is in the attachment
>
> the cluster information
>
> gluster volume status web detail
> Status of volume: web
> 
> --
> Brick: Brick gluster1:/home/export/md3/brick
> TCP Port : 49154
> RDMA Port: 0
> Online   : Y
> Pid  : 16730
> File System  : ext4
> Device   : /dev/md3
> Mount Options: rw,noatime,nodiratime,nobarrier,data=ordered
> Inode Size   : 256
> Disk Space Free  : 239.4GB
> Total Disk Space : 1.6TB
> Inode Count  : 110067712
> Free Inodes  : 80867992
> 
> --
> Brick: Brick gluster1:/export/md2/brick
> TCP Port : 49155
> RDMA Port: 0
> Online   : Y
> Pid  : 16758
> File System  : ext4
> Device   : /dev/md2
> Mount Options: rw,noatime,nodiratime,nobarrier,data=ordered
> Inode Size   : 256
> Disk Space Free  : 589.4GB
> Total Disk Space : 1.9TB
> Inode Count  : 131981312
> Free Inodes  : 101080484
> 
> --
> Brick: Brick gluster2:/home/export/md3/brick
> TCP Port : 49152
> RDMA Port: 0
> Online   : Y
> Pid  : 12556
> File System  : xfs
> Device   : /dev/md3
> Mount Options: rw,noatime,nodiratime,attr2,in
> ode64,sunit=1024,swidth=3072,noquota
> Inode Size   : 256
> Disk Space Free  : 10.7TB
> Total Disk Space : 10.8TB
> Inode Count  : 2317811968
> Free Inodes  : 2314218207
>
> Most of the files in the cluster are pictures smaller than 1M
>
>
> 2018-04-30 15:16 GMT+08:00 Nithya Balachandran :
>
>> Hi,
>>
>>
>> This value is an ongoing rough estimate based on the amount of data
>> rebalance has migrated since it started. The values will cange as the
>> rebalance progresses.
>> A few questions:
>>
>>1. How many files/dirs do you have on this volume?
>>2. What is the average size of the files?
>>3. What is the total size of the data on the volume?
>>
>>
>> Can you send us the rebalance log?
>>
>>
>> Thanks,
>> Nithya
>>
>> On 30 April 2018 at 10:33, kiwizhang618  wrote:
>>
>>>  I met a big problem,the cluster rebalance takes a long time after
>>> adding a new node
>>>
>>> gluster volume rebalance web status
>>> Node Rebalanced-files  size
>>>   scanned  failures   skipped   status  run time in
>>> h:m:s
>>>-  ---   ---
>>>   ---   ---   --- 
>>> --
>>>localhost  90043.5MB
>>>  2232 069  in progress
>>>  0:36:49
>>> gluster2 105239.3MB
>>>  4393 0  1052  in progress
>>>  0:36:49
>>> Estimated time left for rebalance to complete : 9919:44:34
>>> volume rebalance: web: success
>>>
>>> the rebalance log
>>> [glusterfsd.c:2511:main] 0-/usr/sbin/glusterfs: Started running
>>>