Re: [Gluster-users] [Gluster-devel] Gluster Volume as object storage with S3 interface
Its difficult to give ETA at this point. we are discussing on tech-stack to implement S3 . However please let us know what S3 features you often use . On 03/08/2017 05:54 PM, Gandalf Corvotempesta wrote: 2017-03-08 13:09 GMT+01:00 Saravanakumar Arumugam: We are working on a custom solution which will avoids gluster-swift altogether. We will update here once it is ready. Stay tuned. Any ETA ? ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Cannot remove-brick/migrate data
On 8 March 2017 at 23:34, Jarsulic, Michael [CRI] < mjarsu...@bsd.uchicago.edu> wrote: > I am having issues with one of my systems that houses two bricks and want > to bring it down for maintenance. I was able to remove the first brick > successfully and committed the changes. The second brick is giving me a lot > of problems with the rebalance when I try to remove it. It seems like it is > stuck somewhere in that process: > > # gluster volume remove-brick hpcscratch cri16fs002-ib:/data/brick4/scratch > status > Node Rebalanced-files size >scanned failures skipped status run time in > secs >- --- --- > --- --- --- > -- >localhost00Bytes >522 0 0 in progress > 915.00 > > > The rebalance logs show the following error message. > > [2017-03-08 17:48:19.329934] I [MSGID: 109081] > [dht-common.c:3810:dht_setxattr] > 0-hpcscratch-dht: fixing the layout of /userx/Ethiopian_imputation > [2017-03-08 17:48:19.329960] I [MSGID: 109045] > [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: > subvolume 0 (hpcscratch-client-0): 45778954 chunks > [2017-03-08 17:48:19.329968] I [MSGID: 109045] > [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: > subvolume 1 (hpcscratch-client-1): 45778954 chunks > [2017-03-08 17:48:19.329974] I [MSGID: 109045] > [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: > subvolume 2 (hpcscratch-client-4): 45778954 chunks > [2017-03-08 17:48:19.329979] I [MSGID: 109045] > [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: > subvolume 3 (hpcscratch-client-5): 45778954 chunks > [2017-03-08 17:48:19.329983] I [MSGID: 109045] > [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: > subvolume 4 (hpcscratch-client-7): 45778954 chunks > [2017-03-08 17:48:19.400394] I [MSGID: 109036] > [dht-common.c:7869:dht_log_new_layout_for_dir_selfheal] > 0-hpcscratch-dht: Setting layout of /userx/Ethiopian_imputation with > [Subvol_name: hpcscratch-client-0, Err: -1 , Start: 1052915942 , Stop: > 2105831883 , Hash: 1 ], [Subvol_name: hpcscratch-client-1, Err: -1 , Start: > 3158747826 , Stop: 4294967295 , Hash: 1 ], [Subvol_name: > hpcscratch-client-4, Err: -1 , Start: 0 , Stop: 1052915941 , Hash: 1 ], > [Subvol_name: hpcscratch-client-5, Err: -1 , Start: 2105831884 , Stop: > 3158747825 , Hash: 1 ], [Subvol_name: hpcscratch-client-7, Err: 22 , Start: > 0 , Stop: 0 , Hash: 0 ], > [2017-03-08 17:48:19.480882] I [dht-rebalance.c:2446:gf_defrag_process_dir] > 0-hpcscratch-dht: migrate data called on /userx/Ethiopian_imputation > > These are not error messages - these are info messages logged when the layout for a directory is being set and can be ignored. The remove-brick operation is still in progress according to the status. What is it that makes you feel it is stuck? Is there no difference in the status output even after a considerable interval? Regards, Nithya > > Any suggestions on how I can get this brick out of play and preserve the > data? > > -- > Mike Jarsulic > Sr. HPC Administrator > Center for Research Informatics | University of Chicago > 773.702.2066 > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] one brick vs multiple brick on the same ZFS zpool.
Not necessarily. ZFS does things fairly differently than other filesystems, and can be faster than HW RAID. I’d recommend spending a bit of time reading up - the Linux ZFS-discuss list archives are a great place to start - http://list.zfsonlinux.org/pipermail/zfs-discuss/ . That said, if you’re not using the ZFS feature set and don’t want to invest some time on learning about it (it really is rather different than most file systems), I’d recommend *not* using it, because people all too often lose data due to misconfiguration, misunderstanding or the assumption that it behaves like other filesystems. In which case, XFS is probably a good option for you. -j > On Mar 6, 2017, at 3:12 PM, Dung Lewrote: > > Hi, > > How about hardware raid with XFS? I assuming it would be faster than ZFS raid > since it has physical cache on raid controller for reads and writes. > > Thanks, > > >> On Mar 6, 2017, at 3:08 PM, Gandalf Corvotempesta >> wrote: >> >> Hardware raid with ZFS should avoided >> ZFS needs direct access to disks and with hardware raid you have a >> controller in the middle >> >> If you need ZFS, skip the hardware raid and use ZFS raid >> >> Il 6 mar 2017 9:23 PM, "Dung Le" ha scritto: >> Hi, >> >> Since I am new with Gluster, need your advices. I have 2 different Gluster >> configuration: >> >> Purpose: Need to create 5 Gluster volumes. I am running the gluster version >> is 3.9.0. >> >> Config #1: 5 bricks from one zpool >> • 3 storage nodes. >> • Using hardware raid to create one array with raid5 (9+1) per storage >> node >> • Create a zpool on top of the array per storage node >> • Create 5 ZFS shares (each share is a brick) per storage node >> • Create 5 volumes with replica of 3 using 5 different bricks. >> >> Config #2: 1 brick from one zpool >> • 3 storage nodes. >> • Using hardware raid to create one array with raid5 (9+1) per storage >> node >> • Create a zpool on top of the array per storage node >> • Create 1 ZFS shares per storage node. Using the share as brick. >> • Create 5 volumes with replica of 3 with same share. >> >> 1) Is there any different on the performance on both config? >> 2) Will the single brick be handling parallel writing vs multiple brick? >> 3) Since I am using hardware raid controller, any option that I need to >> enable or disable for the gluster volume? >> >> Best Regards, >> ~ Vic Le >> >> >> ___ >> Gluster-users mailing list >> Gluster-users@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Cannot remove-brick/migrate data
I am having issues with one of my systems that houses two bricks and want to bring it down for maintenance. I was able to remove the first brick successfully and committed the changes. The second brick is giving me a lot of problems with the rebalance when I try to remove it. It seems like it is stuck somewhere in that process: # gluster volume remove-brick hpcscratch cri16fs002-ib:/data/brick4/scratch status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost00Bytes 522 0 0 in progress 915.00 The rebalance logs show the following error message. [2017-03-08 17:48:19.329934] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-hpcscratch-dht: fixing the layout of /userx/Ethiopian_imputation [2017-03-08 17:48:19.329960] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 0 (hpcscratch-client-0): 45778954 chunks [2017-03-08 17:48:19.329968] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 1 (hpcscratch-client-1): 45778954 chunks [2017-03-08 17:48:19.329974] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 2 (hpcscratch-client-4): 45778954 chunks [2017-03-08 17:48:19.329979] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 3 (hpcscratch-client-5): 45778954 chunks [2017-03-08 17:48:19.329983] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht: subvolume 4 (hpcscratch-client-7): 45778954 chunks [2017-03-08 17:48:19.400394] I [MSGID: 109036] [dht-common.c:7869:dht_log_new_layout_for_dir_selfheal] 0-hpcscratch-dht: Setting layout of /userx/Ethiopian_imputation with [Subvol_name: hpcscratch-client-0, Err: -1 , Start: 1052915942 , Stop: 2105831883 , Hash: 1 ], [Subvol_name: hpcscratch-client-1, Err: -1 , Start: 3158747826 , Stop: 4294967295 , Hash: 1 ], [Subvol_name: hpcscratch-client-4, Err: -1 , Start: 0 , Stop: 1052915941 , Hash: 1 ], [Subvol_name: hpcscratch-client-5, Err: -1 , Start: 2105831884 , Stop: 3158747825 , Hash: 1 ], [Subvol_name: hpcscratch-client-7, Err: 22 , Start: 0 , Stop: 0 , Hash: 0 ], [2017-03-08 17:48:19.480882] I [dht-rebalance.c:2446:gf_defrag_process_dir] 0-hpcscratch-dht: migrate data called on /userx/Ethiopian_imputation Any suggestions on how I can get this brick out of play and preserve the data? -- Mike Jarsulic Sr. HPC Administrator Center for Research Informatics | University of Chicago 773.702.2066 ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] RE : Frequent connect and disconnect messages flooded in logs
Just to let you know: I have reverted back to glusterfs 3.4.2 and everything is working again. No more disconnects, no more errors in the kernel log. So there *has* to be some kind of regression in the newer versions. Sadly, I guess, it will be hard to find. 2016-12-20 13:31 GMT+01:00 Micha Ober: > Hi Rafi, > > here are the log files: > > NFS: http://paste.ubuntu.com/23658653/ > Brick: http://paste.ubuntu.com/23658656/ > > The brick log is of the brick which has caused the last disconnect > at 2016-12-20 06:46:36 (0-gv0-client-7). > > For completeness, here is also dmesg output: http://paste.ubuntu. > com/23658691/ > > Regards, > Micha > > 2016-12-19 7:28 GMT+01:00 Mohammed Rafi K C : > >> Hi Micha, >> >> Sorry for the late reply. I was busy with some other things. >> >> If you have still the setup available Can you enable TRACE log level >> [1],[2] and see if you could find any log entries when the network start >> disconnecting. Basically I'm trying to find out any disconnection had >> occurred other than ping timer expire issue. >> >> >> >> [1] : gluster volume diagnostics.brick-log-level TRACE >> >> [2] : gluster volume diagnostics.client-log-level TRACE >> >> >> Regards >> >> Rafi KC >> >> On 12/08/2016 07:59 PM, Atin Mukherjee wrote: >> >> >> >> On Thu, Dec 8, 2016 at 4:37 PM, Micha Ober wrote: >> >>> Hi Rafi, >>> >>> thank you for your support. It is greatly appreciated. >>> >>> Just some more thoughts from my side: >>> >>> There have been no reports from other users in *this* thread until now, >>> but I have found at least one user with a very simiar problem in an older >>> thread: >>> >>> https://www.gluster.org/pipermail/gluster-users/2014-Novembe >>> r/019637.html >>> >>> He is also reporting disconnects with no apparent reasons, althogh his >>> setup is a bit more complicated, also involving a firewall. In our setup, >>> all servers/clients are connected via 1 GbE with no firewall or anything >>> that might block/throttle traffic. Also, we are using exactly the same >>> software versions on all nodes. >>> >>> >>> I can also find some reports in the bugtracker when searching for >>> "rpc_client_ping_timer_expired" and "rpc_clnt_ping_timer_expired" >>> (looks like spelling changed during versions). >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1096729 >>> >> >> Just FYI, this is a different issue, here GlusterD fails to handle the >> volume of incoming requests on time since MT-epoll is not enabled here. >> >> >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1370683 >>> >>> But both reports involve large traffic/load on the bricks/disks, which >>> is not the case for out setup. >>> To give a ballpark figure: Over three days, 30 GiB were written. And the >>> data was not written at once, but continuously over the whole time. >>> >>> >>> Just to be sure, I have checked the logfiles of one of the other >>> clusters right now, which are sitting in the same building, in the same >>> rack, even on the same switch, running the same jobs, but with glusterfs >>> 3.4.2 and I can see no disconnects in the logfiles. So I can definitely >>> rule out our infrastructure as problem. >>> >>> Regards, >>> Micha >>> >>> >>> >>> Am 07.12.2016 um 18:08 schrieb Mohammed Rafi K C: >>> >>> Hi Micha, >>> >>> This is great. I will provide you one debug build which has two fixes >>> which I possible suspect for a frequent disconnect issue, though I don't >>> have much data to validate my theory. So I will take one more day to dig in >>> to that. >>> >>> Thanks for your support, and opensource++ >>> >>> Regards >>> >>> Rafi KC >>> On 12/07/2016 05:02 AM, Micha Ober wrote: >>> >>> Hi, >>> >>> thank you for your answer and even more for the question! >>> Until now, I was using FUSE. Today I changed all mounts to NFS using the >>> same 3.7.17 version. >>> >>> But: The problem is still the same. Now, the NFS logfile contains lines >>> like these: >>> >>> [2016-12-06 15:12:29.006325] C >>> [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired] >>> 0-gv0-client-7: server X.X.18.62:49153 has not responded in the last 42 >>> seconds, disconnecting. >>> >>> Interestingly enough, the IP address X.X.18.62 is the same machine! As >>> I wrote earlier, each node serves both as a server and a client, as each >>> node contributes bricks to the volume. Every server is connecting to itself >>> via its hostname. For example, the fstab on the node "giant2" looks like: >>> >>> #giant2:/gv0/shared_dataglusterfs defaults,noauto 0 0 >>> #giant2:/gv2/shared_slurm glusterfs defaults,noauto 0 0 >>> >>> giant2:/gv0 /shared_datanfs defaults,_netdev,vers=3 >>> 0 0 >>> giant2:/gv2 /shared_slurm nfs defaults,_netdev,vers=3 >>> 0 0 >>> >>> So I understand the disconnects even less. >>> >>> I don't know if it's possible to create a dummy cluster which exposes >>> the same behaviour, because the
Re: [Gluster-users] Increase or performance tune READ perf for glusterfs distributed volume
Hi Karan, >>Are you reading a small file data-set or large files data-set and secondly, >>volume is mounted using which protocol? I am using 1mb block size to test using RDMA transport. -- Deepak > On Mar 8, 2017, at 2:48 AM, Karan Sandhawrote: > > Are you reading a small file data-set or large files data-set and secondly, > volume is mounted using which protocol? --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. --- ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Deleting huge file from glusterfs hangs the cluster for a while
Thanks for your feedback. May I know what was the shard-block-size? One way to fix this would be to make shard translator delete only the base file (0th shard) in the IO path and move the deletion of the rest of the shards to background. I'll work on this. -Krutika On Fri, Mar 3, 2017 at 10:35 PM, GEORGI MIRCHEVwrote: > Hi, > > I have deleted two large files (around 1 TB each) via gluster client > (mounted > on /mnt folder). I used a simple rm command, e.g "rm /mnt/hugefile". This > resulted in hang of the cluster (no io can be done, the VM hanged). After a > few minutes my ssh connection to the gluster node gets disconnected - I > had to > reconnect, which was very strange, probably some kind of timeout. Nothing > in > dmesg so it's probably the ssh that terminated the connection. > > After that the cluster works, everything seems fine, the file is gone in > the > client but the space is not reclaimed. > > The deleted file is also gone from bricks, but the shards are still there > and > use up all the space. > > I need to reclaim the space. How do I delete the shards / other metadata > for a > file that no longer exists? > > > Versions: > glusterfs-server-3.8.9-1.el7.x86_64 > glusterfs-client-xlators-3.8.9-1.el7.x86_64 > glusterfs-geo-replication-3.8.9-1.el7.x86_64 > glusterfs-3.8.9-1.el7.x86_64 > glusterfs-fuse-3.8.9-1.el7.x86_64 > vdsm-gluster-4.19.4-1.el7.centos.noarch > glusterfs-cli-3.8.9-1.el7.x86_64 > glusterfs-libs-3.8.9-1.el7.x86_64 > glusterfs-api-3.8.9-1.el7.x86_64 > > -- > Georgi Mirchev > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Gluster Volume as object storage with S3 interface
2017-03-08 13:09 GMT+01:00 Saravanakumar Arumugam: > We are working on a custom solution which will avoids gluster-swift > altogether. > We will update here once it is ready. Stay tuned. Any ETA ? ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Gluster Volume as object storage with S3 interface
On 03/08/2017 04:55 PM, Gandalf Corvotempesta wrote: I'm really inerested in this. cool. Let me know if I understood properly, now is possible to access a Gluster volume as object storage via S3 API ? Yes. It is possible. Authentication is currently turned off. You can expect updates on Authentication soon. Is Gluster-swift (and with that, the rings, auth and so on coming from OpenStack) still needed ? You are right. gluster-swift is still needed. But, it is part of Docker container. All gluster-swift processes are running inside Docker container in order to provide Object interface. Docker container accesses Gluster volume to access (get/put) objects. We are working on a custom solution which will avoids gluster-swift altogether. We will update here once it is ready. Stay tuned. 2017-03-08 9:53 GMT+01:00 Saravanakumar Arumugam: Hi, I have posted a blog about accessing Gluster volume via S3 interface.[1] ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [Gluster-devel] Gluster Volume as object storage with S3 interface
I'm really inerested in this. Let me know if I understood properly, now is possible to access a Gluster volume as object storage via S3 API ? Is Gluster-swift (and with that, the rings, auth and so on coming from OpenStack) still needed ? 2017-03-08 9:53 GMT+01:00 Saravanakumar Arumugam: > Hi, > > I have posted a blog about accessing Gluster volume via S3 interface.[1] > > > Here, Gluster volume is exposed as a object storage. > > Object storage functionality is implemented with changes to Swift storage > and > > swift3 plugin is used to expose S3 interface. [4] > > > gluster-object is available as part of docker hub [2] and the corresponding > github > > link is [3]. > > > You can expect further updates on this, to provide object storage in > Kubernetes/OpenShift platform. > > Thanks to Prashanth Pai for all his help. > > > [1] > https://uyirpodiru.blogspot.in/2017/03/building-gluster-object-in-docker.html > > [2] https://hub.docker.com/r/gluster/gluster-object/ > > [3] https://github.com/SaravanaStorageNetwork/docker-gluster-s3 > > [4] https://github.com/gluster/gluster-swift > > > Thanks, > > Saravana > > ___ > Gluster-devel mailing list > gluster-de...@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Increase or performance tune READ perf for glusterfs distributed volume
2017-03-08 11:48 GMT+01:00 Karan Sandha: > Hi Deepak, > > Are you reading a small file data-set or large files data-set and secondly, > volume is mounted using which protocol? > > for small files data-set :- > > gluster volume set vol-name cluster.lookup-optimize on (default=off) > > gluster volume set vol-name server.event-threads 4 (default=2) > > gluster volume set vol-name client.event-threads 4 (default=2) > > and do a re-balance on the volume and then check the performance, we > generally see a performance bump up when we turn these parameter on. Are these settings shown somewhere in official docs? ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Increase or performance tune READ perf for glusterfs distributed volume
Hi Deepak, Are you reading a small file data-set or large files data-set and secondly, volume is mounted using which protocol? for small files data-set :- ** *gluster volume set *vol-name* cluster.lookup-optimize on (default=off)* * gluster volume set *vol-name* server.event-threads 4 (default=2) gluster volume set *vol-name* client.event-threads 4 *(default=2)* * and do a re-balance on the volume and then check the performance, we generally see a performance bump up when we turn these parameter on. Thanks & regards Karan Sandha On 03/08/2017 02:21 AM, Deepak Naidu wrote: Is there are any tuning param for READ, I need to set to get maximum throughput on glusterfs distributed volume read performance. Currently, I am trying to compare this with my local SSD Disk performance. ·My local SSD(/dev/sdb) can random read 6.3TB in 56 minutes on XFS filesystem. ·I have 2x node distributed glusterfs volume. When I read the same workload, it takes around 63 minutes. ·Network is IPoIB using RDMA. Infiniband network is 1x 100 Gb/sec (4X EDR) Any suggestion is appreciated. -- Deepak This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Gluster Volume as object storage with S3 interface
Hi, I have posted a blog about accessing Gluster volume via S3 interface.[1] Here, Gluster volume is exposed as a object storage. Object storage functionality is implemented with changes to Swift storage and swift3 plugin is used to expose S3 interface. [4] gluster-object is available as part of docker hub [2] and the corresponding github link is [3]. You can expect further updates on this, to provide object storage in Kubernetes/OpenShift platform. Thanks to Prashanth Pai for all his help. [1] https://uyirpodiru.blogspot.in/2017/03/building-gluster-object-in-docker.html [2] https://hub.docker.com/r/gluster/gluster-object/ [3] https://github.com/SaravanaStorageNetwork/docker-gluster-s3 [4] https://github.com/gluster/gluster-swift Thanks, Saravana ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users