Re: [Gluster-users] timestamps getting updated during self-heal after primary brick rebuild

2013-03-05 Thread Todd Stansell
In the interest of pinging the community for *any* sort of feedback, I'd like
to note that we rebuilt things on centos 6 with btrfs as the filesystem to use
something entirely different.  We see the same behavior.  After rebuilding the
first brick in the 2-brick replicate cluster, all file timestamps get updated
to the time self-heal copies the data back to that brick.

This is obviously a bug in 3.3.1.  We basically did what's described here:

  
http://gluster.org/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server

and timestamps get updated on all files.  Can someone acknowledge that this
sounds like a bug?  Does anyone care?

Being relatively new to glusterfs, it's painful to watch the mailing list and
even the IRC channel and see many folks ask questions with nothing but
silence.  I honestly wasn't sure if glusterfs was actively being supported
anymore.  Given the recent flurry of mail about lack of documentation I see
that's not really true.  Unfortunately, given that what I'm seeing is a form
of data corruption (yes, timestamps do matter), I'm surprised nobody's
interested to help figure out what's going wrong.  Hopefully it's something
about the way I've build out cluster (though it seems less and less likely
given we are able to replicate the problem so easily).

Todd

On Thu, Feb 28, 2013 at 11:23:34PM +, Todd Stansell wrote:
> We're looking at using glusterfs to provide a shared filesystem between two
> nodes, using just local disk.  They are both gluster servers as well as
> clients.  This is on CentOS 5.9 64-bit.  The bricks are simply ext3
> filesystems on top of LVM:
> 
> /dev/mapper/VolGroup00-LogVol0 on /gfs0 type ext3 (rw,user_xattr)
> 
> We set up a test volume with:
> 
> host14# gluster volume create gv0 replica 2 transport tcp host14:/gfs0 
> host13:/gfs0
> host14# gluster volume set gv0 nfs.disable on
> host14# gluster volume start gv0
> 
> This works just fine.  The issue is simulating hardware failure where we need
> to rebuild an entire node.  In this case, we kickstart our server which 
> creates
> all fresh new filesystems.  We have a kickstart postinstall script that sets
> the glusterd UUID of the server so that it never changes.  It then does a 
> probe
> of the other server, looks for existing volumes, sets up fstab entries for 
> them
> (to also act as a client) and also sets up an init script to force a full heal
> every time the server boots just to ensure all data is replicated to both
> nodes.  All of this works great when I'm rebuilding the second brick.
> 
> The issue I have is when we rebuild the server that hosts the primary brick
> (host14:/gfs0).  It will come online and start copying data from host13:/gfs0,
> but as it does so, it sets the timestamps of the files on host13:/gfs0 to the
> time it healed the data on host14:/gfs0.  As a result, all files in the
> filesystem end up with timestamps of when the first brick was healed.
> 
> I enabled client debug logs and the following indicates that it *thinks* it is
> doing the right thing:
> 
> after rebuilding gv0-client-1:
> [2013-02-28 00:01:37.264018] D 
> [afr-self-heal-metadata.c:329:afr_sh_metadata_sync] 0-gv0-replicate-0: 
> self-healing metadata of /data/bin/sync-data from gv0-client-0 to gv0-client-1
> 
> after rebuilding gv0-client-0:
> [2013-02-28 00:17:03.578377] D 
> [afr-self-heal-metadata.c:329:afr_sh_metadata_sync] 0-gv0-replicate-0: 
> self-healing metadata of /data/bin/sync-data from gv0-client-1 to gv0-client-0
> 
> Unfortunately, in the second case, the timestamp of the files changed from:
> 
> -r-xr-xr-x 1 root root 2717 Feb 27 23:32 /data/data/bin/sync-data*
> 
>  to:
> 
> -r-xr-xr-x 1 root root 2717 Feb 28 00:17 /data/data/bin/sync-data*
> 
> And remember, there's nothing accessing any data in this volume so there's no
> "client" access going on anywhere.  No changes happening on the filesystem,
> other than self-heal screwing things up.
> 
> The only thing I could find in any logs that would indicate a problem was this
> in the brick log:
> 
> [2013-02-28 00:17:03.583063] D [posix.c:323:posix_do_utimes] 0-gv0-posix: 
> /gfs0/data/bin/sync-data (Function not implemented)
> 
> I've also now built a Centos 6 host and verified that the same behavior
> happens there, though I get a slightly different brick debug log (which makes
> me think this has nothing to do with what I'm seeing):
> 
> [2013-02-28 23:07:41.879440] D [posix.c:262:posix_do_chmod] 0-gv0-posix: 
> /gfs0/data/bin/sync-data (Function not implemented)
> 
> Here's any basic info that might help folks know what's going on:
> 
> # rpm -qa | grep gluster
> glusterfs-server-3.3.1-1.el5
> glusterfs-3.3.1-1.el5
> glusterfs-fuse-3.3.1-1.el5
> 
> # gluster volume info
> 
> Volume Name: gv0
> Type: Replicate
> Volume ID: 7cec2ba3-f69c-409a-a259-0d055792b11a
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: host14:/g

[Gluster-users] What will 3.5 look like?

2013-03-05 Thread John Mark Walker
I realize that we haven't released 3.4 yet, but we're quickly approaching a new 
release cycle that will ultimately produce GlusterFS 3.5.

The Gluster Dev Summit starts tomorrow, and one of the things we'll discuss is 
the new roadmap. What we should all do is take a look at what is on the table, 
discuss which should take priority, and vote on your favorites. 

See the very early list of proposed features here:
http://www.gluster.org/community/documentation/index.php/Planning35

If you want to submit new features, please use this template:
http://www.gluster.org/community/documentation/index.php/Features/Feature_Template

- save as a new wiki page and then link to it from the planning page, above


To see what we did for 3.4, see these pages:
http://www.gluster.org/community/documentation/index.php/Features
http://www.gluster.org/community/documentation/index.php/Features34
http://www.gluster.org/community/documentation/index.php/Planning34

I've added space on the Planning35 page for new feature submissions and 
discussions on proposed features. 

If one of the features you'd like to propose is not for the software itself, 
but rather for infrastructure pieces, ie. website, documentation, dev process, 
etc., those are also welcome. 

We will look to stream and record the dev summit sessions, so look for links 
here and on the blog soon!

-John Mark
Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread John Mark Walker

- Original Message -
> Is there anybody, who reviews modifications to avoid stupid mistakes,
> fails?
> 

There are a few of us who review and edit the wiki on a regular basis. 

A good way to make sure people see it is to 1) link to it from a prominent page 
on the wiki and 2) send a note out to this list asking for review and comment.

You can also post drafts on your user page and ask for reviews there before 
posting it "live" on the wiki.

-JM
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread John Mark Walker
Hey guys - thanks for chiming in. As many of you have noted, we do need better 
documentation. There are a couple of things to note here. For one, we did 
create new docs that are 3.3-specific around most issues related to getting a 
cluster up and running - 
http://www.gluster.org/community/documentation/index.php/QuickStart

As for the admin guide, I agree that having it in PDF (or HTML tarball) is not 
ideal, but well, there it is - 
http://www.gluster.org/wp-content/uploads/2012/05/Gluster_File_System-3.3.0-Administration_Guide-en-US.pdf
 

I also agree that it doesn't cover nearly enough of the issues that come up 
with many use cases, and we should definitely address that.

There is one other resource that no one has mentioned here yet, and that is the 
documentation listed under Red Hat Storage: 
https://access.redhat.com/knowledge/docs/Red_Hat_Storage/

We should flesh out docs on gluster.org more, and I'm hopeful of getting a new 
employee to do precisely that in the next couple of weeks. More on that soon, 
as in this month. 

And finally, there was a comment, I believe from jjulian, that we didn't 
backport enough patches to the current release cycle, and I think that's 
correct. This is one of the issues we plan to address this week at the dev 
summit tomorrow and Friday.

For 3.4, the goal is to have point releases at regular intervals with patches 
backported as we move forward. Obviously, we didn't do nearly enough of this 
for the 3.3 release cycle, and we're going to address that for this next 
release cycle.

So yes, it's fair to say that our docs are lacking, and we should take steps to 
address that. The thing is, whatever documentation strategy we adopt, we will 
need your help to really execute against a plan. If you're up for this, let's 
start a docs project on gluster.org and try to address the largest gaps.

Thanks!
John Mark
Gluster Community Lead


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Joe Julian

On 03/05/2013 05:12 PM, Toby Corkindale wrote:

On 06/03/13 03:33, Joe Julian wrote:

It comes up on this list from time to time that there's not sufficient
documentation on troubleshooting. I assume that's what some people mean
when they refer to disappointing documentation as the current
documentation is far more detailed and useful than it was 3 years ago
when I got started. I'm not really sure what's being asked for here, nor
am I sure how one would document how to troubleshoot. In my mind, if
there's a trouble that can be documented with a clear path to
resolution, then a bug report should be filed and that should be fixed.
Any other cases that cannot be coded for require human intervention and
are already documented.

Please tell me your thoughts.


So I've asked things like this and received no help:
http://www.gluster.org/pipermail/gluster-users/2012-June/033398.html

I've found answers relating to old versions of glusterfs and asked if 
they still apply, to no response:

http://www.gluster.org/pipermail/gluster-users/2012-June/033478.html


Not intending to excuse the lack of response, but I know I was on 
vacation at that time. June 7th was a Friday and I've noticed that 
Fridays in general are sub-optimal for finding support on things.


I'll also freely admit that I despise doing support over email. That 
question was open-ended and actually requires a very long response, of 
which I suppose everyone felt overwhelmed and wanted to get home and 
have a beer. I know for me, one problem I have with email support is 
that once my train of thought has derailed, I'm loath to get it on track 
again.


I don't suppose you tried the IRC channel? There's usually someone there 
that is willing to walk people through just about anything.




I've seen other people ask about split-brain issues to no response:
http://thr3ads.net/gluster-users/2012/06/1962160-managing-split-brain-in-3.3 



There are several responses there.




I've tried to figure things out about the self-heal daemon, and found 
no documentation (but at least received some responses on list):
http://thr3ads.net/gluster-users/2012/05/1919111-3.3-beta3-When-should-the-self-heal-daemon-be-triggered 



Definitely hard to get support on beta or qa releases. Probably because 
everyone's still in the process of testing. When anything is that young, 
there's a very small knowledge pool. Normally there are a lot of bugs 
found during the qa and beta releases and the people best suited to 
answering those questions are busy squashing them.


Does anybody have any thoughts on how that problem could be mitigated?



I found that the official documentation for the 3.2->3.3 upgrade path 
was in fact erroneous and did not work:

http://www.mail-archive.com/gluster-users@gluster.org/msg10890.html
(This turned out to be because the blog the wiki had copied the 
commands from had turned hypens into mdashes)


Yes, I remember that discovery. It wasn't like that when it was 
originally posted but there was a drastic circumstance that required 
exporting and importing. It should have been on the wiki though. Bad 
Vijay... ;)




I've gone looking for info on the options available to simply mounting 
a glusterfs volume, and would you believe the latest version available 
with docs on the website is 3.2, not 3.3? (At least as far as a google 
search is concerned):

https://www.google.com.au/search?q=glusterfs+mount+volume


I don't believe the mount options changed from 3.2. I was aware that 
someone made the decision that the 3.2 documentation was "still current" 
for 3.3 but I don't recall who said that. (JMW?)


That goes back to what I was saying about converting from Publican. It's 
far too cumbersome to contribute to.




So, yeah, from my point of view GlusterFS's documentation fails at 
covering (a) simple, day-to-day actions, (b) upgrade paths, (c) 
handling failures.




I'll disagree on the "simple day-to-day" actions. Healing split-brain is 
neither simple, nor should it be day-to-day. The simple actions are 
covered. And, frankly, the only day-to-day operations that are commonly 
done are scouring logs and monitoring systems. The rest of the simple 
operations generally happen rarely but for fixed conditions.




I note there is also this site:
http://community.gluster.org/t/glusterfs/

It's full of people asking questions, and almost completely empty of 
people receiving useful replies. I know it's a *community* driven 
thing, but still.. it doesn't keep a good impression of community 
support.




99% of the "questions" are statements and the quality of the questions 
are so bad that I got so frustrated that I stopped looking at them. Top 
that with people posing additional questions as answers and my ocd 
tendencies finalized that decision. Yes, it's probably irrational that 
it makes me so angry that people don't even know the difference between 
a question and an answer, but that's my problem. :/




The mailing list is often a source of good

Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Toby Corkindale

On 06/03/13 03:33, Joe Julian wrote:

It comes up on this list from time to time that there's not sufficient
documentation on troubleshooting. I assume that's what some people mean
when they refer to disappointing documentation as the current
documentation is far more detailed and useful than it was 3 years ago
when I got started. I'm not really sure what's being asked for here, nor
am I sure how one would document how to troubleshoot. In my mind, if
there's a trouble that can be documented with a clear path to
resolution, then a bug report should be filed and that should be fixed.
Any other cases that cannot be coded for require human intervention and
are already documented.

Please tell me your thoughts.


So I've asked things like this and received no help:
http://www.gluster.org/pipermail/gluster-users/2012-June/033398.html

I've found answers relating to old versions of glusterfs and asked if 
they still apply, to no response:

http://www.gluster.org/pipermail/gluster-users/2012-June/033478.html

I've seen other people ask about split-brain issues to no response:
http://thr3ads.net/gluster-users/2012/06/1962160-managing-split-brain-in-3.3


I've tried to figure things out about the self-heal daemon, and found no 
documentation (but at least received some responses on list):

http://thr3ads.net/gluster-users/2012/05/1919111-3.3-beta3-When-should-the-self-heal-daemon-be-triggered


I found that the official documentation for the 3.2->3.3 upgrade path 
was in fact erroneous and did not work:

http://www.mail-archive.com/gluster-users@gluster.org/msg10890.html
(This turned out to be because the blog the wiki had copied the commands 
from had turned hypens into mdashes)



I've gone looking for info on the options available to simply mounting a 
glusterfs volume, and would you believe the latest version available 
with docs on the website is 3.2, not 3.3? (At least as far as a google 
search is concerned):

https://www.google.com.au/search?q=glusterfs+mount+volume



So, yeah, from my point of view GlusterFS's documentation fails at 
covering (a) simple, day-to-day actions, (b) upgrade paths, (c) handling 
failures.



I note there is also this site:
http://community.gluster.org/t/glusterfs/

It's full of people asking questions, and almost completely empty of 
people receiving useful replies. I know it's a *community* driven thing, 
but still.. it doesn't keep a good impression of community support.




The mailing list is often a source of good, useful information, and I 
appreciate everyone's help. Thanks for continuing to provide that support!
It would just be great if some of the collective's knowledge was 
available online in an easily-searched manner, and kept up to date.


Cheers,
Toby

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Jeff Darcy
On 03/05/2013 04:17 PM, Joe Julian wrote:
> On 03/05/2013 09:33 AM, Shawn Nock wrote:
>> However, since the switch to the new release cycle, bugs don't seem to
>> get fixed (within a release)
> Starting to frustrate me as well. There's too many new features in 3.4 
> for me to feel comfortable making the switch and not enough bugs being 
> backported.

Actually (as the person maintaining the list of fixes which are being
backported to 3.4) it seems like the list is quite extensive.

http://www.gluster.org/glusterfs-3-4-planning/
https://bugzilla.redhat.com/showdependencytree.cgi?id=895528

If anybody has suggestions for other fixes which they'd like to see
backported to 3.4, please *please* let me know.  Direct email will get
my attention fastest, list email will probably work too, comments on
bugs will probably only reach me if they're bugs that I'm already
subscribed to.  I know Joe has already noted done this for a few bugs.
As 3.4 is about to move from alpha to beta, some such requests might not
be fulfilled due to inherent risk of a complicated fix merely replacing
one bug with a new one, but a fix is even less likely to make it into
3.4 if it never gets mentioned or after 3.4 exits beta.  We'll keep
trying to improve this communication in future releases.

BTW, I've greatly enjoyed reading this conversation (and not only
because Shawn said nice things about my articles on hekafs.org *blush*).
 Now is a fantastic time to be bringing this stuff up, because we're all
here in Bangalore right now with the dev summit coming up tomorrow and
Friday.  That means we can get small groups of developers together and
actually do something about some of the ideas that have been discussed.
 I've already made a plan to spend some time on Friday implementing the
idea of providing descriptions for some of the most common or important
error codes.  Some day I'd like to embed this knowledge into a software
agent which watches the logs for the "signatures" of known issues and
boils that information down into concrete and actionable suggestions for
the administrator, but that's more of a long-term thing.  Right now just
having some explanatory text around some of the log messages would be a
help.

Please, keep the ideas coming.  This kind of conversation is extremely
helpful.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Papp Tamas

On 03/06/2013 12:38 AM, Joe Julian wrote:


On 03/05/2013 03:02 PM, Papp Tamas wrote:

On 03/05/2013 11:50 PM, Joe Julian wrote:


I know I'm probably starting to sound like a broken record, repeating the same 
mantra over and over
again, but I'll ask anyway: can you add the better information you found to the 
wiki please?


I can try:)
I didn't know, I can just edit wiki without your special permission.



Yes, anybody can create an account and add/edit the wiki. There are some spam 
filters in place, so
just post an email here if the content you're trying to publish is rejected and 
we'll try to resolve
any problems.


Is there anybody, who reviews modifications to avoid stupid mistakes, fails?

10x
tamas
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Joe Julian

On 03/05/2013 03:02 PM, Papp Tamas wrote:

On 03/05/2013 11:50 PM, Joe Julian wrote:


I know I'm probably starting to sound like a broken record, repeating 
the same mantra over and over
again, but I'll ask anyway: can you add the better information you 
found to the wiki please?


I can try:)
I didn't know, I can just edit wiki without your special permission.



Yes, anybody can create an account and add/edit the wiki. There are some 
spam filters in place, so just post an email here if the content you're 
trying to publish is rejected and we'll try to resolve any problems.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Papp Tamas

On 03/05/2013 11:50 PM, Joe Julian wrote:


I know I'm probably starting to sound like a broken record, repeating the same 
mantra over and over
again, but I'll ask anyway: can you add the better information you found to the 
wiki please?


I can try:)
I didn't know, I can just edit wiki without your special permission.

10x
tamas
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Joe Julian

On 03/05/2013 02:46 PM, Papp Tamas wrote:

On 03/05/2013 05:33 PM, Joe Julian wrote:


It comes up on this list from time to time that there's not 
sufficient documentation on
troubleshooting. I assume that's what some people mean when they 
refer to disappointing
documentation as the current documentation is far more detailed and 
useful than it was 3 years ago
when I got started. I'm not really sure what's being asked for here, 
nor am I sure how one would
document how to troubleshoot. In my mind, if there's a trouble that 
can be documented with a clear
path to resolution, then a bug report should be filed and that should 
be fixed. Any other cases that
cannot be coded for require human intervention and are already 
documented.


Please tell me your thoughts.


There is no online documentation for v3.3. Every google search points to
http://gluster.org/community/documentation/index.php/Gluster_3.2_Filesystem_Administration_Guide 




There is an Admin guide, but it's in pdf and html gzipped? Is it fun?:)
Also I'm afraid this guide is RH specific.

For example there is this section:

"If you are using Samba to access GlusterFS FUSE mount, then POSIX 
ACLs are enabled by default.

Samba has been compiled with the
--with-acl-support
option, so no special flags are required
when accessing or mounting a Samba share."

How do you know about that?


Also there is no good cifs documentation.
I suggest adding a full smb.conf.
I asked here in the list for tuning help, with no answer, then later I 
found the solution by google.
For me max/min protocol = SMB2 was the trick. But there can be locking 
issues (for example Adobe Premiere and Photoshop).
I don't blame the list, but the online, searchable documentation 
should be more helpful.


However of course this is a samba general issue at the same time.


And I think this guide should be used more or less by experts, not 
absolutely beginners.
If someone is looking for the right information, it's hard to find, 
because too verbose:


"Chapter 6. Accessing Data - Setting Up GlusterFS Client
28
6.3.1.2. Manually Mounting Volumes Using CIFS
You can manually mount Gluster volumes using CIFS on Microsoft 
Windows-based client machines.

To manually mount a Gluster volume using CIFS
1.
Using Windows Explorer, choose
Tools > Map Network Drive...
from the menu. The
Map
Network Drive
window appears.
2.
Choose the drive letter using the
Drive
drop-down list.
3.
Click
Browse
, select the volume to map to the network drive, and click
OK
.
4.
Click
Finish.
The network drive (mapped to the volume) appears in the Computer window.
Alternatively, to manually mount a Gluster volume using CIFS.
•
Click
Start > Run
and enter the following:
\\SERVERNAME\VOLNAME
For example:
\\server1\test-volume"



This whol cifs section is obviously only an example:)


I know I'm probably starting to sound like a broken record, repeating 
the same mantra over and over again, but I'll ask anyway: can you add 
the better information you found to the wiki please?

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Papp Tamas

On 03/05/2013 05:33 PM, Joe Julian wrote:


It comes up on this list from time to time that there's not sufficient 
documentation on
troubleshooting. I assume that's what some people mean when they refer to 
disappointing
documentation as the current documentation is far more detailed and useful than 
it was 3 years ago
when I got started. I'm not really sure what's being asked for here, nor am I 
sure how one would
document how to troubleshoot. In my mind, if there's a trouble that can be 
documented with a clear
path to resolution, then a bug report should be filed and that should be fixed. 
Any other cases that
cannot be coded for require human intervention and are already documented.

Please tell me your thoughts.


There is no online documentation for v3.3. Every google search points to
http://gluster.org/community/documentation/index.php/Gluster_3.2_Filesystem_Administration_Guide


There is an Admin guide, but it's in pdf and html gzipped? Is it fun?:)
Also I'm afraid this guide is RH specific.

For example there is this section:

"If you are using Samba to access GlusterFS FUSE mount, then POSIX ACLs are 
enabled by default.
Samba has been compiled with the
--with-acl-support
option, so no special flags are required
when accessing or mounting a Samba share."

How do you know about that?


Also there is no good cifs documentation.
I suggest adding a full smb.conf.
I asked here in the list for tuning help, with no answer, then later I found 
the solution by google.
For me max/min protocol = SMB2 was the trick. But there can be locking issues (for example Adobe 
Premiere and Photoshop).

I don't blame the list, but the online, searchable documentation should be more 
helpful.

However of course this is a samba general issue at the same time.


And I think this guide should be used more or less by experts, not absolutely 
beginners.
If someone is looking for the right information, it's hard to find, because too 
verbose:

"Chapter 6. Accessing Data - Setting Up GlusterFS Client
28
6.3.1.2. Manually Mounting Volumes Using CIFS
You can manually mount Gluster volumes using CIFS on Microsoft Windows-based 
client machines.
To manually mount a Gluster volume using CIFS
1.
Using Windows Explorer, choose
Tools > Map Network Drive...
from the menu. The
Map
Network Drive
window appears.
2.
Choose the drive letter using the
Drive
drop-down list.
3.
Click
Browse
, select the volume to map to the network drive, and click
OK
.
4.
Click
Finish.
The network drive (mapped to the volume) appears in the Computer window.
Alternatively, to manually mount a Gluster volume using CIFS.
•
Click
Start > Run
and enter the following:
\\SERVERNAME\VOLNAME
For example:
\\server1\test-volume"



This whol cifs section is obviously only an example:)

Cheers,
tamas


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] memory leak in 3.3.1 rebalance?

2013-03-05 Thread Pierre-Francois Laquerre
I started rebalancing my 25x2 distributed-replicate volume two days ago. 
Since then, the memory usage of the rebalance processes has been 
steadily climbing by 1-2 megabytes per minute. Following 
http://gluster.org/community/documentation/index.php/High_Memory_Usage, 
I tried "echo 2 > /proc/sys/vm/drop_caches". This had no effect on the 
processes' memory usage. Some of the servers are already eating >10G of 
memory. At this rate, I will have to cancel this rebalance, even though 
brick usage is heavily skewed right now (most bricks are at 87% 
capacity, but recently added ones are at 18-28%).


Any ideas what might be causing this? The only references to rebalance 
memory leaks I could find were related to 3.2.x, not 3.3.1.


gluster volume info:
Volume Name: bigdata
Type: Distributed-Replicate
Volume ID: 56498956-7b4b-4ee3-9d2b-4c8cfce26051
Status: Started
Number of Bricks: 25 x 2 = 50
Transport-type: tcp
Bricks:
Brick1: ml43:/mnt/donottouch/localb
Brick2: ml44:/mnt/donottouch/localb
Brick3: ml43:/mnt/donottouch/localc
Brick4: ml44:/mnt/donottouch/localc
Brick5: ml45:/mnt/donottouch/localb
Brick6: ml46:/mnt/donottouch/localb
Brick7: ml45:/mnt/donottouch/localc
Brick8: ml46:/mnt/donottouch/localc
Brick9: ml47:/mnt/donottouch/localb
Brick10: ml48:/mnt/donottouch/localb
Brick11: ml47:/mnt/donottouch/localc
Brick12: ml48:/mnt/donottouch/localc
Brick13: ml45:/mnt/donottouch/locald
Brick14: ml46:/mnt/donottouch/locald
Brick15: ml47:/mnt/donottouch/locald
Brick16: ml48:/mnt/donottouch/locald
Brick17: ml51:/mnt/donottouch/localb
Brick18: ml52:/mnt/donottouch/localb
Brick19: ml51:/mnt/donottouch/localc
Brick20: ml52:/mnt/donottouch/localc
Brick21: ml51:/mnt/donottouch/locald
Brick22: ml52:/mnt/donottouch/locald
Brick23: ml59:/mnt/donottouch/locald
Brick24: ml54:/mnt/donottouch/locald
Brick25: ml59:/mnt/donottouch/localc
Brick26: ml54:/mnt/donottouch/localc
Brick27: ml59:/mnt/donottouch/localb
Brick28: ml54:/mnt/donottouch/localb
Brick29: ml55:/mnt/donottouch/localb
Brick30: ml29:/mnt/donottouch/localb
Brick31: ml55:/mnt/donottouch/localc
Brick32: ml29:/mnt/donottouch/localc
Brick33: ml30:/mnt/donottouch/localc
Brick34: ml31:/mnt/donottouch/localc
Brick35: ml30:/mnt/donottouch/localb
Brick36: ml31:/mnt/donottouch/localb
Brick37: ml40:/mnt/donottouch/localb
Brick38: ml41:/mnt/donottouch/localb
Brick39: ml40:/mnt/donottouch/localc
Brick40: ml41:/mnt/donottouch/localc
Brick41: ml56:/mnt/donottouch/localb
Brick42: ml57:/mnt/donottouch/localb
Brick43: ml56:/mnt/donottouch/localc
Brick44: ml57:/mnt/donottouch/localc
Brick45: ml25:/mnt/donottouch/localb
Brick46: ml26:/mnt/donottouch/localb
Brick47: ml01:/mnt/donottouch/localb/brick
Brick48: ml25:/mnt/donottouch/localc/brick
Brick49: ml01:/mnt/donottouch/localc/brick
Brick50: ml26:/mnt/donottouch/localc/brick
Options Reconfigured:
nfs.register-with-portmap: OFF
nfs.disable: on
performance.quick-read: on

The majority of these bricks are ext4, except for the latest 7 which are 
xfs. Each is backed by a 2T hard drive.


gluster --version:
glusterfs 3.3.1 built on Oct 11 2012 21:49:37

cat /etc/system-release:
Scientific Linux release 6.1 (Carbon)

uname -a:
Linux ml01 2.6.32-131.17.1.el6.x86_64 #1 SMP Wed Oct 5 17:19:54 CDT 2011 
x86_64 x86_64 x86_64 GNU/Linux


df -i /mnt/bigdata:
FilesystemInodes   IUsed   IFree IUse% Mounted on
ml43:/bigdata3272292992 114236317 31580566754% /mnt/bigdata

df /mnt/bigdata:
Filesystem   1K-blocks  Used Available Use% Mounted on
ml43:/bigdata48160570880 33787913600 12223793792  74% /mnt/bigdata


the process I am referring to:

/usr/sbin/glusterfs -s localhost --volfile-id bigdata --xlator-option 
*dht.use-readdirp=yes --xlator-option *dht.lookup-unhashed=yes 
--xlator-option *dht.assert-no-child-down=yes --xlator-option 
*replicate*.data-self-heal=off --xlator-option 
*replicate*.metadata-self-heal=off --xlator-option 
*replicate*.entry-self-heal=off --xlator-option *dht.rebalance-cmd=1 
--xlator-option *dht.node-uuid=5c338e03-28ff-429b-b702-0a04e25565f8 
--socket-file 
/var/lib/glusterd/vols/bigdata/rebalance/5c338e03-28ff-429b-b702-0a04e25565f8.sock 
--pid-file 
/var/lib/glusterd/vols/bigdata/rebalance/5c338e03-28ff-429b-b702-0a04e25565f8.pid 
-l /var/log/glusterfs/bigdata-rebalance.log


I am seeing a lot of:
[2013-03-05 10:59:51.170051] I [dht-rebalance.c:647:dht_migrate_file] 
0-bigdata-dht: /a/c/lu/lu-nod/_4.nrm: attempting to move from 
bigdata-replicate-23 to bigdata-replicate-17
[2013-03-05 10:59:51.296487] W 
[dht-rebalance.c:361:__dht_check_free_space] 0-bigdata-dht: data 
movement attempted from node (bigdata-replicate-23) with higher disk 
space to a node (bigdata-replicate-17) with lesser disk space 
(/a/c/lu/lu-nod/_4.nrm)
[2013-03-05 10:59:51.296604] E 
[dht-rebalance.c:1202:gf_defrag_migrate_data] 0-bigdata-dht: 
migrate-data failed for /a/c/lu/lu-nod/_4.nrm


in the rebalance log, and a lot of:

[2013-03-05 11:00:02.609033] I [server3_1-fops.c:576:server_mknod_cbk] 
0-bigdata-

Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Joe Julian

On 03/05/2013 09:43 AM, harry mangalam wrote:

Mostly a Ditto on Shawn's piece.  Among the best format of documentation for a
complex system like gluster is what was produced for MySQL a while ago.  I
think it still exists:

http://dev.mysql.com/doc/refman/5.5/en/installing.html
(see comment at bottom)

Don't know what they use to do this, but it's pretty effective .

This allows user comments to be added directly to the docs, possibly verified
or at least filtered by the doc manager, so that the comments and additional
helpful links are available to people browsing the docs without having to go
off to other sites (or at least it makes the other sites' relevant docs
available from the mainline docs).  This consolidates all the info in one
place.

Since all software is essentially a rolling beta, it makes sense for the docs
to reflect that and leverage user comments into the main.

hjm



On Tuesday, March 05, 2013 12:33:18 PM Shawn Nock wrote:

Joe Julian  writes:

It comes up on this list from time to time that there's not sufficient
documentation on troubleshooting. I assume that's what some people
mean when they refer to disappointing documentation as the current
documentation is far more detailed and useful than it was 3 years ago
when I got started. I'm not really sure what's being asked for here,
nor am I sure how one would document how to troubleshoot. In my mind,
if there's a trouble that can be documented with a clear path to
resolution, then a bug report should be filed and that should be
fixed. Any other cases that cannot be coded for require human
intervention and are already documented.

It is true that the documentation has gotten better.

However, since the switch to the new release cycle, bugs don't seem to
get fixed (within a release) and the documentation could do a better job
listing some of the holes new users starting with the current GA will
likely fall into:

Examples:

- Don't use ext4
(https://bugzilla.redhat.com/show_bug.cgi?id=838784)

- Don't use fix-layout after adding a brick
(https://bugzilla.redhat.com/show_bug.cgi?id=913699), maybe fixed by
10617e6cbc73329f259b471327d88375352042b0 in 3.3.1 but:

- Don't upgrade from 3.3 to 3.3.1 if you need NFS
(https://bugzilla.redhat.com/show_bug.cgi?id=893778)

1. Perhaps a wiki entry like "Known Issues" with links to all these bugs?

2. Copying the excellent info about gluster's xattrs from this blog
post (http://cloudfs.org/2011/04/glusterfs-extended-attributes/) into
the admin guide would be a start.

3. A brief guide on how to collect info on problematic files
(permissions, xattrs, client log, brick log) would probably help generate
more helpful bug reports and help users sort out many of their own
problems.

It's all stuff you pickup after you've been in the game for a while, but
they must really flummox new users.
I've started to look at converting the admin guide to asciidoc so it'll 
be easier to contribute to. I've also talked (briefly) about splitting 
the documentation away from the release source to make contributing 
easier. I plan on putting it up on github after I make at least enough 
progress to make it obvious where it's heading.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Joe Julian

On 03/05/2013 09:33 AM, Shawn Nock wrote:

However, since the switch to the new release cycle, bugs don't seem to
get fixed (within a release)
Starting to frustrate me as well. There's too many new features in 3.4 
for me to feel comfortable making the switch and not enough bugs being 
backported.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Joe Julian

On 03/05/2013 09:57 AM, Brian Candler wrote:

On Tue, Mar 05, 2013 at 08:33:28AM -0800, Joe Julian wrote:

It comes up on this list from time to time that there's not
sufficient documentation on troubleshooting. I assume that's what
some people mean when they refer to disappointing documentation as
the current documentation is far more detailed and useful than it
was 3 years ago when I got started. I'm not really sure what's being
asked for here, nor am I sure how one would document how to
troubleshoot. In my mind, if there's a trouble that can be
documented with a clear path to resolution, then a bug report should
be filed and that should be fixed. Any other cases that cannot be
coded for require human intervention and are already documented.

When people come to this list and say "I am seeing split brain errors" or
"ls shows question marks for file attributes"
Article(s) on the official Q&A Site but that [censored] site can't find 
it with a search. Grrr.

or "I need to replace a failed server with a new one"

Article also on the official Q&A Site but again search isn't finding them.

I'll try to grab the contents of those and paste them into the wiki 
somewhere (unless you do it first. It is a wiki after all).

or "probing a server fails",
Agreed. This would be good. Does anyone actually know how to answer 
this? Please write it up on the wiki. I know I even have trouble 
sometimes figuring out why someone's probe fails.

I don't think there's
any official documentation to help them.

"Documenting how to troubleshoot" would include what log messages you should
look for and what they mean, what xattrs you should expect to see on the
bricks and what they mean (for each case of distributed, replicated etc).
Given a basic checklist of these things, it would be easy for users to
report to the list "I checked A, B and C and the output from B was  when
the docs say it should be  on a working system", which is at least a
starting point.
This is where all open source seems to hit problems. Sure, there's error 
messages (at least they're not "Error ##" like mysql does...) but they 
seem to generally only make sense to whomever wrote the software. There 
are 7216 log entries in the source. That's a lot of man-hours to 
document all of those even without any degree of detail.


Now, there are only 136 critical errors but I'm not sure I've ever seen 
one of those. 2991 at the level of "error" so I'm really not sure how 
that could be handled. Even if someone could volunteer 8 hours/day to 
spend 15 minutes describing each error message, it would take them 
around 4 1/2 months. That's longer than a production cycle (granted, 
once they were documented the production cycle would be unlikely to 
produce nearly 3000 new error messages).


I'd be willing to make the list and document 1 or 2 a day. Anyone else?

As far as I'm aware, the official admin guide is completely oblivious to
internals like this.

Users may be able to find suggestions by perusing mailing list archives, or
by trying gluster 2.x wiki documentation (which may be stale), or some blog
postings.
Thanks for pointing these out. Some I (obviously) wasn't even aware were 
a problem.


By the way - if anyone wants to copy-paste stuff from my blog into the 
wiki, feel free. I keep meaning to but have been behind schedule at work 
and just haven't had enough free time lately.

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] UFO - new version capable of S3 API?

2013-03-05 Thread Shawn Heisey
On 3/5/2013 1:27 PM, Peter Portante wrote:> If I am not mistaken, as of 
OpenStack Folsom (1.7.4), the s3

> compatibility middleware was broken out into its own project,
> found here: https://github.com/fujita/swift3
>
> You should be able to install that and configure as before, be sure
> to read that project's updated readme for any changes.

I'm relying on kkeithley's packaging, the separate one for CentOS or the 
built-in one for F18.  I tried to install openstack-swift-plugin-swift3, 
but got the following error, which didn't surprise me:


Error: glusterfs-swift conflicts with openstack-swift-1.7.4-1.fc18.noarch

There does not appear to be a swift3 plugin in the glusterfs-ufo package 
set, either for CentOS 6 or F18.  Any chance one could be made?


Thanks,
Shawn

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Joe Julian

On 03/05/2013 08:51 AM, Jason Villalta wrote:
I think some people would benefit from more recipe examples, 
especially around using with Virtualization. (KVM, Openstack, 
Cloudstack, OpenNebula).  I know it is not the job of Gluster to tell 
people how to configure these other systems but maybe a quick list of 
will work, sorta work and not work at all.  I know from my personal 
experience I have spent a lot time testing different 
configuration/combinations of these virtualization systems and Gluster 
before find what seems to work acceptably for my uses.
I don't suppose you added the results of your testing to the wiki 
somewhere?
 Admittedly most of the issues are around this stubborn 
FUSE/Direct-IO/ODirect support in most distributions (Ubuntu/CentOS). 
 I think if these FUSE mounting IO issues were resolve/(made better) 
people would be A LOT happier to use Gluster.  I think there will 
always be weird hardware related issues that will crop up in Gluster 
that all the documentation in the world can't fix but getting a clear 
supported path for (Cloud/KVM/Gluster/FS of choice) would 
help tremendously or at least allow people to determine if Gluster is 
the right fit.
That O_DIRECT issue has been resolved in EL (aka RHEL, CentOS, 
Scientific Linux, etc) 6.3.


My 2 cents.



Thanks.

On that note, as well, I'd like to remind everybody that open source 
isn't about being a producer or a consumer. If you know something, share 
it. Add it to the wiki, or blog, email, Facebook, Google+, etc. You 
don't have to be a coder to be part of any open source project. I 
haven't written anything in C for over 20 years, but I'm still a part of 
the Gluster Community. I'm also becoming more involved in puppet, 
logstash, and OpenStack.


I know that before I got involved with Gluster I always felt like there 
was "them" that made the software, and "us" that used whatever they gave 
us. I knew I didn't have the time to contribute code so I quietly used 
free software. Since I've gotten involved, I've realized how easy it is. 
Get involved. It's fun. :)


On Tue, Mar 5, 2013 at 11:33 AM, Joe Julian > wrote:


It comes up on this list from time to time that there's not
sufficient documentation on troubleshooting. I assume that's what
some people mean when they refer to disappointing documentation as
the current documentation is far more detailed and useful than it
was 3 years ago when I got started. I'm not really sure what's
being asked for here, nor am I sure how one would document how to
troubleshoot. In my mind, if there's a trouble that can be
documented with a clear path to resolution, then a bug report
should be filed and that should be fixed. Any other cases that
cannot be coded for require human intervention and are already
documented.

Please tell me your thoughts.



___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] UFO - new version capable of S3 API?

2013-03-05 Thread Peter Portante
- Original Message -
> From: "Shawn Heisey" 
> To: gluster-users@gluster.org
> Sent: Tuesday, March 5, 2013 2:22:47 PM
> Subject: [Gluster-users] UFO - new version capable of S3 API?
> 
> I was running the previous version of UFO, the 3.3 one that was based
> on Swift 1.4.8.  Now there is a 3.3.1 based on Swift 1.7.4.  The
> config that I used last time to enable S3 isn't working with the new
> one, just updated yesterday using yum.  I was using tempauth in the
> old version and I'm still using tempauth.
> 
> I have a CentOS 6 system and a Fedora 18 system with UFO on them.
> The CentOS is using kkiethley's repository, the F18 is using
> standard repositories.
> 
> Is there any way to get S3 with UFO?

If I am not mistaken, as of OpenStack Folsom (1.7.4), the s3
compatibility middleware was broken out into its own project,
found here: https://github.com/fujita/swift3

You should be able to install that and configure as before, be sure
to read that project's updated readme for any changes.

-peter


> Thanks,
> Shawn
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] UFO - new version capable of S3 API?

2013-03-05 Thread Shawn Heisey
I was running the previous version of UFO, the 3.3 one that was based on 
Swift 1.4.8.  Now there is a 3.3.1 based on Swift 1.7.4.  The config 
that I used last time to enable S3 isn't working with the new one, just 
updated yesterday using yum.  I was using tempauth in the old version 
and I'm still using tempauth.


I have a CentOS 6 system and a Fedora 18 system with UFO on them.  The 
CentOS is using kkiethley's repository, the F18 is using standard 
repositories.


Is there any way to get S3 with UFO?

Thanks,
Shawn
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Brian Candler
On Tue, Mar 05, 2013 at 08:33:28AM -0800, Joe Julian wrote:
> It comes up on this list from time to time that there's not
> sufficient documentation on troubleshooting. I assume that's what
> some people mean when they refer to disappointing documentation as
> the current documentation is far more detailed and useful than it
> was 3 years ago when I got started. I'm not really sure what's being
> asked for here, nor am I sure how one would document how to
> troubleshoot. In my mind, if there's a trouble that can be
> documented with a clear path to resolution, then a bug report should
> be filed and that should be fixed. Any other cases that cannot be
> coded for require human intervention and are already documented.

When people come to this list and say "I am seeing split brain errors" or
"ls shows question marks for file attributes" or "I need to replace a failed
server with a new one" or "probing a server fails", I don't think there's
any official documentation to help them.

"Documenting how to troubleshoot" would include what log messages you should
look for and what they mean, what xattrs you should expect to see on the
bricks and what they mean (for each case of distributed, replicated etc). 
Given a basic checklist of these things, it would be easy for users to
report to the list "I checked A, B and C and the output from B was  when
the docs say it should be  on a working system", which is at least a
starting point.

As far as I'm aware, the official admin guide is completely oblivious to
internals like this.

Users may be able to find suggestions by perusing mailing list archives, or
by trying gluster 2.x wiki documentation (which may be stale), or some blog
postings.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread harry mangalam
Mostly a Ditto on Shawn's piece.  Among the best format of documentation for a 
complex system like gluster is what was produced for MySQL a while ago.  I 
think it still exists:

http://dev.mysql.com/doc/refman/5.5/en/installing.html
(see comment at bottom)

Don't know what they use to do this, but it's pretty effective .

This allows user comments to be added directly to the docs, possibly verified 
or at least filtered by the doc manager, so that the comments and additional 
helpful links are available to people browsing the docs without having to go 
off to other sites (or at least it makes the other sites' relevant docs 
available from the mainline docs).  This consolidates all the info in one 
place.

Since all software is essentially a rolling beta, it makes sense for the docs 
to reflect that and leverage user comments into the main.

hjm



On Tuesday, March 05, 2013 12:33:18 PM Shawn Nock wrote:
> Joe Julian  writes:
> > It comes up on this list from time to time that there's not sufficient
> > documentation on troubleshooting. I assume that's what some people
> > mean when they refer to disappointing documentation as the current
> > documentation is far more detailed and useful than it was 3 years ago
> > when I got started. I'm not really sure what's being asked for here,
> > nor am I sure how one would document how to troubleshoot. In my mind,
> > if there's a trouble that can be documented with a clear path to
> > resolution, then a bug report should be filed and that should be
> > fixed. Any other cases that cannot be coded for require human
> > intervention and are already documented.
> 
> It is true that the documentation has gotten better.
> 
> However, since the switch to the new release cycle, bugs don't seem to
> get fixed (within a release) and the documentation could do a better job
> listing some of the holes new users starting with the current GA will
> likely fall into:
> 
> Examples:
> 
> - Don't use ext4
> (https://bugzilla.redhat.com/show_bug.cgi?id=838784)
> 
> - Don't use fix-layout after adding a brick
> (https://bugzilla.redhat.com/show_bug.cgi?id=913699), maybe fixed by
> 10617e6cbc73329f259b471327d88375352042b0 in 3.3.1 but:
> 
> - Don't upgrade from 3.3 to 3.3.1 if you need NFS
> (https://bugzilla.redhat.com/show_bug.cgi?id=893778)
> 
> 1. Perhaps a wiki entry like "Known Issues" with links to all these bugs?
> 
> 2. Copying the excellent info about gluster's xattrs from this blog
> post (http://cloudfs.org/2011/04/glusterfs-extended-attributes/) into
> the admin guide would be a start.
> 
> 3. A brief guide on how to collect info on problematic files
> (permissions, xattrs, client log, brick log) would probably help generate
> more helpful bug reports and help users sort out many of their own
> problems.
> 
> It's all stuff you pickup after you've been in the game for a while, but
> they must really flummox new users.

---
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
415 South Circle View Dr, Irvine, CA, 92697 [shipping]
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
---
"Something must be done. [X] is something. Therefore, we must do it."
Bruce Schneier, on American response to just about anything.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Shawn Nock
Shawn Nock  writes:
> 2. Copying the excellent info about gluster's xattrs from this blog
> post (http://cloudfs.org/2011/04/glusterfs-extended-attributes/) into
> the admin guide would be a start.

The correct link is:
http://hekafs.org/index.php/2011/04/glusterfs-extended-attributes/ ,
but so many of Jeff Darcy's posts have proven useful to me.

Perhaps the correct word is "adapt" and not "copy".

-- 
Shawn Nock (OpenPGP: 0x65118FA5)


pgp0G6F6_0K5V.pgp
Description: PGP signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Disappointing documentation?

2013-03-05 Thread Shawn Nock
Joe Julian  writes:

> It comes up on this list from time to time that there's not sufficient
> documentation on troubleshooting. I assume that's what some people
> mean when they refer to disappointing documentation as the current
> documentation is far more detailed and useful than it was 3 years ago
> when I got started. I'm not really sure what's being asked for here,
> nor am I sure how one would document how to troubleshoot. In my mind,
> if there's a trouble that can be documented with a clear path to
> resolution, then a bug report should be filed and that should be
> fixed. Any other cases that cannot be coded for require human
> intervention and are already documented.

It is true that the documentation has gotten better. 

However, since the switch to the new release cycle, bugs don't seem to
get fixed (within a release) and the documentation could do a better job
listing some of the holes new users starting with the current GA will
likely fall into:

Examples:

- Don't use ext4
(https://bugzilla.redhat.com/show_bug.cgi?id=838784)

- Don't use fix-layout after adding a brick
(https://bugzilla.redhat.com/show_bug.cgi?id=913699), maybe fixed by
10617e6cbc73329f259b471327d88375352042b0 in 3.3.1 but:

- Don't upgrade from 3.3 to 3.3.1 if you need NFS 
(https://bugzilla.redhat.com/show_bug.cgi?id=893778)

1. Perhaps a wiki entry like "Known Issues" with links to all these bugs?

2. Copying the excellent info about gluster's xattrs from this blog
post (http://cloudfs.org/2011/04/glusterfs-extended-attributes/) into
the admin guide would be a start.

3. A brief guide on how to collect info on problematic files
(permissions, xattrs, client log, brick log) would probably help generate
more helpful bug reports and help users sort out many of their own
problems.

It's all stuff you pickup after you've been in the game for a while, but
they must really flummox new users.

-- 
Shawn Nock (OpenPGP: 0x65118FA5)


pgpXeto644M6M.pgp
Description: PGP signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Disappointing documentation?

2013-03-05 Thread Joe Julian
It comes up on this list from time to time that there's not sufficient 
documentation on troubleshooting. I assume that's what some people mean 
when they refer to disappointing documentation as the current 
documentation is far more detailed and useful than it was 3 years ago 
when I got started. I'm not really sure what's being asked for here, nor 
am I sure how one would document how to troubleshoot. In my mind, if 
there's a trouble that can be documented with a clear path to 
resolution, then a bug report should be filed and that should be fixed. 
Any other cases that cannot be coded for require human intervention and 
are already documented.


Please tell me your thoughts.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] GlusterFS performance

2013-03-05 Thread harry mangalam
This kind of info is surprisingly hard to obtain.  The gluster docs do contain 
some of it, ie:



I also found well-described kernel tuning parameters in the FHGFS wiki (as 
another distibuted fs, they share some characteristics)

http://www.fhgfs.com/wiki/wikka.php?wakka=StorageServerTuning

and more XFS tuning filesystem params here:



and here:


But of course, YMMV and a number of these parameters conflict and/or have 
serious tradeoffs, as you discovered.

LSI recently loaned me a Nytro SAS controller (on-card SSD-cached) which seems 
pretty phenomenal on a single brick (and is predicted to perform well based on 
their profiling), but am waiting for another node to arrive before I can test 
it under true gluster conditions.  Anyone else tried this hardware?

hjm

On Tuesday, March 05, 2013 12:34:41 PM Nikita A Kardashin wrote:
> Hello all!
> 
> This problem is solved by me today.
> Root of all in the incompatibility of gluster cache and kvm cache.
> 
> Bug reproduces if KVM virtual machine created with cache=writethrough
> (default for OpenStack) option and hosted on GlusterFS volume. If any other
> (cache=writeback or cache=none with direct-io) cacher used, performance of
> writing to existing file inside VM is equal to bare storage (from host
> machine) write performance.
> 
> I think, it must be documented in Gluster and maybe filled a bug.
> 
> Other question. Where I can read something about gluster tuning (optimal
> cache size, write-behind, flush-behind use cases and other)? I found only
> options list, without any how-to or tested cases.
> 
> 
> 2013/3/5 Toby Corkindale 
> 
> > On 01/03/13 21:12, Brian Candler wrote:
> >> On Fri, Mar 01, 2013 at 03:30:07PM +0600, Nikita A Kardashin wrote:
> >>> If I try to execute above command inside virtual machine (KVM),
> >>> first
> >>> time all going right - about 900MB/s (cache effect, I think), but if
> >>> 
> >>> I
> >>> 
> >>> run this test again on existing file - task (dd) hungs up and can be
> >>> stopped only by Ctrl+C.
> >>> Overall virtual system latency is poor too. For example, apt-get
> >>> upgrade upgrading system very, very slow, freezing on "Unpacking
> >>> replacement" and other io-related steps.
> >>> Does glusterfs have any tuning options, that can help me?
> >> 
> >> If you are finding that processes hang or freeze indefinitely, this is
> >> not
> >> a question of "tuning", this is simply "broken".
> >> 
> >> Anyway, you're asking the wrong person - I'm currently in the process of
> >> stripping out glusterfs, although I remain interested in the project.
> >> 
> >> I did find that KVM performed very poorly, but KVM was not my main
> >> application and that's not why I'm abandoning it.  I'm stripping out
> >> glusterfs primarily because it's not supportable in my environment,
> >> because
> >> there is no documentation on how to analyse and recover from failure
> >> scenarios which can and do happen. This point in more detail:
> >> http://www.gluster.org/**pipermail/gluster-users/2013-**
> >> January/035118.html >> anuary/035118.html>
> >> 
> >> The other downside of gluster was its lack of flexibility, in particular
> >> the
> >> fact that there is no usage scaling factor on bricks, so that even with a
> >> simple distributed setup all your bricks have to be the same size.  Also,
> >> the object store feature which I wanted to use, has clearly had hardly
> >> any
> >> testing (even the RPM packages don't install properly).
> >> 
> >> I *really* wanted to deploy gluster, because in principle I like the idea
> >> of
> >> a virtual distribution/replication system which sits on top of existing
> >> local filesystems.  But for storage, I need something where operational
> >> supportability is at the top of the pile.
> > 
> > I have to agree; GlusterFS has been in use here in production for a while,
> > and while it mostly works, it's been fragile and documentation has been
> > disappointing. Despite 3.3 being in beta for a year, it still seems to
> > have
> > been poorly tested. For eg, I can't believe almost no-one else noticed
> > that
> > the log files were busted.. nor that the bug report has been around for
> > quarter of a year without being responded to or fixed.
> > 
> > I have to ask -- what are you moving to now, Brian?
> > 
> > -Toby
> > 
> > 
> > __**_
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://supercolony.gluster.**org/mailman/listinfo/gluster-**users > upercolony.gluster.org/mailman/listinfo/gluster-users>

---
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
415 South C

Re: [Gluster-users] GlusterFS performance

2013-03-05 Thread Brian Candler
On Tue, Mar 05, 2013 at 12:01:35PM +1100, Toby Corkindale wrote:
> I have to ask -- what are you moving to now, Brian?

Nothing clever: NFSv4 to the storage bricks, behind-time replication using
rsync, application-layer distribution of files between bricks.

We may have a future need for a storage pool with S3 API, at which point
I'll probably spend some time testing Ceph.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Fwd: Performance in VM guests when hosting VM images on Gluster

2013-03-05 Thread Torbjørn Thorsen
On Fri, Mar 1, 2013 at 7:01 PM, Brian Foster  wrote:
> On 03/01/2013 11:48 AM, Torbjørn Thorsen wrote:
>> On Thu, Feb 28, 2013 at 4:54 PM, Brian Foster  wrote:
> All writes are done with sync, so I don't quite understand how cache
>> flushing comes in.
>>
>
> Flushing doesn't seem to be a factor, I was just noting previously that
> the only slowdown I noticed in my brief tests were associated with flushing.
>
> Note again though that loop seems to flush on close(). I suspect a
> reason for this is so 'losetup -d' can return immediately, but that's
> just a guess. IOW, if you hadn't used oflag=sync, the close() issued by
> dd before it actually exits would result in flushing the buffers
> associated with the loop device to the backing store. You are using
> oflag=sync, so that doesn't really matter.

Ah, I see.
I thought you meant close() on the FD that was backing the loop device,
but now I see what you mean.
Doing an non-sync dd run towards a loop device, it felt like that was the case.
I was seeing high throughput, but pressing ^C didn't stop dd,
and I'm guessing it's because it was blocking on close().

>>
..
>> To me it seems that a fresh loop device does mostly 64kb writes,
>> and at some point during a 24 hour window, changes to doing 4kb writes ?
>>
>
> Yeah, interesting data. One thing I was curious about is whether
> write-behind or some other caching translator was behind this one way or
> another (including the possibility that the higher throughput value is
> actually due to a bug, rather than the other way around). If I
> understand the io-stats translator correctly however, these request size
> metrics should match the size of the requests coming into gluster and
> thus suggest something else is going on.
>
> Regardless, I think it's best to narrow the problem down and rule out as
> much as possible. Could you try some of the commands in my previous
> email to disable performance translators and see if it affects
> throughput? For example, does disabling any particular translator
> degrade throughput consistently (even on new loop devices)? If so, does
> re-enabling a particular translator enhance throughput on an already
> mapped and "degraded" loop (without unmapping/remapping the loop)?
>

I was running with defaults, no configuration had been done after
installing Gluster.

If I disable the write-behind translator, I immediately see pretty
much the same speeds
as the "degraded loop", ie. ~3MB/s.
Gluster profiling tells me the same story, all writes are now 4KB requests.

If write-behind is disabled, the loop device is slow even if it's fresh.
Enabling write-behind, even while dd is writing to the loop device,
seems to increase the speed right away, without needing a new fd to the device.

A degraded loop device without an open fd will be fast after a toggle
of write-behind.
However, it seems that an open fd will keep the loop device slow.
I've only tested that with Xen, as that was the only thing I had with
a long-lived open fd to a loop device.

> Also, what gluster and kernel versions are you on?

# uname -a
Linux xen-storage01 2.6.32-5-xen-amd64 #1 SMP Sun May 6 08:57:29 UTC
2012 x86_64 GNU/Linux

# dpkg -l | grep $(uname -r)
ii  linux-image-2.6.32-5-xen-amd64  2.6.32-46
Linux 2.6.32 for 64-bit PCs, Xen dom0 support

# dpkg -l | grep gluster
ii  glusterfs-client3.3.1-1
clustered file-system (client package)
ii  glusterfs-common3.3.1-1
GlusterFS common libraries and translator modules


--
Vennlig hilsen
Torbjørn Thorsen
Utvikler / driftstekniker

Trollweb Solutions AS
- Professional Magento Partner
www.trollweb.no

Telefon dagtid: +47 51215300
Telefon kveld/helg: For kunder med Serviceavtale

Besøksadresse: Luramyrveien 40, 4313 Sandnes
Postadresse: Maurholen 57, 4316 Sandnes

Husk at alle våre standard-vilkår alltid er gjeldende
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users