Re: litte bug in initscripts

2012-12-13 Thread norbi
hm... now the MOD and MON doesn't start... the problems seems the configuration differences in ceph.conf http://ceph.com/docs/master/rados/configuration/ceph-conf/#the-ceph-conf-file or its a failure in the documentation? [osd.0] hostname = {hostname} [mon.a] host = hostName

litte bug in initscripts

2012-12-13 Thread norbi
Hi Ceph-List, i have found a little bug in "ceph_common.sh" "/usr/local/bin/ceph-conf --help" show the right option to get a hostname from the ceph.conf. in my case, the ceph init-script doesnt stop/start the OSDs FLAGS --name name Set type.id or the example EXAMPLES $

Re: Debian packaging question

2012-12-13 Thread Gary Lowell
On Dec 13, 2012, at 1:09 AM, James Page wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > On 12/12/12 23:38, Gary Lowell wrote: >> I took your new rules file out for a spin. It built ok, but we >> still got the libcephfs-java_0.55.1-1precise_all.deb built despite >> the --binary-ar

ceph on nilfs2?

2012-12-13 Thread Sage Weil
nilfs2 has a 'continuous snapshotting' architecture that the ceph-osd can take advantage of for making fully-consistent checkpoints of state for recovering from a crash. If these checkpoints are efficient enough, in fact, you may even get away with not having a journal file/device at all. Pro

Re: A couple of OSD-crashes after serious network trouble

2012-12-13 Thread Samuel Just
Most likely what happened is that the block represented by that file was fully overwritten replacing both copies. You can probably consider that one healed. The others should be dealt with similarly: the larger file should be the more correct one (since it should also reflect writes made recently

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Noah Watkins
The bindings use the default Hadoop settings (e.g. 64 or 128 MB chunks) when creating new files. The chunk size can also be specified on a per-file basis using the same interface as Hadoop. Additionally, while Hadoop doesn't provide an interface to configuration parameters beyond chunk size, we wil

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Gregory Farnum
On Thu, Dec 13, 2012 at 12:23 PM, Cameron Bahar wrote: > Is the chunk size tunable in A Ceph cluster. I don't mean dynamic, but even > statically configurable when a cluster is first installed? Yeah. You can set chunk size on a per-file basis; you just can't change it once the file has any data

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Cameron Bahar
Is the chunk size tunable in A Ceph cluster. I don't mean dynamic, but even statically configurable when a cluster is first installed? Thanks, Cameron Sent from my iPhone On Dec 13, 2012, at 9:41 AM, Gregory Farnum wrote: > On Thu, Dec 13, 2012 at 9:27 AM, Sage Weil wrote: >> Hi Jutta, >> >

Re: Crush and Monitor questions

2012-12-13 Thread Bryant Ng
Thanks Joao. Makes more sense now. What are your thoughts on my other question about expected load a monitor can handle? My understanding is that it just returns the cluster map to the client requesting it? The documentation mentions 3 to 5 monitors in a ceph cluster but what is the reques

Re: rbd map command hangs for 15 minutes during system start up

2012-12-13 Thread Alex Elder
On 12/13/2012 01:00 PM, Nick Bartos wrote: > Here's another log with the kernel debugging enabled: > https://gist.github.com/raw/4278697/1c9e41d275e614783fbbdee8ca5842680f46c249/rbd-hang-1355424455.log > > Note that it hung on the 2nd try. OK, thanks for the info. We'll keep looking. -Alex >

Re: rbd map command hangs for 15 minutes during system start up

2012-12-13 Thread Nick Bartos
Here's another log with the kernel debugging enabled: https://gist.github.com/raw/4278697/1c9e41d275e614783fbbdee8ca5842680f46c249/rbd-hang-1355424455.log Note that it hung on the 2nd try. On Wed, Dec 12, 2012 at 4:57 PM, Nick Bartos wrote: > Using wip-nick-newer, the problem still presented it

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Gregory Farnum
On Thu, Dec 13, 2012 at 9:27 AM, Sage Weil wrote: > Hi Jutta, > > On Thu, 13 Dec 2012, Lachfeld, Jutta wrote: >> Hi all, >> >> I am currently doing some comparisons between CEPH FS and HDFS as a file >> system for Hadoop using Hadoop's integrated benchmark TeraSort. This >> benchmark first gener

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Sage Weil
Hi Jutta, On Thu, 13 Dec 2012, Lachfeld, Jutta wrote: > Hi all, > > I am currently doing some comparisons between CEPH FS and HDFS as a file > system for Hadoop using Hadoop's integrated benchmark TeraSort. This > benchmark first generates the specified amount of data in the file system > used

Re: [PATCH] rbd: Add --json flag for the showmapped command

2012-12-13 Thread Yehuda Sadeh
On Thu, Dec 13, 2012 at 7:37 AM, Stratos Psomadakis wrote: > Signed-off-by: Stratos Psomadakis > --- > Hi Josh, > > This patch adds the '--json' flag to enable dumping the showmapped output in I think that it should be "--format=json" rather than --json. This will make it more in line with other

v0.55.1 is released

2012-12-13 Thread Sage Weil
There were some packaging and init script issues with v0.55, so a small point release is out. It fixes a few odds and ends: * init-ceph: typo (new 'fs type' stuff was broken) * debian: fixed conflicting upstart and sysvinit scripts * auth: fixed default auth settings * osd: dropped some brok

[PATCH 8/9] rbd: fix ceph_pg_poolid_by_name()

2012-12-13 Thread Alex Elder
Currently ceph_pg_poolid_by_name() returns an int, which is used to encode a ceph pool id. This could be a problem because a pool id (at least in some cases) is a 64-bit value. We have a defined pool id value that represents "no pool," and that's a very sensible return value here. This patch cha

[PATCH 9/9] libceph: socket can close in any connection state

2012-12-13 Thread Alex Elder
A connection's socket can close for any reason, independent of the state of the connection (and without irrespective of the connection mutex). As a result, the connectino can be in pretty much any state at the time its socket is closed. Handle those other cases at the top of con_work(). Pull thi

[PATCH 7/9] rbd: don't use ENOTSUPP

2012-12-13 Thread Alex Elder
ENOTSUPP is not a standard errno (it shows up as "Unknown error 524" in an error message). This is what was getting produced when the the local rbd code does not implement features required by a discovered rbd image. Change the error code returned in this case to ENXIO. Signed-off-by: Alex Elder

[PATCH 6/9] rbd: remove linger unconditionally

2012-12-13 Thread Alex Elder
In __unregister_linger_request(), the request is being removed from the osd client's req_linger list only when the request has a non-null osd pointer. It should be done whether or not the request currently has an osd. This is most likely a non-issue because I believe the request will always have

[PATCH 5/9] libceph: init osd->o_node in create_osd()

2012-12-13 Thread Alex Elder
It turns out to be harmless but the red-black node o_node in the ceph osd structure is not initialized in create_osd(). Add a call to rb_init_node() initialize it. Signed-off-by: Alex Elder --- net/ceph/osd_client.c |1 + 1 file changed, 1 insertion(+) diff --git a/net/ceph/osd_client.c b/

[PATCH 4/9] rbd: get rid of RBD_MAX_SEG_NAME_LEN

2012-12-13 Thread Alex Elder
RBD_MAX_SEG_NAME_LEN represents the maximum length of an rbd object name (i.e., one of the objects providing storage backing an rbd image). Another symbol, MAX_OBJ_NAME_SIZE, is used in the osd client code to define the maximum length of any object name in an osd request. Right now they disagree,

[PATCH 3/9] libceph: avoid using freed osd in __kick_osd_requests()

2012-12-13 Thread Alex Elder
If an osd has no requests and no linger requests, __reset_osd() will just remove it with a call to __remove_osd(). That drops a reference to the osd, and therefore the osd may have been free by the time __reset_osd() returns. That function offers no indication this may have occurred, and as a res

[PATCH 2/9] ceph: don't reference req after put

2012-12-13 Thread Alex Elder
In __unregister_request(), there is a call to list_del_init() referencing a request that was the subject of a call to ceph_osdc_put_request() on the previous line. This is not safe, because the request structure could have been freed by the time we reach the list_del_init(). Fix this by reversing

[PATCH 1/9] rbd: do not allow remove of mounted-on image

2012-12-13 Thread Alex Elder
There is no check in rbd_remove() to see if anybody holds open the image being removed. That's not cool. Add a simple open count that goes up and down with opens and closes (releases) of the device, and don't allow an rbd image to be removed if the count is non-zero. Protect the updates of the o

[PATCH 0/9] ceph: re-post of bug fixes

2012-12-13 Thread Alex Elder
I've pulled out most of the more important patches from the backlog I posted the other day and am reposting then here so they can hopefully get some needed attention. I'm in the midst of rearranging the patches in the testing branch so reviewed patches are all before those that are not yet reviewe

Re: A couple of OSD-crashes after serious network trouble

2012-12-13 Thread Oliver Francke
Hi Sam, On 12/13/2012 05:15 AM, Samuel Just wrote: Apologies, I missed your reply on Monday. Any attempt to read or no prob ;) We are busy, too, with preparing new nodes and switch to 10GE this evening. write the object will hit the file on the primary (the smaller one with the newer sysl

btrfs vulnerable to hash-DoS attack

2012-12-13 Thread Christopher Kunz
Hi all, since a couple people here are using btrfs (we, fortunately, are not), I thought you might be interested in this blog entry: Basically, it seems that if you create files with filenames that have colliding CRC32 hashes, all kind of

[PATCH] rbd: Add --json flag for the showmapped command

2012-12-13 Thread Stratos Psomadakis
Signed-off-by: Stratos Psomadakis --- Hi Josh, This patch adds the '--json' flag to enable dumping the showmapped output in json format (as you suggested). I'm not sure if any other rbd subcommands could make use of this flag (so the patch is showmapped-specific atm). Comments? Thanks, Stratos

Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Lachfeld, Jutta
Hi all, I am currently doing some comparisons between CEPH FS and HDFS as a file system for Hadoop using Hadoop's integrated benchmark TeraSort. This benchmark first generates the specified amount of data in the file system used by Hadoop, e.g. 1TB of data, and then sorts the data via the MapRe

Re: mds don´t start after upgrade to 0.55

2012-12-13 Thread David McBride
On 13/12/12 12:20, Soporte wrote: -1> 2012-12-13 09:04:00.745484 7f0944848700 10 monclient(hunting): none of our auth protocols are supported by the server This suggests that the change in 0.55 that enabled cephx authentication by default has caught you out. See: http://ceph.com/releas

mds don´t start after upgrade to 0.55

2012-12-13 Thread Soporte
Hi. I can´t start mds after upgrade to 0.55. The logs show: -40> 2012-12-13 09:03:54.264433 7f0949da0780 5 asok(0x2754000) register_command 2 hook 0x273c010 -39> 2012-12-13 09:03:54.264440 7f0949da0780 5 asok(0x2754000) register_command perf schema hook 0x273c010 -38> 2012-12-13 09:0

Re: Some osd cannot boot after a host down

2012-12-13 Thread Drunkard Zhang
It looks related to bug: http://tracker.newdream.net/issues/2843 . Is there any workaround? I'm not sure that recreate the osd node works :-( 2012/12/12 Drunkard Zhang : > Some osd node can not boot on a host of my cluster, the reason from > osd.log maybe different, looks like journal broken. > >

Re: Debian packaging question

2012-12-13 Thread James Page
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 12/12/12 23:38, Gary Lowell wrote: > I took your new rules file out for a spin. It built ok, but we > still got the libcephfs-java_0.55.1-1precise_all.deb built despite > the --binary-arch flag. The command used for the build is: > > sudo pbuil

Re: [ceph-commit] [ceph/ceph] e6a154: osx: compile on OSX

2012-12-13 Thread Christoph Hellwig
On Mon, Dec 10, 2012 at 07:11:44AM -1000, Sam Lang wrote: > >>Is libaio really needed to build ceph-fuse? I use macports on my system > >>and the last time I tried to make a change set to let ceph/ceph-fuse > >>build on my laptop failed as I didn't have libaio, though I could just > >>write a port