Re: Braindump: multiple clusters on the same hardware

2012-10-18 Thread Tommi Virtanen
On Thu, Oct 18, 2012 at 7:40 AM, Jimmy Tang jt...@tchpc.tcd.ie wrote: What I actually meant to ask was, is it possible to copy objects or pools from one ceph cluster to another (for disaster recovery reasons) and if this feature is planned or even considered? That's the async replication for

Braindump/announce: ceph-deploy

2012-10-18 Thread Tommi Virtanen
Hi. We've been working on the Chef cookbook and Crowbar barclamp for Ceph for a while now. At the same time, Clint Byrum and James Page have been working on the Juju Charm, and I've seen at least two separate efforts for Puppet scripts. All this time, I've repeatedly gotten one item of feedback,

Braindump: multiple clusters on the same hardware

2012-10-17 Thread Tommi Virtanen
You can run multiple Ceph clusters on the same hardware. They will have completely separate monitors, OSDs (including separate data disks and journals that will not be shared between clusters), MDSs etc. This provides a higher level of isolation than e.g. just using multiple RADOS pools and CRUSH

Braindump: ceph-disk-*, upstart scripts, ceph-create-keys

2012-10-17 Thread Tommi Virtanen
Alright. I've written a few braindumps on OSD hotplugging before, this is an update on what's in place now, and will hopefully form the core of the relevant documentation later. New-school deployments of Ceph have OSDs consume data disks fully -- that is, admin hands off the whole disk, Ceph

Re: Braindump: multiple clusters on the same hardware

2012-10-17 Thread Tommi Virtanen
On Wed, Oct 17, 2012 at 3:59 PM, John Wilkins john.wilk...@inktank.com wrote: Are there any issues related to ports or network conflicts? Yeah, two monitors belonging to different clusters on the same host are probably going to collide on the listening TCP port, if admin doesn't explicitly avoid

Re: s3 UnboundLocalError

2012-10-16 Thread Tommi Virtanen
On Mon, Oct 15, 2012 at 10:04 PM, Lokesh Krishnappa lokesh.professional...@gmail.com wrote: am following, the below given link : https://github.com/ceph/s3-tests for my testing purpose ... and while am executing the command as: S3TEST_CONF=example.conf ./virtualenv/bin/nosetests getting

New branch: Python packaging integrated into automake

2012-10-15 Thread Tommi Virtanen
Hi. While working on the external journal stuff, for a while I thought I needed more python code than I ended up needing. To support that code, I put in the skeleton of import ceph.foo support. While I ultimately didn't need it, I didn't want to throw away the results. If you later need to have

Re: Ignore O_SYNC for rbd cache

2012-10-12 Thread Tommi Virtanen
On Wed, Oct 10, 2012 at 9:23 AM, Sage Weil s...@inktank.com wrote: I certainly wouldn't recommend it, but there are probably use cases where it makes sense (i.e., the data isn't as important as the performance). This would make a lot of sense for e.g. service orchestration-style setups where

Re: Deleting files from radosgw-bucket doesn't free up space in ceph?

2012-10-09 Thread Tommi Virtanen
On Tue, Oct 9, 2012 at 9:31 AM, John Axel Eriksson j...@insane.se wrote: I'm worried that data deleted in radosgw wasn't actually deleted from disk/cluster. Are you aware of radosgw-admin temp remove? I was trying to point you to docs, but couldn't find any, so I filed

Re: How to write data to snapshot.

2012-10-08 Thread Tommi Virtanen
On Sat, Oct 6, 2012 at 10:11 PM, ramu eppa ramu.freesyst...@gmail.com wrote: Hi, In rbd am creating snapshot and next time i copied data to that snapshot, it was not copying the data.Please help me to how to copy data to snapshot. RBD snapshots are immutable by design. You can do a

Re: How to write data to rbd pool.

2012-10-03 Thread Tommi Virtanen
On Tue, Oct 2, 2012 at 9:55 PM, ramu eppa ramu.freesyst...@gmail.com wrote: When map rbd to /dev it's giving error, rbd map /dev/mypool/mytest The syntax is rbd map mypool/mytest. http://ceph.com/docs/master/man/8/rbd/ -- To unsubscribe from this list: send the line unsubscribe ceph-devel

Re: How to write data to rbd pool.

2012-10-02 Thread Tommi Virtanen
On Tue, Oct 2, 2012 at 6:13 AM, ramu eppa ramu.freesyst...@gmail.com wrote: How to know the rbd volume path. Umm. What is an rbd volume path? If you rbd map it, the block device will be called /dev/rbd/poolname/imagename -- maybe that's what you were looking for? -- To unsubscribe from this

Re: [ceph-commit] teuthology value error

2012-10-02 Thread Tommi Virtanen
On Tue, Oct 2, 2012 at 9:56 AM, Gregory Farnum g...@inktank.com wrote: File /home/lokesh/Downloads/ceph-teuthology-78b7b02/teuthology/misc.py, line 501, in read_config ctx.teuthology_config.update(new) ValueError: dictionary update sequence element #0 has length 1; 2 is required I haven't

Re: PG recovery reservation state chart

2012-10-02 Thread Tommi Virtanen
On Tue, Oct 2, 2012 at 12:48 PM, Mike Ryan mike.r...@inktank.com wrote: Tried sending this earlier but it seems the list doesn't like PNGs. dotty or dot -Tpng will make short work of the .dot file I've attached. vger discards messages with attachments. It's old school mailing list software.

Re: How to write data to rbd pool.

2012-10-01 Thread Tommi Virtanen
On Mon, Oct 1, 2012 at 8:38 AM, ramu eppa ramu.freesyst...@gmail.com wrote: Hi all, I want write data or text file to rbd pool.Please help me. You can create objects in RADOS pools with rados put. See http://ceph.com/docs/master/man/8/rados/ for the documentation. The rbd pool usually

Re: Collection of strange lockups on 0.51

2012-10-01 Thread Tommi Virtanen
On Sun, Sep 30, 2012 at 2:55 PM, Andrey Korolyov and...@xdel.ru wrote: Short post mortem - EX3200/12.1R2.9 may begin to drop packets (seems to appear more likely on 0.51 traffic patterns, which is very strange for L2 switching) when a bunch of the 802.3ad pairs, sixteen in my case, exposed to

Re: Slow ceph fs performance

2012-10-01 Thread Tommi Virtanen
On Thu, Sep 27, 2012 at 11:04 AM, Gregory Farnum g...@inktank.com wrote: However, my suspicion is that you're limited by metadata throughput here. How large are your files? There might be some MDS or client tunables we can adjust, but rsync's workload is a known weak spot for CephFS. I feel

Re: How to direct data inject to specific OSDs

2012-10-01 Thread Tommi Virtanen
On Thu, Sep 27, 2012 at 3:52 AM, hemant surale hemant.sur...@gmail.com wrote: I have upgraded my cluster to Ceph v0.48. and cluster is fine (except gceph not working) . gceph has been broken for a long time, wasn't seen to be worth fixing, and was removed in 0.48argonaut. How can I direct my

Re: Problem creating osd.0 ??

2012-10-01 Thread Tommi Virtanen
On Wed, Sep 26, 2012 at 10:43 PM, hemant surale hemant.sur...@gmail.com wrote: root@hemant-virtual-machine:/etc/ceph# mkcephfs -a -c /etc/ceph/ceph.conf -k ceph.keyring temp dir is /tmp/mkcephfs.E1M7ay6HBa preparing monmap in /tmp/mkcephfs.E1M7ay6HBa/monmap /usr/bin/monmaptool --create

Re: Hardware Requirements for RADOS Gateway Cluster

2012-09-26 Thread Tommi Virtanen
On Mon, Sep 24, 2012 at 3:58 PM, Brice Burgess briceb...@gmail.com wrote: 1. Is it preferable to run the RADOS Gateway on a MDS machine [for latency ... My assumption is to provision a dedicated machine for the RADOS Gateway. I'd treat this machine as a front end proxy/caching server meaning it

Re: safe to defrag XFS on live system?

2012-09-14 Thread Tommi Virtanen
On Fri, Sep 14, 2012 at 8:51 AM, Travis Rhoden trho...@gmail.com wrote: On a running Ceph cluster using XFS for the OSD's, is it safe to defrag the OSD devices while the system is live? I did a quick check of one device: xfs_db -c frag -r /dev/sdd actual 637596, ideal 144935, fragmentation

Re: ceph-level build system fixes cleanup

2012-09-12 Thread Tommi Virtanen
On Tue, Sep 11, 2012 at 7:22 PM, Jan Engelhardt jeng...@inai.de wrote: per our short exchange in private, I am taking to spruce up the ceph build system definitions. First comes leveldb, more shall follow soonish. Thanks! Patches #1–2 are mandatory for continued successful operation, #3–5

Re: does ceph consider the device performance for distributing data?

2012-09-12 Thread Tommi Virtanen
On Wed, Sep 12, 2012 at 1:15 PM, sheng qiu herbert1984...@gmail.com wrote: may i ask which part of codes within ceph deal with the distribution of workloads among the OSDs? i am interested in ceph's source codes and want to understand that part of codes. You want to read the academic papers on

Re: does ceph consider the device performance for distributing data?

2012-09-12 Thread Tommi Virtanen
On Wed, Sep 12, 2012 at 1:27 PM, sheng qiu herbert1984...@gmail.com wrote: actually i have read both these two papers. i understand the theory but just curious about how is it implemented within the codes. i looked into the ceph source codes, it seems very complicated for me. The CRUSH library

Re: Collection of strange lockups on 0.51

2012-09-12 Thread Tommi Virtanen
On Wed, Sep 12, 2012 at 10:33 AM, Andrey Korolyov and...@xdel.ru wrote: Hi, This is completely off-list, but I`m asking because only ceph trigger such a bug :) . With 0.51, following happens: if I kill an osd, one or more neighbor nodes may go to hanged state with cpu lockups, not related to

Re: crush create-or-move

2012-09-11 Thread Tommi Virtanen
On Tue, Sep 11, 2012 at 9:09 AM, Sage Weil s...@inktank.com wrote: I pushed a wip-crush branch that implements 'ceph osd crush create-or-move ...'. I don't think it's worth backporting this to argonaut, since the user is actually in the deploy scripts, so any chef/juju/whatever won't care if

Re: does ceph consider the device performance for distributing data?

2012-09-11 Thread Tommi Virtanen
On Mon, Sep 10, 2012 at 8:54 PM, sheng qiu herbert1984...@gmail.com wrote: i have a simple question. for distribution workload among OSDs, does ceph do any online modeling for OSDs, i.e. collect the online IO latency and try to distribute more workloads to lower latency OSDs? or only based on

Re: issues with adjusting the crushmap in 0.51

2012-09-06 Thread Tommi Virtanen
On Thu, Sep 6, 2012 at 11:51 AM, Jimmy Tang jt...@tchpc.tcd.ie wrote: Also, the ceph osd setcrushmap... command doesn't up when a ceph --help is run in the 0.51 release, however it is documented on the wiki as far as I recall. It'd be real nice if the applications emitted all the available

Re: [PATCH] docs: Add CloudStack documentation

2012-09-05 Thread Tommi Virtanen
On Wed, Sep 5, 2012 at 8:28 AM, Wido den Hollander w...@widodh.nl wrote: You can only specify one monitor in CloudStack, but your cluster can have multiple. This is due to the internals of CloudStack. It stores storage pools in a URI format, like: rbd://admin:secret@1.2.3.4/rbd You know, for

Re: ceph-fs tests

2012-09-05 Thread Tommi Virtanen
On Tue, Sep 4, 2012 at 4:26 PM, Smart Weblications GmbH - Florian Wiessner f.wiess...@smart-weblications.de wrote: i set up a 3 node ceph cluster 0.48.1argonaut to test ceph-fs. i mount ceph via fuse, then i downloaded kernel tree and decompressed a few times, then stopping one osd (osd.1),

Re: rbd 0.48 storage support for kvm proxmox distribution available

2012-09-05 Thread Tommi Virtanen
On Wed, Sep 5, 2012 at 5:35 AM, Wido den Hollander w...@widodh.nl wrote: I also thought it is on the roadmap to not read /etc/ceph/ceph.conf by default with librbd/librados to take away these kind of issues. Hmm. I'm not intimately familiar with librbd, but it seems it just takes RADOS ioctx as

Re: rbd 0.48 storage support for kvm proxmox distribution available

2012-09-05 Thread Tommi Virtanen
On Wed, Sep 5, 2012 at 9:40 AM, Tommi Virtanen t...@inktank.com wrote: On Wed, Sep 5, 2012 at 5:35 AM, Wido den Hollander w...@widodh.nl wrote: I also thought it is on the roadmap to not read /etc/ceph/ceph.conf by default with librbd/librados to take away these kind of issues. Hmm. I'm

Re: [PATCH] docs: Add CloudStack documentation

2012-09-05 Thread Tommi Virtanen
On Wed, Sep 5, 2012 at 10:05 AM, Wido den Hollander w...@widodh.nl wrote: For example, rbd:?id=adminsecret=s3kr1tmon=1.2.3.4mon=5.6.7.8pool=rbd is perfectly legal. Whether some Java library fails to implement generic URIs is another concern.. It is indeed a Java library in this case:

Re: [PATCH] docs: Add CloudStack documentation

2012-09-05 Thread Tommi Virtanen
On Wed, Sep 5, 2012 at 3:39 PM, Wido den Hollander w...@widodh.nl wrote: The main problem with that is how CloudStack internally stores the data. At the storage driver the URI doesn't arrive in plain format, it gets splitted with getHost(), getAuthUsername(), getPath() and arrives in these

Re: Integration work

2012-09-04 Thread Tommi Virtanen
On Fri, Aug 31, 2012 at 11:02 PM, Ryan Nicholson ryan.nichol...@kcrg.com wrote: Secondly: Through some trials, I've found that if one loses all of his Monitors in a way that they also lose their disks, one basically loses their cluster. I would like to recommend a lower priority shift in

Re: Very unbalanced storage

2012-09-04 Thread Tommi Virtanen
On Fri, Aug 31, 2012 at 11:58 PM, Andrew Thompson andre...@aktzero.com wrote: Looking at old archives, I found this thread which shows that to mount a pool as cephfs, it needs to be added to mds: http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/5685 I started a `rados cppool

Re: radosgw keeps dying

2012-09-04 Thread Tommi Virtanen
On Sun, Sep 2, 2012 at 6:36 AM, Nick Couchman nick.couch...@seakr.com wrote: One additional piece of info...I did find the -d flag (documented in the radosgw-admin man page, but not in the radosgw man page) that keeps the daemon in the foreground and prints messages to stderr. When I use

Re: Very unbalanced storage

2012-09-04 Thread Tommi Virtanen
On Tue, Sep 4, 2012 at 9:19 AM, Andrew Thompson andre...@aktzero.com wrote: Yes, it was my `data` pool I was trying to grow. After renaming and removing the original data pool, I can `ls` my folders/files, but not access them. Yup, you're seeing ceph-mds being able to access the metadata pool,

Re: Best insertion point for storage shim

2012-08-31 Thread Tommi Virtanen
On Fri, Aug 31, 2012 at 10:37 AM, Stephen Perkins perk...@netmass.com wrote: Would this require 2 clusters because of the need to have RADOS keep N copies on one and 1 copy on the other? That's doable with just multiple RADOS pools, no need for multiple clusters. And CephFS is even able to

Re: Best insertion point for storage shim

2012-08-31 Thread Tommi Virtanen
On Fri, Aug 31, 2012 at 11:59 AM, Atchley, Scott atchle...@ornl.gov wrote: I think what he is looking for is not to bring data to a client to convert from replication to/from erasure coding, but to have the servers do it based on some metric _or_ have the client indicate which file needs to

Review please: wip-create-admin-key and #3065

2012-08-30 Thread Tommi Virtanen
Hi. I'm trying to improve the admin key management, and change how we use client.admin. I'd appreciate if you could review the branch and the somewhat unrelated ticket http://tracker.newdream.net/issues/3065 . -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a

Re: Mapping between PG OSD

2012-08-29 Thread Tommi Virtanen
On Wed, Aug 29, 2012 at 12:46 AM, hemant surale hemant.sur...@gmail.com wrote: Hi Tommi , Actually I was thinking of Availability of related files to particular host. I wanted to guide ceph in some way to store related files for host on his local osd so that I/O over network can be

Re: Integration work

2012-08-29 Thread Tommi Virtanen
On Wed, Aug 29, 2012 at 9:40 AM, Wido den Hollander w...@widodh.nl wrote: Huh ... I've never heard this. Also the guys in ##xen haven't either. I'm not really involved in xen dev and don't follow it closely but that seems unlikely. The few slides I looked at from the Xen Summit a couple days

Re: Unable to set pg_num property of pool data

2012-08-29 Thread Tommi Virtanen
On Wed, Aug 29, 2012 at 9:42 AM, Wido den Hollander w...@widodh.nl wrote: So you are trying to decrease the number of PGs. Now, I'm not sure if that is supported. But why would you want to do this? Support for pg shrinking will come pretty much exactly at the same time as support for pg

Re: mkcehfs: ** ERROR: error creating empty object store in /data/sd1: (2) No such file or directory

2012-08-28 Thread Tommi Virtanen
On Tue, Aug 28, 2012 at 1:38 PM, Tim Flavin tim.fla...@gmail.com wrote: Sage: Yes I tried that. I wonder if the journal file exists.. osd journal = /user1/log -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org

Re: Integration work

2012-08-28 Thread Tommi Virtanen
On Tue, Aug 28, 2012 at 5:03 PM, Florian Haas flor...@hastexo.com wrote: I for my part, in the documentation space, would love for the admin tools to become self-documenting. For example, I would love a help subcommand at any level of the ceph shell, listing the supported subcommands in that

Re: Best insertion point for storage shim

2012-08-24 Thread Tommi Virtanen
On Fri, Aug 24, 2012 at 8:49 AM, Stephen Perkins perk...@netmass.com wrote: I'd like to get feedback from folks as to where the best place would be to insert a shim into the RADOS object storage. ... I would assume that it is possible to configure RADOS to store only 1 copy of a file (bear

Re: Ceph performance improvement

2012-08-22 Thread Tommi Virtanen
On Wed, Aug 22, 2012 at 9:23 AM, Denis Fondras c...@ledeuns.net wrote: Are you sure your osd data and journal are on the disks you think? The /home paths look suspicious -- especially for journal, which often should be a block device. I am :) ... -rw-r--r-- 1 root root 1048576000 août 22

Re: Ceph performance improvement / journal on block-dev

2012-08-22 Thread Tommi Virtanen
On Wed, Aug 22, 2012 at 12:12 PM, Dieter Kasper (KD) d.kas...@kabelmail.de wrote: Your journal is a file on a btrfs partition. That is probably a bad idea for performance. I'd recommend partitioning the drive and using partitions as journals directly. can you please teach me how to use the

Re: ceph osd create

2012-08-20 Thread Tommi Virtanen
On Mon, Aug 20, 2012 at 1:30 PM, Mandell Degerness mand...@pistoncloud.com wrote: Is there now, or will there be a migration path that works to add existing OSDs to the function such that future calls to ceph osd create uuid return the current OSD number used in an existing configuration? I

Re: Ceph setup on single node : prob with

2012-08-16 Thread Tommi Virtanen
On Wed, Aug 15, 2012 at 9:44 PM, hemant surale hemant.sur...@gmail.com wrote: Hello Tommi, Ceph community I did mkdir the directory. Infact I have created a new partition by the same name and formatted using ext3. I also executed the following command for the partition/directory: mount -o

Re: host settings in ceph.conf when ipaddress != hostname

2012-08-16 Thread Tommi Virtanen
On Thu, Aug 16, 2012 at 12:38 PM, Travis Rhoden trho...@gmail.com wrote: I've seen a couple threads in the past about /etc/init.d/ceph not working right when the output of 'hostname' did not match the 'host' entry in ceph.conf. I'm hoping you can advise me on how to get the following setup

Re: [PATCH] make mkcephfs and init-ceph osd filesystem handling more flexible

2012-08-09 Thread Tommi Virtanen
On Thu, Aug 9, 2012 at 8:42 AM, Danny Kukawka danny.kuka...@bisect.de wrote: Remove btrfs specific keys and replace them by more generic keys to be able to replace btrfs with e.g. xfs or ext4 easily. Add new key to define the osd fs type: 'fstype', which can get defined in the [osd] section

Re: [PATCH] make mkcephfs and init-ceph osd filesystem handling more flexible

2012-08-09 Thread Tommi Virtanen
On Thu, Aug 9, 2012 at 9:46 AM, Jim Schutt jasc...@sandia.gov wrote: I'm embarrassed to admit I haven't been keeping up with this, but I seem to recall that early versions didn't handle a journal on a partition. Did I get that wrong, or maybe that capability exists now? In the past I've

Re: [PATCH] make mkcephfs and init-ceph osd filesystem handling more flexible

2012-08-09 Thread Tommi Virtanen
On Thu, Aug 9, 2012 at 9:49 AM, Danny Kukawka danny.kuka...@bisect.de wrote: And where can I find this new OSD hotplugging style init? Is there any documentation? mkcephfs does not use the new style. The Chef cookbooks we have do. If you use the Juju Charms that Canonical has been working on,

Re: Puppet modules for Ceph

2012-08-09 Thread Tommi Virtanen
On Tue, Aug 7, 2012 at 6:51 AM, Jonathan D. Proulx j...@csail.mit.edu wrote: :Juju seems to provide a real-time notification mechanism between :peers, using it's name-relation-changed hook. Other CM frameworks :may need to step up their game, or be subject to the keep re-running :chef-client

Re: [PATCH] make mkcephfs and init-ceph osd filesystem handling more flexible

2012-08-09 Thread Tommi Virtanen
On Thu, Aug 9, 2012 at 10:03 AM, Danny Kukawka danny.kuka...@bisect.de wrote: So you mean chef?! Will there be an alternative to simply setup a cluster from console? We (SUSE) are already working on an own chef ceph cookbook. But from what I've seen till now it's really hard and more

Re: no key for auth when running auth export mon. -o /tmp/monkey?

2012-08-09 Thread Tommi Virtanen
On Mon, Aug 6, 2012 at 2:07 PM, Matthew Roy imjustmatt...@gmail.com wrote: On Mon, Aug 6, 2012 at 11:55 AM, Tommi Virtanen t...@inktank.com wrote: Do you have the file keyring in your mon data dir, and does it contain a [mon.] section? If so, that section is what you need in /tmp/monkey

Re: [PATCH] doc: Correct Git URL for clone

2012-08-02 Thread Tommi Virtanen
On Thu, Aug 2, 2012 at 3:48 AM, Wido den Hollander w...@widodh.nl wrote: Signed-off-by: Wido den Hollander w...@widodh.nl --- doc/source/clone-source.rst |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/clone-source.rst b/doc/source/clone-source.rst index

Re: ceph osd create

2012-08-01 Thread Tommi Virtanen
On Wed, Aug 1, 2012 at 4:27 PM, Mandell Degerness mand...@pistoncloud.com wrote: As of this time, we are allocating OSD numbers based on servers. The ceph osd create command seems like it may be useful, but is not documented nearly well enough yet. Can someone please provide answers to the

Re: About teuthology

2012-07-31 Thread Tommi Virtanen
On Tue, Jul 31, 2012 at 6:59 AM, Mehdi Abaakouk sil...@sileht.net wrote: Hi, I have taken a look into teuthology, the automation of all this tests are good, but are they any way to run it into a already installed ceph clusters ? Thanks in advance. Many of the actual tests being run are

Re: High-availability testing of ceph

2012-07-31 Thread Tommi Virtanen
On Tue, Jul 31, 2012 at 12:31 AM, eric_yh_c...@wiwynn.com wrote: If the performance of rbd device is n MB/s under replica=2, then that means the total io throughputs on hard disk is over 3 * n MB/s. Because I think the total number of copies is 3 in original. So, it seems not correct now,

Re: How to integrate ceph with opendedup.

2012-07-31 Thread Tommi Virtanen
On Tue, Jul 31, 2012 at 2:18 AM, ramu ramu.freesyst...@gmail.com wrote: I want to integrate ceph with opendedup(sdfs) using java-rados. Please help me to integration of ceph with opendedup. It sounds like you could use radosgw and just use S3ChunkStore. If you really want to implement your own

Re: Puppet modules for Ceph

2012-07-31 Thread Tommi Virtanen
On Tue, Jul 24, 2012 at 6:15 AM, loic.dach...@enovance.com wrote: Note that if puppet client was run on nodeB before it was run on nodeA, all three steps would have been run in sequence instead of being spread over two puppet client invocations. Unfortunately, even that is not enough. The

Re: Puppet modules for Ceph

2012-07-31 Thread Tommi Virtanen
On Tue, Jul 31, 2012 at 11:51 AM, Sage Weil s...@inktank.com wrote: It is also possible to feed initial keys to the monitors during the 'mkfs' stage. If the keys can be agreed on somehow beforehand, then they will already be in place when the initial quorum is reached. Not sure if that helps

Re: RadosGW on ubuntu is OK but CentOS fails

2012-07-30 Thread Tommi Virtanen
On Mon, Jul 30, 2012 at 12:58 AM, 袁冬 yuandong1...@gmail.com wrote: I deploy an instance including 1 mon, 1 osd and 1 radosgw on one machine which based on CentOS 6.2. I can access the system from browser(Chrome) which tells me that: no bucket for anonymous user. I think I get everything fine,

Re: uneven placement

2012-07-30 Thread Tommi Virtanen
On Fri, Jul 27, 2012 at 6:07 AM, Yann Dupont yann.dup...@univ-nantes.fr wrote: My ceph cluster is made of 8 OSD with quite big storage attached. All OSD nodes are equal, except 4 OSD have 6,2 TB, 4 have 8 TB storage. Sounds like you should just set the weights yourself, based on the capacities

Re: Ceph Benchmark HowTo

2012-07-25 Thread Tommi Virtanen
On Wed, Jul 25, 2012 at 1:25 PM, Gregory Farnum g...@inktank.com wrote: Yeah, an average isn't necessarily very useful here — it's what you get because that's easy to implement (with a sum and a counter variable, instead of binning). The inclusion of max and min latencies is an attempt to

Re: Ceph Benchmark HowTo

2012-07-24 Thread Tommi Virtanen
On Tue, Jul 24, 2012 at 8:55 AM, Mark Nelson mark.nel...@inktank.com wrote: personally I think it's fine to have it on the wiki. I do want to stress that performance is going to be (hopefully!) improving over the next couple of months so we will probably want to have updated results (or at

Re: Increase number of PG

2012-07-23 Thread Tommi Virtanen
On Sun, Jul 22, 2012 at 11:57 PM, Sławomir Skowron szi...@gmail.com wrote: My workload looks like this: - Max 20% are PUTs, with 99% of objects smaller then 4MB, - 80% are GETs, and S3 metadata operations. Well, the good news is that that's actually the easy to fix part -- just increase the

Re: Poor read performance in KVM

2012-07-20 Thread Tommi Virtanen
On Fri, Jul 20, 2012 at 9:17 AM, Vladimir Bashkirtsev vladi...@bashkirtsev.com wrote: not running. So I ended up rebooting hosts and that's where fun begin: btrfs has failed to umount , on boot up it spit out btrfs: free space inode generation (0) did not match free space cache generation

Re: Increase number of PG

2012-07-20 Thread Tommi Virtanen
On Fri, Jul 20, 2012 at 8:31 AM, Sławomir Skowron szi...@gmail.com wrote: I know that this feature is disabled, are you planning to enable this in near future ?? PG splitting/joining is the next major project for the OSD. It won't be backported to argonaut, but it will be in the next stable

Re: Running into an error with mkcephfs

2012-07-20 Thread Tommi Virtanen
On Fri, Jul 20, 2012 at 3:04 PM, Gregory Farnum g...@inktank.com wrote: If I remember mkcephfs correctly, it deliberately does not create the directories for each store (you'll notice that http://ceph.com/docs/master/start/quick-start/#deploy-the-configuration includes creating the directory

Re: Ceph doesn't update the block device size while a rbd image is mounted

2012-07-19 Thread Tommi Virtanen
On Thu, Jul 19, 2012 at 8:26 AM, Sébastien Han han.sebast...@gmail.com wrote: Ok I got your point seems logic, but why is this possible with LVM for example? You can easily do this with LVM without un-mounting the device. Do your LVM volumes have partition tables inside them? That might be

Re: Ceph doesn't update the block device size while a rbd image is mounted

2012-07-19 Thread Tommi Virtanen
On Thu, Jul 19, 2012 at 8:38 AM, Tommi Virtanen t...@inktank.com wrote: Do your LVM volumes have partition tables inside them? That might be the difference. Of course, you can put your filesystem straight on the RBD; that would be a good experiment to run. Oops, I see you did put your fs

Re: Poor read performance in KVM

2012-07-19 Thread Tommi Virtanen
On Thu, Jul 19, 2012 at 5:19 AM, Vladimir Bashkirtsev vladi...@bashkirtsev.com wrote: Look like that osd.0 performs with low latency but osd.1 latency is way too high and on average it appears as 200ms. osd is backed by btrfs over LVM2. May be issue lie in backing fs selection? All four osds

Re: Puppet modules for Ceph

2012-07-18 Thread Tommi Virtanen
On Wed, Jul 18, 2012 at 6:58 AM, François Charlier francois.charl...@enovance.com wrote: I'm currently working on writing a Puppet module for Ceph. As after some research I found no existing module, I'll start from scratch but I would be glad to hear from people who would already have started

Re: Puppet modules for Ceph

2012-07-18 Thread Tommi Virtanen
On Wed, Jul 18, 2012 at 2:59 PM, Teyo Tyree t...@puppetlabs.com wrote: As you probably know, Puppet Labs is based in Portland. Are you attending OScon? It might be a good opportunity for us to have some face to face hacking time on a Puppet module. Let me know if you would like for us to

Re: Puppet modules for Ceph

2012-07-18 Thread Tommi Virtanen
On Wed, Jul 18, 2012 at 3:26 PM, Teyo Tyree t...@puppetlabs.com wrote: Ha, that would be an interesting experiment indeed. I think Francois would like to have the Puppet module done sooner rather than later. Are the current Chef cookbooks functional enough for us to get started with them as a

Re: Intermittent loss of connectivity with KVM-Ceph-Network (solved)

2012-07-17 Thread Tommi Virtanen
On Mon, Jul 16, 2012 at 5:21 PM, Australian Jade sa...@australianjade.com wrote: all in vain. I also noticed that when hold up on interface happened pings come back with second difference: ie they piling up on interface and then suddenly sent through all at once and remote host returns all of

Re: can rbd unmap detect if device is mounted?

2012-07-16 Thread Tommi Virtanen
On Mon, Jul 16, 2012 at 3:43 PM, Josh Durgin josh.dur...@inktank.com wrote: I've made this mistake a couple of times now (completely my fault, when will I learn?), and am wondering if a bit of protection can be put in place against user errors. Yeah, we've been working on advisory locking. The

Re: Linux large IO size.

2012-07-13 Thread Tommi Virtanen
On Fri, Jul 13, 2012 at 12:07 PM, Matt Weil mw...@genome.wustl.edu wrote: Is it possible to get IO sizes 1024k and larger? Can you be a bit more explicit in what you're asking? Are you talking about submitting IO to RBD (via the kernel module? inside a qemu vm?), are you talking about using the

Re: [PATCH] vstart: use absolute path for keyring

2012-07-13 Thread Tommi Virtanen
On Fri, Jul 13, 2012 at 5:01 PM, Noah Watkins jayh...@cs.ucsc.edu wrote: Stores absolute path to the generated keyring so that tests running in other directories (e.g. src/java/test) can simply reference the generated ceph.conf. Perhaps relative paths in ceph.conf should be interpreted as

Re: librbd: error finding header

2012-07-12 Thread Tommi Virtanen
On Wed, Jul 11, 2012 at 9:41 PM, Josh Durgin josh.dur...@inktank.com wrote: You're right about the object name - you can get its offset in the image that way. Since rbd is thin-provisioned, however, the highest index object might not be the highest possible object. When you first create an

Re: [PATCH] Robustify ceph-rbdnamer and adapt udev rules

2012-07-11 Thread Tommi Virtanen
On Wed, Jul 11, 2012 at 9:28 AM, Josh Durgin josh.dur...@inktank.com wrote: This all makes sense, but maybe we should put the -part suffix in another namespace to avoid colliding with images that happen to have -partN in their name, e.g.: /dev/rbd/mypool/myrbd/part1 - /dev/rbd3p1 Good

Re: domino-style OSD crash

2012-07-10 Thread Tommi Virtanen
On Tue, Jul 10, 2012 at 2:46 AM, Yann Dupont yann.dup...@univ-nantes.fr wrote: As I've keeped the original broken btrfs volumes, I tried this morning to run the old osd in parrallel, using the $cluster variable. I only have partial success. The cluster mechanism was never intended for moving

ioping: tool to monitor I/O latency in real time

2012-07-10 Thread Tommi Virtanen
Hi. I stumbled on this over the weekend, and thought people here might be interested in seeing whether it's useful in figuring out things like btrfs health: http://code.google.com/p/ioping/ -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to

Re: domino-style OSD crash

2012-07-10 Thread Tommi Virtanen
On Tue, Jul 10, 2012 at 9:39 AM, Yann Dupont yann.dup...@univ-nantes.fr wrote: The cluster mechanism was never intended for moving existing osds to other clusters. Trying that might not be a good idea. Ok, good to know. I saw that the remaining maps could lead to problem, but in 2 words, what

Re: domino-style OSD crash

2012-07-10 Thread Tommi Virtanen
On Tue, Jul 10, 2012 at 10:36 AM, Yann Dupont yann.dup...@univ-nantes.fr wrote: Fundamentally, it comes down to this: the two clusters will still have the same fsid, and you won't be isolated from configuration errors or (CEPH-PROD is the old btrfs volume ). /CEPH is new xfs volume, completely

Re: domino-style OSD crash

2012-07-09 Thread Tommi Virtanen
On Wed, Jul 4, 2012 at 1:06 AM, Yann Dupont yann.dup...@univ-nantes.fr wrote: Well, I probably wasn't clear enough. I talked about crashed FS, but i was talking about ceph. The underlying FS (btrfs in that case) of 1 node (and only one) has PROBABLY crashed in the past, causing corruption in

Re: domino-style OSD crash

2012-07-09 Thread Tommi Virtanen
On Mon, Jul 9, 2012 at 12:05 PM, Yann Dupont yann.dup...@univ-nantes.fr wrote: The information here isn't enough to say whether the cause of the corruption is btrfs or LevelDB, but the recovery needs to handled by LevelDB -- and upstream is working on making it more robust:

Re: Unable to restart Mon after reboot

2012-07-03 Thread Tommi Virtanen
On Tue, Jul 3, 2012 at 9:35 AM, David Blundell david.blund...@100percentit.com wrote: We switched to Ubuntu 12.04 for the tests which stopped all btrfs problems. We have now spent a week running the iotester corruption tests in KVM instances while live migrating them every 5 minutes (with and

Re: domino-style OSD crash

2012-07-03 Thread Tommi Virtanen
On Tue, Jul 3, 2012 at 1:40 AM, Yann Dupont yann.dup...@univ-nantes.fr wrote: Upgraded the kernel to 3.5.0-rc4 + some patches, seems btrfs is OK right now. Tried to restart osd with 0.47.3, then next branch, and today with 0.48. 4 of 8 nodes fails with the same message : ceph version

Re: domino-style OSD crash

2012-07-03 Thread Tommi Virtanen
On Tue, Jul 3, 2012 at 1:54 PM, Yann Dupont yann.dup...@univ-nantes.fr wrote: In the case I could repair, do you think a crashed FS as it is right now is valuable for you, for future reference , as I saw you can't reproduce the problem ? I can make an archive (or a btrfs dump ?), but it will be

Re: ceph was didn't stop.

2012-06-28 Thread Tommi Virtanen
On Wed, Jun 27, 2012 at 10:05 PM, ramu ramu.freesyst...@gmail.com wrote: + ssh gamma cd /etc/ceph ; ulimit -c unlimited ; while [ 1 ]; do        [ -e /var/run/ceph/osd.1.pid ] || break        pid=`cat /var/run/ceph/osd.1.pid`        while [ -e /proc/$pid ] grep -q ceph-osd /proc/$pid/cmdline

Re: reproducable osd crash

2012-06-26 Thread Tommi Virtanen
On Mon, Jun 25, 2012 at 10:48 PM, Stefan Priebe s.pri...@profihost.ag wrote: Strange just copied /core.hostname and /usr/bin/ceph-osd no idea how this can happen. For building I use the provided Debian scripts. Perhaps you upgraded the debs but did not restart the daemons? That would make the

Re: Unable to restart Mon after reboot

2012-06-25 Thread Tommi Virtanen
On Sat, Jun 23, 2012 at 3:43 AM, David Blundell david.blund...@100percentit.com wrote: The logs on all three servers are full of messages like: Jun 23 04:02:19 Store2 kernel: [63811.494955] ceph-osd: page allocation failure: order:3, mode:0x4020 The difference between the lines is that

Re: Ceph as a NOVA-INST-DIR/instances/ storage backend

2012-06-25 Thread Tommi Virtanen
On Sat, Jun 23, 2012 at 11:42 AM, Igor Laskovy igor.lask...@gmail.com wrote: Hi all from hot Kiev)) Does anybody use Ceph as a backend storage for NOVA-INST-DIR/instances/ ? Is it in production use? Live migration is still possible? I kindly ask any advice of best practices point of view.

Re: RBD layering design draft

2012-06-22 Thread Tommi Virtanen
On Fri, Jun 22, 2012 at 7:36 AM, Guido Winkelmann guido-c...@thisisnotatest.de wrote: rbd: Cannot unpreserve: Still in use by pool2/image2 What if it's in use by a lot of images? Should it print them all, or should it print something like Still in use by pool2/image2 and 50 others, use

Re: RBD layering design draft

2012-06-22 Thread Tommi Virtanen
On Thu, Jun 21, 2012 at 2:51 PM, Alex Elder el...@dreamhost.com wrote: Before cloning a snapshot, you must mark it as preserved, to prevent it from being deleted while child images refer to it: ::     $ rbd preserve pool/image@snap Why is it necessary to do this?  I think it may be

  1   2   3   4   >