Re: [RFC][PATCH] osd: Add local_connection to fast_dispatch in func _send_boot.

2014-07-16 Thread Gregory Farnum
I'm looking at this and getting a little confused. Can you provide a log of the crash occurring? (preferably with debug_ms=20, debug_osd=20) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Jul 13, 2014 at 8:17 PM, Ma, Jianpeng jianpeng...@intel.com wrote: When do

Re: Disabling CRUSH for erasure code and doing custom placement

2014-07-15 Thread Gregory Farnum
is contained within a centralized sort of metadata server? I understand that for simple object store MDS is not used but is there a way to utilize it for faster querying? Regards, Shayan Saeed On Tue, Jun 24, 2014 at 11:37 AM, Gregory Farnum g...@inktank.com wrote: On Tue, Jun 24, 2014 at 8:29 AM

Re: A log bug in OSD.cc

2014-07-07 Thread Gregory Farnum
Thanks! I've fixed this up in master. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Jul 4, 2014 at 11:41 PM, 李沛伦 lpl6338...@gmail.com wrote: It seems that the 2 log sentences OSD.cc:1220 dout(10) init creating/touching snapmapper object dendl; OSD.cc:1230

Re: QA Session for ReplicatedPG.cc

2014-07-03 Thread Gregory Farnum
: Yeah I wish there was a way to do it inline. You can add them as comments at the bottom of the Gist or as a reply to this email. Whatever is easiest. - Luis On 07/02/2014 03:44 PM, Gregory Farnum wrote: I had a few minutes unexpectedly free so I was going to answer (some of) these, but I

Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Gregory Farnum
. As this mail was some days unseen - i thought nobody has an idea or could help. Stefan On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 02.07.2014 00:51, schrieb Gregory Farnum: On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG s.pri

Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Gregory Farnum
On Wed, Jul 2, 2014 at 12:00 PM, Stefan Priebe s.pri...@profihost.ag wrote: Am 02.07.2014 16:00, schrieb Gregory Farnum: Yeah, it's fighting for attention with a lot of other urgent stuff. :( Anyway, even if you can't look up any details or reproduce at this time, I'm sure you know what

Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Gregory Farnum
On Wed, Jul 2, 2014 at 12:44 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Greg, Am 02.07.2014 21:36, schrieb Gregory Farnum: On Wed, Jul 2, 2014 at 12:00 PM, Stefan Priebe s.pri...@profihost.ag wrote: Am 02.07.2014 16:00, schrieb Gregory Farnum: Yeah, it's fighting for attention

Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-01 Thread Gregory Farnum
On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi Greg, Am 26.06.2014 02:17, schrieb Gregory Farnum: Sorry we let this drop; we've all been busy traveling and things. There have been a lot of changes to librados between Dumpling and Firefly

Re: master does not compile

2014-06-26 Thread Gregory Farnum
Looking at our gitbuilder page, we haven't built Ceph on Fedora for a while, but it's fine elsewhere. And the Fedora issues look to be some config issue; it never actually makes it to the compilation stage... http://ceph.com/gitbuilder.cgi -Greg Software Engineer #42 @ http://inktank.com |

Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-06-25 Thread Gregory Farnum
Sorry we let this drop; we've all been busy traveling and things. There have been a lot of changes to librados between Dumpling and Firefly, but we have no idea what would have made it slower. Can you provide more details about how you were running these tests? -Greg Software Engineer #42 @

Re: Disabling CRUSH for erasure code and doing custom placement

2014-06-24 Thread Gregory Farnum
On Tue, Jun 24, 2014 at 8:29 AM, Shayan Saeed shayansaee...@gmail.com wrote: Hi, CRUSH placement algorithm works really nice with replication. However, with erasure code, my cluster has some issues which require making changes that I cannot specify with CRUSH maps. Sometimes, depending on

Re: Disabling CRUSH for erasure code and doing custom placement

2014-06-24 Thread Gregory Farnum
challenging. The source changes you're talking about will prove really challenging. ;) -Greg Regards, Shayan Saeed On Tue, Jun 24, 2014 at 11:37 AM, Gregory Farnum g...@inktank.com wrote: On Tue, Jun 24, 2014 at 8:29 AM, Shayan Saeed shayansaee...@gmail.com wrote: Hi, CRUSH placement algorithm

Re: CEPH IOPS Baseline Measurements with MemStore

2014-06-23 Thread Gregory Farnum
On Fri, Jun 20, 2014 at 12:41 AM, Alexandre DERUMIER aderum...@odiso.com wrote: They are also a tracker here http://tracker.ceph.com/issues/7191 Replace Mutex to RWLock with fdcache_lock in FileStore seem to be done, but I'm not sure it's already is the master branch ? I believe this

Re: Ubuntu 12.04 MDS tcmalloc leaks

2014-06-23 Thread Gregory Farnum
. On Fri, Apr 11, 2014 at 6:23 PM, Gregory Farnum g...@inktank.com wrote: On Fri, Apr 11, 2014 at 11:07 AM, Milosz Tanski mil...@adfin.com wrote: On Fri, Apr 11, 2014 at 1:07 PM, Gregory Farnum g...@inktank.com wrote: On Fri, Apr 11, 2014 at 8:59 AM, Milosz Tanski mil...@adfin.com wrote

MDS suggestions for upcoming CDS

2014-06-17 Thread Gregory Farnum
Hey all, It's a bit of a broken record, but we are again trying to kickstart CephFS development going forward. To that end, I've created one blueprint for the next CDS on CephFS Forward Scrub (https://wiki.ceph.com/Planning/Blueprints/Submissions/CephFS%3A_Forward_Scrub) based off discussions on

Re: Changes of scrubbing?

2014-06-11 Thread Gregory Farnum
On Wed, Jun 11, 2014 at 12:54 AM, Guang Yang yguan...@outlook.com wrote: On Jun 11, 2014, at 6:33 AM, Gregory Farnum g...@inktank.com wrote: On Tue, May 20, 2014 at 6:44 PM, Guang Yang yguan...@outlook.com wrote: Hi ceph-devel, Like some users of Ceph, we are using Ceph for a latency

Re: [Ceph] Managing crushmap

2014-06-11 Thread Gregory Farnum
That doesn't sound right. Can you supply your decompiled CRUSH map, the exact commands you ran against the ceph cluster, and the exact version(s) you ran the test against? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Jun 11, 2014 at 2:17 AM,

Re: xattr spillout appears broken :(

2014-06-11 Thread Gregory Farnum
, 2014 at 3:00 AM, Haomai Wang haomaiw...@gmail.com wrote: Yes, maybe you can add it in your branch. Because it will happen when creating object and set spill out On Sat, Jun 7, 2014 at 2:59 AM, Gregory Farnum g...@inktank.com wrote: On Fri, Jun 6, 2014 at 11:55 AM, Haomai Wang haomaiw

Re: Changes of scrubbing?

2014-06-10 Thread Gregory Farnum
On Tue, May 20, 2014 at 6:44 PM, Guang Yang yguan...@outlook.com wrote: Hi ceph-devel, Like some users of Ceph, we are using Ceph for a latency sensitive project, and scrubbing (especially deep-scrubbing) impacts the SLA in a non-trivial way, as commodity hardware could fail in one way or

Re: [Feature]Proposal for adding a new flag named shared to support performance and statistic purpose

2014-06-10 Thread Gregory Farnum
We discussed a great deal of this during the initial format 2 work as well, when we were thinking about having bitmaps of allocated space. (Although we also have interval sets which might be a better fit?) I think there was more thought behind it than is in the copy-on-read blueprint; do you know

Re: Locally repairable code description revisited (was Pyramid ...)

2014-06-09 Thread Gregory Farnum
On Fri, Jun 6, 2014 at 7:30 AM, Loic Dachary l...@dachary.org wrote: Hi Andreas, On 06/06/2014 13:46, Andreas Joachim Peters wrote: Hi Loic, the basic implementation looks very clean. I have few comments/ideas: - the reconstruction strategy using the three levels is certainly efficient

Re: xattr spillout appears broken :(

2014-06-06 Thread Gregory Farnum
On Fri, Jun 6, 2014 at 11:55 AM, Haomai Wang haomaiw...@gmail.com wrote: The fix should make clone method copy cephos prefix xattrs On Sat, Jun 7, 2014 at 2:54 AM, Haomai Wang haomaiw...@gmail.com wrote: Hi Greg, I have found the reason. user.cephos.spill_out can't be apply to new object

Re: Librbd licensing

2014-06-02 Thread Gregory Farnum
On Mon, Jun 2, 2014 at 11:00 AM, Josh Durgin josh.dur...@inktank.com wrote: On 06/02/2014 10:22 AM, Sage Weil wrote: Ideally the change comes from Josh, who originally put the notice there, but I think it shouldn't matter. We relicensed rbd.cc as LGPL2 a while back (it was GPL due to a

Re: xattr spillout appears broken :(

2014-05-30 Thread Gregory Farnum
On Fri, May 30, 2014 at 2:18 AM, Haomai Wang haomaiw...@gmail.com wrote: Hi Gregory, I try to reproduce the bug in my local machine but failed. My test cmdline: ./ceph_test_rados --op read 100 --op write 100 --op delete 50 --max-ops 40 --objects 1024 --max-in-flight 64 --size 400

xattr spillout appears broken :(

2014-05-29 Thread Gregory Farnum
I got a bug report a little while ago that we were hitting leveldb on every xattr lookup with Firefly, and after looking into it I indeed discovered that we never set the XATTR_SPILL_OUT_NAME xattr when creating new files. I pushed a branch wip-xattr-spillout containing a fix for that, plus some

Re: XioMessenger (RDMA) Status Update

2014-05-16 Thread Gregory Farnum
On Fri, May 16, 2014 at 11:21 AM, Sage Weil s...@inktank.com wrote: Hi Matt, I've rebased this branch on top of master and pushed it to wip-xio in ceph.git, and then opened a pull request to capture review: https://github.com/ceph/ceph/pull/1819 I would like to get some of the

Re: [ceph-users] Does CEPH rely on any multicasting?

2014-05-15 Thread Gregory Farnum
On Thu, May 15, 2014 at 9:52 AM, Amit Vijairania amit.vijaira...@gmail.com wrote: Hello! Does CEPH rely on any multicasting? Appreciate the feedback.. Nope! All networking is point-to-point. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com -- To unsubscribe from this list:

Re: Suspend and the ceph clients

2014-05-15 Thread Gregory Farnum
On Thu, May 15, 2014 at 1:13 AM, Holger Hoffstätte holger.hoffstae...@googlemail.com wrote: On Wed, 14 May 2014 15:07:44 -0700, Gregory Farnum wrote: [..] Unfortunately, I don't know anything about Linux's suspend functionality or APIs, and my weak attempts at googling and grepping aren't

Suspend and the ceph clients

2014-05-14 Thread Gregory Farnum
There's a recent ticket discussing the behavior of ceph-fuse after the machine it's running on has been suspended: http://tracker.ceph.com/issues/8291 In short, CephFS clients which are disconnected from the cluster for a sufficiently long time are generally forbidden from reconnecting — after a

Re: XioMessenger (RDMA) Status Update

2014-05-13 Thread Gregory Farnum
On Tue, May 13, 2014 at 2:08 PM, Matt W. Benjamin m...@cohortfs.com wrote: Hi Ceph Devs, I've pushed two Ceph+Accelio branches, xio-firefly and xio-firefly-cmake to our public ceph repository https://github.com/linuxbox2/linuxbox-ceph.git . These branches are pulled up to the HEAD of

Re: XioMessenger (RDMA) Status Update

2014-05-13 Thread Gregory Farnum
On Tue, May 13, 2014 at 6:30 PM, Matt W. Benjamin m...@cohortfs.com wrote: Hi Greg, 1. there isn't support for for mark_down() yet (!), but we're working on it. I've had good discussions with the Accelio team regarding exposing explicit channel and flow control interfaces, which round out

Re: CEPH - messenger rebind race

2014-05-08 Thread Gregory Farnum
On Thu, May 8, 2014 at 12:12 AM, Guang yguan...@yahoo.com wrote: Thanks Sage and Greg! I just capture a crash from which I get the logs from both side (sadly we still use debug_ms=0 as increasing the debug_ms level dramatically consume more disk spaces), from the log, I think I get some more

Re: [ceph-users] v0.80 Firefly released

2014-05-07 Thread Gregory Farnum
On Wed, May 7, 2014 at 8:44 AM, Dan van der Ster daniel.vanders...@cern.ch wrote: Hi, Sage Weil wrote: * *Primary affinity*: Ceph now has the ability to skew selection of OSDs as the primary copy, which allows the read workload to be cheaply skewed away from parts of the cluster

Re: [ceph-users] v0.80 Firefly released

2014-05-07 Thread Gregory Farnum
On Wed, May 7, 2014 at 11:18 AM, Mike Dawson mike.daw...@cloudapt.com wrote: On 5/7/2014 11:53 AM, Gregory Farnum wrote: On Wed, May 7, 2014 at 8:44 AM, Dan van der Ster daniel.vanders...@cern.ch wrote: Hi, Sage Weil wrote: * *Primary affinity*: Ceph now has the ability to skew

Re: CEPH - messenger rebind race

2014-05-07 Thread Gregory Farnum
On Wed, May 7, 2014 at 5:52 PM, Sage Weil s...@inktank.com wrote: [CCing Greg and ceph-devel] On Wed, 7 May 2014, Guang Yang wrote: Hi Sage, Sorry to bother you directly, I am debugging / fixing issue http://tracker.ceph.com/issues/8232, during which time I studied the messenger component

Re: [PATCH] locks: ensure that fl_owner is always initialized properly in flock and lease codepaths

2014-05-06 Thread Gregory Farnum
The Ceph bit is fine. Acked-by: Greg Farnum g...@inktank.com On Mon, Apr 28, 2014 at 10:50 AM, Jeff Layton jlay...@poochiereds.net wrote: Currently, the fl_owner isn't set for flock locks. Some filesystems use byte-range locks to simulate flock locks and there is a common idiom in those that

Re: xio-rados-firefly branch update

2014-05-01 Thread Gregory Farnum
On Mon, Apr 28, 2014 at 8:14 PM, Matt W. Benjamin m...@linuxbox.com wrote: Hi Greg, - Gregory Farnum g...@inktank.com wrote: The re-org mostly looks fine. I notice you're adding a few more friend declarations though, and I don't think those should be necessary — Connection can label

Re: default filestore max sync interval

2014-04-29 Thread Gregory Farnum
On Tue, Apr 29, 2014 at 1:10 PM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi all, Why is the default max sync interval only 5 seconds? Today we realized what a huge difference that increasing this to 30 or 60s can do for the small write latency. Basically, with a 5s interval our 4k

Re: default filestore max sync interval

2014-04-29 Thread Gregory Farnum
On Tue, Apr 29, 2014 at 1:35 PM, Stefan Priebe s.pri...@profihost.ag wrote: H Greg, Am 29.04.2014 22:23, schrieb Gregory Farnum: On Tue, Apr 29, 2014 at 1:10 PM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi all, Why is the default max sync interval only 5 seconds? Today we

Re: xio-rados-firefly branch update

2014-04-28 Thread Gregory Farnum
A few days later than I wanted, but I got through various pieces of this today. It wasn't a thorough review but more a shape of things check, but I have a bunch of notes. On Tue, Apr 22, 2014 at 2:50 PM, Matt W. Benjamin m...@linuxbox.com wrote: Hi Greg, Sure. I'm interested in all feedback

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-25 Thread Gregory Farnum
Hmm, it looks like your on-disk SessionMap is horrendously out of date. Did your cluster get full at some point? In any case, we're working on tools to repair this now but they aren't ready for use yet. Probably the only thing you could do is create an empty sessionmap with a higher version than

Re: bandwidth with Ceph - v0.59 (Bobtail)

2014-04-25 Thread Gregory Farnum
Bobtail is really too old to draw any meaningful conclusions from; why did you choose it? That's not to say that performance on current code will be better (though it very much might be), but the internal architecture has changed in some ways that will be particularly important for the futex

Re: xio-rados-firefly branch update

2014-04-22 Thread Gregory Farnum
Awesome! I'll try and take a preliminary look at this in the next day or two. What kind of feedback are you interested in right now? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Apr 22, 2014 at 12:44 PM, Matt W. Benjamin m...@linuxbox.com wrote: Hi, We've

Re: Ceph daemon memory utilization: 'heap release' drops use by 50%

2014-04-14 Thread Gregory Farnum
What distro are you running on? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Apr 14, 2014 at 5:28 AM, David McBride dw...@cam.ac.uk wrote: Hello, I'm currently experimenting with a Ceph deployment, and am noting that some of my machines are having processes

Re: Ceph daemon memory utilization: 'heap release' drops use by 50%

2014-04-14 Thread Gregory Farnum
as well — it might be a tcmalloc issue they can resolve in their repo. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Apr 14, 2014 at 7:04 AM, David McBride dw...@cam.ac.uk wrote: On 14/04/14 14:53, Gregory Farnum wrote: What distro are you running on? -Greg Hi Greg

Re: Ubuntu 12.04 MDS tcmalloc leaks

2014-04-11 Thread Gregory Farnum
On Fri, Apr 11, 2014 at 8:59 AM, Milosz Tanski mil...@adfin.com wrote: I'd like to restart this debate about tcmalloc slow leaks in MDS. This time around I have some charts. Looking at OSDs and MONs, it doesn't seam to affect those (as much). Here's the chart: http://i.imgur.com/xMCINAD.png

Re: [Share]Performance tunning on Ceph FileStore with SSD backend

2014-04-09 Thread Gregory Farnum
On Wed, Apr 9, 2014 at 3:05 AM, Haomai Wang haomaiw...@gmail.com wrote: Hi all, I would like to share some ideas about how to improve performance on ceph with SSD. Not much preciseness. Our ssd is 500GB and each OSD own a SSD(journal is on the same SSD). ceph version is 0.67.5(Dumping) At

Re: Deterministic thrashing

2014-04-07 Thread Gregory Farnum
This would be really nice but there are unfortunately even more hiccups than you've noted here: 1) Thrashing is both time and disk access sensitive, and hardware differs 2) The teuthology thrashing is triggered largely based on PG state events (eg, all PGs are clean, so restart an OSD) 3) The

Re: Deterministic thrashing

2014-04-07 Thread Gregory Farnum
On Mon, Apr 7, 2014 at 10:13 AM, Loic Dachary l...@dachary.org wrote: On 07/04/2014 18:55, Gregory Farnum wrote: This would be really nice but there are unfortunately even more hiccups than you've noted here: 1) Thrashing is both time and disk access sensitive, and hardware differs 2

Re: [ceph-users] Ceph User Committee monthly meeting #1 : executive summary

2014-04-04 Thread Gregory Farnum
On Fri, Apr 4, 2014 at 11:15 AM, Milosz Tanski mil...@adfin.com wrote: Loic, The writeup has been helpful. What I'm curious about (and hasn't been mentioned) is can we use erasure with CephFS? What steps have to be taken in order to setup erasure coding for CephFS? Lots. CephFS takes

Re: Multiple Posix namespaces?

2014-04-04 Thread Gregory Farnum
Not yet, no. There are a couple different approaches to this that a third-party contributor could work on without too much difficulty (I *think* there's a blueprint floating around somewhere), but nobody's done so yet. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri,

Re: Assertion error in librados

2014-03-31 Thread Gregory Farnum
Nope, I don't think anybody's looked into it. If you have core dumps you could get a backtrace and the return value referenced. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Mar 28, 2014 at 2:54 AM, Filippos Giannakos philipg...@grnet.gr wrote: Hello, We recently

Re: Question about librados notification

2014-03-25 Thread Gregory Farnum
On Tue, Mar 25, 2014 at 5:50 AM, Shinji Matsumoto shinji.matsum...@us.sios.com wrote: Hello all, I have a question about Ceph notification mechanism. http://ceph.com/docs/master/architecture/#object-watch-notify Scenario: (1) 3 clients (client1, client2, client3) have interests on a Ceph

Re: ceph-0.77-900.gce9bfb8 Testing Rados EC/Tiering CephFS ...

2014-03-25 Thread Gregory Farnum
On Thu, Mar 20, 2014 at 3:49 AM, Andreas Joachim Peters andreas.joachim.pet...@cern.ch wrote: Hi, I did some Firefly ceph-0.77-900.gce9bfb8 testing of EC/Tiering deploying 64 OSD with in-memory filesystems (RapidDisk with ext4) on a single 256 GB box. The raw write performance of this box

Re: Limiting specific to specific directory, client separation

2014-03-24 Thread Gregory Farnum
This is not currently a priority in Inktank's roadmap for the MDS. :( But we discussed client security in more detail than those tickets during the Dumpling Ceph Developer Summit: http://wiki.ceph.com/Planning/CDS/Dumpling (search for 1G: Client Security for CephFS -- there's a blueprint, an

Re: rbd client map error

2014-03-19 Thread Gregory Farnum
, 1129 GB / 1489 GB avail 6553604/9830406 objects degraded (66.667%) 1776 active+degraded Thanks Regards Somnath -Original Message- From: Gregory Farnum [mailto:g...@inktank.com] Sent: Wednesday, March 19, 2014 3:38 PM To: Somnath Roy Cc: Sage Weil

Re: XioMessenger (RDMA) Performance results

2014-03-18 Thread Gregory Farnum
On Tue, Mar 18, 2014 at 1:05 PM, Yaron Haviv yar...@mellanox.com wrote: Im happy to share test results we run in the lab with Matt's latest XioMessenger code which implements Ceph messaging over Accelio RDMA library Results look pretty encouraging, demonstrating a * 20x * performance boost

Re: help - i/o error when mounting with cephfs

2014-03-14 Thread Gregory Farnum
On Fri, Mar 14, 2014 at 6:06 PM, Shaun Keenan skee...@gmail.com wrote: When trying to mount off my ceph cluster I get this: mount error 5 = Input/output error cluster looks healthy: [root@ceph-mds2 ~]# ceph -s cluster 61b6dda1-5412-41f7-9769-3ae7e47241b7 health HEALTH_OK monmap

Re: contraining crush placement possibilities

2014-03-10 Thread Gregory Farnum
, why not map object_id to OSD combinations directly, will it achieve a more uniform distribution? On 2014/3/8 1:43, Sage Weil wrote: On Fri, 7 Mar 2014, Gregory Farnum wrote: On Fri, Mar 7, 2014 at 7:10 AM, Sage Weil s...@inktank.com wrote: On Fri, 7 Mar 2014, Dan van der Ster wrote

Re: [PATCH v2] ceph: use fl-fl_file as owner identifier of flock and posix lock

2014-03-10 Thread Gregory Farnum
Okay, this problem makes sense to me and I think your basic approach is good. I've got no problem with it after all. :) Just throwing this out there, if we're worried about exposing kernel addresses to external processes, and don't want them to collide, should we just keep a mapping of

Re: contraining crush placement possibilities

2014-03-07 Thread Gregory Farnum
On Fri, Mar 7, 2014 at 7:10 AM, Sage Weil s...@inktank.com wrote: On Fri, 7 Mar 2014, Dan van der Ster wrote: On Thu, Mar 6, 2014 at 9:30 PM, Sage Weil s...@inktank.com wrote: Sheldon just pointed out a talk from ATC that discusses the basic problem:

Re: contraining crush placement possibilities

2014-03-07 Thread Gregory Farnum
On Fri, Mar 7, 2014 at 9:43 AM, Sage Weil s...@inktank.com wrote: On Fri, 7 Mar 2014, Gregory Farnum wrote: On Fri, Mar 7, 2014 at 7:10 AM, Sage Weil s...@inktank.com wrote: On Fri, 7 Mar 2014, Dan van der Ster wrote: On Thu, Mar 6, 2014 at 9:30 PM, Sage Weil s...@inktank.com wrote

Re: cache pool user interfaces

2014-02-28 Thread Gregory Farnum
On Fri, Feb 28, 2014 at 7:21 AM, Sage Weil s...@inktank.com wrote: On Wed, 26 Feb 2014, Gregory Farnum wrote: We/you/somebody need(s) to sit down and decide on what kind of interface we want to actually expose to users for working with caching pools. What we have right now is very flexible

Re: location-aware file placement in Ceph

2014-02-27 Thread Gregory Farnum
There are some options within CRUSH to let you decide where you want to place particular classes of data, but it's not really available on a per-object or per-file basis. You should look through the CRUSH stuff at ceph.com/docs to get an idea of what's possible. -Greg Software Engineer #42 @

Re: Assertion error in librados

2014-02-25 Thread Gregory Farnum
Do you have logs? The assert indicates that the messenger got back something other than okay when trying to grab a local Mutex, which shouldn't be able to happen. It may be that some error-handling path didn't drop it (within the same thread that later tried to grab it again), but we'll need more

Re: Assertion error in librados

2014-02-25 Thread Gregory Farnum
, Unfortunately we don't keep any Ceph related logs on the client side. On the server side, we kept the default log settings to avoid overlogging. Do you think that there might be something usefull on the OSD side ? On Tue, Feb 25, 2014 at 07:28:30AM -0800, Gregory Farnum wrote: Do you have logs? The assert

Re: [ceph-users] PG folder hierarchy

2014-02-25 Thread Gregory Farnum
On Tue, Feb 25, 2014 at 7:13 PM, Guang yguan...@yahoo.com wrote: Hello, Most recently when looking at PG's folder splitting, I found that there was only one sub folder in the top 3 / 4 levels and start having 16 sub folders starting from level 6, what is the design consideration behind this?

Re: [ceph-users] Ceph GET latency

2014-02-20 Thread Gregory Farnum
On Tue, Feb 18, 2014 at 7:24 AM, Guang Yang yguan...@yahoo.com wrote: Hi ceph-users, We are using Ceph (radosgw) to store user generated images, as GET latency is critical for us, most recently I did some investigation over the GET path to understand where time spend. I first confirmed that

Re: DISCARD support in kernel driver

2014-01-30 Thread Gregory Farnum
On Thu, Jan 30, 2014 at 1:31 AM, Jean-Tiare LE BIGOT jean-tiare.le-bi...@ovh.net wrote: Hi, I started to implement 'DISCARD' support in RBD kernel driver as described on http://tracker.ceph.com/issues/190 This first (easy) step was to add at the end of drivers/block/rbd.c:rbd_init_disk

Re: DISCARD support in kernel driver

2014-01-30 Thread Gregory Farnum
that $ fstrim /mnt # neither Maybe missing something there ? I expected '-o discard' to be enough ? On 01/30/14 16:24, Gregory Farnum wrote: On Thu, Jan 30, 2014 at 1:31 AM, Jean-Tiare LE BIGOT jean-tiare.le-bi...@ovh.net wrote: Hi, I started to implement 'DISCARD' support in RBD

Re: [ceph-users] many meta files in osd

2014-01-27 Thread Gregory Farnum
Looks like you got lost over the Christmas holidays; sorry! I'm not an expert on running rgw but it sounds like garbage collection isn't running or something. What version are you on, and have you done anything to set it up? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On

Re: Proposal for adding disable FileJournal option

2014-01-09 Thread Gregory Farnum
The FileJournal is also for data safety whenever we're using write ahead. To disable it we need a backing store that we know can provide us consistent checkpoints (i.e., we can use parallel journaling mode — so for the FileJournal, we're using btrfs, or maybe zfs someday). But for those systems

Re: Proposal for adding disable FileJournal option

2014-01-09 Thread Gregory Farnum
Wang haomaiw...@gmail.com wrote: On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum g...@inktank.com wrote: The FileJournal is also for data safety whenever we're using write ahead. To disable it we need a backing store that we know can provide us consistent checkpoints (i.e., we can use parallel

Re: [PATCH] mds: handle setxattr ceph.parent

2014-01-07 Thread Gregory Farnum
On Mon, Jan 6, 2014 at 8:15 PM, Alexandre Oliva ol...@gnu.org wrote: On Jan 6, 2014, Gregory Farnum g...@inktank.com wrote: On Fri, Dec 20, 2013 at 4:50 PM, Alexandre Oliva ol...@gnu.org wrote: On Dec 20, 2013, Alexandre Oliva ol...@gnu.org wrote: back many of the osds to recent snapshots

Re: CephFS standup

2014-01-06 Thread Gregory Farnum
A little late on this (I was on vacation and my phone doesn't do plain-text email!), but I prefer the morning slot to the later ones. :) -Greg On Thu, Jan 2, 2014 at 12:51 PM, Sage Weil s...@inktank.com wrote: 2014 will be the Year of the Linux Desktop^W^W^WCephFS! To that end, we should

Re: [PATCH] mds: handle setxattr ceph.parent

2014-01-06 Thread Gregory Farnum
On Fri, Dec 20, 2013 at 4:50 PM, Alexandre Oliva ol...@gnu.org wrote: On Dec 20, 2013, Alexandre Oliva ol...@gnu.org wrote: back many of the osds to recent snapshots thereof, from which I'd cleaned all traces of the user.ceph._parent. I intended to roll back Err, I meant user.ceph._path, of

Re: enable old OSD snapshot to re-join a cluster

2013-12-19 Thread Gregory Farnum
On Wed, Dec 18, 2013 at 11:32 PM, Alexandre Oliva ol...@gnu.org wrote: On Dec 18, 2013, Gregory Farnum g...@inktank.com wrote: On Tue, Dec 17, 2013 at 3:36 AM, Alexandre Oliva ol...@gnu.org wrote: Here's an updated version of the patch, that makes it much faster than the earlier version

Re: [PATCH] reinstate ceph cluster_snap support

2013-12-18 Thread Gregory Farnum
On Tue, Dec 17, 2013 at 4:14 AM, Alexandre Oliva ol...@gnu.org wrote: On Aug 27, 2013, Sage Weil s...@inktank.com wrote: Hi, On Sat, 24 Aug 2013, Alexandre Oliva wrote: On Aug 23, 2013, Sage Weil s...@inktank.com wrote: FWIW Alexandre, this feature was never really complete. For it to

Re: enable old OSD snapshot to re-join a cluster

2013-12-18 Thread Gregory Farnum
On Tue, Dec 17, 2013 at 3:36 AM, Alexandre Oliva ol...@gnu.org wrote: On Feb 20, 2013, Gregory Farnum g...@inktank.com wrote: On Tue, Feb 19, 2013 at 2:52 PM, Alexandre Oliva ol...@gnu.org wrote: It recently occurred to me that I messed up an OSD's storage, and decided that the easiest way

Re: [PATCH] mds: handle setxattr ceph.parent

2013-12-18 Thread Gregory Farnum
On Wed, Dec 18, 2013 at 9:09 AM, Sage Weil s...@inktank.com wrote: On Wed, 18 Dec 2013, Alexandre Oliva wrote: On Dec 18, 2013, Yan, Zheng uker...@gmail.com wrote: On Tue, Dec 17, 2013 at 7:25 PM, Alexandre Oliva ol...@gnu.org wrote: # setfattr -n ceph.parent /cephfs/mount/path/name Can

Re: [PATCH] mds: drop unused find_ino_dir

2013-12-18 Thread Gregory Farnum
Sage applied this in commit f5d32a33d25a5f9ddccadb4c3ebbd5ccd211204f; thanks! -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Dec 17, 2013 at 3:00 AM, Alexandre Oliva ol...@gnu.org wrote: I was looking at inconsistencies in xattrs in my OSDs, and found out that only

Re: Ceph Messaging on Accelio (libxio) RDMA

2013-12-18 Thread Gregory Farnum
(Sorry for the delay getting back on this.) On Wed, Dec 11, 2013 at 5:13 PM, Matt W. Benjamin m...@cohortfs.com wrote: Hi Greg, I haven't fixed the decision to reify replies in the Messenger at this point, but it is what the current prototype code tries to do. The request/response model is

Re: tcmalloc

2013-12-17 Thread Gregory Farnum
On Tue, Dec 17, 2013 at 4:15 PM, Milosz Tanski mil...@adfin.com wrote: I wanted to bring up an issue with Ceph's use of tcmalloc. I know that in Ubuntu (12.04) Ceph uses the distro version of tcmalloc which older. I've personally ran into issues with tcmalloc for our application where the

Re: Ceph Messaging on Accelio (libxio) RDMA

2013-12-11 Thread Gregory Farnum
On Wed, Dec 11, 2013 at 2:32 PM, Matt W. Benjamin m...@cohortfs.com wrote: Hi Ceph devs, For the last several weeks, we've been working with engineers at Mellanox on a prototype Ceph messaging implementation that runs on the Accelio RDMA messaging service (libxio). Very cool! An RDMA

Re: [PATCH 0/3] block I/O when cluster is full

2013-12-09 Thread Gregory Farnum
On Mon, Dec 9, 2013 at 4:11 PM, Josh Durgin josh.dur...@inktank.com wrote: On 12/06/2013 06:24 PM, Gregory Farnum wrote: On Fri, Dec 6, 2013 at 6:16 PM, Josh Durgin josh.dur...@inktank.com wrote: Don't bother trying to stop ENOSPC on the client side, since it'd need some restructuring

Re: [PATCH 0/3] block I/O when cluster is full

2013-12-06 Thread Gregory Farnum
On Fri, Dec 6, 2013 at 6:16 PM, Josh Durgin josh.dur...@inktank.com wrote: On 12/05/2013 08:58 PM, Gregory Farnum wrote: On Thu, Dec 5, 2013 at 5:47 PM, Josh Durgin josh.dur...@inktank.com wrote: On 12/03/2013 03:12 PM, Josh Durgin wrote: These patches allow rbd to block writes instead

Re: [PATCH 0/3] block I/O when cluster is full

2013-12-05 Thread Gregory Farnum
On Thu, Dec 5, 2013 at 5:47 PM, Josh Durgin josh.dur...@inktank.com wrote: On 12/03/2013 03:12 PM, Josh Durgin wrote: These patches allow rbd to block writes instead of returning errors when OSDs are full enough that the FULL flag is set in the osd map. This avoids filesystems on top of rbd

Re: MDS can't join in

2013-12-03 Thread Gregory Farnum
Does the MDS have access to a keyring which contains its key, and does that match what's on the monitor? You're just referring to the client.admin one, which it won't use (it's not a client). It certainly looks like there's a mismatch based on the verification error. -Greg Software Engineer #42 @

Re: How to use the class Filer in Ceph

2013-11-27 Thread Gregory Farnum
=102450 size=10 mtime=2013-11-25 15:46:15.539420 caps=- objectset[134 ts 1/18446744073709551615 objects 0 dirty_or_tx 0] parents=0x31d4ba0 0x3253480) 2013-11-25 15:46:15.539563 7fb1b37a4780 10 client.5705 nothing to flush -Original Message- From: Gregory Farnum [mailto:g

Re: possible bug in init-ceph.in

2013-11-24 Thread Gregory Farnum
Merged; thanks guys. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Nov 21, 2013 at 8:54 PM, Dietmar Maurer diet...@proxmox.com wrote: Can we take that diff you provided as coming with a signed-off-by, as in the pull request Loic generated? :) Sure. -- To

Re: How to use the class Filer in Ceph

2013-11-24 Thread Gregory Farnum
I haven't looked at the Filer code (or anything around it) in a while, but if I were to guess, in-snapid is set to something which doesn't exist. Are you actually using the Filer in some new code that includes inodes, or modifying the Client classes? Looking at how they initialize things should

Re: possible bug in init-ceph.in

2013-11-21 Thread Gregory Farnum
Can we take that diff you provided as coming with a signed-off-by, as in the pull request Loic generated? :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Nov 21, 2013 at 9:57 AM, Loic Dachary l...@dachary.org wrote: Hi, It turns out there was no pull request or

Re: Out-of-tree build of ceph

2013-11-14 Thread Gregory Farnum
Hrm, I don't think many of the developers do builds like that too often, so it's not surprising it got a little busted. :( Can you make a ticket in the tracker with whatever you've figured out about the cause and timing so we don't lose it? :) -Greg Software Engineer #42 @ http://inktank.com |

Re: CDS blueprint: strong auth for cephfs

2013-11-14 Thread Gregory Farnum
On Thu, Nov 14, 2013 at 2:00 AM, Dan van der Ster d...@vanderster.com wrote: Hi Greg, On Wed, Nov 13, 2013 at 6:45 PM, Gregory Farnum g...@inktank.com wrote: On Wed, Nov 13, 2013 at 8:05 AM, Dan van der Ster d...@vanderster.com wrote: Hi all, This mail is just to let you know that we've

Re: CDS blueprint: strong auth for cephfs

2013-11-14 Thread Gregory Farnum
On Thu, Nov 14, 2013 at 12:30 PM, Arne Wiebalck arne.wieba...@cern.ch wrote: On Nov 14, 2013, at 5:37 PM, Gregory Farnum g...@inktank.com wrote: On Thu, Nov 14, 2013 at 8:21 AM, Dan van der Ster d...@vanderster.com wrote: On Thu, Nov 14, 2013 at 4:55 PM, Gregory Farnum g...@inktank.com

Re: CDS blueprint: strong auth for cephfs

2013-11-13 Thread Gregory Farnum
On Wed, Nov 13, 2013 at 8:05 AM, Dan van der Ster d...@vanderster.com wrote: Hi all, This mail is just to let you know that we've prepared a draft blueprint related to adding strong(er) authn/authz to cephfs:

Re: [RFC] Ceph encryption support

2013-11-12 Thread Gregory Farnum
On Tue, Nov 12, 2013 at 6:10 AM, Li Wang liw...@ubuntukylin.com wrote: Hi, We want to implement encryption support for Ceph. Currently, we have the draft design, 1 When user mount a ceph directory for the first time, he can specify a passphrase and the encryption algorithm and length of

Re: HSM

2013-11-11 Thread Gregory Farnum
On Mon, Nov 11, 2013 at 3:04 AM, John Spray john.sp...@inktank.com wrote: This is a really useful summary from Malcolm. In addition to the coordinator/copytool interface, there is the question of where the policy engine gets its data from. Lustre has the MDS changelog, which Robinhood uses

Re: messenger refactor notes

2013-11-11 Thread Gregory Farnum
On Mon, Nov 11, 2013 at 7:00 AM, Atchley, Scott atchle...@ornl.gov wrote: On Nov 9, 2013, at 4:18 AM, Sage Weil s...@inktank.com wrote: The SimpleMessenger implementation of the Messenger interface has grown organically over many years and is one of the cruftier bits of code in Ceph. The

Re: cache tier blueprint (part 2)

2013-11-09 Thread Gregory Farnum
On Fri, Nov 8, 2013 at 10:25 PM, Sage Weil s...@inktank.com wrote: On Fri, 8 Nov 2013, Gregory Farnum wrote: On Thu, Nov 7, 2013 at 6:56 AM, Sage Weil s...@inktank.com wrote: I typed up what I think is remaining for the cache tier work for firefly. Greg, can you take a look? I'm most likely

<    1   2   3   4   5   6   7   8   9   10   >