Re: FreeBSD Building and Testing

2016-01-05 Thread Gregory Farnum
On Mon, Dec 28, 2015 at 8:53 AM, Willem Jan Withagen wrote: > Hi, > > Can somebody try to help me and explain why > > in test: Func: test/mon/osd-crash > Func: TEST_crush_reject_empty started > > Fails with a python error which sort of startles me: > test/mon/osd-crush.sh:227:

Re: CBT on an existing cluster

2016-01-05 Thread Gregory Farnum
On Tue, Jan 5, 2016 at 9:56 AM, Deneau, Tom wrote: > Having trouble getting a reply from c...@cbt.com so trying ceph-devel list... > > To get familiar with CBT, I first wanted to use it on an existing cluster. > (i.e., not have CBT do any cluster setup). > > Is there a .yaml

deprecation and build warnings

2016-01-05 Thread Gregory Farnum
I was annoyed again at our gitbuilders being all yellow because of compile warnings so I went to check out how many of them are real and how many of them are self-inflicted warnings. I just spot-checked

Re: Create one millon empty files with cephfs

2016-01-04 Thread Gregory Farnum
On Tue, Dec 29, 2015 at 4:55 AM, Fengguang Gong wrote: > hi, > We create one million empty files through filebench, here is the test env: > MDS: one MDS > MON: one MON > OSD: two OSD, each with one Inter P3700; data on OSD with 2x replica > Network: all nodes are

Re: Let's Not Destroy the World in 2038

2015-12-22 Thread Gregory Farnum
On Tue, Dec 22, 2015 at 12:10 PM, Adam C. Emerson wrote: > Comrades, > > Ceph's victory is assured. It will be the storage system of The Future. > Matt Benjamin has reminded me that if we don't act fast¹ Ceph will be > responsible for destroying the world. > > utime_t() uses

Re: Issue with Ceph File System and LIO

2015-12-21 Thread Gregory Farnum
On Sun, Dec 20, 2015 at 6:38 PM, Eric Eastman wrote: > On Fri, Dec 18, 2015 at 12:18 AM, Yan, Zheng wrote: >> On Fri, Dec 18, 2015 at 2:23 PM, Eric Eastman >> wrote: Hi Yan Zheng, Eric Eastman Similar

Re: RFC: tool for applying 'ceph daemon ' command to all OSDs

2015-12-21 Thread Gregory Farnum
On Mon, Dec 21, 2015 at 9:59 PM, Dan Mick wrote: > I needed something to fetch current config values from all OSDs (sorta > the opposite of 'injectargs --key value), so I hacked it, and then > spiffed it up a bit. Does this seem like something that would be useful > in this

Re: Improving Data-At-Rest encryption in Ceph

2015-12-15 Thread Gregory Farnum
On Tue, Dec 15, 2015 at 1:58 AM, Adam Kupczyk <akupc...@mirantis.com> wrote: > > > On Mon, Dec 14, 2015 at 9:28 PM, Gregory Farnum <gfar...@redhat.com> wrote: >> >> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski >> <rzarzyn...@mirantis.com> wr

Re: Improving Data-At-Rest encryption in Ceph

2015-12-14 Thread Gregory Farnum
On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski wrote: > Hello Folks, > > I would like to publish a proposal regarding improvements to Ceph > data-at-rest encryption mechanism. Adam Kupczyk and I worked > on that in last weeks. > > Initially we considered several

Re: Improving Data-At-Rest encryption in Ceph

2015-12-14 Thread Gregory Farnum
On Mon, Dec 14, 2015 at 2:02 PM, Martin Millnert <mar...@millnert.se> wrote: > On Mon, 2015-12-14 at 12:28 -0800, Gregory Farnum wrote: >> On Mon, Dec 14, 2015 at 5:17 AM, Radoslaw Zarzynski > >> > In typical case ciphertext data transferred from OSD to OSD can

Re: OSD public / cluster network isolation using VRF:s

2015-12-10 Thread Gregory Farnum
On Mon, Dec 7, 2015 at 7:31 AM, Martin Millnert <mar...@millnert.se> wrote: > On Mon, 2015-12-07 at 06:48 -0800, Gregory Farnum wrote: > >> >> I'm probably just being dense here, but I don't quite understand what >> >> all this is trying to accomplish. It look

Re: OSD public / cluster network isolation using VRF:s

2015-12-07 Thread Gregory Farnum
On Mon, Dec 7, 2015 at 5:36 AM, Martin Millnert <mar...@millnert.se> wrote: > Greg, > > see below. > > On Thu, 2015-12-03 at 13:25 -0800, Gregory Farnum wrote: >> On Thu, Dec 3, 2015 at 12:13 PM, Martin Millnert <mar...@millnert.se> wrote: >> > H

Re: proposal to run Ceph tests on pull requests

2015-12-07 Thread Gregory Farnum
On Mon, Dec 7, 2015 at 3:29 AM, John Spray wrote: > On Sat, Dec 5, 2015 at 11:49 AM, Loic Dachary wrote: >> Hi Ceph, >> >> TL;DR: a ceph-qa-suite bot running on pull requests is sustainable and is an >> incentive for contributors to use teuthology-openstack

Re: Why FailedAssertion is not my favorite exception

2015-12-04 Thread Gregory Farnum
On Fri, Dec 4, 2015 at 12:31 PM, Adam C. Emerson wrote: > Noble Creators of the Squid Cybernetic Swimming in a Distributed Data Sea, > > There is a spectre haunting src/common/assert.cc: The spectre of throw > FailedAssertion. > > This seemingly inconsequential yet villainous

Re: Why FailedAssertion is not my favorite exception

2015-12-04 Thread Gregory Farnum
On Fri, Dec 4, 2015 at 12:40 PM, Adam C. Emerson <aemer...@redhat.com> wrote: > On 04/12/2015, Gregory Farnum wrote: >> I must be missing something here. As far as I can tell, "throw >> FailedAssertion" only happens in assert.cc, and I know that stuff >> does

Re: Compiling for FreeBSD

2015-12-04 Thread Gregory Farnum
On Fri, Dec 4, 2015 at 10:30 AM, Willem Jan Withagen wrote: > On 3-12-2015 01:27, Yan, Zheng wrote: >> On Thu, Dec 3, 2015 at 4:52 AM, Willem Jan Withagen wrote: >>> On 2-12-2015 15:13, Yan, Zheng wrote: > >>> I see that you have disabled uuid? >>> Might I ask

Re: OSD->ReplicaOSD->OSD questions

2015-12-03 Thread Gregory Farnum
On Thu, Dec 3, 2015 at 1:30 AM, Lakis, Jacek wrote: > Hi cephers! > I got two questions about "client->osd->replica osd->osd->client" path that > appears during my deep dive into this part. > 1. eval_repop() is called twice [in C_OSD_RepopCommit and >

Re: Compiling for FreeBSD, runtimes for seperate tests.

2015-12-03 Thread Gregory Farnum
On Thu, Dec 3, 2015 at 1:50 AM, Willem Jan Withagen wrote: > On 2-12-2015 22:10, Willem Jan Withagen wrote: > >> Running gmake check > > > Now I start wondering how long certain tests are able to run: > > I've killed: > unittest_chain_xattr > because it was running

Re: OSD public / cluster network isolation using VRF:s

2015-12-03 Thread Gregory Farnum
On Thu, Dec 3, 2015 at 12:13 PM, Martin Millnert wrote: > Hi, > > we're deploying Ceph on Linux for multiple purposes. > We want to build network isolation in our L3 DC network using VRF:s. > > In the case of Ceph this means that we are separating the Ceph public > network

Re: ack vs commit

2015-12-03 Thread Gregory Farnum
On Thu, Dec 3, 2015 at 4:54 PM, Sage Weil wrote: > From the beginning Ceph has had two kinds of acks for rados write/update > operations: ack (indicating the operation is accepted, serialized, and > staged in the osd's buffer cache) and commit (indicating the write is >

Re: Suggestions on tracker 13578

2015-12-02 Thread Gregory Farnum
On Tue, Dec 1, 2015 at 5:23 AM, Vimal wrote: > Hello, > > This mail is to discuss the feature request at > http://tracker.ceph.com/issues/13578. > > If done, such a tool should help point out several mis-configurations that > may cause problems in a cluster later. > > Some of

Re: CodingStyle on existing code

2015-12-01 Thread Gregory Farnum
On Tue, Dec 1, 2015 at 5:47 AM, Loic Dachary wrote: > > > On 01/12/2015 14:10, Wido den Hollander wrote: >> Hi, >> >> While working on mon/PGMonitor.cc I see that there is a lot of >> inconsistency on the code. >> >> A lot of whitespaces, indentation which is not correct, well,

Re: RFC: teuthology field in commit messages

2015-11-30 Thread Gregory Farnum
On Sun, Nov 29, 2015 at 6:15 PM, Loic Dachary wrote: > > > On 29/11/2015 23:55, John Spray wrote: >> On Sun, Nov 29, 2015 at 9:25 PM, Loic Dachary wrote: >>> >>> >>> On 29/11/2015 21:47, John Spray wrote: On Sun, Nov 29, 2015 at 8:25 PM, Loic Dachary

Re: Compiling for FreeBSD

2015-11-30 Thread Gregory Farnum
On Mon, Nov 30, 2015 at 11:04 AM, Willem Jan Withagen wrote: > On 30-11-2015 15:40, Willem Jan Withagen wrote: >> On 30-11-2015 15:13, Mykola Golub wrote: >>> On Mon, Nov 30, 2015 at 12:53:54PM +0100, Willem Jan Withagen wrote: >>> > git clone --recursive -b wip-freebsd

Scaling Ceph reviews and testing

2015-11-25 Thread Gregory Farnum
Everybody, Ceph is popular! The global community of developers is growing quickly, and that’s leading to some challenges for our leads and core development team as we try to absorb incoming pull requests. Over the past few weeks our leads have discussed (internally and with a few external

Re: Multiple OSDs suicide because of client issues?

2015-11-23 Thread Gregory Farnum
9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Mon, Nov 23, 2015 at 10:33 AM, Gregory Farnum wrote: >> On Mon, Nov 23, 2015 at 11:27 AM, Robert LeBlanc wrote: >>> -BEGIN PGP SIGNED MESSAGE- >>> Hash: SHA256 >>> >>> I checked the SAR da

Re: Wiping object content on removal

2015-11-23 Thread Gregory Farnum
On Wed, Nov 18, 2015 at 8:42 AM, Igor Fedotov wrote: > Hi Cephers. > > Does Ceph have an ability to wipe object content during one's removal? > Surely one can do that manually from the client but I think that's > ineffective and not 100% secure. > > If no - what's about

Re: Multiple OSDs suicide because of client issues?

2015-11-23 Thread Gregory Farnum
> gKWfi6SC80VMVyLPNEV35p+SK2UAjhmsplxpxErEkSj8U/8YdC0TzwauKwYN > k48ZiIWHfDN40cgcP/RuSZMuhfvqTSIyFifIGs5ADuDe47o3SIpI6rBt5MPs > ebmbvAMTT/1ez/JQ9ugJ83QKiSgPD/Sw5YffMF1S+J4mMKOGEl8mfv8HFyjo > J9chHcVYrQt8T3AaGKqJqwc4C4BKTGDm314Hf+iDxsROjMMzgtbGxGyQC7vv > SQnpMsQjikIZKsI/9hoAentFe9f3/ks7GZH2aEbU

Re: Multiple OSDs suicide because of client issues?

2015-11-23 Thread Gregory Farnum
attaching the entire OSD log in case it is useful. Uh, that doesn't have the backtrace in it. -Greg > > Thanks for taking a look at this. > > - -------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Mon, Nov 23, 2015 at

Re: Crc32 Challenge

2015-11-23 Thread Gregory Farnum
On Tue, Nov 17, 2015 at 10:51 AM, chris holcombe wrote: > Hello Ceph Devs, > > I'm almost certain at this point that I have discovered a major bug in > ceph's crc32c mechanism. http://tracker.ceph.com/issues/13713 I'm totally > open to be proven wrong and that's

Re: recommendations for good newbie bug to look at?

2015-11-13 Thread Gregory Farnum
Looks like we only have two tagged right now :( but periodically things in the tracker get tagged with "new-dev". http://tracker.ceph.com/projects/ceph/search?utf8=✓=1=new-dev ...and looking at that, the osdmap_subscribe ones I think are mostly dealt with in

Re: why ShardedWQ in osd using smart pointer for PG?

2015-11-10 Thread Gregory Farnum
On Tue, Nov 10, 2015 at 7:19 AM, 池信泽 wrote: > hi, all: > > op_wq is declared as ShardedThreadPool::ShardedWQ < pair OpRequestRef> > _wq. I do not know why we should use PGRef in this? > > Because the overhead of the smart pointer is not small. Maybe the >

Re: [ceph-users] Permanent MDS restarting under load

2015-11-10 Thread Gregory Farnum
On Tue, Nov 10, 2015 at 6:32 AM, Oleksandr Natalenko wrote: > Hello. > > We have CephFS deployed over Ceph cluster (0.94.5). > > We experience constant MDS restarting under high IOPS workload (e.g. > rsyncing lots of small mailboxes from another storage to CephFS using >

Re: why ShardedWQ in osd using smart pointer for PG?

2015-11-10 Thread Gregory Farnum
> 2015-11-11 2:28 GMT+08:00 Gregory Farnum <gfar...@redhat.com>: >> On Tue, Nov 10, 2015 at 7:19 AM, 池信泽 <xmdx...@gmail.com> wrote: >>> hi, all: >>> >>> op_wq is declared as ShardedThreadPool::ShardedWQ < pair <PGRef, >>> O

Re: ceph encoding optimization

2015-11-09 Thread Gregory Farnum
On Wed, Nov 4, 2015 at 7:07 AM, Gregory Farnum <gfar...@redhat.com> wrote: > The problem with this approach is that the encoded versions need to be > platform-independent — they are shared over the wire and written to > disks that might get transplanted to different machines. Apart

Re: Question about how rebuild works.

2015-11-06 Thread Gregory Farnum
On Thu, Nov 5, 2015 at 9:59 PM, Allen Samuels wrote: > I have a question about rebuild in the following situation: > > I have a pool with 3x replication. > For one particular PG we'll designate the active OSD set as [1,2,3] with 1 as > the primary. > Assume 2 and 3

Re: Would it make sense to require ntp

2015-11-06 Thread Gregory Farnum
On Fri, Nov 6, 2015 at 4:26 AM, John Spray wrote: > On Fri, Nov 6, 2015 at 10:06 AM, Nathan Cutler wrote: >> Hi Ceph: >> >> Recently I encountered some a "clock skew" issue with 0.94.3. I have >> some small demo clusters in AWS. When I boot them up, in most

Re: Question about how rebuild works.

2015-11-06 Thread Gregory Farnum
would actually have on > overall durability (how frequent is this case?). Once Allen does the > math, we'll have a better idea :) > -Sam > > On Fri, Nov 6, 2015 at 8:43 AM, Gregory Farnum <gfar...@redhat.com> wrote: >> Argh, I guess I was wrong. Sorry for the misinformati

Re: Request for Comments: Weighted Round Robin OP Queue

2015-11-05 Thread Gregory Farnum
On Thu, Nov 5, 2015 at 7:14 AM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Thanks Gregory, > > People are most likely busy and haven't had time to digest this and I > may be expecting more excitement from it (I'm excited due to the >

Re: ceph encoding optimization

2015-11-04 Thread Gregory Farnum
On Wed, Nov 4, 2015 at 7:00 AM, 池信泽 wrote: > hi, all: > > I am focus on the cpu usage of ceph now. I find the struct (such > as pg_info_t , transaction and so on) encode and decode exhaust too > much cpu resource. > > For now, we should encode every member variable

Re: NIC with Erasure offload feature support and Ceph

2015-11-03 Thread Gregory Farnum
On Tue, Nov 3, 2015 at 3:15 AM, Mike wrote: > Hello! > > In our project we planing build a petabayte cluster with Erasure pool. > Also we looking on Mellanox ConnectX-4 Lx EN Cards/ConnectX-4 EN Cards > for using its a offloading erasure code feature. > > Someone use this

Re: Fix OP dequeuing order

2015-10-30 Thread Gregory Farnum
n1JDyl8hAl9BqKBPFUthRH3gv/RYkkQTejE2iVfdvSn8l9+EcfzCtsdGou > LXDYb+k5jyxZelvR3qY1QdRxcuBxqLnmYVzS/iPph6nU3TINZGpyi/mFZiN5 > mxIED4BQGNLAG6hBr4OD7WusH9I8U2CEXFs5nGjlMxBsAQpM8L0xTwhmgthC > 4aHZqp0hH2DlNcBC8L1gNbDV15Q7fg0T8x2jXnh7F81Oq3AF+S4xYm6OzisC > jUc+Pmb1XwlWoL9wkcwqZ+GwKRcw2W4a/0ryi4KDriU+zTUo7J0P6qQHm6ab &

Re: Fix OP dequeuing order

2015-10-30 Thread Gregory Farnum
As we've discussed on the PR, this isn't right. We deliberately iterate through the lower-priority queues first so that the higher-priority ones don't starve them out entirely: they might be generating tokens more quickly than tokens can be used, which would prevent low queues from ever getting

Re: server unclear options of buckets stats info

2015-10-27 Thread Gregory Farnum
On Tue, Oct 27, 2015 at 6:34 AM, huang jun wrote: > Hi, all > I am looking rgw storage object, when use the following command to > view the bucket information: radosgw-admin bucket stats --bucket = bk0 > {"Bucket": "bk0", >"pool": ".rgw.buckets", >"index_pool":

Re: PG: all requests stuck when acting set < min_size

2015-10-27 Thread Gregory Farnum
On Tue, Oct 27, 2015 at 11:47 AM, GuangYang wrote: > Hi there, > Is there any reason we stuck read only requests as well for a PG when the > acting set size is less than min_size? A few. The most important reason: PGs don't have any concept of a read-only mode in the code.

Re: newstore direction

2015-10-23 Thread Gregory Farnum
On Fri, Oct 23, 2015 at 7:59 AM, Howard Chu wrote: > If the stream of writes is large enough, you could omit fsync because > everything is being forced out of the cache to disk anyway. In that > scenario, the only thing that matters is that the writes get forced out in > the order

Re: cephfs and the next firefly release v0.80.11

2015-10-23 Thread Gregory Farnum
Sounds good! On Fri, Oct 23, 2015 at 1:12 PM, Loic Dachary wrote: > Hi Greg, > > The next firefly release as found at > https://github.com/ceph/ceph/tree/firefly passed the fs suite > (http://tracker.ceph.com/issues/11644#note-112). Do you think the firefly > branch is ready

Re: MDS stuck in a crash loop

2015-10-21 Thread Gregory Farnum
On Mon, Oct 19, 2015 at 8:31 AM, Milosz Tanski <mil...@adfin.com> wrote: > On Wed, Oct 14, 2015 at 12:46 AM, Gregory Farnum <gfar...@redhat.com> wrote: >> On Sun, Oct 11, 2015 at 7:36 PM, Milosz Tanski <mil...@adfin.com> wrote: >>> On Sun, Oct 11, 2015 at 6:44

Re: MDS stuck in a crash loop

2015-10-21 Thread Gregory Farnum
On Wed, Oct 21, 2015 at 2:33 PM, John Spray wrote: > On Wed, Oct 21, 2015 at 10:33 PM, John Spray wrote: >>> John, I know you've got >>> https://github.com/ceph/ceph-qa-suite/pull/647. I think that's >>> supposed to be for this, but I'm not sure if you

Re: newstore direction

2015-10-20 Thread Gregory Farnum
On Tue, Oct 20, 2015 at 12:44 PM, Sage Weil wrote: > On Tue, 20 Oct 2015, Ric Wheeler wrote: >> The big problem with consuming block devices directly is that you ultimately >> end up recreating most of the features that you had in the file system. Even >> enterprise databases

Re: dump_historic_ops, slow requests

2015-10-13 Thread Gregory Farnum
On Mon, Oct 12, 2015 at 2:22 PM, Deneau, Tom wrote: > I have a small ceph cluster (3 nodes, 5 osds each, journals all just > partitions > on the spinner disks) and I have noticed that when I hit it with a bunch of > rados bench clients all doing writes of large (40M objects)

Re: Re: The questions of data collection and cache tiering in Ceph

2015-10-13 Thread Gregory Farnum
s readable out of page cache! As soon as the write operation completes in-memory the OSD will continue processing subsequent ops on the object (because the OSD journal has persisted the operation). -Greg > Thank you so much. > Yours, > Chay > > > > > > > &g

Re: MDS stuck in a crash loop

2015-10-13 Thread Gregory Farnum
PM, Milosz Tanski <mil...@adfin.com> wrote: >>>> On Sun, Oct 11, 2015 at 5:24 PM, Milosz Tanski <mil...@adfin.com> wrote: >>>>> On Sun, Oct 11, 2015 at 1:16 PM, Gregory Farnum <gfar...@redhat.com> >>>>> wrote: >>>>>> On

Re: Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-12 Thread Gregory Farnum
On Mon, Oct 12, 2015 at 9:50 AM, Mark Nelson wrote: > Hi Guy, > > Given all of the recent data on how different memory allocator > configurations improve SimpleMessenger performance (and the effect of memory > allocators and transparent hugepages on RSS memory usage), I

Re: MDS stuck in a crash loop

2015-10-11 Thread Gregory Farnum
On Sun, Oct 11, 2015 at 10:09 AM, Milosz Tanski wrote: > About an hour ago my MDSs (primary and follower) started ping-pong > crashing with this message. I've spent about 30 minutes looking into > it but nothing yet. > > This is from a 0.94.3 MDS > > 0> 2015-10-11

Re: CephFS and the next hammer release v0.94.4

2015-10-08 Thread Gregory Farnum
On Tue, Oct 6, 2015 at 8:16 AM, Loic Dachary wrote: > Hi Greg, > > The next hammer release as found at https://github.com/ceph/ceph/tree/hammer > passed the fs suite (http://tracker.ceph.com/issues/12701#note-66). Do you > think the hammer branch is ready for QE to start their

Re: The questions of data collection and cache tiering in Ceph

2015-10-08 Thread Gregory Farnum
On Thu, Oct 8, 2015 at 9:09 AM, 蔡毅 wrote: > > Dear developers, > >Recently I met some troubles when I read the Ceph’s source code and > understand the architecture. > The details of problems are as followed. > >1.In monitoring tools, they can collect much data when

Re: Adding Data-At-Rest compression support to Ceph

2015-09-24 Thread Gregory Farnum
On Thu, Sep 24, 2015 at 8:13 AM, Igor Fedotov <ifedo...@mirantis.com> wrote: > On 23.09.2015 21:03, Gregory Farnum wrote: >> >> On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil <s...@newdream.net> wrote: >>>>> >>>>> >>>>> The id

Re: Where does the data go ??

2015-09-24 Thread Gregory Farnum
On Tue, Sep 22, 2015 at 6:58 PM, Tomy Cheru wrote: > Noticed while benchmarking newstore with rocksdb backend that, the data is > missing in "dev/osd0/fragments" > >>64k sized objects produces content in above mentioned dir, however missing >>with <=64k sized objects >

Re: full cluster/pool handling

2015-09-24 Thread Gregory Farnum
On Thu, Sep 24, 2015 at 8:04 AM, Sage Weil wrote: > On Thu, 24 Sep 2015, Robert LeBlanc wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA256 >> >> >> On Thu, Sep 24, 2015 at 6:30 AM, Sage Weil wrote: >> > Xuan Liu recently pointed out that there is a problem with our

Re: Very slow recovery/peering with latest master

2015-09-24 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 4:42 PM, Handzik, Joe wrote: > Ok. When configuring with ceph-disk, it does something nifty and actually > gives the OSD the uuid of the disk's partition as its fsid. I bootstrap off > that to get an argument to pass into the function you have

Re: full cluster/pool handling

2015-09-24 Thread Gregory Farnum
On Thu, Sep 24, 2015 at 1:36 PM, John Spray <jsp...@redhat.com> wrote: > On Thu, Sep 24, 2015 at 7:26 PM, Gregory Farnum <gfar...@redhat.com> wrote: >> That latter switch already exists, by the way, although I don't think >> it's actually enforced via cephx caps (i

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: > On Wed, 23 Sep 2015, Igor Fedotov wrote: >> Hi Sage, >> thanks a lot for your feedback. >> >> Regarding issues with offset mapping and stripe size exposure. >> What's about the idea to apply compression in two-tier

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 8:26 AM, Igor Fedotov <ifedo...@mirantis.com> wrote: > > > On 23.09.2015 17:05, Gregory Farnum wrote: >> >> On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil <s...@newdream.net> wrote: >>> >>> On Wed, 23 Sep 2015, Igor Fedotov w

Re: perf counters from a performance discrepancy

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 11:19 AM, Sage Weil wrote: > On Wed, 23 Sep 2015, Deneau, Tom wrote: >> Hi all -- >> >> Looking for guidance with perf counters... >> I am trying to see whether the perf counters can tell me anything about the >> following discrepancy >> >> I populate a

Re: perf counters from a performance discrepancy

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 9:33 AM, Deneau, Tom wrote: > Hi all -- > > Looking for guidance with perf counters... > I am trying to see whether the perf counters can tell me anything about the > following discrepancy > > I populate a number of 40k size objects in each of two

Re: perf counters from a performance discrepancy

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 1:51 PM, Deneau, Tom <tom.den...@amd.com> wrote: > > >> -Original Message- >> From: Gregory Farnum [mailto:gfar...@redhat.com] >> So if you've got 20k objects and 5 OSDs then each OSD is getting ~4k reads >> during this test.

Re: OSD::op_is_discardable condition doubled

2015-09-22 Thread Gregory Farnum
On Tue, Sep 22, 2015 at 5:26 AM, Sage Weil wrote: > On Tue, 22 Sep 2015, Lakis, Jacek wrote: >> Hi ceph-devel! >> We're checking OSD::op_is_discardable condition two times: in dispatcher >> (handle_op) and in worker threads >> (do_request->can_discard_request->can_discard_op)

Re: Review Request

2015-09-16 Thread Gregory Farnum
On Wed, Sep 16, 2015 at 11:09 AM, Adam C. Emerson wrote: > Creators of the Storage Squid, > > If you're interested in less use of the allocator, you are interested in > Context* elimination. If so, please review the top two commits on the > wip-decontextualization branch of:

Re: Very slow recovery/peering with latest master

2015-09-16 Thread Gregory Farnum
On Tue, Sep 15, 2015 at 8:04 PM, Somnath Roy wrote: > Hi, > I am seeing very slow recovery when I am adding OSDs with the latest master. > Also, If I just restart all the OSDs (no IO is going on in the cluster) , > cluster is taking a significant amount of time to reach

Re: gitbuilder mirrors

2015-09-15 Thread Gregory Farnum
On Tue, Sep 15, 2015 at 1:13 AM, Dan van der Ster wrote: > Hi, > Downloading from gitbuilder.ceph.com is super slow from where I'm > sitting (<100KB/s). Does anyone have a publicly accessible mirror? > Cheers, Dan I'm not aware of any...it would be pretty bandwidth intensive

Re: 答复: 答复: 答复: 2 replications,flapping can not stop for a very long time

2015-09-15 Thread Gregory Farnum
On Mon, Sep 14, 2015 at 7:56 PM, zhao.ming...@h3c.com wrote: > The OSD is supposed to stay down if any of the networks are missing. > osd1 osd2 osd3,if cut off osd2's cluster-network,then all of > the osd( osd1/osd2/osd3) will all stay down? No, but osd2

Re: Brewer's theorem also known as CAP theorem

2015-09-15 Thread Gregory Farnum
Congratulations, you've just hit on my biggest pet peeve in distributed systems discussions. Sorry if this gets a little hot. :) On Tue, Sep 15, 2015 at 5:38 AM, Owen Synge <osy...@suse.com> wrote: > On Mon, 14 Sep 2015 13:57:26 -0700 > Gregory Farnum <gfar...@redhat.com> w

Re: A new rados api set_client_priority

2015-09-14 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 2:43 AM, 瞿天善 wrote: > Hi, > I'm working on the topic > > , first step I add a new rados api set_client_priority, then I will do > more test to get the idea of how to

Re: 答复: 答复: 2 replications,flapping can not stop for a very long time

2015-09-14 Thread Gregory Farnum
The OSD is supposed to stay down if any of the networks are missing. Ceph is a CP system in CAP parlance; there's no such thing as a CA system. ;) What version of Ceph are you testing right now? -Greg On Mon, Sep 14, 2015 at 1:02 AM, zhao.ming...@h3c.com wrote: > Thanks

Re: osd: new pool flags: noscrub, nodeep-scrub

2015-09-11 Thread Gregory Farnum
On Fri, Sep 11, 2015 at 7:42 AM, Mykola Golub wrote: > Hi, > > I would like to add new pool flags: noscrub and nodeep-scrub, to be > able to control scrubbing on per pool basis. In our case it could be > helpful in order to disable scrubbing on cache pools, which does not >

Re: [ceph-users] Ceph.conf

2015-09-10 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 9:44 AM, Shinobu Kinjo wrote: > Hello, > > I'm seeing 859 parameters in the output of: > > $ ./ceph --show-config | wc -l > *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** > 859 > > In: > > $ ./ceph --version >

Re: [ceph-users] how to improve ceph cluster capacity usage

2015-09-08 Thread Gregory Farnum
On Tue, Sep 1, 2015 at 3:58 PM, huang jun wrote: > hi,all > > Recently, i did some experiments on OSD data distribution, > we set up a cluster with 72 OSDs,all 2TB sata disk, > and ceph version is v0.94.3 and linux kernel version is 3.18, > and set "ceph osd crush tunables

Re: [Backport] assigning a pull request when asking permission to merge

2015-09-08 Thread Gregory Farnum
On Sun, Sep 6, 2015 at 10:47 PM, Loic Dachary wrote: > Hi, > > Today I realized that when we ask for permission to merge [1], we don't > usually assign the pull request to the person who is supposed to answer. This > may not be of consequence if thre is just one pull request

Re: [NewStore]About PGLog Workload With RocksDB

2015-09-08 Thread Gregory Farnum
On Tue, Sep 8, 2015 at 3:06 PM, Haomai Wang wrote: > Hit "Send" by accident for previous mail. :-( > > some points about pglog: > 1. short-alive but frequency(HIGH) Is this really true? The default length of the log is 1000 entries, and most OSDs have ~100 PGs, so on a hard

Re: [NewStore]About PGLog Workload With RocksDB

2015-09-08 Thread Gregory Farnum
On Tue, Sep 8, 2015 at 3:12 PM, Gregory Farnum <gfar...@redhat.com> wrote: > On Tue, Sep 8, 2015 at 3:06 PM, Haomai Wang <haomaiw...@gmail.com> wrote: >> Hit "Send" by accident for previous mail. :-( >> >> some points about pglog: >> 1. short-

Re: pull request labels : related to

2015-09-08 Thread Gregory Farnum
On Sun, Sep 6, 2015 at 5:04 PM, Loic Dachary wrote: > Hi, > > I wen thru all pull request looking for those not related to anything (like > having "bug fix" without "core" or "rgw"). Now that there are many labels, > it's not trivial for someone not used to Ceph to sort out

Re: [ceph-users] Opensource plugin for pulling out cluster recovery and client IO metric

2015-08-28 Thread Gregory Farnum
On Mon, Aug 24, 2015 at 4:03 PM, Vickey Singh vickey.singh22...@gmail.com wrote: Hello Ceph Geeks I am planning to develop a python plugin that pulls out cluster recovery IO and client IO operation metrics , that can be further used with collectd. For example , i need to take out these

Re: Signed-off-by and aliases

2015-08-12 Thread Gregory Farnum
On Mon, Aug 3, 2015 at 11:10 PM, Loic Dachary l...@dachary.org wrote: On 03/08/2015 21:18, John Spray wrote: On Fri, Jul 31, 2015 at 8:59 PM, Loic Dachary l...@dachary.org wrote: Hi Ceph, We require that each commit has a Signed-off-by line with the name and email of the author. The

Re: CephFS and the next hammer release v0.94.3

2015-08-03 Thread Gregory Farnum
On Mon, Aug 3, 2015 at 6:43 PM Loic Dachary l...@dachary.org wrote: Hi Greg, The next hammer release as found at https://github.com/ceph/ceph/tree/hammer passed the fs suite (http://tracker.ceph.com/issues/11990#fs). Do you think it is ready for QE to start their own round of testing ?

Re: Criteria to become a Ceph project

2015-07-28 Thread Gregory Farnum
On Tue, Jul 28, 2015 at 7:38 AM, Loic Dachary l...@dachary.org wrote: Hi Ceph, The title sounds a little strange (Citerias to become a Ceph project) because I'm not aware of projects initiated by someone external to Ceph that later became part of the Ceph nebula of projects (as found at

Re: pre_start command not executed with upstart

2015-07-28 Thread Gregory Farnum
On Tue, Jul 28, 2015 at 8:55 AM, Wido den Hollander w...@42on.com wrote: Hi, I was trying to inject a pre_start command on a bunch of OSDs under Ubuntu 14.04, but that didn't work. I found out that only the sysvinit script execute pre_start commands, but the upstart nor the systemd scripts

Re: timeout 120 teuthology-killl is highly recommended

2015-07-21 Thread Gregory Farnum
On Tue, Jul 21, 2015 at 5:13 PM, Loic Dachary l...@dachary.org wrote: Hi Ceph, Today I did something wrong and that blocked the lab for a good half hour. a) I ran two teuthology-kill simultaneously and that makes them deadlock each other b) I let them run unattended only to come back to

Re: Ceph Tech Talk next week

2015-07-21 Thread Gregory Farnum
On Tue, Jul 21, 2015 at 6:09 PM, Patrick McGarry pmcga...@redhat.com wrote: Hey cephers, Just a reminder that the Ceph Tech Talk on CephFS that was scheduled for last month (and cancelled due to technical difficulties) has been rescheduled for this month's talk. It will be happening next

Re: The design of the eviction improvement

2015-07-21 Thread Gregory Farnum
On Tue, Jul 21, 2015 at 3:15 PM, Matt W. Benjamin m...@cohortfs.com wrote: Hi, Couple of points. 1) a successor to 2Q is MQ (Li et al). We have an intrusive MQ LRU implementation with 2 levels currently, plus a pinned queue, that addresses stuff like partitioning (sharding), scan

Re: Difference between convention and enforcement.

2015-07-14 Thread Gregory Farnum
Hey Owen, I haven't followed any of the conversations you've had in ceph-deploy land, but I've been trying to keep track of the ones on ceph-devel et al. I can't comment on very much of it because I suck at Python — I can write C in any language, and do so! ;) I interject this comment because

Client: what is supposed to protect racing readdirs and unlinks?

2015-07-14 Thread Gregory Farnum
I spent a bunch of today looking at http://tracker.ceph.com/issues/12297. Long story short: the workload is doing a readdir at the same time as it's unlinking files. The readdir functions (in this case, _readdir_cache_cb) drop the client_lock each time they invoke the callback (for obvious

Re: osd suicide timeout

2015-07-13 Thread Gregory Farnum
On Fri, Jul 10, 2015 at 10:45 PM, Deneau, Tom tom.den...@amd.com wrote: I have an osd log file from an osd that hit a suicide timeout (with the previous 1 events logged). (On this node I have also seen this suicide timeout happen once before and also a sync_entry timeout. I can see

Re: osd suicide timeout

2015-07-13 Thread Gregory Farnum
0.00 0.00 0.00 03:54:32 PM sde1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 03:54:32 PM sde2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -- Tom -Original Message- From: Gregory Farnum

Re: load-gen from an osd node

2015-07-10 Thread Gregory Farnum
to be in memory. So the newer load-gen works fine as long as you explicitly set --max-ops on the command line. Can you create a ticket on the tracker describing this issue? That's the kind of thing we'll want to fix. ;) -Greg -- Tom Deneau -Original Message- From: Gregory Farnum

Re: load-gen from an osd node

2015-07-01 Thread Gregory Farnum
Hmm, the only changes I see between those two versions are some pretty precise cleanups which shouldn't cause this. But it means that a bisect or determined look should be easy. Can you create a ticket which includes the exact output you're seeing and the exact versions you're running? -Greg On

Re: Probable memory leak in Hammer write path ?

2015-07-01 Thread Gregory Farnum
On Mon, Jun 29, 2015 at 4:39 PM, Somnath Roy somnath@sandisk.com wrote: Greg, Updating to the new kernel updating the gcc version too. Recent kernel is changing tcmalloc version too, but, 3.16 has old tcmalloc but still exhibiting the issue. Yes, the behavior is very confusing and

Re: What is omap

2015-06-29 Thread Gregory Farnum
On Fri, Jun 26, 2015 at 8:02 PM, Pete Zaitcev zait...@redhat.com wrote: On Fri, 26 Jun 2015 14:48:15 +0100 Gregory Farnum g...@gregs42.com wrote: Each object consists of three different data storage areas, all of which are 100% optional: the bundle of bits object data, the object xattrs

Re: Inline dedup/compression

2015-06-29 Thread Gregory Farnum
We discuss this periodically but not in any great depth. Compression and dedupe are both best performed at a single point with some sort of global knowledge, which is very antithetical to Ceph's design. Blue-sky discussions for dedupe generally center around trying out some kind of CAS system with

Re: CRC32 of messages

2015-06-29 Thread Gregory Farnum
On Mon, Jun 29, 2015 at 12:30 PM, Daniel Swarbrick daniel.swarbr...@profitbricks.com wrote: On 29/06/15 12:51, Gregory Farnum wrote: Yes, we have our own CRC32 checksum because lng ago (before I started!) Sage saw a lot of network corruption that wasn't being caught by the TCP checksums

  1   2   3   4   5   6   7   8   9   10   >