to let me know.
thanks!!
vicente
2015-12-09 10:50 GMT+08:00 Sage Weil <sw...@redhat.com>:
On Tue, 8 Dec 2015, David Zafman wrote:
Remember I really think we want a disk replacement feature that would retain
the OSD id so that it avoids unnecessary data movement. See tracke
Remember I really think we want a disk replacement feature that would
retain the OSD id so that it avoids unnecessary data movement. See
tracker http://tracker.ceph.com/issues/13732
David
On 12/5/15 8:49 AM, Loic Dachary wrote:
Hi Sage,
The problem described at "new OSD re-using old OSD
dout() is used for an OSD to log information about what it is doing
locally and might become very chatty. It is saved on the local nodes
disk only.
clog is the cluster log and is used for major events that should be
known by the administrator (see ceph -w). Clog should be used sparingly
I can't remember the details now, but I know that recovery needed
additional work. If it were a simple fix
I would have done it when implementing that code.
I found this bug related to recovery and ec errors
(http://tracker.ceph.com/issues/13493)
BUG #13493: osd: for ec, cascading crash
:00 David Zafman <dzaf...@redhat.com>:
There are two reasons for having a ceph-disk replace feature.
1. To simplify the steps required to replace a disk
2. To allow a disk to be replaced proactively without causing any data
movement.
Hi David,
It good to without causing any data movement w
There are two reasons for having a ceph-disk replace feature.
1. To simplify the steps required to replace a disk
2. To allow a disk to be replaced proactively without causing any data
movement.
So keeping the osd id the same is required and is what motivated the
feature for me.
David
On
Initiating a manual deep-scrub like you are doing should always run.
The command you are running doesn't report any information it just
initiates a background process. If you follow the command with ceph -w
you'll see what is happening:
After I corrupted one of my replicas I see this.
$
Good point. In my previous response I did "echo garbage >
./foo__head_7FC1F406__1" to corrupt a replica.
David
On 10/28/15 5:13 PM, Sage Weil wrote:
Becuse you *just* wrote the object, and the FileStore caches open file
handles. Vim renames a new inode over the old one so the open
I don't understand how encode/decode of entity_addr_t is changing
without versioning in the encode/decode. This means that this branch is
changing the ceph-objectstore-tool export format if
CEPH_FEATURE_MSG_ADDR2 is part of the features. So we could bump
super_header::super_ver if the
There would be a benefit to doing fadvise POSIX_FADV_DONTNEED after
deep-scrub reads for objects not recently accessed by clients.
I see the NewStore objectstore sometimes using the O_DIRECT flag for
writes. This concerns me because the open(2) man pages says:
"Applications should avoid
Sage,
I restored the branch wip-digest-repair which merged post-hammer in pull
request #4365. Do you think that 4365 fixes the reported bug #12577?
I cherry-picked the 9 commits off of hammer-backports-next as pull
request #5458 and assigned to Loic.
David
--
To unsubscribe from this
at least aren't valuable to keep around.
-Sam
- Original Message -
From: Sage Weil sw...@redhat.com
To: Samuel Just sj...@redhat.com
Cc: David Zafman dzaf...@redhat.com, ceph-devel@vger.kernel.org
Sent: Tuesday, July 7, 2015 10:22:32 AM
Subject: Re: ceph-objectstore-tool import failures
On Tue
and assume that after replay the clear_temp_objects() will
clean them up?
David
On 7/6/15 1:28 PM, Sage Weil wrote:
On Fri, 19 Jun 2015, David Zafman wrote:
This ghobject_t which has a pool of -3 is part of the export. This caused
the assert:
Read -3/1c/temp_recovering_1.1c_33'50_39_head/head
Regards,
Igor.
-Original Message-
From: David Zafman [mailto:dzaf...@redhat.com]
Sent: Friday, June 26, 2015 3:46 AM
To: Podoski, Igor; Deneau, Tom; Dałek, Piotr; ceph-devel
Subject: Re: deleting objects from a pool
If you have rados bench data around, you'll need to run cleanup a second
If you have rados bench data around, you'll need to run cleanup a second
time because the first time the benchmark_last_metadata object
will be consulted to find what objects to remove.
Also, using cleanup this way will only remove objects from the default
namespace unless a namespace is
Have not seen this as an assert before. Given the code below in
do_import() of master branch the assert is impossible (?).
if (!curmap.have_pg_pool(pgid.pgid.m_pool)) {
cerr Pool pgid.pgid.m_pool no longer exists
std::endl;
// Special exit code for this error, used by test
or recreate it on import with special handling.
David
On 6/19/15 7:38 PM, David Zafman wrote:
Have not seen this as an assert before. Given the code below in
do_import() of master branch the assert is impossible (?).
if (!curmap.have_pg_pool(pgid.pgid.m_pool)) {
cerr Pool
Greg,
Have you changed anything (log rotation related?) that would uninstall
or cause rsyslog to not be able to start?
I'm sometimes seeing machines fail with this error probably in
teuthology/nuke.py reset_syslog_dir().
CommandFailedError: Command failed on plana94 with status 1: 'sudo
I'm wonder if this issue could be the cause of #11511. Could a proxy
write have raced with the fill_in_copy_get() so object_info_t size
doesn't correspond with the size of the object in the filestore?
David
On 6/3/15 6:22 PM, Wang, Zhiqiang wrote:
Making the 'copy get' op to be a cache
In early march I ran rados:thrash on the firefly backport of the
ceph-objectstore-tool changes (wip-cot-firefly). We considered it
passed, even though an obscure segfault was seen:
bug #11141: Segmentation Violation: ceph-objectstore-tool doing --op
list-pgs
David
On 4/21/15 8:52 AM,
I found that I could not build the docs on Ubuntu 14.10 with the proper
packages installed. Kefu is looking into Asphyxiate which is very
tempermental. I installed an Ubuntu 11.10 in order to generate docs.
David
On 3/17/15 10:11 AM, Sage Weil wrote:
On Tue, 17 Mar 2015, Josh Durgin
.
During upgrade testing it is interesting that one node has the
transaction hints feature, but other nodes still running firefly don't.
Is this a case where we don't have to wait for all OSDs to update
before the cluster can start handling OP_COLL_HINT operations?
David Zafman
in use old-releases.ubuntu.com to install
additional packages. Just like gitbuilder-doc the admin/build-doc
command runs without errors.
I assume other distributions with more up to date packages will see the
same problem. I filed bug #11077 with the sphinx log attached.
David Zafman
On 2 of my rados thrash runs clocks out of sync. Is this an occasional
issue or did we have an infrastructure problem?
On burnupi19 and burnupi25:
2015-02-20 12:52:52.636017 mon.1 10.214.134.14:6789/0 177 : cluster
[WRN] message from mon.0 was stamped 0.501458s in the future, clocks not
A recent test run had an EIO on the following disk:
plana74 /dev/sdb
The machine is locked right now.
David Zafman
Senior Developer
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http
: (1)
Operation not permitted
David Zafman
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
not behave properly
David Zafman
Senior Developer
http://www.redhat.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
The most secure way would be one in which you can only create pools with
WORM set and can't ever change the WORM state of a pool. I like this
simple/secure approach as a first cut.
David
On 1/17/15 11:09 AM, Alex Elsayed wrote:
Sage Weil wrote:
On Fri, 16 Jan 2015, Alex Elsayed wrote:
We are seeing gitbuilder failures. This is what I saw on one.
error: Failed build dependencies:
xmlstarlet is needed by ceph-1:0.90-821.g680fe3c.el7.x86_64
David Zafman
Senior Developer
http://www.redhat.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body
We are seeing gitbuilder failures. This is what I saw on one.
error: Failed build dependencies:
xmlstarlet is needed by ceph-1:0.90-821.g680fe3c.el7.x86_64
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More
of the tool will always be executed.
David Zafman
Senior Developer
http://www.redhat.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
time to remind people to dedicate time to code
reviews.
David Zafman
Senior Developer
http://www.inktank.com
On Nov 9, 2014, at 4:08 AM, Joao Eduardo Luis j...@redhat.com wrote:
On 11/08/2014 05:32 PM, Loic Dachary wrote:
Hi Ceph,
In the past few weeks the number of pending pull
I just realized what it is. The way killall is used when stopping a vstart
cluster, is to kill all processes by name! You can't stop vstarted tests
running in parallel.
David Zafman
Senior Developer
http://www.inktank.com
On Oct 21, 2014, at 7:55 PM, Loic Dachary l...@dachary.org wrote
On Oct 22, 2014, at 3:43 PM, Sage Weil s...@newdream.net wrote:
On Wed, 22 Oct 2014, David Zafman wrote:
I just realized what it is. The way killall is used when stopping a
vstart cluster, is to kill all processes by name! You can't stop
vstarted tests running in parallel.
Ah. FWIW
I have this change in my branch so that test/ceph_objectstore_tool.py works
again after that change from John. I wonder if this would fix your case too:
commit 18937cf49be616d32b4e2d0b6deef2882321fbe4
Author: David Zafman dzaf...@redhat.com
Date: Tue Oct 14 18:45:41 2014 -0700
vstart.sh
After updating my master branch make check” passes now.
David Zafman
Senior Developer
http://www.inktank.com
On Oct 7, 2014, at 11:28 PM, Loic Dachary l...@dachary.org wrote:
[cc'ing the list in case someone else experiences problems with make check]
Hi David,
Yesterday you mentioned
are expressed or implied about the correctness or suitability
of this branch for future use.
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More
-N ns1 ls
ns1-obj5
ns1-obj4
ns1-obj10
ns1-obj2
ns1-obj9
ns1-obj3
ns1-obj6
ns1-obj1
ns1-obj8
ns1-obj7
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord
: main (ceph_objectstore_tool.cc:1849)
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo
The import-rados feature (#8276) uses librados so in my wip-8231 branch I now
link with librados. It is hard to reproduce, but I’ll play with that commit
and branch.
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
On Aug 21, 2014, at 4:56 PM, Sage Weil sw
:
$ ./autogen.sh
$ ./configure
$ make
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
On Jun 13, 2014, at 11:51 AM, Sushma Gurram sushma.gur...@sandisk.com wrote:
Hi Xinxin,
I tried to compile the wip-rocksdb branch, but the src/rocksdb directory
seems
to manipulate erasure coded pools.
David Zafman
Senior Developer
http://www.inktank.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Another way to look at this is to enumerate the recovery cases:
primary starts with head and no snapdir:
A Recovery sets last_backfill_started to head and sends head object where
needed
head (1.b case while backfills in flight - 1.a when done)
snapdir (2)
B
]: 2013-10-04
10:39:02.072487 7f57fa316780 -1 *** Caught signal (Segmentation fault) **
2013-10-04T10:39:02.074
INFO:teuthology.task.ceph-fuse.ceph-fuse.0.err:[10.214.132.22]: in thread
7f57fa316780
David Zafman
Senior Developer
http://www.inktank.com
--
To unsubscribe from this list: send
Here is the test script:
xattr-test.sh
Description: Binary data
David Zafman
Senior Developer
http://www.inktank.com
On Oct 3, 2013, at 11:02 PM, Loic Dachary l...@dachary.org wrote:
Hi David,
Would you mind attaching the script to the mail for completness ? It's a
useful thing
done
rm src.$$
exit 0
David Zafman
Senior Developer
http://www.inktank.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More
` is needed to
interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
Aborted
David Zafman
Senior Developer
http://www.inktank.com
On Oct 4, 2013, at 4:55 PM, Sage Weil s...@inktank.com wrote:
This point release fixes an important performance issue with radosgw
I want to record with the ceph-devel archive results from testing limits of
xattrs for Linux filesystems used with Ceph.
Script that creates xattrs with name user.test1, user.test2, …. on a single file
3.10 linux kernel
ext4
value bytes number of entries
1 148
16
executable` is needed to
interpret this.
David Zafman
Senior Developer
http://www.inktank.com
On Sep 24, 2013, at 12:03 PM, Sage Weil s...@inktank.com wrote:
On Tue, 24 Sep 2013, David Zafman wrote:
Rados suite test run results for wip-5862. 2 scrub mismatch from mon
(known problem). 2
take responsibility for
holding the data assigned to that rack.
Though I didn't look at the data movement, I'm confident that it will work.
You can simply mark your OSDs out manually to verify that missing replicas are
replaced.
David Zafman
Senior Developer
http://www.inktank.com
On Apr 26
defined.
David Zafman
Senior Developer
http://www.inktank.com
On Apr 26, 2013, at 6:44 AM, Mike Dawson mike.daw...@scholarstack.com wrote:
David / Martin,
I can confirm this issue. At present I am running monitors only with 100% of
my OSD processes shutdown down. For the past couple hours
I filed tracker bug 4822 and have wip-4822 with a fix. My manual testing shows
that it works. I'm building a teuthology test.
Given your osd tree has a single rack it should always mark OSDs down after 5
minutes by default.
David Zafman
Senior Developer
http://www.inktank.com
On Apr 25
# src/tpbench
# src/xattr_bench
nothing added to commit but untracked files present (use git add to track)
David Zafman
Senior Developer
david.zaf...@inktank.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
You could look at the wip-wireshark-zafman branch. I rebased it and force
pushed it. It has changes to the wireshark.patch and a minor change I needed
to get it to build. I'm surprised the recent checkin didn't include the change
to packet-ceph.c which I needed to get it to build.
David
active+remapped, 5 active+degraded; 0 bytes data,
798 GB used, 3050 GB / 4055 GB avail
mdsmap e2: 0/0/0 up
David Zafman
Senior Developer
david.zaf...@inktank.com
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More
I sent this proposal out to the developers that own the FSAL CEPH portion of
Nfs-Ganesha. They have changes to Ceph that expose additional interfaces for
this. This is our initial cut at improving the interfaces.
David Zafman
Senior Developer
david.zaf...@inktank.com
Begin forwarded
I reviewed these.
Reviewed-by: David Zafman david.zaf...@inktank.com
David Zafman
Senior Developer
david.zaf...@inktank.com
On Jan 3, 2013, at 11:04 AM, Alex Elder el...@inktank.com wrote:
I'm re-posting my patch backlog, in chunks that may or may not
match how they got posted before
I amended the last 5 commits which I committed to the testing branch last
night. Please update your repositories accordingly.
David--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at
Keep in mind that some of the init.d stuff doesn't work with a ceph-deploy
installed system. Not clear to me if we need to fix ceph-deploy or for those
type of setups only upstart should be used/available.
David
On Dec 5, 2012, at 11:41 AM, Dan Mick dan.m...@inktank.com wrote:
The story as
On Nov 27, 2012, at 9:03 AM, Sage Weil s...@inktank.com wrote:
On Tue, 27 Nov 2012, Sam Lang wrote:
3. When a client acquires the cap for a file, have the mds provide its
current
time as well. As the client updates the mtime, it uses the timestamp
provided
by the mds and the time
On Nov 27, 2012, at 11:05 AM, Sam Lang sam.l...@inktank.com wrote:
On 11/27/2012 12:01 PM, Sage Weil wrote:
On Tue, 27 Nov 2012, David Zafman wrote:
On Nov 27, 2012, at 9:03 AM, Sage Weil s...@inktank.com wrote:
On Tue, 27 Nov 2012, Sam Lang wrote:
3. When a client acquires the cap
On Nov 27, 2012, at 1:14 PM, Sam Lang sam.l...@inktank.com wrote:
On 11/27/2012 01:38 PM, David Zafman wrote:
On Nov 27, 2012, at 11:05 AM, Sam Lang sam.l...@inktank.com wrote:
On 11/27/2012 12:01 PM, Sage Weil wrote:
On Tue, 27 Nov 2012, David Zafman wrote:
On Nov 27, 2012, at 9:03
I also added a kcon_most teuthology task which does almost the same thing as
ceph/src/script/kcon_most.sh to all or any set of clients. The teuthology
version does not raise the console log level.
For example:
tasks:
- ceph:
- kclient:
- kcon_most:
- interactive:
On Oct 24, 2012, at 11:14
63 matches
Mail list logo