Also worth pointing out something a bit obvious but: this kind of
faster/destructive migration should only be attempted if all your pools are at
least 3x replicated.
For example, if you had a 1x replicated pool you would lose data using this
approach.
-- Dan
> On Jan 11, 2018, at 14:24, Reed
To add a little color here... we started an rsync last night to copy about 4TB
worth of files to CephFS. Paused it this morning because CephFS was
unresponsive on the machine (e.g. can't cat a file from the filesystem).
Been waiting about 3 hours for the log jam to clear. Slow requests have
s
> On Dec 2, 2016, at 10:48, Sage Weil wrote:
>
> On Fri, 2 Dec 2016, Dan Jakubiec wrote:
>> For what it's worth... this sounds like the condition we hit we
>> re-enabled scrub on our 16 OSDs (after 6 to 8 weeks of noscrub). They
>> flapped for about 30 minu
For what it's worth... this sounds like the condition we hit we re-enabled
scrub on our 16 OSDs (after 6 to 8 weeks of noscrub). They flapped for about
30 minutes as most of the OSDs randomly hit suicide timeouts here and there.
This settled down after about an hour and the OSDs stopped dying.
Thanks Greg, makes sense.
Our ceph cluster currently has 16 OSDs, each with an 8TB disk.
Sounds like 32 PGs at 3x replication might be a reasonable starting point?
Thanks,
-- Dan
> On Nov 8, 2016, at 14:02, Gregory Farnum wrote:
>
> On Tue, Nov 8, 2016 at 9:37 AM, Dan Jakubi
Hello,
Picking the number of PGs for the CephFS data pool seems straightforward, but
how does one do this for the metadata pool?
Any rules of thumb or recommendations?
Thanks,
-- Dan Jakubiec
___
ceph-users mailing list
ceph-users@lists.ceph.com
We currently have one master RADOS pool in our cluster that is shared among
many applications. All objects stored in the pool are currently stored using
specific namespaces -- nothing is stored in the default namespace.
We would like to add a CephFS filesystem to our cluster, and would like to
Hi John,
How does one configure namespaces for file/dir layouts? I'm looking here, but
am not seeing any mentions of namespaces:
http://docs.ceph.com/docs/jewel/cephfs/file-layouts/
Thanks,
-- Dan
> On Oct 28, 2016, at 04:11, John Spray wrote:
>
> On Thu, Oct 27, 2016 at 9:43 PM, Reed Di
Thanks Kostis, great read.
We also had a Ceph disaster back in August and a lot of this experience looked
familiar. Sadly, in the end we were not able to recover our cluster but glad
to hear that you were successful.
LevelDB corruptions were one of our big problems. Your note below about
r
ng how many pg's are backfilling and the load on machines and
network.
kind regards
Ronny Aasen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Dan Jakubiec
VP Development
Focus VQ
Hello, I need to issue the following commands on millions of objects:
rados_write_full(oid1, ...)
rados_setxattr(oid1, "attr1", ...)
rados_setxattr(oid1, "attr2", ...)
Would it make it any faster if I combined all 3 of these into a single
rados_write_op and issued them "together" as a single cal
Hi Brad, thank you very much for the response:
> On Sep 3, 2016, at 17:05, Brad Hubbard wrote:
>
>
>
> On Sun, Sep 4, 2016 at 6:21 AM, Dan Jakubiec <mailto:dan.jakub...@gmail.com>> wrote:
>
>> 2016-09-03 16:12:44.124033 7fec728c9700 15
>> f
Hi Samuel,
Here is another assert, but this time with debug filestore = 20.
Does this reveal anything?
2016-09-03 16:12:44.122451 7fec728c9700 20 list_by_hash_bitwise prefix 08F3
2016-09-03 16:12:44.123046 7fec728c9700 20 list_by_hash_bitwise prefix 08F30042
2016-09-03 16:12:44.123068 7fec728c97
there is no command to removed the old OSD, I think our next step will be to
bring up a new/real/empty OSD.8 and see if that will clear the log jam. But
seems like there should be a tool to deal with this kind of thing?
Thanks,
-- Dan
> On Sep 2, 2016, at 15:01, Dan Jakubiec wrote:
&g
A while back we removed two damaged OSDs from our cluster, osd.0 and osd.8.
They are now gone from most Ceph commands, but are still showing up in the
CRUSH map with weird device names:
...
# devices
device 0 device0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
de
Re-packaging this question which was buried in a larger, less-specific thread
from a couple of days ago. Hoping this will be more useful here.
We have been working on restoring our Ceph cluster after losing a large number
of OSDs. We have all PGs active now except for 80 PGs that are stuck in
Thanks you for all the help Wido:
> On Sep 1, 2016, at 14:03, Wido den Hollander wrote:
>
> You have to mark those OSDs as lost and also force create the incomplete PGs.
>
This might be the root of our problems. We didn't mark the parent OSD as
"lost" before we removed it. Now ceph won't le
Thanks Wido. Reed and I have been working together to try to restore this
cluster for about 3 weeks now. I have been accumulating a number of failure
modes that I am hoping to share with the Ceph group soon, but have been holding
off a bit until we see the full picture clearly so that we can p
>
> You are more then welcome to send a Pull Request though!
> https://github.com/ceph/rados-java/pulls
>
> Wido
>
>> Op 24 augustus 2016 om 21:58 schreef Dan Jakubiec :
>>
>>
>> Hello,
>>
>> Is anyone planning to implement support for
Hello,
Is anyone planning to implement support for Rados locks in the Java API anytime
soon?
Thanks,
-- Dan J
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi Wido,
Thank you for the response:
> On Aug 17, 2016, at 16:25, Wido den Hollander wrote:
>
>
>> Op 17 augustus 2016 om 17:44 schreef Dan Jakubiec :
>>
>>
>> Hello, we have a Ceph cluster with 8 OSD that recently lost power to all 8
>> mach
Hello, we have a Ceph cluster with 8 OSD that recently lost power to all 8
machines. We've managed to recover the XFS filesystems on 7 of the machines,
but the OSD service is only starting on 1 of them.
The other 5 machines all have complaints similar to the following:
2016-08-17 09:32
22 matches
Mail list logo