Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Daniel Baumann
On 05/19/2018 01:13 AM, Webert de Souza Lima wrote:
> New question: will it make any difference in the balancing if instead of
> having the MAIL directory in the root of cephfs and the domains's
> subtrees inside it, I discard the parent dir and put all the subtress right 
> in cephfs root?

the balancing between the MDS is influenced by which directories are
accessed, the currently accessed directory-trees are diveded between the
MDS's (also check the dirfrag option in the docs). assuming you have the
same access pattern, the "fragmentation" between the MDS's happens at
these "target-directories", so it doesn't matter if these directories
are further up or down in the same filesystem tree.

in the multi-MDS scenario where the MDS serving rank 0 fails, the
effects in the moment of the failure for any cephfs client accessing a
directory/file are the same (as described in an earlier mail),
regardless on which level the directory/file is within the filesystem.

Regards,
Daniel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] in retrospect get OSD for "slow requests are blocked" ? / get detailed health status via librados?

2018-05-18 Thread Brad Hubbard
On Thu, May 17, 2018 at 6:06 PM, Uwe Sauter  wrote:
> Brad,
>
> thanks for the bug report. This is exactly the problem I am having (log-wise).

 You don't give any indication what version you are running but see
 https://tracker.ceph.com/issues/23205
>>>
>>>
>>> the cluster is an Proxmox installation which is based on an Ubuntu kernel.
>>>
>>> # ceph -v
>>> ceph version 12.2.5 (dfcb7b53b2e4fcd2a5af0240d4975adc711ab96e) luminous
>>> (stable)
>>>
>>> The mistery is that these blocked requests occur numerously when at least
>>> one of the 6 servers is booted with kernel 4.15.17, if all are running
>>> 4.13.16 the number of blocked requests is infrequent and low.
>>
>> Sounds like you need to profile your two kernel versions and work out
>> why one is under-performing.
>>
>
> Well, the problem is that I see this behavior only in our production system 
> (6 hosts and 22 OSDs total). The test system I have is
> a bit smaller (only 3 hosts with 12 OSDs on older hardware) and shows no sign 
> of this possible regression…

Are you saying you can't gather performance data from your production system?

>
>
> Regards,
>
> Uwe



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-MDS Failover

2018-05-18 Thread Scottix
So we have been testing this quite a bit, having the failure domain as
partially available is ok for us but odd, since we don't know what will be
down. Compared to a single MDS we know everything will be blocked.

It would be nice to have an option to have all IO blocked if it hits a
degraded state until it recovers. Since you are unaware of other MDS state,
seems like that would be tough to do.

I'll leave this as a feature request possibly in the future.

On Fri, May 18, 2018 at 3:15 PM Gregory Farnum  wrote:

> On Fri, May 18, 2018 at 11:56 AM Webert de Souza Lima <
> webert.b...@gmail.com> wrote:
>
>> Hello,
>>
>>
>> On Mon, Apr 30, 2018 at 7:16 AM Daniel Baumann 
>> wrote:
>>
>>> additionally: if rank 0 is lost, the whole FS stands still (no new
>>> client can mount the fs; no existing client can change a directory,
>>> etc.).
>>>
>>> my guess is that the root of a cephfs (/; which is always served by rank
>>> 0) is needed in order to do traversals/lookups of any directories on the
>>> top-level (which then can be served by ranks 1-n).
>>>
>>
>> Could someone confirm if this is actually how it works? Thanks.
>>
>
> Yes, although I'd expect that clients can keep doing work in directories
> they've already got opened (or in descendants of those). Perhaps I'm
> missing something about that, though...
> -Greg
>
>
>>
>> Regards,
>>
>> Webert Lima
>> DevOps Engineer at MAV Tecnologia
>> *Belo Horizonte - Brasil*
>> *IRC NICK - WebertRLZ*
>>
>>
>>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Webert de Souza Lima
Hi Patrick

On Fri, May 18, 2018 at 6:20 PM Patrick Donnelly 
wrote:

> Each MDS may have multiple subtrees they are authoritative for. Each
> MDS may also replicate metadata from another MDS as a form of load
> balancing.


Ok, its good to know that it actually does some load balance. Thanks.
New question: will it make any difference in the balancing if instead of
having the MAIL directory in the root of cephfs and the domains's subtrees
inside it,
I discard the parent dir and put all the subtress right in cephfs root?


> standby-replay daemons are not available to take over for ranks other
> than the one it follows. So, you would want to have a standby-replay
> daemon for each rank or just have normal standbys. It will likely
> depend on the size of your MDS (cache size) and available hardware.
>
> It's best if y ou see if the normal balancer (especially in v12.2.6
> [1]) can handle the load for you without trying to micromanage things
> via pins. You can use pinning to isolate metadata load from other
> ranks as a stop-gap measure.
>

Ok I will start with the simplest way. This can be changed after deployment
if it comes to be the case.

On Fri, May 18, 2018 at 6:38 PM Daniel Baumann 
wrote:

> jftr, having 3 active mds and 3 standby-replay resulted May 20217 in a
> longer downtime for us due to http://tracker.ceph.com/issues/21749
>
> we're not using standby-replay MDS's anymore but only "normal" standby,
> and didn't have had any problems anymore (running kraken then, upgraded
> to luminous last fall).
>

Thank you very much for your feedback Daniel. I'll go for the regular
standby daemons, then.

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help/advice with crush rules

2018-05-18 Thread Gregory Farnum
On Thu, May 17, 2018 at 9:05 AM Andras Pataki 
wrote:

> I've been trying to wrap my head around crush rules, and I need some
> help/advice.  I'm thinking of using erasure coding instead of
> replication, and trying to understand the possibilities for planning for
> failure cases.
>
> For a simplified example, consider a 2 level topology, OSDs live on
> hosts, and hosts live in racks.  I'd like to set up a rule for a 6+3
> erasure code that would put at most 1 of the 9 chunks on a host, and no
> more than 3 chunks in a rack (so in case the rack is lost, we still have
> a way to recover).  Some racks may not have 3 hosts in them, so they
> could potentially accept only 1 or 2 chunks then.  How can something
> like this be implemented as a crush rule?  Or, if not exactly this,
> something in this spirit?  I don't want to say that all chunks need to
> live in a separate rack because that is too restrictive (some racks may
> be much bigger than others, or there might not even be 9 racks).
>

Unfortunately what you describe here is a little too detailed in ways CRUSH
can't easily specify. You should think of a CRUSH rule as a sequence of
steps that start out at a root (the "take" step), and incrementally specify
more detail about which piece of the CRUSH hierarchy they run on, but run
the *same* rule on every piece they select.

So the simplest thing that comes close to what you suggest is:
(forgive me if my syntax is slightly off, I'm doing this from memory)
step take default
step chooseleaf n type=rack
step emit

That would start at the default root, select "n" racks (9, in your case)
and then for each rack find an OSD within it. (chooseleaf is special and
more flexibly than most of the CRUSH language; it's nice because if it
can't find an OSD in one of the selected racks, it will pick another rack).
But a rule that's more illustrative of how things work is:
step take default
step choose 3 type=rack
step chooseleaf 3 type=host
step emit

That one selects three racks, then selects three OSDs within different
hosts *in each rack*. (You'll note that it doesn't necessarily work out so
well if you don't want 9 OSDs!) If one of the racks it selected doesn't
have 3 separate hosts...well, tough, it tried to do what you told it. :/

If you were dedicated, you could split up your racks into
equivalently-sized units — let's say rows. Then you could do
step take default
step choose 3 type=row
step chooseleaf 3 type=host
step emit

Assuming you have 3+ rows of good size, that'll get you 9 OSDs which are
all on different hosts.
-Greg


>
> Thanks,
>
> Andras
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph MeetUp Berlin – May 28

2018-05-18 Thread Gregory Farnum
Is there any chance of sharing those slides when the meetup has finished?
It sounds interesting! :)

On Fri, May 18, 2018 at 6:53 AM Robert Sander 
wrote:

> Hi,
>
> we are organizing a bi-monthyl meetup in Berlin, Germany and invite any
> interested party to join us for the next one on May 28:
>
> https://www.meetup.com/Ceph-Berlin/events/qbpxrhyxhblc/
>
> The presented topic is "High available (active/active) NFS and CIFS
> exports upon CephFS".
>
> Kindest Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-MDS Failover

2018-05-18 Thread Gregory Farnum
On Fri, May 18, 2018 at 11:56 AM Webert de Souza Lima 
wrote:

> Hello,
>
>
> On Mon, Apr 30, 2018 at 7:16 AM Daniel Baumann 
> wrote:
>
>> additionally: if rank 0 is lost, the whole FS stands still (no new
>> client can mount the fs; no existing client can change a directory, etc.).
>>
>> my guess is that the root of a cephfs (/; which is always served by rank
>> 0) is needed in order to do traversals/lookups of any directories on the
>> top-level (which then can be served by ranks 1-n).
>>
>
> Could someone confirm if this is actually how it works? Thanks.
>

Yes, although I'd expect that clients can keep doing work in directories
they've already got opened (or in descendants of those). Perhaps I'm
missing something about that, though...
-Greg


>
> Regards,
>
> Webert Lima
> DevOps Engineer at MAV Tecnologia
> *Belo Horizonte - Brasil*
> *IRC NICK - WebertRLZ*
>
>
>> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Kubernetes/Ceph block performance

2018-05-18 Thread Gregory Farnum
You're doing 4K direct IOs on a distributed storage system and then
comparing it to what the local device does with 1GB blocks? :)

Try feeding Ceph with some larger IOs and check how it does.
-Greg

On Fri, May 18, 2018 at 1:22 PM Rhugga Harper  wrote:

>
> We're evaluating persistent block providers for Kubernetes and looking at
> ceph at the moment.
>
> We aren't seeing performance anywhere near what we expect.
>
> I have a 50-node proof of concept cluster with 40 nodes available for
> storage and configured with rook/ceph. Each has 10GB nics and 8 x 1TB
> SSD's. (only 3 drives on each node have been allocated to ceph use)
>
> We are testing with replicated pools of size 1 and 3. I've been doing fio
> tests in parallel (pod setup to run fio) and it seems to average aggregate
> bandwidth around 150 MB/sec.
>
> I'm running the fio tests as follows:
>
> direct=1, fsync=8|16|32|64, readwrite=write, blocksize=4k, numjobs=4|8,
> size=10G
>
> Regardless of doing 1 stream or 40, the aggregate bandwidth as reported by
> "ceph -s" is ~150MB/sec:
>
> I'm creating my pool with pg_num/pgp_num=1024|2048|4096
>
> A baseline dd (100GB file using blocksize 1G) on these SSD's shows them
> capable of 1.6 GB/s.
>
> I can't seem to find any limitations or bottlenecks on the nodes or the
> network.
>
> Anyone have any idea where else I can look?
>
> I'm new to ceph and it just seems like this should be pushing more I/O.
> I've dug thru a lot of performance tuning sites and have m implemented most
> of the suggestions.
>
>
> # ceph -s
>   cluster:
> id: 949a8caf-9a9b-4f09-8711-1d5158a65bd8
> health: HEALTH_OK
>
>   services:
> mon: 7 daemons, quorum
> rook-ceph-mon1,rook-ceph-mon3,rook-ceph-mon0,rook-ceph-mon5,rook-ceph-mon4,rook-ceph-mon2,rook-ceph-mon6
> mgr: rook-ceph-mgr0(active)
> osd: 123 osds: 123 up, 123 in
>
>   data:
> pools:   1 pools, 2048 pgs
> objects: 134k objects, 508 GB
> usage:   1240 GB used, 110 TB / 112 TB avail
> pgs: 2048 active+clean
>
>   io:
> client:   138 MB/s wr, 0 op/s rd, 71163 op/s wr
>
> Thanks for any help,
> CC
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Daniel Baumann
On 05/18/2018 11:19 PM, Patrick Donnelly wrote:
> So, you would want to have a standby-replay
> daemon for each rank or just have normal standbys. It will likely
> depend on the size of your MDS (cache size) and available hardware.

jftr, having 3 active mds and 3 standby-replay resulted May 20217 in a
longer downtime for us due to http://tracker.ceph.com/issues/21749

(http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/thread.html#21390
- thanks again for the help back then, still much appreciated)

we're not using standby-replay MDS's anymore but only "normal" standby,
and didn't have had any problems anymore (running kraken then, upgraded
to luminous last fall).

Regards,
Daniel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Patrick Donnelly
Hello Webert,

On Fri, May 18, 2018 at 1:10 PM, Webert de Souza Lima
 wrote:
> Hi,
>
> We're migrating from a Jewel / filestore based cephfs archicture to a
> Luminous / buestore based one.
>
> One MUST HAVE is multiple Active MDS daemons. I'm still lacking knowledge of
> how it actually works.
> After reading the docs and ML we learned that they work by sort of dividing
> the responsibilities, each with his own and only directory subtree. (please
> correct me if I'm wrong).

Each MDS may have multiple subtrees they are authoritative for. Each
MDS may also replicate metadata from another MDS as a form of load
balancing.

> Question 1: I'd like to know if it is viable to have 4 MDS daemons, being 3
> Active and 1 Standby (or Standby-Replay if that's still possible with
> multi-mds).

standby-replay daemons are not available to take over for ranks other
than the one it follows. So, you would want to have a standby-replay
daemon for each rank or just have normal standbys. It will likely
depend on the size of your MDS (cache size) and available hardware.

> Basically, what we have is 2 subtrees used by dovecot: INDEX and MAIL.
> Their tree is almost identical but INDEX stores all dovecot metadata with
> heavy IO going on and MAIL stores actual email files, with much more writes
> than reads.
>
> I don't know by now which one could bottleneck the MDS servers most so I
> wonder if I can take metrics on MDS usage per pool when it's deployed.
> Question 2: If the metadata workloads are very different I wonder if I can
> isolate them, like pinning MDS servers X and Y to one of the directories.

It's best if y ou see if the normal balancer (especially in v12.2.6
[1]) can handle the load for you without trying to micromanage things
via pins. You can use pinning to isolate metadata load from other
ranks as a stop-gap measure.

[1] https://github.com/ceph/ceph/pull/21412

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] (yet another) multi active mds advise needed

2018-05-18 Thread Webert de Souza Lima
Hi,

We're migrating from a Jewel / filestore based cephfs archicture to a
Luminous / buestore based one.

One MUST HAVE is multiple Active MDS daemons. I'm still lacking knowledge
of how it actually works.
After reading the docs and ML we learned that they work by sort of dividing
the responsibilities, each with his own and only directory subtree. (please
correct me if I'm wrong).

Question 1: I'd like to know if it is viable to have 4 MDS daemons, being 3
Active and 1 Standby (or Standby-Replay if that's still possible with
multi-mds).

Basically, what we have is 2 subtrees used by dovecot: INDEX and MAIL.
Their tree is almost identical but INDEX stores all dovecot metadata with
heavy IO going on and MAIL stores actual email files, with much more writes
than reads.

I don't know by now which one could bottleneck the MDS servers most so I
wonder if I can take metrics on MDS usage per pool when it's deployed.
Question 2: If the metadata workloads are very different I wonder if I can
isolate them, like pinning MDS servers X and Y to one of the directories.

Cache Tier is deprecated so,
Question 3: how can I think of a read cache mechanism in Luminous with
bluestore, mainly to keep newly created files (emails that just arrived and
will probably be fetched by the user in a few seconds via IMAP/POP3).

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-MDS Failover

2018-05-18 Thread Webert de Souza Lima
Hello,


On Mon, Apr 30, 2018 at 7:16 AM Daniel Baumann 
wrote:

> additionally: if rank 0 is lost, the whole FS stands still (no new
> client can mount the fs; no existing client can change a directory, etc.).
>
> my guess is that the root of a cephfs (/; which is always served by rank
> 0) is needed in order to do traversals/lookups of any directories on the
> top-level (which then can be served by ranks 1-n).
>

Could someone confirm if this is actually how it works? Thanks.

Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
*IRC NICK - WebertRLZ*


>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph osd status output

2018-05-18 Thread John Spray
On Fri, May 18, 2018 at 9:55 AM, Marc Roos  wrote:
>
> Should ceph osd status not be stdout?

Oops, that's a bug.
http://tracker.ceph.com/issues/24175
https://github.com/ceph/ceph/pull/22089

John

> So I can do something like this
>
> [@ ~]# ceph osd status |grep c01
>
> And don't need to do this
>
> [@ ~]# ceph osd status 2>&1 |grep c01
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph osd status output

2018-05-18 Thread Marc Roos

Should ceph osd status not be stdout?

So I can do something like this

[@ ~]# ceph osd status |grep c01

And don't need to do this

[@ ~]# ceph osd status 2>&1 |grep c01



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Increasing number of PGs by not a factor of two?

2018-05-18 Thread Bryan Banister
+1

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Kai 
Wagner
Sent: Thursday, May 17, 2018 4:20 PM
To: David Turner 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Increasing number of PGs by not a factor of two?


Great summary David. Wouldn't this be worth a blog post?

On 17.05.2018 20:36, David Turner wrote:
By sticking with PG numbers as a base 2 number (1024, 16384, etc) all of your 
PGs will be the same size and easier to balance and manage.  What happens when 
you have a non base 2 number is something like this.  Say you have 4 PGs that 
are all 2GB in size.  If you increase pg(p)_num to 6, then you will have 2 PGs 
that are 2GB and 4 PGs that are 1GB as you've split 2 of the PGs into 4 to get 
to the 6 total.  If you increase the pg(p)_num to 8, then all 8 PGs will be 
1GB.  Depending on how you manage your cluster, that doesn't really matter, but 
for some methods of balancing your cluster, that will greatly imbalance things.

This would be a good time to go to a base 2 number.  I think you're thinking 
about Gluster where if you have 4 bricks and you want to increase your 
capacity, going to anything other than a multiple of 4 (8, 12, 16) kills 
performance (worse than increasing storage already does) and takes longer as it 
has to weirdly divide the data instead of splitting a single brick up to 
multiple bricks.

As you increase your PGs, do this slowly and in a loop.  I like to increase my 
PGs by 256, wait for all PGs to create, activate, and peer, rinse/repate until 
I get to my target.  [1] This is an example of a script that should accomplish 
this with no interference.  Notice the use of flags while increasing the PGs.  
It will make things take much longer if you have an OSD OOM itself or die for 
any reason by adding to the peering needing to happen.  It will also be wasted 
IO to start backfilling while you're still making changes; it's best to wait 
until you finish increasing your PGs and everything peers before you let data 
start moving.

Another thing to keep in mind is how long your cluster will be moving data 
around.  Increasing your PG count on a pool full of data is one of the most 
intensive operations you can tell a cluster to do.  The last time I had to do 
this, I increased pg(p)_num by 4k PGs from 16k to 32k, let it backfill, 
rinse/repeat until the desired PG count was achieved.  For me, that 4k PGs 
would take 3-5 days depending on other cluster load and how full the cluster 
was.  If you do decide to increase your PGs by 4k instead of the full increase, 
change the 16384 to the number you decide to go to, backfill, continue.


[1]
# Make sure to set pool variable as well as the number ranges to the 
appropriate values.
flags="nodown nobackfill norecover"
for flag in $flags; do
  ceph osd set $flag
done
pool=rbd
echo "$pool currently has $(ceph osd pool get $pool pg_num) PGs"
# The first number is your current PG count for the pool, the second number is 
the target PG count, and the third number is how many to increase it by each 
time through the loop.
for num in {7700..16384..256}; do
  ceph osd pool set $pool pg_num $num
  while sleep 10; do
ceph osd health | grep -q 'peering\|stale\|activating\|creating\|inactive' 
|| break
  done
  ceph osd pool set $pool pgp_num $num
  while sleep 10; do
ceph osd health | grep -q 'peering\|stale\|activating\|creating\|inactive' 
|| break
  done
done
for flag in $flags; do
  ceph osd unset $flag
done

On Thu, May 17, 2018 at 9:27 AM Kai Wagner 
> wrote:
Hi Oliver,

a good value is 100-150 PGs per OSD. So in your case between 20k and 30k.

You can increase your PGs, but keep in mind that this will keep the
cluster quite busy for some while. That said I would rather increase in
smaller steps than in one large move.

Kai


On 17.05.2018 01:29, Oliver Schulz wrote:
> Dear all,
>
> we have a Ceph cluster that has slowly evolved over several
> years and Ceph versions (started with 18 OSDs and 54 TB
> in 2013, now about 200 OSDs and 1.5 PB, still the same
> cluster, with data continuity). So there are some
> "early sins" in the cluster configuration, left over from
> the early days.
>
> One of these sins is the number of PGs in our CephFS "data"
> pool, which is 7200 and therefore not (as recommended)
> a power of two. Pretty much all of our data is in the
> "data" pool, the only other pools are "rbd" and "metadata",
> both contain little data (and they have way too many PGs
> already, another early sin).
>
> Is it possible - and safe - to change the number of "data"
> pool PGs from 7200 to 8192 or 16384? As we recently added
> more OSDs, I guess it would be time to increase the number
> of PGs anyhow. Or would we have to go to 14400 instead of
> 16384?
>
>
> Thanks for any advice,
>
> Oliver
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com

[ceph-users] Ceph MeetUp Berlin – May 28

2018-05-18 Thread Robert Sander
Hi,

we are organizing a bi-monthyl meetup in Berlin, Germany and invite any
interested party to join us for the next one on May 28:

https://www.meetup.com/Ceph-Berlin/events/qbpxrhyxhblc/

The presented topic is "High available (active/active) NFS and CIFS
exports upon CephFS".

Kindest Regards
-- 
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor CentOS 7.5 client performance

2018-05-18 Thread Ilya Dryomov
On Fri, May 18, 2018 at 3:25 PM, Donald "Mac" McCarthy
 wrote:
> Ilya,
>   Your recommendation worked beautifully.  Thank you!
>
> Is this something that is expected behavior or is this something that should 
> be filed as a bug.
>
> I ask because I have just enough experience with ceph at this point to be 
> very dangerous and not enough history to know if this was expected from past 
> behavior.
>
> I did the dd testing after noticing poor read/write from a set of machines 
> that use ceph to back their home directories.

This is a bug, definitely not expected.  A Red Hat BZ has already been
filed.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor CentOS 7.5 client performance

2018-05-18 Thread Donald "Mac" McCarthy
Ilya,
  Your recommendation worked beautifully.  Thank you!

Is this something that is expected behavior or is this something that should be 
filed as a bug.

I ask because I have just enough experience with ceph at this point to be very 
dangerous and not enough history to know if this was expected from past 
behavior.

I did the dd testing after noticing poor read/write from a set of machines that 
use ceph to back their home directories.


Mac
Please excuse any typos.  Autocorrect is evil!

> On May 17, 2018, at 16:57, Ilya Dryomov  wrote:
> 
> Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [PROBLEM] Fail in deploy do ceph on RHEL

2018-05-18 Thread Jacob DeGlopper
Hi Antonio - you need to set !requiretty in your sudoers file.  This is 
documented here: 
http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/   but it 
appears that section may not have been copied into the current docs.


You can test this by running 'ssh sds@node1 sudo whoami' from your admin 
node.


    -- jacob


On 05/18/2018 09:00 AM, Antonio Novaes wrote:
I tried create new cluster ceph, but on the my first command, received 
this erro in blue.
Searched on the gogle about this erro, but believe that is error of 
the ssh, and dont of the ceph.


I tried:
alias ssh="ssh -t" on the admin node

I Modifyed the file

Host node01
   Hostname node01.domain.local
   User sds
   PreferredAuthentications publickey
   IdentityFile /home/sds/.ssh/id_rsa

also try,
- start the command wtih sudo
-  Add PermitRootLogin whitout-password on /etc/ssh/sshd_config on the 
host node01


But, erro hold

[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[node01][DEBUG ] connected to host: cadmfsd001.tjba.jus.br 


[node01][INFO ] Running command: ssh -CT -o BatchMode=yes node01
[ceph_deploy.new][WARNIN] could not connect via SSH
[ceph_deploy.new][INFO  ] will connect again with password prompt
[node01][DEBUG ] connected to host: sds@node01
[node01][DEBUG ] detect platform information from remote host
[node01][DEBUG ] detect machine type
[ceph_deploy.new][INFO  ] adding public keys to authorized_keys
[node01][DEBUG ] append contents to file
[node01][DEBUG ] connection detected need for sudo
*sudo: I'm sorry, you should have a tty to run sudo*
*[ceph_deploy][ERROR ] RuntimeError: connecting to host: sds@node01 
resulted in errors: IOError cannot send (already closed?)*


Someone can help me?

Att,
Antonio Novaes de C. Jr
Analista TIC - Sistema e Infraestrutura
Especialista em Segurança de Rede de Computadores
Information Security Foundation based on ISO/IEC 27002 | ISFS
EXIN Cloud Computing (CLOUDF)
Red Hat Certified Engineer (RHCE)
Red Hat Certified Jboss Administrator (RHCJA)
Linux Certified Engineer (LPIC-2)
Novell Certified Linux Administrator (SUSE CLA)
ID Linux: 481126 | LPI000255169
LinkedIN: Perfil Público 






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [PROBLEM] Fail in deploy do ceph on RHEL

2018-05-18 Thread David Turner
That error is a sudo error, not an SSH error.  Making root login possible
without password doesn't affect this at all.  ceph-deploy is successfully
logging in as sds to node01, but is failing to be able to execute sudo
commands without a password.  To fix that you need to use `visudo` to give
the sds user the ability to run sudo commands with nopasswd.

On Fri, May 18, 2018 at 9:01 AM Antonio Novaes 
wrote:

> I tried create new cluster ceph, but on the my first command, received
> this erro in blue.
> Searched on the gogle about this erro, but believe that is error of the
> ssh, and dont of the ceph.
>
> I tried:
> alias ssh="ssh -t" on the admin node
>
> I Modifyed the file
>
> Host node01
>Hostname node01.domain.local
>User sds
>PreferredAuthentications publickey
>IdentityFile /home/sds/.ssh/id_rsa
>
> also try,
> - start the command wtih sudo
> -  Add PermitRootLogin whitout-password on /etc/ssh/sshd_config on the
> host node01
>
> But, erro hold
>
> [ceph_deploy.new][DEBUG ] Creating new cluster named ceph
> [ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
> [node01][DEBUG ] connected to host: cadmfsd001.tjba.jus.br
> [node01][INFO  ] Running command: ssh -CT -o BatchMode=yes node01
> [ceph_deploy.new][WARNIN] could not connect via SSH
> [ceph_deploy.new][INFO  ] will connect again with password prompt
> [node01][DEBUG ] connected to host: sds@node01
> [node01][DEBUG ] detect platform information from remote host
> [node01][DEBUG ] detect machine type
> [ceph_deploy.new][INFO  ] adding public keys to authorized_keys
> [node01][DEBUG ] append contents to file
> [node01][DEBUG ] connection detected need for sudo
> *sudo: I'm sorry, you should have a tty to run sudo*
> *[ceph_deploy][ERROR ] RuntimeError: connecting to host: sds@node01
> resulted in errors: IOError cannot send (already closed?)*
>
> Someone can help me?
>
> Att,
> Antonio Novaes de C. Jr
> Analista TIC - Sistema e Infraestrutura
> Especialista em Segurança de Rede de Computadores
> Information Security Foundation based on ISO/IEC 27002 | ISFS
> EXIN Cloud Computing (CLOUDF)
> Red Hat Certified Engineer (RHCE)
> Red Hat Certified Jboss Administrator (RHCJA)
> Linux Certified Engineer (LPIC-2)
> Novell Certified Linux Administrator (SUSE CLA)
> ID Linux: 481126 | LPI000255169
> LinkedIN: Perfil Público
> 
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [PROBLEM] Fail in deploy do ceph on RHEL

2018-05-18 Thread Antonio Novaes
I tried create new cluster ceph, but on the my first command, received this
erro in blue.
Searched on the gogle about this erro, but believe that is error of the
ssh, and dont of the ceph.

I tried:
alias ssh="ssh -t" on the admin node

I Modifyed the file

Host node01
   Hostname node01.domain.local
   User sds
   PreferredAuthentications publickey
   IdentityFile /home/sds/.ssh/id_rsa

also try,
- start the command wtih sudo
-  Add PermitRootLogin whitout-password on /etc/ssh/sshd_config on the host
node01

But, erro hold

[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[node01][DEBUG ] connected to host: cadmfsd001.tjba.jus.br
[node01][INFO  ] Running command: ssh -CT -o BatchMode=yes node01
[ceph_deploy.new][WARNIN] could not connect via SSH
[ceph_deploy.new][INFO  ] will connect again with password prompt
[node01][DEBUG ] connected to host: sds@node01
[node01][DEBUG ] detect platform information from remote host
[node01][DEBUG ] detect machine type
[ceph_deploy.new][INFO  ] adding public keys to authorized_keys
[node01][DEBUG ] append contents to file
[node01][DEBUG ] connection detected need for sudo
*sudo: I'm sorry, you should have a tty to run sudo*
*[ceph_deploy][ERROR ] RuntimeError: connecting to host: sds@node01
resulted in errors: IOError cannot send (already closed?)*

Someone can help me?

Att,
Antonio Novaes de C. Jr
Analista TIC - Sistema e Infraestrutura
Especialista em Segurança de Rede de Computadores
Information Security Foundation based on ISO/IEC 27002 | ISFS
EXIN Cloud Computing (CLOUDF)
Red Hat Certified Engineer (RHCE)
Red Hat Certified Jboss Administrator (RHCJA)
Linux Certified Engineer (LPIC-2)
Novell Certified Linux Administrator (SUSE CLA)
ID Linux: 481126 | LPI000255169
LinkedIN: Perfil Público

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] (no subject)

2018-05-18 Thread Don Doerner
unsubscribe ceph-users
The information contained in this transmission may be confidential. Any 
disclosure, copying, or further distribution of confidential information is not 
permitted unless such privilege is explicitly granted in writing by Quantum. 
Quantum reserves the right to have electronic communications, including email 
and attachments, sent across its networks filtered through security software 
programs and retain such messages in order to comply with applicable data 
security and retention requirements. Quantum is not responsible for the proper 
and complete transmission of the substance of this communication or for any 
delay in its receipt.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com