[ceph-users] remove require_jewel_osds flag after upgrade to kraken

2017-07-12 Thread Jan Krcmar
hi,

is it possible to remove the require_jewel_osds flag after upgrade to kraken?

$ ceph osd stat
 osdmap e29021: 40 osds: 40 up, 40 in
flags sortbitwise,require_jewel_osds,require_kraken_osds

it seems that ceph osd unset does not support require_jewel_osds

$ ceph osd unset require_jewel_osds
Invalid command:  require_jewel_osds not in
full|pause|noup|nodown|noout|noin|nobackfill|norebalance|norecover|noscrub|nodeep-scrub|notieragent|sortbitwise
osd unset 
full|pause|noup|nodown|noout|noin|nobackfill|norebalance|norecover|noscrub|nodeep-scrub|notieragent|sortbitwise
:  unset 
Error EINVAL: invalid command

is there any way to remove it?
if not, is it ok to leave the flag there?

thanks
fous
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] calculate past_intervals wrong, lead to choose wrong authority osd, then osd assert(newhead >= log.tail)

2017-07-12 Thread Chenyehua
Hi Sage
I find  the osd assert due to the  wrongly  generated past_intervals, could you 
give me some advice and solutions to this problem?

Here is the detail:

Ceph version: 0.94.5

HOST-A   HOST-BHOST-C
osd 7osd 21   osd11
1. osdmap epoch95, pg 1.20f on osd acting set [11,7]/ up set[11,7],then 
shutdown HOST-C
2. for a long time, cluster has only HOST A and HOST B, write data
3. shutdown HOST-A , then start HOST-C, restart HOST-B about 4 times
4. start HOST-A, osd 21 assert

Analysis:
when osd 11 start, it generate past_intervals wrongly, make [92~1000] in the 
same interval
pg map 1673,osd11 become the primary,and pg 1.20f change from peering to 
activating+undersized+degraded , modified last_epoch_start;
osd7 start, find_best_info will choose out bigger last_epoch_start,althought 
osd7 has the latest data;
past_intervals on osd 7:
~95 [11,7]/[11,7]
96~100[7]/[7]
101 [7,21]/[7,21]
102~178 [7,21]/[7]
179~1663  [7,21]/[7,21]
1664~1672  [21]/[21]
1673~1692  [11]/[11]

past_intervals on osd11:
92~1000 [11,7]/[11,7]the wrong pi
1001~1663   [7,21]/[7,21] no rw
1664~1672   [21]/[21] no rw
1673~1692[11]/[11]



Logs:
Assert on osd7:
2017-07-10 16:08:29.836722 7f4fac24a700 -1 osd/PGLog.cc: In function 'void 
PGLog::rewind_divergent_log(ObjectStore::Transaction&, eversion_t, pg_info_t&, 
PGLog::LogEntryHandler*, bool&, bool&)' thread 7f4fac24a700 time 2017-07-10 
16:08:29.833699
osd/PGLog.cc: 503: FAILED assert(newhead >= log.tail)
ceph version 0.94.5 (664cc0b54fdb496233a81ab19d42df3f46dcda50)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) 
[0xbd1ebb]
2: (PGLog::rewind_divergent_log(ObjectStore::Transaction&, eversion_t, 
pg_info_t&, PGLog::LogEntryHandler*, bool&, bool&)+0x60b) [0x7840fb]
3: (PG::rewind_divergent_log(ObjectStore::Transaction&, eversion_t)+0x97) 
[0x7df4b7]
4: (PG::RecoveryState::Stray::react(PG::MInfoRec const&)+0x22f) [0x80109f]
5: (boost::statechart::simple_state, 
(boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base 
const&, void const*)+0x216) [0x83c8d6]
6: (boost::statechart::state_machine, 
boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base
 const&)+0x5b) [0x827edb]
7: (PG::handle_peering_event(std::tr1::shared_ptr, 
PG::RecoveryCtx*)+0x1ce) [0x7d5dce]
8: (OSD::process_peering_events(std::list > const&, 
ThreadPool::TPHandle&)+0x2c0) [0x6b5930]
9: (OSD::PeeringWQ::_process(std::list > const&, 
ThreadPool::TPHandle&)+0x18) [0x70ef18]
10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56) [0xbc2aa6]
11: (ThreadPool::WorkThread::entry()+0x10) [0xbc3b50]
12: (()+0x8182) [0x7f4fc6458182]
13: (clone()+0x6d) [0x7f4fc49c347d]


osd 11 log:
2017-07-10 15:40:43.742547 7f8b1567a740 10 osd.11 95 pgid 1.20f coll 1.20f_head
2017-07-10 15:40:43.742580 7f8b1567a740 10 osd.11 95 _open_lock_pg 1.20f
2017-07-10 15:40:43.742592 7f8b1567a740  5 osd.11 pg_epoch: 95 
pg[1.20f(unlocked)] enter Initial
2017-07-10 15:40:43.743207 7f8b1567a740 10 osd.11 pg_epoch: 95 pg[1.20f( v 
95'397 (95'300,95'397] local-les=93 n=69 ec=75 les/c 93/93 92/92/92) [11,7] r=0 
lpr=0 crt=95'394 lcod 0'0 mlcod 0'0 inactive] handle_loaded
2017-07-10 15:40:43.743214 7f8b1567a740  5 osd.11 pg_epoch: 95 pg[1.20f( v 
95'397 (95'300,95'397] local-les=93 n=69 ec=75 les/c 93/93 92/92/92) [11,7] r=0 
lpr=0 crt=95'394 lcod 0'0 mlcod 0'0 inactive] exit Initial 0.000622 0 0.00
2017-07-10 15:40:43.743220 7f8b1567a740  5 osd.11 pg_epoch: 95 pg[1.20f( v 
95'397 (95'300,95'397] local-les=93 n=69 ec=75 les/c 93/93 92/92/92) [11,7] r=0 
lpr=0 crt=95'394 lcod 0'0 mlcod 0'0 inactive] enter Reset
2017-07-10 15:40:43.743224 7f8b1567a740 10 osd.11 pg_epoch: 95 pg[1.20f( v 
95'397 (95'300,95'397] local-les=93 n=69 ec=75 les/c 93/93 92/92/92) [11,7] r=0 
lpr=95 crt=95'394 lcod 0'0 mlcod 0'0 inactive] Clearing blocked outgoing 
recovery messages
2017-07-10 15:40:43.743228 7f8b1567a740 10 osd.11 pg_epoch: 95 pg[1.20f( v 
95'397 (95'300,95'397] local-les=93 n=69 ec=75 les/c 93/93 92/92/92) [11,7] r=0 
lpr=95 crt=95'394 lcod 0'0 mlcod 0'0 inactive] Not blocking outgoing recovery 
messages
2017-07-10 15:40:43.743232 7f8b1567a740 10 osd.11 95 load_pgs loaded pg[1.20f( 
v 95'397 (95'300,95'397] local-les=93 n=69 ec=75 les/c 93/93 92/92/92) [11,7] 
r=0 lpr=95 crt=95'394 lcod 0'0 mlcod 0'0 inactive] log((95'300,95'397], 
crt=95'394)
2017-07-10 15:40:43.829867 7f8b1567a740 10 osd.11 pg_epoch: 95 
pg[1.20f(unlocked)] _calc_past_interval_range start epoch 93 >= end epoch 92, 
nothing to do
2017-07-10 15:42:40.899157 7f8b1567a740 10 osd.11 pg_epoch: 95 pg[1.20f( v 
95'397 (95'300,95'397] local-les=93 n=69 ec=75 les/c 93/93 92/92/92) [11,7] r=0 
lpr=95 crt=95'394 lcod 0'0 mlcod 0'0 inactive] null
2017-07-10 15:42:40.902520 7f8afa1bd700 10 osd.11 pg_epoch: 95 pg[1.20f( v 
95'397 (95'300,95'397] local-les=93 n=69 ec=75 les/c 93/93 92/92/92) [11,7] r=0 
lpr=95 crt=9

Re: [ceph-users] Bucket policies in Luminous

2017-07-12 Thread Pritha Srivastava

- Original Message -
> From: "Adam C. Emerson" 
> To: "Graham Allan" 
> Cc: "Ceph Users" 
> Sent: Thursday, July 13, 2017 1:23:27 AM
> Subject: Re: [ceph-users] Bucket policies in Luminous
> 
> Graham Allan Wrote:
> > I thought I'd try out the new bucket policy support in Luminous. My goal
> > was simply to permit access on a bucket to another user.
> [snip]
> > Thanks for any ideas,
> 
> It's probably the 'blank' tenant. I'll make up a test case to exercise
> this and come up with a patch for it. Sorry about the trouble.
> 

The fix in this PR: https://github.com/ceph/ceph/pull/15997 should help.

Thanks,
Pritha

> --
> Senior Software Engineer   Red Hat Storage, Ann Arbor, MI, US
> IRC: Aemerson@{RedHat, OFTC}
> 0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD journaling benchmarks

2017-07-12 Thread Jason Dillaman
On Mon, Jul 10, 2017 at 3:41 PM, Maged Mokhtar  wrote:
> On 2017-07-10 20:06, Mohamad Gebai wrote:
>
>
> On 07/10/2017 01:51 PM, Jason Dillaman wrote:
>
> On Mon, Jul 10, 2017 at 1:39 PM, Maged Mokhtar  wrote:
>
> These are significant differences, to the point where it may not make sense
> to use rbd journaling / mirroring unless there is only 1 active client.
>
> I interpreted the results as the same RBD image was being concurrently
> used by two fio jobs -- which we strongly recommend against since it
> will result in the exclusive-lock ping-ponging back and forth between
> the two clients / jobs. Each fio RBD job should utilize its own
> backing image to avoid such a scenario.
>
>
> That is correct. The single job runs are more representative of the
> overhead of journaling only, and it is worth noting the (expected)
> inefficiency of multiple clients for the same RBD image, as explained by
> Jason.
>
> Mohamad
>
> Yes i expected a penalty but not as large. There are some use cases that
> would benefit from concurrent access to the same block device, in vmware ad
> hyper-v several hypervisors could share the same device which is formatted
> via a clustered file system like MS CSV ( clustered shared volumes ) or
> VMFS, which creates a volume/datastore that houses many VMs.

Both of these use-cases would first need support for active/active
iSCSI. While A/A iSCSI via MPIO is trivial to enable, getting it to
properly handle failure conditions without the possibility of data
corruption is not since it relies heavily on arbitrary initiator and
target-based timers. The only realistic and safe solution is to rely
on an MCS-based active/active implementation.

> I was wondering if such a setup could be supported in the future and maybe
> there could be a way to minimize the overhead of the exclusive lock..for
> example by having a distributed sequence number to the different active
> client writers and have each writer maintain its own journal, i doubt that
> the overhead will reach the values you showed.

The journal used by the librbd mirroring feature was designed to
support multiple concurrent writers. Of course, that original design
was more inline with the goal of supporting multiple images within a
consistency group.

> Maged
>
>

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mds replay forever after a power failure

2017-07-12 Thread Su, Zhan
Hi,

I have a single-node Ceph cluster. After a power failure, the mds cell
stuck on replaying the logs and cephfs stops working.

ceph -s gives:

health HEALTH_WARN
mds cluster is degraded
noscrub,nodeep-scrub flag(s) set
 monmap e4: 1 mons at {redbox=10.1.0.194:6789/0}
election epoch 41, quorum 0 redbox
  fsmap e68: 1/1/1 up {0=redbox=up:replay}
mgr no daemons active
 osdmap e369: 10 osds: 10 up, 10 in
flags
noscrub,nodeep-scrub,sortbitwise,require_jewel_osds,require_kraken_osds
  pgmap v528504: 870 pgs, 11 pools, 24050 GB data, 6030 kobjects
48200 GB used, 44938 GB / 93139 GB avail
 870 active+clean


And log:

2017-07-12 16:29:15.763089 7fe48be74380  0 ceph version 11.2.0
(f223e27eeb35991352ebc1f67423d4ebc252adb7), process ceph-mds, pid 6925
2017-07-12 16:29:15.764181 7fe48be74380  0 pidfile_write: ignore empty
--pid-file
2017-07-12 16:29:15.766205 7fe48be74380 -1 WARNING: the following dangerous
and experimental features are enabled: bluestore
2017-07-12 16:29:20.243839 7fe48534e700  1 mds.0.68 handle_mds_map i am now
mds.0.68
2017-07-12 16:29:20.243852 7fe48534e700  1 mds.0.68 handle_mds_map state
change up:boot --> up:replay
2017-07-12 16:29:20.243880 7fe48534e700  1 mds.0.68 replay_start
2017-07-12 16:29:20.243886 7fe48534e700  1 mds.0.68  recovery set is
2017-07-12 16:29:20.243892 7fe48534e700  1 mds.0.68  waiting for osdmap 369
(which blacklists prior instance)
2017-07-12 16:31:10.333118 7fe4872ef700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d6000 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:31:10.533666 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d7800 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:31:10.934493 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d9000 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:31:11.244383 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d9000 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:31:11.445010 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d6000 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:31:11.848886 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d7800 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:31:12.650186 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af3675800 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:31:14.252245 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
-

And above line just repeatedly printed out.

I tried to bump logging level by:

ceph tell mds.redbox injectargs --debug_journaler 20
ceph tell mds.redbox injectargs --debug_mds 20
ceph tell mds.redbox injectargs --debug_mds_log 20

And log:

2017-07-12 16:33:36.675521 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d9000 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:33:51.676004 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d7800 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:34:06.679509 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d6000 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:34:21.680076 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af3678800 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:34:36.684580 7fe487af0700  0 -- 10.1.0.194:6800/422236473 >>
- conn(0x559af36d7800 :6800 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0
l=0).fault with nothing to send and in the half  accept state just closed
2017-07-12 16:34:40.401344 7fe48534e700  5 mds.redbox ms_handle_reset on
10.1.0.194:0/193586809
2017-07-12 16:34:40.401354 7fe48534e700  3 mds.redbox ms_handle_reset
closing connection for session client.144148 10.1.0.194:0/193586809
2017-07-12 16:34:40.776533 7fe482b49700 20 mds.0.bal get_load no root, no
load
2017-07-12 16:34:40.776631 7fe482b49700 15 mds.0.bal get_load mdsload<[0,0
0]/[0,0 0], req 0, hr 0, qlen 0, cpu 0>
2017-07-12 16:34:40.776703 7fe482b49700 20 mds.beacon.redbox 0 slow request
found

[ceph-users] Fwd: installing specific version of ceph-common

2017-07-12 Thread Brad Hubbard
Sorry meant to include the list.


-- Forwarded message --
From: Brad Hubbard 
Date: Wed, Jul 12, 2017 at 9:12 PM
Subject: Re: [ceph-users] installing specific version of ceph-common
To: Buyens Niels 


On Wed, Jul 12, 2017 at 8:14 PM, Buyens Niels  wrote:
>
>
> I tried installing librados2-10.2.7 separately first (which worked). Then 
> trying to install ceph-common-10.2.7 again:

Try removing librados2 and issuing the command I gave previously. You
need to restrict the versions explicitly.

Late here so this will be my last post today.

>
> Error: Package: 1:ceph-common-10.2.7-0.el7.x86_64 (Ceph)
>Requires: librados2 = 1:10.2.7-0.el7
>Removing: 1:librados2-10.2.7-0.el7.x86_64 (@Ceph)
>librados2 = 1:10.2.7-0.el7
>Updated By: 1:librados2-10.2.8-0.el7.x86_64 (Ceph)
>librados2 = 1:10.2.8-0.el7
>
> 
> From: Brad Hubbard 
> Sent: Wednesday, July 12, 2017 11:59
> To: Buyens Niels
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] installing specific version of ceph-common
>
> On Wed, Jul 12, 2017 at 6:19 PM, Buyens Niels  wrote:
>> Hello,
>>
>>
>> When trying to install a specific version of ceph-common when a newer
>> version has been released, the installation fails.
>>
>>
>> I have an environment running version 10.2.7 on CentOS 7. Recently, 10.2.8
>> has been released to the repos.
>>
>>
>> Trying to install version 10.2.7 will fail because it is installing 10.2.8
>> dependencies (even though it says it's going to install 10.2.7
>> dependencies):
>>
>>
>> # yum install ceph-common-10.2.7
>
> Try something like...
>
> # yum install ceph-common-10.2.7 librados2-10.2.7
>
> You may need to add more packages with that specific version depending what
> other errors you get and also check for existing installed packages that may 
> get
> in the way and remove them as necessary.
>
>> Loaded plugins: fastestmirror
>> Loading mirror speeds from cached hostfile
>>  * base: mirror.unix-solutions.be
>>  * epel: epel.mirrors.ovh.net
>>  * extras: mirror.unix-solutions.be
>>  * updates: mirror.unix-solutions.be
>> Resolving Dependencies
>> --> Running transaction check
>> ---> Package ceph-common.x86_64 1:10.2.7-0.el7 will be installed
>> --> Processing Dependency: python-rados = 1:10.2.7-0.el7 for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: librbd1 = 1:10.2.7-0.el7 for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: python-rbd = 1:10.2.7-0.el7 for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: python-cephfs = 1:10.2.7-0.el7 for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: libcephfs1 = 1:10.2.7-0.el7 for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: librbd.so.1()(64bit) for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: librados.so.2()(64bit) for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: libbabeltrace.so.1()(64bit) for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: libbabeltrace-ctf.so.1()(64bit) for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: libradosstriper.so.1()(64bit) for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Processing Dependency: librgw.so.2()(64bit) for package:
>> 1:ceph-common-10.2.7-0.el7.x86_64
>> --> Running transaction check
>> ---> Package libbabeltrace.x86_64 0:1.2.4-3.el7 will be installed
>> ---> Package libcephfs1.x86_64 1:10.2.7-0.el7 will be installed
>> ---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
>> --> Processing Dependency: liblttng-ust.so.0()(64bit) for package:
>> 1:librados2-10.2.7-0.el7.x86_64
>> ---> Package libradosstriper1.x86_64 1:10.2.8-0.el7 will be installed
>> --> Processing Dependency: librados2 = 1:10.2.8-0.el7 for package:
>> 1:libradosstriper1-10.2.8-0.el7.x86_64
>> ---> Package librbd1.x86_64 1:10.2.7-0.el7 will be installed
>> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
>> 1:librbd1-10.2.7-0.el7.x86_64
>> ---> Package librgw2.x86_64 1:10.2.8-0.el7 will be installed
>> --> Processing Dependency: libfcgi.so.0()(64bit) for package:
>> 1:librgw2-10.2.8-0.el7.x86_64
>> ---> Package python-cephfs.x86_64 1:10.2.7-0.el7 will be installed
>> ---> Package python-rados.x86_64 1:10.2.7-0.el7 will be installed
>> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
>> 1:python-rados-10.2.7-0.el7.x86_64
>> ---> Package python-rbd.x86_64 1:10.2.7-0.el7 will be installed
>> --> Running transaction check
>> ---> Package fcgi.x86_64 0:2.4.0-25.el7 will be installed
>> ---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
>> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
>> 1:python-rados-10.2.7-0

Re: [ceph-users] Change the meta data pool of cephfs

2017-07-12 Thread Patrick Donnelly
On Tue, Jul 11, 2017 at 12:22 AM, Marc Roos  wrote:
>
>
> Is it possible to change the cephfs meta data pool. I would like to
> lower the pg's. And thought about just making a new pool, copying the
> pool and then renaming them. But I guess cephfs works with the pool id
> not? How can this be best done?

There is currently no way to change the metadata pool except through
manual recovery into a new pool:
http://docs.ceph.com/docs/master/cephfs/disaster-recovery/#using-an-alternate-metadata-pool-for-recovery

I would strongly recommend backups before trying such a procedure.

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stealth Jewel release?

2017-07-12 Thread Patrick Donnelly
On Wed, Jul 12, 2017 at 11:31 AM, Dan van der Ster  wrote:
> On Wed, Jul 12, 2017 at 5:51 PM, Abhishek L
>  wrote:
>> On Wed, Jul 12, 2017 at 9:13 PM, Xiaoxi Chen  wrote:
>>> +However, it also introduced a regression that could cause MDS damage.
>>> +Therefore, we do *not* recommend that Jewel users upgrade to this version -
>>> +instead, we recommend upgrading directly to v10.2.9 in which the 
>>> regression is
>>> +fixed.
>>>
>>> It looks like this version is NOT production ready. Curious why we
>>> want a not-recwaended version  to be released?
>>
>> We found a regression in MDS right after packages were built, and the release
>> was about to be announced. This is why we didn't announce the release.
>> We're  currently running tests after the fix for MDS was merged.
>>
>> So when we do announce the release we'll announce 10.2.9 so that users
>> can upgrade from 10.2.7->10.2.9
>
> Suppose some users already upgraded their CephFS to 10.2.8 -- what is
> the immediate recommended course of action? Downgrade or wait for the
> 10.2.9 ?

I'm not aware of or see any changes that would make downgrading back
to 10.2.7 a problem but the safest thing to do would be to replace the
v10.2.8 ceph-mds binaries with the v10.2.7 binary. If that's not
practical, I would recommend a cluster-wide downgrade to 10.2.7.

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bucket policies in Luminous

2017-07-12 Thread Graham Allan
I'm not sure if it is the blank tenant - I should have thought to try 
before writing, but I added a new user which does have a tenancy, but 
get the same issue.


policy:
{
  "Version": "2012-10-17",
  "Statement": [
{
  "Effect": "Allow",
  "Principal": { "AWS": "arn:aws:iam::lemming:user/gta4"},
  "Action": "s3:*",
  "Resource": ["arn:aws:s3:::gta/*"]
}
  ]
}

Using principal
- { "AWS": "arn:aws:iam::lemming:user/gta4"} causes the same crash
- { "AWS": "arn:aws:iam::lemming:gta4"} causes principal discarded

of course... that is the user I am trying to grant access to. Possibly 
the problem might be the blank tenant for the bucket owner?


Thanks,

Graham

On 07/12/2017 02:53 PM, Adam C. Emerson wrote:

Graham Allan Wrote:

I thought I'd try out the new bucket policy support in Luminous. My goal
was simply to permit access on a bucket to another user.

[snip]

Thanks for any ideas,


It's probably the 'blank' tenant. I'll make up a test case to exercise
this and come up with a patch for it. Sorry about the trouble.



--
Graham Allan
Minnesota Supercomputing Institute - g...@umn.edu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bucket policies in Luminous

2017-07-12 Thread Adam C. Emerson
Graham Allan Wrote:
> I thought I'd try out the new bucket policy support in Luminous. My goal
> was simply to permit access on a bucket to another user.
[snip]
> Thanks for any ideas,

It's probably the 'blank' tenant. I'll make up a test case to exercise
this and come up with a patch for it. Sorry about the trouble.

-- 
Senior Software Engineer   Red Hat Storage, Ann Arbor, MI, US
IRC: Aemerson@{RedHat, OFTC}
0x80F7544B90EDBFB9 E707 86BA 0C1B 62CC 152C  7C12 80F7 544B 90ED BFB9
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] autoconfigured haproxy service?

2017-07-12 Thread Chris Jones
Hi Sage,

The automated tool Cepheus https://github.com/cepheus-io/cepheus does this
with ceph-chef. It's based on json data for a given environment. It uses
Chef and Ansible. If someone wanted to break out the haproxy (ADC) portion
into a package then it has a good model for HAProxy they could look at.
Originally created due to the need for our own software LB solution over a
hardware LB. It also supports keep-alived and bird (BGP).

Thanks

On Tue, Jul 11, 2017 at 11:03 AM, Sage Weil  wrote:

> Hi all,
>
> Luminous features a new 'service map' that lets rgw's (and rgw nfs
> gateways and iscsi gateways and rbd mirror daemons and ...) advertise
> themselves to the cluster along with some metadata (like the addresses
> they are binding to and the services the provide).
>
> It should be pretty straightforward to build a service that
> auto-configures haproxy based on this information so that you can deploy
> an rgw front-end that dynamically reconfigures itself when additional
> rgw's are deployed or removed.  haproxy has a facility to adjust its
> backend configuration at runtime[1].
>
> Anybody interested in tackling this?  Setting up the load balancer in
> front of rgw is one of the more annoying pieces of getting ceph up and
> running in production and until now has been mostly treated as out of
> scope.  It would be awesome if there was an autoconfigured service that
> did it out of the box (and had all the right haproxy options set).
>
> sage
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] updating the documentation

2017-07-12 Thread Sage Weil
On Wed, 12 Jul 2017, Patrick Donnelly wrote:
> On Wed, Jul 12, 2017 at 11:29 AM, Sage Weil  wrote:
> > In the meantime, we can also avoid making the problem worse by requiring
> > that all pull requests include any relevant documentation updates.  This
> > means (1) helping educate contributors that doc updates are needed, (2)
> > helping maintainers and reviewers remember that doc updates are part of
> > the merge criteria (it will likely take a bit of time before this is
> > second nature), and (3) generally inducing developers to become aware of
> > the documentation that exists so that they know what needs to be updated
> > when they make a change.
> 
> There was a joke to add a bot which automatically fails PRs for no
> documentation but I think there is an way to make that work in a
> reasonable way. Perhaps the bot could simply comment on all PRs
> touching src/ that documentation is required and where to look, and
> then fails a doc check. A developer must comment on the PR to say it
> passes documentation requirements before the bot changes the check to
> pass.
> 
> This addresses all three points in an automatic way.

This is a great idea.  Greg brought up the idea of a bot but we 
didn't think of a "docs ok"-type comment to make it happy.

Anybody interested in coding it up?

Piotr makes a good point about config_opts.h, although that problem is 
about to go away (or at least change) with John's config update:

https://github.com/ceph/ceph/pull/16211

(Config options will be documented in the code where the schema is 
defined, and docs.ceph.com .rst will eventually be auto-generated from 
that.)

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bucket policies in Luminous

2017-07-12 Thread Graham Allan
I thought I'd try out the new bucket policy support in Luminous. My goal 
was simply to permit access on a bucket to another user.


I have 2 users, "gta" and "gta2", both of which are in the default ("") 
tenant. "gta" also owns the bucket named "gta". I want to grant access 
on this bucket to user gta2.


My policy file:

{
  "Version": "2012-10-17",
  "Statement": [
{
  "Effect": "Allow",
  "Principal": {"AWS": "arn:aws:iam:::gta2"},
  "Action": "s3:*",
  "Resource": ["arn:aws:s3:::gta/*"]
}
  ]
}

Uploads ok to ceph...
% s3cmd setpolicy s3policy s3://gta

and I can read this back with awscli:
% aws s3api get-bucket-policy --bucket=gta
{
"Policy": "{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": [\n 
  {\n  \"Effect\": \"Allow\",\n  \"Principal\": {\"AWS\": 
\"arn:aws:iam:::gta2\"},\n  \"Action\": \"s3:*\",\n 
\"Resource\": [\"arn:aws:s3:::gta/*\"]\n}\n  ]\n}\n"

}

However gta2 still cannot access the bucket...
% ./s3cmd-fceph-gta2 ls s3://gta
ERROR: Access to bucket 'gta' was denied
ERROR: S3 error: 403 (AccessDenied)

I see that when uploading the policy file, radosgw logs this:
Supplied principal is discarded: arn:aws:iam:::gta2

Looking at the source for rgw_iam_policy.cc it looks like this is 
because I'm not prefixing the user name with "user/" as hinted on the 
doc page http://docs.ceph.com/docs/master/radosgw/bucketpolicy/. I can 
just set a wildcard, {"AWS": "*"} which radosgw seems to accept without 
discarding, however user gta2 still has no access.


So tried setting the principal to {"AWS": "arn:aws:iam:::user/gta2"}

This just resulted in a crash dump from radosgw... in summary

rgw_iam_policy.cc: In function 'boost::optional 
rgw::IAM::parse_principal(CephContext*, rgw::IAM::TokenID, 
std::string&&)' thread 7fe8332a0700 time 2017-07-12 13:46:27.468022

rgw_iam_policy.cc: 716: FAILED assert(match.size() == 2)

Is this because I'm specifying the policy incorrectly... or does it not 
work if there's no tenant?


Thanks for any ideas,

Graham
--
Graham Allan
Minnesota Supercomputing Institute - g...@umn.edu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] updating the documentation

2017-07-12 Thread Patrick Donnelly
On Wed, Jul 12, 2017 at 11:29 AM, Sage Weil  wrote:
> In the meantime, we can also avoid making the problem worse by requiring
> that all pull requests include any relevant documentation updates.  This
> means (1) helping educate contributors that doc updates are needed, (2)
> helping maintainers and reviewers remember that doc updates are part of
> the merge criteria (it will likely take a bit of time before this is
> second nature), and (3) generally inducing developers to become aware of
> the documentation that exists so that they know what needs to be updated
> when they make a change.

There was a joke to add a bot which automatically fails PRs for no
documentation but I think there is an way to make that work in a
reasonable way. Perhaps the bot could simply comment on all PRs
touching src/ that documentation is required and where to look, and
then fails a doc check. A developer must comment on the PR to say it
passes documentation requirements before the bot changes the check to
pass.

This addresses all three points in an automatic way.

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stealth Jewel release?

2017-07-12 Thread Dan van der Ster
On Wed, Jul 12, 2017 at 5:51 PM, Abhishek L
 wrote:
> On Wed, Jul 12, 2017 at 9:13 PM, Xiaoxi Chen  wrote:
>> +However, it also introduced a regression that could cause MDS damage.
>> +Therefore, we do *not* recommend that Jewel users upgrade to this version -
>> +instead, we recommend upgrading directly to v10.2.9 in which the regression 
>> is
>> +fixed.
>>
>> It looks like this version is NOT production ready. Curious why we
>> want a not-recwaended version  to be released?
>
> We found a regression in MDS right after packages were built, and the release
> was about to be announced. This is why we didn't announce the release.
> We're  currently running tests after the fix for MDS was merged.
>
> So when we do announce the release we'll announce 10.2.9 so that users
> can upgrade from 10.2.7->10.2.9

Suppose some users already upgraded their CephFS to 10.2.8 -- what is
the immediate recommended course of action? Downgrade or wait for the
10.2.9 ?

Cheers, Dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] updating the documentation

2017-07-12 Thread Sage Weil
We have a fair-sized list of documentation items to update for the 
luminous release.  The other day when I starting looking through what is 
there now, though, I was also immediately struck by how out of date much 
of the content is.  In addition to addressing the immediate updates for 
luminous, I think we also need a systematic review of the current docs 
(including the information structure) and a coordinated effort to make 
updates and revisions.

First question is, of course: is anyone is interested in helping 
coordinate this effort?

In the meantime, we can also avoid making the problem worse by requiring 
that all pull requests include any relevant documentation updates.  This 
means (1) helping educate contributors that doc updates are needed, (2) 
helping maintainers and reviewers remember that doc updates are part of 
the merge criteria (it will likely take a bit of time before this is 
second nature), and (3) generally inducing developers to become aware of 
the documentation that exists so that they know what needs to be updated 
when they make a change.

Comments or concerns?

Thanks!
sage

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Kernel mounted RBD's hanging

2017-07-12 Thread Nick Fisk


> -Original Message-
> From: Nick Fisk [mailto:n...@fisk.me.uk]
> Sent: 12 July 2017 13:47
> To: 'Ilya Dryomov' 
> Cc: 'Ceph Users' 
> Subject: RE: [ceph-users] Kernel mounted RBD's hanging
> 
> > -Original Message-
> > From: Nick Fisk [mailto:n...@fisk.me.uk]
> > Sent: 08 July 2017 21:50
> > To: 'Ilya Dryomov' 
> > Cc: 'Ceph Users' 
> > Subject: RE: [ceph-users] Kernel mounted RBD's hanging
> >
> > > -Original Message-
> > > From: Ilya Dryomov [mailto:idryo...@gmail.com]
> > > Sent: 07 July 2017 11:32
> > > To: Nick Fisk 
> > > Cc: Ceph Users 
> > > Subject: Re: [ceph-users] Kernel mounted RBD's hanging
> > >
> > > On Fri, Jul 7, 2017 at 12:10 PM, Nick Fisk  wrote:
> > > > Managed to catch another one, osd.75 again, not sure if that is an
> > > indication of anything or just a co-incidence. osd.75 is one of 8
> > > OSD's in a cache tier, so all IO will be funnelled through them.
> > > >
> > > >
> > > >
> > > > Also found this in the log of osd.75 at the same time, but the
> > > > client IP is not
> > > the same as the node which experienced the hang.
> > >
> > > Can you bump debug_ms and debug_osd to 30 on osd75?  I doubt it's an
> > > issue with that particular OSD, but if it goes down the same way
> > > again, I'd have something to look at.  Make sure logrotate is
> > > configured and working before doing that though... ;)
> > >
> > > Thanks,
> > >
> > > Ilya
> >
> > So, osd.75 was a coincidence, several other hangs have had outstanding
> > requests to other OSD's. I haven't been able to get the debug logs of
> > the OSD during a hang yet because of this. Although I think the crc problem 
> > may now be fixed, by upgrading all clients to 4.11.1+.
> >
> > Here is a series of osdc dumps every minute during one of the hangs
> > with a different target OSD. The osdc dumps on another node show IO
> > being processed normally whilst the other node hangs, so the cluster
> > is definitely handling IO fine whilst the other node hangs. And as I am 
> > using cache tiering with proxying, all IO will be going through
> just 8 OSD's. The host has 3 RBD's mounted and all 3 hang.
> >
> > Latest hang:
> > Sat  8 Jul 18:49:01 BST 2017
> > REQUESTS 4 homeless 0
> > 174662831   osd25   17.77737285 [25,74,14]/25   [25,74,14]/25   
> > rbd_data.15d8670238e1f29.000cf9f8   0x4000241
> > 0'0 set-alloc-hint,write
> > 174662863   osd25   17.7b91a345 [25,74,14]/25   [25,74,14]/25   
> > rbd_data.1555406238e1f29.0002571c   0x4000241
> > 0'0 set-alloc-hint,write
> > 174662887   osd25   17.6c2eaa93 [25,75,14]/25   [25,75,14]/25   
> > rbd_data.158f204238e1f29.0008   0x4000241
> > 0'0 set-alloc-hint,write
> > 174662925   osd25   17.32271445 [25,74,14]/25   [25,74,14]/25   
> > rbd_data.1555406238e1f29.0001   0x4000241
> > 0'0 set-alloc-hint,write
> > LINGER REQUESTS
> > 18446462598732840990osd74   17.145baa0f [74,72,14]/74   
> > [74,72,14]/74   rbd_header.158f204238e1f29  0x208   WC/0
> > 18446462598732840991osd74   17.7b4e2a06 [74,72,25]/74   
> > [74,72,25]/74   rbd_header.1555406238e1f29  0x209   WC/0
> > 18446462598732840992osd74   17.eea94d58 [74,73,25]/74   
> > [74,73,25]/74   rbd_header.15d8670238e1f29  0x208   WC/0
> > Sat  8 Jul 18:50:01 BST 2017
> > REQUESTS 5 homeless 0
> > 174662831   osd25   17.77737285 [25,74,14]/25   [25,74,14]/25   
> > rbd_data.15d8670238e1f29.000cf9f8   0x4000241
> > 0'0 set-alloc-hint,write
> > 174662863   osd25   17.7b91a345 [25,74,14]/25   [25,74,14]/25   
> > rbd_data.1555406238e1f29.0002571c   0x4000241
> > 0'0 set-alloc-hint,write
> > 174662887   osd25   17.6c2eaa93 [25,75,14]/25   [25,75,14]/25   
> > rbd_data.158f204238e1f29.0008   0x4000241
> > 0'0 set-alloc-hint,write
> > 174662925   osd25   17.32271445 [25,74,14]/25   [25,74,14]/25   
> > rbd_data.1555406238e1f29.0001   0x4000241
> > 0'0 set-alloc-hint,write
> > 174663129   osd25   17.32271445 [25,74,14]/25   [25,74,14]/25   
> > rbd_data.1555406238e1f29.0001   0x4000241
> > 0'0 set-alloc-hint,write
> > LINGER REQUESTS
> > 18446462598732840990osd74   17.145baa0f [74,72,14]/74   
> > [74,72,14]/74   rbd_header.158f204238e1f29  0x208   WC/0
> > 18446462598732840991osd74   17.7b4e2a06 [74,72,25]/74   
> > [74,72,25]/74   rbd_header.1555406238e1f29  0x209   WC/0
> > 18446462598732840992osd74   17.eea94d58 [74,73,25]/74   
> > [74,73,25]/74   rbd_header.15d8670238e1f29  0x208   WC/0
> > Sat  8 Jul 18:51:01 BST 2017
> > REQUESTS 5 homeless 0
> > 174662831   osd25   17.77737285 [25,74,14]/25   [25,74,14]/25   
> > rbd_data.15d8670238e1f29.000cf9f8   0x400024 

[ceph-users] libceph: auth method 'x' error -1

2017-07-12 Thread c . monty
Hi!

I have installed Ceph using ceph-deploy.
The Ceph Storage Cluster setup includes these nodes:
ld4257 Monitor0 + Admin
ld4258 Montor1
ld4259 Monitor2
ld4464 OSD0
ld4465 OSD1

Ceph Health status is OK.

However, I cannot mount Ceph FS.
When I enter this command on ld4257
mount -t ceph ldcephmon1,ldcephmon2,ldcephmon3:/ /mnt/cephfs/ -o
name=client.openattic,secret=[secretkey]
I get this error:
mount error 1 = Operation not permitted
In syslog I find this entries:
[ 3657.493337] libceph: client264233 fsid
5f6f168d-2ade-4d16-a7e6-3704f93ad94e
[ 3657.493542] libceph: auth method 'x' error -1

When I use another mount command on ld4257
mount.ceph ld4257,ld4258,ld4259:/cephfs /mnt/cephfs/ -o
name=client.openattic,secretfile=/etc/ceph/ceph.client.openattic.keyring
I get this error:
secret is not valid base64: Invalid argument.
adding ceph secret key to kernel failed: Invalid argument.
failed to parse ceph_options

Question:
Is mount-option "secretfile" not supported anymore?
How can I fix the authentication error?

THX
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multipath configuration for Ceph storage nodes

2017-07-12 Thread Benjeman Meekhof
We use a puppet module to deploy them.  We give it devices to
configure from hiera data specific to our different types of storage
nodes.  The module is a fork from
https://github.com/openstack/puppet-ceph.

Ultimately the module ends up running 'ceph-disk prepare [arguments]
/dev/mapper/mpathXX /dev/nvmeXX'   (data dev, journal dev).

thanks,
Ben



On Wed, Jul 12, 2017 at 12:13 PM,   wrote:
> Hi Ben,
>
> Thanks for this, much appreciated.
>
> Can I just check: Do you use ceph-deploy to create your OSDs? E.g.:
>
> ceph-deploy disk zap ceph-sn1.example.com:/dev/mapper/disk1
> ceph-deploy osd prepare ceph-sn1.example.com:/dev/mapper/disk1
>
> Best wishes,
> Bruno
>
>
> -Original Message-
> From: Benjeman Meekhof [mailto:bmeek...@umich.edu]
> Sent: 11 July 2017 18:46
> To: Canning, Bruno (STFC,RAL,SC)
> Cc: ceph-users
> Subject: Re: [ceph-users] Multipath configuration for Ceph storage nodes
>
> Hi Bruno,
>
> We have similar types of nodes and minimal configuration is required 
> (RHEL7-derived OS).  Install device-mapper-multipath or equivalent package, 
> configure /etc/multipath.conf and enable 'multipathd'.  If working correctly 
> the command 'multipath -ll' should output multipath devices and component 
> devices on all paths.
>
> For reference, our /etc/multipath.conf is just these few lines:
>
> defaults {
> user_friendly_names yes
> find_multipaths yes
> }
>
> thanks,
> Ben
>
> On Tue, Jul 11, 2017 at 10:48 AM,   wrote:
>> Hi All,
>>
>>
>>
>> I’d like to know if anyone has any experience of configuring multipath
>> on ceph storage nodes, please. I’d like to know how best to go about it.
>>
>>
>>
>> We have a number of Dell PowerEdge R630 servers, each of which are
>> fitted with two SAS 12G HBA cards and each of which have two
>> associated Dell MD1400 storage units connected to them via HD-Mini -
>> HD-Mini cables, see the attached graphic (ignore colours: two direct
>> connections from the server to each storage unit, two connections running 
>> between each storage unit).
>>
>>
>>
>> Best wishes,
>>
>> Bruno
>>
>>
>>
>>
>>
>> Bruno Canning
>>
>> LHC Data Store System Administrator
>>
>> Scientific Computing Department
>>
>> STFC Rutherford Appleton Laboratory
>>
>> Harwell Oxford
>>
>> Didcot
>>
>> OX11 0QX
>>
>> Tel. +44 ((0)1235) 446621
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stealth Jewel release?

2017-07-12 Thread Xiaoxi Chen
Understood, thanks Abhishek.

 So 10.2.9 will not be another release cycle but just 10.2.8+ mds fix,
and expect to be out soon, right?


2017-07-12 23:51 GMT+08:00 Abhishek L :
> On Wed, Jul 12, 2017 at 9:13 PM, Xiaoxi Chen  wrote:
>> +However, it also introduced a regression that could cause MDS damage.
>> +Therefore, we do *not* recommend that Jewel users upgrade to this version -
>> +instead, we recommend upgrading directly to v10.2.9 in which the regression 
>> is
>> +fixed.
>>
>> It looks like this version is NOT production ready. Curious why we
>> want a not-recwaended version  to be released?
>
> We found a regression in MDS right after packages were built, and the release
> was about to be announced. This is why we didn't announce the release.
> We're  currently running tests after the fix for MDS was merged.
>
> So when we do announce the release we'll announce 10.2.9 so that users
> can upgrade from 10.2.7->10.2.9
>
> Best,
> Abhishek
>
>> 2017-07-12 22:44 GMT+08:00 David Turner :
>>> The lack of communication on this makes me tentative to upgrade to it.  Are
>>> the packages available to Ubuntu/Debian systems production ready and
>>> intended for upgrades?
>>>
>>> On Tue, Jul 11, 2017 at 8:33 PM Brad Hubbard  wrote:

 On Wed, Jul 12, 2017 at 12:58 AM, David Turner 
 wrote:
 > I haven't seen any release notes for 10.2.8 yet.  Is there a document
 > somewhere stating what's in the release?

 https://github.com/ceph/ceph/pull/16274 for now although it should
 make it into the master doc tree soon.

 >
 > On Mon, Jul 10, 2017 at 1:41 AM Henrik Korkuc  wrote:
 >>
 >> On 17-07-10 08:29, Christian Balzer wrote:
 >> > Hello,
 >> >
 >> > so this morning I was greeted with the availability of 10.2.8 for
 >> > both
 >> > Jessie and Stretch (much appreciated), but w/o any announcement here
 >> > or
 >> > updated release notes on the website, etc.
 >> >
 >> > Any reason other "Friday" (US time) for this?
 >> >
 >> > Christian
 >>
 >> My guess is that they didn't have time to announce it yet. Maybe pkgs
 >> were not ready yet on friday?
 >>
 >> ___
 >> ceph-users mailing list
 >> ceph-users@lists.ceph.com
 >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 >
 >
 > ___
 > ceph-users mailing list
 > ceph-users@lists.ceph.com
 > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 >



 --
 Cheers,
 Brad
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multipath configuration for Ceph storage nodes

2017-07-12 Thread bruno.canning
Hi Ben,

Thanks for this, much appreciated.

Can I just check: Do you use ceph-deploy to create your OSDs? E.g.:

ceph-deploy disk zap ceph-sn1.example.com:/dev/mapper/disk1
ceph-deploy osd prepare ceph-sn1.example.com:/dev/mapper/disk1

Best wishes,
Bruno


-Original Message-
From: Benjeman Meekhof [mailto:bmeek...@umich.edu] 
Sent: 11 July 2017 18:46
To: Canning, Bruno (STFC,RAL,SC)
Cc: ceph-users
Subject: Re: [ceph-users] Multipath configuration for Ceph storage nodes

Hi Bruno,

We have similar types of nodes and minimal configuration is required 
(RHEL7-derived OS).  Install device-mapper-multipath or equivalent package, 
configure /etc/multipath.conf and enable 'multipathd'.  If working correctly 
the command 'multipath -ll' should output multipath devices and component 
devices on all paths.

For reference, our /etc/multipath.conf is just these few lines:

defaults {
user_friendly_names yes
find_multipaths yes
}

thanks,
Ben

On Tue, Jul 11, 2017 at 10:48 AM,   wrote:
> Hi All,
>
>
>
> I’d like to know if anyone has any experience of configuring multipath 
> on ceph storage nodes, please. I’d like to know how best to go about it.
>
>
>
> We have a number of Dell PowerEdge R630 servers, each of which are 
> fitted with two SAS 12G HBA cards and each of which have two 
> associated Dell MD1400 storage units connected to them via HD-Mini - 
> HD-Mini cables, see the attached graphic (ignore colours: two direct 
> connections from the server to each storage unit, two connections running 
> between each storage unit).
>
>
>
> Best wishes,
>
> Bruno
>
>
>
>
>
> Bruno Canning
>
> LHC Data Store System Administrator
>
> Scientific Computing Department
>
> STFC Rutherford Appleton Laboratory
>
> Harwell Oxford
>
> Didcot
>
> OX11 0QX
>
> Tel. +44 ((0)1235) 446621
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW/Civet: Reads too much data when client doesn't close the connection

2017-07-12 Thread Aaron Bassett
Yup already working on fixing the client, but it seems like a potentially nasty 
issue for RGW, as a malicious client could potentially DOS an endpoint pretty 
easily this way.

Aaron

> On Jul 12, 2017, at 11:48 AM, Jens Rosenboom  wrote:
>
> 2017-07-12 15:23 GMT+00:00 Aaron Bassett :
>> I have a situation where a client is GET'ing a large key (100GB) from 
>> RadosGW and just reading the first few bytes to determine if it's a gzip 
>> file or not, and then just moving on without closing the connection. I'm 
>> RadosGW then goes on to read the rest of the object out of the cluster, 
>> while sending nothing to the client as it's no longer listening. When this 
>> client does this to many objects in quick succession, it essentially creates 
>> a DOS on my cluster as all my rgws are reading out of the cluster as fast as 
>> they can but not sending the data anywhere. This is on an up to date Jewel 
>> cluster, using civetweb for the web server.
>>
>> I just wanted to reach out and see if anyone else has seen this before I dig 
>> in more and try to find more details about where the problem may lay.
>
> I would say your client is broken, if it is only interested in a range
> of the object, it should include a corresponding range header with the
> GET request.
>
> Though I agree that the behaviour for closed connections could
> probably improved, too. See 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tracker.ceph.com_issues_20166&d=DwIFaQ&c=Tpa2GKmmYSmpYS4baANxQwQYqA0vwGXwkJOPBegaiTs&r=5nKer5huNDFQXjYpOR4o_7t5CRI8wb5Vb_v1pBywbYw&m=6pdFEFo2m68_ouTlVrEa4GOrzh-WcOpK4K8hRD2n2ho&s=wtiIaAqUaoNJeBMwjyIDRQXs-So9Hj6xELikPSSRuV0&e=
>   for a
> similar issue, something like the opposite of your case.

CONFIDENTIALITY NOTICE
This e-mail message and any attachments are only for the use of the intended 
recipient and may contain information that is privileged, confidential or 
exempt from disclosure under applicable law. If you are not the intended 
recipient, any disclosure, distribution or other use of this e-mail message or 
attachments is prohibited. If you have received this e-mail message in error, 
please delete and notify the sender immediately. Thank you.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stealth Jewel release?

2017-07-12 Thread Abhishek L
On Wed, Jul 12, 2017 at 9:13 PM, Xiaoxi Chen  wrote:
> +However, it also introduced a regression that could cause MDS damage.
> +Therefore, we do *not* recommend that Jewel users upgrade to this version -
> +instead, we recommend upgrading directly to v10.2.9 in which the regression 
> is
> +fixed.
>
> It looks like this version is NOT production ready. Curious why we
> want a not-recwaended version  to be released?

We found a regression in MDS right after packages were built, and the release
was about to be announced. This is why we didn't announce the release.
We're  currently running tests after the fix for MDS was merged.

So when we do announce the release we'll announce 10.2.9 so that users
can upgrade from 10.2.7->10.2.9

Best,
Abhishek

> 2017-07-12 22:44 GMT+08:00 David Turner :
>> The lack of communication on this makes me tentative to upgrade to it.  Are
>> the packages available to Ubuntu/Debian systems production ready and
>> intended for upgrades?
>>
>> On Tue, Jul 11, 2017 at 8:33 PM Brad Hubbard  wrote:
>>>
>>> On Wed, Jul 12, 2017 at 12:58 AM, David Turner 
>>> wrote:
>>> > I haven't seen any release notes for 10.2.8 yet.  Is there a document
>>> > somewhere stating what's in the release?
>>>
>>> https://github.com/ceph/ceph/pull/16274 for now although it should
>>> make it into the master doc tree soon.
>>>
>>> >
>>> > On Mon, Jul 10, 2017 at 1:41 AM Henrik Korkuc  wrote:
>>> >>
>>> >> On 17-07-10 08:29, Christian Balzer wrote:
>>> >> > Hello,
>>> >> >
>>> >> > so this morning I was greeted with the availability of 10.2.8 for
>>> >> > both
>>> >> > Jessie and Stretch (much appreciated), but w/o any announcement here
>>> >> > or
>>> >> > updated release notes on the website, etc.
>>> >> >
>>> >> > Any reason other "Friday" (US time) for this?
>>> >> >
>>> >> > Christian
>>> >>
>>> >> My guess is that they didn't have time to announce it yet. Maybe pkgs
>>> >> were not ready yet on friday?
>>> >>
>>> >> ___
>>> >> ceph-users mailing list
>>> >> ceph-users@lists.ceph.com
>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>> >
>>> > ___
>>> > ceph-users mailing list
>>> > ceph-users@lists.ceph.com
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Brad
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW/Civet: Reads too much data when client doesn't close the connection

2017-07-12 Thread Jens Rosenboom
2017-07-12 15:23 GMT+00:00 Aaron Bassett :
> I have a situation where a client is GET'ing a large key (100GB) from RadosGW 
> and just reading the first few bytes to determine if it's a gzip file or not, 
> and then just moving on without closing the connection. I'm RadosGW then goes 
> on to read the rest of the object out of the cluster, while sending nothing 
> to the client as it's no longer listening. When this client does this to many 
> objects in quick succession, it essentially creates a DOS on my cluster as 
> all my rgws are reading out of the cluster as fast as they can but not 
> sending the data anywhere. This is on an up to date Jewel cluster, using 
> civetweb for the web server.
>
> I just wanted to reach out and see if anyone else has seen this before I dig 
> in more and try to find more details about where the problem may lay.

I would say your client is broken, if it is only interested in a range
of the object, it should include a corresponding range header with the
GET request.

Though I agree that the behaviour for closed connections could
probably improved, too. See http://tracker.ceph.com/issues/20166 for a
similar issue, something like the opposite of your case.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stealth Jewel release?

2017-07-12 Thread Xiaoxi Chen
+However, it also introduced a regression that could cause MDS damage.
+Therefore, we do *not* recommend that Jewel users upgrade to this version -
+instead, we recommend upgrading directly to v10.2.9 in which the regression is
+fixed.

It looks like this version is NOT production ready. Curious why we
want a not-recommended version  to be released?

2017-07-12 22:44 GMT+08:00 David Turner :
> The lack of communication on this makes me tentative to upgrade to it.  Are
> the packages available to Ubuntu/Debian systems production ready and
> intended for upgrades?
>
> On Tue, Jul 11, 2017 at 8:33 PM Brad Hubbard  wrote:
>>
>> On Wed, Jul 12, 2017 at 12:58 AM, David Turner 
>> wrote:
>> > I haven't seen any release notes for 10.2.8 yet.  Is there a document
>> > somewhere stating what's in the release?
>>
>> https://github.com/ceph/ceph/pull/16274 for now although it should
>> make it into the master doc tree soon.
>>
>> >
>> > On Mon, Jul 10, 2017 at 1:41 AM Henrik Korkuc  wrote:
>> >>
>> >> On 17-07-10 08:29, Christian Balzer wrote:
>> >> > Hello,
>> >> >
>> >> > so this morning I was greeted with the availability of 10.2.8 for
>> >> > both
>> >> > Jessie and Stretch (much appreciated), but w/o any announcement here
>> >> > or
>> >> > updated release notes on the website, etc.
>> >> >
>> >> > Any reason other "Friday" (US time) for this?
>> >> >
>> >> > Christian
>> >>
>> >> My guess is that they didn't have time to announce it yet. Maybe pkgs
>> >> were not ready yet on friday?
>> >>
>> >> ___
>> >> ceph-users mailing list
>> >> ceph-users@lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>>
>>
>> --
>> Cheers,
>> Brad
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RGW/Civet: Reads too much data when client doesn't close the connection

2017-07-12 Thread Aaron Bassett
I have a situation where a client is GET'ing a large key (100GB) from RadosGW 
and just reading the first few bytes to determine if it's a gzip file or not, 
and then just moving on without closing the connection. I'm RadosGW then goes 
on to read the rest of the object out of the cluster, while sending nothing to 
the client as it's no longer listening. When this client does this to many 
objects in quick succession, it essentially creates a DOS on my cluster as all 
my rgws are reading out of the cluster as fast as they can but not sending the 
data anywhere. This is on an up to date Jewel cluster, using civetweb for the 
web server.

I just wanted to reach out and see if anyone else has seen this before I dig in 
more and try to find more details about where the problem may lay.

Aaron

CONFIDENTIALITY NOTICE
This e-mail message and any attachments are only for the use of the intended 
recipient and may contain information that is privileged, confidential or 
exempt from disclosure under applicable law. If you are not the intended 
recipient, any disclosure, distribution or other use of this e-mail message or 
attachments is prohibited. If you have received this e-mail message in error, 
please delete and notify the sender immediately. Thank you.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Writing to EC Pool in degraded state?

2017-07-12 Thread David Turner
Make sure to test that stuff.  I've never had to modify the min_size on an
EC pool before.

On Wed, Jul 12, 2017 at 11:12 AM Jake Grimmett 
wrote:

> Hi David,
>
> put that way, the docs make complete sense, thank you!
>
> i.e. to allow writing to a 5+2 EC cluster with one node down:
>
> default is:
> # ceph osd pool get ecpool min_size
> min_size: 7
>
> to tolerate one node failure, set:
> # ceph osd pool set ecpool min_size 6
> set pool 1 min_size to 6
>
> to tolerate two nodes failing, set:
> # ceph osd pool set ecpool min_size 5
> set pool 1 min_size to 5
>
> thanks again!
>
> Jake
>
> On 12/07/17 14:36, David Turner wrote:
> > As long as you have the 7 copies online if you're using 7+2 then you can
> > still work and read to the EC pool.  For EC pool, size is equivalently 9
> > and min_size is 7.
> >
> > I have a 3 node cluster with 2+1 and I can restart 1 node at a time with
> > host failure domain.
> >
> >
> > On Wed, Jul 12, 2017, 6:34 AM Jake Grimmett  > > wrote:
> >
> > Dear All,
> >
> > Quick question; is it possible to write to a degraded EC pool?
> >
> > i.e. is there an equivalent to this setting for a replicated pool..
> >
> > osd pool default size = 3
> > osd pool default min size = 2
> >
> > My reason for asking, is that it would be nice if we could build a EC
> > 7+2 cluster, and actively use the cluster while a node was off-line
> by
> > setting osd noout.
> >
> > BTW, Currently testing the Luminous RC, it's looking really nice!
> >
> > thanks,
> >
> > Jake
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Writing to EC Pool in degraded state?

2017-07-12 Thread Jake Grimmett
Hi David,

put that way, the docs make complete sense, thank you!

i.e. to allow writing to a 5+2 EC cluster with one node down:

default is:
# ceph osd pool get ecpool min_size
min_size: 7

to tolerate one node failure, set:
# ceph osd pool set ecpool min_size 6
set pool 1 min_size to 6

to tolerate two nodes failing, set:
# ceph osd pool set ecpool min_size 5
set pool 1 min_size to 5

thanks again!

Jake

On 12/07/17 14:36, David Turner wrote:
> As long as you have the 7 copies online if you're using 7+2 then you can
> still work and read to the EC pool.  For EC pool, size is equivalently 9
> and min_size is 7.
> 
> I have a 3 node cluster with 2+1 and I can restart 1 node at a time with
> host failure domain.
> 
> 
> On Wed, Jul 12, 2017, 6:34 AM Jake Grimmett  > wrote:
> 
> Dear All,
> 
> Quick question; is it possible to write to a degraded EC pool?
> 
> i.e. is there an equivalent to this setting for a replicated pool..
> 
> osd pool default size = 3
> osd pool default min size = 2
> 
> My reason for asking, is that it would be nice if we could build a EC
> 7+2 cluster, and actively use the cluster while a node was off-line by
> setting osd noout.
> 
> BTW, Currently testing the Luminous RC, it's looking really nice!
> 
> thanks,
> 
> Jake
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrating RGW from FastCGI to Civetweb

2017-07-12 Thread Roger Brown
SOLVED! S3-style subdomains work now!

In summary, to cutover from apache to civetweb without breaking other sites
on the same domain, here are the changes that worked for me:

/etc/ceph/ceph.conf:
# FASTCGI SETTINGS
#rgw socket path = ""
#rgw print continue = false
#rgw frontends = fastcgi socket_port=9000 socket_host=0.0.0.0
# CIVETWEB SETTINGS
rgw frontends = civetweb port=7480

httpd.conf/ssl.conf:
# FASTCGI SETTINGS
#ProxyPass / fcgi://localhost:9000/
# CIVETWEB SETTINGS
ProxyPass / http://localhost:7480/
ProxyPassReverse / http://localhost:7480/
ProxyPreserveHost On

Then restart ceph-radosgw and httpd.

If you run CentOS 7 like I do, you may have SELinux interference to deal
with. After the above, I would get this error accessing the gateway: "503
Service Unavailable" and in /var/log/messages a corresponding error,
"SELinux is preventing /usr/sbin/httpd from name_connect access on the
tcp_socket port 7480."

I temporarily fixed the SELinux error with: setenforce 0

I permanently fixed the SELinux error with: semanage port -a -t http_port_t
-p tcp 7480

Thanks to Richard Hesketh for steering me in the right direction!


On Wed, Jul 12, 2017 at 3:52 AM Richard Hesketh <
richard.hesk...@rd.bbc.co.uk> wrote:

> Oh, correcting myself. When HTTP proxying Apache translates the host
> header to whatever was specified in the ProxyPass line, so your civetweb
> server is receiving requests with host headers for localhost! Presumably
> for fcgi protocol it works differently. Nonetheless ProxyPreserveHost
> should solve your problem.
>
> Rich
>
> On 12/07/17 10:40, Richard Hesketh wrote:
> > Best guess, apache is munging together everything it picks up using the
> aliases and translating the host to the ServerName before passing on the
> request. Try setting ProxyPreserveHost on as per
> https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypreservehost ?
> >
> > Rich
> >
> > On 11/07/17 21:47, Roger Brown wrote:
> >> Thank you Richard, that mostly worked for me.
> >>
> >> But I notice that when I switch it from FastCGI to Civitweb that the
> S3-style subdomains (e.g., bucket-name.domain-name.com <
> http://bucket-name.domain-name.com>) stops working and I haven't been
> able to figure out why on my own.
> >>
> >> - ceph.conf excerpt:
> >> [client.radosgw.gateway]
> >> host = nuc1
> >> keyring = /etc/ceph/ceph.client.radosgw.keyring
> >> log file = /var/log/ceph/client.radosgw.gateway.log
> >> rgw dns name = s3.e-prepared.com 
> >> # FASTCGI SETTINGS
> >> rgw socket path = ""
> >> rgw print continue = false
> >> rgw frontends = fastcgi socket_port=9000 socket_host=0.0.0.0
> >> # CIVETWEB SETTINGS
> >> #rgw frontends = civetweb port=7480
> >>
> >> - httpd.conf excerpt
> >> 
> >> ServerName s3.e-prepared.com 
> >> ServerAlias *.s3.e-prepared.com 
> >> ServerAlias s3.amazonaws.com 
> >> ServerAlias *.amazonaws.com 
> >> DocumentRoot /srv/www/html/e-prepared_com/s3
> >> ErrorLog /var/log/httpd/rgw_error.log
> >> CustomLog /var/log/httpd/rgw_access.log combined
> >> # LogLevel debug
> >> RewriteEngine On
> >> RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]
> >> SetEnv proxy-nokeepalive 1
> >> # FASTCGI SETTINGS
> >> ProxyPass / fcgi://localhost:9000/
> >> # CIVETWEB SETTINGS
> >> #ProxyPass / http://localhost:7480/
> >> #ProxyPassReverse / http://localhost:7480/
> >> 
> >>
> >> With the above FastCGI settings, S3-style subdomains work. Eg.
> >> [root@nuc1 ~]# curl http://roger-public.s3.e-prepared.com/index.html
> >> 
> >> 
> >>   
> >> Hello, World!
> >>   
> >> 
> >>
> >> But when I comment out the fastcgi settings, uncomment the civetweb
> settings, and restart ceph-radosgw and http (and disable selinux), I get
> output like this:
> >> [root@nuc1 ~]# curl http://roger-public.s3.e-prepared.com/index.html
> >>  encoding="UTF-8"?>NoSuchBucketindex.htmltx3-00596536b0-1465f8-default1465f8-default-default
> >>
> >> However I can still access the bucket the old-fashioned way (e.g.,
> domain-name.com/bucket-name ) even
> with Civetweb running:
> >> [root@nuc1 ~]# curl http://s3.e-prepared.com/roger-public/index.html
> >> 
> >> 
> >>   
> >> Hello, World!
> >>   
> >> 
> >>
> >> Thoughts, anyone?
> >>
> >> Roger
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
> --
> ---
> Richard Hesketh
> Linux Systems Administrator, Research Platforms
> BBC Research & Development
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/cep

Re: [ceph-users] Stealth Jewel release?

2017-07-12 Thread David Turner
The lack of communication on this makes me tentative to upgrade to it.  Are
the packages available to Ubuntu/Debian systems production ready and
intended for upgrades?

On Tue, Jul 11, 2017 at 8:33 PM Brad Hubbard  wrote:

> On Wed, Jul 12, 2017 at 12:58 AM, David Turner 
> wrote:
> > I haven't seen any release notes for 10.2.8 yet.  Is there a document
> > somewhere stating what's in the release?
>
> https://github.com/ceph/ceph/pull/16274 for now although it should
> make it into the master doc tree soon.
>
> >
> > On Mon, Jul 10, 2017 at 1:41 AM Henrik Korkuc  wrote:
> >>
> >> On 17-07-10 08:29, Christian Balzer wrote:
> >> > Hello,
> >> >
> >> > so this morning I was greeted with the availability of 10.2.8 for both
> >> > Jessie and Stretch (much appreciated), but w/o any announcement here
> or
> >> > updated release notes on the website, etc.
> >> >
> >> > Any reason other "Friday" (US time) for this?
> >> >
> >> > Christian
> >>
> >> My guess is that they didn't have time to announce it yet. Maybe pkgs
> >> were not ready yet on friday?
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> Cheers,
> Brad
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph @ OpenStack Sydney Summit

2017-07-12 Thread Blair Bethwaite
Hi Trilliams,

Sounds good, I bet that would be popular among OpenStack new-comers.

I'm interested in seeing some content that looks forward at the next
couple of years of server and storage hardware roadmaps alongside new
and upcoming Ceph features (e.g. BlueStore, EC overwrite support) to
suggest best-practice architecture updates and discuss approaches to
migrating/adding things into existing clusters.

Cheers,

On 7 July 2017 at 22:15, T. Nichole Williams  wrote:
> I submitted a "ceph for absolute, complete beginners" presentation but idk if 
> it will be approved since I'm kind of a newcomer myself. I'd also like a Ceph 
> BoF.
>
> <3 Trilliams
>
> Sent from my iPhone
>
>> On Jul 6, 2017, at 10:50 PM, Blair Bethwaite  
>> wrote:
>>
>> Oops, this time plain text...
>>
>>> On 7 July 2017 at 13:47, Blair Bethwaite  wrote:
>>>
>>> Hi all,
>>>
>>> Are there any "official" plans to have Ceph events co-hosted with OpenStack 
>>> Summit Sydney, like in Boston?
>>>
>>> The call for presentations closes in a week. The Forum will be organised 
>>> throughout September and (I think) that is the most likely place to have 
>>> e.g. Ceph ops sessions like we have in the past. Some of my local 
>>> colleagues have also expressed interest in having a CephFS BoF.
>>>
>>> --
>>> Cheers,
>>> ~Blairo
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
~Blairo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph @ OpenStack Sydney Summit

2017-07-12 Thread Blair Bethwaite
Hi Greg,

On 12 July 2017 at 03:48, Gregory Farnum  wrote:
> I poked at Patrick about this and it sounds like the venue is a little
> smaller than usual (and community planning is a little less
> planned-out for those ranges than usual) so things are still up in the
> air. :/

Yes, it is a smaller venue. I've posted on the OpenStack User
Committee ML asking about where these sorts of sessions might fit...

> In Boston we had an ops dinner meetup beyond the open-source days
> track, so you can probably count on that much if nothing more formal
> happens.

I would really like a proper ops session where we and other users (of
particular interest to me are others running large clusters with
non-homogeneous topologies) can discuss common issues and challenges,
which could hopefully surface some useful feature suggestions.

-- 
Cheers,
~Blairo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Writing to EC Pool in degraded state?

2017-07-12 Thread David Turner
As long as you have the 7 copies online if you're using 7+2 then you can
still work and read to the EC pool.  For EC pool, size is equivalently 9
and min_size is 7.

I have a 3 node cluster with 2+1 and I can restart 1 node at a time with
host failure domain.

On Wed, Jul 12, 2017, 6:34 AM Jake Grimmett  wrote:

> Dear All,
>
> Quick question; is it possible to write to a degraded EC pool?
>
> i.e. is there an equivalent to this setting for a replicated pool..
>
> osd pool default size = 3
> osd pool default min size = 2
>
> My reason for asking, is that it would be nice if we could build a EC
> 7+2 cluster, and actively use the cluster while a node was off-line by
> setting osd noout.
>
> BTW, Currently testing the Luminous RC, it's looking really nice!
>
> thanks,
>
> Jake
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Writing to EC Pool in degraded state?

2017-07-12 Thread Jake Grimmett
Dear All,

Quick question; is it possible to write to a degraded EC pool?

i.e. is there an equivalent to this setting for a replicated pool..

osd pool default size = 3
osd pool default min size = 2

My reason for asking, is that it would be nice if we could build a EC
7+2 cluster, and actively use the cluster while a node was off-line by
setting osd noout.

BTW, Currently testing the Luminous RC, it's looking really nice!

thanks,

Jake


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] installing specific version of ceph-common

2017-07-12 Thread Buyens Niels


I tried installing librados2-10.2.7 separately first (which worked). Then 
trying to install ceph-common-10.2.7 again:

Error: Package: 1:ceph-common-10.2.7-0.el7.x86_64 (Ceph)
   Requires: librados2 = 1:10.2.7-0.el7
   Removing: 1:librados2-10.2.7-0.el7.x86_64 (@Ceph)
   librados2 = 1:10.2.7-0.el7
   Updated By: 1:librados2-10.2.8-0.el7.x86_64 (Ceph)
   librados2 = 1:10.2.8-0.el7


From: Brad Hubbard 
Sent: Wednesday, July 12, 2017 11:59
To: Buyens Niels
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] installing specific version of ceph-common

On Wed, Jul 12, 2017 at 6:19 PM, Buyens Niels  wrote:
> Hello,
>
>
> When trying to install a specific version of ceph-common when a newer
> version has been released, the installation fails.
>
>
> I have an environment running version 10.2.7 on CentOS 7. Recently, 10.2.8
> has been released to the repos.
>
>
> Trying to install version 10.2.7 will fail because it is installing 10.2.8
> dependencies (even though it says it's going to install 10.2.7
> dependencies):
>
>
> # yum install ceph-common-10.2.7

Try something like...

# yum install ceph-common-10.2.7 librados2-10.2.7

You may need to add more packages with that specific version depending what
other errors you get and also check for existing installed packages that may get
in the way and remove them as necessary.

> Loaded plugins: fastestmirror
> Loading mirror speeds from cached hostfile
>  * base: mirror.unix-solutions.be
>  * epel: epel.mirrors.ovh.net
>  * extras: mirror.unix-solutions.be
>  * updates: mirror.unix-solutions.be
> Resolving Dependencies
> --> Running transaction check
> ---> Package ceph-common.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: python-rados = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librbd1 = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: python-rbd = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: python-cephfs = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: libcephfs1 = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librbd.so.1()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados.so.2()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: libbabeltrace.so.1()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: libbabeltrace-ctf.so.1()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: libradosstriper.so.1()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librgw.so.2()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Running transaction check
> ---> Package libbabeltrace.x86_64 0:1.2.4-3.el7 will be installed
> ---> Package libcephfs1.x86_64 1:10.2.7-0.el7 will be installed
> ---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: liblttng-ust.so.0()(64bit) for package:
> 1:librados2-10.2.7-0.el7.x86_64
> ---> Package libradosstriper1.x86_64 1:10.2.8-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.8-0.el7 for package:
> 1:libradosstriper1-10.2.8-0.el7.x86_64
> ---> Package librbd1.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:librbd1-10.2.7-0.el7.x86_64
> ---> Package librgw2.x86_64 1:10.2.8-0.el7 will be installed
> --> Processing Dependency: libfcgi.so.0()(64bit) for package:
> 1:librgw2-10.2.8-0.el7.x86_64
> ---> Package python-cephfs.x86_64 1:10.2.7-0.el7 will be installed
> ---> Package python-rados.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:python-rados-10.2.7-0.el7.x86_64
> ---> Package python-rbd.x86_64 1:10.2.7-0.el7 will be installed
> --> Running transaction check
> ---> Package fcgi.x86_64 0:2.4.0-25.el7 will be installed
> ---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:python-rados-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:librbd1-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> ---> Package librados2.x86_64 1:10.2.8-0.el7 will be installed
> ---> Package librbd1.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:librbd1-10.2.7-0.el7.x86_64
> ---> Package lttng-ust.x86_64 0:2.4.1-4.el7 will be installed
> --> Processing Dependency: liburcu-cds.so.1()(64

Re: [ceph-users] installing specific version of ceph-common

2017-07-12 Thread Brad Hubbard


On Wed, Jul 12, 2017 at 6:19 PM, Buyens Niels  wrote:
> Hello,
>
>
> When trying to install a specific version of ceph-common when a newer
> version has been released, the installation fails.
>
>
> I have an environment running version 10.2.7 on CentOS 7. Recently, 10.2.8
> has been released to the repos.
>
>
> Trying to install version 10.2.7 will fail because it is installing 10.2.8
> dependencies (even though it says it's going to install 10.2.7
> dependencies):
>
>
> # yum install ceph-common-10.2.7

Try something like...

# yum install ceph-common-10.2.7 librados2-10.2.7

You may need to add more packages with that specific version depending what
other errors you get and also check for existing installed packages that may get
in the way and remove them as necessary.

> Loaded plugins: fastestmirror
> Loading mirror speeds from cached hostfile
>  * base: mirror.unix-solutions.be
>  * epel: epel.mirrors.ovh.net
>  * extras: mirror.unix-solutions.be
>  * updates: mirror.unix-solutions.be
> Resolving Dependencies
> --> Running transaction check
> ---> Package ceph-common.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: python-rados = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librbd1 = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: python-rbd = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: python-cephfs = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: libcephfs1 = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librbd.so.1()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados.so.2()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: libbabeltrace.so.1()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: libbabeltrace-ctf.so.1()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: libradosstriper.so.1()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Processing Dependency: librgw.so.2()(64bit) for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> --> Running transaction check
> ---> Package libbabeltrace.x86_64 0:1.2.4-3.el7 will be installed
> ---> Package libcephfs1.x86_64 1:10.2.7-0.el7 will be installed
> ---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: liblttng-ust.so.0()(64bit) for package:
> 1:librados2-10.2.7-0.el7.x86_64
> ---> Package libradosstriper1.x86_64 1:10.2.8-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.8-0.el7 for package:
> 1:libradosstriper1-10.2.8-0.el7.x86_64
> ---> Package librbd1.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:librbd1-10.2.7-0.el7.x86_64
> ---> Package librgw2.x86_64 1:10.2.8-0.el7 will be installed
> --> Processing Dependency: libfcgi.so.0()(64bit) for package:
> 1:librgw2-10.2.8-0.el7.x86_64
> ---> Package python-cephfs.x86_64 1:10.2.7-0.el7 will be installed
> ---> Package python-rados.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:python-rados-10.2.7-0.el7.x86_64
> ---> Package python-rbd.x86_64 1:10.2.7-0.el7 will be installed
> --> Running transaction check
> ---> Package fcgi.x86_64 0:2.4.0-25.el7 will be installed
> ---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:python-rados-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:librbd1-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:ceph-common-10.2.7-0.el7.x86_64
> ---> Package librados2.x86_64 1:10.2.8-0.el7 will be installed
> ---> Package librbd1.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:librbd1-10.2.7-0.el7.x86_64
> ---> Package lttng-ust.x86_64 0:2.4.1-4.el7 will be installed
> --> Processing Dependency: liburcu-cds.so.1()(64bit) for package:
> lttng-ust-2.4.1-4.el7.x86_64
> --> Processing Dependency: liburcu-bp.so.1()(64bit) for package:
> lttng-ust-2.4.1-4.el7.x86_64
> ---> Package python-rados.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:python-rados-10.2.7-0.el7.x86_64
> --> Running transaction check
> ---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:python-rados-10.2.7-0.el7.x86_64
> --> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package:
> 1:librbd1-10.2.7-0.el7.x86_64
> --> Processing Depend

Re: [ceph-users] Degraded objects while OSD is being added/filled

2017-07-12 Thread Richard Hesketh
On 11/07/17 20:05, Eino Tuominen wrote:
> Hi Richard,
> 
> Thanks for the explanation, that makes perfect sense. I've missed the 
> difference between ceph osd reweight and ceph osd crush reweight. I have to 
> study that better.
> 
> Is there a way to get ceph to prioritise fixing degraded objects over fixing 
> misplaced ones?

The difference between out/reweight and crush weight caught me out recently as 
well, I was seeing a lot more data movement that I expected during the process 
of replacing old disks until someone explained the difference to me.

As I understand it, ceph already prioritises fixing things which are degraded 
over things which are misplaced - the problem is that this is granular at the 
level of PGs, then objects. It will try and choose to recover PGs with degraded 
objects before PGs which only have misplaced objects, and I think that within a 
specific PG recovery it will try to fix degraded objects before misplaced 
objects. However it will still finish recovering the whole PG, misplaced 
objects and all - it's got no mechanism to put a recovery on hold once it's 
fixed the degraded objects to free capacity for recovering a different PG.

The default recovery limits are really conservative though, you can probably 
increase the rate quite a lot by increasing osd_max_backfills above the default 
limit of 1 - "ceph tell osd.X injectargs '--osd_max_backfills Y'" to update a 
running config, "osd_max_backfills = Y" in the [osd] section of your ceph.conf 
to make it persist.

Rich



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrating RGW from FastCGI to Civetweb

2017-07-12 Thread Richard Hesketh
Oh, correcting myself. When HTTP proxying Apache translates the host header to 
whatever was specified in the ProxyPass line, so your civetweb server is 
receiving requests with host headers for localhost! Presumably for fcgi 
protocol it works differently. Nonetheless ProxyPreserveHost should solve your 
problem.

Rich

On 12/07/17 10:40, Richard Hesketh wrote:
> Best guess, apache is munging together everything it picks up using the 
> aliases and translating the host to the ServerName before passing on the 
> request. Try setting ProxyPreserveHost on as per 
> https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypreservehost ?
> 
> Rich
> 
> On 11/07/17 21:47, Roger Brown wrote:
>> Thank you Richard, that mostly worked for me. 
>>
>> But I notice that when I switch it from FastCGI to Civitweb that the 
>> S3-style subdomains (e.g., bucket-name.domain-name.com 
>> ) stops working and I haven't been able 
>> to figure out why on my own.
>>
>> - ceph.conf excerpt:
>> [client.radosgw.gateway]
>> host = nuc1
>> keyring = /etc/ceph/ceph.client.radosgw.keyring
>> log file = /var/log/ceph/client.radosgw.gateway.log
>> rgw dns name = s3.e-prepared.com 
>> # FASTCGI SETTINGS
>> rgw socket path = ""
>> rgw print continue = false
>> rgw frontends = fastcgi socket_port=9000 socket_host=0.0.0.0
>> # CIVETWEB SETTINGS
>> #rgw frontends = civetweb port=7480
>>
>> - httpd.conf excerpt
>> 
>> ServerName s3.e-prepared.com 
>> ServerAlias *.s3.e-prepared.com 
>> ServerAlias s3.amazonaws.com 
>> ServerAlias *.amazonaws.com 
>> DocumentRoot /srv/www/html/e-prepared_com/s3
>> ErrorLog /var/log/httpd/rgw_error.log
>> CustomLog /var/log/httpd/rgw_access.log combined
>> # LogLevel debug
>> RewriteEngine On
>> RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]
>> SetEnv proxy-nokeepalive 1
>> # FASTCGI SETTINGS
>> ProxyPass / fcgi://localhost:9000/
>> # CIVETWEB SETTINGS
>> #ProxyPass / http://localhost:7480/
>> #ProxyPassReverse / http://localhost:7480/
>> 
>>
>> With the above FastCGI settings, S3-style subdomains work. Eg.
>> [root@nuc1 ~]# curl http://roger-public.s3.e-prepared.com/index.html
>> 
>> 
>>   
>> Hello, World!
>>   
>> 
>>
>> But when I comment out the fastcgi settings, uncomment the civetweb 
>> settings, and restart ceph-radosgw and http (and disable selinux), I get 
>> output like this:
>> [root@nuc1 ~]# curl http://roger-public.s3.e-prepared.com/index.html
>> > encoding="UTF-8"?>NoSuchBucketindex.htmltx3-00596536b0-1465f8-default1465f8-default-default
>>
>> However I can still access the bucket the old-fashioned way (e.g., 
>> domain-name.com/bucket-name ) even with 
>> Civetweb running:
>> [root@nuc1 ~]# curl http://s3.e-prepared.com/roger-public/index.html 
>> 
>> 
>>   
>> Hello, World!
>>   
>> 
>>
>> Thoughts, anyone?
>>
>> Roger
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
---
Richard Hesketh
Linux Systems Administrator, Research Platforms
BBC Research & Development



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrating RGW from FastCGI to Civetweb

2017-07-12 Thread Richard Hesketh
Best guess, apache is munging together everything it picks up using the aliases 
and translating the host to the ServerName before passing on the request. Try 
setting ProxyPreserveHost on as per 
https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypreservehost ?

Rich

On 11/07/17 21:47, Roger Brown wrote:
> Thank you Richard, that mostly worked for me. 
> 
> But I notice that when I switch it from FastCGI to Civitweb that the S3-style 
> subdomains (e.g., bucket-name.domain-name.com 
> ) stops working and I haven't been able 
> to figure out why on my own.
> 
> - ceph.conf excerpt:
> [client.radosgw.gateway]
> host = nuc1
> keyring = /etc/ceph/ceph.client.radosgw.keyring
> log file = /var/log/ceph/client.radosgw.gateway.log
> rgw dns name = s3.e-prepared.com 
> # FASTCGI SETTINGS
> rgw socket path = ""
> rgw print continue = false
> rgw frontends = fastcgi socket_port=9000 socket_host=0.0.0.0
> # CIVETWEB SETTINGS
> #rgw frontends = civetweb port=7480
> 
> - httpd.conf excerpt
> 
> ServerName s3.e-prepared.com 
> ServerAlias *.s3.e-prepared.com 
> ServerAlias s3.amazonaws.com 
> ServerAlias *.amazonaws.com 
> DocumentRoot /srv/www/html/e-prepared_com/s3
> ErrorLog /var/log/httpd/rgw_error.log
> CustomLog /var/log/httpd/rgw_access.log combined
> # LogLevel debug
> RewriteEngine On
> RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]
> SetEnv proxy-nokeepalive 1
> # FASTCGI SETTINGS
> ProxyPass / fcgi://localhost:9000/
> # CIVETWEB SETTINGS
> #ProxyPass / http://localhost:7480/
> #ProxyPassReverse / http://localhost:7480/
> 
> 
> With the above FastCGI settings, S3-style subdomains work. Eg.
> [root@nuc1 ~]# curl http://roger-public.s3.e-prepared.com/index.html
> 
> 
>   
> Hello, World!
>   
> 
> 
> But when I comment out the fastcgi settings, uncomment the civetweb settings, 
> and restart ceph-radosgw and http (and disable selinux), I get output like 
> this:
> [root@nuc1 ~]# curl http://roger-public.s3.e-prepared.com/index.html
>  encoding="UTF-8"?>NoSuchBucketindex.htmltx3-00596536b0-1465f8-default1465f8-default-default
> 
> However I can still access the bucket the old-fashioned way (e.g., 
> domain-name.com/bucket-name ) even with 
> Civetweb running:
> [root@nuc1 ~]# curl http://s3.e-prepared.com/roger-public/index.html 
> 
> 
>   
> Hello, World!
>   
> 
> 
> Thoughts, anyone?
> 
> Roger



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] installing specific version of ceph-common

2017-07-12 Thread Buyens Niels
Hello,


When trying to install a specific version of ceph-common when a newer version 
has been released, the installation fails.


I have an environment running version 10.2.7 on CentOS 7. Recently, 10.2.8 has 
been released to the repos.


Trying to install version 10.2.7 will fail because it is installing 10.2.8 
dependencies (even though it says it's going to install 10.2.7 dependencies):

# yum install ceph-common-10.2.7
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.unix-solutions.be
 * epel: epel.mirrors.ovh.net
 * extras: mirror.unix-solutions.be
 * updates: mirror.unix-solutions.be
Resolving Dependencies
--> Running transaction check
---> Package ceph-common.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: python-rados = 1:10.2.7-0.el7 for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: librbd1 = 1:10.2.7-0.el7 for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: python-rbd = 1:10.2.7-0.el7 for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: python-cephfs = 1:10.2.7-0.el7 for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: libcephfs1 = 1:10.2.7-0.el7 for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: librbd.so.1()(64bit) for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: librados.so.2()(64bit) for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: libbabeltrace.so.1()(64bit) for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: libbabeltrace-ctf.so.1()(64bit) for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: libradosstriper.so.1()(64bit) for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Processing Dependency: librgw.so.2()(64bit) for package: 
1:ceph-common-10.2.7-0.el7.x86_64
--> Running transaction check
---> Package libbabeltrace.x86_64 0:1.2.4-3.el7 will be installed
---> Package libcephfs1.x86_64 1:10.2.7-0.el7 will be installed
---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: liblttng-ust.so.0()(64bit) for package: 
1:librados2-10.2.7-0.el7.x86_64
---> Package libradosstriper1.x86_64 1:10.2.8-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.8-0.el7 for package: 
1:libradosstriper1-10.2.8-0.el7.x86_64
---> Package librbd1.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:librbd1-10.2.7-0.el7.x86_64
---> Package librgw2.x86_64 1:10.2.8-0.el7 will be installed
--> Processing Dependency: libfcgi.so.0()(64bit) for package: 
1:librgw2-10.2.8-0.el7.x86_64
---> Package python-cephfs.x86_64 1:10.2.7-0.el7 will be installed
---> Package python-rados.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:python-rados-10.2.7-0.el7.x86_64
---> Package python-rbd.x86_64 1:10.2.7-0.el7 will be installed
--> Running transaction check
---> Package fcgi.x86_64 0:2.4.0-25.el7 will be installed
---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:python-rados-10.2.7-0.el7.x86_64
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:librbd1-10.2.7-0.el7.x86_64
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:ceph-common-10.2.7-0.el7.x86_64
---> Package librados2.x86_64 1:10.2.8-0.el7 will be installed
---> Package librbd1.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:librbd1-10.2.7-0.el7.x86_64
---> Package lttng-ust.x86_64 0:2.4.1-4.el7 will be installed
--> Processing Dependency: liburcu-cds.so.1()(64bit) for package: 
lttng-ust-2.4.1-4.el7.x86_64
--> Processing Dependency: liburcu-bp.so.1()(64bit) for package: 
lttng-ust-2.4.1-4.el7.x86_64
---> Package python-rados.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:python-rados-10.2.7-0.el7.x86_64
--> Running transaction check
---> Package librados2.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:python-rados-10.2.7-0.el7.x86_64
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:librbd1-10.2.7-0.el7.x86_64
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:ceph-common-10.2.7-0.el7.x86_64
---> Package librbd1.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:librbd1-10.2.7-0.el7.x86_64
---> Package python-rados.x86_64 1:10.2.7-0.el7 will be installed
--> Processing Dependency: librados2 = 1:10.2.7-0.el7 for package: 
1:python-rados-10.2.7-0.el7.x86_64
---> Package userspace-rcu.x86_64 0:0.7.16-1.el7 will be installed
--> Finished D

Re: [ceph-users] PG Stuck EC Pool

2017-07-12 Thread Ashley Merrick
Is this planned to be merged into Luminous at some point?

,Ashley

From: Gregory Farnum [mailto:gfar...@redhat.com]
Sent: Tuesday, 6 June 2017 2:24 AM
To: Ashley Merrick ; ceph-us...@ceph.com
Cc: David Zafman 
Subject: Re: [ceph-users] PG Stuck EC Pool

It looks to me like this is related to http://tracker.ceph.com/issues/18162.

You might see if they came up with good resolution steps, and it looks like 
David is working on it in master but hasn't finished it yet.
-Greg

On Sat, Jun 3, 2017 at 2:47 AM Ashley Merrick 
mailto:ash...@amerrick.co.uk>> wrote:
Attaching with logging to level 20.

After repeat attempts by removing nobackfill I have got it down to:


recovery 31892/272325586 objects degraded (0.012%)
recovery 2/272325586 objects misplaced (0.000%)

However any further attempts after removing nobackfill just causes an instant 
crash on 83 & 84, at this point I feel there is some corruption on the 
remaining 11 OSD’s of the PG however the error’s aren’t directly saying that, 
however always end the crash with:

-1 *** Caught signal (Aborted) ** in thread 7f716e862700 
thread_name:tp_osd_recov

,Ashley

From: ceph-users 
[mailto:ceph-users-boun...@lists.ceph.com]
 On Behalf Of Ashley Merrick
Sent: 03 June 2017 17:14
To: ceph-us...@ceph.com
Subject: Re: [ceph-users] PG Stuck EC Pool


This sender failed our fraud detection checks and may not be who they appear to 
be. Learn about spoofing

Feedback

I have now done some further testing and seeing these errors on 84 / 83 the 
OSD’s that crash while backfilling to 10,11

   -60> 2017-06-03 10:08:56.651768 7f6f76714700  1 -- 
172.16.3.14:6823/2694 <== osd.3 
172.16.2.101:0/25361 10  osd_ping(ping e71688 
stamp 2017-06-03 10:08:56.652035) v2  47+0+0 (1097709006 0 0) 
0x5569ea88d400 con 0x5569e900e300
   -59> 2017-06-03 10:08:56.651804 7f6f76714700  1 -- 
172.16.3.14:6823/2694 --> 
172.16.2.101:0/25361 -- osd_ping(ping_reply e71688 
stamp 2017-06-03 10:08:56.652035) v2 -- ?+0 0x5569e985fc00 con 0x5569e900e300
-6> 2017-06-03 10:08:56.937156 7f6f5ee4d700  1 -- 
172.16.3.14:6822/2694 <== osd.53 
172.16.3.7:6816/15230 13  
MOSDECSubOpReadReply(6.14s3 71688 ECSubReadReply(tid=83, attrs_read=0)) v1  
148+0+0 (2355392791 0 0) 0x5569e8b22080 con 0x5569e9538f00
-5> 2017-06-03 10:08:56.937193 7f6f5ee4d700  5 -- op tracker -- seq: 2409, 
time: 2017-06-03 10:08:56.937193, event: queued_for_pg, op: 
MOSDECSubOpReadReply(6.14s3 71688 ECSubReadReply(tid=83, attrs_read=0))
-4> 2017-06-03 10:08:56.937241 7f6f8ef8a700  5 -- op tracker -- seq: 2409, 
time: 2017-06-03 10:08:56.937240, event: reached_pg, op: 
MOSDECSubOpReadReply(6.14s3 71688 ECSubReadReply(tid=83, attrs_read=0))
-3> 2017-06-03 10:08:56.937266 7f6f8ef8a700  0 osd.83 pg_epoch: 71688 
pg[6.14s3( v 71685'35512 (68694'30812,71685'35512] local-les=71688 n=15928 
ec=31534 les/c/f 71688/69510/67943 71687/71687/71687) 
[11,10,2147483647,83,22,26,69,72,53,59,8,4,46]/[2147483647,2147483647,2147483647,83,22,26,69,72,53,59,8,4,46]
 r=3 lpr=71687 pi=47065-71686/711 rops=1 bft=10(1),11(0) crt=71629'35509 mlcod 
0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE] 
failed_push 6:28170432:::rbd_data.e3d8852ae8944a.00047d28:head from 
shard 53(8), reps on  unfound? 0
-2> 2017-06-03 10:08:56.937346 7f6f8ef8a700  5 -- op tracker -- seq: 2409, 
time: 2017-06-03 10:08:56.937345, event: done, op: MOSDECSubOpReadReply(6.14s3 
71688 ECSubReadReply(tid=83, attrs_read=0))
-1> 2017-06-03 10:08:56.937351 7f6f89f80700 -1 osd.83 pg_epoch: 71688 
pg[6.14s3( v 71685'35512 (68694'30812,71685'35512] local-les=71688 n=15928 
ec=31534 les/c/f 71688/69510/67943 71687/71687/71687) 
[11,10,2147483647,83,22,26,69,72,53,59,8,4,46]/[2147483647,2147483647,2147483647,83,22,26,69,72,53,59,8,4,46]
 r=3 lpr=71687 pi=47065-71686/711 bft=10(1),11(0) crt=71629'35509 mlcod 0'0 
active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE] 
recover_replicas: object added to missing set for backfill, but is not in 
recovering, error!
   -42> 2017-06-03 10:08:56.968433 7f6f5f04f700  1 -- 
172.16.2.114:6822/2694 <== client.22857445 
172.16.2.212:0/2238053329 56  
osd_op(client.22857445.1:759236283 2.e732321d 
rbd_data.61b4c6238e1f29.0001ea27 [set-alloc-hint object_size 4194304 
write_size 4194304,write 126976~45056] snapc 0=[] ondisk+write e71688) v4  
217+0+45056 (2626314663 0 3883338397) 0x5569ea886b00 con 0x5569ea99c880

From: Ashley Merrick
Sent: 03 June 2017 14:27
To: 'ceph-us...@ceph.com' 
mailto:ceph-us...@ceph.com>>
Subject: RE: PG