[ceph-users] Re: Erasure Profile Pool caps at pg_num 1024

2020-02-16 Thread Eugen Block

Hi,

with EC you need to take into account the total number of chunks for  
PG calculation. Given your 16348 target, devide that number by 12 (if  
I'm reading your crush rule correctly) results in 1362, this would be  
either 1024 or 2048 PGs for that pool.


By the way, I wouldn't recommend to use all 12 hosts for your EC  
profile since that would mean a degraded cluster as soon as one host  
fails. You should consider one or two spare nodes so the PGs can be  
recovered in case one node fails.


Regards,
Eugen



Zitat von Gunnar Bandelow :


Hello Everyone,

i've run into problems with placement groups.

We have a 12 Host Ceph-Cluster with 408 OSDs (hdd and ssd).

I create a replicated pool with a large pg_num (16384). No problems  
everything works.


If I do this with an erasure pool, i get a warning which is fixable  
by extending the max_pgs_per_osd and afterwards i get a health_warn  
in the ceph status due too few PGs.


Checking the pg_num after pool creation, it is capped at 1024.

I'm stuck at this point. Maybe i did something fundamentally wrong?

To illustrate my steps I tried to summarize everything in a small example:

# ceph -v
ceph version 14.2.7 (fb8e34a687d76cd3bd45c2a0fb445432ab69b4ff)  
nautilus (stable)


# ceph osd erasure-code-profile get myerasurehdd
crush-device-class=hdd
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=7
m=5
plugin=jerasure
technique=reed_sol_van
w=8

# ceph osd crush rule dump sas_rule
{
    "rule_id": 0,
    "rule_name": "sas_rule",
    "ruleset": 0,
    "type": 3,
    "min_size": 1,
    "max_size": 12,
    "steps": [
    {
    "op": "take",
    "item": -2,
    "item_name": "default~hdd"
    },
    {
    "op": "chooseleaf_firstn",
    "num": 0,
    "type": "rack"
    },
    {
    "op": "emit"
    }
    ]
}


# ceph osd pool create sas-pool 16384 16384 erasure myerasurehdd sas_rule
Error ERANGE:  pg_num 16384 size 12 would mean 196704 total pgs,  
which exceeds max 102000 (mon_max_pg_per_osd 250 * num_in_osds 408)


# ceph tell mon.\* injectargs '--mon-max-pg-per-osd=500'
mon.ceph-fs01: injectargs:mon_max_pg_per_osd = '500' (not observed,  
change may require restart)
mon.ceph-fs05: injectargs:mon_max_pg_per_osd = '500' (not observed,  
change may require restart)
mon.ceph-fs09: injectargs:mon_max_pg_per_osd = '500' (not observed,  
change may require restart)
ceph-fs01:/opt/ceph-setup# ceph osd pool create sas-pool 16384 16384  
erasure myerasurehdd sas_rule

pool 'sas-pool' created

# ceph -s

cluster:
    id: b9471b57-95a2-4e58-8f69-b5e6048bea7c
    health: HEALTH_WARN
    Reduced data availability: 1024 pgs incomplete
    too few PGs per OSD (7 < min 30)


# ceph osd pool get sas-pool pg_num
pg_num: 1024

Best regards,

Gunnar



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bucket rename with

2020-02-16 Thread Janne Johansson
Den fre 14 feb. 2020 kl 23:02 skrev EDH - Manuel Rios <
mrios...@easydatahost.com>:

> Honestly, not having a function to rename bucket from admin rgw-admin is
> like not having a function to copy or move. It is something basic, since if
> not the workarround, it is to create a new bucket and move all the files
> with the consequent loss of time and cost of computation. In addition to
> the interruption.
> Im sure that im not the only one administrator of rgw that need rename
> some buckets, because by default system let Users for example use CAPITIAL
> letter and that's not compliance.
>

I wonder if the "bucketnames might also become hostnames" weighed into the
reluctance for renames, but I have also wanted to have renames, in the
reverse situation where I did copy contents to another bucket (placed on a
better place) and wanted to "copy OLD to NEW, rename OLD to OLD2, rename
NEW to OLD, slowly delete OLD2" to have the shortest possible downtime
while still doing a full object copy.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Erasure Profile Pool caps at pg_num 1024

2020-02-16 Thread Gunnar Bandelow

Hello Everyone,

i've run into problems with placement groups.

We have a 12 Host Ceph-Cluster with 408 OSDs (hdd and ssd).

I create a replicated pool with a large pg_num (16384). No problems 
everything works.


If I do this with an erasure pool, i get a warning which is fixable by 
extending the max_pgs_per_osd and afterwards i get a health_warn in the 
ceph status due too few PGs.


Checking the pg_num after pool creation, it is capped at 1024.

I'm stuck at this point. Maybe i did something fundamentally wrong?

To illustrate my steps I tried to summarize everything in a small example:

# ceph -v
ceph version 14.2.7 (fb8e34a687d76cd3bd45c2a0fb445432ab69b4ff) nautilus 
(stable)


# ceph osd erasure-code-profile get myerasurehdd
crush-device-class=hdd
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=7
m=5
plugin=jerasure
technique=reed_sol_van
w=8

# ceph osd crush rule dump sas_rule
{
    "rule_id": 0,
    "rule_name": "sas_rule",
    "ruleset": 0,
    "type": 3,
    "min_size": 1,
    "max_size": 12,
    "steps": [
    {
    "op": "take",
    "item": -2,
    "item_name": "default~hdd"
    },
    {
    "op": "chooseleaf_firstn",
    "num": 0,
    "type": "rack"
    },
    {
    "op": "emit"
    }
    ]
}


# ceph osd pool create sas-pool 16384 16384 erasure myerasurehdd sas_rule
Error ERANGE:  pg_num 16384 size 12 would mean 196704 total pgs, which 
exceeds max 102000 (mon_max_pg_per_osd 250 * num_in_osds 408)


# ceph tell mon.\* injectargs '--mon-max-pg-per-osd=500'
mon.ceph-fs01: injectargs:mon_max_pg_per_osd = '500' (not observed, 
change may require restart)
mon.ceph-fs05: injectargs:mon_max_pg_per_osd = '500' (not observed, 
change may require restart)
mon.ceph-fs09: injectargs:mon_max_pg_per_osd = '500' (not observed, 
change may require restart)
ceph-fs01:/opt/ceph-setup# ceph osd pool create sas-pool 16384 16384 
erasure myerasurehdd sas_rule

pool 'sas-pool' created

# ceph -s

cluster:
    id: b9471b57-95a2-4e58-8f69-b5e6048bea7c
    health: HEALTH_WARN
    Reduced data availability: 1024 pgs incomplete
    too few PGs per OSD (7 < min 30)


# ceph osd pool get sas-pool pg_num
pg_num: 1024

Best regards,

Gunnar



smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Extended security attributes on cephfs (nautilus) not working with kernel 5.3

2020-02-16 Thread Stolte, Felix
Hi Ilya,

thank you very much for the information. I will switch to 5.4.

Regards 
Felix
-
-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
-
-
 

Am 14.02.20, 16:29 schrieb "Ilya Dryomov" :

On Fri, Feb 14, 2020 at 12:20 PM Stolte, Felix  
wrote:
>
> Hi guys,
>
> I am exporting cephfs with samba using the vfs acl_xattr which stores 
ntfs acls in the security extended attributes. This works fine using cephfs 
kernel mount wither kernel version 4.15.
>
> Using kernel 5.3 I cannot access the security.ntacl attributes anymore. 
Attributes in user or ceph namespace is still working.
>
> Can someone enlighten me on this issue (or even better tell me how to fix 
it)?

Hi Felix,

I think this issue was introduced by [1] in 5.3 and fixed by [2] in
5.4.   Looks like Jeff didn't realise he was fixing a regression and
didn't flag the fix for backporting.

There is nothing we can do about 5.3 now, so upgrading to 5.4 or 5.5
is probably the only option.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ac6713ccb5a6d13b59a2e3fda4fb049a2c4e0af2
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b8fe918b090442447d821b32b7dd6e17d5b5dfc1

Thanks,

Ilya


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph and Windows - experiences or suggestions

2020-02-16 Thread Stuart Longland
On 13/2/20 6:33 pm, Lars Täuber wrote:
> I got the task to connect a Windows client to our existing ceph cluster.
> I'm looking for experiences or suggestions from the community.
> There came two possibilities to my mind:
> 1. iSCSI Target on RBD exported to Windows
> 2. NFS-Ganesha on CephFS exported to Windows

Virtualise the Windows Client under KVM (which natively supports Ceph)
and attach the RBD target as a virtual disk.
-- 
Stuart Longland (aka Redhatter, VK4MSL)

I haven't lost my mind...
  ...it's backed up on a tape somewhere.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: slow using ISCSI - Help-me

2020-02-16 Thread Gesiel Galvão Bernardes
Em sex., 14 de fev. de 2020 às 13:25, Mike Christie 
escreveu:

> On 02/13/2020 08:52 PM, Gesiel Galvão Bernardes wrote:
> > Hi
> >
> > Em dom., 9 de fev. de 2020 às 18:27, Mike Christie  > > escreveu:
> >
> > On 02/08/2020 11:34 PM, Gesiel Galvão Bernardes wrote:
> > > Hi,
> > >
> > > Em qui., 6 de fev. de 2020 às 18:56, Mike Christie
> > mailto:mchri...@redhat.com>
> > > >>
> escreveu:
> > >
> > > On 02/05/2020 07:03 AM, Gesiel Galvão Bernardes wrote:
> > > > Em dom., 2 de fev. de 2020 às 00:37, Gesiel Galvão Bernardes
> > > >  > 
> > >
> > >  > 
> > >  >  > > >
> > > > Hi,
> > > >
> > > > Just now was possible continue this. Below is the
> > information
> > > > required. Thanks advan
> > >
> > >
> > > Hey, sorry for the late reply. I just back from PTO.
> > >
> > > >
> > > > esxcli storage nmp device list -d
> > > naa.6001405ba48e0b99e4c418ca13506c8e
> > > > naa.6001405ba48e0b99e4c418ca13506c8e
> > > >Device Display Name: LIO-ORG iSCSI Disk
> > > > (naa.6001405ba48e0b99e4c418ca13506c8e)
> > > >Storage Array Type: VMW_SATP_ALUA
> > > >Storage Array Type Device Config:
> {implicit_support=on;
> > > > explicit_support=off; explicit_allow=on;
> alua_followover=on;
> > > > action_OnRetryErrors=on; {TPG_id=1,TPG_state=ANO}}
> > > >Path Selection Policy: VMW_PSP_MRU
> > > >Path Selection Policy Device Config: Current
> > > Path=vmhba68:C0:T0:L0
> > > >Path Selection Policy Device Custom Config:
> > > >Working Paths: vmhba68:C0:T0:L0
> > > >Is USB: false
> > >
> > > 
> > >
> > > > Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x2 0x4 0xa.
> > > Act:FAILOVER
> > >
> > >
> > > Are you sure you are using tcmu-runner 1.4? Is that the actual
> > daemon
> > > reversion running? Did you by any chance install the 1.4 rpm,
> > but you/it
> > > did not restart the daemon? The error code above is returned
> > in 1.3 and
> > > earlier.
> > >
> > > You are probably hitting a combo of 2 issues.
> > >
> > > We had only listed ESX 6.5 in the docs you probably saw, and
> > in 6.7 the
> > > value of action_OnRetryErrors defaulted to on instead of off.
> > You should
> > > set this back to off.
> > >
> > > You should also upgrade to the current version of tcmu-runner
> > 1.5.x. It
> > > should fix the issue you are hitting, so non IO commands like
> > inquiry,
> > > RTPG, etc are executed while failing over/back, so you would
> > not hit the
> > > problem where path initialization and path testing IO is
> > failed causing
> > > the path to marked as failed.
> > >
> > >
> > > I updated tcmu-runner to 1.5.2, and change action_OnRetryErrors to
> > off,
> > > but the problem continue 😭
> > >
> > > Attached is vmkernel.log.
> > >
> >
> >
> > When you stopped the iscsi gw at around 2020-02-09T01:51:25.820Z, how
> > many paths did your device have? Did:
> >
> > esxcli storage nmp path list -d your_device
> >
> > report only one path? Did
> >
> > esxcli iscsi session connection list
> >
> > show a iscsi connection to each gw?
> >
> > Hmmm, I believe the problem may be here. I verified that I was listing
> > only one GW for each path. So I ran a "rescan HBA" on VMware on both
> > ESX, now one of them lists the 3 (I added one more) gateways, but an ESX
> > host with the same configuration continues to list only one gateway. See
> > the different outputs:
> >
> >  [root@tcnvh7:~] esxcli iscsi session connection list
> > vmhba68,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,00023d01,0
> >Adapter: vmhba68
> >Target: iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
> >ISID: 00023d01
> >CID: 0
> >DataDigest: NONE
> >HeaderDigest: NONE
> >IFMarker: false
> >IFMarkerInterval: 0
> >MaxRecvDataSegmentLength: 131072
> >MaxTransmitDataSegmentLength: 262144
> >OFMarker: false
> >OFMarkerInterval: 0
> >ConnectionAddress: 192.168.201.1
> >RemoteAddress: 192.168.201.1
> >LocalAddress: 192.168.201.107
> >SessionCreateTime: 01/19/20 00:11:25
> >ConnectionCreateTime: 01/19/20 00:11:25
> >ConnectionStartTime: 02/13/20 23:03:10
> >State: