[ceph-users] Ceph & iSCSI

2024-02-26 Thread Michael Worsham
I was reading on the Ceph site that iSCSI is no longer under active development 
since November 2022. Why is that?

https://docs.ceph.com/en/latest/rbd/iscsi-overview/

-- Michael

This message and its attachments are from Data Dimensions and are intended only 
for the use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify the 
sender immediately and permanently delete the original email and destroy any 
copies or printouts of this email as well as any attachments.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph-iscsi on RL9

2023-12-23 Thread duluxoz

Hi All,

Just successfully(?) completed a "live" update of the first node of a 
Ceph Quincy cluster from RL8 to RL9. Everything "seems" to be working - 
EXCEPT the iSCSI Gateway on that box.


During the update the ceph-iscsi package was removed (ie 
`ceph-iscsi-3.6-2.g97f5b02.el8.noarch.rpm` - this is the latest package 
available from the Ceph Repos). So, obviously, I reinstalled the package.


However, `dnf` is throwing errors (unsurprisingly, as that package is an 
el8 package and this box is now running el9): that package requires 
python 3.6 and el9 runs with python 3.8 (I believe).


So my question(s) is: Can I simply "downgrade" python to 3.6, or is 
there an el9-compatible version of `ceph-iscsi` somewhere, and/or is 
there some process I need to follow to get the iSCSI Gateway back up and 
running?


Some further info: The next step in my 
"happy-happy-fun-time-holiday-ICT-maintenance" was to upgrade the 
current Ceph Cluster to use `cephadm` and to go from Ceph-Quincy to 
Ceph-Reef - is this my ultimate upgrade path to get the iSCSI G/W back?


BTW the Ceph Cluster is used *only* to provide iSCSI LUNS to an oVirt 
(KVM) Cluster front-end. Because it is the holidays I can take the 
entire network down (ie shutdown all the VMs) to facilitate this update 
process, which also means that I can use some other way (ie a non-iSCSI 
way - I think) to connect the Ceph SAN Cluster to the oVirt VM-Hosting 
Cluster - if *this* is the solution (ie no iSCSI) does someone have any 
experience in running oVirt off of Ceph in a non-iSCSI way - and could 
you be so kind as to provide some pointers/documentation/help?


And before anyone says it, let me: "I broke, now I own it" :-)

Thanks in advance, and everyone have a Merry Christmas, Heavenly 
Hanukkah, Quality Kwanzaa, Really-good (upcoming) Ramadan, and/or a 
Happy Holidays.


Cheers

Dulux-Oz
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph iSCSI GW is too slow when compared with Raw RBD performance

2023-06-22 Thread Work Ceph
Hello guys,

We have a Ceph cluster that runs just fine with Ceph Octopus; we use RBD
for some workloads, RadosGW (via S3) for others, and iSCSI for some Windows
clients.

We started noticing some unexpected performance issues with iSCSI. I mean,
an SSD pool is reaching 100MB of write speed for an image, when it can
reach up to 600MB+ of write speed for the same image when mounted and
consumed directly via RBD.

Is that performance degradation expected? We would expect some degradation,
but not as much as this one.

Also, we have a question regarding the use of Intel Turbo boost. Should we
disable it? Is it possible that the root cause of the slowness in the iSCSI
GW is caused by the use of Intel Turbo boost feature, which reduces the
clock of some cores?

Any feedback is much appreciated.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph iSCSI GW not working with VMware VMFS and Windows Clustered Storage Volumes (CSV)

2023-06-19 Thread Work Ceph
Hello guys,

We have a Ceph cluster that runs just fine with Ceph Octopus; we use RBD
for some workloads, RadosGW (via S3) for others, and iSCSI for some Windows
clients.

Recently, we had the need to add some VMWare clusters as clients for the
iSCSI GW and also Windows systems with the use of Clustered Storage Volumes
(CSV), and we are facing a weird situation. In windows for instance, the
iSCSI block can be mounted, formatted and consumed by all nodes, but when
we add in the CSV it fails with some generic exception. The same happens in
VMWare, when we try to use it with VMFS it fails.

We do not seem to find the root cause for these errors. However, the errors
seem to be linked to the situation of multiple nodes consuming the same
block by shared file systems. Have you guys seen this before?

Are we missing some basic configuration in the iSCSI GW?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph iscsi gateway semi deprecation warning?

2023-05-26 Thread Mark Kirkwood
I am looking at using an iscsi gateway in front of a ceph setup. However 
the warning in the docs is concerning:


The iSCSI gateway is in maintenance as of November 2022. This means that 
it is no longer in active development and will not be updated to add new 
features.


Does this mean I should be wary of using it, or is it simply that it 
does all the stuff it needs to and no further development is needed?


regards

Mark
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph-iscsi-cli: cannot remove duplicated gateways.

2023-02-16 Thread luckydog xf
Hi, please see the output below.
ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain is the one who is being
messed up with a wrong hostname. I want to delete it.

/iscsi-target...-igw/gateways> ls
o- gateways 
..
[Up: 2/3, Portals: 3]
  o- ceph-iscsi-gw-1.ipa.pthl.hk
.
[172.16.202.251 (UP)]
  o- ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain
.. [172.16.202.251
(UNAUTHORIZED)]
  o- ceph-iscsi-gw-2.ipa.pthl.hk
.
[172.16.202.252 (UP)]

/iscsi-target...-igw/gateways> delete
gateway_name=ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain
confirm=true
Deleting gateway, ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain
Could not contact ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain. If
the gateway is permanently down. Use confirm=true to force removal.
WARNING: Forcing removal of a gateway that can still be reached by an
initiator may result in data corruption.
/iscsi-target...-igw/gateways>
/iscsi-target...-igw/gateways> delete
gateway_name=ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain
confirm=true
Deleting gateway, ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain
Failed : Unhandled exception: list.remove(x): x not in list

However  ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain is still there.
Version info is ceph-iscsi-3.5-1.el8cp.noarch on RHEL 8.4.

/iscsi-target...-igw/gateways> ls
o- gateways 
..
[Up: 2/3, Portals: 3]
  o- ceph-iscsi-gw-1.ipa.pthl.hk
.
[172.16.202.251 (UP)]
  o- ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain
... [172.16.202.251
(UNKNOWN)]
  o- ceph-iscsi-gw-2.ipa.pthl.hk
.
[172.16.202.252 (UP)]
/iscsi-target...-igw/gateways> delete
ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain confirm=true
Deleting gateway, ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain
Failed : Unhandled exception: list.remove(x): x not in list

However  ceph-iscsi-gw-1.ipa.pthl.hklocalhost.localdomain is still there.


Please help, thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph iSCSI & oVirt

2022-09-21 Thread duluxoz

Hi Everybody (Hi Dr. Nick),

I'm attacking this issue from both ends (ie from the Ceph-end and from 
the oVirt-end - I've posted questions on both mailing lists to ensure we 
capture the required knowledge-bearer(s)).


We've got a Ceph Cluster set up with three iSCSI Gateways configured, 
and we want to use this Cluster as the back-end storage for an oVirt 
Cluster.  *Somewhere* in the oVirt documentation I read that when using 
oVirt with multiple iSCSI paths (in my case, multiple Ceph iSCSI 
Gateways) we need to set up DM Multipath.


My question is: Has anyone done what we're trying to do, and if so are 
there any "gotchas" we should be aware of


Cheers

Dulux-Oz

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph iSCSI rbd-target.api Failed to Load

2022-09-07 Thread duluxoz

Hi All,

I've followed the instructions on the CEPH Doco website on Configuring 
the iSCSI Target. Everything went AOK up to the point where I try to 
start the rbd-target-api service, which fails (the rbd-target-gw service 
started OK).


A `systemctl status rbd-target-api` gives:

~~~
rbd-target-api.service - Ceph iscsi target configuration API
   Loaded: loaded (/usr/lib/systemd/system/rbd-target-api.service; 
enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2022-09-07 18:07:26 
AEST; 1h 5min ago

  Process: 32547 ExecStart=/usr/bin/rbd-target-api (code=exited, status=16)
 Main PID: 32547 (code=exited, status=16)

Sep 07 19:19:03 ceph-host1.mydomain.local systemd[1]: 
rbd-target-api.service: Start request repeated too quickly.
Sep 07 19:19:03 ceph-host1.mydomain.local systemd[1]: 
rbd-target-api.service: Failed with result 'exit-code'.
Sep 07 19:19:03 ceph-host1.mydomain.local systemd[1]: Failed to start 
Ceph iscsi target configuration API.

~~~

A `journalctl -xe` gives:

~~~
Sep 07 19:19:03 ceph-host1.mydomain.local systemd[1]: 
rbd-target-api.service: Start request repeated too quickly.
Sep 07 19:19:03 ceph-host1.mydomain.local systemd[1]: 
rbd-target-api.service: Failed with result 'exit-code'.

-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- The unit rbd-target-api.service has entered the 'failed' state with 
result 'exit-code'.
Sep 07 19:19:03 ceph-host1.mydomain.local systemd[1]: Failed to start 
Ceph iscsi target configuration API.

-- Subject: Unit rbd-target-api.service has failed
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit rbd-target-api.service has failed.
--
-- The result is failed.
~~~

The `rbd-target-api.log` gives:

~~~
2022-09-07 19:19:01,084DEBUG [common.py:141:_open_ioctx()] - 
(_open_ioctx) Opening connection to rbd pool
2022-09-07 19:19:01,086DEBUG [common.py:148:_open_ioctx()] - 
(_open_ioctx) connection opened
2022-09-07 19:19:01,105DEBUG [common.py:438:init_config()] - 
(init_config) using pre existing config object
2022-09-07 19:19:01,105DEBUG [common.py:141:_open_ioctx()] - 
(_open_ioctx) Opening connection to rbd pool
2022-09-07 19:19:01,105DEBUG [common.py:148:_open_ioctx()] - 
(_open_ioctx) connection opened
2022-09-07 19:19:01,106DEBUG [common.py:120:_read_config_object()] - 
_read_config_object reading the config object
2022-09-07 19:19:01,107DEBUG [common.py:170:_get_ceph_config()] - 
(_get_rbd_config) config object contains 'b'{\n"created": "2022/09/07 
07:25:58",\n"discovery_auth": {\n"mutual_password": 
"",\n"mutual_password_encryption_enabled": false,\n"mutual_username": 
"",\n"password": "",\n"password_encryption_enabled": false,\n"username": 
""\n},\n"disks": {},\n"epoch": 0,\n"gateways": {},\n"targets": 
{},\n"updated": "",\n"version": 11\n}''
2022-09-07 19:19:01,107 INFO [rbd-target-api:2810:run()] - Started the 
configuration object watcher
2022-09-07 19:19:01,107 INFO [rbd-target-api:2812:run()] - Checking for 
config object changes every 1s
2022-09-07 19:19:01,109 INFO [gateway.py:82:osd_blocklist_cleanup()] - 
Processing osd blocklist entries for this node
2022-09-07 19:19:01,497 INFO [gateway.py:140:osd_blocklist_cleanup()] - 
No OSD blocklist entries found
2022-09-07 19:19:01,497 INFO [gateway.py:250:define()] - Reading the 
configuration object to update local LIO configuration
2022-09-07 19:19:01,497 INFO [gateway.py:261:define()] - Configuration 
does not have an entry for this host(ceph-host1.mydomain.local) - 
nothing to define to LIO
2022-09-07 19:19:01,507 CRITICAL [rbd-target-api:2942:main()] - Secure 
API requested but the crt/key files missing/incompatible?
2022-09-07 19:19:01,508 CRITICAL [rbd-target-api:2944:main()] - Unable 
to start
2022-09-07 19:19:01,956DEBUG [common.py:141:_open_ioctx()] - 
(_open_ioctx) Opening connection to rbd pool
2022-09-07 19:19:01,958DEBUG [common.py:148:_open_ioctx()] - 
(_open_ioctx) connection opened
2022-09-07 19:19:01,976DEBUG [common.py:438:init_config()] - 
(init_config) using pre existing config object
2022-09-07 19:19:01,976DEBUG [common.py:141:_open_ioctx()] - 
(_open_ioctx) Opening connection to rbd pool
2022-09-07 19:19:01,976DEBUG [common.py:148:_open_ioctx()] - 
(_open_ioctx) connection opened
2022-09-07 19:19:01,977DEBUG [common.py:120:_read_config_object()] - 
_read_config_object reading the config object
2022-09-07 19:19:01,978DEBUG [common.py:170:_get_ceph_config()] - 
(_get_rbd_config) config object contains 'b'{\n"created": "2022/09/07 
07:25:58",\n"discovery_auth": {\n"mutual_password": 
"",\n"mutual_password_encryption_enabled": false,\n"mutual_username": 
"",\n"password": "",\n"password_encryption_enabled": false,\n"username": 
""\n},\n"disks": {},\n"epoch": 0,\n"gateways": {},\n"targets": 
{},\n"updated": "",\n"version": 11\n}''
2022-09-07 19:19:01,979 INFO [rbd-target-api:2810:run()] - Started the 
configuration object 

[ceph-users] Ceph-iscsi

2022-03-29 Thread Michel Niyoyita
Hello team

I have a problem which I want the team to help me on.
I have ceph cluster with Health OK which is running in testing environment
with 3 nodes with 4 osds each ,and 3 mons plus 2 managers, deployed using
ansible. the purpose of the cluster is to work as backend of openstack as
storage and it works perfectly . my question is: I have another server
which I want it to use the ceph as its storage. would like to ask you team
, how can I proceed? the link and hint can be helpful.

Best Regards,

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] CEPH iSCSI Gateway

2022-01-03 Thread Carlos Rebelato de Alcantara
Hi Guys.
I have a doubt about CEPH working with iSCSI Gateways.
Today we have a cluster with 10 OSD Nodes, 3 Monitors and 02 iSCSI Gateways. We 
are planning to expand de gateways up to 04 machines. We understood the process 
to do this, but we would like to know if it's necessary to adjust some 
configuration in "clientes" to able this servers to balance the I/O betwen all 
machines ( 04 servers ).
Thanks,

Carlos
Esta mensagem, incluindo seus anexos, tem car?ter confidencial e seu conte?do ? 
restrito ao destinat?rio da mensagem. Caso voc? tenha recebido por engano, 
queira, por favor, retorn?-la ao destinat?rio e apag?-la de seus arquivos. 
Qualquer uso n?o autorizado, replica??o ou dissemina??o desta mensagem ou parte 
dela ? expressamente proibido. A TIVIT n?o se responsabilizar? pelo conte?do ou 
pela veracidade desta informa??o, caso ela seja utilizada de forma inapropriada.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph-iscsi / tcmu-runner bad pefromance with vmware esxi

2021-09-23 Thread José H . Freidhof
Hello together,

i need some help on our ceph 16.2.5 cluster as iscsi target with esxi nodes

background infos:

   - we have build 3x osd nodes with 60 bluestore osd with and 60x6TB
   spinning disks, 12 ssd´s and 3nvme.
   - osd nodes have 32cores and 256gb Ram
   - the osd disk are connected to a scsi raid controller ... each disk is
   configured as raid0 and with write back enabled to use the raid controller
   cache etc.
   - we have 3x mons and 2x iscsi gateways
   - all servers are connected on a 10Gbit network (switches)
   - all servers have two 10gbit network adapter configured as bond-rr
   - we created one rbd pool with autoscaling and 128pg (at the moment)
   - in the pool are at the moment 5 rbd images... 2x 10tb and 3x500gb with
   feature exlusic lock and striping v2 (4mb obj / 1mb stipe / count 4)
   - All the images are attached to the two iscsi gateays running
   tcmu-runner 1.5.4 and exposed as iscsi target
   - we have 6 esxi 6.7u3 servers as computed node connected to the ceph
   iscsi target

esxi iscsi config:
esxcli system settings advanced set -o /ISCSI/MaxIoSizeKB -i 512
esxcli system module parameters set -m iscsi_vmk -p iscsivmk_LunQDepth=64
esxcli system module parameters set -m iscsi_vmk -p iscsivmk_HostQDepth=64
esxcli system settings advanced set --int-value 1 --option
/DataMover/HardwareAcceleratedMove

the osd nodes, mons, rgw/iscsi gateways and esxi nodes are all connected to
the 10gbit network with bond-rr

rbd benchmark test:

root@cd133-ceph-osdh-01:~# rados bench -p rbd 10 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size
4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_cd133-ceph-osdh-01_87894
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
0   0 0 0 0 0   -   0
1  166953   211.987   2120.2505780.249261
2  16   129   113   225.976   2400.2965190.266439
3  16   183   167   222.641   2160.2194220.273838
4  16   237   221   220.974   2160.469045 0.28091
5  16   292   276   220.773   2200.249321 0.27565
6  16   339   323   215.307   1880.205553 0.28624
7  16   390   374   213.688   2040.1884040.290426
8  16   457   441   220.472   2680.1812540.286525
9  16   509   493   219.083   2080.2505380.286832
   10  16   568   552   220.772   2360.3078290.286076
Total time run: 10.2833
Total writes made:  568
Write size: 4194304
Object size:4194304
Bandwidth (MB/sec): 220.941
Stddev Bandwidth:   22.295
Max bandwidth (MB/sec): 268
Min bandwidth (MB/sec): 188
Average IOPS:   55
Stddev IOPS:5.57375
Max IOPS:   67
Min IOPS:   47
Average Latency(s): 0.285903
Stddev Latency(s):  0.115162
Max latency(s): 0.88187
Min latency(s): 0.119276
Cleaning up (deleting benchmark objects)
Removed 568 objects
Clean up completed and total clean up time :3.18627

the rbd benchmark says that min 250 mb/s is possible... but i saw realy
much more... up to 550mb/s

if i start iftop on one osd node i see the ceph iscsi gw names as rgw and
the traffic is nearly 80mb/s
[image: grafik]


the ceph dashboard shows that the write iscsi performance are only 40mb/s
the max value i saw was between 40 and 60mb/s.. very poor
[image: grafik]


if i look into the vcenter and esxi datastore performance i see very high
storage device latencys between 50 and 100ms... very bad
[image: grafik]


root@cd133-ceph-mon-01:/home/cephadm# ceph config dump
WHO   MASK   LEVEL
OPTION   VALUE
   RO
global   basic
container_image
docker.io/ceph/ceph@sha256:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb
 *
global   advanced
journal_max_write_bytes  1073714824
global   advanced
journal_max_write_entries1
global   advanced
mon_osd_cache_size   1024
global   dev
osd_client_watch_timeout 15
global 

[ceph-users] ceph-iscsi issue after upgrading from nautilus to octopus

2021-04-15 Thread icy chan
Hi,

I had several clusters running as nautilus and pending upgrading to
octopus.

I am now testing the upgrade steps for ceph cluster from nautilus
to octopus using cephadm adopt in lab referred to below link:
- https://docs.ceph.com/en/octopus/cephadm/adoption/

Lab environment:
3 all-in-one nodes.
OS: CentOS 7.9.2009 with podman 1.6.4.

After the adoption, ceph health keep warns about tcme-runner not managed by
cephadm.
# ceph health detail
HEALTH_WARN 12 stray daemon(s) not managed by cephadm; 1 pool(s) have no
replicas configured
[WRN] CEPHADM_STRAY_DAEMON: 12 stray daemon(s) not managed by cephadm
stray daemon tcmu-runner.ceph-aio1:iSCSI/iscsi_image_01 on host
ceph-aio1 not managed by cephadm
stray daemon tcmu-runner.ceph-aio1:iSCSI/iscsi_image_02 on host
ceph-aio1 not managed by cephadm
stray daemon tcmu-runner.ceph-aio1:iSCSI/iscsi_image_03 on host
ceph-aio1 not managed by cephadm
stray daemon tcmu-runner.ceph-aio1:iSCSI/iscsi_image_test on host
ceph-aio1 not managed by cephadm
stray daemon tcmu-runner.ceph-aio2:iSCSI/iscsi_image_01 on host
ceph-aio2 not managed by cephadm
stray daemon tcmu-runner.ceph-aio2:iSCSI/iscsi_image_02 on host
ceph-aio2 not managed by cephadm
stray daemon tcmu-runner.ceph-aio2:iSCSI/iscsi_image_03 on host
ceph-aio2 not managed by cephadm
stray daemon tcmu-runner.ceph-aio2:iSCSI/iscsi_image_test on host
ceph-aio2 not managed by cephadm
stray daemon tcmu-runner.ceph-aio3:iSCSI/iscsi_image_01 on host
ceph-aio3 not managed by cephadm
stray daemon tcmu-runner.ceph-aio3:iSCSI/iscsi_image_02 on host
ceph-aio3 not managed by cephadm
stray daemon tcmu-runner.ceph-aio3:iSCSI/iscsi_image_03 on host
ceph-aio3 not managed by cephadm
stray daemon tcmu-runner.ceph-aio3:iSCSI/iscsi_image_test on host
ceph-aio3 not managed by cephadm

And tcmu-runner is still running with the old version.
# ceph versions
{
"mon": {
"ceph version 15.2.10 (27917a557cca91e4da407489bbaa64ad4352cc02)
octopus (stable)": 3
},
"mgr": {
"ceph version 15.2.10 (27917a557cca91e4da407489bbaa64ad4352cc02)
octopus (stable)": 1
},
"osd": {
"ceph version 15.2.10 (27917a557cca91e4da407489bbaa64ad4352cc02)
octopus (stable)": 9
},
"mds": {},
"tcmu-runner": {
"ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9)
nautilus (stable)": 12
},
"overall": {
"ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9)
nautilus (stable)": 12,
"ceph version 15.2.10 (27917a557cca91e4da407489bbaa64ad4352cc02)
octopus (stable)": 13
}
}

I didn't find any ceph-iscsi related upgrade steps from the above reference
link.
Can anyone here point me to the right direction of ceph-iscsi version
upgrade?

Thanks.

Regs,
Icy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph iscsi help

2021-03-01 Thread Várkonyi János
Hi All,

I2d like to install a Ceph Nautilus on Ubuntu 18.04 LTS and give the storage to 
2 windows server via ISCSI. I choose the Nautilus because of the deploy 
function I don't want to another VM to cephadm. So I can isntall the ceph and 
it is working properly but can't setup the icsi gateway. The services running 
like tcmu-runner, ebd-target-gw and rbd-target-api. I can going into gwcli but 
I can't create the first gw I get this msessage:

/iscsi-target...-igw/gateways> create cf01 192.168.203.51 skipchecks=true
OS version/package checks have been bypassed
Get gateway hostname failed : 403 Forbidden
Please check api_host setting and make sure host cf01 IP is listening on port 
5000

In the syslog at the same time:

Mar  1 15:43:02 cf01 there is no tcmu-runner data avaliable
Mar  1 15:43:06 cf01 :::127.0.0.1 - - [01/Mar/2021 15:43:06] "GET 
/api/config HTTP/1.1" 200 -

I can see the python listening on port 5000 (mybe this is my problem)

netstat -tulpn | grep 5000
tcp6   0  0 :::5000 :::*LISTEN  
1976/python

I cannot find anything about this error and I can't figure out what is solution.

Ubuntu 18.04.5 LTS
4.15.0-136-generic
I also tried with 4.20.0-042000-generic but the erorr was the same.

jansz0



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] CEPH-ISCSI fails when restarting rbd-target-api and won't work anymore

2020-11-28 Thread Hamidreza Hosseini
I have  an issue on ceph-iscsi ( ubuntu 20 LTS and Ceph 15.2.6)  after I 
restart  rbd-target-api, it fails and not starting again:

```

sudo systemctl status rbd-target-api.service
● rbd-target-api.service - Ceph iscsi target configuration API
 Loaded: loaded (/lib/systemd/system/rbd-target-api.service; enabled; 
vendor preset: enabled)
 Active: deactivating (stop-sigterm) since Sat 2020-11-28 17:01:40 +0330; 
20s ago
   Main PID: 37651 (rbd-target-api)
  Tasks: 55 (limit: 9451)
 Memory: 141.4M
 CGroup: /system.slice/rbd-target-api.service
 ├─15289 /usr/bin/python3 /usr/bin/rbd-target-api
 └─37651 /usr/bin/python3 /usr/bin/rbd-target-api

Nov 28 14:36:53 dev11 systemd[1]: Started Ceph iscsi target configuration API.
Nov 28 14:36:54 dev11 rbd-target-api[37651]: Started the configuration object 
watcher
Nov 28 14:36:54 dev11 rbd-target-api[37651]: Processing osd blacklist entries 
for this node
Nov 28 14:36:54 dev11 rbd-target-api[37651]: Checking for config object changes 
every 1s
Nov 28 14:36:55 dev11 rbd-target-api[37651]: Reading the configuration object 
to update local LIO configuration
Nov 28 14:36:55 dev11 rbd-target-api[37651]: Processing Gateway configuration
Nov 28 14:36:55 dev11 rbd-target-api[37651]: Setting up 
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 14:36:55 dev11 rbd-target-api[37651]: (Gateway.load_config) successfully 
loaded existing target definition
Nov 28 17:01:40 dev11 systemd[1]: Stopping Ceph iscsi target configuration 
API...

```
journalctl:
```
Nov 28 17:00:01 dev11 kernel: Unable to locate Target Portal Group on 
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:01 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:04 dev11 kernel: Unable to locate Target Portal Group on 
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:04 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:06 dev11 ceph-mgr[3184]: [172.16.1.3:57002] [GET] [500] [45.074s] 
[admin] [513.0B] /api/health/minimal
Nov 28 17:00:06 dev11 ceph-mgr[3184]: [b'{"status": "500 Internal Server 
Error", "detail": "The server encountered an unexpected condition which 
prevented it from fulfilling the request.", "request_id": 
"68eed46b-3ece-4e60-bc17-a172358f2d76"} 



   ']
Nov 28 17:00:06 dev11 ceph-mgr[3184]: [172.16.1.3:60128] [GET] [500] [45.070s] 
[admin] [513.0B] /api/health/minimal
Nov 28 17:00:06 dev11 ceph-mgr[3184]: [b'{"status": "500 Internal Server 
Error", "detail": "The server encountered an unexpected condition which 
prevented it from fulfilling the request.", "request_id": 
"5b6fdaa2-dc70-48a7-b01f-ca554ecfec41"} 



   ']
Nov 28 17:00:07 dev11 kernel: Unable to locate Target Portal Group on 
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:07 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:11 dev11 kernel: Unable to locate Target Portal Group on 
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:11 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:11 dev11 ceph-mgr[3184]: :::127.0.0.1 - - 
[28/Nov/2020:17:00:11] "GET /metrics HTTP/1.1" 200 151419 "" "Prometheus/2.7.2"
Nov 28 17:00:14 dev11 kernel: Unable to locate Target Portal Group on 
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:14 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:17 dev11 kernel: Unable to locate Target Portal Group on 
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:17 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:20 dev11 kernel: Unable to locate Target Portal Group on 
iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:20 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:22 dev11 ceph-mgr[3184]: [172.16.1.3:59834] [GET] [500] [45.062s] 
[admin] [513.0B] /api/health/minimal
Nov 28 17:00:22 dev11 ceph-mgr[3184]: [b'{"status": "500 Internal Server 
Error", "detail": "The server encountered an unexpected condition which 
prevented it from fulfilling the request.", "request_id": 
"1ba61331-1dfd-43e7-8ced-9f28aeb8a39c"} 



   ']
Nov 28 17:00:23 dev11 kernel: Unable to 

[ceph-users] Ceph iSCSI Performance

2020-10-05 Thread DHilsbos
All;

I've finally gotten around to setting up iSCSI gateways on my primary 
production cluster, and performance is terrible.

We're talking 1/4 to 1/3 of our current solution.

I see no evidence of network congestion on any involved network link.  I see no 
evidence CPU or memory being a problem on any involved server (MON / OSD / 
gateway /client).

What can I look at to tune this, preferably on the iSCSI gateways?

Thank you,

Dominic L. Hilsbos, MBA 
Director - Information Technology 
Perform Air International, Inc.
dhils...@performair.com 
www.PerformAir.com

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph iscsi latency too high for esxi?

2020-10-04 Thread Golasowski Martin
Hi,
does anyone here use CEPH iSCSI with VMware ESXi? It seems that we are hitting 
the 5 second timeout limit on software HBA in ESXi. It appears whenever there 
is increased load on the cluster, like deep scrub or rebalance. Is it normal 
behaviour in production? Or is there something special we need to tune?

We are on latest Nautilus, 12 x 10 TB OSDs (4 servers), 25 Gbit/s Ethernet, 
erasure coded rbd pool with 128 PGs, aroun 200 PGs per OSD total.


ESXi Log:

2020-10-04T01:57:04.314Z cpu34:2098959)WARNING: iscsi_vmk: 
iscsivmk_ConnReceiveAtomic:517: vmhba64:CH:1 T:0 CN:0: Failed to receive data: 
Connection closed by peer
2020-10-04T01:57:04.314Z cpu34:2098959)iscsi_vmk: 
iscsivmk_ConnRxNotifyFailure:1235: vmhba64:CH:1 T:0 CN:0: Connection rx 
notifying failure: Failed to Receive. State=Bound
2020-10-04T01:57:04.566Z cpu19:2098979)WARNING: iscsi_vmk: 
iscsivmk_StopConnection:741: vmhba64:CH:1 T:0 CN:0: iSCSI connection is being 
marked "OFFLINE" (Event:4)
2020-10-04T01:57:04.654Z cpu7:2097866)WARNING: VMW_SATP_ALUA: 
satp_alua_issueCommandOnPath:788: Probe cmd 0xa3 failed for path 
"vmhba64:C2:T0:L0" (0x5/0x20/0x0). Check if failover mode is still ALUA.


OSD Log:

[303088.450088] Did not receive response to NOPIN on CID: 0, failing connection 
for I_T Nexus 
iqn.1994-05.com.redhat:esxi1,i,0x00023d02,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,0x01
[324926.694077] Did not receive response to NOPIN on CID: 0, failing connection 
for I_T Nexus 
iqn.1994-05.com.redhat:esxi2,i,0x00023d01,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,0x01
[407067.404538] ABORT_TASK: Found referenced iSCSI task_tag: 5891
[407076.077175] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 5891
[411677.887690] ABORT_TASK: Found referenced iSCSI task_tag: 6722
[411683.297425] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 6722
[481459.755876] ABORT_TASK: Found referenced iSCSI task_tag: 7930
[481460.787968] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 7930

Cheers,
Martin

smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] CEPH iSCSI issue - ESXi command timeout

2020-10-01 Thread Golasowski Martin
Dear All,

a week ago we had to reboot our ESXi nodes since our CEPH cluster sudennly 
stopped serving all I/O. We have identified a VM (vCenter appliance) which was 
swapping heavily and causing heavy load. However, since then we are 
experiencing strange issues, as if the cluster cannot handle any spike in I/O 
load like migration or VM reboot.

The main problem is that the iSCSI commands issued by ESXi sometimes time out 
and ESXi reports inaccessible datastore. It disrupts the I/O heavily, we had to 
reboot the vmware cluster entirely several times. It started suddennly after 
approx 10 months of operation without problems.

I can see a steadily increasing number of dropped Rx packets on the iSCSI 
network interfaces in the OSDs.

Our CEPH setup is following: 4 OSDs, each having 3 10TB 7.2k rpm HDDs. The OSDs 
are connected by 25 Gbps Ethernet to the other nodes. For the RBD pools I have 
64 PGs. The OSDs have 32 GB RAM, free is around 1G on each, I have seen even 
lower, though. OS is CentOS 7, CEPH release is Nautilus 14.2.11 deployed by 
ceph-ansible. MONs are virtualized in ESXi nodes on the local SSD drives.

iSCSI NICs are on separate VLAN, other traffic is served via bond with 
balance-xor (LACP is unusable due to VMware limitation for using SW iSCSI HBA) 
in a different VLAN. Our network is Mellanox based - SN2100 switches and 
Connect-X 5 NICs. 

The iSCSI target serves 2 LUNs in RBD pool which is erasure coded. Yesterday I 
have increased the number of PGs for that pool from 64 to 128, without much 
effect after the cluster finished rebalancing.

In OSD servers kernel log we see the following:

[299560.618893] iSCSI Login negotiation failed.
[303088.450088] Did not receive response to NOPIN on CID: 0, failing connection 
for I_T Nexus 
iqn.1994-05.com.redhat:esxi1,i,0x00023d02,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,0x01
[324926.694077] Did not receive response to NOPIN on CID: 0, failing connection 
for I_T Nexus 
iqn.1994-05.com.redhat:esxi2,i,0x00023d01,iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw,t,0x01
[407067.404538] ABORT_TASK: Found referenced iSCSI task_tag: 5891
[407076.077175] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 5891
[411677.887690] ABORT_TASK: Found referenced iSCSI task_tag: 6722
[411683.297425] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 6722


The error in ESXi looks like this:

naa.60014053b46fc760ff0470dbd7980263" on path "vmhba64:C1:T0:L0" Failed:
2020-10-01T05:38:51.291Z cpu49:2144076)NMP: nmp_ThrottleLogForDevice:3856: Cmd 
0x89 (0x459a5b1b9480, 2097241) to dev "naa.6001405a527d78935724451aa5f53513" on 
path "vmhba64:C2:T0:L1" Failed:
2020-10-01T05:38:57.098Z cpu44:2099346)NMP: nmp_ThrottleLogForDevice:3856: Cmd 
0x8a (0x45ba96710ec0, 2107403) to dev "naa.60014053b46fc760ff0470dbd7980263" on 
path "vmhba64:C1:T0:L0" Failed:
2020-10-01T05:38:57.122Z cpu71:2098965)NMP: nmp_ThrottleLogForDevice:3856: Cmd 
0x89 (0x45ba9676aec0, 2146212) to dev "naa.60014053b46fc760ff0470dbd7980263" on 
path "vmhba64:C1:T0:L0" Failed:
2020-10-01T05:38:57.256Z cpu65:2098959)NMP: nmp_ThrottleLogForDevice:3856: Cmd 
0x89 (0x459a4179d8c0, 2146269) to dev "naa.6001405a527d78935724451aa5f53513" on 
path "vmhba64:C2:T0:L1" Failed:

We would appreciate any help you can give us.

Thank you very much.

Regards,
Martin Golasowski




smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph iSCSI Questions

2020-09-04 Thread DHilsbos
All;

We've used iSCSI to support virtualization for a while, and have used 
multi-pathing almost the entire time.  Now, I'm looking to move from our single 
box iSCSI hosts to iSCSI on Ceph.

We have 2 independent, non-routed, subnets assigned to iSCSI (let's call them 
192.168.250.0/24 and 192.168.251.0/24).  These subnets are hosted in VLANs 250 
and 251, respectively, on our switches.  Currently; each target and each 
initiator have a dedicated network port for each subnet (i.e. 2 NIC  per target 
& 2 NIC per initiator).

I have 2 server prepared to setup as Ceph iSCSI targets (let's call them 
ceph-iscsi1 & cpeh-iscsi2), and I'm wondering about their network 
configurations.  My initial plan is to configure one on the 250 network, and 
the other on the 251 network.

Would it be possible to have both servers on both networks?  In other words, 
can I give ceph-iscsi1 both 192.168.250.200 and 192.168.251.200, and 
ceph-iscsi2 192.168.250.201 and 192.168.251.201?

If that works, I would expect the initiators to see 4 paths to each portal, 
correct?

Thank you,

Dominic L. Hilsbos, MBA 
Director - Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph-iscsi create RBDs on erasure coded data pools

2020-01-30 Thread Wesley Dillingham
Is it possible to create an EC backed RBD via ceph-iscsi tools (gwcli,
rbd-target-api)? It appears that a pre-existing RBD created with the rbd
command can be imported, but there is no means to directly create an EC
backed RBD. The API seems to expect a single pool field in the body to work
with.

Perhaps there is a lower level construct where you can set the metadata of
a particular RADOS pool to always use Pool X for data-pool when using Pool
Y for RBD header and metadata. This way the clients, in our case ceph-iscsi
needn't be modified or concerned with the dual-pool situation unless
explicitly specified.

For out particular use case we expose limited functionality of
rbd-target-api to clients and it would be helpful for them to keep track of
a single pool and not be concerned with two pools but if a data-pool and a
"main" pool could be passed via the API that would be okay too.

Thanks a lot.

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph-iscsi and tcmu-runner RPMs for CentOS?

2019-09-07 Thread Robert Sander
Hi,

In the Documentation on
https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli/ it is stated
that you need at least CentOS 7.5 with at least kernel 4.16 and to
install tcmu-runner and ceph-iscsi "from your Linux distribution's
software repository".

CentOS does not know about tcmu-runner nor ceph-iscsi.

Where do I get these RPMs from?

Regards
-- 
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io