Hi Eugen. Ok, edited the file /etc/salt/minion, uncommented the "log_level_logfile" line and set it to "debug" level.
Turned off the computer, waited a few minutes so that the time frame would stand out in the /var/log/messages file, and restarted the computer. Using vi I "greped out" (awful wording) the reboot section. From that, I also removed most of what it seemed totally unrelated to ceph, salt, minions, grafana, prometheus, whatever. I got the lines below. It does not seem to complain about anything that I can see. :( ################ 2018-08-30T15:41:46.455383-03:00 torcello systemd[1]: systemd 234 running in system mode. (+PAM -AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT -GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID -ELFUTILS +KMOD -IDN2 -IDN default-hierarchy=hybrid) 2018-08-30T15:41:46.456330-03:00 torcello systemd[1]: Detected architecture x86-64. 2018-08-30T15:41:46.456350-03:00 torcello systemd[1]: nss-lookup.target: Dependency Before=nss-lookup.target dropped 2018-08-30T15:41:46.456357-03:00 torcello systemd[1]: Started Load Kernel Modules. 2018-08-30T15:41:46.456369-03:00 torcello systemd[1]: Starting Apply Kernel Variables... 2018-08-30T15:41:46.457230-03:00 torcello systemd[1]: Started Alertmanager for prometheus. 2018-08-30T15:41:46.457237-03:00 torcello systemd[1]: Started Monitoring system and time series database. 2018-08-30T15:41:46.457403-03:00 torcello systemd[1]: Starting NTP client/server... *2018-08-30T15:41:46.457425-03:00 torcello systemd[1]: Started Prometheus exporter for machine metrics.2018-08-30T15:41:46.457706-03:00 torcello prometheus[695]: level=info ts=2018-08-30T18:41:44.797896888Z caller=main.go:225 msg="Starting Prometheus" version="(version=2.1.0, branch=non-git, revision=non-git)"2018-08-30T15:41:46.457712-03:00 torcello prometheus[695]: level=info ts=2018-08-30T18:41:44.797969232Z caller=main.go:226 build_context="(go=go1.9.4, user=abuild@lamb69, date=20180513-03:46:03)"2018-08-30T15:41:46.457719-03:00 torcello prometheus[695]: level=info ts=2018-08-30T18:41:44.798008802Z caller=main.go:227 host_details="(Linux 4.12.14-lp150.12.4-default #1 SMP Tue May 22 05:17:22 UTC 2018 (66b2eda) x86_64 torcello (none))"2018-08-30T15:41:46.457726-03:00 torcello prometheus[695]: level=info ts=2018-08-30T18:41:44.798044088Z caller=main.go:228 fd_limits="(soft=1024, hard=4096)"2018-08-30T15:41:46.457738-03:00 torcello prometheus[695]: level=info ts=2018-08-30T18:41:44.802067189Z caller=web.go:383 component=web msg="Start listening for connections" address=0.0.0.0:9090 <http://0.0.0.0:9090>2018-08-30T15:41:46.457745-03:00 torcello prometheus[695]: level=info ts=2018-08-30T18:41:44.802037354Z caller=main.go:499 msg="Starting TSDB ..."* 2018-08-30T15:41:46.458145-03:00 torcello smartd[809]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices 2018-08-30T15:41:46.458321-03:00 torcello systemd[1]: Started NTP client/server. *2018-08-30T15:41:50.387157-03:00 torcello ceph_exporter[690]: 2018/08/30 15:41:50 Starting ceph exporter on ":9128"* 2018-08-30T15:41:52.658272-03:00 torcello wicked[905]: lo up 2018-08-30T15:41:52.658738-03:00 torcello wicked[905]: eth0 up 2018-08-30T15:41:52.659989-03:00 torcello systemd[1]: Started wicked managed network interfaces. 2018-08-30T15:41:52.660514-03:00 torcello systemd[1]: Reached target Network. 2018-08-30T15:41:52.667938-03:00 torcello systemd[1]: Starting OpenSSH Daemon... 2018-08-30T15:41:52.668292-03:00 torcello systemd[1]: Reached target Network is Online. *2018-08-30T15:41:52.669132-03:00 torcello systemd[1]: Started Ceph cluster monitor daemon.2018-08-30T15:41:52.669328-03:00 torcello systemd[1]: Reached target ceph target allowing to start/stop all ceph-mon@.service instances at once.2018-08-30T15:41:52.670346-03:00 torcello systemd[1]: Started Ceph cluster manager daemon.2018-08-30T15:41:52.670565-03:00 torcello systemd[1]: Reached target ceph target allowing to start/stop all ceph-mgr@.service instances at once.2018-08-30T15:41:52.670839-03:00 torcello systemd[1]: Reached target ceph target allowing to start/stop all ceph*@.service instances at once.* 2018-08-30T15:41:52.671246-03:00 torcello systemd[1]: Starting Login and scanning of iSCSI devices... *2018-08-30T15:41:52.672402-03:00 torcello systemd[1]: Starting Grafana instance...* 2018-08-30T15:41:52.678922-03:00 torcello systemd[1]: Started Backup of /etc/sysconfig. 2018-08-30T15:41:52.679109-03:00 torcello systemd[1]: Reached target Timers. *2018-08-30T15:41:52.679630-03:00 torcello systemd[1]: Started The Salt API.* 2018-08-30T15:41:52.692944-03:00 torcello systemd[1]: Starting Postfix Mail Transport Agent... *2018-08-30T15:41:52.694687-03:00 torcello systemd[1]: Started The Salt Master Server.* *2018-08-30T15:41:52.696821-03:00 torcello systemd[1]: Starting The Salt Minion...* 2018-08-30T15:41:52.772750-03:00 torcello sshd-gen-keys-start[1408]: Checking for missing server keys in /etc/ssh 2018-08-30T15:41:52.818695-03:00 torcello iscsiadm[1412]: iscsiadm: No records found 2018-08-30T15:41:52.819541-03:00 torcello systemd[1]: Started Login and scanning of iSCSI devices. 2018-08-30T15:41:52.820214-03:00 torcello systemd[1]: Reached target Remote File Systems. 2018-08-30T15:41:52.821418-03:00 torcello systemd[1]: Starting Permit User Sessions... 2018-08-30T15:41:53.045278-03:00 torcello systemd[1]: Started Permit User Sessions. 2018-08-30T15:41:53.048482-03:00 torcello systemd[1]: Starting Hold until boot process finishes up... 2018-08-30T15:41:53.054461-03:00 torcello echo[1415]: Starting mail service (Postfix) 2018-08-30T15:41:53.447390-03:00 torcello sshd[1431]: Server listening on 0.0.0.0 port 22. 2018-08-30T15:41:53.447685-03:00 torcello sshd[1431]: Server listening on :: port 22. 2018-08-30T15:41:53.447907-03:00 torcello systemd[1]: Started OpenSSH Daemon. *2018-08-30T15:41:54.519192-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Starting Grafana" logger=server version=5.1.3 commit=NA compiled=2018-08-30T15:41:53-03002018-08-30T15:41:54.519664-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Config loaded from" logger=settings file=/usr/share/grafana/conf/defaults.ini2018-08-30T15:41:54.519979-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Config loaded from" logger=settings file=/etc/grafana/grafana.ini2018-08-30T15:41:54.520257-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.data=/var/lib/grafana"2018-08-30T15:41:54.520546-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.logs=/var/log/grafana"2018-08-30T15:41:54.520823-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.plugins=/var/lib/grafana/plugins"2018-08-30T15:41:54.521085-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Config overridden from command line" logger=settings arg="default.paths.provisioning=/etc/grafana/provisioning"2018-08-30T15:41:54.521343-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Path Home" logger=settings path=/usr/share/grafana2018-08-30T15:41:54.521593-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Path Data" logger=settings path=/var/lib/grafana2018-08-30T15:41:54.521843-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Path Logs" logger=settings path=/var/log/grafana2018-08-30T15:41:54.522108-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Path Plugins" logger=settings path=/var/lib/grafana/plugins2018-08-30T15:41:54.522361-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Path Provisioning" logger=settings path=/etc/grafana/provisioning2018-08-30T15:41:54.522611-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="App mode production" logger=settings2018-08-30T15:41:54.522885-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Writing PID file" logger=server path=/var/run/grafana/grafana-server.pid pid=1413* *2018-08-30T15:41:54.523148-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Initializing DB" logger=sqlstore dbtype=sqlite32018-08-30T15:41:54.523398-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Starting DB migration" logger=migrator2018-08-30T15:41:54.804052-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Executing migration" logger=migrator id="copy data account to org"2018-08-30T15:41:54.804423-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Skipping migration condition not fulfilled" logger=migrator id="copy data account to org"2018-08-30T15:41:54.804724-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Executing migration" logger=migrator id="copy data account_user to org_user"2018-08-30T15:41:54.804985-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Skipping migration condition not fulfilled" logger=migrator id="copy data account_user to org_user"2018-08-30T15:41:54.838327-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Starting plugin search" logger=plugins* 2018-08-30T15:41:54.947408-03:00 torcello systemd[1]: Starting Locale Service... 2018-08-30T15:41:54.979069-03:00 torcello systemd[1]: Started Locale Service. *2018-08-30T15:41:55.023859-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info msg="Registering plugin" logger=plugins name=Discrete2018-08-30T15:41:55.028462-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info msg="Registering plugin" logger=plugins name=Monasca2018-08-30T15:41:55.065227-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=eror msg="can't read datasource provisioning files from directory" logger=provisioning.datasources path=/etc/grafana/provisioning/datasources2018-08-30T15:41:55.065462-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=eror msg="can't read dashboard provisioning files from directory" logger=provisioning.dashboard path=/etc/grafana/provisioning/dashboards2018-08-30T15:41:55.065636-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info msg="Initializing Alerting" logger=alerting.engine2018-08-30T15:41:55.065779-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info msg="Initializing CleanUpService" logger=cleanup* 2018-08-30T15:41:55.274779-03:00 torcello systemd[1]: Started Grafana instance. 2 *018-08-30T15:41:55.313056-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info msg="Initializing Stream Manager"2018-08-30T15:41:55.313251-03:00 torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info msg="Initializing HTTP Server" logger=http.server address=0.0.0.0:3000 <http://0.0.0.0:3000> protocol=http subUrl= socket=* 2018-08-30T15:41:58.304749-03:00 torcello systemd[1]: Started Command Scheduler. 2018-08-30T15:41:58.381694-03:00 torcello systemd[1]: Started The Salt Minion. 2018-08-30T15:41:58.386643-03:00 torcello cron[1611]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 11% if used.) 2018-08-30T15:41:58.396087-03:00 torcello cron[1611]: (CRON) INFO (running with inotify support) 2018-08-30T15:42:06.367096-03:00 torcello systemd[1]: Started Hold until boot process finishes up. 2018-08-30T15:42:06.369301-03:00 torcello systemd[1]: Started Getty on tty1. 2018-08-30T15:42:11.535310-03:00 torcello systemd[1792]: Reached target Paths. 2018-08-30T15:42:11.536128-03:00 torcello systemd[1792]: Starting D-Bus User Message Bus Socket. 2018-08-30T15:42:11.536378-03:00 torcello systemd[1792]: Reached target Timers. 2018-08-30T15:42:11.598968-03:00 torcello systemd[1792]: Listening on D-Bus User Message Bus Socket. 2018-08-30T15:42:11.599151-03:00 torcello systemd[1792]: Reached target Sockets. 2018-08-30T15:42:11.599277-03:00 torcello systemd[1792]: Reached target Basic System. 2018-08-30T15:42:11.599398-03:00 torcello systemd[1792]: Reached target Default. 2018-08-30T15:42:11.599514-03:00 torcello systemd[1792]: Startup finished in 145ms. 2018-08-30T15:42:11.599636-03:00 torcello systemd[1]: Started User Manager for UID 464. 2018-08-30T15:42:12.471869-03:00 torcello systemd[1792]: Started D-Bus User Message Bus. 2018-08-30T15:42:15.898853-03:00 torcello systemd[1]: Starting Disk Manager... 2018-08-30T15:42:15.974641-03:00 torcello systemd[1]: Started Disk Manager. 2018-08-30T15:42:16.897412-03:00 torcello node_exporter[807]: time="2018-08-30T15:42:16-03:00" level=error msg="ERROR: ntp collector failed after 0.000087s: couldn't get SNTP reply: read udp 127.0.0.1:42089-> 127.0.0.1:123: read: connection refused" source="collector.go:123" 2018-08-30T15:42:17.589461-03:00 torcello chronyd[845]: Selected source 200.189.40.8 2018-08-30T15:43:16.899040-03:00 torcello node_exporter[807]: time="2018-08-30T15:43:16-03:00" level=error msg="ERROR: ntp collector failed after 0.000105s: couldn't get SNTP reply: read udp 127.0.0.1:59525-> 127.0.0.1:123: read: connection refused" source="collector.go:123" 2018-08-30T15:44:15.496595-03:00 torcello systemd[1792]: Stopped target Default. 2018-08-30T15:44:15.496824-03:00 torcello systemd[1792]: Stopping D-Bus User Message Bus... 2018-08-30T15:44:15.502438-03:00 torcello systemd[1792]: Stopped D-Bus User Message Bus. 2018-08-30T15:44:15.502627-03:00 torcello systemd[1792]: Stopped target Basic System. 2018-08-30T15:44:15.502776-03:00 torcello systemd[1792]: Stopped target Paths. 2018-08-30T15:44:15.502923-03:00 torcello systemd[1792]: Stopped target Timers. 2018-08-30T15:44:15.503062-03:00 torcello systemd[1792]: Stopped target Sockets. 2018-08-30T15:44:15.503200-03:00 torcello systemd[1792]: Closed D-Bus User Message Bus Socket. 2018-08-30T15:44:15.503356-03:00 torcello systemd[1792]: Reached target Shutdown. 2018-08-30T15:44:15.503572-03:00 torcello systemd[1792]: Starting Exit the Session... 2018-08-30T15:44:15.511298-03:00 torcello systemd[2295]: Starting D-Bus User Message Bus Socket. 2018-08-30T15:44:15.511493-03:00 torcello systemd[2295]: Reached target Timers. 2018-08-30T15:44:15.511664-03:00 torcello systemd[2295]: Reached target Paths. 2018-08-30T15:44:15.517873-03:00 torcello systemd[2295]: Listening on D-Bus User Message Bus Socket. 2018-08-30T15:44:15.518060-03:00 torcello systemd[2295]: Reached target Sockets. 2018-08-30T15:44:15.518216-03:00 torcello systemd[2295]: Reached target Basic System. 2018-08-30T15:44:15.518373-03:00 torcello systemd[2295]: Reached target Default. 2018-08-30T15:44:15.518501-03:00 torcello systemd[2295]: Startup finished in 31ms. 2018-08-30T15:44:15.518634-03:00 torcello systemd[1]: Started User Manager for UID 1000. 2018-08-30T15:44:15.518759-03:00 torcello systemd[1792]: Received SIGRTMIN+24 from PID 2300 (kill). 2018-08-30T15:44:15.537634-03:00 torcello systemd[1]: Stopped User Manager for UID 464. 2018-08-30T15:44:15.538422-03:00 torcello systemd[1]: Removed slice User Slice of sddm. 2018-08-30T15:44:15.613246-03:00 torcello systemd[2295]: Started D-Bus User Message Bus. 2018-08-30T15:44:15.623989-03:00 torcello dbus-daemon[2311]: [session uid=1000 pid=2311] Successfully activated service 'org.freedesktop.systemd1' 2018-08-30T15:44:16.447162-03:00 torcello kapplymousetheme[2350]: kcm_input: Using X11 backend 2018-08-30T15:44:16.901642-03:00 torcello node_exporter[807]: time="2018-08-30T15:44:16-03:00" level=error msg="ERROR: ntp collector failed after 0.000205s: couldn't get SNTP reply: read udp 127.0.0.1:53434-> 127.0.0.1:123: read: connection refused" source="collector.go:123" ################ Any ideas? Thanks a lot, Jones On Thu, Aug 30, 2018 at 4:14 AM Eugen Block <ebl...@nde.ag> wrote: > Hi, > > > So, it only contains logs concerning the node itself (is it correct? > sincer > > node01 is also the master, I was expecting it to have logs from the other > > too) and, moreover, no ceph-osd* files. Also, I'm looking the logs I have > > available, and nothing "shines out" (sorry for my poor english) as a > > possible error. > > the logging is not configured to be centralised per default, you would > have to configure that yourself. > > Regarding the OSDs, if there are OSD logs created, they're created on > the OSD nodes, not on the master. But since the OSD deployment fails, > there probably are no OSD specific logs yet. So you'll have to take a > look into the syslog (/var/log/messages), that's where the salt-minion > reports its attempts to create the OSDs. Chances are high that you'll > find the root cause in here. > > If the output is not enough, set the log-level to debug: > > osd-1:~ # grep -E "^log_level" /etc/salt/minion > log_level: debug > > > Regards, > Eugen > > > Zitat von Jones de Andrade <johanne...@gmail.com>: > > > Hi Eugen. > > > > Sorry for the delay in answering. > > > > Just looked in the /var/log/ceph/ directory. It only contains the > following > > files (for example on node01): > > > > ####### > > # ls -lart > > total 3864 > > -rw------- 1 ceph ceph 904 ago 24 13:11 ceph.audit.log-20180829.xz > > drwxr-xr-x 1 root root 898 ago 28 10:07 .. > > -rw-r--r-- 1 ceph ceph 189464 ago 28 23:59 > ceph-mon.node01.log-20180829.xz > > -rw------- 1 ceph ceph 24360 ago 28 23:59 ceph.log-20180829.xz > > -rw-r--r-- 1 ceph ceph 48584 ago 29 00:00 > ceph-mgr.node01.log-20180829.xz > > -rw------- 1 ceph ceph 0 ago 29 00:00 ceph.audit.log > > drwxrws--T 1 ceph ceph 352 ago 29 00:00 . > > -rw-r--r-- 1 ceph ceph 1908122 ago 29 12:46 ceph-mon.node01.log > > -rw------- 1 ceph ceph 175229 ago 29 12:48 ceph.log > > -rw-r--r-- 1 ceph ceph 1599920 ago 29 12:49 ceph-mgr.node01.log > > ####### > > > > So, it only contains logs concerning the node itself (is it correct? > sincer > > node01 is also the master, I was expecting it to have logs from the other > > too) and, moreover, no ceph-osd* files. Also, I'm looking the logs I have > > available, and nothing "shines out" (sorry for my poor english) as a > > possible error. > > > > Any suggestion on how to proceed? > > > > Thanks a lot in advance, > > > > Jones > > > > > > On Mon, Aug 27, 2018 at 5:29 AM Eugen Block <ebl...@nde.ag> wrote: > > > >> Hi Jones, > >> > >> all ceph logs are in the directory /var/log/ceph/, each daemon has its > >> own log file, e.g. OSD logs are named ceph-osd.*. > >> > >> I haven't tried it but I don't think SUSE Enterprise Storage deploys > >> OSDs on partitioned disks. Is there a way to attach a second disk to > >> the OSD nodes, maybe via USB or something? > >> > >> Although this thread is ceph related it is referring to a specific > >> product, so I would recommend to post your question in the SUSE forum > >> [1]. > >> > >> Regards, > >> Eugen > >> > >> [1] https://forums.suse.com/forumdisplay.php?99-SUSE-Enterprise-Storage > >> > >> Zitat von Jones de Andrade <johanne...@gmail.com>: > >> > >> > Hi Eugen. > >> > > >> > Thanks for the suggestion. I'll look for the logs (since it's our > first > >> > attempt with ceph, I'll have to discover where they are, but no > problem). > >> > > >> > One thing called my attention on your response however: > >> > > >> > I haven't made myself clear, but one of the failures we encountered > were > >> > that the files now containing: > >> > > >> > node02: > >> > ---------- > >> > storage: > >> > ---------- > >> > osds: > >> > ---------- > >> > /dev/sda4: > >> > ---------- > >> > format: > >> > bluestore > >> > standalone: > >> > True > >> > > >> > Were originally empty, and we filled them by hand following a model > found > >> > elsewhere on the web. It was necessary, so that we could continue, but > >> the > >> > model indicated that, for example, it should have the path for > /dev/sda > >> > here, not /dev/sda4. We chosen to include the specific partition > >> > identification because we won't have dedicated disks here, rather just > >> the > >> > very same partition as all disks were partitioned exactly the same. > >> > > >> > While that was enough for the procedure to continue at that point, > now I > >> > wonder if it was the right call and, if it indeed was, if it was done > >> > properly. As such, I wonder: what you mean by "wipe" the partition > here? > >> > /dev/sda4 is created, but is both empty and unmounted: Should a > different > >> > operation be performed on it, should I remove it first, should I have > >> > written the files above with only /dev/sda as target? > >> > > >> > I know that probably I wouldn't run in this issues with dedicated > discks, > >> > but unfortunately that is absolutely not an option. > >> > > >> > Thanks a lot in advance for any comments and/or extra suggestions. > >> > > >> > Sincerely yours, > >> > > >> > Jones > >> > > >> > On Sat, Aug 25, 2018 at 5:46 PM Eugen Block <ebl...@nde.ag> wrote: > >> > > >> >> Hi, > >> >> > >> >> take a look into the logs, they should point you in the right > direction. > >> >> Since the deployment stage fails at the OSD level, start with the OSD > >> >> logs. Something's not right with the disks/partitions, did you wipe > >> >> the partition from previous attempts? > >> >> > >> >> Regards, > >> >> Eugen > >> >> > >> >> Zitat von Jones de Andrade <johanne...@gmail.com>: > >> >> > >> >>> (Please forgive my previous email: I was using another message and > >> >>> completely forget to update the subject) > >> >>> > >> >>> Hi all. > >> >>> > >> >>> I'm new to ceph, and after having serious problems in ceph stages > 0, 1 > >> >> and > >> >>> 2 that I could solve myself, now it seems that I have hit a wall > harder > >> >>> than my head. :) > >> >>> > >> >>> When I run salt-run state.orch ceph.stage.deploy, i monitor I see it > >> >> going > >> >>> up to here: > >> >>> > >> >>> ####### > >> >>> [14/71] ceph.sysctl on > >> >>> node01....................................... ✓ (0.5s) > >> >>> node02........................................ ✓ (0.7s) > >> >>> node03....................................... ✓ (0.6s) > >> >>> node04......................................... ✓ (0.5s) > >> >>> node05....................................... ✓ (0.6s) > >> >>> node06.......................................... ✓ (0.5s) > >> >>> > >> >>> [15/71] ceph.osd on > >> >>> node01...................................... ❌ (0.7s) > >> >>> node02........................................ ❌ (0.7s) > >> >>> node03....................................... ❌ (0.7s) > >> >>> node04......................................... ❌ (0.6s) > >> >>> node05....................................... ❌ (0.6s) > >> >>> node06.......................................... ❌ (0.7s) > >> >>> > >> >>> Ended stage: ceph.stage.deploy succeeded=14/71 failed=1/71 > time=624.7s > >> >>> > >> >>> Failures summary: > >> >>> > >> >>> ceph.osd (/srv/salt/ceph/osd): > >> >>> node02: > >> >>> deploy OSDs: Module function osd.deploy threw an exception. > >> >> Exception: > >> >>> Mine on node02 for cephdisks.list > >> >>> node03: > >> >>> deploy OSDs: Module function osd.deploy threw an exception. > >> >> Exception: > >> >>> Mine on node03 for cephdisks.list > >> >>> node01: > >> >>> deploy OSDs: Module function osd.deploy threw an exception. > >> >> Exception: > >> >>> Mine on node01 for cephdisks.list > >> >>> node04: > >> >>> deploy OSDs: Module function osd.deploy threw an exception. > >> >> Exception: > >> >>> Mine on node04 for cephdisks.list > >> >>> node05: > >> >>> deploy OSDs: Module function osd.deploy threw an exception. > >> >> Exception: > >> >>> Mine on node05 for cephdisks.list > >> >>> node06: > >> >>> deploy OSDs: Module function osd.deploy threw an exception. > >> >> Exception: > >> >>> Mine on node06 for cephdisks.list > >> >>> ####### > >> >>> > >> >>> Since this is a first attempt in 6 simple test machines, we are > going > >> to > >> >>> put the mon, osds, etc, in all nodes at first. Only the master is > left > >> >> in a > >> >>> single machine (node01) by now. > >> >>> > >> >>> As they are simple machines, they have a single hdd, which is > >> partitioned > >> >>> as follows (the hda4 partition is unmounted and left for the ceph > >> >> system): > >> >>> > >> >>> ########### > >> >>> # lsblk > >> >>> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > >> >>> sda 8:0 0 465,8G 0 disk > >> >>> ├─sda1 8:1 0 500M 0 part /boot/efi > >> >>> ├─sda2 8:2 0 16G 0 part [SWAP] > >> >>> ├─sda3 8:3 0 49,3G 0 part / > >> >>> └─sda4 8:4 0 400G 0 part > >> >>> sr0 11:0 1 3,7G 0 rom > >> >>> > >> >>> # salt -I 'roles:storage' cephdisks.list > >> >>> node01: > >> >>> node02: > >> >>> node03: > >> >>> node04: > >> >>> node05: > >> >>> node06: > >> >>> > >> >>> # salt -I 'roles:storage' pillar.get ceph > >> >>> node02: > >> >>> ---------- > >> >>> storage: > >> >>> ---------- > >> >>> osds: > >> >>> ---------- > >> >>> /dev/sda4: > >> >>> ---------- > >> >>> format: > >> >>> bluestore > >> >>> standalone: > >> >>> True > >> >>> (and so on for all 6 machines) > >> >>> ########## > >> >>> > >> >>> Finally and just in case, my policy.cfg file reads: > >> >>> > >> >>> ######### > >> >>> #cluster-unassigned/cluster/*.sls > >> >>> cluster-ceph/cluster/*.sls > >> >>> profile-default/cluster/*.sls > >> >>> profile-default/stack/default/ceph/minions/*yml > >> >>> config/stack/default/global.yml > >> >>> config/stack/default/ceph/cluster.yml > >> >>> role-master/cluster/node01.sls > >> >>> role-admin/cluster/*.sls > >> >>> role-mon/cluster/*.sls > >> >>> role-mgr/cluster/*.sls > >> >>> role-mds/cluster/*.sls > >> >>> role-ganesha/cluster/*.sls > >> >>> role-client-nfs/cluster/*.sls > >> >>> role-client-cephfs/cluster/*.sls > >> >>> ########## > >> >>> > >> >>> Please, could someone help me and shed some light on this issue? > >> >>> > >> >>> Thanks a lot in advance, > >> >>> > >> >>> Regasrds, > >> >>> > >> >>> Jones > >> >> > >> >> > >> >> > >> >> _______________________________________________ > >> >> ceph-users mailing list > >> >> ceph-users@lists.ceph.com > >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> >> > >> > >> > >> > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com