Re: [ceph-users] Ceph-Deploy error on 15/71 stage

Jones de Andrade Thu, 30 Aug 2018 12:27:02 -0700

Hi Eugen.

Ok, edited the file /etc/salt/minion, uncommented the "log_level_logfile"
line and set it to "debug" level.

Turned off the computer, waited a few minutes so that the time frame would
stand out in the /var/log/messages file, and restarted the computer.

Using vi I "greped out" (awful wording) the reboot section. From that, I
also removed most of what it seemed totally unrelated to ceph, salt,
minions, grafana, prometheus, whatever.

I got the lines below. It does not seem to complain about anything that I
can see. :(

################
2018-08-30T15:41:46.455383-03:00 torcello systemd[1]: systemd 234 running
in system mode. (+PAM -AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINIT +UTMP
+LIBCRYPTSETUP +GCRYPT -GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID -ELFUTILS
+KMOD -IDN2 -IDN default-hierarchy=hybrid)
2018-08-30T15:41:46.456330-03:00 torcello systemd[1]: Detected architecture
x86-64.
2018-08-30T15:41:46.456350-03:00 torcello systemd[1]: nss-lookup.target:
Dependency Before=nss-lookup.target dropped
2018-08-30T15:41:46.456357-03:00 torcello systemd[1]: Started Load Kernel
Modules.
2018-08-30T15:41:46.456369-03:00 torcello systemd[1]: Starting Apply Kernel
Variables...
2018-08-30T15:41:46.457230-03:00 torcello systemd[1]: Started Alertmanager
for prometheus.
2018-08-30T15:41:46.457237-03:00 torcello systemd[1]: Started Monitoring
system and time series database.
2018-08-30T15:41:46.457403-03:00 torcello systemd[1]: Starting NTP
client/server...

*2018-08-30T15:41:46.457425-03:00 torcello systemd[1]: Started Prometheus
exporter for machine metrics.2018-08-30T15:41:46.457706-03:00 torcello
prometheus[695]: level=info ts=2018-08-30T18:41:44.797896888Z
caller=main.go:225 msg="Starting Prometheus" version="(version=2.1.0,
branch=non-git, revision=non-git)"2018-08-30T15:41:46.457712-03:00 torcello
prometheus[695]: level=info ts=2018-08-30T18:41:44.797969232Z
caller=main.go:226 build_context="(go=go1.9.4, user=abuild@lamb69,
date=20180513-03:46:03)"2018-08-30T15:41:46.457719-03:00 torcello
prometheus[695]: level=info ts=2018-08-30T18:41:44.798008802Z
caller=main.go:227 host_details="(Linux 4.12.14-lp150.12.4-default #1 SMP
Tue May 22 05:17:22 UTC 2018 (66b2eda) x86_64 torcello
(none))"2018-08-30T15:41:46.457726-03:00 torcello prometheus[695]:
level=info ts=2018-08-30T18:41:44.798044088Z caller=main.go:228
fd_limits="(soft=1024, hard=4096)"2018-08-30T15:41:46.457738-03:00 torcello
prometheus[695]: level=info ts=2018-08-30T18:41:44.802067189Z
caller=web.go:383 component=web msg="Start listening for connections"
address=0.0.0.0:9090 <http://0.0.0.0:9090>2018-08-30T15:41:46.457745-03:00
torcello prometheus[695]: level=info ts=2018-08-30T18:41:44.802037354Z
caller=main.go:499 msg="Starting TSDB ..."*
2018-08-30T15:41:46.458145-03:00 torcello smartd[809]: Monitoring 1
ATA/SATA, 0 SCSI/SAS and 0 NVMe devices
2018-08-30T15:41:46.458321-03:00 torcello systemd[1]: Started NTP
client/server.
*2018-08-30T15:41:50.387157-03:00 torcello ceph_exporter[690]: 2018/08/30
15:41:50 Starting ceph exporter on ":9128"*
2018-08-30T15:41:52.658272-03:00 torcello wicked[905]: lo              up
2018-08-30T15:41:52.658738-03:00 torcello wicked[905]: eth0            up
2018-08-30T15:41:52.659989-03:00 torcello systemd[1]: Started wicked
managed network interfaces.
2018-08-30T15:41:52.660514-03:00 torcello systemd[1]: Reached target
Network.
2018-08-30T15:41:52.667938-03:00 torcello systemd[1]: Starting OpenSSH
Daemon...
2018-08-30T15:41:52.668292-03:00 torcello systemd[1]: Reached target
Network is Online.

*2018-08-30T15:41:52.669132-03:00 torcello systemd[1]: Started Ceph cluster
monitor daemon.2018-08-30T15:41:52.669328-03:00 torcello systemd[1]:
Reached target ceph target allowing to start/stop all ceph-mon@.service
instances at once.2018-08-30T15:41:52.670346-03:00 torcello systemd[1]:
Started Ceph cluster manager daemon.2018-08-30T15:41:52.670565-03:00
torcello systemd[1]: Reached target ceph target allowing to start/stop all
ceph-mgr@.service instances at once.2018-08-30T15:41:52.670839-03:00
torcello systemd[1]: Reached target ceph target allowing to start/stop all
ceph*@.service instances at once.*
2018-08-30T15:41:52.671246-03:00 torcello systemd[1]: Starting Login and
scanning of iSCSI devices...
*2018-08-30T15:41:52.672402-03:00 torcello systemd[1]: Starting Grafana
instance...*
2018-08-30T15:41:52.678922-03:00 torcello systemd[1]: Started Backup of
/etc/sysconfig.
2018-08-30T15:41:52.679109-03:00 torcello systemd[1]: Reached target Timers.
*2018-08-30T15:41:52.679630-03:00 torcello systemd[1]: Started The Salt
API.*
2018-08-30T15:41:52.692944-03:00 torcello systemd[1]: Starting Postfix Mail
Transport Agent...
*2018-08-30T15:41:52.694687-03:00 torcello systemd[1]: Started The Salt
Master Server.*
*2018-08-30T15:41:52.696821-03:00 torcello systemd[1]: Starting The Salt
Minion...*
2018-08-30T15:41:52.772750-03:00 torcello sshd-gen-keys-start[1408]:
Checking for missing server keys in /etc/ssh
2018-08-30T15:41:52.818695-03:00 torcello iscsiadm[1412]: iscsiadm: No
records found
2018-08-30T15:41:52.819541-03:00 torcello systemd[1]: Started Login and
scanning of iSCSI devices.
2018-08-30T15:41:52.820214-03:00 torcello systemd[1]: Reached target Remote
File Systems.
2018-08-30T15:41:52.821418-03:00 torcello systemd[1]: Starting Permit User
Sessions...
2018-08-30T15:41:53.045278-03:00 torcello systemd[1]: Started Permit User
Sessions.
2018-08-30T15:41:53.048482-03:00 torcello systemd[1]: Starting Hold until
boot process finishes up...
2018-08-30T15:41:53.054461-03:00 torcello echo[1415]: Starting mail service
(Postfix)
2018-08-30T15:41:53.447390-03:00 torcello sshd[1431]: Server listening on
0.0.0.0 port 22.
2018-08-30T15:41:53.447685-03:00 torcello sshd[1431]: Server listening on
:: port 22.
2018-08-30T15:41:53.447907-03:00 torcello systemd[1]: Started OpenSSH
Daemon.

*2018-08-30T15:41:54.519192-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:54-0300 lvl=info msg="Starting Grafana" logger=server
version=5.1.3 commit=NA
compiled=2018-08-30T15:41:53-03002018-08-30T15:41:54.519664-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Config
loaded from" logger=settings
file=/usr/share/grafana/conf/defaults.ini2018-08-30T15:41:54.519979-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info
msg="Config loaded from" logger=settings
file=/etc/grafana/grafana.ini2018-08-30T15:41:54.520257-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Config
overridden from command line" logger=settings
arg="default.paths.data=/var/lib/grafana"2018-08-30T15:41:54.520546-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info
msg="Config overridden from command line" logger=settings
arg="default.paths.logs=/var/log/grafana"2018-08-30T15:41:54.520823-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info
msg="Config overridden from command line" logger=settings
arg="default.paths.plugins=/var/lib/grafana/plugins"2018-08-30T15:41:54.521085-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info
msg="Config overridden from command line" logger=settings
arg="default.paths.provisioning=/etc/grafana/provisioning"2018-08-30T15:41:54.521343-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info
msg="Path Home" logger=settings
path=/usr/share/grafana2018-08-30T15:41:54.521593-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Path Data"
logger=settings path=/var/lib/grafana2018-08-30T15:41:54.521843-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info
msg="Path Logs" logger=settings
path=/var/log/grafana2018-08-30T15:41:54.522108-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Path
Plugins" logger=settings
path=/var/lib/grafana/plugins2018-08-30T15:41:54.522361-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Path
Provisioning" logger=settings
path=/etc/grafana/provisioning2018-08-30T15:41:54.522611-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="App mode
production" logger=settings2018-08-30T15:41:54.522885-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Writing PID
file" logger=server path=/var/run/grafana/grafana-server.pid pid=1413*

*2018-08-30T15:41:54.523148-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:54-0300 lvl=info msg="Initializing DB" logger=sqlstore
dbtype=sqlite32018-08-30T15:41:54.523398-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Starting DB
migration" logger=migrator2018-08-30T15:41:54.804052-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:54-0300 lvl=info msg="Executing
migration" logger=migrator id="copy data account to
org"2018-08-30T15:41:54.804423-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:54-0300 lvl=info msg="Skipping migration condition not
fulfilled" logger=migrator id="copy data account to
org"2018-08-30T15:41:54.804724-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:54-0300 lvl=info msg="Executing migration"
logger=migrator id="copy data account_user to
org_user"2018-08-30T15:41:54.804985-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:54-0300 lvl=info msg="Skipping migration condition not
fulfilled" logger=migrator id="copy data account_user to
org_user"2018-08-30T15:41:54.838327-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:54-0300 lvl=info msg="Starting plugin search"
logger=plugins*
2018-08-30T15:41:54.947408-03:00 torcello systemd[1]: Starting Locale
Service...
2018-08-30T15:41:54.979069-03:00 torcello systemd[1]: Started Locale
Service.

*2018-08-30T15:41:55.023859-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:55-0300 lvl=info msg="Registering plugin" logger=plugins
name=Discrete2018-08-30T15:41:55.028462-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info msg="Registering
plugin" logger=plugins name=Monasca2018-08-30T15:41:55.065227-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=eror
msg="can't read datasource provisioning files from directory"
logger=provisioning.datasources
path=/etc/grafana/provisioning/datasources2018-08-30T15:41:55.065462-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=eror
msg="can't read dashboard provisioning files from directory"
logger=provisioning.dashboard
path=/etc/grafana/provisioning/dashboards2018-08-30T15:41:55.065636-03:00
torcello grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info
msg="Initializing Alerting"
logger=alerting.engine2018-08-30T15:41:55.065779-03:00 torcello
grafana-server[1413]: t=2018-08-30T15:41:55-0300 lvl=info msg="Initializing
CleanUpService" logger=cleanup*
2018-08-30T15:41:55.274779-03:00 torcello systemd[1]: Started Grafana
instance.
2
*018-08-30T15:41:55.313056-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:55-0300 lvl=info msg="Initializing Stream
Manager"2018-08-30T15:41:55.313251-03:00 torcello grafana-server[1413]:
t=2018-08-30T15:41:55-0300 lvl=info msg="Initializing HTTP Server"
logger=http.server address=0.0.0.0:3000 <http://0.0.0.0:3000> protocol=http
subUrl= socket=*
2018-08-30T15:41:58.304749-03:00 torcello systemd[1]: Started Command
Scheduler.
2018-08-30T15:41:58.381694-03:00 torcello systemd[1]: Started The Salt
Minion.
2018-08-30T15:41:58.386643-03:00 torcello cron[1611]: (CRON) INFO
(RANDOM_DELAY will be scaled with factor 11% if used.)
2018-08-30T15:41:58.396087-03:00 torcello cron[1611]: (CRON) INFO (running
with inotify support)
2018-08-30T15:42:06.367096-03:00 torcello systemd[1]: Started Hold until
boot process finishes up.
2018-08-30T15:42:06.369301-03:00 torcello systemd[1]: Started Getty on tty1.
2018-08-30T15:42:11.535310-03:00 torcello systemd[1792]: Reached target
Paths.
2018-08-30T15:42:11.536128-03:00 torcello systemd[1792]: Starting D-Bus
User Message Bus Socket.
2018-08-30T15:42:11.536378-03:00 torcello systemd[1792]: Reached target
Timers.
2018-08-30T15:42:11.598968-03:00 torcello systemd[1792]: Listening on D-Bus
User Message Bus Socket.
2018-08-30T15:42:11.599151-03:00 torcello systemd[1792]: Reached target
Sockets.
2018-08-30T15:42:11.599277-03:00 torcello systemd[1792]: Reached target
Basic System.
2018-08-30T15:42:11.599398-03:00 torcello systemd[1792]: Reached target
Default.
2018-08-30T15:42:11.599514-03:00 torcello systemd[1792]: Startup finished
in 145ms.
2018-08-30T15:42:11.599636-03:00 torcello systemd[1]: Started User Manager
for UID 464.
2018-08-30T15:42:12.471869-03:00 torcello systemd[1792]: Started D-Bus User
Message Bus.
2018-08-30T15:42:15.898853-03:00 torcello systemd[1]: Starting Disk
Manager...
2018-08-30T15:42:15.974641-03:00 torcello systemd[1]: Started Disk Manager.
2018-08-30T15:42:16.897412-03:00 torcello node_exporter[807]:
time="2018-08-30T15:42:16-03:00" level=error msg="ERROR: ntp collector
failed after 0.000087s: couldn't get SNTP reply: read udp 127.0.0.1:42089->
127.0.0.1:123: read: connection refused" source="collector.go:123"
2018-08-30T15:42:17.589461-03:00 torcello chronyd[845]: Selected source
200.189.40.8
2018-08-30T15:43:16.899040-03:00 torcello node_exporter[807]:
time="2018-08-30T15:43:16-03:00" level=error msg="ERROR: ntp collector
failed after 0.000105s: couldn't get SNTP reply: read udp 127.0.0.1:59525->
127.0.0.1:123: read: connection refused" source="collector.go:123"
2018-08-30T15:44:15.496595-03:00 torcello systemd[1792]: Stopped target
Default.
2018-08-30T15:44:15.496824-03:00 torcello systemd[1792]: Stopping D-Bus
User Message Bus...
2018-08-30T15:44:15.502438-03:00 torcello systemd[1792]: Stopped D-Bus User
Message Bus.
2018-08-30T15:44:15.502627-03:00 torcello systemd[1792]: Stopped target
Basic System.
2018-08-30T15:44:15.502776-03:00 torcello systemd[1792]: Stopped target
Paths.
2018-08-30T15:44:15.502923-03:00 torcello systemd[1792]: Stopped target
Timers.
2018-08-30T15:44:15.503062-03:00 torcello systemd[1792]: Stopped target
Sockets.
2018-08-30T15:44:15.503200-03:00 torcello systemd[1792]: Closed D-Bus User
Message Bus Socket.
2018-08-30T15:44:15.503356-03:00 torcello systemd[1792]: Reached target
Shutdown.
2018-08-30T15:44:15.503572-03:00 torcello systemd[1792]: Starting Exit the
Session...
2018-08-30T15:44:15.511298-03:00 torcello systemd[2295]: Starting D-Bus
User Message Bus Socket.
2018-08-30T15:44:15.511493-03:00 torcello systemd[2295]: Reached target
Timers.
2018-08-30T15:44:15.511664-03:00 torcello systemd[2295]: Reached target
Paths.
2018-08-30T15:44:15.517873-03:00 torcello systemd[2295]: Listening on D-Bus
User Message Bus Socket.
2018-08-30T15:44:15.518060-03:00 torcello systemd[2295]: Reached target
Sockets.
2018-08-30T15:44:15.518216-03:00 torcello systemd[2295]: Reached target
Basic System.
2018-08-30T15:44:15.518373-03:00 torcello systemd[2295]: Reached target
Default.
2018-08-30T15:44:15.518501-03:00 torcello systemd[2295]: Startup finished
in 31ms.
2018-08-30T15:44:15.518634-03:00 torcello systemd[1]: Started User Manager
for UID 1000.
2018-08-30T15:44:15.518759-03:00 torcello systemd[1792]: Received
SIGRTMIN+24 from PID 2300 (kill).
2018-08-30T15:44:15.537634-03:00 torcello systemd[1]: Stopped User Manager
for UID 464.
2018-08-30T15:44:15.538422-03:00 torcello systemd[1]: Removed slice User
Slice of sddm.
2018-08-30T15:44:15.613246-03:00 torcello systemd[2295]: Started D-Bus User
Message Bus.
2018-08-30T15:44:15.623989-03:00 torcello dbus-daemon[2311]: [session
uid=1000 pid=2311] Successfully activated service 'org.freedesktop.systemd1'
2018-08-30T15:44:16.447162-03:00 torcello kapplymousetheme[2350]:
kcm_input: Using X11 backend
2018-08-30T15:44:16.901642-03:00 torcello node_exporter[807]:
time="2018-08-30T15:44:16-03:00" level=error msg="ERROR: ntp collector
failed after 0.000205s: couldn't get SNTP reply: read udp 127.0.0.1:53434->
127.0.0.1:123: read: connection refused" source="collector.go:123"
################

Any ideas?

Thanks a lot,

Jones

On Thu, Aug 30, 2018 at 4:14 AM Eugen Block <ebl...@nde.ag> wrote:

> Hi,
>
> > So, it only contains logs concerning the node itself (is it correct?
> sincer
> > node01 is also the master, I was expecting it to have logs from the other
> > too) and, moreover, no ceph-osd* files. Also, I'm looking the logs I have
> > available, and nothing "shines out" (sorry for my poor english) as a
> > possible error.
>
> the logging is not configured to be centralised per default, you would
> have to configure that yourself.
>
> Regarding the OSDs, if there are OSD logs created, they're created on
> the OSD nodes, not on the master. But since the OSD deployment fails,
> there probably are no OSD specific logs yet. So you'll have to take a
> look into the syslog (/var/log/messages), that's where the salt-minion
> reports its attempts to create the OSDs. Chances are high that you'll
> find the root cause in here.
>
> If the output is not enough, set the log-level to debug:
>
> osd-1:~ # grep -E "^log_level" /etc/salt/minion
> log_level: debug
>
>
> Regards,
> Eugen
>
>
> Zitat von Jones de Andrade <johanne...@gmail.com>:
>
> > Hi Eugen.
> >
> > Sorry for the delay in answering.
> >
> > Just looked in the /var/log/ceph/ directory. It only contains the
> following
> > files (for example on node01):
> >
> > #######
> > # ls -lart
> > total 3864
> > -rw------- 1 ceph ceph     904 ago 24 13:11 ceph.audit.log-20180829.xz
> > drwxr-xr-x 1 root root     898 ago 28 10:07 ..
> > -rw-r--r-- 1 ceph ceph  189464 ago 28 23:59
> ceph-mon.node01.log-20180829.xz
> > -rw------- 1 ceph ceph   24360 ago 28 23:59 ceph.log-20180829.xz
> > -rw-r--r-- 1 ceph ceph   48584 ago 29 00:00
> ceph-mgr.node01.log-20180829.xz
> > -rw------- 1 ceph ceph       0 ago 29 00:00 ceph.audit.log
> > drwxrws--T 1 ceph ceph     352 ago 29 00:00 .
> > -rw-r--r-- 1 ceph ceph 1908122 ago 29 12:46 ceph-mon.node01.log
> > -rw------- 1 ceph ceph  175229 ago 29 12:48 ceph.log
> > -rw-r--r-- 1 ceph ceph 1599920 ago 29 12:49 ceph-mgr.node01.log
> > #######
> >
> > So, it only contains logs concerning the node itself (is it correct?
> sincer
> > node01 is also the master, I was expecting it to have logs from the other
> > too) and, moreover, no ceph-osd* files. Also, I'm looking the logs I have
> > available, and nothing "shines out" (sorry for my poor english) as a
> > possible error.
> >
> > Any suggestion on how to proceed?
> >
> > Thanks a lot in advance,
> >
> > Jones
> >
> >
> > On Mon, Aug 27, 2018 at 5:29 AM Eugen Block <ebl...@nde.ag> wrote:
> >
> >> Hi Jones,
> >>
> >> all ceph logs are in the directory /var/log/ceph/, each daemon has its
> >> own log file, e.g. OSD logs are named ceph-osd.*.
> >>
> >> I haven't tried it but I don't think SUSE Enterprise Storage deploys
> >> OSDs on partitioned disks. Is there a way to attach a second disk to
> >> the OSD nodes, maybe via USB or something?
> >>
> >> Although this thread is ceph related it is referring to a specific
> >> product, so I would recommend to post your question in the SUSE forum
> >> [1].
> >>
> >> Regards,
> >> Eugen
> >>
> >> [1] https://forums.suse.com/forumdisplay.php?99-SUSE-Enterprise-Storage
> >>
> >> Zitat von Jones de Andrade <johanne...@gmail.com>:
> >>
> >> > Hi Eugen.
> >> >
> >> > Thanks for the suggestion. I'll look for the logs (since it's our
> first
> >> > attempt with ceph, I'll have to discover where they are, but no
> problem).
> >> >
> >> > One thing called my attention on your response however:
> >> >
> >> > I haven't made myself clear, but one of the failures we encountered
> were
> >> > that the files now containing:
> >> >
> >> > node02:
> >> >    ----------
> >> >    storage:
> >> >        ----------
> >> >        osds:
> >> >            ----------
> >> >            /dev/sda4:
> >> >                ----------
> >> >                format:
> >> >                    bluestore
> >> >                standalone:
> >> >                    True
> >> >
> >> > Were originally empty, and we filled them by hand following a model
> found
> >> > elsewhere on the web. It was necessary, so that we could continue, but
> >> the
> >> > model indicated that, for example, it should have the path for
> /dev/sda
> >> > here, not /dev/sda4. We chosen to include the specific partition
> >> > identification because we won't have dedicated disks here, rather just
> >> the
> >> > very same partition as all disks were partitioned exactly the same.
> >> >
> >> > While that was enough for the procedure to continue at that point,
> now I
> >> > wonder if it was the right call and, if it indeed was, if it was done
> >> > properly.  As such, I wonder: what you mean by "wipe" the partition
> here?
> >> > /dev/sda4 is created, but is both empty and unmounted: Should a
> different
> >> > operation be performed on it, should I remove it first, should I have
> >> > written the files above with only /dev/sda as target?
> >> >
> >> > I know that probably I wouldn't run in this issues with dedicated
> discks,
> >> > but unfortunately that is absolutely not an option.
> >> >
> >> > Thanks a lot in advance for any comments and/or extra suggestions.
> >> >
> >> > Sincerely yours,
> >> >
> >> > Jones
> >> >
> >> > On Sat, Aug 25, 2018 at 5:46 PM Eugen Block <ebl...@nde.ag> wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> take a look into the logs, they should point you in the right
> direction.
> >> >> Since the deployment stage fails at the OSD level, start with the OSD
> >> >> logs. Something's not right with the disks/partitions, did you wipe
> >> >> the partition from previous attempts?
> >> >>
> >> >> Regards,
> >> >> Eugen
> >> >>
> >> >> Zitat von Jones de Andrade <johanne...@gmail.com>:
> >> >>
> >> >>> (Please forgive my previous email: I was using another message and
> >> >>> completely forget to update the subject)
> >> >>>
> >> >>> Hi all.
> >> >>>
> >> >>> I'm new to ceph, and after having serious problems in ceph stages
> 0, 1
> >> >> and
> >> >>> 2 that I could solve myself, now it seems that I have hit a wall
> harder
> >> >>> than my head. :)
> >> >>>
> >> >>> When I run salt-run state.orch ceph.stage.deploy, i monitor I see it
> >> >> going
> >> >>> up to here:
> >> >>>
> >> >>> #######
> >> >>> [14/71]   ceph.sysctl on
> >> >>>           node01....................................... ✓ (0.5s)
> >> >>>           node02........................................ ✓ (0.7s)
> >> >>>           node03....................................... ✓ (0.6s)
> >> >>>           node04......................................... ✓ (0.5s)
> >> >>>           node05....................................... ✓ (0.6s)
> >> >>>           node06.......................................... ✓ (0.5s)
> >> >>>
> >> >>> [15/71]   ceph.osd on
> >> >>>           node01...................................... ❌ (0.7s)
> >> >>>           node02........................................ ❌ (0.7s)
> >> >>>           node03....................................... ❌ (0.7s)
> >> >>>           node04......................................... ❌ (0.6s)
> >> >>>           node05....................................... ❌ (0.6s)
> >> >>>           node06.......................................... ❌ (0.7s)
> >> >>>
> >> >>> Ended stage: ceph.stage.deploy succeeded=14/71 failed=1/71
> time=624.7s
> >> >>>
> >> >>> Failures summary:
> >> >>>
> >> >>> ceph.osd (/srv/salt/ceph/osd):
> >> >>>   node02:
> >> >>>     deploy OSDs: Module function osd.deploy threw an exception.
> >> >> Exception:
> >> >>> Mine on node02 for cephdisks.list
> >> >>>   node03:
> >> >>>     deploy OSDs: Module function osd.deploy threw an exception.
> >> >> Exception:
> >> >>> Mine on node03 for cephdisks.list
> >> >>>   node01:
> >> >>>     deploy OSDs: Module function osd.deploy threw an exception.
> >> >> Exception:
> >> >>> Mine on node01 for cephdisks.list
> >> >>>   node04:
> >> >>>     deploy OSDs: Module function osd.deploy threw an exception.
> >> >> Exception:
> >> >>> Mine on node04 for cephdisks.list
> >> >>>   node05:
> >> >>>     deploy OSDs: Module function osd.deploy threw an exception.
> >> >> Exception:
> >> >>> Mine on node05 for cephdisks.list
> >> >>>   node06:
> >> >>>     deploy OSDs: Module function osd.deploy threw an exception.
> >> >> Exception:
> >> >>> Mine on node06 for cephdisks.list
> >> >>> #######
> >> >>>
> >> >>> Since this is a first attempt in 6 simple test machines, we are
> going
> >> to
> >> >>> put the mon, osds, etc, in all nodes at first. Only the master is
> left
> >> >> in a
> >> >>> single machine (node01) by now.
> >> >>>
> >> >>> As they are simple machines, they have a single hdd, which is
> >> partitioned
> >> >>> as follows (the hda4 partition is unmounted and left for the ceph
> >> >> system):
> >> >>>
> >> >>> ###########
> >> >>> # lsblk
> >> >>> NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
> >> >>> sda      8:0    0 465,8G  0 disk
> >> >>> ├─sda1   8:1    0   500M  0 part /boot/efi
> >> >>> ├─sda2   8:2    0    16G  0 part [SWAP]
> >> >>> ├─sda3   8:3    0  49,3G  0 part /
> >> >>> └─sda4   8:4    0   400G  0 part
> >> >>> sr0     11:0    1   3,7G  0 rom
> >> >>>
> >> >>> # salt -I 'roles:storage' cephdisks.list
> >> >>> node01:
> >> >>> node02:
> >> >>> node03:
> >> >>> node04:
> >> >>> node05:
> >> >>> node06:
> >> >>>
> >> >>> # salt -I 'roles:storage' pillar.get ceph
> >> >>> node02:
> >> >>>     ----------
> >> >>>     storage:
> >> >>>         ----------
> >> >>>         osds:
> >> >>>             ----------
> >> >>>             /dev/sda4:
> >> >>>                 ----------
> >> >>>                 format:
> >> >>>                     bluestore
> >> >>>                 standalone:
> >> >>>                     True
> >> >>> (and so on for all 6 machines)
> >> >>> ##########
> >> >>>
> >> >>> Finally and just in case, my policy.cfg file reads:
> >> >>>
> >> >>> #########
> >> >>> #cluster-unassigned/cluster/*.sls
> >> >>> cluster-ceph/cluster/*.sls
> >> >>> profile-default/cluster/*.sls
> >> >>> profile-default/stack/default/ceph/minions/*yml
> >> >>> config/stack/default/global.yml
> >> >>> config/stack/default/ceph/cluster.yml
> >> >>> role-master/cluster/node01.sls
> >> >>> role-admin/cluster/*.sls
> >> >>> role-mon/cluster/*.sls
> >> >>> role-mgr/cluster/*.sls
> >> >>> role-mds/cluster/*.sls
> >> >>> role-ganesha/cluster/*.sls
> >> >>> role-client-nfs/cluster/*.sls
> >> >>> role-client-cephfs/cluster/*.sls
> >> >>> ##########
> >> >>>
> >> >>> Please, could someone help me and shed some light on this issue?
> >> >>>
> >> >>> Thanks a lot in advance,
> >> >>>
> >> >>> Regasrds,
> >> >>>
> >> >>> Jones
> >> >>
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> ceph-users mailing list
> >> >> ceph-users@lists.ceph.com
> >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >>
> >>
> >>
> >>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph-Deploy error on 15/71 stage

Reply via email to