Hi community, 10 months ago, we discovered issue, after removing cache tier
from cluster with cluster HEALTH, and start email thread, as result - new
bug was created on tracker by Samuel Just
http://tracker.ceph.com/issues/12738
Till that time, i'm looking for good moment to upgrade (after fix
Hi community, 10 months ago, we discovered issue, after removing cache tier
from cluster with cluster HEALTH, and start email thread, as result - new
bug was created on tracker by Samuel Just
http://tracker.ceph.com/issues/12738
Till that time, i'm looking for good moment to upgrade (after fix
Voloshanenko Igor <igor.voloshane...@gmail.com>:
> Wido, it's the main issue. No records at all...
>
>
> So, from last time:
>
>
> 2015-11-02 11:40:33,204 DEBUG [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Executing: /bin/bash -c free|grep
Wido den Hollander <w...@42on.com>:
>
>
> On 03-11-15 10:04, Voloshanenko Igor wrote:
> > Wido, also minor issue with 0,2.0 java-rados
> >
>
> Did you also re-compile CloudStack against the new rados-java? I still
> think it's related to when the Agent starts c
... but i can;t find any
bad code there (((
2015-11-03 10:40 GMT+02:00 Wido den Hollander <w...@42on.com>:
>
>
> On 03-11-15 01:54, Voloshanenko Igor wrote:
> > Thank you, Jason!
> >
> > Any advice, for troubleshooting
> >
> > I'm looking in code, and righ
Dear all, can anybody help?
2015-10-30 10:37 GMT+02:00 Voloshanenko Igor <igor.voloshane...@gmail.com>:
> It's pain, but not... :(
> We already used your updated lib in dev env... :(
>
> 2015-10-30 10:06 GMT+02:00 Wido den Hollander <w...@42on.com>:
>
>>
>
f. The most likely problem is that the RADOS IO
> context is being closed prior to closing the RBD image.
>
> --
>
> Jason Dillaman
>
>
> - Original Message -
>
> > From: "Voloshanenko Igor" <igor.voloshane...@gmail.com>
> > To: "Ceph
It's pain, but not... :(
We already used your updated lib in dev env... :(
2015-10-30 10:06 GMT+02:00 Wido den Hollander <w...@42on.com>:
>
>
> On 29-10-15 16:38, Voloshanenko Igor wrote:
> > Hi Wido and all community.
> >
> > We catched very idiotic issue on o
Hi Wido and all community.
We catched very idiotic issue on our Cloudstack installation, which related
to ceph and possible to java-rados lib.
So, we have constantly agent crashed (which cause very big problem for
us... ).
When agent crashed - it's crash JVM. And no event in logs at all.
We
>From all we analyzed - look like - it's this issue
http://tracker.ceph.com/issues/13045
PR: https://github.com/ceph/ceph/pull/6097
Can anyone help us to confirm this? :)
2015-10-29 23:13 GMT+02:00 Voloshanenko Igor <igor.voloshane...@gmail.com>:
> Additional trace:
>
> #0
e.c:312
#12 0x7f30f995547d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
2015-10-29 17:38 GMT+02:00 Voloshanenko Igor <igor.voloshane...@gmail.com>:
> Hi Wido and all community.
>
> We catched very idiotic issue on our Cloudstack installation, which
> relate
Great!
Yes, behaviour exact as i described. So looks like it's root cause )
Thank you, Sam. Ilya!
2015-08-21 21:08 GMT+03:00 Samuel Just sj...@redhat.com:
I think I found the bug -- need to whiteout the snapset (or decache
it) upon evict.
http://tracker.ceph.com/issues/12748
-Sam
On Fri,
To be honest, Samsung 850 PRO not 24/7 series... it's something about
desktop+ series, but anyway - results from this drives - very very bad in
any scenario acceptable by real life...
Possible 845 PRO more better, but we don't want to experiment anymore... So
we choose S3500 240G. Yes, it's
Exact as in our case.
Ilya, same for images from our side. Headers opened from hot tier
пятница, 21 августа 2015 г. пользователь Ilya Dryomov написал:
On Fri, Aug 21, 2015 at 2:02 AM, Samuel Just sj...@redhat.com
javascript:; wrote:
What's supposed to happen is that the client transparently
Not yet. I will create.
But according to mail lists and Inktank docs - it's expected behaviour when
cache enable
2015-08-20 19:56 GMT+03:00 Samuel Just sj...@redhat.com:
Is there a bug for this in the tracker?
-Sam
On Thu, Aug 20, 2015 at 9:54 AM, Voloshanenko Igor
igor.voloshane
, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
Not yet. I will create.
But according to mail lists and Inktank docs - it's expected behaviour
when
cache enable
2015-08-20 19:56 GMT+03:00 Samuel Just sj...@redhat.com:
Is there a bug for this in the tracker?
-Sam
On Thu, Aug 20
these two images.
-Sam
On Thu, Aug 20, 2015 at 3:42 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
Sam, i try to understand which rbd contain this chunks.. but no luck. No
rbd
images block names started with this...
Actually, now that I think about it, you probably didn't remove
?
-Sam
On Thu, Aug 20, 2015 at 3:58 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
We used 4.x branch, as we have very good Samsung 850 pro in
production,
and they don;t support ncq_trim...
And 4,x first branch which include exceptions for this in libsata.c.
sure we can
probably where the bug is. Odd. It could also be a bug
specific 'forward' mode either in the client or on the osd. Why did
you have it in that mode?
-Sam
On Thu, Aug 20, 2015 at 3:58 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
We used 4.x branch, as we have very good Samsung
Right. But issues started...
2015-08-21 2:20 GMT+03:00 Samuel Just sj...@redhat.com:
But that was still in writeback mode, right?
-Sam
On Thu, Aug 20, 2015 at 4:18 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
WE haven't set values for max_bytes / max_objects.. and all data
As i we use journal collocation for journal now (because we want to utilize
cache layer ((( ) i use ceph-disk to create new OSD (changed journal size
on ceph.conf). I don;t prefer manual work))
So create very simple script to update journal size
2015-08-21 2:25 GMT+03:00 Voloshanenko Igor
Will do, Sam!
thank in advance for you help!
2015-08-21 2:28 GMT+03:00 Samuel Just sj...@redhat.com:
Ok, create a ticket with a timeline and all of this information, I'll
try to look into it more tomorrow.
-Sam
On Thu, Aug 20, 2015 at 4:25 PM, Voloshanenko Igor
igor.voloshane...@gmail.com
attach the
whole ceph.log from the 6 hours before and after the snippet you
linked above? Are you using cache/tiering? Can you attach the osdmap
(ceph osd getmap -o outfile)?
-Sam
On Tue, Aug 18, 2015 at 4:15 AM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
ceph - 0.94.2
Its
, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
No, when we start draining cache - bad pgs was in place...
We have big rebalance (disk by disk - to change journal side on both
hot/cold layers).. All was Ok, but after 2 days - arrived scrub errors
and 2
pgs inconsistent
PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
root@test:~# uname -a
Linux ix-s5 4.0.4-040004-generic #201505171336 SMP Sun May 17 17:37:22
UTC
2015 x86_64 x86_64 x86_64 GNU/Linux
2015-08-21 1:54 GMT+03:00 Samuel Just sj...@redhat.com:
Also, can you include the kernel
values). For any new images - no
2015-08-21 2:21 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.com:
Right. But issues started...
2015-08-21 2:20 GMT+03:00 Samuel Just sj...@redhat.com:
But that was still in writeback mode, right?
-Sam
On Thu, Aug 20, 2015 at 4:18 PM, Voloshanenko Igor
, 2015 at 4:11 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
No, when we start draining cache - bad pgs was in place...
We have big rebalance (disk by disk - to change journal side on both
hot/cold layers).. All was Ok, but after 2 days - arrived scrub errors
and 2
pgs
:
This was related to the caching layer, which doesnt support snapshooting
per
docs...for sake of closing the thread.
On 17 August 2015 at 21:15, Voloshanenko Igor
igor.voloshane...@gmail.com
wrote:
Hi all, can you please help me with unexplained situation...
All snapshot inside ceph
-o cm.new
echo Inject new CRUSHMAP
ceph osd setcrushmap -i cm.new
#echo Clean...
#rm -rf cm cm.new
echo Unset noout option for CEPH cluster
ceph osd unset noout
echo OSD recreated... Waiting for rebalancing...
2015-08-21 2:37 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.com:
As i we
the thread.
On 17 August 2015 at 21:15, Voloshanenko Igor
igor.voloshane...@gmail.com
wrote:
Hi all, can you please help me with unexplained situation...
All snapshot inside ceph broken...
So, as example, we have VM template, as rbd inside ceph.
We can map it and mount to check that all
, Samuel Just sj...@redhat.com wrote:
Yeah, I'm trying to confirm that the issues did happen in writeback mode.
-Sam
On Thu, Aug 20, 2015 at 4:21 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
Right. But issues started...
2015-08-21 2:20 GMT+03:00 Samuel Just sj...@redhat.com
Exactly
пятница, 21 августа 2015 г. пользователь Samuel Just написал:
And you adjusted the journals by removing the osd, recreating it with
a larger journal, and reinserting it?
-Sam
On Thu, Aug 20, 2015 at 4:24 PM, Voloshanenko Igor
igor.voloshane...@gmail.com javascript:; wrote:
Right
-21 1:56 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.com:
root@test:~# uname -a
Linux ix-s5 4.0.4-040004-generic #201505171336 SMP Sun May 17 17:37:22 UTC
2015 x86_64 x86_64 x86_64 GNU/Linux
2015-08-21 1:54 GMT+03:00 Samuel Just sj...@redhat.com:
Also, can you include the kernel
correctly?)
-Sam
On Thu, Aug 20, 2015 at 4:07 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
Good joke )
2015-08-21 2:06 GMT+03:00 Samuel Just sj...@redhat.com:
Certainly, don't reproduce this with a cluster you care about :).
-Sam
On Thu, Aug 20, 2015 at 4:02 PM
corruption (except possibly on snapshots of those two images)?
-Sam
On Thu, Aug 20, 2015 at 10:07 AM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
Inktank:
https://download.inktank.com/docs/ICE%201.2%20-%20Cache%20and%20Erasure%20Coding%20FAQ.pdf
Mail-list:
https://www.mail
. Please scrub both inconsistent pgs and post the
ceph.log from before when you started the scrub until after. Also,
what command are you using to take snapshots?
-Sam
On Thu, Aug 20, 2015 at 3:59 AM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
Hi Samuel, we try to fix it in trick way
, Aug 20, 2015 at 9:41 AM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
Samuel, we turned off cache layer few hours ago...
I will post ceph.log in few minutes
For snap - we found issue, was connected with cache tier..
2015-08-20 19:23 GMT+03:00 Samuel Just sj...@redhat.com:
Ok
be the right place to
ask questions.
Otherwise, it'll probably get done in the next few weeks.
-Sam
On Thu, Aug 20, 2015 at 3:10 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
thank you Sam!
I also noticed this linked errors during scrub...
Now all lools like reasonable!
So we
No. This will no help (((
I try to found data, but it's look exist with same time stamp on all osd or
missing on all osd ...
So, need advice , what I need to do...
вторник, 18 августа 2015 г. пользователь Abhishek L написал:
Voloshanenko Igor writes:
Hi Irek, Please read careful )))
You
-- Пересылаемое сообщение -
От: *Voloshanenko Igor* igor.voloshane...@gmail.com
Дата: вторник, 18 августа 2015 г.
Тема: Repair inconsistent pgs..
Кому: Irek Fasikhov malm...@gmail.com
Some additional inforamtion (Tnx Irek for questions!)
Pool values:
root@test:~# ceph osd pool
No. This will no help (((
I try to found data, but it's look exist with same time stamp on all osd or
missing on all osd ...
So, need advice , what I need to do...
вторник, 18 августа 2015 г. пользователь Abhishek L написал:
Voloshanenko Igor writes:
Hi Irek, Please read careful )))
You
{'print$1'}`;do ceph pg repair $i;done
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
2015-08-18 8:27 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.com:
Hi all, at our production cluster, due high rebalancing ((( we have 2 pgs
in inconsistent state...
root@temp:~# ceph health
Hi all, at our production cluster, due high rebalancing ((( we have 2 pgs
in inconsistent state...
root@temp:~# ceph health detail | grep inc
HEALTH_ERR 2 pgs inconsistent; 18 scrub errors
pg 2.490 is active+clean+inconsistent, acting [56,15,29]
pg 2.c4 is active+clean+inconsistent, acting
Hi all, can you please help me with unexplained situation...
All snapshot inside ceph broken...
So, as example, we have VM template, as rbd inside ceph.
We can map it and mount to check that all ok with it
root@test:~# rbd map cold-storage/0e23c701-401d-4465-b9b4-c02939d57bb5
/dev/rbd0
, as a conclusion - I'll recommend you to get a bigger budget and buy
durable
and fast SSDs for Ceph.
Megov Igor
CIO, Yuterra
От: ceph-users ceph-users-boun...@lists.ceph.com javascript:; от
имени Voloshanenko
Igor igor.voloshane...@gmail.com javascript
any SSD) on write performance. This is a very small
cluster.
Pieter
On Aug 12, 2015, at 04:33 PM, Voloshanenko Igor
igor.voloshane...@gmail.com wrote:
Hi all, we have setup CEPH cluster with 60 OSD (2 diff types) (5 nodes, 12
disks on each, 10 HDD, 2 SSD)
Also we cover this with custom
cheap but are out of stock at the moment (here).
Faster than Intels, cheaper, and slightly different technology (3D V-NAND)
which IMO makes them superior without needing many tricks to do its job.
Jan
On 13 Aug 2015, at 14:40, Voloshanenko Igor igor.voloshane...@gmail.com
wrote:
Tnx, Irek
Hi all, we have setup CEPH cluster with 60 OSD (2 diff types) (5 nodes, 12
disks on each, 10 HDD, 2 SSD)
Also we cover this with custom crushmap with 2 root leaf
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-100 5.0 root ssd
-102 1.0 host ix-s2-ssd
2
48 matches
Mail list logo