Re: [ceph-users] ceph-osd failure following 0.92 -> 0.94 upgrade

2015-04-09 Thread Gregory Farnum
On Thu, Apr 9, 2015 at 2:05 PM, Dirk Grunwald
 wrote:
> Ceph cluster, U14.10 base system, OSD's using BTRFS, journal on same disk as
> partition
> (done using ceph-deploy)
>
> I had been running 0.92 without (significant) issue. I upgraded
> to Hammer (0.94) be modifying /etc/apt/sources.list, apt-get update, apt-get
> upgrade
>
> Upgraded and restarted ceph-mon and then ceph-osd
>
> Most of the 50 OSD's are in a failure cycle with the error
> "os/Transaction.cc: 504: FAILED assert(ops == data.ops)"
>
> Right now, the entire cluster is useless because of this.
>
> Any suggestions?

It looks like maybe it's under the v80.x section instead of general
upgrading, but the release notes include:

* If you are upgrading specifically from v0.92, you must stop all OSD
  daemons and flush their journals (``ceph-osd -i NNN
  --flush-journal``) before upgrading.  There was a transaction
  encoding bug in v0.92 that broke compatibility.  Upgrading from v0.93,
  v0.91, or anything earlier is safe.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-osd failure following 0.92 -> 0.94 upgrade

2015-04-09 Thread Dirk Grunwald
Ceph cluster, U14.10 base system, OSD's using BTRFS, journal on same disk
as partition
(done using ceph-deploy)

I had been running 0.92 without (significant) issue. I upgraded
to Hammer (0.94) be modifying /etc/apt/sources.list, apt-get update,
apt-get upgrade

Upgraded and restarted ceph-mon and then ceph-osd

Most of the 50 OSD's are in a failure cycle with the error
"os/Transaction.cc: 504: FAILED assert(ops == data.ops)"

Right now, the entire cluster is useless because of this.

Any suggestions?

*--*

*root@zfs2:/var/log/ceph# **ceph-osd -i 2 -f*

*starting osd.2 at :/0 osd_data /var/lib/ceph/osd/ceph-2
/var/lib/ceph/osd/ceph-2/journal*

* HDIO_DRIVE_CMD(identify) failed: Invalid argument*

*os/Transaction.cc: In function 'void
ObjectStore::Transaction::_build_actions_from_tbl()' thread**\*

* 7fa97d88c900 time 2015-04-09 14:57:33.763110*

*os/Transaction.cc: 504: FAILED assert(ops == data.ops)*

* ceph version 0.94 (e61c4f093f88e44961d157f65091733580cea79a)*

* 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x8b) [0xbc26ab]*

* 2: (ObjectStore::Transaction::_build_actions_from_tbl()+0x3735)
[0x99fa55]*

* 3: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
int, ThreadPool::TPHan\*

*dle*)+0x407b) [0x92497b]*

* 4: (FileStore::_do_transactions(std::list >&, unsigned long, ThreadPool::TPHandle*)+0x64) [0x9279e4]*

* 5: (JournalingObjectStore::journal_replay(unsigned long)+0x5cb)
[0x94058b]*

* 6: (FileStore::mount()+0x3bb6) [0x910d46]*

* 7: (OSD::init()+0x259) [0x6c47b9]*

* 8: (main()+0x2860) [0x651fc0]*

* 9: (__libc_start_main()+0xf5) [0x7fa97a9ceec5]*

* 10: ceph-osd() [0x66aff7]*

* NOTE: a copy of the executable, or `objdump **-rdS** ` is
needed to interpret this.*

*2015-04-09 14:57:33.766270 7fa97d88c900 **-1** os/Transaction.cc: In
function 'void ObjectStore::Tra\*

*nsaction::_build_actions_from_tbl()' thread 7fa97d88c900 time 2015-04-09
14:57:33.763110*

*os/Transaction.cc: 504: FAILED assert(ops == data.ops)*


* cep*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] installing and updating while leaving osd drive data intact

2015-04-09 Thread Deneau, Tom
Referencing this old thread below, I am wondering what is the proper way
to install say new versions of ceph and start up daemons but keep
all the data on the osd drives.

I had been using ceph-deploy new which I guess creates a new cluster fsid.
Normally for my testing I had been starting with clean osd drives but
I would also like to be able to restart and leave the osd drives as is.

-- Tom


> Hi,
> I have faced a similar issue. This happens if the ceph disks aren't
> purged/cleaned completely. Clear of the contents in the /dev/sdb1 device.
> There is a file named ceph_fsid in the disk which would  have the old
> cluster's fsid. This needs to be deleted for it to work.
>
> Hope it helps.
>
> Sharmila


On Mon, May 26, 2014 at 2:52 PM, JinHwan Hwang  wrote:

> I'm trying to install ceph 0.80.1 on ubuntu 14.04. All other things goes
> well except 'activate osd' phase. It tells me they can't find proper fsid
> when i do 'activate osd'. This is not my first time of installing ceph, and
> all the process i did was ok when i did on other(though they were ubuntu
> 12.04 , virtual machines, ceph-emperor)
>
> ceph at ceph-mon:~$ ceph-deploy osd activate ceph-osd0:/dev/sdb1
> ceph-osd0:/dev/sdc1 ceph-osd1:/dev/sdb1 ceph-osd1:/dev/sdc1
> ...
> [ceph-osd0][WARNIN] ceph-disk: Error: No cluster conf found in /etc/ceph
> with fsid 05b994a0-20f9-48d7-8d34-107ffcb39e5b
> ..
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Firefly - Giant : CentOS 7 : install failed ceph-deploy

2015-04-09 Thread Ken Dreyer
On 04/08/2015 03:00 PM, Travis Rhoden wrote:
> Hi Vickey,
> 
> The easiest way I know of to get around this right now is to add the
> following line in section for epel in /etc/yum.repos.d/epel.repo
> 
> exclude=python-rados python-rbd
> 
> So this is what my epel.repo file looks like: http://fpaste.org/208681/
> 
> It is those two packages in EPEL that are causing problems.  I also
> tried enabling epel-testing, but that didn't work either.

My wild guess is that enabling epel-testing is not enough, because the
offending 0.80.7-0.4.el7 build in the stable EPEL repository is still
visible to yum.

When you set that "exclude=" parameter in /etc/yum.repos.d/epel.repo,
like "exclude=python-rados python-rbd python-cephfs", *and* also try
"--enablerepo=epel-testing", does it work?

- Ken
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] use ZFS for OSDs

2015-04-09 Thread Michal Kozanecki
I had surgery and have been off for a while. Had to rebuild test ceph+openstack 
cluster with whatever spare parts I had. I apologize for the delay for anyone 
who's been interested.

Here are the results;
==
Hardware/Software
3 node CEPH cluster, 3 OSDs (one OSD per node)
--
CPU = 1x E5-2670 v1
RAM = 8GB
OS Disk = 500GB SATA
OSD = 900GB 10k SAS (sdc - whole device)
Journal = Shared Intel SSD DC3500 80GB (sdb1 - 10GB partition)
ZFS log = Shared Intel SSD DC3500 80GB (sdb2 - 4GB partition)
ZFS L2ARC = Intel SSD 320 40GB (sdd - whole device)  
-
ceph 0.87
ZoL 0.63
CentOS 7.0

2 node KVM/Openstack cluster

CPU = 2x Xeon X5650
RAM = 24 GB
OS Disk = 500GB SATA
-
Ubuntu 14.04
OpenStack Juno

the rough performance of this oddball sized test ceph cluster is 8k 1000-1500 
IOPS 

==
Compression; (cut out unneeded details)
Various Debian and CentOS images, with lots of test SVN and GIT data 
KVM/OpenStack

[root@ceph03 ~]# zfs get all SAS1
NAME  PROPERTY  VALUE  SOURCE
SAS1  used  586G   -
SAS1  compressratio 1.50x  -
SAS1  recordsize32Klocal
SAS1  checksum  on default
SAS1  compression   lz4local
SAS1  refcompressratio  1.50x  -
SAS1  written   586G   -
SAS1  logicalused   877G   -

==
Dedupe; (dedupe is enabled on a dataset level but can dedupe space savings only 
be viewed at a pool level - bit odd I know)
Various Debian and CentOS images, with lots of test SVN and GIT data 
KVM/OpenStack

[root@ceph01 ~]# zpool get all SAS1
NAME  PROPERTY   VALUE  SOURCE
SAS1  size   836G   -
SAS1  capacity   70%-
SAS1  dedupratio 1.02x  -
SAS1  free   250G   -
SAS1  allocated  586G   -

==
Bitrot/Corruption;
Injected random data to random locations (changed seek to random value) of sdc 
with;

dd if=/dev/urandom of=/dev/sdc seek=54356 bs=4k count=1

Results;

1. ZFS detects error on disk affecting PG files, being as this is a single vdev 
(no zraid or mirror) it cannot automatically fix. It blocks all(but delete) 
access to the entire files(inaccessible). 
*note: I ran this after status after already repairing 2 PGs (5.15 and 5.25), 
ZFS status will no longer list filename after it has been 
repaired/deleted/cleared*



[root@ceph01 ~]# zpool status -v
  pool: SAS1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub in progress since Thu Apr  9 13:04:54 2015
153G scanned out of 586G at 40.3M/s, 3h3m to go
0 repaired, 26.05% done
config:

NAME  STATE READ WRITE CKSUM
SAS1  ONLINE   0 035
  sdc ONLINE   0 070
logs
  sdb2ONLINE   0 0 0
cache
  sdd ONLINE   0 0 0

errors: Permanent errors have been detected in the following files: 

/SAS1/current/5.e_head/DIR_E/DIR_0/DIR_6/rbd\udata.2ba762ae8944a.24cc__head_6153260E__5



2. CEPH-OSD cannot read PG file. Kicks off scrub/deep-scrub



/var/log/ceph/ceph-osd.2.log
2015-04-09 13:10:18.319312 7fcbb163a700 -1 log_channel(default) log [ERR] : 
5.18 shard 1: soid cd635018/rbd_data.93d1f74b0dc51.18ee/head//5 
candidate had a read error, digest 1835988768 != known digest 473354757
2015-04-09 13:11:38.587014 7fcbb1e3b700 -1 log_channel(default) log [ERR] : 
5.18 deep-scrub 0 missing, 1 inconsistent objects
2015-04-09 13:11:38.587020 7fcbb1e3b700 -1 log_channel(default) log [ERR] : 
5.18 deep-scrub 1 errors

/var/log/ceph/ceph-osd.1.log
2015-04-09 13:11:43.640499 7fe10b3c5700 -1 log_channel(default) log [ERR] : 
5.25 shard 1: soid 73eb0125/rbd_data.5315b2ae8944a.5348/head//5 
candidate had a read error, digest 1522345897 != known digest 1180025616
2015-04-09 13:12:44.781546 7fe10abc4700 -1 log_channel(default) log [ERR] : 
5.25 deep-scrub 0 missing, 1 inconsistent objects
2015-04-09 13:12:44.781553 7fe10abc4700 -1 log_channel(default) log [ERR] : 
5.25 deep-scrub 1 errors

---

3. CEPH STATUS reports an error

---

[root@client01 ~]# ceph status

cluster e93ce4d3-3a46-4082-9ec5-e23c82ca616e
 health HEALTH_WARN 2 p

Re: [ceph-users] Ceph Hammer : Ceph-deploy 1.5.23-0 : RGW civetweb :: Not getting installed

2015-04-09 Thread Iain Geddes
Hi Vickey,

The keyring gets created as part of the initial deployment so it should be
on your admin node right alongside the admin keyring etc. FWIW, I tried
this quickly yesterday and it failed because the RGW directory didn't exist
on the node that I was attempting to deploy to ... but I didn't actually
look that deeply into it as it's not critical for what I wanted to complete
today. The keyring was definately there following a successful deployment
though.

Kind regards


Iain

On Thu, Apr 9, 2015 at 7:41 PM, Vickey Singh 
wrote:

> Hello Cephers
>
> I am trying to setup RGW using Ceph-deploy which is described here
>
>
> http://docs.ceph.com/docs/master/start/quick-ceph-deploy/#add-an-rgw-instance
>
>
> But unfortunately it doesn't seems to be working
>
> Is there something i am missing  or you know some fix for this.
>
>
>
>
> [root@ceph-node1 yum.repos.d]# ceph -v
>
> *ceph version 0.94* (e61c4f093f88e44961d157f65091733580cea79a)
>
> [root@ceph-node1 yum.repos.d]#
>
>
>
> # yum update ceph-deploy
>
>
> << SKIPPED >
>
>
>
>   Verifying  : ceph-deploy-1.5.22-0.noarch
> 2/2
>
>
> Updated:
>
>  * ceph-deploy.noarch 0:1.5.23-0*
>
>
> Complete!
>
> [root@ceph-node1 ceph]#
>
>
>
>
>
> [root@ceph-node1 ceph]# ceph-deploy rgw create rgw-node1
>
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
>
> [ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy rgw
> create rgw-node1
>
> [ceph_deploy.rgw][DEBUG ] Deploying rgw, cluster ceph hosts
> rgw-node1:rgw.rgw-node1
>
> *[ceph_deploy][ERROR ] RuntimeError: bootstrap-rgw keyring not found; run
> 'gatherkeys'*
>
>
>
>
> [root@ceph-node1 ceph]# ceph-deploy --overwrite-conf mon create-initial
>
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
>
> [ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy
> --overwrite-conf mon create-initial
>
> << SKIPPED >
>
> [ceph_deploy.mon][INFO  ] mon.ceph-node1 monitor has reached quorum!
>
> [ceph_deploy.mon][INFO  ] all initial monitors are running and have formed
> quorum
>
> [ceph_deploy.mon][INFO  ] Running gatherkeys...
>
> [ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring
>
> [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring
>
> [ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-osd.keyring
>
> [ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-mds.keyring
>
> [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-node1 for
> /var/lib/ceph/bootstrap-rgw/ceph.keyring
>
> [ceph-node1][DEBUG ] connected to host: ceph-node1
>
> [ceph-node1][DEBUG ] detect platform information from remote host
>
> [ceph-node1][DEBUG ] detect machine type
>
> [ceph-node1][DEBUG ] fetch remote file
>
> *[ceph_deploy.gatherkeys][WARNIN] Unable to find
> /var/lib/ceph/bootstrap-rgw/ceph.keyring on ceph-node1*
>
> *[ceph_deploy.gatherkeys][WARNIN] No RGW bootstrap key found. Will not be
> able to deploy RGW daemons*
>
> [root@ceph-node1 ceph]#
>
>
>
> [root@ceph-node1 ceph]# ceph-deploy gatherkeys ceph-node1
>
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
>
> [ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy
> gatherkeys ceph-node1
>
> [ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring
>
> [ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring
>
> [ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-osd.keyring
>
> [ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-mds.keyring
>
> [ceph_deploy.gatherkeys][DEBUG ] Checking ceph-node1 for
> /var/lib/ceph/bootstrap-rgw/ceph.keyring
>
> [ceph-node1][DEBUG ] connected to host: ceph-node1
>
> [ceph-node1][DEBUG ] detect platform information from remote host
>
> [ceph-node1][DEBUG ] detect machine type
>
> [ceph-node1][DEBUG ] fetch remote file
>
> *[ceph_deploy.gatherkeys][WARNIN] Unable to find
> /var/lib/ceph/bootstrap-rgw/ceph.keyring on ceph-node1*
>
> *[ceph_deploy.gatherkeys][WARNIN] No RGW bootstrap key found. Will not be
> able to deploy RGW daemons*
>
> [root@ceph-node1 ceph]#
>
>
>
> Regards
>
> VS
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Iain Geddes
Application Engineer[image: Cyan] 1383 North McDowell
Blvd.
Petaluma, CA 94954M+353 89 432
6811eiain.ged...@cyaninc.comwww.cyaninc.com[image:
Facebook]  [image: LinkedIn]
 [image:
Twitter] 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Hammer : Ceph-deploy 1.5.23-0 : RGW civetweb :: Not getting installed

2015-04-09 Thread Vickey Singh
Hello Cephers

I am trying to setup RGW using Ceph-deploy which is described here

http://docs.ceph.com/docs/master/start/quick-ceph-deploy/#add-an-rgw-instance


But unfortunately it doesn't seems to be working

Is there something i am missing  or you know some fix for this.




[root@ceph-node1 yum.repos.d]# ceph -v

*ceph version 0.94* (e61c4f093f88e44961d157f65091733580cea79a)

[root@ceph-node1 yum.repos.d]#



# yum update ceph-deploy


<< SKIPPED >



  Verifying  : ceph-deploy-1.5.22-0.noarch
2/2


Updated:

 * ceph-deploy.noarch 0:1.5.23-0*


Complete!

[root@ceph-node1 ceph]#





[root@ceph-node1 ceph]# ceph-deploy rgw create rgw-node1

[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf

[ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy rgw create
rgw-node1

[ceph_deploy.rgw][DEBUG ] Deploying rgw, cluster ceph hosts
rgw-node1:rgw.rgw-node1

*[ceph_deploy][ERROR ] RuntimeError: bootstrap-rgw keyring not found; run
'gatherkeys'*




[root@ceph-node1 ceph]# ceph-deploy --overwrite-conf mon create-initial

[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf

[ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy
--overwrite-conf mon create-initial

<< SKIPPED >

[ceph_deploy.mon][INFO  ] mon.ceph-node1 monitor has reached quorum!

[ceph_deploy.mon][INFO  ] all initial monitors are running and have formed
quorum

[ceph_deploy.mon][INFO  ] Running gatherkeys...

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-osd.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-mds.keyring

[ceph_deploy.gatherkeys][DEBUG ] Checking ceph-node1 for
/var/lib/ceph/bootstrap-rgw/ceph.keyring

[ceph-node1][DEBUG ] connected to host: ceph-node1

[ceph-node1][DEBUG ] detect platform information from remote host

[ceph-node1][DEBUG ] detect machine type

[ceph-node1][DEBUG ] fetch remote file

*[ceph_deploy.gatherkeys][WARNIN] Unable to find
/var/lib/ceph/bootstrap-rgw/ceph.keyring on ceph-node1*

*[ceph_deploy.gatherkeys][WARNIN] No RGW bootstrap key found. Will not be
able to deploy RGW daemons*

[root@ceph-node1 ceph]#



[root@ceph-node1 ceph]# ceph-deploy gatherkeys ceph-node1

[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf

[ceph_deploy.cli][INFO  ] Invoked (1.5.23): /usr/bin/ceph-deploy gatherkeys
ceph-node1

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.client.admin.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-osd.keyring

[ceph_deploy.gatherkeys][DEBUG ] Have ceph.bootstrap-mds.keyring

[ceph_deploy.gatherkeys][DEBUG ] Checking ceph-node1 for
/var/lib/ceph/bootstrap-rgw/ceph.keyring

[ceph-node1][DEBUG ] connected to host: ceph-node1

[ceph-node1][DEBUG ] detect platform information from remote host

[ceph-node1][DEBUG ] detect machine type

[ceph-node1][DEBUG ] fetch remote file

*[ceph_deploy.gatherkeys][WARNIN] Unable to find
/var/lib/ceph/bootstrap-rgw/ceph.keyring on ceph-node1*

*[ceph_deploy.gatherkeys][WARNIN] No RGW bootstrap key found. Will not be
able to deploy RGW daemons*

[root@ceph-node1 ceph]#



Regards

VS
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD Hardware recommendation

2015-04-09 Thread f...@univ-lr.fr

Hi all,

just an update - but an important one - of the previous benchmark with 2 
new "10 DWPD class" contenders :

- Seagate 1200 -  ST200FM0053 - SAS 12Gb/s
- Intel DC S3700 - SATA 6Gb/s

The graph :
   
http://www.4shared.com/download/yaeJgJiFce/Perf-SSDs-Toshiba-Seagate-Inte.png?lgfp=3000 



It speaks by itself, the Seagate is clearly a massive improvement over 
our best SSD so far (Toshiba M2).
That's a 430MB/s write bandwidth reached with blocks as small as 4KB, 
written with SYNC and DIRECT flags.
This was somewhat expected after reading this review 
http://www.tweaktown.com/reviews/6075/seagate-1200-stx00fm-12gb-s-sas-enterprise-ssd-review/index.html
An impressive result that should make the Seagate as a SSD of choice for 
journal on hosts with SAS controllers


I had also access to an Intel DC S3700, an unavoidable reference as Ceph 
journal. Indeed not bad on 4k blocks for the price.


The benchs were made on Dell R730xd with H730P SAS controller (LSI 3108 
12GB/s SAS)


Frederic

f...@univ-lr.fr  a écrit le 31/03/15 14:09 :

Hi,

in our quest to get the right SSD for OSD journals, I managed to 
benchmark two kind of "10 DWPD" SSDs :

- Toshiba M2 PX02SMF020
- Samsung 845DC PRO

I wan't to determine if a disk is appropriate considering its absolute 
performances, and the optimal number of ceph-osd processes using the 
SSD as a journal.
The benchmark consists of a fio command, with SYNC and DIRECT access 
options, and 4k blocks write accesses.


fio --filename=/dev/sda --direct=1 --sync=1 --rw=write --bs=4k 
--runtime=60 --time_based --group_reporting --name=journal-test 
--iodepth=<1 or 16> --numjobs=< ranging from 1 to 16>


I think numjobs can represent the concurrent number of OSD served by 
this SSD. Am I right on this ?



http://www.4shared.com/download/WOvooKVXce/Fio-Direct-Sync-ToshibaM2-Sams.png?lgfp=3000


My understanding of that data is that the 845DC Pro cannot be used for 
more that 4 OSD.

The M2 is very constant in its comportment.
The iodepth has almost no impact on perfs here.

Could someone having other SSD types make the same test to consolidate 
the data ?


Among the short list that could be considered for that task (for their 
price/perfs/DWPD/...) :

- Seagate 1200 SSD 200GB, SAS 12Gb/s ST200FM0053
- Hitachi SSD800MM MLC HUSMM8020ASS200
- Intel DC3700

I've not yet considered write amplification mentionned in other posts.

Frederic

Josef Johansson  a écrit le 20/03/15 10:29 :



The 845DC Pro does look really nice, comparable with s3700 with TDW even.
The price is what really does it, as it’s almost a third compared with s3700..

  




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Recovering incomplete PGs with ceph_objectstore_tool

2015-04-09 Thread Paul Evans
Congrats Chris and nice "save" on that RBD!

--
Paul 

> On Apr 9, 2015, at 11:11 AM, Chris Kitzmiller  
> wrote:
> 
> Success! Hopefully my notes from the process will help:
> 
> In the event of multiple disk failures the cluster could lose PGs. Should 
> this occur it is best to attempt to restart the OSD process and have the 
> drive marked as up+out. Marking the drive as out will cause data to flow off 
> the drive to elsewhere in the cluster. In the event that the ceph-osd process 
> is unable to keep running you could try using the ceph_objectstore_tool 
> program to extract just the damaged PGs and import them into working PGs.
> 
> Fixing Journals
> In this particular scenario things were complicated by the fact that 
> ceph_objectstore_tool came out in Giant but we were running Firefly. Not 
> wanting to upgrade the cluster in a degraded state this required that the OSD 
> drives be moved to a different physical machine for repair. This added a lot 
> of steps related to the journals but it wasn't a big deal. That process looks 
> like:
> 
> On Storage1:
> stop ceph-osd id=15
> ceph-osd -i 15 --flush-journal
> ls -l /var/lib/ceph/osd/ceph-15/journal
> 
> Note the journal device UUID then pull the disk and move it to Ithome:
> rm /var/lib/ceph/osd/ceph-15/journal
> ceph-osd -i 15 --mkjournal
> 
> That creates a colocated journal for which to use during the 
> ceph_objectstore_tool commands. Once done then:
> ceph-osd -i 15 --flush-journal
> rm /var/lib/ceph/osd/ceph-15/journal
> 
> Pull the disk and bring it back to Storage1. Then:
> ln -s /dev/disk/by-partitionuuid/b4f8d911-5ac9-4bf0-a06a-b8492e25a00f 
> /var/lib/ceph/osd/ceph-15/journal
> ceph-osd -i 15 --mkjournal
> start ceph-osd id=15
> 
> This all won't be needed once the cluster is running Hammer because then 
> there will be an available version of ceph_objectstore_tool on the local 
> machine and you can keep the journals throughout the process.
> 
> 
> Recovery Process
> We were missing two PGs, 3.c7 and 3.102. These PGs were hosted on OSD.0 and 
> OSD.15 which were the two disks which failed out of Storage1. The disk for 
> OSD.0 seemed to be a total loss while the disk for OSD.15 was somewhat more 
> cooperative but not in a place to be up and running in the cluster. I took 
> the dying OSD.15 drive and placed it into a new physical machine with a fresh 
> install of Ceph Giant. Using Giant's ceph_objectstore_tool I was able to 
> extract the PGs with a command like:
> for i in 3.c7 3.102 ; do ceph_objectstore_tool --data 
> /var/lib/ceph/osd/ceph-15 --journal /var/lib/ceph/osd/ceph-15/journal --op 
> export --pgid $i --file ~/${i}.export
> 
> Once both PGs were successfully exported I attempted to import them into a 
> new temporary OSD following instructions from here. For some reason that 
> didn't work. The OSD was up+in but wasn't backfilling the PGs into the 
> cluster. If you find yourself in this process I would try that first just in 
> case it provides a cleaner process.
> Considering the above didn't work and we were looking at the possibility of 
> losing the RBD volume (or perhaps worse, the potential of fruitlessly fscking 
> 35TB) I took what I might describe as heroic measures:
> 
> Running
> ceph pg dump | grep incomplete
> 
> 3.c7   0  0  0  0  0  0  0  incomplete  2015-04-02  20:49:32.968841  0'0  
> 15730:17  [15,0]  15  [15,0]  15  13985'54076  2015-03-31  19:14:22.721695  
> 13985'54076  2015-03-31  19:14:22.721695
> 3.102  0  0  0  0  0  0  0  incomplete  2015-04-02  20:49:32.529594  0'0  
> 15730:21  [0,15]  0   [0,15]  0   13985'53107  2015-03-29  21:17:15.568125  
> 13985'49195  2015-03-24  18:38:08.244769
> 
> Then I stopped all OSDs, which blocked all I/O to the cluster, with:
> stop ceph-osd-all
> 
> Then I looked for all copies of the PG on all OSDs with:
> for i in 3.c7 3.102 ; do find /var/lib/ceph/osd/ -maxdepth 3 -type d -name 
> "$i" ; done | sort -V
> 
> /var/lib/ceph/osd/ceph-0/current/3.c7_head
> /var/lib/ceph/osd/ceph-0/current/3.102_head
> /var/lib/ceph/osd/ceph-3/current/3.c7_head
> /var/lib/ceph/osd/ceph-13/current/3.102_head
> /var/lib/ceph/osd/ceph-15/current/3.c7_head
> /var/lib/ceph/osd/ceph-15/current/3.102_head
> 
> Then I flushed the journals for all of those OSDs with:
> for i in 0 3 13 15 ; do ceph-osd -i $i --flush-journal ; done
> 
> Then I removed all of those drives and moved them (using Journal Fixing 
> above) to Ithome where I used ceph_objectstore_tool to remove all traces of 
> 3.102 and 3.c7:
> for i in 0 3 13 15 ; do for j in 3.c7 3.102 ; do ceph_objectstore_tool --data 
> /var/lib/ceph/osd/ceph-$i --journal /var/lib/ceph/osd/ceph-$i/journal --op 
> remove --pgid $j ; done ; done
> 
> Then I imported the PGs onto OSD.0 and OSD.15 with:
> for i in 0 15 ; do for j in 3.c7 3.102 ; do ceph_objectstore_tool --data 
> /var/lib/ceph/osd/ceph-$i --journal /var/lib/ceph/osd/ceph-$i/journal --op 
> import --file ~/${j}.export ; done ; done
> for i in 0 15 ; do ceph-osd -i $i --flush-jo

Re: [ceph-users] Recovering incomplete PGs with ceph_objectstore_tool

2015-04-09 Thread Chris Kitzmiller
Success! Hopefully my notes from the process will help:

In the event of multiple disk failures the cluster could lose PGs. Should this 
occur it is best to attempt to restart the OSD process and have the drive 
marked as up+out. Marking the drive as out will cause data to flow off the 
drive to elsewhere in the cluster. In the event that the ceph-osd process is 
unable to keep running you could try using the ceph_objectstore_tool program to 
extract just the damaged PGs and import them into working PGs.

Fixing Journals
In this particular scenario things were complicated by the fact that 
ceph_objectstore_tool came out in Giant but we were running Firefly. Not 
wanting to upgrade the cluster in a degraded state this required that the OSD 
drives be moved to a different physical machine for repair. This added a lot of 
steps related to the journals but it wasn't a big deal. That process looks like:

On Storage1:
stop ceph-osd id=15
ceph-osd -i 15 --flush-journal
ls -l /var/lib/ceph/osd/ceph-15/journal

Note the journal device UUID then pull the disk and move it to Ithome:
rm /var/lib/ceph/osd/ceph-15/journal
ceph-osd -i 15 --mkjournal

That creates a colocated journal for which to use during the 
ceph_objectstore_tool commands. Once done then:
ceph-osd -i 15 --flush-journal
rm /var/lib/ceph/osd/ceph-15/journal

Pull the disk and bring it back to Storage1. Then:
ln -s /dev/disk/by-partitionuuid/b4f8d911-5ac9-4bf0-a06a-b8492e25a00f 
/var/lib/ceph/osd/ceph-15/journal
ceph-osd -i 15 --mkjournal
start ceph-osd id=15

This all won't be needed once the cluster is running Hammer because then there 
will be an available version of ceph_objectstore_tool on the local machine and 
you can keep the journals throughout the process.


Recovery Process
We were missing two PGs, 3.c7 and 3.102. These PGs were hosted on OSD.0 and 
OSD.15 which were the two disks which failed out of Storage1. The disk for 
OSD.0 seemed to be a total loss while the disk for OSD.15 was somewhat more 
cooperative but not in a place to be up and running in the cluster. I took the 
dying OSD.15 drive and placed it into a new physical machine with a fresh 
install of Ceph Giant. Using Giant's ceph_objectstore_tool I was able to 
extract the PGs with a command like:
for i in 3.c7 3.102 ; do ceph_objectstore_tool --data /var/lib/ceph/osd/ceph-15 
--journal /var/lib/ceph/osd/ceph-15/journal --op export --pgid $i --file 
~/${i}.export

Once both PGs were successfully exported I attempted to import them into a new 
temporary OSD following instructions from here. For some reason that didn't 
work. The OSD was up+in but wasn't backfilling the PGs into the cluster. If you 
find yourself in this process I would try that first just in case it provides a 
cleaner process.
Considering the above didn't work and we were looking at the possibility of 
losing the RBD volume (or perhaps worse, the potential of fruitlessly fscking 
35TB) I took what I might describe as heroic measures:

Running
ceph pg dump | grep incomplete

3.c7   0  0  0  0  0  0  0  incomplete  2015-04-02  20:49:32.968841  0'0  
15730:17  [15,0]  15  [15,0]  15  13985'54076  2015-03-31  19:14:22.721695  
13985'54076  2015-03-31  19:14:22.721695
3.102  0  0  0  0  0  0  0  incomplete  2015-04-02  20:49:32.529594  0'0  
15730:21  [0,15]  0   [0,15]  0   13985'53107  2015-03-29  21:17:15.568125  
13985'49195  2015-03-24  18:38:08.244769

Then I stopped all OSDs, which blocked all I/O to the cluster, with:
stop ceph-osd-all

Then I looked for all copies of the PG on all OSDs with:
for i in 3.c7 3.102 ; do find /var/lib/ceph/osd/ -maxdepth 3 -type d -name "$i" 
; done | sort -V

/var/lib/ceph/osd/ceph-0/current/3.c7_head
/var/lib/ceph/osd/ceph-0/current/3.102_head
/var/lib/ceph/osd/ceph-3/current/3.c7_head
/var/lib/ceph/osd/ceph-13/current/3.102_head
/var/lib/ceph/osd/ceph-15/current/3.c7_head
/var/lib/ceph/osd/ceph-15/current/3.102_head

Then I flushed the journals for all of those OSDs with:
for i in 0 3 13 15 ; do ceph-osd -i $i --flush-journal ; done

Then I removed all of those drives and moved them (using Journal Fixing above) 
to Ithome where I used ceph_objectstore_tool to remove all traces of 3.102 and 
3.c7:
for i in 0 3 13 15 ; do for j in 3.c7 3.102 ; do ceph_objectstore_tool --data 
/var/lib/ceph/osd/ceph-$i --journal /var/lib/ceph/osd/ceph-$i/journal --op 
remove --pgid $j ; done ; done

Then I imported the PGs onto OSD.0 and OSD.15 with:
for i in 0 15 ; do for j in 3.c7 3.102 ; do ceph_objectstore_tool --data 
/var/lib/ceph/osd/ceph-$i --journal /var/lib/ceph/osd/ceph-$i/journal --op 
import --file ~/${j}.export ; done ; done
for i in 0 15 ; do ceph-osd -i $i --flush-journal && rm 
/var/log/ceph/osd/ceph-$i/journal ; done

Then I moved the disks back to Storage1 and started them all back up again. I 
think that this should have worked but what happened in this case was that 
OSD.0 didn't start up for some reason. I initially thought that that wouldn't 
matter because OSD.15 did start a

Re: [ceph-users] MDS unmatched rstat after upgrade hammer

2015-04-09 Thread Scottix
I fully understand, why it is just a comment :)

Can't wait for scrub.

Thanks!

On Thu, Apr 9, 2015 at 10:13 AM John Spray  wrote:

>
>
> On 09/04/2015 17:09, Scottix wrote:
> > Alright sounds good.
> >
> > Only one comment then:
> > From an IT/ops perspective all I see is ERR and that raises red flags.
> > So the exposure of the message might need some tweaking. In production
> > I like to be notified of an issue but have reassurance it was fixed
> > within the system.
>
> Fair point.  Unfortunately, in general we can't distinguish between
> inconsistencies we're fixing up due to a known software bug, and
> inconsistencies that we're encountering for unknown reasons.  The reason
> this is an error rather than a warning is that we handle this case by
> arbitrarily trusting one statistic when it disagrees with another, so we
> don't *know* that we've correctly repaired, we just hope.
>
> Anyway: the solution is the forthcoming scrub functionality, which will
> be able to unambiguously repair things like this, and give you a clearer
> statement about what happened.
>
> Cheers,
> John
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MDS unmatched rstat after upgrade hammer

2015-04-09 Thread John Spray



On 09/04/2015 17:09, Scottix wrote:

Alright sounds good.

Only one comment then:
From an IT/ops perspective all I see is ERR and that raises red flags. 
So the exposure of the message might need some tweaking. In production 
I like to be notified of an issue but have reassurance it was fixed 
within the system.


Fair point.  Unfortunately, in general we can't distinguish between 
inconsistencies we're fixing up due to a known software bug, and 
inconsistencies that we're encountering for unknown reasons.  The reason 
this is an error rather than a warning is that we handle this case by 
arbitrarily trusting one statistic when it disagrees with another, so we 
don't *know* that we've correctly repaired, we just hope.


Anyway: the solution is the forthcoming scrub functionality, which will 
be able to unambiguously repair things like this, and give you a clearer 
statement about what happened.


Cheers,
John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Kyle Hutson
http://people.beocat.cis.ksu.edu/~kylehutson/crushmap

On Thu, Apr 9, 2015 at 11:25 AM, Gregory Farnum  wrote:

> Hmmm. That does look right and neither I nor Sage can come up with
> anything via code inspection. Can you post the actual binary crush map
> somewhere for download so that we can inspect it with our tools?
> -Greg
>
> On Thu, Apr 9, 2015 at 7:57 AM, Kyle Hutson  wrote:
> > Here 'tis:
> > https://dpaste.de/POr1
> >
> >
> > On Thu, Apr 9, 2015 at 9:49 AM, Gregory Farnum  wrote:
> >>
> >> Can you dump your crush map and post it on pastebin or something?
> >>
> >> On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson  wrote:
> >> > Nope - it's 64-bit.
> >> >
> >> > (Sorry, I missed the reply-all last time.)
> >> >
> >> > On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum 
> wrote:
> >> >>
> >> >> [Re-added the list]
> >> >>
> >> >> Hmm, I'm checking the code and that shouldn't be possible. What's
> your
> >> >> ciient? (In particular, is it 32-bit? That's the only thing i can
> >> >> think of that might have slipped through our QA.)
> >> >>
> >> >> On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson 
> wrote:
> >> >> > I did nothing to enable anything else. Just changed my ceph repo
> from
> >> >> > 'giant' to 'hammer', then did 'yum update' and restarted services.
> >> >> >
> >> >> > On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum 
> >> >> > wrote:
> >> >> >>
> >> >> >> Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by
> >> >> >> the
> >> >> >> cluster unless you made changes to the layout requiring it.
> >> >> >>
> >> >> >> If you did, the clients have to be upgraded to understand it. You
> >> >> >> could disable all the v4 features; that should let them connect
> >> >> >> again.
> >> >> >> -Greg
> >> >> >>
> >> >> >> On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson 
> >> >> >> wrote:
> >> >> >> > This particular problem I just figured out myself ('ceph -w' was
> >> >> >> > still
> >> >> >> > running from before the upgrade, and ctrl-c and restarting
> solved
> >> >> >> > that
> >> >> >> > issue), but I'm still having a similar problem on the ceph
> client:
> >> >> >> >
> >> >> >> > libceph: mon19 10.5.38.20:6789 feature set mismatch, my
> >> >> >> > 2b84a042aca <
> >> >> >> > server's 102b84a042aca, missing 1
> >> >> >> >
> >> >> >> > It appears that even the latest kernel doesn't have support for
> >> >> >> > CEPH_FEATURE_CRUSH_V4
> >> >> >> >
> >> >> >> > How do I make my ceph cluster backward-compatible with the old
> >> >> >> > cephfs
> >> >> >> > client?
> >> >> >> >
> >> >> >> > On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson  >
> >> >> >> > wrote:
> >> >> >> >>
> >> >> >> >> I upgraded from giant to hammer yesterday and now 'ceph -w' is
> >> >> >> >> constantly
> >> >> >> >> repeating this message:
> >> >> >> >>
> >> >> >> >> 2015-04-09 08:50:26.318042 7f95dbf86700  0 --
> 10.5.38.1:0/2037478
> >> >> >> >> >>
> >> >> >> >> 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0
> cs=0
> >> >> >> >> l=1
> >> >> >> >> c=0x7f95e0023670).connect protocol feature mismatch, my
> >> >> >> >> 3fff
> >> >> >> >> <
> >> >> >> >> peer
> >> >> >> >> 13fff missing 1
> >> >> >> >>
> >> >> >> >> It isn't always the same IP for the destination - here's
> another:
> >> >> >> >> 2015-04-09 08:50:20.322059 7f95dc087700  0 --
> 10.5.38.1:0/2037478
> >> >> >> >> >>
> >> >> >> >> 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0
> cs=0
> >> >> >> >> l=1
> >> >> >> >> c=0x7f95e002b480).connect protocol feature mismatch, my
> >> >> >> >> 3fff
> >> >> >> >> <
> >> >> >> >> peer
> >> >> >> >> 13fff missing 1
> >> >> >> >>
> >> >> >> >> Some details about our install:
> >> >> >> >> We have 24 hosts with 18 OSDs each. 16 per host are spinning
> >> >> >> >> disks
> >> >> >> >> in
> >> >> >> >> an
> >> >> >> >> erasure coded pool (k=8 m=4). 2 OSDs per host are SSD
> partitions
> >> >> >> >> used
> >> >> >> >> for a
> >> >> >> >> caching tier in front of the EC pool. All 24 hosts are
> monitors.
> >> >> >> >> 4
> >> >> >> >> hosts are
> >> >> >> >> mds. We are running cephfs with a client trying to write data
> >> >> >> >> over
> >> >> >> >> cephfs
> >> >> >> >> when we're seeing these messages.
> >> >> >> >>
> >> >> >> >> Any ideas?
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > ___
> >> >> >> > ceph-users mailing list
> >> >> >> > ceph-users@lists.ceph.com
> >> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >> >> >
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Gregory Farnum
Hmmm. That does look right and neither I nor Sage can come up with
anything via code inspection. Can you post the actual binary crush map
somewhere for download so that we can inspect it with our tools?
-Greg

On Thu, Apr 9, 2015 at 7:57 AM, Kyle Hutson  wrote:
> Here 'tis:
> https://dpaste.de/POr1
>
>
> On Thu, Apr 9, 2015 at 9:49 AM, Gregory Farnum  wrote:
>>
>> Can you dump your crush map and post it on pastebin or something?
>>
>> On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson  wrote:
>> > Nope - it's 64-bit.
>> >
>> > (Sorry, I missed the reply-all last time.)
>> >
>> > On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum  wrote:
>> >>
>> >> [Re-added the list]
>> >>
>> >> Hmm, I'm checking the code and that shouldn't be possible. What's your
>> >> ciient? (In particular, is it 32-bit? That's the only thing i can
>> >> think of that might have slipped through our QA.)
>> >>
>> >> On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson  wrote:
>> >> > I did nothing to enable anything else. Just changed my ceph repo from
>> >> > 'giant' to 'hammer', then did 'yum update' and restarted services.
>> >> >
>> >> > On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum 
>> >> > wrote:
>> >> >>
>> >> >> Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by
>> >> >> the
>> >> >> cluster unless you made changes to the layout requiring it.
>> >> >>
>> >> >> If you did, the clients have to be upgraded to understand it. You
>> >> >> could disable all the v4 features; that should let them connect
>> >> >> again.
>> >> >> -Greg
>> >> >>
>> >> >> On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson 
>> >> >> wrote:
>> >> >> > This particular problem I just figured out myself ('ceph -w' was
>> >> >> > still
>> >> >> > running from before the upgrade, and ctrl-c and restarting solved
>> >> >> > that
>> >> >> > issue), but I'm still having a similar problem on the ceph client:
>> >> >> >
>> >> >> > libceph: mon19 10.5.38.20:6789 feature set mismatch, my
>> >> >> > 2b84a042aca <
>> >> >> > server's 102b84a042aca, missing 1
>> >> >> >
>> >> >> > It appears that even the latest kernel doesn't have support for
>> >> >> > CEPH_FEATURE_CRUSH_V4
>> >> >> >
>> >> >> > How do I make my ceph cluster backward-compatible with the old
>> >> >> > cephfs
>> >> >> > client?
>> >> >> >
>> >> >> > On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson 
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> I upgraded from giant to hammer yesterday and now 'ceph -w' is
>> >> >> >> constantly
>> >> >> >> repeating this message:
>> >> >> >>
>> >> >> >> 2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478
>> >> >> >> >>
>> >> >> >> 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0
>> >> >> >> l=1
>> >> >> >> c=0x7f95e0023670).connect protocol feature mismatch, my
>> >> >> >> 3fff
>> >> >> >> <
>> >> >> >> peer
>> >> >> >> 13fff missing 1
>> >> >> >>
>> >> >> >> It isn't always the same IP for the destination - here's another:
>> >> >> >> 2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478
>> >> >> >> >>
>> >> >> >> 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0
>> >> >> >> l=1
>> >> >> >> c=0x7f95e002b480).connect protocol feature mismatch, my
>> >> >> >> 3fff
>> >> >> >> <
>> >> >> >> peer
>> >> >> >> 13fff missing 1
>> >> >> >>
>> >> >> >> Some details about our install:
>> >> >> >> We have 24 hosts with 18 OSDs each. 16 per host are spinning
>> >> >> >> disks
>> >> >> >> in
>> >> >> >> an
>> >> >> >> erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions
>> >> >> >> used
>> >> >> >> for a
>> >> >> >> caching tier in front of the EC pool. All 24 hosts are monitors.
>> >> >> >> 4
>> >> >> >> hosts are
>> >> >> >> mds. We are running cephfs with a client trying to write data
>> >> >> >> over
>> >> >> >> cephfs
>> >> >> >> when we're seeing these messages.
>> >> >> >>
>> >> >> >> Any ideas?
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > ___
>> >> >> > ceph-users mailing list
>> >> >> > ceph-users@lists.ceph.com
>> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Jacob Reid
On Thu, Apr 09, 2015 at 08:46:07AM -0700, Gregory Farnum wrote:
> On Thu, Apr 9, 2015 at 8:14 AM, Jacob Reid  
> wrote:
> > On Thu, Apr 09, 2015 at 06:43:45AM -0700, Gregory Farnum wrote:
> >> You can turn up debugging ("debug osd = 10" and "debug filestore = 10"
> >> are probably enough, or maybe 20 each) and see what comes out to get
> >> more information about why the threads are stuck.
> >>
> >> But just from the log my answer is the same as before, and now I don't
> >> trust that controller (or maybe its disks), regardless of what it's
> >> admitting to. ;)
> >> -Greg
> >>
> >
> > Ran with osd and filestore debug both at 20; still nothing jumping out at 
> > me. Logfile attached as it got huge fairly quickly, but mostly seems to be 
> > the same extra lines. I tried running some test I/O on the drives in 
> > question to try and provoke some kind of problem, but they seem fine now...
> 
> Okay, this is strange. Something very wonky is happening with your
> scheduler — it looks like these threads are all idle, and they're
> scheduling wakeups that handle an appreciable amount of time after
> they're supposed to. For instance:
> 2015-04-09 15:56:55.953116 7f70a7963700 20
> filestore(/var/lib/ceph/osd/osd.15) sync_entry woke after 5.416704
> 2015-04-09 15:56:55.953153 7f70a7963700 20
> filestore(/var/lib/ceph/osd/osd.15) sync_entry waiting for
> max_interval 5.00
> 
> This is the thread that syncs your backing store, and it always sets
> itself to get woken up at 5-second intervals — but here it took >5.4
> seconds, and later on in your log it takes more than 6 seconds.
> It looks like all the threads which are getting timed out are also
> idle, but are taking so much longer to wake up than they're set for
> that they get a timeout warning.
> 
> There might be some bugs in here where we're expecting wakeups to be
> more precise than they can be, but these sorts of misses are
> definitely not normal. Is this server overloaded on the CPU? Have you
> done something to make the scheduler or wakeups wonky?
> -Greg

CPU load is minimal - the host does nothing but run OSDs and has 8 cores that 
are all sitting idle with a load average of 0.1. I haven't done anything to 
scheduling. That was with the debug logging on, if that could be the cause of 
any delays. A scheduler issue seems possible - I haven't done anything to it, 
but `time sleep 5` run a few times returns anything spread randomly from 5.002 
to 7.1(!) seconds but mostly in the 5.5-6.0 region where it managed fairly 
consistently <5.2 on the other servers in the cluster and <5.02 on my desktop. 
I have disabled the CPU power saving mode as the only thing I could think of 
that might be having an effect on this, and running the same test again gives 
more sane results... we'll see if this reflects in the OSD logs or not, I 
guess. If this is the cause, it's probably something that the next version 
might want to make a specific warning case of detecting. I will keep you 
updated as to their behaviour now...
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MDS unmatched rstat after upgrade hammer

2015-04-09 Thread Scottix
Alright sounds good.

Only one comment then:
>From an IT/ops perspective all I see is ERR and that raises red flags. So
the exposure of the message might need some tweaking. In production I like
to be notified of an issue but have reassurance it was fixed within the
system.

Best Regards

On Wed, Apr 8, 2015 at 8:10 PM Yan, Zheng  wrote:

> On Thu, Apr 9, 2015 at 7:09 AM, Scottix  wrote:
> > I was testing the upgrade on our dev environment and after I restarted
> the
> > mds I got the following errors.
> >
> > 2015-04-08 15:58:34.056470 mds.0 [ERR] unmatched rstat on 605, inode has
> > n(v70 rc2015-03-16 09:11:34.390905), dirfrags have n(v0 rc2015-03-16
> > 09:11:34.390905 1=0+1)
> > 2015-04-08 15:58:34.056530 mds.0 [ERR] unmatched rstat on 604, inode has
> > n(v69 rc2015-03-31 08:07:09.265241), dirfrags have n(v0 rc2015-03-31
> > 08:07:09.265241 1=0+1)
> > 2015-04-08 15:58:34.056581 mds.0 [ERR] unmatched rstat on 606, inode has
> > n(v67 rc2015-03-16 08:54:36.314790), dirfrags have n(v0 rc2015-03-16
> > 08:54:36.314790 1=0+1)
> > 2015-04-08 15:58:34.056633 mds.0 [ERR] unmatched rstat on 607, inode has
> > n(v57 rc2015-03-16 08:54:46.797240), dirfrags have n(v0 rc2015-03-16
> > 08:54:46.797240 1=0+1)
> > 2015-04-08 15:58:34.056687 mds.0 [ERR] unmatched rstat on 608, inode has
> > n(v23 rc2015-03-16 08:54:59.634299), dirfrags have n(v0 rc2015-03-16
> > 08:54:59.634299 1=0+1)
> > 2015-04-08 15:58:34.056737 mds.0 [ERR] unmatched rstat on 609, inode has
> > n(v62 rc2015-03-16 08:55:06.598286), dirfrags have n(v0 rc2015-03-16
> > 08:55:06.598286 1=0+1)
> > 2015-04-08 15:58:34.056789 mds.0 [ERR] unmatched rstat on 600, inode has
> > n(v101 rc2015-03-16 08:55:16.153175), dirfrags have n(v0 rc2015-03-16
> > 08:55:16.153175 1=0+1)
>
> These errors are likely caused by the bug that rstats are not set to
> correct values
> when creating new fs. Nothing to worry about, the MDS automatically fixes
> rstat
> errors.
>
> >
> > I am not sure if this is an issue or got fixed or something I should
> worry
> > about. But would just like some context around this issue since it came
> up
> > in the ceph -w and other users might see it as well.
> >
> > I have done a lot of "unsafe" stuff on this mds so not to freak anyone
> out
> > if that is the issue.
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Mark Nelson
Notice that this is under their emerging technologies section.  I don't 
think you can buy them yet.  Hopefully we'll know more as time goes on. :)


Mark


On 04/09/2015 10:52 AM, Stillwell, Bryan wrote:

These are really interesting to me, but how can you buy them?  What's the
performance like in ceph?  Are they using the keyvaluestore backend, or
something specific to these drives?  Also what kind of chassis do they go
into (some kind of ethernet JBOD)?

Bryan

On 4/9/15, 9:43 AM, "Mark Nelson"  wrote:


How about drives that run Linux with an ARM processor, RAM, and an
ethernet port right on the drive?  Notice the Ceph logo. :)

https://www.hgst.com/science-of-storage/emerging-technologies/open-etherne
t-drive-architecture

Mark

On 04/09/2015 10:37 AM, Scott Laird wrote:

Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB)
Ethernet port.


On Thu, Apr 9, 2015, 8:03 AM p...@philw.com 
mailto:p...@philw.com>> wrote:

 Rather expensive option:

 Applied Micro X-Gene, overkill for a single disk, and only really
 available in a
 development kit format right now.


>

 Better Option:

 Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks,
 and one
 node with mSATA SSD

 >

 --phil

  > On 09 April 2015 at 15:57 Quentin Hartman
 mailto:qhart...@direwolfdigital.com>>
  > wrote:
  >
  >  I'm skeptical about how well this would work, but a Banana Pi
 might be a
  > place to start. Like a raspberry pi, but it has a SATA connector:
  > http://www.bananapi.org/
  >
  >  On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg
 mailto:jer...@update.uu.se>
  > > >
wrote:
  >> >Hello ceph users,
  > >
  > >Is anyone running any low powered single disk nodes with
 Ceph now?
  > > Calxeda seems to be no more according to Wikipedia. I do not
 think HP
  > > moonshot is what I am looking for - I want stand-alone nodes,
 not server
  > > cartridges integrated into server chassis. And I do not want to
 be locked to
  > > a single vendor.
  > >
  > >I was playing with Raspberry Pi 2 for signage when I thought
 of my old
  > > experiments with Ceph.
  > >
  > >I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or
 maybe
  > > something with a low-power Intel x64/x86 processor. Together
 with one SSD or
  > > one low power HDD the node could get all power via PoE (via
 splitter or
  > > integrated into board if such boards exist). PoE provide remote
 power-on
  > > power-off even for consumer grade nodes.
  > >
  > >The cost for a single low power node should be able to
 compete with
  > > traditional PC-servers price per disk. Ceph take care of
 redundancy.
  > >
  > >I think simple custom casing should be good enough - maybe
 just strap or
  > > velcro everything on trays in the rack, at least for the nodes
 with SSD.
  > >
  > >Kind regards,
  > >--
  > >Jerker Nyberg, Uppsala, Sweden.
  > >_
  > >ceph-users mailing list
  > > ceph-users@lists.ceph.com 
 >
  > > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 
  > > >
  > >  >  _
  >  ceph-users mailing list
  > ceph-users@lists.ceph.com 
  > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 
  >


 _
 ceph-users mailing list
 ceph-users@lists.ceph.com 
 http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
 



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



This E-mail and any of its attachments may contain Time Warner Cable 
proprietary information, w

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Stillwell, Bryan
These are really interesting to me, but how can you buy them?  What's the
performance like in ceph?  Are they using the keyvaluestore backend, or
something specific to these drives?  Also what kind of chassis do they go
into (some kind of ethernet JBOD)?

Bryan

On 4/9/15, 9:43 AM, "Mark Nelson"  wrote:

>How about drives that run Linux with an ARM processor, RAM, and an
>ethernet port right on the drive?  Notice the Ceph logo. :)
>
>https://www.hgst.com/science-of-storage/emerging-technologies/open-etherne
>t-drive-architecture
>
>Mark
>
>On 04/09/2015 10:37 AM, Scott Laird wrote:
>> Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB)
>> Ethernet port.
>>
>>
>> On Thu, Apr 9, 2015, 8:03 AM p...@philw.com 
>> mailto:p...@philw.com>> wrote:
>>
>> Rather expensive option:
>>
>> Applied Micro X-Gene, overkill for a single disk, and only really
>> available in a
>> development kit format right now.
>>
>>
>>>ent-kits/
>>
>>>kits/>>
>>
>> Better Option:
>>
>> Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks,
>> and one
>> node with mSATA SSD
>>
>> > >
>>
>> --phil
>>
>>  > On 09 April 2015 at 15:57 Quentin Hartman
>> mailto:qhart...@direwolfdigital.com>>
>>  > wrote:
>>  >
>>  >  I'm skeptical about how well this would work, but a Banana Pi
>> might be a
>>  > place to start. Like a raspberry pi, but it has a SATA connector:
>>  > http://www.bananapi.org/
>>  >
>>  >  On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg
>> mailto:jer...@update.uu.se>
>>  > > >
>>wrote:
>>  >> >Hello ceph users,
>>  > >
>>  > >Is anyone running any low powered single disk nodes with
>> Ceph now?
>>  > > Calxeda seems to be no more according to Wikipedia. I do not
>> think HP
>>  > > moonshot is what I am looking for - I want stand-alone nodes,
>> not server
>>  > > cartridges integrated into server chassis. And I do not want to
>> be locked to
>>  > > a single vendor.
>>  > >
>>  > >I was playing with Raspberry Pi 2 for signage when I thought
>> of my old
>>  > > experiments with Ceph.
>>  > >
>>  > >I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or
>> maybe
>>  > > something with a low-power Intel x64/x86 processor. Together
>> with one SSD or
>>  > > one low power HDD the node could get all power via PoE (via
>> splitter or
>>  > > integrated into board if such boards exist). PoE provide remote
>> power-on
>>  > > power-off even for consumer grade nodes.
>>  > >
>>  > >The cost for a single low power node should be able to
>> compete with
>>  > > traditional PC-servers price per disk. Ceph take care of
>> redundancy.
>>  > >
>>  > >I think simple custom casing should be good enough - maybe
>> just strap or
>>  > > velcro everything on trays in the rack, at least for the nodes
>> with SSD.
>>  > >
>>  > >Kind regards,
>>  > >--
>>  > >Jerker Nyberg, Uppsala, Sweden.
>>  > >_
>>  > >ceph-users mailing list
>>  > > ceph-users@lists.ceph.com 
>> >>
>>  > > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>> 
>>  > > > >
>>  > >  >  _
>>  >  ceph-users mailing list
>>  > ceph-users@lists.ceph.com 
>>  > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>> 
>>  >
>>
>>
>> _
>> ceph-users mailing list
>> ceph-users@lists.ceph.com 
>> http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>> 
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


This E-mail and any of its attachments may contain Time War

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Quentin Hartman
Where's the "take my money" button?

On Thu, Apr 9, 2015 at 9:43 AM, Mark Nelson  wrote:

> How about drives that run Linux with an ARM processor, RAM, and an
> ethernet port right on the drive?  Notice the Ceph logo. :)
>
> https://www.hgst.com/science-of-storage/emerging-
> technologies/open-ethernet-drive-architecture
>
> Mark
>
> On 04/09/2015 10:37 AM, Scott Laird wrote:
>
>> Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB)
>> Ethernet port.
>>
>>
>> On Thu, Apr 9, 2015, 8:03 AM p...@philw.com 
>> mailto:p...@philw.com>> wrote:
>>
>> Rather expensive option:
>>
>> Applied Micro X-Gene, overkill for a single disk, and only really
>> available in a
>> development kit format right now.
>>
>> > __c1-development-kits/
>> > c1-development-kits/>>
>>
>> Better Option:
>>
>> Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks,
>> and one
>> node with mSATA SSD
>>
>> > >
>>
>> --phil
>>
>>  > On 09 April 2015 at 15:57 Quentin Hartman
>> mailto:qhart...@direwolfdigital.com>>
>>  > wrote:
>>  >
>>  >  I'm skeptical about how well this would work, but a Banana Pi
>> might be a
>>  > place to start. Like a raspberry pi, but it has a SATA connector:
>>  > http://www.bananapi.org/
>>  >
>>  >  On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg
>> mailto:jer...@update.uu.se>
>>  > > > wrote:
>>  >> >Hello ceph users,
>>  > >
>>  > >Is anyone running any low powered single disk nodes with
>> Ceph now?
>>  > > Calxeda seems to be no more according to Wikipedia. I do not
>> think HP
>>  > > moonshot is what I am looking for - I want stand-alone nodes,
>> not server
>>  > > cartridges integrated into server chassis. And I do not want to
>> be locked to
>>  > > a single vendor.
>>  > >
>>  > >I was playing with Raspberry Pi 2 for signage when I thought
>> of my old
>>  > > experiments with Ceph.
>>  > >
>>  > >I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or
>> maybe
>>  > > something with a low-power Intel x64/x86 processor. Together
>> with one SSD or
>>  > > one low power HDD the node could get all power via PoE (via
>> splitter or
>>  > > integrated into board if such boards exist). PoE provide remote
>> power-on
>>  > > power-off even for consumer grade nodes.
>>  > >
>>  > >The cost for a single low power node should be able to
>> compete with
>>  > > traditional PC-servers price per disk. Ceph take care of
>> redundancy.
>>  > >
>>  > >I think simple custom casing should be good enough - maybe
>> just strap or
>>  > > velcro everything on trays in the rack, at least for the nodes
>> with SSD.
>>  > >
>>  > >Kind regards,
>>  > >--
>>  > >Jerker Nyberg, Uppsala, Sweden.
>>  > >_
>>  > >ceph-users mailing list
>>  > > ceph-users@lists.ceph.com 
>> > >>
>>  > > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>> 
>>  > > > >
>>  > >  >  _
>>  >  ceph-users mailing list
>>  > ceph-users@lists.ceph.com 
>>  > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>> 
>>  >
>>
>>
>> _
>> ceph-users mailing list
>> ceph-users@lists.ceph.com 
>> http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>> 
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>  ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Gregory Farnum
On Thu, Apr 9, 2015 at 8:14 AM, Jacob Reid  wrote:
> On Thu, Apr 09, 2015 at 06:43:45AM -0700, Gregory Farnum wrote:
>> You can turn up debugging ("debug osd = 10" and "debug filestore = 10"
>> are probably enough, or maybe 20 each) and see what comes out to get
>> more information about why the threads are stuck.
>>
>> But just from the log my answer is the same as before, and now I don't
>> trust that controller (or maybe its disks), regardless of what it's
>> admitting to. ;)
>> -Greg
>>
>
> Ran with osd and filestore debug both at 20; still nothing jumping out at me. 
> Logfile attached as it got huge fairly quickly, but mostly seems to be the 
> same extra lines. I tried running some test I/O on the drives in question to 
> try and provoke some kind of problem, but they seem fine now...

Okay, this is strange. Something very wonky is happening with your
scheduler — it looks like these threads are all idle, and they're
scheduling wakeups that handle an appreciable amount of time after
they're supposed to. For instance:
2015-04-09 15:56:55.953116 7f70a7963700 20
filestore(/var/lib/ceph/osd/osd.15) sync_entry woke after 5.416704
2015-04-09 15:56:55.953153 7f70a7963700 20
filestore(/var/lib/ceph/osd/osd.15) sync_entry waiting for
max_interval 5.00

This is the thread that syncs your backing store, and it always sets
itself to get woken up at 5-second intervals — but here it took >5.4
seconds, and later on in your log it takes more than 6 seconds.
It looks like all the threads which are getting timed out are also
idle, but are taking so much longer to wake up than they're set for
that they get a timeout warning.

There might be some bugs in here where we're expecting wakeups to be
more precise than they can be, but these sorts of misses are
definitely not normal. Is this server overloaded on the CPU? Have you
done something to make the scheduler or wakeups wonky?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Mark Nelson
How about drives that run Linux with an ARM processor, RAM, and an 
ethernet port right on the drive?  Notice the Ceph logo. :)


https://www.hgst.com/science-of-storage/emerging-technologies/open-ethernet-drive-architecture

Mark

On 04/09/2015 10:37 AM, Scott Laird wrote:

Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB)
Ethernet port.


On Thu, Apr 9, 2015, 8:03 AM p...@philw.com 
mailto:p...@philw.com>> wrote:

Rather expensive option:

Applied Micro X-Gene, overkill for a single disk, and only really
available in a
development kit format right now.


>

Better Option:

Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks,
and one
node with mSATA SSD

>

--phil

 > On 09 April 2015 at 15:57 Quentin Hartman
mailto:qhart...@direwolfdigital.com>>
 > wrote:
 >
 >  I'm skeptical about how well this would work, but a Banana Pi
might be a
 > place to start. Like a raspberry pi, but it has a SATA connector:
 > http://www.bananapi.org/
 >
 >  On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg
mailto:jer...@update.uu.se>
 > > > wrote:
 >> >Hello ceph users,
 > >
 > >Is anyone running any low powered single disk nodes with
Ceph now?
 > > Calxeda seems to be no more according to Wikipedia. I do not
think HP
 > > moonshot is what I am looking for - I want stand-alone nodes,
not server
 > > cartridges integrated into server chassis. And I do not want to
be locked to
 > > a single vendor.
 > >
 > >I was playing with Raspberry Pi 2 for signage when I thought
of my old
 > > experiments with Ceph.
 > >
 > >I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or
maybe
 > > something with a low-power Intel x64/x86 processor. Together
with one SSD or
 > > one low power HDD the node could get all power via PoE (via
splitter or
 > > integrated into board if such boards exist). PoE provide remote
power-on
 > > power-off even for consumer grade nodes.
 > >
 > >The cost for a single low power node should be able to
compete with
 > > traditional PC-servers price per disk. Ceph take care of
redundancy.
 > >
 > >I think simple custom casing should be good enough - maybe
just strap or
 > > velcro everything on trays in the rack, at least for the nodes
with SSD.
 > >
 > >Kind regards,
 > >--
 > >Jerker Nyberg, Uppsala, Sweden.
 > >_
 > >ceph-users mailing list
 > > ceph-users@lists.ceph.com 
>
 > > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com

 > > >
 > >  >  _
 >  ceph-users mailing list
 > ceph-users@lists.ceph.com 
 > http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com

 >


_
ceph-users mailing list
ceph-users@lists.ceph.com 
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Scott Laird
Minnowboard Max?  2 atom cores, 1 SATA port, and a real (non-USB) Ethernet
port.

On Thu, Apr 9, 2015, 8:03 AM p...@philw.com  wrote:

> Rather expensive option:
>
> Applied Micro X-Gene, overkill for a single disk, and only really
> available in a
> development kit format right now.
>
>  c1-development-kits/>
>
> Better Option:
>
> Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks, and
> one
> node with mSATA SSD
>
> 
>
> --phil
>
> > On 09 April 2015 at 15:57 Quentin Hartman 
> > wrote:
> >
> >  I'm skeptical about how well this would work, but a Banana Pi might be a
> > place to start. Like a raspberry pi, but it has a SATA connector:
> > http://www.bananapi.org/
> >
> >  On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg  >  > wrote:
> >> >Hello ceph users,
> > >
> > >Is anyone running any low powered single disk nodes with Ceph now?
> > > Calxeda seems to be no more according to Wikipedia. I do not think HP
> > > moonshot is what I am looking for - I want stand-alone nodes, not
> server
> > > cartridges integrated into server chassis. And I do not want to be
> locked to
> > > a single vendor.
> > >
> > >I was playing with Raspberry Pi 2 for signage when I thought of my
> old
> > > experiments with Ceph.
> > >
> > >I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or maybe
> > > something with a low-power Intel x64/x86 processor. Together with one
> SSD or
> > > one low power HDD the node could get all power via PoE (via splitter or
> > > integrated into board if such boards exist). PoE provide remote
> power-on
> > > power-off even for consumer grade nodes.
> > >
> > >The cost for a single low power node should be able to compete with
> > > traditional PC-servers price per disk. Ceph take care of redundancy.
> > >
> > >I think simple custom casing should be good enough - maybe just
> strap or
> > > velcro everything on trays in the rack, at least for the nodes with
> SSD.
> > >
> > >Kind regards,
> > >--
> > >Jerker Nyberg, Uppsala, Sweden.
> > >___
> > >ceph-users mailing list
> > >ceph-users@lists.ceph.com 
> > >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > 
> > >  >  ___
> >  ceph-users mailing list
> >  ceph-users@lists.ceph.com
> >  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low power single disk nodes

2015-04-09 Thread p...@philw.com
Rather expensive option: 

Applied Micro X-Gene, overkill for a single disk, and only really available in a
development kit format right now.
 

 
Better Option: 

Ambedded CY7 - 7 nodes in 1U half Depth, 6 positions for SATA disks, and one
node with mSATA SSD
 


--phil

> On 09 April 2015 at 15:57 Quentin Hartman 
> wrote:
> 
>  I'm skeptical about how well this would work, but a Banana Pi might be a
> place to start. Like a raspberry pi, but it has a SATA connector:
> http://www.bananapi.org/
> 
>  On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg   > wrote:
>> >Hello ceph users,
> > 
> >Is anyone running any low powered single disk nodes with Ceph now?
> > Calxeda seems to be no more according to Wikipedia. I do not think HP
> > moonshot is what I am looking for - I want stand-alone nodes, not server
> > cartridges integrated into server chassis. And I do not want to be locked to
> > a single vendor.
> > 
> >I was playing with Raspberry Pi 2 for signage when I thought of my old
> > experiments with Ceph.
> > 
> >I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or maybe
> > something with a low-power Intel x64/x86 processor. Together with one SSD or
> > one low power HDD the node could get all power via PoE (via splitter or
> > integrated into board if such boards exist). PoE provide remote power-on
> > power-off even for consumer grade nodes.
> > 
> >The cost for a single low power node should be able to compete with
> > traditional PC-servers price per disk. Ceph take care of redundancy.
> > 
> >I think simple custom casing should be good enough - maybe just strap or
> > velcro everything on trays in the rack, at least for the nodes with SSD.
> > 
> >Kind regards,
> >--
> >Jerker Nyberg, Uppsala, Sweden.
> >___
> >ceph-users mailing list
> >ceph-users@lists.ceph.com 
> >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> >  >  ___
>  ceph-users mailing list
>  ceph-users@lists.ceph.com
>  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Kyle Hutson
Here 'tis:
https://dpaste.de/POr1


On Thu, Apr 9, 2015 at 9:49 AM, Gregory Farnum  wrote:

> Can you dump your crush map and post it on pastebin or something?
>
> On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson  wrote:
> > Nope - it's 64-bit.
> >
> > (Sorry, I missed the reply-all last time.)
> >
> > On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum  wrote:
> >>
> >> [Re-added the list]
> >>
> >> Hmm, I'm checking the code and that shouldn't be possible. What's your
> >> ciient? (In particular, is it 32-bit? That's the only thing i can
> >> think of that might have slipped through our QA.)
> >>
> >> On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson  wrote:
> >> > I did nothing to enable anything else. Just changed my ceph repo from
> >> > 'giant' to 'hammer', then did 'yum update' and restarted services.
> >> >
> >> > On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum 
> wrote:
> >> >>
> >> >> Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
> >> >> cluster unless you made changes to the layout requiring it.
> >> >>
> >> >> If you did, the clients have to be upgraded to understand it. You
> >> >> could disable all the v4 features; that should let them connect
> again.
> >> >> -Greg
> >> >>
> >> >> On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson 
> wrote:
> >> >> > This particular problem I just figured out myself ('ceph -w' was
> >> >> > still
> >> >> > running from before the upgrade, and ctrl-c and restarting solved
> >> >> > that
> >> >> > issue), but I'm still having a similar problem on the ceph client:
> >> >> >
> >> >> > libceph: mon19 10.5.38.20:6789 feature set mismatch, my
> 2b84a042aca <
> >> >> > server's 102b84a042aca, missing 1
> >> >> >
> >> >> > It appears that even the latest kernel doesn't have support for
> >> >> > CEPH_FEATURE_CRUSH_V4
> >> >> >
> >> >> > How do I make my ceph cluster backward-compatible with the old
> cephfs
> >> >> > client?
> >> >> >
> >> >> > On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson 
> >> >> > wrote:
> >> >> >>
> >> >> >> I upgraded from giant to hammer yesterday and now 'ceph -w' is
> >> >> >> constantly
> >> >> >> repeating this message:
> >> >> >>
> >> >> >> 2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478
> >>
> >> >> >> 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0
> l=1
> >> >> >> c=0x7f95e0023670).connect protocol feature mismatch, my
> 3fff
> >> >> >> <
> >> >> >> peer
> >> >> >> 13fff missing 1
> >> >> >>
> >> >> >> It isn't always the same IP for the destination - here's another:
> >> >> >> 2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478
> >>
> >> >> >> 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0
> l=1
> >> >> >> c=0x7f95e002b480).connect protocol feature mismatch, my
> 3fff
> >> >> >> <
> >> >> >> peer
> >> >> >> 13fff missing 1
> >> >> >>
> >> >> >> Some details about our install:
> >> >> >> We have 24 hosts with 18 OSDs each. 16 per host are spinning disks
> >> >> >> in
> >> >> >> an
> >> >> >> erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions
> >> >> >> used
> >> >> >> for a
> >> >> >> caching tier in front of the EC pool. All 24 hosts are monitors. 4
> >> >> >> hosts are
> >> >> >> mds. We are running cephfs with a client trying to write data over
> >> >> >> cephfs
> >> >> >> when we're seeing these messages.
> >> >> >>
> >> >> >> Any ideas?
> >> >> >
> >> >> >
> >> >> >
> >> >> > ___
> >> >> > ceph-users mailing list
> >> >> > ceph-users@lists.ceph.com
> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >> >
> >> >
> >> >
> >
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Quentin Hartman
I'm skeptical about how well this would work, but a Banana Pi might be a
place to start. Like a raspberry pi, but it has a SATA connector:
http://www.bananapi.org/

On Thu, Apr 9, 2015 at 3:18 AM, Jerker Nyberg  wrote:

>
> Hello ceph users,
>
> Is anyone running any low powered single disk nodes with Ceph now? Calxeda
> seems to be no more according to Wikipedia. I do not think HP moonshot is
> what I am looking for - I want stand-alone nodes, not server cartridges
> integrated into server chassis. And I do not want to be locked to a single
> vendor.
>
> I was playing with Raspberry Pi 2 for signage when I thought of my old
> experiments with Ceph.
>
> I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or maybe
> something with a low-power Intel x64/x86 processor. Together with one SSD
> or one low power HDD the node could get all power via PoE (via splitter or
> integrated into board if such boards exist). PoE provide remote power-on
> power-off even for consumer grade nodes.
>
> The cost for a single low power node should be able to compete with
> traditional PC-servers price per disk. Ceph take care of redundancy.
>
> I think simple custom casing should be good enough - maybe just strap or
> velcro everything on trays in the rack, at least for the nodes with SSD.
>
> Kind regards,
> --
> Jerker Nyberg, Uppsala, Sweden.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Gregory Farnum
Can you dump your crush map and post it on pastebin or something?

On Thu, Apr 9, 2015 at 7:26 AM, Kyle Hutson  wrote:
> Nope - it's 64-bit.
>
> (Sorry, I missed the reply-all last time.)
>
> On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum  wrote:
>>
>> [Re-added the list]
>>
>> Hmm, I'm checking the code and that shouldn't be possible. What's your
>> ciient? (In particular, is it 32-bit? That's the only thing i can
>> think of that might have slipped through our QA.)
>>
>> On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson  wrote:
>> > I did nothing to enable anything else. Just changed my ceph repo from
>> > 'giant' to 'hammer', then did 'yum update' and restarted services.
>> >
>> > On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum  wrote:
>> >>
>> >> Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
>> >> cluster unless you made changes to the layout requiring it.
>> >>
>> >> If you did, the clients have to be upgraded to understand it. You
>> >> could disable all the v4 features; that should let them connect again.
>> >> -Greg
>> >>
>> >> On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson  wrote:
>> >> > This particular problem I just figured out myself ('ceph -w' was
>> >> > still
>> >> > running from before the upgrade, and ctrl-c and restarting solved
>> >> > that
>> >> > issue), but I'm still having a similar problem on the ceph client:
>> >> >
>> >> > libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca <
>> >> > server's 102b84a042aca, missing 1
>> >> >
>> >> > It appears that even the latest kernel doesn't have support for
>> >> > CEPH_FEATURE_CRUSH_V4
>> >> >
>> >> > How do I make my ceph cluster backward-compatible with the old cephfs
>> >> > client?
>> >> >
>> >> > On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson 
>> >> > wrote:
>> >> >>
>> >> >> I upgraded from giant to hammer yesterday and now 'ceph -w' is
>> >> >> constantly
>> >> >> repeating this message:
>> >> >>
>> >> >> 2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 >>
>> >> >> 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
>> >> >> c=0x7f95e0023670).connect protocol feature mismatch, my 3fff
>> >> >> <
>> >> >> peer
>> >> >> 13fff missing 1
>> >> >>
>> >> >> It isn't always the same IP for the destination - here's another:
>> >> >> 2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 >>
>> >> >> 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
>> >> >> c=0x7f95e002b480).connect protocol feature mismatch, my 3fff
>> >> >> <
>> >> >> peer
>> >> >> 13fff missing 1
>> >> >>
>> >> >> Some details about our install:
>> >> >> We have 24 hosts with 18 OSDs each. 16 per host are spinning disks
>> >> >> in
>> >> >> an
>> >> >> erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions
>> >> >> used
>> >> >> for a
>> >> >> caching tier in front of the EC pool. All 24 hosts are monitors. 4
>> >> >> hosts are
>> >> >> mds. We are running cephfs with a client trying to write data over
>> >> >> cephfs
>> >> >> when we're seeing these messages.
>> >> >>
>> >> >> Any ideas?
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> >
>> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Kyle Hutson
Nope - it's 64-bit.

(Sorry, I missed the reply-all last time.)

On Thu, Apr 9, 2015 at 9:24 AM, Gregory Farnum  wrote:

> [Re-added the list]
>
> Hmm, I'm checking the code and that shouldn't be possible. What's your
> ciient? (In particular, is it 32-bit? That's the only thing i can
> think of that might have slipped through our QA.)
>
> On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson  wrote:
> > I did nothing to enable anything else. Just changed my ceph repo from
> > 'giant' to 'hammer', then did 'yum update' and restarted services.
> >
> > On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum  wrote:
> >>
> >> Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
> >> cluster unless you made changes to the layout requiring it.
> >>
> >> If you did, the clients have to be upgraded to understand it. You
> >> could disable all the v4 features; that should let them connect again.
> >> -Greg
> >>
> >> On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson  wrote:
> >> > This particular problem I just figured out myself ('ceph -w' was still
> >> > running from before the upgrade, and ctrl-c and restarting solved that
> >> > issue), but I'm still having a similar problem on the ceph client:
> >> >
> >> > libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca <
> >> > server's 102b84a042aca, missing 1
> >> >
> >> > It appears that even the latest kernel doesn't have support for
> >> > CEPH_FEATURE_CRUSH_V4
> >> >
> >> > How do I make my ceph cluster backward-compatible with the old cephfs
> >> > client?
> >> >
> >> > On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson 
> wrote:
> >> >>
> >> >> I upgraded from giant to hammer yesterday and now 'ceph -w' is
> >> >> constantly
> >> >> repeating this message:
> >> >>
> >> >> 2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 >>
> >> >> 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
> >> >> c=0x7f95e0023670).connect protocol feature mismatch, my 3fff
> <
> >> >> peer
> >> >> 13fff missing 1
> >> >>
> >> >> It isn't always the same IP for the destination - here's another:
> >> >> 2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 >>
> >> >> 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
> >> >> c=0x7f95e002b480).connect protocol feature mismatch, my 3fff
> <
> >> >> peer
> >> >> 13fff missing 1
> >> >>
> >> >> Some details about our install:
> >> >> We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in
> >> >> an
> >> >> erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used
> >> >> for a
> >> >> caching tier in front of the EC pool. All 24 hosts are monitors. 4
> >> >> hosts are
> >> >> mds. We are running cephfs with a client trying to write data over
> >> >> cephfs
> >> >> when we're seeing these messages.
> >> >>
> >> >> Any ideas?
> >> >
> >> >
> >> >
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Gregory Farnum
[Re-added the list]

Hmm, I'm checking the code and that shouldn't be possible. What's your
ciient? (In particular, is it 32-bit? That's the only thing i can
think of that might have slipped through our QA.)

On Thu, Apr 9, 2015 at 7:17 AM, Kyle Hutson  wrote:
> I did nothing to enable anything else. Just changed my ceph repo from
> 'giant' to 'hammer', then did 'yum update' and restarted services.
>
> On Thu, Apr 9, 2015 at 9:15 AM, Gregory Farnum  wrote:
>>
>> Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
>> cluster unless you made changes to the layout requiring it.
>>
>> If you did, the clients have to be upgraded to understand it. You
>> could disable all the v4 features; that should let them connect again.
>> -Greg
>>
>> On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson  wrote:
>> > This particular problem I just figured out myself ('ceph -w' was still
>> > running from before the upgrade, and ctrl-c and restarting solved that
>> > issue), but I'm still having a similar problem on the ceph client:
>> >
>> > libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca <
>> > server's 102b84a042aca, missing 1
>> >
>> > It appears that even the latest kernel doesn't have support for
>> > CEPH_FEATURE_CRUSH_V4
>> >
>> > How do I make my ceph cluster backward-compatible with the old cephfs
>> > client?
>> >
>> > On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson  wrote:
>> >>
>> >> I upgraded from giant to hammer yesterday and now 'ceph -w' is
>> >> constantly
>> >> repeating this message:
>> >>
>> >> 2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 >>
>> >> 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
>> >> c=0x7f95e0023670).connect protocol feature mismatch, my 3fff <
>> >> peer
>> >> 13fff missing 1
>> >>
>> >> It isn't always the same IP for the destination - here's another:
>> >> 2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 >>
>> >> 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
>> >> c=0x7f95e002b480).connect protocol feature mismatch, my 3fff <
>> >> peer
>> >> 13fff missing 1
>> >>
>> >> Some details about our install:
>> >> We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in
>> >> an
>> >> erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used
>> >> for a
>> >> caching tier in front of the EC pool. All 24 hosts are monitors. 4
>> >> hosts are
>> >> mds. We are running cephfs with a client trying to write data over
>> >> cephfs
>> >> when we're seeing these messages.
>> >>
>> >> Any ideas?
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Patrik Plank
Hi,



set the cache-tier size to 644245094400.

This should work.

But it is the same.



thanks

regards



-Original message-
From: Gregory Farnum 
Sent: Thursday 9th April 2015 15:44
To: Patrik Plank 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] cache-tier do not evict


On Thu, Apr 9, 2015 at 4:56 AM, Patrik Plank  wrote:
> Hi,
>
>
> i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm
> pool.
>
> these are my settings :
>
>
> ceph osd tier add kvm cache-pool
>
> ceph osd tier cache-mode cache-pool writeback
>
> ceph osd tier set-overlay kvm cache-pool
>
>
> ceph osd pool set cache-pool hit_set_type bloom
>
> ceph osd pool set cache-pool hit_set_count 1
>
> ceph osd pool set cache-pool hit set period 3600
>
>
> ceph osd pool set cache-pool target_max_bytes 751619276800

 ˆ 750 GB. For 3*512GB disks that's too large a target value.

>
> ceph osd pool set cache-pool target_max_objects 100
>
>
> ceph osd pool set cache-pool cache_min_flush_age 1800
>
> ceph osd pool set cache-pool cache_min_evict_age 600
>
>
> ceph osd pool cache-pool cache_target_dirty_ratio .4
>
> ceph osd pool cache-pool cache target_full_ratio .8
>
>
> So the problem is, the cache-tier do no evict automatically.
>
> If i copy some kvm images to the ceph cluster, the cache osds always run
> full.
>
>
> Is that normal?
>
> Is there a miss configuration?
>
>
> thanks
>
> best regards
>
> Patrik
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Gregory Farnum
Did you enable the straw2 stuff? CRUSHV4 shouldn't be required by the
cluster unless you made changes to the layout requiring it.

If you did, the clients have to be upgraded to understand it. You
could disable all the v4 features; that should let them connect again.
-Greg

On Thu, Apr 9, 2015 at 7:07 AM, Kyle Hutson  wrote:
> This particular problem I just figured out myself ('ceph -w' was still
> running from before the upgrade, and ctrl-c and restarting solved that
> issue), but I'm still having a similar problem on the ceph client:
>
> libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca <
> server's 102b84a042aca, missing 1
>
> It appears that even the latest kernel doesn't have support for
> CEPH_FEATURE_CRUSH_V4
>
> How do I make my ceph cluster backward-compatible with the old cephfs
> client?
>
> On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson  wrote:
>>
>> I upgraded from giant to hammer yesterday and now 'ceph -w' is constantly
>> repeating this message:
>>
>> 2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 >>
>> 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
>> c=0x7f95e0023670).connect protocol feature mismatch, my 3fff < peer
>> 13fff missing 1
>>
>> It isn't always the same IP for the destination - here's another:
>> 2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 >>
>> 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
>> c=0x7f95e002b480).connect protocol feature mismatch, my 3fff < peer
>> 13fff missing 1
>>
>> Some details about our install:
>> We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in an
>> erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used for a
>> caching tier in front of the EC pool. All 24 hosts are monitors. 4 hosts are
>> mds. We are running cephfs with a client trying to write data over cephfs
>> when we're seeing these messages.
>>
>> Any ideas?
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] long blocking with writes on rbds

2015-04-09 Thread Jeff Epstein



On 04/09/2015 03:14 AM, Christian Balzer wrote:


Your 6 OSDs are on a single VM from what I gather?
Aside from being a very small number for something that you seem to be
using in some sort of production environment (Ceph gets faster the more
OSDs you add), where is the redundancy, HA in that?


We are running one OSD per VM. All data is replicated across three VMs.


The number of your PGs and PGPs need to have at least a semblance of being
correctly sized, as others mentioned before.
You want to re-read the Ceph docs about that and check out the PG
calculator:
http://ceph.com/pgcalc/


My choice of pgs is based on this page. Since each pool is spread across 
3 OSDs, 100 seemed like a good number. Am I misinterpreting this 
documentation?

http://ceph.com/docs/master/rados/operations/placement-groups/


Since RBDs are sparsely allocated, the actual data used is the key factor.
But you're adding the pool removal overhead to this.

How much overhead does pool removal add?

Both and the fact that you have overloaded the PGs by nearly a factor of
10 (or 20 if you're actually using a replica of 3 and not 1)doesn't help
one bit.
And lets clarify what objects are in the Ceph/RBD context, they're the (by
default) 4MB blobs that make up a RBD image.


I'm curious how you reached your estimation of overloading. According to 
the pg calculator you linked to, given that each pool occupies only 3 
OSDs, the suggested number of pgs is around 100. Can you explain?

- Somewhat off-topic, but for my own curiosity: Why is deleting data so
slow, in terms of ceph's architecture? Shouldn't it just be a matter of
flagging a region as available and allowing it to be overwritten, as
would a traditional file system?


Apples and oranges, as RBD is block storage, not a FS.
That said, a traditional FS is local and updates an inode or equivalent
bit.
For Ceph to delete a RBD image, it has to go to all cluster nodes with
OSDs that have PGs that contain objects of that image. Then those objects
have to be deleted on the local filesystem of the OSD and various maps
updated cluster wide. Rince and repeat until all objects have been dealt
with.
Quite a bit more involved, but that's the price you have to pay when you
have a DISTRIBUTED storage architecture that doesn't rely on a single item
(like an inode) to reflect things for the whole system.

Thank you for explaining.

Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Kyle Hutson
This particular problem I just figured out myself ('ceph -w' was still
running from before the upgrade, and ctrl-c and restarting solved that
issue), but I'm still having a similar problem on the ceph client:

libceph: mon19 10.5.38.20:6789 feature set mismatch, my 2b84a042aca <
server's 102b84a042aca, missing 1

It appears that even the latest kernel doesn't have support
for CEPH_FEATURE_CRUSH_V4

How do I make my ceph cluster backward-compatible with the old cephfs
client?

On Thu, Apr 9, 2015 at 8:58 AM, Kyle Hutson  wrote:

> I upgraded from giant to hammer yesterday and now 'ceph -w' is constantly
> repeating this message:
>
> 2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 >>
> 10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
> c=0x7f95e0023670).connect protocol feature mismatch, my 3fff < peer
> 13fff missing 1
>
> It isn't always the same IP for the destination - here's another:
> 2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 >>
> 10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
> c=0x7f95e002b480).connect protocol feature mismatch, my 3fff < peer
> 13fff missing 1
>
> Some details about our install:
> We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in an
> erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used for a
> caching tier in front of the EC pool. All 24 hosts are monitors. 4 hosts
> are mds. We are running cephfs with a client trying to write data over
> cephfs when we're seeing these messages.
>
> Any ideas?
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] "protocol feature mismatch" after upgrading to Hammer

2015-04-09 Thread Kyle Hutson
I upgraded from giant to hammer yesterday and now 'ceph -w' is constantly
repeating this message:

2015-04-09 08:50:26.318042 7f95dbf86700  0 -- 10.5.38.1:0/2037478 >>
10.5.38.1:6789/0 pipe(0x7f95e00256e0 sd=3 :39489 s=1 pgs=0 cs=0 l=1
c=0x7f95e0023670).connect protocol feature mismatch, my 3fff < peer
13fff missing 1

It isn't always the same IP for the destination - here's another:
2015-04-09 08:50:20.322059 7f95dc087700  0 -- 10.5.38.1:0/2037478 >>
10.5.38.8:6789/0 pipe(0x7f95e00262f0 sd=3 :54047 s=1 pgs=0 cs=0 l=1
c=0x7f95e002b480).connect protocol feature mismatch, my 3fff < peer
13fff missing 1

Some details about our install:
We have 24 hosts with 18 OSDs each. 16 per host are spinning disks in an
erasure coded pool (k=8 m=4). 2 OSDs per host are SSD partitions used for a
caching tier in front of the EC pool. All 24 hosts are monitors. 4 hosts
are mds. We are running cephfs with a client trying to write data over
cephfs when we're seeing these messages.

Any ideas?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Gregory Farnum
On Thu, Apr 9, 2015 at 4:56 AM, Patrik Plank  wrote:
> Hi,
>
>
> i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm
> pool.
>
> these are my settings :
>
>
> ceph osd tier add kvm cache-pool
>
> ceph osd tier cache-mode cache-pool writeback
>
> ceph osd tier set-overlay kvm cache-pool
>
>
> ceph osd pool set cache-pool hit_set_type bloom
>
> ceph osd pool set cache-pool hit_set_count 1
>
> ceph osd pool set cache-pool hit set period 3600
>
>
> ceph osd pool set cache-pool target_max_bytes 751619276800

 ^ 750 GB. For 3*512GB disks that's too large a target value.

>
> ceph osd pool set cache-pool target_max_objects 100
>
>
> ceph osd pool set cache-pool cache_min_flush_age 1800
>
> ceph osd pool set cache-pool cache_min_evict_age 600
>
>
> ceph osd pool cache-pool cache_target_dirty_ratio .4
>
> ceph osd pool cache-pool cache target_full_ratio .8
>
>
> So the problem is, the cache-tier do no evict automatically.
>
> If i copy some kvm images to the ceph cluster, the cache osds always run
> full.
>
>
> Is that normal?
>
> Is there a miss configuration?
>
>
> thanks
>
> best regards
>
> Patrik
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Gregory Farnum
You can turn up debugging ("debug osd = 10" and "debug filestore = 10"
are probably enough, or maybe 20 each) and see what comes out to get
more information about why the threads are stuck.

But just from the log my answer is the same as before, and now I don't
trust that controller (or maybe its disks), regardless of what it's
admitting to. ;)
-Greg

On Thu, Apr 9, 2015 at 1:28 AM, Jacob Reid  wrote:
> On Wed, Apr 08, 2015 at 03:42:29PM +, Gregory Farnum wrote:
>> Im on my phone so can't check exactly what those threads are trying to do,
>> but the osd has several threads which are stuck. The FileStore threads are
>> certainly trying to access the disk/local filesystem. You may not have a
>> hardware fault, but it looks like something in your stack is not behaving
>> when the osd asks the filesystem to do something. Check dmesg, etc.
>> -Greg
>
>
> Noticed a bit in dmesg that seems to be controller-related (HP Smart Array 
> P420i) where I/O was hanging in some cases[1]; fixed by updating from 5.42 to 
> 6.00
>
> [1] http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c03555882
>
> In dmesg:
> [11775.779477] hpsa :08:00.0: ABORT REQUEST on C1:B0:T0:L0 
> Tag:0x:0010 Command:0x2a SN:0x49fb  REQUEST SUCCEEDED.
> [11812.170350] hpsa :08:00.0: Abort request on C1:B0:T0:L0
> [11817.386773] hpsa :08:00.0: cp 880522bff000 is reported invalid 
> (probably means target device no longer present)
> [11817.386784] hpsa :08:00.0: ABORT REQUEST on C1:B0:T0:L0 
> Tag:0x:0010 Command:0x2a SN:0x4a13  REQUEST SUCCEEDED.
>
> The problem still appears to be persisting in the cluster, although I am no 
> longer seeing the disk-related errors in dmesg, I am still getting errors in 
> the osd logs:
>
> 2015-04-08 17:24:15.024820 7f0f21e9f700  1 heartbeat_map reset_timeout 
> 'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
> 2015-04-08 17:24:15.025043 7f0f2169e700  1 heartbeat_map reset_timeout 
> 'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
> 2015-04-08 17:48:33.146399 7f0f21e9f700  1 heartbeat_map reset_timeout 
> 'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
> 2015-04-08 17:48:33.146439 7f0f2169e700  1 heartbeat_map reset_timeout 
> 'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
> 2015-04-08 18:55:31.107727 7f0f16740700  1 heartbeat_map reset_timeout 
> 'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
> 2015-04-08 18:55:31.107774 7f0f2169e700  1 heartbeat_map reset_timeout 
> 'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
> 2015-04-08 18:55:31.107789 7f0f21e9f700  1 heartbeat_map reset_timeout 
> 'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
> 2015-04-08 18:55:31.108225 7f0f29eaf700  1 heartbeat_map is_healthy 
> 'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
> 2015-04-08 18:55:31.108268 7f0f15f3f700  1 heartbeat_map reset_timeout 
> 'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
> 2015-04-08 18:55:31.108272 7f0f29eaf700  1 heartbeat_map is_healthy 
> 'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
> 2015-04-08 18:55:31.108281 7f0f29eaf700  1 heartbeat_map is_healthy 
> 'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
> 2015-04-08 18:55:31.108285 7f0f1573e700  1 heartbeat_map reset_timeout 
> 'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
> 2015-04-08 18:55:31.108345 7f0f16f41700  1 heartbeat_map reset_timeout 
> 'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
> 2015-04-08 18:55:31.108378 7f0f17742700  1 heartbeat_map reset_timeout 
> 'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
> 2015-04-08 19:01:20.694897 7f0f15f3f700  1 heartbeat_map reset_timeout 
> 'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
> 2015-04-08 19:01:20.694928 7f0f17742700  1 heartbeat_map reset_timeout 
> 'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
> 2015-04-08 19:01:20.694970 7f0f16f41700  1 heartbeat_map reset_timeout 
> 'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
> 2015-04-08 19:01:20.695544 7f0f1573e700  1 heartbeat_map reset_timeout 
> 'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
> 2015-04-08 19:01:20.695665 7f0f16740700  1 heartbeat_map reset_timeout 
> 'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
> 2015-04-08 19:01:34.979288 7f0f1573e700  1 heartbeat_map reset_timeout 
> 'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
> 2015-04-08 19:01:34.979498 7f0f21e9f700  1 heartbeat_map reset_timeout 
> 'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
> 2015-04-08 19:01:34.979513 7f0f16f41700  1 heartbeat_map reset_timeout 
> 'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
> 2015-04-08 19:01:34.979535 7f0f2169e700  1 heartbeat_map reset_timeout 
> 'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
> 2015-04-08 19:01:34.980021 7f0f15f3f700  1 heartbeat_map reset_timeout 
> 'OSD::disk_tp thread 0

Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Patrik Plank
Hi,



ceph version 0.87.1 

thanks

best regards



-Original message-
From: Chu Duc Minh 
Sent: Thursday 9th April 2015 15:03
To: Patrik Plank 
Cc: ceph-users@lists.ceph.com >> ceph-users@lists.ceph.com 

Subject: Re: [ceph-users] cache-tier do not evict

What ceph version do you use?

Regards,

On 9 Apr 2015 18:58, "Patrik Plank" mailto:pat...@plank.me> > 
wrote:
Hi,



i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm pool.

these are my settings :



ceph osd tier add kvm cache-pool

ceph osd tier cache-mode cache-pool writeback

ceph osd tier set-overlay kvm cache-pool



ceph osd pool set cache-pool hit_set_type bloom

ceph osd pool set cache-pool hit_set_count 1

ceph osd pool set cache-pool hit set period 3600



ceph osd pool set cache-pool target_max_bytes 751619276800

ceph osd pool set cache-pool target_max_objects 100



ceph osd pool set cache-pool cache_min_flush_age 1800

ceph osd pool set cache-pool cache_min_evict_age 600



ceph osd pool cache-pool cache_target_dirty_ratio .4

ceph osd pool cache-pool cache target_full_ratio .8



So the problem is, the cache-tier do no evict automatically.

If i copy some kvm images to the ceph cluster, the cache osds always run full.



Is that normal?

Is there a miss configuration?



thanks

best regards

Patrik












___
 ceph-users mailing list
 ceph-users@lists.ceph.com  
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
 
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cache-tier do not evict

2015-04-09 Thread Chu Duc Minh
What ceph version do you use?

Regards,
On 9 Apr 2015 18:58, "Patrik Plank"  wrote:

>  Hi,
>
>
> i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm
> pool.
>
> these are my settings :
>
>
> ceph osd tier add kvm cache-pool
>
> ceph osd tier cache-mode cache-pool writeback
>
> ceph osd tier set-overlay kvm cache-pool
>
>
> ceph osd pool set cache-pool hit_set_type bloom
>
> ceph osd pool set cache-pool hit_set_count 1
>
> ceph osd pool set cache-pool hit set period 3600
>
>
> ceph osd pool set cache-pool target_max_bytes 751619276800
>
> ceph osd pool set cache-pool target_max_objects 100
>
>
> ceph osd pool set cache-pool cache_min_flush_age 1800
>
> ceph osd pool set cache-pool cache_min_evict_age 600
>
>
> ceph osd pool cache-pool cache_target_dirty_ratio .4
>
> ceph osd pool cache-pool cache target_full_ratio .8
>
>
> So the problem is, the cache-tier do no evict automatically.
>
> If i copy some kvm images to the ceph cluster, the cache osds always run
> full.
>
>
> Is that normal?
>
> Is there a miss configuration?
>
>
> thanks
>
> best regards
>
> Patrik
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Motherboard recommendation?

2015-04-09 Thread Mohamed Pakkeer
Hi Markus,

I think,if you connect more than 16 drives on back plane,X10DRH-CT will
detect and show only 16 drives in BIOS. I am not sure about that. If you
test this motherboard, please let me know the result.

Msg form supermicro site

LSI 3108 SAS3 (12Gbps) controller;

   - 2GB cache; HW RAID 0, 1, 5, 6, 10, 50, 60
   - Supports up to 16 devices as default, more HDD devices support is also
   available as an option *

   For special SKU, please contact your Supermicro Sales.


Thanks
K.Mohamed Pakkeer

On Thu, Apr 9, 2015 at 5:05 PM, Markus Goldberg 
wrote:

>  Hi Mohamed,
> thank you for your reply.
> I thougt, there is a SAS-Expander on the backplanes of the SC847, so all
> drives can be run. Am i wrong?
>
> thanks,
>   Markus
>
> Am 09.04.2015 um 10:24 schrieb Mohamed Pakkeer:
>
>  Hi Markus,
>
>  X10DRH-CT can support only 16 drive as default. If you want to connect
> more drive,there is a special SKU for more drive support from super
> micro or you need additional SAS controller. We are using 2630 V3( 8 core -
> 2.4GHz) *2 for 30 drives on SM X10DRI-T. It is working perfectly on
> replication based cluster. If you are planning to use erasure coding, you
> have to think about higher spec.
>
>  Does any one know about the exact processor requirement of 30 drives
> node for erasure coding? . I can't find suitable hardware recommendation
> for erasure coding.
>
>  Cheers
> K.Mohamed Pakkeer
>
>
>
>
> On Thu, Apr 9, 2015 at 1:30 PM, Markus Goldberg <
> goldb...@uni-hildesheim.de> wrote:
>
>> Hi,
>> i have a backup-storage with ceph 0,93
>> As every backup-system it is only been written and hopefully never read.
>>
>> The hardware is 3 Supermicro SC847-cases with 30 SATA-HDDS each (2- and
>> 4-TB-WD-disks) = 250TB
>> I have realized, that the motherboards and CPUs are totally undersized,
>> so i want to install new boards.
>> I'm thinking of the following:
>> 3 Supermicro X10DRH-CT or X10DRC-T4+ with 128GB memory each.
>> What do you think about these boards? Will they fit into the SC847?
>> They have SAS and 10G-Base-T onboard, so no extra controller seems to be
>> necessary.
>> What Xeon-v3 should i take, how many cores?
>> Does anyone know if M.2-SSDs are supported in their pci-e-slots?
>>
>> Thank you very much,
>>   Markus
>>
>> --
>> Markus Goldberg   Universität Hildesheim
>>   Rechenzentrum
>> Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
>> Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
>> --
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
>  --
> Thanks & Regards
> K.Mohamed Pakkeer
> Mobile- 0091-8754410114
>
>
>
> --
> MfG,
>   Markus Goldberg
>
> --
> Markus Goldberg   Universität Hildesheim
>   Rechenzentrum
> Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
> Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
> --
>
>


-- 
Thanks & Regards
K.Mohamed Pakkeer
Mobile- 0091-8754410114
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cache-tier do not evict

2015-04-09 Thread Patrik Plank
Hi,



i have build a cach-tier pool (replica 2) with 3 x 512gb ssd for my kvm pool.

these are my settings :



ceph osd tier add kvm cache-pool

ceph osd tier cache-mode cache-pool writeback

ceph osd tier set-overlay kvm cache-pool



ceph osd pool set cache-pool hit_set_type bloom

ceph osd pool set cache-pool hit_set_count 1

ceph osd pool set cache-pool hit set period 3600



ceph osd pool set cache-pool target_max_bytes 751619276800

ceph osd pool set cache-pool target_max_objects 100



ceph osd pool set cache-pool cache_min_flush_age 1800

ceph osd pool set cache-pool cache_min_evict_age 600



ceph osd pool cache-pool cache_target_dirty_ratio .4

ceph osd pool cache-pool cache target_full_ratio .8



So the problem is, the cache-tier do no evict automatically.

If i copy some kvm images to the ceph cluster, the cache osds always run full.



Is that normal?

Is there a miss configuration?



thanks

best regards

Patrik











___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Motherboard recommendation?

2015-04-09 Thread Markus Goldberg

Hi Mohamed,
thank you for your reply.
I thougt, there is a SAS-Expander on the backplanes of the SC847, so all 
drives can be run. Am i wrong?


thanks,
  Markus
Am 09.04.2015 um 10:24 schrieb Mohamed Pakkeer:

Hi Markus,

X10DRH-CT can support only 16 drive as default. If you want to connect 
more drive,there is a special SKU for more drive support from super 
micro or you need additional SAS controller. We are using 2630 V3( 8 
core - 2.4GHz) *2 for 30 drives on SM X10DRI-T. It is working 
perfectly on replication based cluster. If you are planning to use 
erasure coding, you have to think about higher spec.


Does any one know about the exact processor requirement of 30 drives 
node for erasure coding? . I can't find suitable hardware 
recommendation for erasure coding.


Cheers
K.Mohamed Pakkeer




On Thu, Apr 9, 2015 at 1:30 PM, Markus Goldberg 
mailto:goldb...@uni-hildesheim.de>> wrote:


Hi,
i have a backup-storage with ceph 0,93
As every backup-system it is only been written and hopefully never
read.

The hardware is 3 Supermicro SC847-cases with 30 SATA-HDDS each
(2- and 4-TB-WD-disks) = 250TB
I have realized, that the motherboards and CPUs are totally
undersized, so i want to install new boards.
I'm thinking of the following:
3 Supermicro X10DRH-CT or X10DRC-T4+ with 128GB memory each.
What do you think about these boards? Will they fit into the SC847?
They have SAS and 10G-Base-T onboard, so no extra controller seems
to be necessary.
What Xeon-v3 should i take, how many cores?
Does anyone know if M.2-SSDs are supported in their pci-e-slots?

Thank you very much,
  Markus

--
Markus Goldberg   Universität Hildesheim
  Rechenzentrum
Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de

--

___
ceph-users mailing list
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Thanks & Regards
K.Mohamed Pakkeer
Mobile- 0091-8754410114




--
MfG,
  Markus Goldberg

--
Markus Goldberg   Universität Hildesheim
  Rechenzentrum
Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
--

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD hard crash on kernel 3.10

2015-04-09 Thread Shawn Edwards
Thanks for the pointer to the patched kernel.  I'll give that a shot.

On Thu, Apr 9, 2015, 5:56 AM Ilya Dryomov  wrote:

> On Wed, Apr 8, 2015 at 5:25 PM, Shawn Edwards 
> wrote:
> > We've been working on a storage repository for xenserver 6.5, which uses
> the
> > 3.10 kernel (ug).  I got the xenserver guys to include the rbd and
> libceph
> > kernel modules into the 6.5 release, so that's at least available.
> >
> > Where things go bad is when we have many (>10 or so) VMs on one host, all
> > using RBD clones for the storage mapped using the rbd kernel module.  The
> > Xenserver crashes so badly that it doesn't even get a chance to kernel
> > panic.  The whole box just hangs.
>
> I'm not very familiar with Xen and ways to debug it but if the problem
> lies in libceph or rbd kernel modules we'd like to fix it.  Perhaps try
> grabbing a vmcore?  If it just hangs and doesn't panic you can normally
> induce a crash with a sysrq.
>
> >
> > Has anyone else seen this sort of behavior?
> >
> > We have a lot of ways to try to work around this, but none of them are
> very
> > pretty:
> >
> > * move the code to user space, ditch the kernel driver:  The build tools
> for
> > Xenserver are all CentOS5 based, and it is painful to get all of the deps
> > built to get the ceph user space libs built.
> >
> > * backport the ceph and rbd kernel modules to 3.10.  Has proven painful,
> as
> > the block device code changed somewhere in the 3.14-3.16 timeframe.
>
> https://github.com/ceph/ceph-client/commits/rhel7-3.10.0-123.9.3 branch
> would be a good start - it has libceph.ko and rbd.ko as of 3.18-rc5
> backported to rhel7 (which is based on 3.10) and may be updated in the
> future as well, although no promises on that.
>
> Thanks,
>
> Ilya
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD hard crash on kernel 3.10

2015-04-09 Thread Ilya Dryomov
On Wed, Apr 8, 2015 at 5:25 PM, Shawn Edwards  wrote:
> We've been working on a storage repository for xenserver 6.5, which uses the
> 3.10 kernel (ug).  I got the xenserver guys to include the rbd and libceph
> kernel modules into the 6.5 release, so that's at least available.
>
> Where things go bad is when we have many (>10 or so) VMs on one host, all
> using RBD clones for the storage mapped using the rbd kernel module.  The
> Xenserver crashes so badly that it doesn't even get a chance to kernel
> panic.  The whole box just hangs.

I'm not very familiar with Xen and ways to debug it but if the problem
lies in libceph or rbd kernel modules we'd like to fix it.  Perhaps try
grabbing a vmcore?  If it just hangs and doesn't panic you can normally
induce a crash with a sysrq.

>
> Has anyone else seen this sort of behavior?
>
> We have a lot of ways to try to work around this, but none of them are very
> pretty:
>
> * move the code to user space, ditch the kernel driver:  The build tools for
> Xenserver are all CentOS5 based, and it is painful to get all of the deps
> built to get the ceph user space libs built.
>
> * backport the ceph and rbd kernel modules to 3.10.  Has proven painful, as
> the block device code changed somewhere in the 3.14-3.16 timeframe.

https://github.com/ceph/ceph-client/commits/rhel7-3.10.0-123.9.3 branch
would be a good start - it has libceph.ko and rbd.ko as of 3.18-rc5
backported to rhel7 (which is based on 3.10) and may be updated in the
future as well, although no promises on that.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] long blocking with writes on rbds

2015-04-09 Thread Ilya Dryomov
On Wed, Apr 8, 2015 at 7:36 PM, Lionel Bouton  wrote:
> On 04/08/15 18:24, Jeff Epstein wrote:
>> Hi, I'm having sporadic very poor performance running ceph. Right now
>> mkfs, even with nodiscard, takes 30 mintes or more. These kind of
>> delays happen often but irregularly .There seems to be no common
>> denominator. Clearly, however, they make it impossible to deploy ceph
>> in production.
>>
>> I reported this problem earlier on ceph's IRC, and was told to add
>> nodiscard to mkfs. That didn't help. Here is the command that I'm
>> using to format an rbd:
>>
>> For example: mkfs.ext4 -text4 -m0 -b4096 -E nodiscard /dev/rbd1
>
> I probably won't be able to help much, but people knowing more will need
> at least:
> - your Ceph version,
> - the kernel version of the host on which you are trying to format
> /dev/rbd1,
> - which hardware and network you are using for this cluster (CPU, RAM,
> HDD or SSD models, network cards, jumbo frames, ...).
>
>>
>> Ceph says everything is okay:
>>
>> cluster e96e10d3-ad2b-467f-9fe4-ab5269b70206
>>  health HEALTH_OK
>>  monmap e1: 3 mons at
>> {a=192.168.224.4:6789/0,b=192.168.232.4:6789/0,c=192.168.240.4:6789/0}, 
>> election
>> epoch 12, quorum 0,1,2 a,b,c
>>  osdmap e972: 6 osds: 6 up, 6 in
>>   pgmap v4821: 4400 pgs, 44 pools, 5157 MB data, 1654 objects
>> 46138 MB used, 1459 GB / 1504 GB avail
>> 4400 active+clean

Are there any "slow request" warnings in the logs?

Assuming a 30 minute mkfs is somewhat reproducible, can you bump osd
and ms log levels and try to capture it?

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] low power single disk nodes

2015-04-09 Thread Jerker Nyberg


Hello ceph users,

Is anyone running any low powered single disk nodes with Ceph now? Calxeda 
seems to be no more according to Wikipedia. I do not think HP moonshot is 
what I am looking for - I want stand-alone nodes, not server cartridges 
integrated into server chassis. And I do not want to be locked to a single 
vendor.


I was playing with Raspberry Pi 2 for signage when I thought of my old 
experiments with Ceph.


I am thinking of for example Odroid-C1 or Odroid-XU3 Lite or maybe 
something with a low-power Intel x64/x86 processor. Together with one SSD 
or one low power HDD the node could get all power via PoE (via splitter or 
integrated into board if such boards exist). PoE provide remote power-on 
power-off even for consumer grade nodes.


The cost for a single low power node should be able to compete with 
traditional PC-servers price per disk. Ceph take care of redundancy.


I think simple custom casing should be good enough - maybe just strap or 
velcro everything on trays in the rack, at least for the nodes with SSD.


Kind regards,
--
Jerker Nyberg, Uppsala, Sweden.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rebuild bucket index

2015-04-09 Thread Laurent Barbe

Hello ceph users,

Do you know a way to rebuild a bucket index ?
I would like to change the num_shards for an existing bucket.
If I change this value in bucket meta, the new index objects are well 
created, but empty (bucket listing return null). It would be nice to be 
able to recreate the index from the objects.

Does anyone have an idea for doing this?

Thanks.

Laurent Barbe
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSDs not coming up on one host

2015-04-09 Thread Jacob Reid
On Wed, Apr 08, 2015 at 03:42:29PM +, Gregory Farnum wrote:
> Im on my phone so can't check exactly what those threads are trying to do,
> but the osd has several threads which are stuck. The FileStore threads are
> certainly trying to access the disk/local filesystem. You may not have a
> hardware fault, but it looks like something in your stack is not behaving
> when the osd asks the filesystem to do something. Check dmesg, etc.
> -Greg


Noticed a bit in dmesg that seems to be controller-related (HP Smart Array 
P420i) where I/O was hanging in some cases[1]; fixed by updating from 5.42 to 
6.00

[1] http://h20564.www2.hp.com/hpsc/doc/public/display?docId=emr_na-c03555882

In dmesg:
[11775.779477] hpsa :08:00.0: ABORT REQUEST on C1:B0:T0:L0 
Tag:0x:0010 Command:0x2a SN:0x49fb  REQUEST SUCCEEDED.
[11812.170350] hpsa :08:00.0: Abort request on C1:B0:T0:L0
[11817.386773] hpsa :08:00.0: cp 880522bff000 is reported invalid 
(probably means target device no longer present)
[11817.386784] hpsa :08:00.0: ABORT REQUEST on C1:B0:T0:L0 
Tag:0x:0010 Command:0x2a SN:0x4a13  REQUEST SUCCEEDED.

The problem still appears to be persisting in the cluster, although I am no 
longer seeing the disk-related errors in dmesg, I am still getting errors in 
the osd logs:

2015-04-08 17:24:15.024820 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
2015-04-08 17:24:15.025043 7f0f2169e700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
2015-04-08 17:48:33.146399 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
2015-04-08 17:48:33.146439 7f0f2169e700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
2015-04-08 18:55:31.107727 7f0f16740700  1 heartbeat_map reset_timeout 
'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
2015-04-08 18:55:31.107774 7f0f2169e700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
2015-04-08 18:55:31.107789 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
2015-04-08 18:55:31.108225 7f0f29eaf700  1 heartbeat_map is_healthy 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 18:55:31.108268 7f0f15f3f700  1 heartbeat_map reset_timeout 
'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
2015-04-08 18:55:31.108272 7f0f29eaf700  1 heartbeat_map is_healthy 'OSD::op_tp 
thread 0x7f0f17742700' had timed out after 4
2015-04-08 18:55:31.108281 7f0f29eaf700  1 heartbeat_map is_healthy 'OSD::op_tp 
thread 0x7f0f16f41700' had timed out after 4
2015-04-08 18:55:31.108285 7f0f1573e700  1 heartbeat_map reset_timeout 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 18:55:31.108345 7f0f16f41700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
2015-04-08 18:55:31.108378 7f0f17742700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
2015-04-08 19:01:20.694897 7f0f15f3f700  1 heartbeat_map reset_timeout 
'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
2015-04-08 19:01:20.694928 7f0f17742700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
2015-04-08 19:01:20.694970 7f0f16f41700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
2015-04-08 19:01:20.695544 7f0f1573e700  1 heartbeat_map reset_timeout 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 19:01:20.695665 7f0f16740700  1 heartbeat_map reset_timeout 
'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
2015-04-08 19:01:34.979288 7f0f1573e700  1 heartbeat_map reset_timeout 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 19:01:34.979498 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f700' had timed out after 4
2015-04-08 19:01:34.979513 7f0f16f41700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f16f41700' had timed out after 4
2015-04-08 19:01:34.979535 7f0f2169e700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f2169e700' had timed out after 4
2015-04-08 19:01:34.980021 7f0f15f3f700  1 heartbeat_map reset_timeout 
'OSD::disk_tp thread 0x7f0f15f3f700' had timed out after 4
2015-04-08 19:01:34.980051 7f0f17742700  1 heartbeat_map reset_timeout 
'OSD::op_tp thread 0x7f0f17742700' had timed out after 4
2015-04-08 19:01:34.980392 7f0f16740700  1 heartbeat_map reset_timeout 
'OSD::recovery_tp thread 0x7f0f16740700' had timed out after 4
2015-04-08 19:03:34.731872 7f0f1573e700  1 heartbeat_map reset_timeout 
'OSD::command_tp thread 0x7f0f1573e700' had timed out after 4
2015-04-08 19:03:34.731972 7f0f21e9f700  1 heartbeat_map reset_timeout 
'FileStore::op_tp thread 0x7f0f21e9f70

Re: [ceph-users] Motherboard recommendation?

2015-04-09 Thread Mohamed Pakkeer
Hi Markus,

X10DRH-CT can support only 16 drive as default. If you want to connect more
drive,there is a special SKU for more drive support from super micro or you
need additional SAS controller. We are using 2630 V3( 8 core - 2.4GHz) *2
for 30 drives on SM X10DRI-T. It is working perfectly on replication based
cluster. If you are planning to use erasure coding, you have to think about
higher spec.

Does any one know about the exact processor requirement of 30 drives node
for erasure coding? . I can't find suitable hardware recommendation for
erasure coding.

Cheers
K.Mohamed Pakkeer




On Thu, Apr 9, 2015 at 1:30 PM, Markus Goldberg 
wrote:

> Hi,
> i have a backup-storage with ceph 0,93
> As every backup-system it is only been written and hopefully never read.
>
> The hardware is 3 Supermicro SC847-cases with 30 SATA-HDDS each (2- and
> 4-TB-WD-disks) = 250TB
> I have realized, that the motherboards and CPUs are totally undersized, so
> i want to install new boards.
> I'm thinking of the following:
> 3 Supermicro X10DRH-CT or X10DRC-T4+ with 128GB memory each.
> What do you think about these boards? Will they fit into the SC847?
> They have SAS and 10G-Base-T onboard, so no extra controller seems to be
> necessary.
> What Xeon-v3 should i take, how many cores?
> Does anyone know if M.2-SSDs are supported in their pci-e-slots?
>
> Thank you very much,
>   Markus
>
> --
> Markus Goldberg   Universität Hildesheim
>   Rechenzentrum
> Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
> Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
> --
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Thanks & Regards
K.Mohamed Pakkeer
Mobile- 0091-8754410114
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Motherboard recommendation?

2015-04-09 Thread Markus Goldberg

Hi,
i have a backup-storage with ceph 0,93
As every backup-system it is only been written and hopefully never read.

The hardware is 3 Supermicro SC847-cases with 30 SATA-HDDS each (2- and 
4-TB-WD-disks) = 250TB
I have realized, that the motherboards and CPUs are totally undersized, 
so i want to install new boards.

I'm thinking of the following:
3 Supermicro X10DRH-CT or X10DRC-T4+ with 128GB memory each.
What do you think about these boards? Will they fit into the SC847?
They have SAS and 10G-Base-T onboard, so no extra controller seems to be 
necessary.

What Xeon-v3 should i take, how many cores?
Does anyone know if M.2-SSDs are supported in their pci-e-slots?

Thank you very much,
  Markus

--
Markus Goldberg   Universität Hildesheim
  Rechenzentrum
Tel +49 5121 88392822 Universitätsplatz 1, D-31141 Hildesheim, Germany
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
--

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cascading Failure of OSDs

2015-04-09 Thread HEWLETT, Paul (Paul)** CTR **

I use the folowing:

cat /sys/class/net/em1/statistics/rx_bytes

for the em1 interface

all other stats are available

Paul Hewlett
Senior Systems Engineer
Velocix, Cambridge
Alcatel-Lucent
t: +44 1223 435893 m: +44 7985327353




From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Carl-Johan 
Schenström [carl-johan.schenst...@gu.se]
Sent: 09 April 2015 07:34
To: Francois Lafont; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Cascading Failure of OSDs

Francois Lafont wrote:

> Just in case it could be useful, I have noticed the -s option (on my
> Ubuntu) that offer an output probably easier to parse:
>
> # "column -t" is just to make it's nice for the human eyes.
> ifconfig -s | column -t

Since ifconfig is deprecated, one should use iproute2 instead.

ip -s link show p2p1 | awk '/(RX|TX):/{getline; print $3;}'

However, the sysfs interface is probably a better alternative. See 

 and .

--
Carl-Johan Schenström
Driftansvarig / System Administrator
Språkbanken & Svensk nationell datatjänst /
The Swedish Language Bank & Swedish National Data Service
Göteborgs universitet / University of Gothenburg
carl-johan.schenst...@gu.se / +46 709 116769
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Firefly - Giant : CentOS 7 : install failed ceph-deploy

2015-04-09 Thread Vickey Singh
Thanks for the help guys , here is my feedback with the tests


@Michael Kidd :yum install ceph ceph-common --disablerepo=base
--disablerepo=epel

Did not worked  here are the logs http://fpaste.org/208828/56448714/


@Travis Rhoden : Yep *exclude=python-rados python-rbd*  under epel.repo did
the trick and i can install Firefly / Giant without errors. Thanks


Any idea when this be fixed once for all ( so i no longer to patch
epel.repo to exclude python-r*)


- VS -

On Thu, Apr 9, 2015 at 4:26 AM, Michael Kidd  wrote:

> I don't think this came through the first time.. resending.. If it's a
> dupe, my apologies..
>
> For Firefly / Giant installs, I've had success with the following:
>
> yum install ceph ceph-common --disablerepo=base --disablerepo=epel
>
> Let us know if this works for you as well.
>
> Thanks,
>
> Michael J. Kidd
> Sr. Storage Consultant
> Inktank Professional Services
>  - by Red Hat
>
> On Wed, Apr 8, 2015 at 9:07 PM, Michael Kidd  wrote:
>
>> For Firefly / Giant installs, I've had success with the following:
>>
>> yum install ceph ceph-common --disablerepo=base --disablerepo=epel
>>
>> Let us know if this works for you as well.
>>
>> Thanks,
>>
>> Michael J. Kidd
>> Sr. Storage Consultant
>> Inktank Professional Services
>>  - by Red Hat
>>
>> On Wed, Apr 8, 2015 at 8:55 PM, Travis Rhoden  wrote:
>>
>>> I did also confirm that, as Ken mentioned, this is not a problem on
>>> Hammer since Hammer includes the package split (python-ceph became
>>> python-rados and python-rbd).
>>>
>>>  - Travis
>>>
>>> On Wed, Apr 8, 2015 at 5:00 PM, Travis Rhoden  wrote:
>>>
 Hi Vickey,

 The easiest way I know of to get around this right now is to add the
 following line in section for epel in /etc/yum.repos.d/epel.repo

 exclude=python-rados python-rbd

 So this is what my epel.repo file looks like: http://fpaste.org/208681/

 It is those two packages in EPEL that are causing problems.  I also
 tried enabling epel-testing, but that didn't work either.

 Unfortunately you would need to add this line on each node where Ceph
 Giant is being installed.

  - Travis

 On Wed, Apr 8, 2015 at 4:11 PM, Vickey Singh <
 vickey.singh22...@gmail.com> wrote:

> Community , need help.
>
>
> -VS-
>
> On Wed, Apr 8, 2015 at 4:36 PM, Vickey Singh <
> vickey.singh22...@gmail.com> wrote:
>
>> Any suggestion  geeks
>>
>>
>> VS
>>
>> On Wed, Apr 8, 2015 at 2:15 PM, Vickey Singh <
>> vickey.singh22...@gmail.com> wrote:
>>
>>>
>>> Hi
>>>
>>>
>>> The below suggestion also didn’t worked
>>>
>>>
>>> Full logs here : http://paste.ubuntu.com/10771939/
>>>
>>>
>>>
>>>
>>> [root@rgw-node1 yum.repos.d]# yum --showduplicates list ceph
>>>
>>> Loaded plugins: fastestmirror, priorities
>>>
>>> Loading mirror speeds from cached hostfile
>>>
>>>  * base: mirror.zetup.net
>>>
>>>  * epel: ftp.fi.muni.cz
>>>
>>>  * extras: mirror.zetup.net
>>>
>>>  * updates: mirror.zetup.net
>>>
>>> 25 packages excluded due to repository priority protections
>>>
>>> Available Packages
>>>
>>> ceph.x86_64
>>> 0.80.6-0.el7.centos
>>> Ceph
>>>
>>> ceph.x86_64
>>> 0.80.7-0.el7.centos
>>> Ceph
>>>
>>> ceph.x86_64
>>> 0.80.8-0.el7.centos
>>> Ceph
>>>
>>> ceph.x86_64
>>> 0.80.9-0.el7.centos
>>> Ceph
>>>
>>> [root@rgw-node1 yum.repos.d]#
>>>
>>>
>>>
>>>
>>>
>>> Its not able to install latest available package , yum is getting
>>> confused with other DOT releases.
>>>
>>>
>>> Any other suggestion to fix this ???
>>>
>>>
>>>
>>> --> Processing Dependency: libboost_system-mt.so.1.53.0()(64bit) for
>>> package: librbd1-0.80.9-0.el7.centos.x86_64
>>>
>>> --> Processing Dependency: libboost_thread-mt.so.1.53.0()(64bit) for
>>> package: librbd1-0.80.9-0.el7.centos.x86_64
>>>
>>> --> Finished Dependency Resolution
>>>
>>> Error: Package: librbd1-0.80.7-0.el7.centos.x86_64 (Ceph)
>>>
>>>Requires: libboost_system-mt.so.1.53.0()(64bit)
>>>
>>> Error: Package: ceph-0.80.7-0.el7.centos.x86_64 (Ceph)
>>>
>>>Requires: libboost_system-mt.so.1.53.0()(64bit)
>>>
>>> Error: Package: ceph-0.80.7-0.el7.centos.x86_64 (Ceph)
>>>
>>>Requires: libaio.so.1(LIBAIO_0.4)(64bit)
>>>
>>> Error: Package: ceph-common-0.80.7-0.el7.centos.x86_64 (Ceph)
>>>
>>>Requires: libboost_thread-mt.so.1.53.0()(64bit)
>>>
>>> Error: Package: ceph-common-0.80.7-0.el7.centos.x86_64 (Ceph)
>>>
>>>Requires: librados2 = 0.80.7-0.el7.centos
>>>
>>>Available: librados2-0.80.6-0.el7.centos.x86_64 (Ceph)
>>>
>>>librados2 = 0.80.6-0

Re: [ceph-users] long blocking with writes on rbds

2015-04-09 Thread Christian Balzer
On Thu, 09 Apr 2015 00:25:08 -0400 Jeff Epstein wrote:

Running Ceph on AWS is, as was mentioned before, certainly not going to
improve things when compared to real HW.
At the very least it will make performance unpredictable.

Your 6 OSDs are on a single VM from what I gather?
Aside from being a very small number for something that you seem to be
using in some sort of production environment (Ceph gets faster the more
OSDs you add), where is the redundancy, HA in that?

The number of your PGs and PGPs need to have at least a semblance of being
correctly sized, as others mentioned before.
You want to re-read the Ceph docs about that and check out the PG
calculator:
http://ceph.com/pgcalc/

> 
> >> Our workload involves creating and destroying a lot of pools. Each
> >> pool has 100 pgs, so it adds up. Could this be causing the problem?
> >> What would you suggest instead?
> >
> > ...this is most likely the cause. Deleting a pool causes the data and
> > pgs associated with it to be deleted asynchronously, which can be a lot
> > of background work for the osds.
> >
> > If you're using the cfq scheduler you can try decreasing the priority 
> > of these operations with the "osd disk thread ioprio..." options:
> >
> > http://ceph.com/docs/master/rados/configuration/osd-config-ref/#operations 
> >
> >
> > If that doesn't help enough, deleting data from pools before deleting
> > the pools might help, since you can control the rate more finely. And
> > of course not creating/deleting so many pools would eliminate the
> > hidden background cost of deleting the pools.
> 
> Thanks for your answer. Some follow-up questions:
> 
> - I wouldn't expect that pool deletion is the problem, since our pools, 
> although many, don't contain much data. Typically, we will have one rbd 
> per pool, several GB in size, but in practice containing little data. 
> Would you expect that performance penalty from deleting pool to be 
> relative to the requested size of the rbd, or relative to the quantity 
> of data actually stored in it?
> 
Since RBDs are sparsely allocated, the actual data used is the key factor.
But you're adding the pool removal overhead to this.

> - Rather than creating and deleting multiple pools, each containing a 
> single rbd, do you think we would see a speed-up if we were to instead 
> have one pool, containing multiple (frequently created and deleted) 
> rbds? Does the performance penalty stem only from deleting pools 
> themselves, or from deleting objects within the pool as well?
> 
Both and the fact that you have overloaded the PGs by nearly a factor of
10 (or 20 if you're actually using a replica of 3 and not 1)doesn't help
one bit.
And lets clarify what objects are in the Ceph/RBD context, they're the (by
default) 4MB blobs that make up a RBD image.

> - Somewhat off-topic, but for my own curiosity: Why is deleting data so 
> slow, in terms of ceph's architecture? Shouldn't it just be a matter of 
> flagging a region as available and allowing it to be overwritten, as 
> would a traditional file system?
> 
Apples and oranges, as RBD is block storage, not a FS.
That said, a traditional FS is local and updates an inode or equivalent
bit.
For Ceph to delete a RBD image, it has to go to all cluster nodes with
OSDs that have PGs that contain objects of that image. Then those objects
have to be deleted on the local filesystem of the OSD and various maps
updated cluster wide. Rince and repeat until all objects have been dealt
with.
Quite a bit more involved, but that's the price you have to pay when you
have a DISTRIBUTED storage architecture that doesn't rely on a single item
(like an inode) to reflect things for the whole system.


Christian

> Jeff
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com