Re: [ceph-users] Erasure coding

2014-05-19 Thread yalla.gnan.kumar
Hi John,

Thanks for the reply.

It seems that to understand the internal mechanism and the algorithmic 
structure of ceph, knowledge of Information theory is necessary ☺.


Thanks
Kumar

From: John Wilkins [mailto:john.wilk...@inktank.com]
Sent: Monday, May 19, 2014 10:18 PM
To: Gnan Kumar, Yalla
Cc: Loic Dachary; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Erasure coding

I have also added a big part of Loic's discussion of the architecture into the 
Ceph architecture document here:

http://ceph.com/docs/master/architecture/#erasure-coding

On Mon, May 19, 2014 at 5:35 AM, 
mailto:yalla.gnan.ku...@accenture.com>> wrote:
Hi Loic,

Thanks for the reply.


Thanks
Kumar


-Original Message-
From: Loic Dachary [mailto:l...@dachary.org]
Sent: Monday, May 19, 2014 6:04 PM
To: Gnan Kumar, Yalla; 
ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Erasure coding

Hi,

The general idea to preserve resilience but save space compared to replication. 
It costs more in terms of CPU and network. You will find a short introduction 
here :

https://wiki.ceph.com/Planning/Blueprints/Dumpling/Erasure_encoding_as_a_storage_backend
https://wiki.ceph.com/Planning/Blueprints/Firefly/Erasure_coded_storage_backend_%28step_3%29

For the next Ceph release Pyramid Codes will help reduce the bandwidth 
requirements 
https://wiki.ceph.com/Planning/Blueprints/Giant/Pyramid_Erasure_Code

Cheers

On 19/05/2014 13:52, 
yalla.gnan.ku...@accenture.com wrote:
> Hi All,
>
>
>
> What exactly is erasure coding and why is it used in ceph ? I could not get 
> enough explanatory information from the documentation.
>
>
>
>
>
> Thanks
>
> Kumar
>
>
> --
>
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Where allowed by local law, 
> electronic communications with Accenture and its affiliates, including e-mail 
> and instant messaging (including content), may be scanned by our systems for 
> the purposes of information security and assessment of internal compliance 
> with Accenture policy.
> __
>
> www.accenture.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

--
Loïc Dachary, Artisan Logiciel Libre


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
http://inktank.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Access denied error for list users

2014-05-19 Thread Shanil S
Hi,

I am trying to create and list all users by using the functions
http://ceph.com/docs/master/radosgw/adminops/ and i successfully created
the access tokens but i am getting an access denied and 403 for listing
users function. The GET /{admin}/user is used for getting the complete
users list, but its not listing and getting the error. The user which
called this function has the complete permission and i am adding the
permission of this user

{ "type": "admin",
  "perm": "*"},
{ "type": "buckets",
  "perm": "*"},
{ "type": "caps",
  "perm": "*"},
{ "type": "metadata",
  "perm": "*"},
{ "type": "usage",
  "perm": "*"},
{ "type": "users",
  "perm": "*"}],
  "op_mask": "read, write, delete",
  "default_placement": "",
  "placement_tags": [],
  "bucket_quota": { "enabled": false,
  "max_size_kb": -1,
  "max_objects": -1}}



This is in the log file which executed the list user function

-

GET

application/x-www-form-urlencoded
Tue, 20 May 2014 05:06:57 GMT
/admin/user/
2014-05-20 13:06:59.506233 7f0497fa7700 15 calculated
digest=Z8FgXRLk+ah5MUThpP9IBJrMnrA=
2014-05-20 13:06:59.506236 7f0497fa7700 15
auth_sign=Z8FgXRLk+ah5MUThpP9IBJrMnrA=
2014-05-20 13:06:59.506237 7f0497fa7700 15 compare=0
2014-05-20 13:06:59.506240 7f0497fa7700  2 req 98:0.000308::GET
/admin/user/:get_user_info:reading permissions
2014-05-20 13:06:59.506244 7f0497fa7700  2 req 98:0.000311::GET
/admin/user/:get_user_info:init op
2014-05-20 13:06:59.506247 7f0497fa7700  2 req 98:0.000314::GET
/admin/user/:get_user_info:verifying op mask
2014-05-20 13:06:59.506249 7f0497fa7700 20 required_mask= 0 user.op_mask=7
2014-05-20 13:06:59.506251 7f0497fa7700  2 req 98:0.000319::GET
/admin/user/:get_user_info:verifying op permissions
2014-05-20 13:06:59.506254 7f0497fa7700  2 req 98:0.000322::GET
/admin/user/:get_user_info:verifying op params
2014-05-20 13:06:59.506257 7f0497fa7700  2 req 98:0.000324::GET
/admin/user/:get_user_info:executing
2014-05-20 13:06:59.506291 7f0497fa7700  2 req 98:0.000359::GET
/admin/user/:get_user_info:http status=403
2014-05-20 13:06:59.506294 7f0497fa7700  1 == req done
req=0x7f04c800d7f0 http_status=403 ==
2014-05-20 13:06:59.506302 7f0497fa7700 20 process_request() returned -13

-


Could you please check what is the issue ?

I am using the ceph version : ceph version 0.80.1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon create error

2014-05-19 Thread reistlin87

Ok, maybe it is bug) But in Ubuntu distro all work correct. Maybee they know 
about it and fix it.

19.05.2014, 23:05, "Alfredo Deza" :
> On Sun, May 18, 2014 at 1:36 PM, reistlin87  wrote:
>
>>  Yes, i have tried. I have found,that Ubuntu Server create admin socket with
>>  right name. All other distros( i have tasted CentOS 6.5, OpenSUSE 13.1,
>>  Debian 7.5) create wrong admin socket name with non-default cluster name. Is
>>  it bug, or it is my mistake?
>>
>>  18.05.2014, 18:30, "John Wilkins" :
>>
>>  Have you tried specifying the socket path in your Ceph configuration file?
>>
>>  On Sat, May 17, 2014 at 9:38 AM, reistlin87  wrote:
>>
>>  Hi all! Sorry for my english, I am russian)
>>
>>  We get the same error on diffrent linux distro(CentOS 6.4 & SuSe 11),
>>  diffrent ceph version (0.67,0.72,0.8).
>>
>>  Point of error:
>>
>>  We want to create new cluster with non-standart name(for example cephtst):
>>  [root@admin ceph]# ceph-deploy --cluster cephtst new mon
>>  Cluster creates Ok.
>>
>>  And then we want to create monitor:
>>  [root@admin ceph]# ceph-deploy --cluster cephtst mon create mon
>>
>>  We get a error related to name of admin socket:
>>  [root@admin ceph]# ceph-deploy --cluster cephtst mon create mon
>>  [ceph_deploy.conf][DEBUG ] found configuration file at:
>>  /root/.cephdeploy.conf
>>  [ceph_deploy.cli][INFO  ] Invoked (1.5.1): /usr/bin/ceph-deploy --cluster
>>  cephtst mon create mon
>>  [ceph_deploy.mon][DEBUG ] Deploying mon, cluster cephtst hosts mon
>>  [ceph_deploy.mon][DEBUG ] detecting platform for host mon ...
>>  [mon][DEBUG ] connected to host: mon
>>  [mon][DEBUG ] detect platform information from remote host
>>  [mon][DEBUG ] detect machine type
>>  [ceph_deploy.mon][INFO  ] distro info: CentOS 6.5 Final
>>  [mon][DEBUG ] determining if provided host has same hostname in remote
>>  [mon][DEBUG ] get remote short hostname
>>  [mon][DEBUG ] deploying mon to mon
>>  [mon][DEBUG ] get remote short hostname
>>  [mon][DEBUG ] remote hostname: mon
>>  [mon][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
>>  [mon][DEBUG ] create the mon path if it does not exist
>>  [mon][DEBUG ] checking for done path: /var/lib/ceph/mon/cephtst-mon/done
>>  [mon][DEBUG ] create a done file to avoid re-doing the mon deployment
>>  [mon][DEBUG ] create the init path if it does not exist
>>  [mon][DEBUG ] locating the `service` executable...
>>  [mon][INFO  ] Running command: /sbin/service ceph -c /etc/ceph/cephtst.conf
>>  start mon.mon
>>  [mon][DEBUG ] === mon.mon ===
>>  [mon][DEBUG ] Starting Ceph mon.mon on mon...already running
>>  [mon][INFO  ] Running command: ceph --cluster=cephtst --admin-daemon
>>  /var/run/ceph/cephtst-mon.mon.asok mon_status
>>  [mon][ERROR ] admin_socket: exception getting command descriptions: [Errno
>>  2] No such file or directory
>>  [mon][WARNIN] monitor: mon.mon, might not be running yet
>>  [mon][INFO  ] Running command: ceph --cluster=cephtst --admin-daemon
>>  /var/run/ceph/cephtst-mon.mon.asok mon_status
>>  [mon][ERROR ] admin_socket: exception getting command descriptions: [Errno
>>  2] No such file or directory
>>  [mon][WARNIN] monitor mon does not exist in monmap
>>  [mon][WARNIN] neither `public_addr` nor `public_network` keys are defined
>>  for monitors
>>  [mon][WARNIN] monitors may not be able to form quorum
>>  Unhandled exception in thread started by
>>  Error in sys.excepthook:
>>  Original exception was:
>>
>>  And in this time in folder /var/run/ceph/ present file with name
>>  ceph-mon.mon.asok
>>
>>  Why does name of admin socket not changes to right?
>
> There is this one line that looks very suspicious to me:
>
>>  [mon][DEBUG ] Starting Ceph mon.mon on mon...already running
>
> This suggests that you may have already tried to create a cluster with
> the default cluster name and then have tried
> to re-create the cluster with a custom name without cleaning everything out.
>
> However! even though that might be the case, while trying to replicate
> this issue I found that I was unable to get
> a monitor daemon to start.
>
> The ticket for that issue is http://tracker.ceph.com/issues/8391
>
>>  ___
>>  ceph-users mailing list
>>  ceph-users@lists.ceph.com
>>  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>  --
>>  John Wilkins
>>  Senior Technical Writer
>>  Intank
>>  john.wilk...@inktank.com
>>  (415) 425-9599
>>  http://inktank.com
>>
>>  ___
>>  ceph-users mailing list
>>  ceph-users@lists.ceph.com
>>  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: "rbd map" command hangs

2014-05-19 Thread Jay Janardhan
Got the stack trace when it crashed. I had to enable serial port to capture
this. Would this help?

 [  172.227318] libceph: mon0 192.168.56.102:6789 feature set mismatch, my
40002 < server's 20042040002, missing 2004200

[  172.451109] libceph: mon0 192.168.56.102:6789 socket error on read

[  172.539837] [ cut here ]

[  172.640704] kernel BUG at /home/apw/COD/linux/net/ceph/messenger.c:2366!

[  172.740775] invalid opcode:  [#1] SMP

[  172.805429] Modules linked in: rbd libceph libcrc32c nfsd nfs_acl
auth_rpcgss nfs fscache lockd sunrpc ext2 ppdev microcode psmouse serio_raw
parport_pc i2c_piix4 mac_hid lp parport e1000

[  173.072985] CPU 0

[  173.143909] Pid: 385, comm: kworker/0:3 Not tainted 3.6.9-030609-generic
#201212031610 innotek GmbH VirtualBox/VirtualBox

[  173.358836] RIP: 0010:[]  []
ceph_fault+0x267/0x270 [libceph]

[  173.629918] RSP: 0018:88007b497d90  EFLAGS: 00010286

[  173.731786] RAX: fffe RBX: 88007b909298 RCX:
0003

[  173.901361] RDX:  RSI:  RDI:
0039

[  174.040360] RBP: 88007b497dc0 R08: 000a R09:
fffb

[  174.235587] R10:  R11: 0199 R12:
88007b9092c8

[  174.385067] R13:  R14: a0199580 R15:
a0195773

[  174.541288] FS:  () GS:88007fc0()
knlGS:

[  174.620856] CS:  0010 DS:  ES:  CR0: 8005003b

[  174.740551] CR2: 7fefd16c5168 CR3: 7bb41000 CR4:
06f0

[  174.948095] DR0:  DR1:  DR2:


[  175.076881] DR3:  DR6: 0ff0 DR7:
0400

[  175.320731] Process kworker/0:3 (pid: 385, threadinfo 88007b496000,
task 880079735bc0)

[  175.565218] Stack:

[  175.630655]   88007b909298 88007b909690
88007b9093d0

[  175.699571]  88007b909418 88007fc0e300 88007b497df0
a018525c

[  175.710012]  88007b909690 880078e4d800 88007fc1bf00
88007fc0e340

[  175.859748] Call Trace:

[  175.909572]  [] con_work+0x14c/0x1c0 [libceph]

[  176.010436]  [] process_one_work+0x136/0x550

[  176.131098]  [] ? try_read+0x440/0x440 [libceph]

[  176.249904]  [] worker_thread+0x165/0x3c0

[  176.368412]  [] ? manage_workers+0x190/0x190

[  176.512415]  [] kthread+0x93/0xa0

[  176.623469]  [] kernel_thread_helper+0x4/0x10

[  176.670502]  [] ? flush_kthread_worker+0xb0/0xb0

[  176.731089]  [] ? gs_change+0x13/0x13

[  176.901284] Code: 00 00 00 00 48 8b 83 38 01 00 00 a8 02 0f 85 f6 fe ff
ff 3e 80 a3 38 01 00 00 fb 48 c7 83 40 01 00 00 06 00 00 00 e9 37 ff ff ff
<0f> 0b 0f 0b 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8

[  177.088895] RIP  [] ceph_fault+0x267/0x270 [libceph]

[  177.251573]  RSP 

[  177.310320] ---[ end trace f66ddfdda09b9821 ]---

[  177.461430] BUG: unable to handle kernel paging request at
fff8

[  177.464615] IP: [] kthread_data+0x11/0x20

[  177.464615] PGD 1c0e067 PUD 1c0f067 PMD 0

[  177.464615] Oops:  [#2] SMP

[  177.464615] Modules linked in: rbd libceph libcrc32c nfsd nfs_acl
auth_rpcgss nfs fscache lockd sunrpc ext2 ppdev microcode psmouse serio_raw
parport_pc i2c_piix4 mac_hid lp parport e1000

[  177.464615] CPU 0

[  177.464615] Pid: 385, comm: kworker/0:3 Tainted: G  D
3.6.9-030609-generic #201212031610 innotek GmbH VirtualBox/VirtualBox

[  177.464615] RIP: 0010:[]  []
kthread_data+0x11/0x20

[  177.464615] RSP: 0018:88007b497a70  EFLAGS: 00010096

[  177.464615] RAX:  RBX:  RCX:


[  177.464615] RDX: 81e593c0 RSI:  RDI:
880079735bc0

[  177.464615] RBP: 88007b497a88 R08: 00989680 R09:
0400

[  177.464615] R10:  R11: 880078fb09e0 R12:


[  177.464615] R13: 880079735f90 R14: 0001 R15:
0006

[  177.464615] FS:  () GS:88007fc0()
knlGS:

[  177.464615] CS:  0010 DS:  ES:  CR0: 8005003b

[  177.464615] CR2: fff8 CR3: 7b73e000 CR4:
06f0

[  177.464615] DR0:  DR1:  DR2:


[  177.464615] DR3:  DR6: 0ff0 DR7:
0400

[  177.464615] Process kworker/0:3 (pid: 385, threadinfo 88007b496000,
task 880079735bc0)

[  177.464615] Stack:

[  177.464615]  81077dc5 88007b497a88 88007fc13dc0
88007b497b08

[  177.464615]  816ade3f 88007b497ab8 
88007b497fd8

[  177.464615]  88007b497fd8 88007b497fd8 00013dc0
880078d8d618

[  177.464615] Call Trace:

[  177.464615]  [] ? wq_worker_sleeping+0x15/0xc0

[  177.464615]  [] __schedule+0x5cf/0x6f0

[  177.464615]  [] schedule+0x29/0x70

[  177.464615]  [] do_exit+0x2b3/0x4

[ceph-users] How do I do deep-scrub manually?

2014-05-19 Thread Jianing Yang

I found that deep scrub has a significant impact on my cluster. I've
used "ceph osd set nodeep-scrub" disable it. But I got an error
"HEALTH_WARN  nodeep-scrub flag(s) set". What is the proper way to
disable deep scrub? and how can I run it manually?

--
 _
/ Install 'denyhosts' to help protect \
| against brute force SSH attacks,|
\ auto-blocking multiple attempts./
 -
  \
   \
\
.-  -..--.  ,---.  .-=<>=-.
   /_-\'''/-_\  / / '' \ \ |,-.| /____\
  |/  o) (o  \|| | ')(' | |   /,'-'.\   |/ (')(') \|
   \   ._.   /  \ \/ /   {_/(') (')\_}   \   __   /
   ,>-_,,,_-<.   >'=jf='< `.   _   .','--__--'.
 /  .  \/\ /'-___-'\/:|\
(_) . (_)  /  \   / \  (_)   :|   (_)
 \_-'--/  (_)(_) (_)___(_)   |___:||
  \___/ || \___/ |_|
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] crushmap for datacenters

2014-05-19 Thread Vladislav Gorbunov
I create the new complex rule:
rule datacenter_rep2 {
ruleset 2
type replicated
min_size 1
max_size 10
step take default
step choose firstn 2 type datacenter
step chooseleaf firstn -1 type host
step emit
}
assign to pools, and now cluster work as I expect.


2014-05-20 11:59 GMT+12:00 Vladislav Gorbunov :

> Hi!
>
> Can you help me to understand why crushmap with
> step chooseleaf firstn 0 type host
> can't work with hosts in data centers?
>
> If I have the osd tree:
> # id weight type name up/down reweight
> -1 0.12 root default
> -3 0.03 host tceph2
> 1 0.03 osd.1 up 1
> -4 0.03 host tceph3
> 2 0.03 osd.2 up 1
> -2 0.03 host tceph1
> 0 0.03 osd.0 up 1
> -5 0.03 host tceph4
> 3 0.03 osd.3 up 1
> -7 0 datacenter dc1
> -6 0 datacenter dc2
>
> and default crush map rule
>
> { "rule_id": 0,
>   "rule_name": "data",
>   "ruleset": 0,
>   "type": 1,
>   "min_size": 1,
>   "max_size": 10,
>   "steps": [
> { "op": "take",
>   "item": -1},
> { "op": "chooseleaf_firstn",
>   "num": 0,
>   "type": "host"},
> { "op": "emit"}]},
>
>
> used by pools:
> pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
> pg_num 64 pgp_num 64 last_change 1176 owner 0 crash_replay_interval 45
> pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 1190 owner 0
> pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
> pg_num 64 pgp_num 64 last_change 1182 owner 0
>
> When one of the osd is down, cluster successfully rebalance to OK state:
>
> # id weight type name up/down reweight
> -1 0.12 root default
> -3 0.03 host tceph2
> 1 0.03 osd.1 down 0
> -4 0.03 host tceph3
> 2 0.03 osd.2 up 1
> -2 0.03 host tceph1
> 0 0.03 osd.0 up 1
> -5 0.03 host tceph4
> 3 0.03 osd.3 up 1
> -7 0 datacenter dc1
> -6 0 datacenter dc2
>
> ceph -s
>   cluster 6bdb23fb-adac-4113-8c75-e6bd245fcfd6
>health HEALTH_OK
>monmap e1: 1 mons at {tceph1=10.166.10.95:6789/0}, election epoch 1,
> quorum 0 tceph1
>osdmap e1207: 4 osds: 3 up, 3 in
> pgmap v4114539: 480 pgs: 480 active+clean; 2628 MB data, 5840 MB used,
> 89341 MB / 95182 MB avail
>mdsmap e1: 0/0/1 up
>
> But if hosts moved to datacenters  like in this map:
> # id weight type name up/down reweight
> -1 0.12 root default
> -7 0.06 datacenter dc1
> -4 0.03 host tceph3
> 2 0.03 osd.2 up 1
> -5 0.03 host tceph4
> 3 0.03 osd.3 up 1
> -6 0.06 datacenter dc2
> -2 0.03 host tceph1
> 0 0.03 osd.0 down 0
> -3 0.03 host tceph2
> 1 0.03 osd.1 up 1
>
> cluster can't reach OK state when one host is out of cluster:
>
>   cluster 6bdb23fb-adac-4113-8c75-e6bd245fcfd6
>health HEALTH_WARN 6 pgs incomplete; 6 pgs stuck inactive; 6 pgs stuck
> unclean
>monmap e1: 1 mons at {tceph1=10.166.10.95:6789/0}, election epoch 1,
> quorum 0 tceph1
>osdmap e1256: 4 osds: 3 up, 3 in
> pgmap v4114707: 480 pgs: 474 active+clean, 6 incomplete; 2516 MB data,
> 5606 MB used, 89575 MB / 95182 MB avail
>mdsmap e1: 0/0/1 up
>
> if downed host is up and return to the cluster then health is OK. If
> downed osd manually reweighed to 0 then cluster health is OK.
>
> Crushmap with
> step chooseleaf firstn 0 type datacenter
> have the same issue.
>
> { "rule_id": 3,
>   "rule_name": "datacenter_rule",
>   "ruleset": 3,
>   "type": 1,
>   "min_size": 1,
>   "max_size": 10,
>   "steps": [
> { "op": "take",
>   "item": -8},
> { "op": "chooseleaf_firstn",
>   "num": 0,
>   "type": "datacenter"},
> { "op": "emit"}]},
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Firefly 0.80 rados bench cleanup / object removal broken?

2014-05-19 Thread Guang Yang
Hi Matt,
The problem you came across was due to a change made in the rados bench along 
with the Firefly release, it aimed to solve the problem that if there were 
multiple rados instance (for writing), we want to do a rados read for each run 
as well.

Unfortunately, that change broke your user case, here is my suggestion to solve 
your problem:
1. Remove the pre-defined metadata file by
$ rados -p {pool_name} rm benchmark_last_metadata
2. Cleanup by prefix
$ sudo rados -p {pool_name} cleanup --prefix bench

Moving forward, you can use the new parameter ‘--run-name’ to name each turn of 
run and cleanup on that basis, if you still want to do a slow liner search to 
cleanup, be sure removing the benchmark_last_metadata object before you kick 
off running the cleanup.

Let me know if that helps.

Thanks,
Guang

On May 20, 2014, at 6:45 AM, matt.lat...@hgst.com wrote:

> 
> I was experimenting previously with 0.72 , and could easily cleanup pool
> objects from several previous rados bench (write) jobs with :
> 
> rados -p  cleanup bench  (would remove all objects starting
> with "bench")
> 
> I quickly realised when I moved to 0.80 that my script was broken and
> theoretically I now need:
> 
> rados -p  cleanup --prefix benchmark_data
> 
> But this only works sometimes, and sometimes partially. Issuing the command
> line twice seems to help a bit !  Also if I do "rados -p  ls"
> before hand, it seems to increase my chances of success, but often I am
> still left with benchmark objects undeleted. I also tried using the
> --run-name option to no avail.
> 
> The story gets more bizarre now I have set up a "hot SSD" cachepool in
> front of the backing OSD (SATA) pool. Objects won't delete from either pool
> with rados cleanup  I tried
> 
> "rados -p  cache-flush-evict-all"
> 
> which worked (rados df shows all objects now on the backing pool). Then
> bizarrely trying cleanup from the backing OSD pool just appears to copy
> them back into the cachepool, and they remain on the backing pool.
> 
> I can list individual object names with
> 
> rados -p  ls
> 
> but rados rm  will not remove individual objects stating "file
> or directory not found".
> 
> Are others seeing these things and any ways to work around or am I doing
> something wrong?  Are these commands now deprecated in which case what
> should I use?
> 
> Ubuntu 12.04, Kernel 3.14.0
> 
> Matt Latter
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ' rbd username specified but secret not found' error, virsh live migration on rbd

2014-05-19 Thread JinHwan Hwang
Thank you for your help.
Yes, i also defined on the second node(destination node). I've double
checked ephemeral and private, both of them show me this

~# virsh secret-dumpxml b34526f2-8d32-ed5d-3153-e90d011dd37e

  b34526f2-8d32-ed5d-3153-e90d011dd37e
  
client.libvirt secret
  


https://ceph.com/docs/master/rbd/libvirt/#configuring-ceph
I did configuring-ceph step 4 to 7 on the destination node. One different
thing is defining same UUID with source node to retain a secret UUID in a
virsh xml descrition of the vm. I mean at step 4, actually i did this

cat > secret.xml <

client.libvirt secret
b34526f2-8d32-ed5d-3153-e90d011dd37e

EOF

except this, everything i did for a secret at the destination node is same
with the source node.





2014-05-20 9:04 GMT+09:00 Josh Durgin :

> On 05/19/2014 01:48 AM, JinHwan Hwang wrote:
>
>> I have been trying to do live migration on vm which is running on rbd.
>> But so far, they only give me ' internal error: rbd username 'libvirt'
>> specified but secret not found' when i do live migration.
>>
>> ceph-admin : source host
>> host : destination host
>>
>> root@main:/home/ceph-admin# virsh migrate --live rbd1-1
>>   qemu+ssh://host/system
>> error: internal error: rbd username 'libvirt' specified but secret not
>> found
>>
>> These are rbd1-1 vm dump. It worked for running rbd1-1.
>> .
>>   
>>
>>
>>  
>>
>>
>>  
>>  
>>
>> 
>>
>> Is '...' not sufficient for doing live migration? I also
>> have tried setting same secret on both source and destination host
>> virsh(so both host virsh have
>> uuid='b34526f2-8d32-ed5d-3153-e90d011dd37e' ), But it didn't worked.
>> I've followed this('http://ceph.com/docs/master/rbd/libvirt/')
>> instruction and this is only secret so far i know. Am i miss something?
>> If that so, Where should i put those missing 'secrets'?
>> Thanks in advance for any helps.
>>
>
> Did you set the value of the secret on the second node?
> Is it defined with ephemeral='no' and private='no'?
>
> Josh
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ' rbd username specified but secret not found' error, virsh live migration on rbd

2014-05-19 Thread Josh Durgin

On 05/19/2014 01:48 AM, JinHwan Hwang wrote:

I have been trying to do live migration on vm which is running on rbd.
But so far, they only give me ' internal error: rbd username 'libvirt'
specified but secret not found' when i do live migration.

ceph-admin : source host
host : destination host

root@main:/home/ceph-admin# virsh migrate --live rbd1-1
  qemu+ssh://host/system
error: internal error: rbd username 'libvirt' specified but secret not found

These are rbd1-1 vm dump. It worked for running rbd1-1.
.
  
   
   
 
   
   
 
 
   


Is '...' not sufficient for doing live migration? I also
have tried setting same secret on both source and destination host
virsh(so both host virsh have
uuid='b34526f2-8d32-ed5d-3153-e90d011dd37e' ), But it didn't worked.
I've followed this('http://ceph.com/docs/master/rbd/libvirt/')
instruction and this is only secret so far i know. Am i miss something?
If that so, Where should i put those missing 'secrets'?
Thanks in advance for any helps.


Did you set the value of the secret on the second node?
Is it defined with ephemeral='no' and private='no'?

Josh
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] crushmap for datacenters

2014-05-19 Thread Vladislav Gorbunov
Hi!

Can you help me to understand why crushmap with
step chooseleaf firstn 0 type host
can't work with hosts in data centers?

If I have the osd tree:
# id weight type name up/down reweight
-1 0.12 root default
-3 0.03 host tceph2
1 0.03 osd.1 up 1
-4 0.03 host tceph3
2 0.03 osd.2 up 1
-2 0.03 host tceph1
0 0.03 osd.0 up 1
-5 0.03 host tceph4
3 0.03 osd.3 up 1
-7 0 datacenter dc1
-6 0 datacenter dc2

and default crush map rule

{ "rule_id": 0,
  "rule_name": "data",
  "ruleset": 0,
  "type": 1,
  "min_size": 1,
  "max_size": 10,
  "steps": [
{ "op": "take",
  "item": -1},
{ "op": "chooseleaf_firstn",
  "num": 0,
  "type": "host"},
{ "op": "emit"}]},


used by pools:
pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 1176 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 1190 owner 0
pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 1182 owner 0

When one of the osd is down, cluster successfully rebalance to OK state:

# id weight type name up/down reweight
-1 0.12 root default
-3 0.03 host tceph2
1 0.03 osd.1 down 0
-4 0.03 host tceph3
2 0.03 osd.2 up 1
-2 0.03 host tceph1
0 0.03 osd.0 up 1
-5 0.03 host tceph4
3 0.03 osd.3 up 1
-7 0 datacenter dc1
-6 0 datacenter dc2

ceph -s
  cluster 6bdb23fb-adac-4113-8c75-e6bd245fcfd6
   health HEALTH_OK
   monmap e1: 1 mons at {tceph1=10.166.10.95:6789/0}, election epoch 1,
quorum 0 tceph1
   osdmap e1207: 4 osds: 3 up, 3 in
pgmap v4114539: 480 pgs: 480 active+clean; 2628 MB data, 5840 MB used,
89341 MB / 95182 MB avail
   mdsmap e1: 0/0/1 up

But if hosts moved to datacenters  like in this map:
# id weight type name up/down reweight
-1 0.12 root default
-7 0.06 datacenter dc1
-4 0.03 host tceph3
2 0.03 osd.2 up 1
-5 0.03 host tceph4
3 0.03 osd.3 up 1
-6 0.06 datacenter dc2
-2 0.03 host tceph1
0 0.03 osd.0 down 0
-3 0.03 host tceph2
1 0.03 osd.1 up 1

cluster can't reach OK state when one host is out of cluster:

  cluster 6bdb23fb-adac-4113-8c75-e6bd245fcfd6
   health HEALTH_WARN 6 pgs incomplete; 6 pgs stuck inactive; 6 pgs stuck
unclean
   monmap e1: 1 mons at {tceph1=10.166.10.95:6789/0}, election epoch 1,
quorum 0 tceph1
   osdmap e1256: 4 osds: 3 up, 3 in
pgmap v4114707: 480 pgs: 474 active+clean, 6 incomplete; 2516 MB data,
5606 MB used, 89575 MB / 95182 MB avail
   mdsmap e1: 0/0/1 up

if downed host is up and return to the cluster then health is OK. If downed
osd manually reweighed to 0 then cluster health is OK.

Crushmap with
step chooseleaf firstn 0 type datacenter
have the same issue.

{ "rule_id": 3,
  "rule_name": "datacenter_rule",
  "ruleset": 3,
  "type": 1,
  "min_size": 1,
  "max_size": 10,
  "steps": [
{ "op": "take",
  "item": -8},
{ "op": "chooseleaf_firstn",
  "num": 0,
  "type": "datacenter"},
{ "op": "emit"}]},
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Firefly 0.80 rados bench cleanup / object removal broken?

2014-05-19 Thread Matt . Latter

I was experimenting previously with 0.72 , and could easily cleanup pool
objects from several previous rados bench (write) jobs with :

rados -p  cleanup bench  (would remove all objects starting
with "bench")

I quickly realised when I moved to 0.80 that my script was broken and
theoretically I now need:

rados -p  cleanup --prefix benchmark_data

But this only works sometimes, and sometimes partially. Issuing the command
line twice seems to help a bit !  Also if I do "rados -p  ls"
before hand, it seems to increase my chances of success, but often I am
still left with benchmark objects undeleted. I also tried using the
--run-name option to no avail.

The story gets more bizarre now I have set up a "hot SSD" cachepool in
front of the backing OSD (SATA) pool. Objects won't delete from either pool
with rados cleanup  I tried

"rados -p  cache-flush-evict-all"

which worked (rados df shows all objects now on the backing pool). Then
bizarrely trying cleanup from the backing OSD pool just appears to copy
them back into the cachepool, and they remain on the backing pool.

I can list individual object names with

rados -p  ls

but rados rm  will not remove individual objects stating "file
or directory not found".

Are others seeing these things and any ways to work around or am I doing
something wrong?  Are these commands now deprecated in which case what
should I use?

Ubuntu 12.04, Kernel 3.14.0

Matt Latter

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD for ephemeral

2014-05-19 Thread Pierre Grandin
With the help of Josh on IRC, we found that actually the glance_api_version
directive has to be in the [default] block of your cinder.conf ( i happen
to have two storage backends and this directive was in my rbd block).

After fixing this config my volumes created via cinder are indeed COW.

Now I need to figure out why nova is still doing rbd --import...

I'm not sure if the v1 url below is correct... Any idea?

2014-05-19 13:40:12.780 27548 AUDIT nova.compute.manager
[req-17b62396-4385-4788-b7c7-9bf5a5134c68
8bb8a6f4fe73422a9f464875eecac1d7 a17f155c44324038b6587b233bc45794]
[instance: 6ab377be-706d-45c2-a139-2860b875e89d] Starting instance...
2014-05-19 13:40:13.858 27548 DEBUG nova.image.glance
[req-17b62396-4385-4788-b7c7-9bf5a5134c68
8bb8a6f4fe73422a9f464875eecac1d7 a17f155c44324038b6587b233bc45794]
fetching image 6850844a-2899-4bc6-a957-de9705d56130 from glance
get_remote_image_service
/usr/lib64/python2.7/site-packages/nova/image/glance.py:575
2014-05-19 13:40:13.859 27548 DEBUG stevedore.extension [-] found
extension EntryPoint.parse('file = nova.image.download.file')
_load_plugins /usr/lib64/python2.7/site-packages/stevedore/extension.py:156
2014-05-19 13:40:13.859 27548 DEBUG stevedore.extension [-] found
extension EntryPoint.parse('file = nova.image.download.file')
_load_plugins /usr/lib64/python2.7/site-packages/stevedore/extension.py:156
2014-05-19 13:40:13.864 27548 DEBUG glanceclient.common.http [-] curl
-i -X HEAD -H 'X-Service-Catalog: [{"endpoints_links": [],
"endpoints": [{"adminURL":
"http://172.16.128.32:8776/v1/a17f155c44324038b6587b233bc45794";,
"region": "RegionOne", "publicURL":
"http://172.16.128.223:8776/v1/a17f155c44324038b6587b233bc45794";,
"internalURL": "http://172.16.128.32:8776/v1/a17f155c44324038b6587b233bc45794";,
"id": "5366b2488dba4acfba6190cfbeb740eb"}], "type": "volume", "name":
"cinder"}]' -H 'X-Identity-Status: Confirmed' -H 'X-Roles:
Member,admin' -H 'User-Agent: python-glanceclient' -H 'X-Tenant-Id:
a17f155c44324038b6587b233bc45794' -H 'X-User-Id:
8bb8a6f4fe73422a9f464875eecac1d7' -H 'X-Auth-Token:
48754a34ec124f9b8787c921235df775' -H 'Content-Type:
application/octet-stream'
http://172.16.128.223:9292/v1/images/6850844a-2899-4bc6-a957-de9705d56130
log_curl_request
/usr/lib64/python2.7/site-packages/glanceclient/common/http.py:142
2014-05-19 13:40:13.887 27548 DEBUG glanceclient.common.http [-]
HTTP/1.1 200 OK
content-length: 0
x-image-meta-id: 6850844a-2899-4bc6-a957-de9705d56130
date: Mon, 19 May 2014 20:40:13 GMT
x-image-meta-deleted: False
x-image-meta-checksum: cca00ef7030ebc1cb6b2613b50c21bf1
x-image-meta-container_format: bare
x-image-meta-protected: False
x-image-meta-min_disk: 0
x-image-meta-created_at: 2014-05-13T20:17:20
x-image-meta-size: 2147483648
x-image-meta-status: active
etag: cca00ef7030ebc1cb6b2613b50c21bf1
location: 
http://172.16.128.223:9292/v1/images/6850844a-2899-4bc6-a957-de9705d56130
x-image-meta-is_public: True
x-image-meta-min_ram: 0
x-image-meta-owner: 6abac8a5da7d4928aea04ffb0c75571a
x-image-meta-updated_at: 2014-05-13T20:19:00
content-type: text/html; charset=UTF-8
x-openstack-request-id: req-d71c7a3b-801d-43f1-8055-8222cd175084
x-image-meta-disk_format: raw
x-image-meta-name: base-image-20130909-1-rbd



On Mon, May 19, 2014 at 8:22 AM, Michael J. Kidd
wrote:

> After sending my earlier email, I found another commit that was merged in
> March:
> https://review.openstack.org/#/c/59149/
>
> Seems to follow a newer image handling technique that was being sought
> which prevented the first patch from being merged in...
>
> Michael J. Kidd
> Sr. Storage Consultant
> Inktank Professional Services
>
>
> On Mon, May 19, 2014 at 11:20 AM, Pierre Grandin <
> pierre.gran...@tubemogul.com> wrote:
>
>> Actually you can get the patched code from here for Havana :
>> https://github.com/jdurgin/nova/tree/havana-ephemeral-rbd
>>
>> But i'm still trying to get it to work (in my case the volumes are still
>> copies, and not copy on write).
>>
>>
>> On Mon, May 19, 2014 at 7:19 AM, Michael J. Kidd <
>> michael.k...@inktank.com> wrote:
>>
>>> Since the status is 'Abandoned', it would appear that the fix has not
>>> been merged into any release of OpenStack.
>>>
>>> Thanks,
>>>
>>> Michael J. Kidd
>>> Sr. Storage Consultant
>>> Inktank Professional Services
>>>
>>>
>>> On Sun, May 18, 2014 at 5:13 PM, Yuming Ma (yumima) wrote:
>>>
  Wondering what is the status of this fix
 https://review.openstack.org/#/c/46879/? Which release has it?
 — Yuming

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> --
>> *Pierre Grandin *| Senior Site Reliability Engineer
>> *M: *510.423.2231  | 
>> @p_grandin

Re: [ceph-users] Ceph Plugin for Collectd

2014-05-19 Thread David McBride
On 14/05/14 13:24, Christian Eichelmann wrote:
> Hi Ceph User!
> 
> I had a look at the "official" collectd fork for ceph, which is quite
> outdated and not compatible with the upstream version.
> 
> Since this was not an option for us, I've worte a Python Plugin for
> Collectd, that gets all the precious informations out of the admin
> sockets "perf dump" command. It runs on our productive cluster right now
> and I'd like to share it with you:
> 
> https://github.com/Crapworks/collectd-ceph
> 
> Any feedback is welcome!

Fun, I'd just implemented something very similar!

I've just pushed my version upstream to:

 https://github.com/dwm/collectd-ceph

There appear to be some minor differences between our designs:

 * I don't require that a types DB be kept up to date and consistent;
   rather, I've reused the generic 'counter' and 'gauge' types.

 * My version includes some historical processing to allow for the
   calculation of per-period, rather than global, average values.

 * Your version is nicer in that it communicates with the admin socket
   directly; I was lazy and simply invoked the `ceph` command-line tool
   to do that work for me.  It's not currently a significant
   performance hit, but should be improved.

I've been feeding all of my Ceph performance counter data to a Graphite
cluster via CollectD, using the CollectD AMQP plugin and a RabbitMQ
cluster as an intermediary, and Grafana as a query/graphing tool.

Apart from causing some stress on the disk-subsystem attempting to
write-out all those metrics, this has been working out quite well...

Cheers,
David
-- 
David McBride 
Unix Specialist, University Information Services
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] is cephfs ready for production ?

2014-05-19 Thread Gregory Farnum
On Mon, May 19, 2014 at 9:05 AM, Ignazio Cassano
 wrote:
> Hi all, I'd like to know if cephfs is in heavy developement or it is ready
> for production .
> Documentation  reports it is not for production but I think documentation on
> ceph.com is not enough recent .

There are groups successfully using CephFS, but we aren't ready to
label it as ready for general production use at this time. We're
expanding our testing coverage and working on stability as fast as we
can. :)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] metadata pool : size growing

2014-05-19 Thread Gregory Farnum
On Mon, May 19, 2014 at 4:05 AM, Florent B  wrote:
> On 05/19/2014 12:40 PM, Wido den Hollander wrote:
>> On 05/19/2014 12:31 PM, Florent B wrote:
>>> Hi all,
>>>
>>> I use CephFS and I am wondering what does "metadata" pool contains
>>> exactly ?
>>>
>>
>> It contains the directory structure of your filesystem
>>
>>> Its size is growing (multiple GB).
>>>
>>> It seems that it stores all operations occurring in CephFS, but is it
>>> totally necessary ?
>>>
>>
>> It doesn't store all operations, but it stores all the metadata like
>> directories, owners, file sizes, but not the contents of the files.
>
> It really seems to contain more than that. In our previous FS (MooseFS),
> metadata was around 600MB for more files than we have on CephFS (which
> metadata is around 3GB) ! That's really strange...

It also contains per-MDS journals, which can get reasonably large.
And just to clarify, when you restart an MDS it does not need to read
*all* the metadata in the pool; it will initially read in the
still-necessary parts of the journal and then will page in the rest of
the structure as needed by subsequent client requests.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS parallel reads from multiple replicas ?

2014-05-19 Thread Gregory Farnum
On Sat, May 17, 2014 at 5:40 PM, Michal Pazdera
 wrote:
> Hi everyone,
>
> I wonder if CephFS is able to read from all replicas simontaniously, that
> will result in doubled read performance if
> a replica of 2 has been used.

Nope, definitely not.

> I have done some humble testing on 5PCs (2OSD
> (each on 1PC) + 1MON/MDS PC + 2 CephFS clients).
> I have did two configurations:
>
> 1) default from ceph-deploy, replication 2, 4MB objects with 1 stripe count
> and 4MB stripe size.
> 2) custom conf without replication, 4MB object with 4 stripe count and 1MB
> stripe size.
>
> If replication is used write performance is around 30MB/s and read 100MB/s.
> If not write is around 55MB/s and read is suprisingly lower 95MB/s. I would
> expect to be around 160 or 180 since i have used two hard drives.
>
> journal was on the same OSD
>
> So i think with replication the data was served from both replicas, but im
> not sure. I have tried to find some more information about
> this but couldnt find any.

I think you're just seeing random variation based on on-disk layouts
or something. 95 and 100 are not far off for tests run over multiple
hard drives.
And while there are a number of other things it could be (lots of
stuff bottlenecks around 100MB/s), it might just be the combination of
your network throughput (which in practice can get ~115MB/s, if it's
gigabit) and occasional latencies.

> I have one last question. Does journals impact read performance? I mean if
> journal is on the same OSD, the write performance is
> two times slower because write is done to journal and then onto OSD. Does
> something simalar happen for reads to? I think not
> coz it seems silly to me.

Nope, the journals have no direct impact on reads.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Looking for ceph consultant

2014-05-19 Thread Glen Aidukas
Hello, 

We are looking for a Ceph consultant to help with setting up an S3 gateway.  

If interested, please contact me via email.

Thanks!

-Glen

gaidu...@behaviormatrix.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon create error

2014-05-19 Thread Alfredo Deza
On Sun, May 18, 2014 at 1:36 PM, reistlin87  wrote:
>
> Yes, i have tried. I have found,that Ubuntu Server create admin socket with
> right name. All other distros( i have tasted CentOS 6.5, OpenSUSE 13.1,
> Debian 7.5) create wrong admin socket name with non-default cluster name. Is
> it bug, or it is my mistake?
>
> 18.05.2014, 18:30, "John Wilkins" :
>
> Have you tried specifying the socket path in your Ceph configuration file?
>
>
> On Sat, May 17, 2014 at 9:38 AM, reistlin87  wrote:
>
> Hi all! Sorry for my english, I am russian)
>
> We get the same error on diffrent linux distro(CentOS 6.4 & SuSe 11),
> diffrent ceph version (0.67,0.72,0.8).
>
> Point of error:
>
> We want to create new cluster with non-standart name(for example cephtst):
> [root@admin ceph]# ceph-deploy --cluster cephtst new mon
> Cluster creates Ok.
>
> And then we want to create monitor:
> [root@admin ceph]# ceph-deploy --cluster cephtst mon create mon
>
> We get a error related to name of admin socket:
> [root@admin ceph]# ceph-deploy --cluster cephtst mon create mon
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.1): /usr/bin/ceph-deploy --cluster
> cephtst mon create mon
> [ceph_deploy.mon][DEBUG ] Deploying mon, cluster cephtst hosts mon
> [ceph_deploy.mon][DEBUG ] detecting platform for host mon ...
> [mon][DEBUG ] connected to host: mon
> [mon][DEBUG ] detect platform information from remote host
> [mon][DEBUG ] detect machine type
> [ceph_deploy.mon][INFO  ] distro info: CentOS 6.5 Final
> [mon][DEBUG ] determining if provided host has same hostname in remote
> [mon][DEBUG ] get remote short hostname
> [mon][DEBUG ] deploying mon to mon
> [mon][DEBUG ] get remote short hostname
> [mon][DEBUG ] remote hostname: mon
> [mon][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
> [mon][DEBUG ] create the mon path if it does not exist
> [mon][DEBUG ] checking for done path: /var/lib/ceph/mon/cephtst-mon/done
> [mon][DEBUG ] create a done file to avoid re-doing the mon deployment
> [mon][DEBUG ] create the init path if it does not exist
> [mon][DEBUG ] locating the `service` executable...
> [mon][INFO  ] Running command: /sbin/service ceph -c /etc/ceph/cephtst.conf
> start mon.mon
> [mon][DEBUG ] === mon.mon ===
> [mon][DEBUG ] Starting Ceph mon.mon on mon...already running
> [mon][INFO  ] Running command: ceph --cluster=cephtst --admin-daemon
> /var/run/ceph/cephtst-mon.mon.asok mon_status
> [mon][ERROR ] admin_socket: exception getting command descriptions: [Errno
> 2] No such file or directory
> [mon][WARNIN] monitor: mon.mon, might not be running yet
> [mon][INFO  ] Running command: ceph --cluster=cephtst --admin-daemon
> /var/run/ceph/cephtst-mon.mon.asok mon_status
> [mon][ERROR ] admin_socket: exception getting command descriptions: [Errno
> 2] No such file or directory
> [mon][WARNIN] monitor mon does not exist in monmap
> [mon][WARNIN] neither `public_addr` nor `public_network` keys are defined
> for monitors
> [mon][WARNIN] monitors may not be able to form quorum
> Unhandled exception in thread started by
> Error in sys.excepthook:
> Original exception was:
>
> And in this time in folder /var/run/ceph/ present file with name
> ceph-mon.mon.asok
>
> Why does name of admin socket not changes to right?

There is this one line that looks very suspicious to me:


> [mon][DEBUG ] Starting Ceph mon.mon on mon...already running

This suggests that you may have already tried to create a cluster with
the default cluster name and then have tried
to re-create the cluster with a custom name without cleaning everything out.

However! even though that might be the case, while trying to replicate
this issue I found that I was unable to get
a monitor daemon to start.

The ticket for that issue is http://tracker.ceph.com/issues/8391


> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> --
> John Wilkins
> Senior Technical Writer
> Intank
> john.wilk...@inktank.com
> (415) 425-9599
> http://inktank.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Web Gateway Start problem after upgrading Emperor to Firefly

2014-05-19 Thread Julien Calvet
Hello,

I have un big problem on Web gateway after upgrading emperor to Firefly.

I have only this on my logs: 

2014-05-19 20:29:17.303454 7f131217a780  0 ceph version 0.80.1 
(a38fe1169b6d2ac98b427334c12d7cf81f809b74), process radosgw, pid 4002
2014-05-19 20:29:17.329183 7f131217a780  0 client.8004.objecter  FULL, paused 
modify 0xf86db0 tid 8
2014-05-19 20:34:17.304370 7f1309d68700 -1 Initialization timeout, failed to 
initialize


Anybody could help me ?


Best regards

Julien
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: "rbd map" command hangs

2014-05-19 Thread Ilya Dryomov
On Mon, May 19, 2014 at 8:37 PM, Jay Janardhan  wrote:
> Ilya, The SysRq is not doing anything as the kernel is hung. Btw, this is a
> VirtualBox environment so I used the VBoxManage to send the SysRq commands.
> Just to let you know, the system locksup and the only way out is a hard
> reset.

Well, that's not much to go on.  Was there something in dmesg when it
locked up or in response to SysRqs?

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Erasure coding

2014-05-19 Thread John Wilkins
I have also added a big part of Loic's discussion of the architecture into
the Ceph architecture document here:

http://ceph.com/docs/master/architecture/#erasure-coding


On Mon, May 19, 2014 at 5:35 AM,  wrote:

> Hi Loic,
>
> Thanks for the reply.
>
>
> Thanks
> Kumar
>
>
> -Original Message-
> From: Loic Dachary [mailto:l...@dachary.org]
> Sent: Monday, May 19, 2014 6:04 PM
> To: Gnan Kumar, Yalla; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Erasure coding
>
> Hi,
>
> The general idea to preserve resilience but save space compared to
> replication. It costs more in terms of CPU and network. You will find a
> short introduction here :
>
>
> https://wiki.ceph.com/Planning/Blueprints/Dumpling/Erasure_encoding_as_a_storage_backend
>
> https://wiki.ceph.com/Planning/Blueprints/Firefly/Erasure_coded_storage_backend_%28step_3%29
>
> For the next Ceph release Pyramid Codes will help reduce the bandwidth
> requirements
> https://wiki.ceph.com/Planning/Blueprints/Giant/Pyramid_Erasure_Code
>
> Cheers
>
> On 19/05/2014 13:52, yalla.gnan.ku...@accenture.com wrote:
> > Hi All,
> >
> >
> >
> > What exactly is erasure coding and why is it used in ceph ? I could not
> get enough explanatory information from the documentation.
> >
> >
> >
> >
> >
> > Thanks
> >
> > Kumar
> >
> >
> >
> --
> >
> > This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
> >
> __
> >
> > www.accenture.com
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
http://inktank.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: "rbd map" command hangs

2014-05-19 Thread Jay Janardhan
Ilya, The SysRq is not doing anything as the kernel is hung. Btw, this is a
VirtualBox environment so I used the VBoxManage to send the SysRq commands.
Just to let you know, the system locksup and the only way out is a hard
reset.




On Mon, May 19, 2014 at 12:10 PM, Jay Janardhan wrote:

> Thanks for the response Ilya. I need to figure out how to use SysRq on my
> Mac. Meanwhile, here is the strace output and ceph version:
>
> *Ceph Version: *ceph version 0.80.1
> (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
>
> *​*
> Note that IP 192.168.56.102:6789 is reachable from the client node. ​
>
>  *​192.168.56.102 is a monitor node.*
>
> ​$ ​
> ceph status
>
>  cluster df4f503a-04a9-4572-96d3-e31218592cfa
>
>  health HEALTH_OK
>
>  monmap e1: 1 mons at {ceph-node1=192.168.56.102:6789/0}, election
> epoch 2, quorum 0 ceph-node1
>
>  osdmap e102: 3 osds: 3 up, 3 in
>
>   pgmap v1604: 192 pgs, 3 pools, 1373 bytes data, 4 objects
>
> 22744 MB used, 202 GB / 236 GB avail
>
>  192 active+clean
>
> *strace output:*
>
> map(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC,
> MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f7899894000
>
> mprotect(0x7f7899894000, 4096, PROT_NONE) = 0
>
> clone(child_stack=0x7f789a093f70,
> flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
> parent_tidptr=0x7f789a0949d0, tls=0x7f789a094700,
> child_tidptr=0x7f789a0949d0) = 1629
>
> rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0
>
> open("/etc/ceph/ceph.client.admin.keyring", O_RDONLY) = 3
>
> close(3)= 0
>
> open("/etc/ceph/ceph.client.admin.keyring", O_RDONLY) = 3
>
> fstat(3, {st_mode=S_IFREG|0644, st_size=63, ...}) = 0
>
> read(3, "[client.admin]\n\tkey = AQDb7HRTkB"..., 63) = 63
>
> close(3)= 0
>
> futex(0x7f789cc741a4, FUTEX_WAKE_PRIVATE, 2147483647) = 0
>
> brk(0x17e3000)  = 0x17e3000
>
> futex(0x17a4b84, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x17a4b80, {FUTEX_OP_SET, 0,
> FUTEX_OP_CMP_GT, 1}) = 1
>
> futex(0x17a4b00, FUTEX_WAKE_PRIVATE, 1) = 1
>
> brk(0x17e1000)  = 0x17e1000
>
> add_key(0x425208, 0x7fffce5220b0, 0x7fffce521fe0, 0x22, 0xfffe) = -1
> ENODEV (No such device)
>
> stat("/sys/bus/rbd", 0x7fffce522230)= -1 ENOENT (No such file or
> directory)
>
> rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f789c6d84c0}, {SIG_DFL,
> [], 0}, 8) = 0
>
> rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f789c6d84c0},
> {SIG_DFL, [], 0}, 8) = 0
>
> rt_sigprocmask(SIG_BLOCK, [CHLD], [PIPE], 8) = 0
>
> clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
> parent_tidptr=0x7fffce522060) = 1630
>
> wait4(1630, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 1630
>
> rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, NULL, 8)
> = 0
>
> rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, NULL, 8)
> = 0
>
> rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0
>
> rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f789c6d84c0}, {SIG_DFL,
> [], SA_RESTORER, 0x7f789c6d84c0}, 8) = 0
>
> rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f789c6d84c0},
> {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, 8) = 0
>
> rt_sigprocmask(SIG_BLOCK, [CHLD], [PIPE], 8) = 0
>
> clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
> parent_tidptr=0x7fffce522060) = 1633
>
> wait4(1633, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 1633
>
> rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, NULL, 8)
> = 0
>
> rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, NULL, 8)
> = 0
>
> rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0
>
> open("/sys/bus/rbd/add_single_major", O_WRONLY) = -1 ENOENT (No such file
> or directory)
>
> open("/sys/bus/rbd/add", O_WRONLY)  = 3
>
> write(3, "192.168.56.102:6789 name=admin,s"..., 87
>
>
> ​
>
>
>
>
> On Mon, May 19, 2014 at 10:16 AM, Ilya Dryomov 
> wrote:
>
>> On Mon, May 19, 2014 at 5:42 PM, Jay Janardhan 
>> wrote:
>> > (Sorry if this is a duplicate message - email server is acting up this
>> > morning).
>> >
>> >
>> > I'm following quick start guide and have a ceph cluster with three
>> nodes.
>> > When I try to map image to block device my command hangs. This seems
>> like a
>> > kernel hang as the only way I was able to get out is via a hard reset
>> of the
>> > image. The following is my configuration. Any help is greatly
>> appreciated.
>> >
>> > command on the ceph-client node (that hangs):
>> >
>> > $ sudo rbd map foo1 --pool rbd --name client.admin
>>
>> What's your ceph version (ceph --version)?  Can you run 'rbd map' under
>> strace,
>> and when it hangs do SysRq+w followed by SysRq+t and send along strace and
>> SysRq outputs?
>>
>> Thanks,
>>
>> Ilya
>>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-use

Re: [ceph-users] Fwd: "rbd map" command hangs

2014-05-19 Thread Jay Janardhan
Thanks for the response Ilya. I need to figure out how to use SysRq on my
Mac. Meanwhile, here is the strace output and ceph version:

*Ceph Version: *ceph version 0.80.1
(a38fe1169b6d2ac98b427334c12d7cf81f809b74)

*​*
Note that IP 192.168.56.102:6789 is reachable from the client node. ​

*​192.168.56.102 is a monitor node.*

​$ ​
ceph status

 cluster df4f503a-04a9-4572-96d3-e31218592cfa

 health HEALTH_OK

 monmap e1: 1 mons at {ceph-node1=192.168.56.102:6789/0}, election
epoch 2, quorum 0 ceph-node1

 osdmap e102: 3 osds: 3 up, 3 in

  pgmap v1604: 192 pgs, 3 pools, 1373 bytes data, 4 objects

22744 MB used, 202 GB / 236 GB avail

 192 active+clean

*strace output:*

map(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f7899894000

mprotect(0x7f7899894000, 4096, PROT_NONE) = 0

clone(child_stack=0x7f789a093f70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f789a0949d0, tls=0x7f789a094700,
child_tidptr=0x7f789a0949d0) = 1629

rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0

open("/etc/ceph/ceph.client.admin.keyring", O_RDONLY) = 3

close(3)= 0

open("/etc/ceph/ceph.client.admin.keyring", O_RDONLY) = 3

fstat(3, {st_mode=S_IFREG|0644, st_size=63, ...}) = 0

read(3, "[client.admin]\n\tkey = AQDb7HRTkB"..., 63) = 63

close(3)= 0

futex(0x7f789cc741a4, FUTEX_WAKE_PRIVATE, 2147483647) = 0

brk(0x17e3000)  = 0x17e3000

futex(0x17a4b84, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x17a4b80, {FUTEX_OP_SET, 0,
FUTEX_OP_CMP_GT, 1}) = 1

futex(0x17a4b00, FUTEX_WAKE_PRIVATE, 1) = 1

brk(0x17e1000)  = 0x17e1000

add_key(0x425208, 0x7fffce5220b0, 0x7fffce521fe0, 0x22, 0xfffe) = -1
ENODEV (No such device)

stat("/sys/bus/rbd", 0x7fffce522230)= -1 ENOENT (No such file or
directory)

rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f789c6d84c0}, {SIG_DFL,
[], 0}, 8) = 0

rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f789c6d84c0}, {SIG_DFL,
[], 0}, 8) = 0

rt_sigprocmask(SIG_BLOCK, [CHLD], [PIPE], 8) = 0

clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
parent_tidptr=0x7fffce522060) = 1630

wait4(1630, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 1630

rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, NULL, 8) =
0

rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, NULL, 8)
= 0

rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0

rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7f789c6d84c0}, {SIG_DFL,
[], SA_RESTORER, 0x7f789c6d84c0}, 8) = 0

rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER, 0x7f789c6d84c0}, {SIG_DFL,
[], SA_RESTORER, 0x7f789c6d84c0}, 8) = 0

rt_sigprocmask(SIG_BLOCK, [CHLD], [PIPE], 8) = 0

clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
parent_tidptr=0x7fffce522060) = 1633

wait4(1633, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 1633

rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, NULL, 8) =
0

rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7f789c6d84c0}, NULL, 8)
= 0

rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0

open("/sys/bus/rbd/add_single_major", O_WRONLY) = -1 ENOENT (No such file
or directory)

open("/sys/bus/rbd/add", O_WRONLY)  = 3

write(3, "192.168.56.102:6789 name=admin,s"..., 87


​




On Mon, May 19, 2014 at 10:16 AM, Ilya Dryomov wrote:

> On Mon, May 19, 2014 at 5:42 PM, Jay Janardhan 
> wrote:
> > (Sorry if this is a duplicate message - email server is acting up this
> > morning).
> >
> >
> > I'm following quick start guide and have a ceph cluster with three nodes.
> > When I try to map image to block device my command hangs. This seems
> like a
> > kernel hang as the only way I was able to get out is via a hard reset of
> the
> > image. The following is my configuration. Any help is greatly
> appreciated.
> >
> > command on the ceph-client node (that hangs):
> >
> > $ sudo rbd map foo1 --pool rbd --name client.admin
>
> What's your ceph version (ceph --version)?  Can you run 'rbd map' under
> strace,
> and when it hangs do SysRq+w followed by SysRq+t and send along strace and
> SysRq outputs?
>
> Thanks,
>
> Ilya
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] is cephfs ready for production ?

2014-05-19 Thread Ignazio Cassano
Hi all, I'd like to know if cephfs is in heavy developement or it is ready
for production .
Documentation  reports it is not for production but I think documentation
on ceph.com is not enough recent .

Regards
Ignazio
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Subscribe

2014-05-19 Thread Jay Janardhan

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] "rbd map" command hangs

2014-05-19 Thread Jay Janardhan
I'm following quick start guide and have a ceph cluster with three nodes.
When I try to map image to block device my command hangs. This seems like a
kernel hang as the only way I was able to get out is via a hard reset of
the image. The following is my configuration. Any help is greatly
appreciated.

command on the ceph-client node (that hangs):

$ sudo rbd map foo1 --pool rbd --name client.admin

*ceph-client node info:*

$ rbd info foo1

rbd image 'foo1':

size 4096 MB in 1024 objects

order 22 (4096 kB objects)

block_name_prefix: rb.0.1050.74b0dc51

format: 1

Kernel and Ubuntu release:

$ uname -r

3.6.9-030609-generic

$ lsb_release -a

No LSB modules are available.

Distributor ID: Ubuntu

Description: Ubuntu 12.04 LTS

Release: 12.04

Codename: precise


logs from /var/log/syslog:

May 17 14:12:48 ceph-client kernel: [  128.866445] Key type ceph registered

May 17 14:12:48 ceph-client kernel: [  128.866453] libceph: loaded (mon/osd
proto 15/24, osdmap 5/6 5/6)

May 17 14:12:48 ceph-client kernel: [  128.867313] rbd: loaded rbd (rados
block device)



*ceph-node1 node info (node2 and node3 are similar):*

$ uname -r

3.2.0-23-generic

$ lsb_release -a

No LSB modules are available.

Distributor ID: Ubuntu

Description: Ubuntu 12.04 LTS

Release: 12.04

Codename: precise



$ ceph status

cluster df4f503a-04a9-4572-96d3-e31218592cfa

 health HEALTH_OK

 monmap e1: 1 mons at {ceph-node1=192.168.56.102:6789/0}, election
epoch 2, quorum 0 ceph-node1

 osdmap e60: 3 osds: 3 up, 3 in

  pgmap v1168: 192 pgs, 3 pools, 1373 bytes data, 4 objects

22739 MB used, 202 GB / 236 GB avail

 192 active+clean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD for ephemeral

2014-05-19 Thread Michael J. Kidd
After sending my earlier email, I found another commit that was merged in
March:
https://review.openstack.org/#/c/59149/

Seems to follow a newer image handling technique that was being sought
which prevented the first patch from being merged in...

Michael J. Kidd
Sr. Storage Consultant
Inktank Professional Services


On Mon, May 19, 2014 at 11:20 AM, Pierre Grandin <
pierre.gran...@tubemogul.com> wrote:

> Actually you can get the patched code from here for Havana :
> https://github.com/jdurgin/nova/tree/havana-ephemeral-rbd
>
> But i'm still trying to get it to work (in my case the volumes are still
> copies, and not copy on write).
>
>
> On Mon, May 19, 2014 at 7:19 AM, Michael J. Kidd  > wrote:
>
>> Since the status is 'Abandoned', it would appear that the fix has not
>> been merged into any release of OpenStack.
>>
>> Thanks,
>>
>> Michael J. Kidd
>> Sr. Storage Consultant
>> Inktank Professional Services
>>
>>
>> On Sun, May 18, 2014 at 5:13 PM, Yuming Ma (yumima) wrote:
>>
>>>  Wondering what is the status of this fix
>>> https://review.openstack.org/#/c/46879/? Which release has it?
>>> — Yuming
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> *Pierre Grandin *| Senior Site Reliability Engineer
> *M: *510.423.2231  | 
> @p_grandin
>
> [image: Inline image 
> 1]
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD for ephemeral

2014-05-19 Thread Pierre Grandin
Actually you can get the patched code from here for Havana :
https://github.com/jdurgin/nova/tree/havana-ephemeral-rbd

But i'm still trying to get it to work (in my case the volumes are still
copies, and not copy on write).


On Mon, May 19, 2014 at 7:19 AM, Michael J. Kidd
wrote:

> Since the status is 'Abandoned', it would appear that the fix has not been
> merged into any release of OpenStack.
>
> Thanks,
>
> Michael J. Kidd
> Sr. Storage Consultant
> Inktank Professional Services
>
>
> On Sun, May 18, 2014 at 5:13 PM, Yuming Ma (yumima) wrote:
>
>>  Wondering what is the status of this fix
>> https://review.openstack.org/#/c/46879/? Which release has it?
>> — Yuming
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
*Pierre Grandin *| Senior Site Reliability Engineer
*M: *510.423.2231  |
@p_grandin

[image: Inline image 1]
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD for ephemeral

2014-05-19 Thread Michael J. Kidd
Since the status is 'Abandoned', it would appear that the fix has not been
merged into any release of OpenStack.

Thanks,

Michael J. Kidd
Sr. Storage Consultant
Inktank Professional Services


On Sun, May 18, 2014 at 5:13 PM, Yuming Ma (yumima) wrote:

>  Wondering what is the status of this fix
> https://review.openstack.org/#/c/46879/? Which release has it?
> — Yuming
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: "rbd map" command hangs

2014-05-19 Thread Ilya Dryomov
On Mon, May 19, 2014 at 5:42 PM, Jay Janardhan  wrote:
> (Sorry if this is a duplicate message - email server is acting up this
> morning).
>
>
> I'm following quick start guide and have a ceph cluster with three nodes.
> When I try to map image to block device my command hangs. This seems like a
> kernel hang as the only way I was able to get out is via a hard reset of the
> image. The following is my configuration. Any help is greatly appreciated.
>
> command on the ceph-client node (that hangs):
>
> $ sudo rbd map foo1 --pool rbd --name client.admin

What's your ceph version (ceph --version)?  Can you run 'rbd map' under strace,
and when it hangs do SysRq+w followed by SysRq+t and send along strace and
SysRq outputs?

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: "rbd map" command hangs

2014-05-19 Thread Jay Janardhan
(Sorry if this is a duplicate message - email server is acting up this
morning).


I'm following quick start guide and have a ceph cluster with three nodes.
When I try to map image to block device my command hangs. This seems like a
kernel hang as the only way I was able to get out is via a hard reset of
the image. The following is my configuration. Any help is greatly
appreciated.

command on the ceph-client node (that hangs):

$ sudo rbd map foo1 --pool rbd --name client.admin

*ceph-client node info:*

$ rbd info foo1

rbd image 'foo1':

size 4096 MB in 1024 objects

order 22 (4096 kB objects)

block_name_prefix: rb.0.1050.74b0dc51

format: 1

Kernel and Ubuntu release:

$ uname -r

3.6.9-030609-generic

$ lsb_release -a

No LSB modules are available.

Distributor ID: Ubuntu

Description: Ubuntu 12.04 LTS

Release: 12.04

Codename: precise


logs from /var/log/syslog:

May 17 14:12:48 ceph-client kernel: [  128.866445] Key type ceph registered

May 17 14:12:48 ceph-client kernel: [  128.866453] libceph: loaded (mon/osd
proto 15/24, osdmap 5/6 5/6)

May 17 14:12:48 ceph-client kernel: [  128.867313] rbd: loaded rbd (rados
block device)



*ceph-node1 node info (node2 and node3 are similar):*

$ uname -r

3.2.0-23-generic

$ lsb_release -a

No LSB modules are available.

Distributor ID: Ubuntu

Description: Ubuntu 12.04 LTS

Release: 12.04

Codename: precise



$ ceph status

cluster df4f503a-04a9-4572-96d3-e31218592cfa

 health HEALTH_OK

 monmap e1: 1 mons at {ceph-node1=192.168.56.102:6789/0}, election
epoch 2, quorum 0 ceph-node1

 osdmap e60: 3 osds: 3 up, 3 in

  pgmap v1168: 192 pgs, 3 pools, 1373 bytes data, 4 objects

22739 MB used, 202 GB / 236 GB avail

 192 active+clean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Working at RedHat & Ceph User Committee

2014-05-19 Thread Karan Singh
Nice to hear.

I would say its a good move from RH to hire you :-)   , good luck and keep 
contributing to ceph *core features*
 as you already been doing.

- Karan Singh -

On 19 May 2014, at 12:21, Wido den Hollander  wrote:

> On 05/19/2014 10:46 AM, Loic Dachary wrote:
>> Hi Ceph,
>> 
>> TL;DR: I'm starting a new position at RedHat in the Ceph development team, 
>> mid-august. I will keep working under the newly elected head of the Ceph 
>> User Committee.
>> 
>> Cloudwatt is a company based in France which provides cloud services. My 
>> participation in Ceph (and erasure code specifically) was driven by their 
>> need for cheaper storage. I'm extremely grateful for the opportunity 
>> Cloudwatt gave me to exclusively work upstream. Now that Firefly is released 
>> and packaged, Cloudwatt will be able to deploy it in production and indeed 
>> save money. A lot more money than my salary, which is a sign of a sound 
>> relationship. There still is work to be done and I'll focus on pyramid codes 
>> for Giant. After that I'm not sure but I trust Samuel Just will find ways to 
>> keep me busy.
>> 
>> When RedHat acquired Inktank my first reaction was : "Calamari will be Free 
>> Software, no more proprietary software !". The Atlanta OpenStack summit was 
>> the opportunity to discuss the consequences of the acquisition with dozens 
>> of people and I've not heard anyone say it was bad for the Ceph project. 
>> However  I've heard about four problems worth considering: a) communication 
>> about the Ceph project is going to be driven by RedHat marketing b) there is 
>> no incentive for RedHat to establish a foundation, c) RedHat has much less 
>> incentive than Inktank to support multiple GNU/Linux distributions, d) 
>> Inktank customers are worried about the transition and need to be reassured.
>> 
>> I also selfishly thought that RedHat/Inktank became a very appealing 
>> workplace. I'm committed to exclusively work on Free Software and RedHat is 
>> one of the few companies in the world where that is possible. I discussed it 
>> with Ian Colle & Sage Weil, sent a job application and was accepted. Since 
>> I've been working with the core team for some time now, it will not change 
>> much in terms of what I do. But one difference matters to me : I will get to 
>> hear about a variety of Ceph use cases that were previously kept secret 
>> because Inktank would not discuss about their customers with an external 
>> contributor.
>> 
> 
> Congratulations Loic!
> 
>> The downside of being hired by RedHat is that I'm no longer a community 
>> member and my judgment will be biased. I'm told RedHat employees are 
>> encouraged to have their own opinion and express them as I just did. But no 
>> matter how hard I try, I will gradually become less objective about the 
>> company paying my salary: it is the root of all conflicts of interest. To 
>> acknowledge that and keep working for the Ceph User Committee, I will not be 
>> a candidate to the next elections. I however pledge to assist the newly 
>> elected head of the Ceph User Committee, i.e. doing what I've done so far, 
>> only without the title nor the authority.
>> 
> 
> I very much appreciate that you are brave enough to admit this yourself. 
> Indeed, your vision on certain things will be 'clouded' ;) by the company who 
> is paying your salary, but that's logical as well.
> 
> Good luck at Red Hat and keep improving Ceph!
> 
>> Cheers
>> 
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> 
> 
> -- 
> Wido den Hollander
> 42on B.V.
> Ceph trainer and consultant
> 
> Phone: +31 (0)20 700 9902
> Skype: contact42on
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Erasure coding

2014-05-19 Thread yalla.gnan.kumar
Hi Loic,

Thanks for the reply.


Thanks
Kumar


-Original Message-
From: Loic Dachary [mailto:l...@dachary.org] 
Sent: Monday, May 19, 2014 6:04 PM
To: Gnan Kumar, Yalla; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Erasure coding

Hi,

The general idea to preserve resilience but save space compared to replication. 
It costs more in terms of CPU and network. You will find a short introduction 
here : 
 
https://wiki.ceph.com/Planning/Blueprints/Dumpling/Erasure_encoding_as_a_storage_backend
https://wiki.ceph.com/Planning/Blueprints/Firefly/Erasure_coded_storage_backend_%28step_3%29

For the next Ceph release Pyramid Codes will help reduce the bandwidth 
requirements 
https://wiki.ceph.com/Planning/Blueprints/Giant/Pyramid_Erasure_Code

Cheers

On 19/05/2014 13:52, yalla.gnan.ku...@accenture.com wrote:
> Hi All,
> 
>  
> 
> What exactly is erasure coding and why is it used in ceph ? I could not get 
> enough explanatory information from the documentation.
> 
>  
> 
>  
> 
> Thanks
> 
> Kumar
> 
> 
> --
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Where allowed by local law, 
> electronic communications with Accenture and its affiliates, including e-mail 
> and instant messaging (including content), may be scanned by our systems for 
> the purposes of information security and assessment of internal compliance 
> with Accenture policy.
> __
> 
> www.accenture.com
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Erasure coding

2014-05-19 Thread Loic Dachary
Hi,

The general idea to preserve resilience but save space compared to replication. 
It costs more in terms of CPU and network. You will find a short introduction 
here : 
 
https://wiki.ceph.com/Planning/Blueprints/Dumpling/Erasure_encoding_as_a_storage_backend
https://wiki.ceph.com/Planning/Blueprints/Firefly/Erasure_coded_storage_backend_%28step_3%29

For the next Ceph release Pyramid Codes will help reduce the bandwidth 
requirements 
https://wiki.ceph.com/Planning/Blueprints/Giant/Pyramid_Erasure_Code

Cheers

On 19/05/2014 13:52, yalla.gnan.ku...@accenture.com wrote:
> Hi All,
> 
>  
> 
> What exactly is erasure coding and why is it used in ceph ? I could not get 
> enough explanatory information from the documentation.
> 
>  
> 
>  
> 
> Thanks
> 
> Kumar
> 
> 
> --
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Where allowed by local law, 
> electronic communications with Accenture and its affiliates, including e-mail 
> and instant messaging (including content), may be scanned by our systems for 
> the purposes of information security and assessment of internal compliance 
> with Accenture policy.
> __
> 
> www.accenture.com
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Erasure coding

2014-05-19 Thread yalla.gnan.kumar
Hi All,

What exactly is erasure coding and why is it used in ceph ? I could not get 
enough explanatory information from the documentation.


Thanks
Kumar



This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy.
__

www.accenture.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with radosgw and some file name characters

2014-05-19 Thread Andrei Mikhailovsky
Yehuda, 

Never mind my last post, i've found the issue with the rule that you've 
suggested. my fastcgi script is called differently, so that's why i was getting 
the 404. 

I've tried your rewrite rule and I am still having the same issues. The same 
characters are failing with the rule you've suggested. 


Any idea how to fix the issue? 

Cheers 

Andrei 
- Original Message -

From: "Andrei Mikhailovsky"  
To: "Yehuda Sadeh"  
Cc: ceph-users@lists.ceph.com 
Sent: Monday, 19 May, 2014 9:30:03 AM 
Subject: Re: [ceph-users] Problem with radosgw and some file name characters 


Yehuda, 

I've tried the rewrite rule that you've suggested, but it is not working for 
me. I get 404 when trying to access the service. 

RewriteRule ^/(.*) /s3gw.3.fcgi?%{QUERY_STRING} 
[E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] 

Any idea what is wrong with this rule? 


Cheers 

Andrei 

- Original Message -



From: "Yehuda Sadeh"  
To: "Andrei Mikhailovsky"  
Cc: ceph-users@lists.ceph.com 
Sent: Friday, 16 May, 2014 5:44:52 PM 
Subject: Re: [ceph-users] Problem with radosgw and some file name characters 

Was talking about this. There is a different and simpler rule that we 
use nowadays, for some reason it's not well documented: 

RewriteRule ^/(.*) /s3gw.3.fcgi?%{QUERY_STRING} 
[E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] 

I still need to see a more verbose log to make a better educated guess. 

Yehuda 

On Thu, May 15, 2014 at 3:01 PM, Andrei Mikhailovsky  wrote: 
> 
> Yehuda, 
> 
> what do you mean by the rewrite rule? is this for Apache? I've used the ceph 
> documentation to create it. My rule is: 
> 
> 
> RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*) 
> /s3gw.fcgi?page=$1¶ms=$2&%{QUERY_STRING} 
> [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] 
> 
> Or are you talking about something else? 
> 
> Cheers 
> 
> Andrei 
>  
> From: "Yehuda Sadeh"  
> To: "Andrei Mikhailovsky"  
> Cc: ceph-users@lists.ceph.com 
> Sent: Thursday, 15 May, 2014 4:05:06 PM 
> Subject: Re: [ceph-users] Problem with radosgw and some file name characters 
> 
> 
> Your rewrite rule might be off a bit. Can you provide log with 'debug rgw = 
> 20'? 
> 
> Yehuda 
> 
> On Thu, May 15, 2014 at 8:02 AM, Andrei Mikhailovsky  
> wrote: 
>> Hello guys, 
>> 
>> 
>> I am trying to figure out what is the problem here. 
>> 
>> 
>> Currently running Ubuntu 12.04 with latest updates and radosgw version 
>> 0.72.2-1precise. My ceph.conf file is pretty standard from the radosgw 
>> howto. 
>> 
>> 
>> 
>> I am testing radosgw as a backup solution to S3 compatible clients. I am 
>> planning to copy a large number of files/folders and I am having issues 
>> with 
>> a large number of files. The client reports the following error on some 
>> files: 
>> 
>> 
>>  
>> 
>>  
>> 
>> AccessDenied 
>> 
>>  
>> 
>> 
>> Looking on the server backup I only see the following errors in the 
>> radosgw.log file: 
>> 
>> 2014-05-13 23:50:35.786181 7f09467dc700 1 == starting new request 
>> req=0x245d7e0 = 
>> 2014-05-13 23:50:35.786470 7f09467dc700 1 == req done req=0x245d7e0 
>> http_status=403 == 
>> 
>> 
>> So, i've done a small file set comprising of test files including the 
>> following names: 
>> 
>> Testing and Testing.txt 
>> Testing ^ Testing.txt 
>> Testing = Testing.txt 
>> Testing _ Testing.txt 
>> Testing - Testing.txt 
>> Testing ; Testing.txt 
>> Testing ! Testing.txt 
>> Testing ? Testing.txt 
>> Testing ( Testing.txt 
>> Testing ) Testing.txt 
>> Testing @ Testing.txt 
>> Testing $ Testing.txt 
>> Testing * Testing.txt 
>> Testing & Testing.txt 
>> Testing # Testing.txt 
>> Testing % Testing.txt 
>> Testing + Testing.txt 
>> 
>> From the above list the files with the following characters are giving me 
>> Access Denied / 403 error: 
>> 
>> =;()@$*&+ 
>> 
>> The rest of the files are successfully uploaded. 
>> 
>> Does anyone know what is required to fix the problem? 
>> 
>> Many thanks 
>> 
>> Andrei 
>> 
>> 
>> ___ 
>> ceph-users mailing list 
>> ceph-users@lists.ceph.com 
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>> 
> 


___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] metadata pool : size growing

2014-05-19 Thread Wido den Hollander

On 05/19/2014 12:31 PM, Florent B wrote:

Hi all,

I use CephFS and I am wondering what does "metadata" pool contains exactly ?



It contains the directory structure of your filesystem


Its size is growing (multiple GB).

It seems that it stores all operations occurring in CephFS, but is it
totally necessary ?



It doesn't store all operations, but it stores all the metadata like 
directories, owners, file sizes, but not the contents of the files.


This pool is mandatory and do NOT try to remove it or any objects in it.


When a MDS is taking failover, does it have to read all content of
metadata pool (replay state) ?



Yes, if another MDS is doing a cold-takeover it has to read the contents 
of this pool.



I haven't found architecture information of this pool, can someone tell me ?


File data goes into the pool 'data' and all the directory information 
goes into 'metadata'.




Thank you.

Florent
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
Ceph consultant and trainer
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Working at RedHat & Ceph User Committee

2014-05-19 Thread Wido den Hollander

On 05/19/2014 10:46 AM, Loic Dachary wrote:

Hi Ceph,

TL;DR: I'm starting a new position at RedHat in the Ceph development team, 
mid-august. I will keep working under the newly elected head of the Ceph User 
Committee.

Cloudwatt is a company based in France which provides cloud services. My 
participation in Ceph (and erasure code specifically) was driven by their need 
for cheaper storage. I'm extremely grateful for the opportunity Cloudwatt gave 
me to exclusively work upstream. Now that Firefly is released and packaged, 
Cloudwatt will be able to deploy it in production and indeed save money. A lot 
more money than my salary, which is a sign of a sound relationship. There still 
is work to be done and I'll focus on pyramid codes for Giant. After that I'm 
not sure but I trust Samuel Just will find ways to keep me busy.

When RedHat acquired Inktank my first reaction was : "Calamari will be Free 
Software, no more proprietary software !". The Atlanta OpenStack summit was the 
opportunity to discuss the consequences of the acquisition with dozens of people and I've 
not heard anyone say it was bad for the Ceph project. However  I've heard about four 
problems worth considering: a) communication about the Ceph project is going to be driven 
by RedHat marketing b) there is no incentive for RedHat to establish a foundation, c) 
RedHat has much less incentive than Inktank to support multiple GNU/Linux distributions, 
d) Inktank customers are worried about the transition and need to be reassured.

I also selfishly thought that RedHat/Inktank became a very appealing workplace. I'm 
committed to exclusively work on Free Software and RedHat is one of the few 
companies in the world where that is possible. I discussed it with Ian Colle & 
Sage Weil, sent a job application and was accepted. Since I've been working with 
the core team for some time now, it will not change much in terms of what I do. But 
one difference matters to me : I will get to hear about a variety of Ceph use cases 
that were previously kept secret because Inktank would not discuss about their 
customers with an external contributor.



Congratulations Loic!


The downside of being hired by RedHat is that I'm no longer a community member 
and my judgment will be biased. I'm told RedHat employees are encouraged to 
have their own opinion and express them as I just did. But no matter how hard I 
try, I will gradually become less objective about the company paying my salary: 
it is the root of all conflicts of interest. To acknowledge that and keep 
working for the Ceph User Committee, I will not be a candidate to the next 
elections. I however pledge to assist the newly elected head of the Ceph User 
Committee, i.e. doing what I've done so far, only without the title nor the 
authority.



I very much appreciate that you are brave enough to admit this yourself. 
Indeed, your vision on certain things will be 'clouded' ;) by the 
company who is paying your salary, but that's logical as well.


Good luck at Red Hat and keep improving Ceph!


Cheers



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph booth at http://www.solutionslinux.fr/

2014-05-19 Thread Loic Dachary
Hi Cedric,

No mug but ... let's discuss this at the booth. We will figure something out ;-)

Cheers

On 19/05/2014 10:48, Cedric Lemarchand wrote:
> Le 12/05/2014 19:14, Loic Dachary a écrit :
>> On 12/05/2014 11:00, Alexandre DERUMIER wrote:
>>> I'll be there !
> Will be there too !
>>>
>>>
>>> (Do you known if it'll be possible to buy some ceph t-shirts ?)
> As a side note, have some mug ? ;-)
> 
> Cheers !
>>>
>>
>> Absolutely ! If you want large quantities it can also be done ;-)
>>
>> Cheers
>>
>>>
>>>
>>> - Mail original -
>>>
>>> De: "Loic Dachary" 
>>> À: "ceph-users" 
>>> Envoyé: Lundi 12 Mai 2014 16:38:31
>>> Objet: [ceph-users] Ceph booth at http://www.solutionslinux.fr/
>>>
>>> Hi Ceph,
>>>
>>> Ceph will be at http://www.solutionslinux.fr/ in Paris next week, booth C36 
>>> next week ( 20 / 21 may 2014 ) in the non-profit village. Drop by anytime 
>>> to discuss Ceph if you are around :-) We're also having a meetup / lunch 
>>> the 20th ( details are at 
>>> http://www.meetup.com/Ceph-in-Paris/events/164818312/ ).
>>>
>>> Cheers
>>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> -- 
> Cédric
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ' rbd username specified but secret not found' error, virsh live migration on rbd

2014-05-19 Thread JinHwan Hwang
I have been trying to do live migration on vm which is running on rbd. But
so far, they only give me ' internal error: rbd username 'libvirt'
specified but secret not found' when i do live migration.

ceph-admin : source host
host : destination host

root@main:/home/ceph-admin# virsh migrate --live rbd1-1
 qemu+ssh://host/system
error: internal error: rbd username 'libvirt' specified but secret not found

These are rbd1-1 vm dump. It worked for running rbd1-1.
.
 
  
  

  
  


  


Is '...' not sufficient for doing live migration? I also have
tried setting same secret on both source and destination host virsh(so both
host virsh have uuid='b34526f2-8d32-ed5d-3153-e90d011dd37e' ), But it
didn't worked. I've followed this('http://ceph.com/docs/master/rbd/libvirt/')
instruction and this is only secret so far i know. Am i miss something? If
that so, Where should i put those missing 'secrets'?
Thanks in advance for any helps.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph booth at http://www.solutionslinux.fr/

2014-05-19 Thread Cedric Lemarchand
Le 12/05/2014 19:14, Loic Dachary a écrit :
> On 12/05/2014 11:00, Alexandre DERUMIER wrote:
>> I'll be there !
Will be there too !
>>
>>
>> (Do you known if it'll be possible to buy some ceph t-shirts ?)
As a side note, have some mug ? ;-)

Cheers !
>>
>
> Absolutely ! If you want large quantities it can also be done ;-)
>
> Cheers
>
>>
>>
>> - Mail original -
>>
>> De: "Loic Dachary" 
>> À: "ceph-users" 
>> Envoyé: Lundi 12 Mai 2014 16:38:31
>> Objet: [ceph-users] Ceph booth at http://www.solutionslinux.fr/
>>
>> Hi Ceph,
>>
>> Ceph will be at http://www.solutionslinux.fr/ in Paris next week,
booth C36 next week ( 20 / 21 may 2014 ) in the non-profit village. Drop
by anytime to discuss Ceph if you are around :-) We're also having a
meetup / lunch the 20th ( details are at
http://www.meetup.com/Ceph-in-Paris/events/164818312/ ).
>>
>> Cheers
>>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Cédric

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Working at RedHat & Ceph User Committee

2014-05-19 Thread Loic Dachary
Hi Ceph,

TL;DR: I'm starting a new position at RedHat in the Ceph development team, 
mid-august. I will keep working under the newly elected head of the Ceph User 
Committee.

Cloudwatt is a company based in France which provides cloud services. My 
participation in Ceph (and erasure code specifically) was driven by their need 
for cheaper storage. I'm extremely grateful for the opportunity Cloudwatt gave 
me to exclusively work upstream. Now that Firefly is released and packaged, 
Cloudwatt will be able to deploy it in production and indeed save money. A lot 
more money than my salary, which is a sign of a sound relationship. There still 
is work to be done and I'll focus on pyramid codes for Giant. After that I'm 
not sure but I trust Samuel Just will find ways to keep me busy.

When RedHat acquired Inktank my first reaction was : "Calamari will be Free 
Software, no more proprietary software !". The Atlanta OpenStack summit was the 
opportunity to discuss the consequences of the acquisition with dozens of 
people and I've not heard anyone say it was bad for the Ceph project. However  
I've heard about four problems worth considering: a) communication about the 
Ceph project is going to be driven by RedHat marketing b) there is no incentive 
for RedHat to establish a foundation, c) RedHat has much less incentive than 
Inktank to support multiple GNU/Linux distributions, d) Inktank customers are 
worried about the transition and need to be reassured.

I also selfishly thought that RedHat/Inktank became a very appealing workplace. 
I'm committed to exclusively work on Free Software and RedHat is one of the few 
companies in the world where that is possible. I discussed it with Ian Colle & 
Sage Weil, sent a job application and was accepted. Since I've been working 
with the core team for some time now, it will not change much in terms of what 
I do. But one difference matters to me : I will get to hear about a variety of 
Ceph use cases that were previously kept secret because Inktank would not 
discuss about their customers with an external contributor.

The downside of being hired by RedHat is that I'm no longer a community member 
and my judgment will be biased. I'm told RedHat employees are encouraged to 
have their own opinion and express them as I just did. But no matter how hard I 
try, I will gradually become less objective about the company paying my salary: 
it is the root of all conflicts of interest. To acknowledge that and keep 
working for the Ceph User Committee, I will not be a candidate to the next 
elections. I however pledge to assist the newly elected head of the Ceph User 
Committee, i.e. doing what I've done so far, only without the title nor the 
authority.

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre





signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with radosgw and some file name characters

2014-05-19 Thread Andrei Mikhailovsky
Yehuda, 

I've tried the rewrite rule that you've suggested, but it is not working for 
me. I get 404 when trying to access the service. 

RewriteRule ^/(.*) /s3gw.3.fcgi?%{QUERY_STRING} 
[E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] 

Any idea what is wrong with this rule? 


Cheers 

Andrei 

- Original Message -



From: "Yehuda Sadeh"  
To: "Andrei Mikhailovsky"  
Cc: ceph-users@lists.ceph.com 
Sent: Friday, 16 May, 2014 5:44:52 PM 
Subject: Re: [ceph-users] Problem with radosgw and some file name characters 

Was talking about this. There is a different and simpler rule that we 
use nowadays, for some reason it's not well documented: 

RewriteRule ^/(.*) /s3gw.3.fcgi?%{QUERY_STRING} 
[E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] 

I still need to see a more verbose log to make a better educated guess. 

Yehuda 

On Thu, May 15, 2014 at 3:01 PM, Andrei Mikhailovsky  wrote: 
> 
> Yehuda, 
> 
> what do you mean by the rewrite rule? is this for Apache? I've used the ceph 
> documentation to create it. My rule is: 
> 
> 
> RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*) 
> /s3gw.fcgi?page=$1¶ms=$2&%{QUERY_STRING} 
> [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L] 
> 
> Or are you talking about something else? 
> 
> Cheers 
> 
> Andrei 
>  
> From: "Yehuda Sadeh"  
> To: "Andrei Mikhailovsky"  
> Cc: ceph-users@lists.ceph.com 
> Sent: Thursday, 15 May, 2014 4:05:06 PM 
> Subject: Re: [ceph-users] Problem with radosgw and some file name characters 
> 
> 
> Your rewrite rule might be off a bit. Can you provide log with 'debug rgw = 
> 20'? 
> 
> Yehuda 
> 
> On Thu, May 15, 2014 at 8:02 AM, Andrei Mikhailovsky  
> wrote: 
>> Hello guys, 
>> 
>> 
>> I am trying to figure out what is the problem here. 
>> 
>> 
>> Currently running Ubuntu 12.04 with latest updates and radosgw version 
>> 0.72.2-1precise. My ceph.conf file is pretty standard from the radosgw 
>> howto. 
>> 
>> 
>> 
>> I am testing radosgw as a backup solution to S3 compatible clients. I am 
>> planning to copy a large number of files/folders and I am having issues 
>> with 
>> a large number of files. The client reports the following error on some 
>> files: 
>> 
>> 
>>  
>> 
>>  
>> 
>> AccessDenied 
>> 
>>  
>> 
>> 
>> Looking on the server backup I only see the following errors in the 
>> radosgw.log file: 
>> 
>> 2014-05-13 23:50:35.786181 7f09467dc700 1 == starting new request 
>> req=0x245d7e0 = 
>> 2014-05-13 23:50:35.786470 7f09467dc700 1 == req done req=0x245d7e0 
>> http_status=403 == 
>> 
>> 
>> So, i've done a small file set comprising of test files including the 
>> following names: 
>> 
>> Testing and Testing.txt 
>> Testing ^ Testing.txt 
>> Testing = Testing.txt 
>> Testing _ Testing.txt 
>> Testing - Testing.txt 
>> Testing ; Testing.txt 
>> Testing ! Testing.txt 
>> Testing ? Testing.txt 
>> Testing ( Testing.txt 
>> Testing ) Testing.txt 
>> Testing @ Testing.txt 
>> Testing $ Testing.txt 
>> Testing * Testing.txt 
>> Testing & Testing.txt 
>> Testing # Testing.txt 
>> Testing % Testing.txt 
>> Testing + Testing.txt 
>> 
>> From the above list the files with the following characters are giving me 
>> Access Denied / 403 error: 
>> 
>> =;()@$*&+ 
>> 
>> The rest of the files are successfully uploaded. 
>> 
>> Does anyone know what is required to fix the problem? 
>> 
>> Many thanks 
>> 
>> Andrei 
>> 
>> 
>> ___ 
>> ceph-users mailing list 
>> ceph-users@lists.ceph.com 
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Various file lengths while uploading the same file

2014-05-19 Thread Yehuda Sadeh
On Mon, May 19, 2014 at 12:19 AM, Arthur Tumanyan
 wrote:
> Hi,
> I'm writing an fastcgi application, which just gets data from stdin and
> appends it to ceph.
> Please,have a look:
>
> while (0 != (written = fread(buffer, sizeof (char), sizeof
> buffer, stdin))) {
> if (written < 0) {
> logger("Error: %s\n", strerror(errno));
> break;
> }
> err = rados_aio_append(io, filename, comp, (const char
> *) buffer, written);
> if (err < 0) {
> logger("Could not schedule aio append: %s\n",
> strerror(-err));
> rados_aio_release(comp);
> rados_ioctx_destroy(io);
> rados_shutdown(cluster);
> break;
> } else {
> if (len >= (100 * MB)) {
> if (0 == (i % (50 * MB)) && i != 0) {
> logger("Uploaded %d bytes", i);
> }
> }
> i += written;
> }
> }
>
> I test like this:
> curl -XPUT -T guide.pdf  "http://95.211.192.129:81/upload?pics|guide.pdf" -i
> -v
>
> In most cases all works fine, but sometimes for (big files) , ceph/or
> application corrupts file. And the same file may have various sizes
>
> ceph1 07:14:46 UploadToCeph # rados -p pics stat 1024.img
> pics/1024.img mtime 1400335784, size 1019215872
>
> As you can see, the size is smaller than should be. Can anyone point me
> what's wrong,please ?

In your code here you don't really wait for the aio completions, so
you don't know if the rados requests even succeeded. A useful tool is
the ceph messenger debug log that will help you see what's actually
going to the osds and the return status. You can turn it on by setting
'debug ms = 1'.
I'd also replace the append operation with a write to a specific
offset. It'll help with debugging, and you'd avoid any potential
ordering issues.
Finally, you shouldn't really create raw rados objects with unbounded
sizes. The right way to do it is to stripe everything larger than a
specific size (usually we use 4MB stripes). Besides having a problem
with data distribution, there are certain operations that don't handle
large objects very well (e.g., deep scrub).

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Various file lengths while uploading the same file

2014-05-19 Thread Arthur Tumanyan
Hi,
I'm writing an fastcgi application, which just gets data from stdin and
appends it to ceph.
Please,have a look:

while (0 != (written = fread(buffer, sizeof (char), sizeof
buffer, stdin))) {
if (written < 0) {
logger("Error: %s\n", strerror(errno));
break;
}
err = rados_aio_append(io, filename, comp, (const char
*) buffer, written);
if (err < 0) {
logger("Could not schedule aio append: %s\n",
strerror(-err));
rados_aio_release(comp);
rados_ioctx_destroy(io);
rados_shutdown(cluster);
break;
} else {
if (len >= (100 * MB)) {
if (0 == (i % (50 * MB)) && i != 0) {
logger("Uploaded %d bytes", i);
}
}
i += written;
}
}

I test like this:
curl -XPUT -T guide.pdf  "http://95.211.192.129:81/upload?pics|guide.pdf"
-i -v

In most cases all works fine, but sometimes for (big files) , ceph/or
application corrupts file. And the same file may have various sizes

ceph1 07:14:46 UploadToCeph # rados -p pics stat 1024.img
pics/1024.img mtime 1400335784, size 1019215872

As you can see, the size is smaller than should be. Can anyone point me
what's wrong,please ?

Thanks in advance
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com