[ceph-users] Delta Lake Support

2019-05-08 Thread Scottix
Hey Cephers,
There is a new OSS software called Delta Lake https://delta.io/

It is compatible with HDFS but seems ripe to add Ceph support as a backend
storage. Just want to put this on the radar for any feelers.

Best
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Right way to delete OSD from cluster?

2019-01-30 Thread Scottix
I generally have gone the crush reweight 0 route
This way the drive can participate in the rebalance, and the rebalance
only happens once. Then you can take it out and purge.

If I am not mistaken this is the safest.

ceph osd crush reweight  0

On Wed, Jan 30, 2019 at 7:45 AM Fyodor Ustinov  wrote:
>
> Hi!
>
> But unless after "ceph osd crush remove" I will not got the undersized 
> objects? That is, this is not the same thing as simply turning off the OSD 
> and waiting for the cluster to be restored?
>
> - Original Message -
> From: "Wido den Hollander" 
> To: "Fyodor Ustinov" , "ceph-users" 
> Sent: Wednesday, 30 January, 2019 15:05:35
> Subject: Re: [ceph-users] Right way to delete OSD from cluster?
>
> On 1/30/19 2:00 PM, Fyodor Ustinov wrote:
> > Hi!
> >
> > I thought I should first do "ceph osd out", wait for the end relocation of 
> > the misplaced objects and after that do "ceph osd purge".
> > But after "purge" the cluster starts relocation again.
> >
> > Maybe I'm doing something wrong? Then what is the correct way to delete the 
> > OSD from the cluster?
> >
>
> You are not doing anything wrong, this is the expected behavior. There
> are two CRUSH changes:
>
> - Marking it out
> - Purging it
>
> You could do:
>
> $ ceph osd crush remove osd.X
>
> Wait for all good
>
> $ ceph osd purge X
>
> The last step should then not initiate any data movement.
>
> Wido
>
> > WBR,
> > Fyodor.
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
T: @Thaumion
IG: Thaumion
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bionic Upgrade 12.2.10

2019-01-14 Thread Scottix
Wow OK.
I wish there was some official stance on this.

Now I got to remove those OSDs, downgrade to 16.04 and re-add them,
this is going to take a while.

--Scott

On Mon, Jan 14, 2019 at 10:53 AM Reed Dier  wrote:
>
> This is because Luminous is not being built for Bionic for whatever reason.
> There are some other mailing list entries detailing this.
>
> Right now you have ceph installed from the Ubuntu bionic-updates repo, which 
> has 12.2.8, but does not get regular release updates.
>
> This is what I ended up having to do for my ceph nodes that were upgraded 
> from Xenial to Bionic, as well as new ceph nodes that installed straight to 
> Bionic, due to the repo issues. Even if you try to use the xenial packages, 
> you will run into issues with libcurl4 and libcurl3 I imagine.
>
> Reed
>
> On Jan 14, 2019, at 12:21 PM, Scottix  wrote:
>
> https://download.ceph.com/debian-luminous/
>
>


-- 
T: @Thaumion
IG: Thaumion
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bionic Upgrade 12.2.10

2019-01-14 Thread Scottix
Hey,
I am having some issues upgrading to 12.2.10 on my 18.04 server. It is
saying 12.2.8 is the latest.
I am not sure why it is not going to 12.2.10, also the rest of my
cluster is already in 12.2.10 except this one machine.

$ cat /etc/apt/sources.list.d/ceph.list
deb https://download.ceph.com/debian-luminous/ bionic main

$ apt update
Hit:1 http://us.archive.ubuntu.com/ubuntu bionic InRelease
Hit:2 http://us.archive.ubuntu.com/ubuntu bionic-updates InRelease
Hit:3 http://security.ubuntu.com/ubuntu bionic-security InRelease
Hit:4 http://us.archive.ubuntu.com/ubuntu bionic-backports InRelease
Hit:5 https://download.ceph.com/debian-luminous bionic InRelease

$ apt-cache policy ceph-osd
ceph-osd:
  Installed: 12.2.8-0ubuntu0.18.04.1
  Candidate: 12.2.8-0ubuntu0.18.04.1
  Version table:
 *** 12.2.8-0ubuntu0.18.04.1 500
500 http://us.archive.ubuntu.com/ubuntu bionic-updates/main
amd64 Packages
100 /var/lib/dpkg/status
 12.2.4-0ubuntu1 500
500 http://us.archive.ubuntu.com/ubuntu bionic/main amd64 Packages

Any help on what could be the issue.

Thanks,
Scott

-- 
T: @Thaumion
IG: Thaumion
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs free space issue

2019-01-10 Thread Scottix
I just had this question as well.

I am interested in what you mean by fullest, is it percentage wise or raw
space. If I have an uneven distribution and adjusted it, would it make more
space available potentially.

Thanks
Scott
On Thu, Jan 10, 2019 at 12:05 AM Wido den Hollander  wrote:

>
>
> On 1/9/19 2:33 PM, Yoann Moulin wrote:
> > Hello,
> >
> > I have a CEPH cluster in luminous 12.2.10 dedicated to cephfs.
> >
> > The raw size is 65.5 TB, with a replica 3, I should have ~21.8 TB usable.
> >
> > But the size of the cephfs view by df is *only* 19 TB, is that normal ?
> >
>
> Yes. Ceph will calculate this based on the fullest OSD. As data
> distribution is never 100% perfect you will get such numbers.
>
> To go from raw to usable I use this calculation:
>
> (RAW / 3) * 0.85
>
> So yes, I take a 20%, sometimes even 30% buffer.
>
> Wido
>
> > Best regards,
> >
> > here some hopefully useful information :
> >
> >> apollo@icadmin004:~$ ceph -s
> >>   cluster:
> >> id: fc76846a-d0f0-4866-ae6d-d442fc885469
> >> health: HEALTH_OK
> >>
> >>   services:
> >> mon: 3 daemons, quorum icadmin006,icadmin007,icadmin008
> >> mgr: icadmin006(active), standbys: icadmin007, icadmin008
> >> mds: cephfs-3/3/3 up
> {0=icadmin008=up:active,1=icadmin007=up:active,2=icadmin006=up:active}
> >> osd: 40 osds: 40 up, 40 in
> >>
> >>   data:
> >> pools:   2 pools, 2560 pgs
> >> objects: 26.12M objects, 15.6TiB
> >> usage:   49.7TiB used, 15.8TiB / 65.5TiB avail
> >> pgs: 2560 active+clean
> >>
> >>   io:
> >> client:   510B/s rd, 24.1MiB/s wr, 0op/s rd, 35op/s wr
> >
> >> apollo@icadmin004:~$ ceph df
> >> GLOBAL:
> >> SIZEAVAIL   RAW USED %RAW USED
> >> 65.5TiB 15.8TiB  49.7TiB 75.94
> >> POOLS:
> >> NAMEID USED%USED MAX AVAIL
>  OBJECTS
> >> cephfs_data 1  15.6TiB 85.62   2.63TiB
>  25874848
> >> cephfs_metadata 2   571MiB  0.02   2.63TiB
>  245778
> >
> >> apollo@icadmin004:~$ rados df
> >> POOL_NAME   USEDOBJECTS  CLONES COPIES   MISSING_ON_PRIMARY
> UNFOUND DEGRADED RD_OPS RD  WR_OPS   WR
> >> cephfs_data 15.6TiB 25874848  0 77624544  0
>00  324156851 25.9TiB 20114360 9.64TiB
> >> cephfs_metadata  571MiB   245778  0   737334  0
>00 1802713236 87.7TiB 75729412 16.0TiB
> >>
> >> total_objects26120626
> >> total_used   49.7TiB
> >> total_avail  15.8TiB
> >> total_space  65.5TiB
> >
> >> apollo@icadmin004:~$ ceph osd pool ls detail
> >> pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 6197 lfor 0/3885
> flags hashpspool stripe_width 0 application cephfs
> >> pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 512 pgp_num 512 last_change 6197 lfor 0/703
> flags hashpspool stripe_width 0 application cephfs
> >
> >> apollo@icadmin004:~$ df -h /apollo/
> >> Filesystem Size  Used Avail Use% Mounted on
> >> 10.90.36.16,10.90.36.17,10.90.36.18:/   19T   16T  2.7T  86% /apollo
> >
> >> apollo@icadmin004:~$ ceph fs get cephfs
> >> Filesystem 'cephfs' (1)
> >> fs_name  cephfs
> >> epoch49277
> >> flagsc
> >> created  2018-01-23 14:06:43.460773
> >> modified 2019-01-09 14:17:08.520888
> >> tableserver  0
> >> root 0
> >> session_timeout  60
> >> session_autoclose300
> >> max_file_size1099511627776
> >> last_failure 0
> >> last_failure_osd_epoch   6216
> >> compat   compat={},rocompat={},incompat={1=base v0.20,2=client
> writeable ranges,3=default file layouts on dirs,4=dir inode in separate
> object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no
> anchor table,9=file layout v2}
> >> max_mds  3
> >> in   0,1,2
> >> up   {0=424203,1=424158,2=424146}
> >> failed
> >> damaged
> >> stopped
> >> data_pools   [1]
> >> metadata_pool2
> >> inline_data  disabled
> >> balancer
> >> standby_count_wanted 0
> >> 424203:  10.90.36.18:6800/3885954695 'icadmin008' mds.0.49202
> up:active seq 6 export_targets=1,2
> >> 424158:  10.90.36.17:6800/152758094 'icadmin007' mds.1.49198
> up:active seq 16 export_targets=0,2
> >> 424146:  10.90.36.16:6801/1771587593 'icadmin006' mds.2.49195
> up:active seq 19 export_targets=0
> >
> >> apollo@icadmin004:~$ ceph osd tree
> >> ID  CLASS WEIGHT   TYPE NAME STATUS REWEIGHT PRI-AFF
> >>  -1   65.49561 root default
> >>  -73.27478 host iccluster150
> >> 160   hdd  1.63739 osd.160   up  1.0 1.0
> >> 165   hdd  1.63739 osd.165   up  1.0 1.0
> >> -113.27478 host iccluster151
> >> 163   hdd  1.63739 osd.163   up  1.0 1.0
> >> 168   hdd  1.63739 osd.168   up  1.0 1.0
> >>  -53.27478 

Re: [ceph-users] cephfs tell command not working

2018-07-30 Thread Scottix
Awww that makes more sense now. I guess I didn't quite comprehend EPERM at
the time.

Thank You,
Scott
On Mon, Jul 30, 2018 at 7:19 AM John Spray  wrote:

> On Fri, Jul 27, 2018 at 8:35 PM Scottix  wrote:
> >
> > ceph tell mds.0 client ls
> > 2018-07-27 12:32:40.344654 7fa5e27fc700  0 client.89408629
> ms_handle_reset on 10.10.1.63:6800/1750774943
> > Error EPERM: problem getting command descriptions from mds.0
>
> You need "mds allow *" capabilities (the default client.admin user has
> this) to send commands to MDS daemons.
>
> John
>
>
>
> >
> > mds log
> > 2018-07-27 12:32:40.342753 7fc9c1239700  1 mds.CephMon203
> handle_command: received command from client without `tell` capability:
> 10.10.1.x:0/953253037
> >
> >
> > We are trying to run this and getting an error.
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs tell command not working

2018-07-27 Thread Scottix
ceph tell mds.0 client ls
2018-07-27 12:32:40.344654 7fa5e27fc700  0 client.89408629 ms_handle_reset
on 10.10.1.63:6800/1750774943
Error EPERM: problem getting command descriptions from mds.0

mds log
2018-07-27 12:32:40.342753 7fc9c1239700  1 mds.CephMon203 handle_command:
received command from client without `tell` capability:
10.10.1.x:0/953253037


We are trying to run this and getting an error.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-MDS Failover

2018-05-18 Thread Scottix
So we have been testing this quite a bit, having the failure domain as
partially available is ok for us but odd, since we don't know what will be
down. Compared to a single MDS we know everything will be blocked.

It would be nice to have an option to have all IO blocked if it hits a
degraded state until it recovers. Since you are unaware of other MDS state,
seems like that would be tough to do.

I'll leave this as a feature request possibly in the future.

On Fri, May 18, 2018 at 3:15 PM Gregory Farnum  wrote:

> On Fri, May 18, 2018 at 11:56 AM Webert de Souza Lima <
> webert.b...@gmail.com> wrote:
>
>> Hello,
>>
>>
>> On Mon, Apr 30, 2018 at 7:16 AM Daniel Baumann 
>> wrote:
>>
>>> additionally: if rank 0 is lost, the whole FS stands still (no new
>>> client can mount the fs; no existing client can change a directory,
>>> etc.).
>>>
>>> my guess is that the root of a cephfs (/; which is always served by rank
>>> 0) is needed in order to do traversals/lookups of any directories on the
>>> top-level (which then can be served by ranks 1-n).
>>>
>>
>> Could someone confirm if this is actually how it works? Thanks.
>>
>
> Yes, although I'd expect that clients can keep doing work in directories
> they've already got opened (or in descendants of those). Perhaps I'm
> missing something about that, though...
> -Greg
>
>
>>
>> Regards,
>>
>> Webert Lima
>> DevOps Engineer at MAV Tecnologia
>> *Belo Horizonte - Brasil*
>> *IRC NICK - WebertRLZ*
>>
>>
>>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy on 14.04

2018-04-30 Thread Scottix
Alright I'll try that.

Thanks
On Mon, Apr 30, 2018 at 5:45 PM Vasu Kulkarni <vakul...@redhat.com> wrote:

> If you are on 14.04 or need to use ceph-disk, then you can  install
> version 1.5.39 from pip. to downgrade just uninstall the current one
> and reinstall 1.5.39 you dont have to delete your conf file folder.
>
> On Mon, Apr 30, 2018 at 5:31 PM, Scottix <scot...@gmail.com> wrote:
> > It looks like ceph-deploy@2.0.0 is incompatible with systems running
> 14.04
> > and it got released in the luminous branch with the new deployment
> commands.
> >
> > Is there anyway to downgrade to an older version?
> >
> > Log of osd list
> >
> > XYZ@XYZStat200:~/XYZ-cluster$ ceph-deploy --overwrite-conf osd list
> > XYZCeph204
> > [ceph_deploy.conf][DEBUG ] found configuration file at:
> > /home/XYZ/.cephdeploy.conf
> > [ceph_deploy.cli][INFO  ] Invoked (2.0.0): /usr/bin/ceph-deploy
> > --overwrite-conf osd list XYZCeph204
> > [ceph_deploy.cli][INFO  ] ceph-deploy options:
> > [ceph_deploy.cli][INFO  ]  username  : None
> > [ceph_deploy.cli][INFO  ]  verbose   : False
> > [ceph_deploy.cli][INFO  ]  debug : False
> > [ceph_deploy.cli][INFO  ]  overwrite_conf: True
> > [ceph_deploy.cli][INFO  ]  subcommand: list
> > [ceph_deploy.cli][INFO  ]  quiet : False
> > [ceph_deploy.cli][INFO  ]  cd_conf   :
> > 
> > [ceph_deploy.cli][INFO  ]  cluster   : ceph
> > [ceph_deploy.cli][INFO  ]  host  : ['XYZCeph204']
> > [ceph_deploy.cli][INFO  ]  func  :  at
> > 0x7f12af1e80c8>
> > [ceph_deploy.cli][INFO  ]  ceph_conf : None
> > [ceph_deploy.cli][INFO  ]  default_release   : False
> > XYZ@XYZceph204's password:
> > [XYZCeph204][DEBUG ] connection detected need for sudo
> > XYZ@XYZceph204's password:
> > [XYZCeph204][DEBUG ] connected to host: XYZCeph204
> > [XYZCeph204][DEBUG ] detect platform information from remote host
> > [XYZCeph204][DEBUG ] detect machine type
> > [XYZCeph204][DEBUG ] find the location of an executable
> > [XYZCeph204][INFO  ] Running command: sudo /sbin/initctl version
> > [XYZCeph204][DEBUG ] find the location of an executable
> > [ceph_deploy.osd][INFO  ] Distro info: Ubuntu 14.04 trusty
> > [ceph_deploy.osd][DEBUG ] Listing disks on XYZCeph204...
> > [XYZCeph204][DEBUG ] find the location of an executable
> > [XYZCeph204][INFO  ] Running command: sudo /usr/sbin/ceph-volume lvm list
> > [XYZCeph204][DEBUG ]  stderr: /sbin/lvs: unrecognized option '--readonly'
> > [XYZCeph204][WARNIN] No valid Ceph devices found
> > [XYZCeph204][DEBUG ]  stderr: Error during parsing of command line.
> > [XYZCeph204][DEBUG ]  stderr: /sbin/lvs: unrecognized option '--readonly'
> > [XYZCeph204][DEBUG ]  stderr: Error during parsing of command line.
> > [XYZCeph204][ERROR ] RuntimeError: command returned non-zero exit
> status: 1
> > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command:
> > /usr/sbin/ceph-volume lvm list
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy on 14.04

2018-04-30 Thread Scottix
It looks like ceph-deploy@2.0.0 is incompatible with systems running 14.04
and it got released in the luminous branch with the new deployment commands.

Is there anyway to downgrade to an older version?

Log of osd list

XYZ@XYZStat200:~/XYZ-cluster$ ceph-deploy --overwrite-conf osd list
XYZCeph204
[ceph_deploy.conf][DEBUG ] found configuration file at:
/home/XYZ/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.0): /usr/bin/ceph-deploy
--overwrite-conf osd list XYZCeph204
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  debug : False
[ceph_deploy.cli][INFO  ]  overwrite_conf: True
[ceph_deploy.cli][INFO  ]  subcommand: list
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  cd_conf   :

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  host  : ['XYZCeph204']
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  default_release   : False
XYZ@XYZceph204's password:
[XYZCeph204][DEBUG ] connection detected need for sudo
XYZ@XYZceph204's password:
[XYZCeph204][DEBUG ] connected to host: XYZCeph204
[XYZCeph204][DEBUG ] detect platform information from remote host
[XYZCeph204][DEBUG ] detect machine type
[XYZCeph204][DEBUG ] find the location of an executable
[XYZCeph204][INFO  ] Running command: sudo /sbin/initctl version
[XYZCeph204][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.osd][DEBUG ] Listing disks on XYZCeph204...
[XYZCeph204][DEBUG ] find the location of an executable
[XYZCeph204][INFO  ] Running command: sudo /usr/sbin/ceph-volume lvm list
[XYZCeph204][DEBUG ]  stderr: /sbin/lvs: unrecognized option '--readonly'
[XYZCeph204][WARNIN] No valid Ceph devices found
[XYZCeph204][DEBUG ]  stderr: Error during parsing of command line.
[XYZCeph204][DEBUG ]  stderr: /sbin/lvs: unrecognized option '--readonly'
[XYZCeph204][DEBUG ]  stderr: Error during parsing of command line.
[XYZCeph204][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command:
/usr/sbin/ceph-volume lvm list
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] *** SPAM *** Re: Multi-MDS Failover

2018-04-27 Thread Scottix
Hey Dan,

Thanks you for the response, the namespace methodology makes more sense and
I think that explains what would be up or not.

In regards to my original email with number 4 of listing 0 files. I will
try to recreate with debug on and submit an issue if that turns out to be a
bug.

I am sorry if I have offended anyone with my attitude, I am just trying to
get information and understand what is going on. I want Ceph and CephFS to
be the best out there.

Thank you all

On Fri, Apr 27, 2018 at 12:14 AM Dan van der Ster <d...@vanderster.com>
wrote:

> Hi Scott,
>
> Multi MDS just assigns different parts of the namespace to different
> "ranks". Each rank (0, 1, 2, ...) is handled by one of the active
> MDSs. (You can query which parts of the name space are assigned to
> each rank using the jq tricks in [1]). If a rank is down and there are
> no more standby's, then you need to bring up a new MDS to handle that
> down rank. In the meantime, part of the namespace will have IO
> blocked.
>
> To handle these failures, you need to configure sufficient standby
> MDSs to handle the failure scenarios you foresee in your environment.
> A strictly "standby" MDS can takeover from *any* of the failed ranks,
> and you can have several "standby" MDSs to cover multiple failures. So
> just run 2 or 3 standby's if you want to be on the safe side.
>
> You can also configure "standby-for-rank" MDSs -- that is, a given
> standby MDS can be watching a specific rank then taking over it that
> specific MDS fails. Those standby-for-rank MDS's can even be "hot"
> standby's to speed up the failover process.
>
> An active MDS for a given rank does not act as a standby for the other
> ranks. I'm not sure if it *could* following some code changes, but
> anyway that just not how it works today.
>
> Does that clarify things?
>
> Cheers, Dan
>
> [1] https://ceph.com/community/new-luminous-cephfs-subtree-pinning/
>
>
> On Fri, Apr 27, 2018 at 4:04 AM, Scottix <scot...@gmail.com> wrote:
> > Ok let me try to explain this better, we are doing this back and forth
> and
> > its not going anywhere. I'll just be as genuine as I can and explain the
> > issue.
> >
> > What we are testing is a critical failure scenario and actually more of a
> > real world scenario. Basically just what happens when it is 1AM and the
> shit
> > hits the fan, half of your servers are down and 1 of the 3 MDS boxes are
> > still alive.
> > There is one very important fact that happens with CephFS and when the
> > single Active MDS server fails. It is guaranteed 100% all IO is blocked.
> No
> > split-brain, no corrupted data, 100% guaranteed ever since we started
> using
> > CephFS
> >
> > Now with multi_mds, I understand this changes the logic and I understand
> how
> > difficult and how hard this problem is, trust me I would not be able to
> > tackle this. Basically I need to answer the question; what happens when
> 1 of
> > 2 multi_mds fails with no standbys ready to come save them?
> > What I have tested is not the same of a single active MDS; this
> absolutely
> > changes the logic of what happens and how we troubleshoot. The CephFS is
> > still alive and it does allow operations and does allow resources to go
> > through. How, why and what is affected are very relevant questions if
> this
> > is what the failure looks like since it is not 100% blocking.
> >
> > This is the problem, I have programs writing a massive amount of data
> and I
> > don't want it corrupted or lost. I need to know what happens and I need
> to
> > have guarantees.
> >
> > Best
> >
> >
> > On Thu, Apr 26, 2018 at 5:03 PM Patrick Donnelly <pdonn...@redhat.com>
> > wrote:
> >>
> >> On Thu, Apr 26, 2018 at 4:40 PM, Scottix <scot...@gmail.com> wrote:
> >> >> Of course -- the mons can't tell the difference!
> >> > That is really unfortunate, it would be nice to know if the filesystem
> >> > has
> >> > been degraded and to what degree.
> >>
> >> If a rank is laggy/crashed, the file system as a whole is generally
> >> unavailable. The span between partial outage and full is small and not
> >> worth quantifying.
> >>
> >> >> You must have standbys for high availability. This is the docs.
> >> > Ok but what if you have your standby go down and a master go down.
> This
> >> > could happen in the real world and is a valid error scenario.
> >> >Also there is
> >> > a period between when the standby becomes active what happens
>

Re: [ceph-users] Multi-MDS Failover

2018-04-26 Thread Scottix
Ok let me try to explain this better, we are doing this back and forth and
its not going anywhere. I'll just be as genuine as I can and explain the
issue.

What we are testing is a critical failure scenario and actually more of a
real world scenario. Basically just what happens when it is 1AM and the
shit hits the fan, half of your servers are down and 1 of the 3 MDS boxes
are still alive.
There is one very important fact that happens with CephFS and when the
single Active MDS server fails. It is guaranteed 100% all IO is blocked. No
split-brain, no corrupted data, 100% guaranteed ever since we started using
CephFS

Now with multi_mds, I understand this changes the logic and I understand
how difficult and how hard this problem is, trust me I would not be able to
tackle this. Basically I need to answer the question; what happens when 1
of 2 multi_mds fails with no standbys ready to come save them?
What I have tested is not the same of a single active MDS; this absolutely
changes the logic of what happens and how we troubleshoot. The CephFS is
still alive and it does allow operations and does allow resources to go
through. How, why and what is affected are very relevant questions if this
is what the failure looks like since it is not 100% blocking.

This is the problem, I have programs writing a massive amount of data and I
don't want it corrupted or lost. I need to know what happens and I need to
have guarantees.

Best


On Thu, Apr 26, 2018 at 5:03 PM Patrick Donnelly <pdonn...@redhat.com>
wrote:

> On Thu, Apr 26, 2018 at 4:40 PM, Scottix <scot...@gmail.com> wrote:
> >> Of course -- the mons can't tell the difference!
> > That is really unfortunate, it would be nice to know if the filesystem
> has
> > been degraded and to what degree.
>
> If a rank is laggy/crashed, the file system as a whole is generally
> unavailable. The span between partial outage and full is small and not
> worth quantifying.
>
> >> You must have standbys for high availability. This is the docs.
> > Ok but what if you have your standby go down and a master go down. This
> > could happen in the real world and is a valid error scenario.
> >Also there is
> > a period between when the standby becomes active what happens in-between
> > that time?
>
> The standby MDS goes through a series of states where it recovers the
> lost state and connections with clients. Finally, it goes active.
>
> >> It depends(tm) on how the metadata is distributed and what locks are
> > held by each MDS.
> > Your saying depending on which mds had a lock on a resource it will block
> > that particular POSIX operation? Can you clarify a little bit?
> >
> >> Standbys are not optional in any production cluster.
> > Of course in production I would hope people have standbys but in theory
> > there is no enforcement in Ceph for this other than a warning. So when
> you
> > say not optional that is not exactly true it will still run.
>
> It's self-defeating to expect CephFS to enforce having standbys --
> presumably by throwing an error or becoming unavailable -- when the
> standbys exist to make the system available.
>
> There's nothing to enforce. A warning is sufficient for the operator
> that (a) they didn't configure any standbys or (b) MDS daemon
> processes/boxes are going away and not coming back as standbys (i.e.
> the pool of MDS daemons is decreasing with each failover)
>
> --
> Patrick Donnelly
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-MDS Failover

2018-04-26 Thread Scottix
> Of course -- the mons can't tell the difference!
That is really unfortunate, it would be nice to know if the filesystem has
been degraded and to what degree.

> You must have standbys for high availability. This is the docs.
Ok but what if you have your standby go down and a master go down. This
could happen in the real world and is a valid error scenario. Also there is
a period between when the standby becomes active what happens in-between
that time?

> It depends(tm) on how the metadata is distributed and what locks are
held by each MDS.
Your saying depending on which mds had a lock on a resource it will block
that particular POSIX operation? Can you clarify a little bit?

> Standbys are not optional in any production cluster.
Of course in production I would hope people have standbys but in theory
there is no enforcement in Ceph for this other than a warning. So when you
say not optional that is not exactly true it will still run.

On Thu, Apr 26, 2018 at 3:37 PM Patrick Donnelly <pdonn...@redhat.com>
wrote:

> On Thu, Apr 26, 2018 at 3:16 PM, Scottix <scot...@gmail.com> wrote:
> > Updated to 12.2.5
> >
> > We are starting to test multi_mds cephfs and we are going through some
> > failure scenarios in our test cluster.
> >
> > We are simulating a power failure to one machine and we are getting mixed
> > results of what happens to the file system.
> >
> > This is the status of the mds once we simulate the power loss considering
> > there are no more standbys.
> >
> > mds: cephfs-2/2/2 up
> > {0=CephDeploy100=up:active,1=TigoMDS100=up:active(laggy or crashed)}
> >
> > 1. It is a little unclear if it is laggy or really is down, using this
> line
> > alone.
>
> Of course -- the mons can't tell the difference!
>
> > 2. The first time we lost total access to ceph folder and just blocked
> i/o
>
> You must have standbys for high availability. This is the docs.
>
> > 3. One time we were still able to access ceph folder and everything
> seems to
> > be running.
>
> It depends(tm) on how the metadata is distributed and what locks are
> held by each MDS.
>
> Standbys are not optional in any production cluster.
>
> --
> Patrick Donnelly
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Multi-MDS Failover

2018-04-26 Thread Scottix
Updated to 12.2.5

We are starting to test multi_mds cephfs and we are going through some
failure scenarios in our test cluster.

We are simulating a power failure to one machine and we are getting mixed
results of what happens to the file system.

This is the status of the mds once we simulate the power loss considering
there are no more standbys.

mds: cephfs-2/2/2 up
{0=CephDeploy100=up:active,1=TigoMDS100=up:active(laggy or crashed)}

1. It is a little unclear if it is laggy or really is down, using this line
alone.
2. The first time we lost total access to ceph folder and just blocked i/o
3. One time we were still able to access ceph folder and everything seems
to be running.
4. One time we had a script creating a bunch of files, simulated the crash,
then we list the directory and showed 0 files, expected should be lots of
files.

I mean we could go into details of each of those, but really I am trying to
understand ceph logic in dealing with a crashed multi mds or if you mark it
degraded? or what is going on.

It just seems a little unclear what is going to happen.

Good news once it comes back online everything is as it should be.

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade Order with ceph-mgr

2018-04-26 Thread Scottix
Thanks.
Can you make sure there is a ticket to update the doc, I am sure others
will have this same question and that location is hard to find and parse.

On Thu, Apr 26, 2018 at 9:13 AM Vasu Kulkarni <vakul...@redhat.com> wrote:

> Step 6 says mon should be upgraded *first*, The step 7 there indicates
> the order would be after mon upgrade and before osd. There are couple
> threads related to colocated mon/osd upgrade scenario's.
>
> On Thu, Apr 26, 2018 at 9:05 AM, Scottix <scot...@gmail.com> wrote:
> > Right I have ceph-mgr but when I do an update I want to make sure it is
> the
> > recommended order to update things or maybe it just doesn't matter.
> > Either way usually there is a recommended order with ceph so just asking
> to
> > see what the official response is.
> >
> > On Thu, Apr 26, 2018 at 8:59 AM Vasu Kulkarni <vakul...@redhat.com>
> wrote:
> >>
> >> On Thu, Apr 26, 2018 at 8:52 AM, Scottix <scot...@gmail.com> wrote:
> >> > Now that we have ceph-mgr in luminous what is the best upgrade order
> for
> >> > the
> >> > ceph-mgr?
> >> >
> >> > http://docs.ceph.com/docs/master/install/upgrading-ceph/
> >> I think that is outdated and needs some fix but release notes is what
> >> gets updated and has accurate steps
> >> Check the section "upgrade from Jewel or Kraken" section here
> >> https://ceph.com/releases/v12-2-0-luminous-released/
> >>
> >> >
> >> > Thanks.
> >> >
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade Order with ceph-mgr

2018-04-26 Thread Scottix
Right I have ceph-mgr but when I do an update I want to make sure it is the
recommended order to update things or maybe it just doesn't matter.
Either way usually there is a recommended order with ceph so just asking to
see what the official response is.

On Thu, Apr 26, 2018 at 8:59 AM Vasu Kulkarni <vakul...@redhat.com> wrote:

> On Thu, Apr 26, 2018 at 8:52 AM, Scottix <scot...@gmail.com> wrote:
> > Now that we have ceph-mgr in luminous what is the best upgrade order for
> the
> > ceph-mgr?
> >
> > http://docs.ceph.com/docs/master/install/upgrading-ceph/
> I think that is outdated and needs some fix but release notes is what
> gets updated and has accurate steps
> Check the section "upgrade from Jewel or Kraken" section here
> https://ceph.com/releases/v12-2-0-luminous-released/
>
> >
> > Thanks.
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrade Order with ceph-mgr

2018-04-26 Thread Scottix
Now that we have ceph-mgr in luminous what is the best upgrade order for
the ceph-mgr?

http://docs.ceph.com/docs/master/install/upgrading-ceph/

Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Install previous version of Ceph

2018-02-26 Thread Scottix
I have been trying the dpk -i route but hitting a lot of dependencies, so
still working on it.

On Mon, Feb 26, 2018 at 7:36 AM David Turner <drakonst...@gmail.com> wrote:

> In the past I downloaded the packages for a version and configured it as a
> local repo on the server.  basically it was a tar.gz that I would extract
> that would place the ceph packages in a folder for me and swap out the repo
> config file to a version that points to the local folder.  I haven't needed
> to do that much, but it was helpful.  Generally it's best to just mirror
> the upstream and lock it to the version you're using in production.  That's
> a good rule of thumb for other repos as well, especially for ceph nodes.
> When I install a new ceph node, I want all of it's package versions to
> match 100% to the existing nodes.  Troubleshooting problems becomes
> drastically simpler once you get to that point.
>
> On Mon, Feb 26, 2018 at 9:08 AM Ronny Aasen <ronny+ceph-us...@aasen.cx>
> wrote:
>
>> On 23. feb. 2018 23:37, Scottix wrote:
>> > Hey,
>> > We had one of our monitor servers die on us and I have a replacement
>> > computer now. In between that time you have released 12.2.3 but we are
>> > still on 12.2.2.
>> >
>> > We are on Ubuntu servers
>> >
>> > I see all the binaries are in the repo but your package cache only shows
>> > 12.2.3, is there a reason for not keeping the previous builds like in my
>> > case.
>> >
>> > I could do an install like
>> > apt install ceph-mon=12.2.2
>> >
>> > Also how would I go installing 12.2.2 in my scenario since I don't want
>> > to update till have this monitor running again.
>> >
>> > Thanks,
>> > Scott
>>
>> did you figure out a solution to this ? I have the same problem now.
>> I assume you have to download the old version manually and install with
>> dpkg -i
>>
>> optionally mirror the ceph repo and build your own repo index containing
>> all versions.
>>
>> kind regards
>> Ronny Aasen
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Install previous version of Ceph

2018-02-23 Thread Scottix
Hey,
We had one of our monitor servers die on us and I have a replacement
computer now. In between that time you have released 12.2.3 but we are
still on 12.2.2.

We are on Ubuntu servers

I see all the binaries are in the repo but your package cache only shows
12.2.3, is there a reason for not keeping the previous builds like in my
case.

I could do an install like
apt install ceph-mon=12.2.2

Also how would I go installing 12.2.2 in my scenario since I don't want to
update till have this monitor running again.

Thanks,
Scott
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore osd_max_backfills

2017-11-08 Thread Scottix
When I add in the next hdd i'll try the method again and see if I just
needed to wait longer.

On Tue, Nov 7, 2017 at 11:19 PM Wido den Hollander <w...@42on.com> wrote:

>
> > Op 7 november 2017 om 22:54 schreef Scottix <scot...@gmail.com>:
> >
> >
> > Hey,
> > I recently updated to luminous and started deploying bluestore osd
> nodes. I
> > normally set osd_max_backfills = 1 and then ramp up as time progresses.
> >
> > Although with bluestore it seems like I wasn't able to do this on the fly
> > like I used to with XFS.
> >
> > ceph tell osd.* injectargs '--osd-max-backfills 5'
> >
> > osd.34: osd_max_backfills = '5'
> > osd.35: osd_max_backfills = '5' rocksdb_separate_wal_dir = 'false' (not
> > observed, change may require restart)
> > osd.36: osd_max_backfills = '5'
> > osd.37: osd_max_backfills = '5'
> >
> > As I incorporate more bluestore osds not being able to control this is
> > going to drastically affect recovery speed and with the default as 1, on
> a
> > big rebalance, I would be afraid restarting a bunch of osd.
> >
>
> Are you sure the backfills are really not increasing? If you re-run the
> command, what does it output?
>
> I've seen this as well, but the backfills seemed to increase anyway.
>
> Wido
>
> > Any advice in how to control this better?
> >
> > Thanks,
> > Scott
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bluestore osd_max_backfills

2017-11-07 Thread Scottix
Hey,
I recently updated to luminous and started deploying bluestore osd nodes. I
normally set osd_max_backfills = 1 and then ramp up as time progresses.

Although with bluestore it seems like I wasn't able to do this on the fly
like I used to with XFS.

ceph tell osd.* injectargs '--osd-max-backfills 5'

osd.34: osd_max_backfills = '5'
osd.35: osd_max_backfills = '5' rocksdb_separate_wal_dir = 'false' (not
observed, change may require restart)
osd.36: osd_max_backfills = '5'
osd.37: osd_max_backfills = '5'

As I incorporate more bluestore osds not being able to control this is
going to drastically affect recovery speed and with the default as 1, on a
big rebalance, I would be afraid restarting a bunch of osd.

Any advice in how to control this better?

Thanks,
Scott
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph release cadence

2017-09-08 Thread Scottix
Personally I kind of like the current format and fundamentally we are
talking about Data storage which should be the most tested and scrutinized
piece of software on your computer. Having XYZ feature later than sooner
compared to oh I lost all my data. I am thinking of a recent FS that had a
feature they shouldn't have released. I appreciate the extra time it takes
to release to make it resilient.

Having a LTS version to rely on provides a good assurance the upgrade
process wil be thoroughly tested.
Having a version to do more experimental features keeps the new features at
bay, it follows the Ubuntu model basically.

I feel there were a lot of underpinning features in Luminous that
checkmarked a lot of checkboxes you have been wanting for a while. One
thing to consider possibly a lot of the core features become more
incremental.

I guess from my use case Ceph actually does everything I need it to do atm.
Yes new features and better processes make it better, but more or less I am
pretty content. Maybe I am a small minority in this logic.

On Fri, Sep 8, 2017 at 2:20 AM Matthew Vernon  wrote:

> Hi,
>
> On 06/09/17 16:23, Sage Weil wrote:
>
> > Traditionally, we have done a major named "stable" release twice a year,
> > and every other such release has been an "LTS" release, with fixes
> > backported for 1-2 years.
>
> We use the ceph version that comes with our distribution (Ubuntu LTS);
> those come out every 2 years (though we won't move to a brand-new
> distribution until we've done some testing!). So from my POV, LTS ceph
> releases that come out such that adjacent ceph LTSs fit neatly into
> adjacent Ubuntu LTSs is the ideal outcome. We're unlikely to ever try
> putting a non-LTS ceph version into production.
>
> I hope this isn't an unusual requirement :)
>
> Matthew
>
>
> --
>  The Wellcome Trust Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon osd down out subtree limit default

2017-08-21 Thread Scottix
Great to hear.

Best

On Mon, Aug 21, 2017 at 8:54 AM John Spray <jsp...@redhat.com> wrote:

> On Mon, Aug 21, 2017 at 4:34 PM, Scottix <scot...@gmail.com> wrote:
> > I don't want to hijack another thread so here is my question.
> > I just learned about this option from another thread and from my
> > understanding with our Ceph cluster that we have setup, the default
> value is
> > not good. Which is "rack" and I should have it on "host".
> > Which comes to my point why is it set to rack? To be on the safer side
> > wouldn't the option make more sense as host as default? Then if you are
> rack
> > aware then you can change the default.
>
> Yes!
>
> As it happens, we (Sage was in the room, not sure who else) talked
> about this recently, and the idea was to make the default conditional
> depending on system size.  So for smallish systems, we would set it to
> host, and on larger systems it would be rack.
>
> John
>
> >
> > Best,
> > Scott
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mon osd down out subtree limit default

2017-08-21 Thread Scottix
I don't want to hijack another thread so here is my question.
I just learned about this option from another thread and from my
understanding with our Ceph cluster that we have setup, the default value
is not good. Which is "rack" and I should have it on "host".
Which comes to my point why is it set to rack? To be on the safer side
wouldn't the option make more sense as host as default? Then if you are
rack aware then you can change the default.

Best,
Scott
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mysql performance on CephFS vs RBD

2017-05-01 Thread Scottix
I'm by no means a Ceph expert but I feel this is not a fair representation
of Ceph, I am not saying numbers would be better or worse. Just the fact I
see some major holes that don't represent a typical Ceph setup.

1 Mon? Most have a minimum of 3
1 OSD? basically all your reads and writes are going to 1 HDD? (I would say
biggest flaw in the benchmark setup)
Is everything on a VM? worse, is it on 1 machine?
What is your network setup?
Why are you testing CephFS and RBD on an older kernel?
Why did you compile from source?
Journal And Data on same disk, is it a spinning drive, SSD, or other? (We
need way more specs to understand)

I would suggest if you want to benchmark you need to get actual hardware to
represent what you would do in production, to try to maximize the
performance of this type of test. Otherwise these numbers are basically
meaningless.


On Sun, Apr 30, 2017 at 9:57 PM Babu Shanmugam  wrote:

>
>
> On Monday 01 May 2017 10:24 AM, David Turner wrote:
>
> You don't have results that include the added network latency of having
> replica 3 replicating across multiple hosts. The reads would be very
> similar as the primary is the only thing that is read from, but writes will
> not return until after all 3 copies are written.
>
> I started this as an experiment to see why table creation takes too much
> time on CephFS. That was my prime focus, David. So haven't tried it on
> pools with size > 1.
>
>
> On Sat, Apr 29, 2017, 9:46 PM Babu Shanmugam  wrote:
>
>> Hi,
>> I did some basic experiments with mysql and measured the time taken by a
>> set of operations on CephFS and RBD. The RBD measurements are taken on a
>> 1GB RBD disk with ext4 filesystem. Following are my observation. The time
>> listed below are in seconds.
>>
>>
>>
>> *Plain file system* *CephFS* *RBD*
>> Mysql install db 7.9 38.3 36.4
>> Create table 0.43 4.2 2.5
>> Drop table 0.14 0.21 0.40
>> Create table + 1000 recs 2.76 4.69 5.07
>> Create table + 1 recs
>> 7.69 11.96
>> Create table + 100K recs
>> 12.06 29.65
>>
>>
>> From the above numbers, CephFS seems to fare very well while creating
>> records whereas RBD does well while creating a table. I tried measuring the
>> syscalls of ceph-osd, ceph-mds and the mysqld while creating a table on
>> CephFS and RBD. Following is how the key syscalls of mysqld performed while
>> creating a table (time includes wait time as well).
>>
>> *Syscalls of MYSQLD* *CephFS* *RBD*
>> fsync 338.237 ms 183.697 ms
>> fdatasync 75.635 ms 96.359 ms
>> io_submit 50 us 151 us
>> open 2266 us 61 us
>> close 1186 us 33 us
>> write 115 us 51 us
>>
>> From the above numbers, open, close and fsync syscalls take too much time
>> on CephFs as compared to RBD.
>>
>> Sysbench results are below;
>>
>>
>> *Sysbence 100K records in 60 secs* *CephFS* *RBD*
>> Read Queries performed 631876 501690
>> Other Queries performed 90268 71670
>> No. of transactions 45134 35835
>> No. of transactions per sec 752.04 597.17
>> R/W requests per sec 10528.55 8360.37
>> Other operations per sec 1504.08 1194.34
>> Above numbers seems to indicate the CephFS does very well with MYSQL
>> transactions, better than RBD.
>>
>>
>> Following is my setup;
>>
>> Num MONs: 1
>> Num OSDs: 1
>> Num MDSs: 1
>> Disk  : 10 GB Qemu disk file (Both journal and data in the
>> same disk)
>> Ceph version : 10.2.5 (Built from source)
>> 
>> Build config   : ./configure --without-debug --without-fuse
>> --with-libaio \
>>   --without-libatomic-ops --without-hadoop --with-nss
>> --without-cryptopp \
>>   --without-gtk2 --disable-static --with-jemalloc \
>>   --without-libzfs --without-lttng --without-babeltrace \
>>   --with-eventfd --with-python -without-kinetic
>> --without-librocksdb \
>>   --without-openldap \
>>   CFLAGS="-g -O2 -fPIC" CXXFLAGS="-g -O2 -std=c++11 -fPIC
>>
>> Ceph conf : Apart from host and network settings nothing else is
>> configured
>> CephFS mount options: rw,relatime,name=cephfs,secret=,acl
>> RBD mount options: rw,relatime,stripe=1024,data=ordered
>>
>> All the processes were run in a Qemu virtual machine with Linux 4.4.18
>> kernel
>>
>> Searching for "Mysql on CephFS" in google does not give any useful
>> results. If this kind of experiments had been done previously and shared
>> publicly, kindly share a link to it.
>>
>> If you are aware of anything that I can do to optimise this, kindly let
>> me know. I am willing to continue this experiment to see how well we can
>> optimise CephFs for mysql.
>>
>>
>>
>> Thank you,
>> Babu Shanmugam
>> www.aalam.io
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> 

Re: [ceph-users] Random Health_warn

2017-02-23 Thread Scottix
That sounds about right, I do see blocked requests sometimes when it is
under really heavy load.

Looking at some examples I think summary should list the issues.
"summary": [],
"overall_status": "HEALTH_OK",

I'll try logging that too.

Scott

On Thu, Feb 23, 2017 at 3:00 PM David Turner <david.tur...@storagecraft.com>
wrote:

> There are multiple approaches to give you more information about the
> Health state.  CLI has these 2 options:
> ceph health detail
> ceph status
>
> I also like using ceph-dash.  ( https://github.com/Crapworks/ceph-dash )
>  It has an associated nagios check to scrape the ceph-dash page.
>
> I personally do `watch ceph status` when I'm monitoring the cluster
> closely.  It will show you things like blocked requests, osds flapping, mon
> clock skew, or whatever your problem is causing the health_warn state.  The
> most likely cause for health_warn off and on is blocked requests.  Those
> are caused by any number of things that you would need to diagnose further
> if that is what is causing the health_warn state.
>
> --
>
> <https://storagecraft.com> David Turner | Cloud Operations Engineer | 
> StorageCraft
> Technology Corporation <https://storagecraft.com>
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2760 <(801)%20871-2760> | Mobile: 385.224.2943
> <(385)%20224-2943>
>
> --
>
> If you are not the intended recipient of this message or received it
> erroneously, please notify the sender and delete it, together with any
> attachments, and be advised that any dissemination or copying of this
> message is prohibited.
> --
>
> ____
> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of John
> Spray [jsp...@redhat.com]
> Sent: Thursday, February 23, 2017 3:47 PM
> To: Scottix
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Random Health_warn
>
>
> On Thu, Feb 23, 2017 at 9:49 PM, Scottix <scot...@gmail.com> wrote:
> > ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
> >
> > We are seeing a weird behavior or not sure how to diagnose what could be
> > going on. We started monitoring the overall_status from the json query
> and
> > every once in a while we would get a HEALTH_WARN for a minute or two.
> >
> > Monitoring logs.
> > 02/23/2017 07:25:54 AM HEALTH_OK
> > 02/23/2017 07:24:54 AM HEALTH_WARN
> > 02/23/2017 07:23:55 AM HEALTH_OK
> > 02/23/2017 07:22:54 AM HEALTH_OK
> > ...
> > 02/23/2017 05:13:55 AM HEALTH_OK
> > 02/23/2017 05:12:54 AM HEALTH_WARN
> > 02/23/2017 05:11:54 AM HEALTH_WARN
> > 02/23/2017 05:10:54 AM HEALTH_OK
> > 02/23/2017 05:09:54 AM HEALTH_OK
> >
> > When I check the mon leader logs there is no indication of an error or
> > issues that could be occuring. Is there a way to find what is causing the
> > HEALTH_WARN?
>
> Possibly not without grabbing more than just the overall status at the
> same time as you're grabbing the OK/WARN status.
>
> Internally, the OK/WARN/ERROR health state is generated on-demand by
> applying a bunch of checks to the state of the system when the user
> runs the health command -- the system doesn't know it's in a warning
> state until it's asked.  Often you will see a corresponding log
> message, but not necessarily.
>
> John
>
> > Best,
> > Scott
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Random Health_warn

2017-02-23 Thread Scottix
Ya the ceph-mon.$ID.log

I was running ceph -w when one of them occurred too and it never output
anything.

Here is a snippet for the the 5:11AM occurrence.

On Thu, Feb 23, 2017 at 1:56 PM Robin H. Johnson <robb...@gentoo.org> wrote:

> On Thu, Feb 23, 2017 at 09:49:21PM +, Scottix wrote:
> > ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
> >
> > We are seeing a weird behavior or not sure how to diagnose what could be
> > going on. We started monitoring the overall_status from the json query
> and
> > every once in a while we would get a HEALTH_WARN for a minute or two.
> >
> > Monitoring logs.
> > 02/23/2017 07:25:54 AM HEALTH_OK
> > 02/23/2017 07:24:54 AM HEALTH_WARN
> > 02/23/2017 07:23:55 AM HEALTH_OK
> > 02/23/2017 07:22:54 AM HEALTH_OK
> > ...
> > 02/23/2017 05:13:55 AM HEALTH_OK
> > 02/23/2017 05:12:54 AM HEALTH_WARN
> > 02/23/2017 05:11:54 AM HEALTH_WARN
> > 02/23/2017 05:10:54 AM HEALTH_OK
> > 02/23/2017 05:09:54 AM HEALTH_OK
> >
> > When I check the mon leader logs there is no indication of an error or
> > issues that could be occuring. Is there a way to find what is causing the
> > HEALTH_WARN?
> By leader logs, do you mean the cluster log (mon_cluster_log_file), or
> the mon log (log_file)? Eg /var/log/ceph/ceph.log vs
> /var/log/ceph/ceph-mon.$ID.log.
>
> Could you post the log entries for a time period between two HEALTH_OK
> states with a HEALTH_WARN in the middle?
>
> The reason for WARN _should_ be included on the logged status line.
>
> Alternatively, you should be able to just log the output of 'ceph -w'
> for a while, and find the WARN status as well.
>
> --
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer
> E-Mail   : robb...@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
2017-02-23 05:10:54.139358 7f5c17894700  0 mon.CephMon200@0(leader) e7 handle_command mon_command({"prefix": "status", "format": "json"} v 0) v1
2017-02-23 05:10:54.139549 7f5c17894700  0 log_channel(audit) log [DBG] : from='client.? 10.10.1.30:0/1031767' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2017-02-23 05:10:54.535319 7f5c1a25c700  0 log_channel(cluster) log [INF] : pgmap v77496604: 5120 pgs: 2 active+clean+scrubbing, 5111 active+clean, 7 active+clean+scrubbing+deep; 58071 GB data, 114 TB used, 113 TB / 227 TB avail; 16681 kB/s rd, 11886 kB/s wr, 705 op/s
2017-02-23 05:10:55.600104 7f5c1a25c700  0 log_channel(cluster) log [INF] : pgmap v77496605: 5120 pgs: 2 active+clean+scrubbing, 5111 active+clean, 7 active+clean+scrubbing+deep; 58071 GB data, 114 TB used, 113 TB / 227 TB avail; 14716 kB/s rd, 6627 kB/s wr, 408 op/s
2017-02-23 05:10:56.170435 7f5c17894700  0 mon.CephMon200@0(leader) e7 handle_command mon_command({"prefix": "status", "format": "json"} v 0) v1
2017-02-23 05:10:56.170502 7f5c17894700  0 log_channel(audit) log [DBG] : from='client.? 10.10.1.30:0/1031899' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2017-02-23 05:10:56.642040 7f5c1a25c700  0 log_channel(cluster) log [INF] : pgmap v77496606: 5120 pgs: 2 active+clean+scrubbing, 5111 active+clean, 7 active+clean+scrubbing+deep; 58071 GB data, 114 TB used, 113 TB / 227 TB avail; 14617 kB/s rd, 6580 kB/s wr, 537 op/s
2017-02-23 05:10:57.667496 7f5c1a25c700  0 log_channel(cluster) log [INF] : pgmap v77496607: 5120 pgs: 2 active+clean+scrubbing, 5110 active+clean, 8 active+clean+scrubbing+deep; 58071 GB data, 114 TB used, 113 TB / 227 TB avail; 8862 kB/s rd, 7126 kB/s wr, 552 op/s
2017-02-23 05:10:58.736114 7f5c1a25c700  0 log_channel(cluster) log [INF] : pgmap v77496608: 5120 pgs: 2 active+clean+scrubbing, 5110 active+clean, 8 active+clean+scrubbing+deep; 58071 GB data, 114 TB used, 113 TB / 227 TB avail; 14126 kB/s rd, 11254 kB/s wr, 974 op/s
2017-02-23 05:10:59.451884 7f5c17894700  0 mon.CephMon200@0(leader) e7 handle_command mon_command({"prefix": "status", "format": "json"} v 0) v1
2017-02-23 05:10:59.451903 7f5c17894700  0 log_channel(audit) log [DBG] : from='client.? 10.10.1.30:0/1031932' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2017-02-23 05:10:59.812909 7f5c1a25c700  0 log_channel(cluster) log [INF] : pgmap v77496609: 5120 pgs: 2 active+clean+scrubbing, 5110 active+clean, 8 active+clean+scrubbing+deep; 58071 

[ceph-users] Random Health_warn

2017-02-23 Thread Scottix
ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)

We are seeing a weird behavior or not sure how to diagnose what could be
going on. We started monitoring the overall_status from the json query and
every once in a while we would get a HEALTH_WARN for a minute or two.

Monitoring logs.
02/23/2017 07:25:54 AM HEALTH_OK
02/23/2017 07:24:54 AM HEALTH_WARN
02/23/2017 07:23:55 AM HEALTH_OK
02/23/2017 07:22:54 AM HEALTH_OK
...
02/23/2017 05:13:55 AM HEALTH_OK
02/23/2017 05:12:54 AM HEALTH_WARN
02/23/2017 05:11:54 AM HEALTH_WARN
02/23/2017 05:10:54 AM HEALTH_OK
02/23/2017 05:09:54 AM HEALTH_OK

When I check the mon leader logs there is no indication of an error or
issues that could be occuring. Is there a way to find what is causing the
HEALTH_WARN?

Best,
Scott
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failing to Activate new OSD ceph-deploy

2017-01-10 Thread Scottix
My guess, ceph-deploy doesn't know how to handle that setting. I just
remove it on host machine to add the disk then put it back so the other
will boot as root.

--Scottie

On Tue, Jan 10, 2017 at 11:02 AM David Turner <david.tur...@storagecraft.com>
wrote:

> Removing the setuser_match_path resolved this.  This seems like an
> oversight in this setting that allows people to run osds as root that
> prevents them from adding storage.
>
> --
>
> <https://storagecraft.com> David Turner | Cloud Operations Engineer | 
> StorageCraft
> Technology Corporation <https://storagecraft.com>
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2760 <(801)%20871-2760> | Mobile: 385.224.2943
> <(385)%20224-2943>
>
> --
>
> If you are not the intended recipient of this message or received it
> erroneously, please notify the sender and delete it, together with any
> attachments, and be advised that any dissemination or copying of this
> message is prohibited.
> --
>
> --
> *From:* Scottix [scot...@gmail.com]
> *Sent:* Tuesday, January 10, 2017 11:52 AM
> *To:* David Turner; ceph-users
>
> *Subject:* Re: [ceph-users] Failing to Activate new OSD ceph-deploy
> I think I got it to work by removing setuser_match_path =
> /var/lib/ceph/$type/$cluster-$id from the host machine.
> I think I did do a reboot it was a while ago so don't remember exactly.
> Then running ceph-deploy activate
>
> --Scott
>
> On Tue, Jan 10, 2017 at 10:16 AM David Turner <
> david.tur...@storagecraft.com> wrote:
>
> Did you ever fitgure out why the /var/lib/ceph/osd/ceph-22 folder was not
> being created automatically?  We are having this issue while testing adding
> storage to an upgraded to jewel ceph cluster.  Like you manually creating
> the directory and setting the permissions for the directory will allow us
> to activate the osd and it comes up and in without issue.
>
> --
>
> <https://storagecraft.com> David Turner | Cloud Operations Engineer | 
> StorageCraft
> Technology Corporation <https://storagecraft.com>
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2760 <(801)%20871-2760> | Mobile: 385.224.2943
> <(385)%20224-2943>
>
> --
>
> If you are not the intended recipient of this message or received it
> erroneously, please notify the sender and delete it, together with any
> attachments, and be advised that any dissemination or copying of this
> message is prohibited.
>
> --
>
> --
> *From:* ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of
> Scottix [scot...@gmail.com]
> *Sent:* Thursday, July 07, 2016 5:01 PM
> *To:* ceph-users
> *Subject:* Re: [ceph-users] Failing to Activate new OSD ceph-deploy
>
> I played with it enough to make it work.
>
> Basically i created the directory it was going to put the data in
> mkdir /var/lib/ceph/osd/ceph-22
>
> Then I ran ceph-deploy activate which then did a little bit more into
> putting it in the cluster but it still didn't start because of permissions
> with the journal.
>
> Some of the permissions were set to ceph:ceph I tried the new permissions
> but it failed to start, and after reading a mailing list a reboot may have
> fixed that.
> Anyway I ran chown -R root:root ceph-22 and after that is started.
>
> I still need to fix permissions but I am happy I got it in atleast.
>
> --Scott
>
>
>
> On Thu, Jul 7, 2016 at 2:54 PM Scottix <scot...@gmail.com> wrote:
>
> Hey,
> This is the first time I have had a problem with ceph-deploy
>
> I have attached the log but I can't seem to activate the osd.
>
> I am running
> ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
>
> I did upgrade from Infernalis->Jewel
> I haven't changed ceph ownership but I do have the config option
> setuser_match_path = /var/lib/ceph/$type/$cluster-$id
>
> Any help would be appreciated,
> Scott
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failing to Activate new OSD ceph-deploy

2017-01-10 Thread Scottix
I think I got it to work by removing setuser_match_path =
/var/lib/ceph/$type/$cluster-$id from the host machine.
I think I did do a reboot it was a while ago so don't remember exactly.
Then running ceph-deploy activate

--Scott

On Tue, Jan 10, 2017 at 10:16 AM David Turner <david.tur...@storagecraft.com>
wrote:

Did you ever fitgure out why the /var/lib/ceph/osd/ceph-22 folder was not
being created automatically?  We are having this issue while testing adding
storage to an upgraded to jewel ceph cluster.  Like you manually creating
the directory and setting the permissions for the directory will allow us
to activate the osd and it comes up and in without issue.

--

<https://storagecraft.com> David Turner | Cloud Operations Engineer |
StorageCraft
Technology Corporation <https://storagecraft.com>
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 <(801)%20871-2760> | Mobile: 385.224.2943
<(385)%20224-2943>

--

If you are not the intended recipient of this message or received it
erroneously, please notify the sender and delete it, together with any
attachments, and be advised that any dissemination or copying of this
message is prohibited.

--

--
*From:* ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Scottix
[scot...@gmail.com]
*Sent:* Thursday, July 07, 2016 5:01 PM
*To:* ceph-users
*Subject:* Re: [ceph-users] Failing to Activate new OSD ceph-deploy

I played with it enough to make it work.

Basically i created the directory it was going to put the data in
mkdir /var/lib/ceph/osd/ceph-22

Then I ran ceph-deploy activate which then did a little bit more into
putting it in the cluster but it still didn't start because of permissions
with the journal.

Some of the permissions were set to ceph:ceph I tried the new permissions
but it failed to start, and after reading a mailing list a reboot may have
fixed that.
Anyway I ran chown -R root:root ceph-22 and after that is started.

I still need to fix permissions but I am happy I got it in atleast.

--Scott



On Thu, Jul 7, 2016 at 2:54 PM Scottix <scot...@gmail.com> wrote:

Hey,
This is the first time I have had a problem with ceph-deploy

I have attached the log but I can't seem to activate the osd.

I am running
ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)

I did upgrade from Infernalis->Jewel
I haven't changed ceph ownership but I do have the config option
setuser_match_path = /var/lib/ceph/$type/$cluster-$id

Any help would be appreciated,
Scott
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Feedback wanted: health warning when standby MDS dies?

2016-10-19 Thread Scottix
I would take the analogy of a Raid scenario. Basically a standby is
considered like a spare drive. If that spare drive goes down. It is good to
know about the event, but it does in no way indicate a degraded system,
everything keeps running at top speed.

If you had multi active MDS and one goes down then I would say that is a
degraded system, but still waiting for that feature.

On Tue, Oct 18, 2016 at 10:18 AM Goncalo Borges <
goncalo.bor...@sydney.edu.au> wrote:

> Hi John.
>
> That would be good.
>
> In our case we are just picking that up simply through nagios and some
> fancy scripts parsing the dump of the MDS maps.
>
> Cheers
> Goncalo
> 
> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of John
> Spray [jsp...@redhat.com]
> Sent: 18 October 2016 22:46
> To: ceph-users
> Subject: [ceph-users] Feedback wanted: health warning when standby MDS
> dies?
>
> Hi all,
>
> Someone asked me today how to get a list of down MDS daemons, and I
> explained that currently the MDS simply forgets about any standby that
> stops sending beacons.  That got me thinking about the case where a
> standby dies while the active MDS remains up -- the cluster has gone
> into a non-highly-available state, but we are not giving the admin any
> indication.
>
> I've suggested a solution here:
> http://tracker.ceph.com/issues/17604
>
> This is probably going to be a bit of a subjective thing in terms of
> whether people find it useful or find it to be annoying noise, so I'd
> be interested in feedback from people currently running cephfs.
>
> Cheers,
> John
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 10.2.3 release announcement?

2016-09-26 Thread Scottix
Agreed no announcement like there usually is, what is going on?
Hopefully there is an explanation. :|

On Mon, Sep 26, 2016 at 6:01 AM Henrik Korkuc  wrote:

> Hey,
>
> 10.2.3 is tagged in jewel branch for more than 5 days already, but there
> were no announcement for that yet. Is there any reasons for that?
> Packages seems to be present too
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failing to Activate new OSD ceph-deploy

2016-07-07 Thread Scottix
I played with it enough to make it work.

Basically i created the directory it was going to put the data in
mkdir /var/lib/ceph/osd/ceph-22

Then I ran ceph-deploy activate which then did a little bit more into
putting it in the cluster but it still didn't start because of permissions
with the journal.

Some of the permissions were set to ceph:ceph I tried the new permissions
but it failed to start, and after reading a mailing list a reboot may have
fixed that.
Anyway I ran chown -R root:root ceph-22 and after that is started.

I still need to fix permissions but I am happy I got it in atleast.

--Scott



On Thu, Jul 7, 2016 at 2:54 PM Scottix <scot...@gmail.com> wrote:

> Hey,
> This is the first time I have had a problem with ceph-deploy
>
> I have attached the log but I can't seem to activate the osd.
>
> I am running
> ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
>
> I did upgrade from Infernalis->Jewel
> I haven't changed ceph ownership but I do have the config option
> setuser_match_path = /var/lib/ceph/$type/$cluster-$id
>
> Any help would be appreciated,
> Scott
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Failing to Activate new OSD ceph-deploy

2016-07-07 Thread Scottix
Hey,
This is the first time I have had a problem with ceph-deploy

I have attached the log but I can't seem to activate the osd.

I am running
ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)

I did upgrade from Infernalis->Jewel
I haven't changed ceph ownership but I do have the config option
setuser_match_path = /var/lib/ceph/$type/$cluster-$id

Any help would be appreciated,
Scott
Stat200:~/t-cluster$ ceph-deploy --overwrite-conf osd create tCeph203:/dev/sdl:/dev/sdc4
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/t/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.34): /usr/bin/ceph-deploy --overwrite-conf osd create tCeph203:/dev/sdl:/dev/sdc4
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  disk  : [('tCeph203', '/dev/sdl', '/dev/sdc4')]
[ceph_deploy.cli][INFO  ]  dmcrypt   : False
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  bluestore : None
[ceph_deploy.cli][INFO  ]  overwrite_conf: True
[ceph_deploy.cli][INFO  ]  subcommand: create
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  cd_conf   : 
[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  fs_type   : xfs
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  zap_disk  : False
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks tCeph203:/dev/sdl:/dev/sdc4
t@tceph203's password: 
[tCeph203][DEBUG ] connection detected need for sudo
t@tceph203's password: 
[tCeph203][DEBUG ] connected to host: tCeph203 
[tCeph203][DEBUG ] detect platform information from remote host
[tCeph203][DEBUG ] detect machine type
[tCeph203][DEBUG ] find the location of an executable
[tCeph203][INFO  ] Running command: sudo /sbin/initctl version
[tCeph203][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.osd][DEBUG ] Deploying osd to tCeph203
[tCeph203][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host tCeph203 disk /dev/sdl journal /dev/sdc4 activate True
[tCeph203][DEBUG ] find the location of an executable
[tCeph203][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /dev/sdl /dev/sdc4
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
[tCeph203][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdl uuid path is /sys/dev/block/8:176/dm/uuid
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[tCeph203][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdl uuid path is /sys/dev/block/8:176/dm/uuid
[tCeph203][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdl uuid path is /sys/dev/block/8:176/dm/uuid
[tCeph203][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdl uuid path is /sys/dev/block/8:176/dm/uuid
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
[tCeph203][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
[tCeph203][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdc4 uuid path is /sys/dev/block/8:36/dm/uuid
[tCeph203][WARNIN] prepare_device: Journal /dev/sdc4 is a partition
[tCeph203][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdc4 uuid path is /sys/dev/block/8:36/dm/uuid
[tCeph203][WARNIN] prepare_device: OSD will not be hot-swappable if journal is not the same device as the osd data
[tCeph203][WARNIN] command: Running command: /sbin/blkid -o udev -p /dev/sdc4
[tCeph203][WARNIN] prepare_device: Journal /dev/sdc4 was not prepared with ceph-disk. Symlinking directly.
[tCeph203][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdl uuid path is /sys/dev/block/8:176/dm/uuid
[tCeph203][WARNIN] set_data_partition: Creating osd partition on /dev/sdl
[tCeph203][WARNIN] get_dm_uuid: get_dm_uuid /dev/sdl uuid path is 

Re: [ceph-users] Required maintenance for upgraded CephFS filesystems

2016-06-03 Thread Scottix
Great thanks.

--Scott

On Fri, Jun 3, 2016 at 8:59 AM John Spray <jsp...@redhat.com> wrote:

> On Fri, Jun 3, 2016 at 4:49 PM, Scottix <scot...@gmail.com> wrote:
> > Is there anyway to check what it is currently using?
>
> Since Firefly, the MDS rewrites TMAPs to OMAPs whenever a directory is
> updated, so a pre-firefly filesystem might already be all OMAPs, or
> might still have some TMAPs -- there's no way to know without scanning
> the whole system.
>
> If you're think you might have used a pre-firefly version to create
> the filesystem, then run the tool: if there aren't any TMAPs in the
> system it'll be a no-op.
>
> John
>
> > Best,
> > Scott
> >
> > On Fri, Jun 3, 2016 at 4:26 AM John Spray <jsp...@redhat.com> wrote:
> >>
> >> Hi,
> >>
> >> If you do not have a CephFS filesystem that was created with a Ceph
> >> version older than Firefly, then you can ignore this message.
> >>
> >> If you have such a filesystem, you need to run a special command at
> >> some point while you are using Jewel, but before upgrading to future
> >> versions.  Please see the documentation here:
> >> http://docs.ceph.com/docs/jewel/cephfs/upgrading/
> >>
> >> In Kraken, we are removing all the code that handled legacy TMAP
> >> objects, so this is something you need to take care of during the
> >> Jewel lifetime.
> >>
> >> Thanks,
> >> John
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS in the wild

2016-06-02 Thread Scottix
I have three comments on our CephFS deployment. Some background first, we
have been using CephFS since Giant with some not so important data. We are
using it more heavily now in Infernalis. We have our own raw data storage
using the POSIX semantics and keep everything as basic as possible.
Basically open, read, and write.

1st thing is if you have a lot of files or directories in a folder. The
lookup can get slow, I would say when you get to about 5000 items you can
feel the latency. Although traditionally this never has been ultra fast on
regular file systems, but just be aware.
2nd We do see an increase in parallelization of reading and writing data
compared to a tradition spinning raid file system. I think this is
testimony of Ceph.
3rd When we do an upgrade to a mds, we basically have to stop all activity
on cephfs to restart the MDS. Replaying the backlog when it is starting, if
it is large, can eat a lot of memory and hope you don't hit swap. This does
create some downtime for us but it usually isn't long.

I am hoping for more improvements in MDS like HA and various other things
to make it even better.

On Thu, Jun 2, 2016 at 9:11 AM Brady Deetz  wrote:

> On Wed, Jun 1, 2016 at 8:18 PM, Christian Balzer  wrote:
>
>>
>> Hello,
>>
>> On Wed, 1 Jun 2016 15:50:19 -0500 Brady Deetz wrote:
>>
>> > Question:
>> > I'm curious if there is anybody else out there running CephFS at the
>> > scale I'm planning for. I'd like to know some of the issues you didn't
>> > expect that I should be looking out for. I'd also like to simply see
>> > when CephFS hasn't worked out and why. Basically, give me your war
>> > stories.
>> >
>> Not me, but diligently search the archives, there are people with large
>> CephFS deployments (despite the non-production status when they did them).
>> Also look at the current horror story thread about what happens when you
>> have huge directories.
>>
>> >
>> > Problem Details:
>> > Now that I'm out of my design phase and finished testing on VMs, I'm
>> > ready to drop $100k on a pilo. I'd like to get some sense of confidence
>> > from the community that this is going to work before I pull the trigger.
>> >
>> > I'm planning to replace my 110 disk 300TB (usable) Oracle ZFS 7320 with
>> > CephFS by this time next year (hopefully by December). My workload is a
>> > mix of small and vary large files (100GB+ in size). We do fMRI analysis
>> > on DICOM image sets as well as other physio data collected from
>> > subjects. We also have plenty of spreadsheets, scripts, etc. Currently
>> > 90% of our analysis is I/O bound and generally sequential.
>> >
>> There are other people here doing similar things (medical institutes,
>> universities), again search the archives and maybe contact them directly.
>>
>> > In deploying Ceph, I am hoping to see more throughput than the 7320 can
>> > currently provide. I'm also looking to get away from traditional
>> > file-systems that require forklift upgrades. That's where Ceph really
>> > shines for us.
>> >
>> > I don't have a total file count, but I do know that we have about 500k
>> > directories.
>> >
>> >
>> > Planned Architecture:
>> >
>> Well, we talked about this 2 months ago, you seem to have changed only a
>> few things.
>> So lets dissect this again...
>>
>> > Storage Interconnect:
>> > Brocade VDX 6940 (40 gig)
>> >
>> Is this a flat (single) network for all the storage nodes?
>> And then from these 40Gb/s switches links to the access switches?
>>
>
> This will start as a single 40Gb/s switch with a single link to each node
> (upgraded in the future to dual-switch + dual-link). The 40Gb/s switch will
> also be connected to several 10Gb/s and 1Gb/s access switches with dual
> 40Gb/s uplinks.
>
> We do intend to segment the public and private networks using VLANs
> untagged at the node. There are obviously many subnets on our network. The
> 40Gb/s switch will handle routing for those networks.
>
> You can see list discussion in "Public and Private network over 1
> interface" May 23,2016 regarding some of this.
>
>
>>
>> > Access Switches for clients (servers):
>> > Brocade VDX 6740 (10 gig)
>> >
>> > Access Switches for clients (workstations):
>> > Brocade ICX 7450
>> >
>> > 3x MON:
>> > 128GB RAM
>> > 2x 200GB SSD for OS
>> > 2x 400GB P3700 for LevelDB
>> > 2x E5-2660v4
>> > 1x Dual Port 40Gb Ethernet
>> >
>> Total overkill in the CPU core arena, fewer but faster cores would be more
>> suited for this task.
>> A 6-8 core, 2.8-3GHz base speed would be nice, alas Intel has nothing like
>> that, the closest one would be the E5-2643v4.
>>
>> Same for RAM, MON processes are pretty frugal.
>>
>> No need for NVMes for the leveldb, use 2 400GB DC S3710 for OS (and thus
>> the leveldb) and that's being overly generous in the speed/IOPS
>> department.
>>
>> Note also that 40Gb/s isn't really needed here, alas latency and KISS do
>> speak in favor of it, especially if you can afford it.
>>
>
> Noted
>
>
>>
>> > 2x MDS:
>> > 

Re: [ceph-users] Weighted Priority Queue testing

2016-05-12 Thread Scottix
We have run into this same scenarios in terms of the long tail taking much
longer on recovery than the initial.

Either time we are adding osd or an osd get taken down. At first we have
max-backfill set to 1 so it doesn't kill the cluster with io. As time
passes by the single osd is performing the backfill. So we are gradually
increasing the max-backfill up to 10 to reduce the amount of time it needs
to recover fully. I know there are a few other factors at play here but for
us we tend to do this procedure every time.

On Wed, May 11, 2016 at 6:29 PM Christian Balzer  wrote:

> On Wed, 11 May 2016 16:10:06 + Somnath Roy wrote:
>
> > I bumped up the backfill/recovery settings to match up Hammer. It is
> > probably unlikely that long tail latency is a parallelism issue. If so,
> > entire recovery would be suffering not the tail alone. It's probably a
> > prioritization issue. Will start looking and update my findings. I can't
> > add devl because of the table but needed to add community that's why
> > ceph-users :-).. Also, wanted to know from Ceph's user if they are also
> > facing similar issues..
> >
>
> What I meant with lack of parallelism is that at the start of a rebuild,
> there are likely to be many candidate PGs for recovery and backfilling, so
> many things happen at the same time, up to the limits of what is
> configured (max backfill etc).
>
> From looking at my test cluster, it starts with 8-10 backfills and
> recoveries (out of 140 affected PGs), but later on in the game there are
> less and less PGs (and OSDs/nodes) to choose from, so things slow down
> around 60 PGs to just 3-4 backfills.
> And around 20 PGs it's down to 1-2 backfills, so the parallelism is
> clearly gone at that point and recovery speed is down to what a single
> PG/OSD can handle.
>
> Christian
>
> > Thanks & Regards
> > Somnath
> >
> > -Original Message-
> > From: Christian Balzer [mailto:ch...@gol.com]
> > Sent: Wednesday, May 11, 2016 12:31 AM
> > To: Somnath Roy
> > Cc: Mark Nelson; Nick Fisk; ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] Weighted Priority Queue testing
> >
> >
> >
> > Hello,
> >
> > not sure if the Cc: to the users ML was intentional or not, but either
> > way.
> >
> > The issue seen in the tracker:
> > http://tracker.ceph.com/issues/15763
> > and what you have seen (and I as well) feels a lot like the lack of
> > parallelism towards the end of rebuilds.
> >
> > This becomes even more obvious when backfills and recovery settings are
> > lowered.
> >
> > Regards,
> >
> > Christian
> > --
> > Christian BalzerNetwork/Systems Engineer
> > ch...@gol.com   Global OnLine Japan/Rakuten Communications
> > http://www.gol.com/
> > PLEASE NOTE: The information contained in this electronic mail message
> > is intended only for the use of the designated recipient(s) named above.
> > If the reader of this message is not the intended recipient, you are
> > hereby notified that you have received this message in error and that
> > any review, dissemination, distribution, or copying of this message is
> > strictly prohibited. If you have received this communication in error,
> > please notify the sender by telephone or e-mail (as shown above)
> > immediately and destroy any and all copies of this message in your
> > possession (whether hard copies or electronically stored copies).
> >
>
>
> --
> Christian BalzerNetwork/Systems Engineer
> ch...@gol.com   Global OnLine Japan/Rakuten Communications
> http://www.gol.com/
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs rm -rf on directory of 160TB /40M files

2016-04-06 Thread Scottix
I have been running some speed tests in POSIX file operations and I noticed
even just listing files can take a while compared to an attached HDD. I am
wondering is there a reason it takes so long to even just list files.

Here is the test I ran

time for i in {1..10}; do touch $i; done

Internal HDD:
real 4m37.492s
user 0m18.125s
sys 1m5.040s

Ceph Dir
real 12m30.059s
user 0m16.749s
sys 0m53.451s

~300% faster on HDD

*I am actually ok with this but nice to be quicker.

When I am listing the directory it is taking a lot longer compared to an
attached HDD

time ls -1

Internal HDD
real 0m2.112s
user 0m0.560s
sys 0m0.440s

Ceph Dir
real 3m35.982s
user 0m2.788s
sys 0m4.580s

~1000% faster on HDD

*I understand there is some time in the display so what is really making it
odd is the following test.

time ls -1 > /dev/null

Internal HDD
real 0m0.367s
user 0m0.324s
sys 0m0.040s

Ceph Dir
real 0m2.807s
user 0m0.128s
sys 0m0.052s

~700% faster on HDD

My guess the performance issue is with the batch requests as you stated. So
I am wondering if the file deletion of the 40M files is not just deleting
the files but even just traversing that many files takes a while.

I am running this on 0.94.6 with Ceph Fuse Client
And config
fuse multithreaded = false

Since multithreaded crashes in hammer.

It would be interesting to see the performance on newer versions.

Any thoughts or comments would be good.

On Tue, Apr 5, 2016 at 9:22 AM Gregory Farnum  wrote:

> On Mon, Apr 4, 2016 at 9:55 AM, Gregory Farnum  wrote:
> > Deletes are just slow right now. You can look at the ops in flight on you
> > client or MDS admin socket to see how far along it is and watch them to
> see
> > how long stuff is taking -- I think it's a sync disk commit for each
> unlink
> > though so at 40M it's going to be a good looong while. :/
> > -Greg
>
> Oh good, I misremembered — it's a synchronous request to the MDS, but
> it's not a synchronous disk commit. They get batched up normally in
> the metadata log. :)
> Still, a sync MDS request can take a little bit of time. Someday we
> will make the client able to respond to these more quickly locally and
> batch up MDS requests or something, but it'll be tricky. Faster file
> creates will probably come first. (If we're lucky they can use some of
> the same client-side machinery.)
> -Greg
>
> >
> >
> > On Monday, April 4, 2016, Kenneth Waegeman 
> > wrote:
> >>
> >> Hi all,
> >>
> >> I want to remove a large directory containing +- 40M files /160TB of
> data
> >> in CephFS by running rm -rf on the directory via the ceph kernel client.
> >> After 7h , the rm command is still running. I checked the rados df
> output,
> >> and saw that only about  2TB and 2M files are gone.
> >> I know this output of rados df can be confusing because ceph should
> delete
> >> objects asyncroniously, but then I don't know why the rm command still
> >> hangs.
> >> Is there some way to speed this up? And is there a way to check how far
> >> the marked for deletion has progressed ?
> >>
> >> Thank you very much!
> >>
> >> Kenneth
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Old MDS resurrected after update

2016-02-24 Thread Scottix
Thanks for the responses John.

--Scott

On Wed, Feb 24, 2016 at 3:07 AM John Spray <jsp...@redhat.com> wrote:

> On Tue, Feb 23, 2016 at 5:36 PM, Scottix <scot...@gmail.com> wrote:
> > I had a weird thing happen when I was testing an upgrade in a dev
> > environment where I have removed an MDS from a machine a while back.
> >
> > I upgraded to 0.94.6 and low and behold the mds daemon started up on the
> > machine again. I know the /var/lib/ceph/mds folder was removed becaues I
> > renamed it /var/lib/ceph/mds-removed and I definitely have restarted this
> > machine several times with mds not starting before.
> >
> > Only thing I noticed was the auth keys were still in play. I am assuming
> the
> > upgrade recreated the folder and found it still had access so it started
> > back up.
>
> I don't see how the upgrade would have recreated auth keys for the
> MDS, unless you mean some external upgrade script rather than just the
> packages on the machine?
>
> > I am guessing we have to add one more step in the removal mds from this
> post
> >
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-January/045649.html
> >
> >  1 Stop the old MDS
> >  2 Run "ceph mds fail 0"
> >  3 Run "ceph auth del mds."
>
> Yes, good point, folks should also do the "auth del" part.
>
> > I am a little weary of command 2 since there is no clear depiction of
> what 0
> > is. Is this command better since it is more clear "ceph mds rm 0
> mds."
>
> '0' in this context means rank 0, i.e. whichever active daemon holds
> that rank at the time.  If you have more than one daemon, the you may
> not need to do this; if the daemon you're removing is not currently
> active (i.e. not holding a rank) then you don't actually need to do
> this.
>
> Cheers,
> John
>
> > Is there anything else that could possibly resurrect it?
>
> Nope, not that I can think of.  I actually don't understand how it got
> resurrected in this instance because removing its
> /var/lib/ceph/mds/... directory should have destroyed its auth keys.
>
> John
>
> > Best,
> > Scott
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Old MDS resurrected after update

2016-02-23 Thread Scottix
I had a weird thing happen when I was testing an upgrade in a dev
environment where I have removed an MDS from a machine a while back.

I upgraded to 0.94.6 and low and behold the mds daemon started up on the
machine again. I know the /var/lib/ceph/mds folder was removed becaues I
renamed it /var/lib/ceph/mds-removed and I definitely have restarted this
machine several times with mds not starting before.

Only thing I noticed was the auth keys were still in play. I am assuming
the upgrade recreated the folder and found it still had access so it
started back up.

I am guessing we have to add one more step in the removal mds from this
post
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-January/045649.html

 1 Stop the old MDS
 2 Run "ceph mds fail 0"
 3 Run "ceph auth del mds."
I am a little weary of command 2 since there is no clear depiction of what
0 is. Is this command better since it is more clear "ceph mds rm 0
mds."

Is there anything else that could possibly resurrect it?

Best,
Scott
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2016-02-17 Thread Scottix
Looks like the bug with the kernel using ceph and XFS was fixed, I haven't
tested it yet but just wanted to give an update.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1527062

On Tue, Dec 8, 2015 at 8:05 AM Scottix <scot...@gmail.com> wrote:

> I can confirm it seems to be kernels greater than 3.16, we had this
> problem where servers would lock up and had to perform restarts on a weekly
> basis.
> We downgraded to 3.16, since then we have not had to do any restarts.
>
> I did find this thread in the XFS forums and I am not sure if has been
> fixed or not
> http://oss.sgi.com/archives/xfs/2015-07/msg00034.html
>
>
> On Tue, Dec 8, 2015 at 2:06 AM Tom Christensen <pav...@gmail.com> wrote:
>
>> We run deep scrubs via cron with a script so we know when deep scrubs are
>> happening, and we've seen nodes fail both during deep scrubbing and while
>> no deep scrubs are occurring so I'm pretty sure its not related.
>>
>>
>> On Tue, Dec 8, 2015 at 2:42 AM, Benedikt Fraunhofer <
>> fraunho...@traced.net> wrote:
>>
>>> Hi Tom,
>>>
>>> 2015-12-08 10:34 GMT+01:00 Tom Christensen <pav...@gmail.com>:
>>>
>>> > We didn't go forward to 4.2 as its a large production cluster, and we
>>> just
>>> > needed the problem fixed.  We'll probably test out 4.2 in the next
>>> couple
>>>
>>> unfortunately we don't have the luxury of a test cluster.
>>> and to add to that, we couldnt simulate the load, altough it does not
>>> seem to be load related.
>>> Did you try running with nodeep-scrub as a short-term workaround?
>>>
>>> I'll give ~30% of the nodes 4.2 and see how it goes.
>>>
>>> > In our experience it takes about 2 weeks to start happening
>>>
>>> we're well below that. Somewhat between 1 and 4 days.
>>> And yes, once one goes south, it affects the rest of the cluster.
>>>
>>> Thx!
>>>
>>>  Benedikt
>>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd become unusable, blocked by xfsaild (?) and load > 5000

2015-12-08 Thread Scottix
I can confirm it seems to be kernels greater than 3.16, we had this problem
where servers would lock up and had to perform restarts on a weekly basis.
We downgraded to 3.16, since then we have not had to do any restarts.

I did find this thread in the XFS forums and I am not sure if has been
fixed or not
http://oss.sgi.com/archives/xfs/2015-07/msg00034.html


On Tue, Dec 8, 2015 at 2:06 AM Tom Christensen  wrote:

> We run deep scrubs via cron with a script so we know when deep scrubs are
> happening, and we've seen nodes fail both during deep scrubbing and while
> no deep scrubs are occurring so I'm pretty sure its not related.
>
>
> On Tue, Dec 8, 2015 at 2:42 AM, Benedikt Fraunhofer  > wrote:
>
>> Hi Tom,
>>
>> 2015-12-08 10:34 GMT+01:00 Tom Christensen :
>>
>> > We didn't go forward to 4.2 as its a large production cluster, and we
>> just
>> > needed the problem fixed.  We'll probably test out 4.2 in the next
>> couple
>>
>> unfortunately we don't have the luxury of a test cluster.
>> and to add to that, we couldnt simulate the load, altough it does not
>> seem to be load related.
>> Did you try running with nodeep-scrub as a short-term workaround?
>>
>> I'll give ~30% of the nodes 4.2 and see how it goes.
>>
>> > In our experience it takes about 2 weeks to start happening
>>
>> we're well below that. Somewhat between 1 and 4 days.
>> And yes, once one goes south, it affects the rest of the cluster.
>>
>> Thx!
>>
>>  Benedikt
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS Attributes Question Marks

2015-09-30 Thread Scottix
OpenSuse 12.1

3.1.10-1.29-desktop

On Wed, Sep 30, 2015, 5:34 AM Yan, Zheng <uker...@gmail.com> wrote:

> On Tue, Sep 29, 2015 at 9:51 PM, Scottix <scot...@gmail.com> wrote:
>
>> I'm positive the client I sent you the log is 94. We do have one client
>> still on 87.
>>
> which version of kernel are you using? I found a kernel bug which can
> cause this issue in 4.1 and later kernels.
>
> Regards
> Yan, Zheng
>
>
>
>>
>> On Tue, Sep 29, 2015, 6:42 AM John Spray <jsp...@redhat.com> wrote:
>>
>>>
>>> Hmm, so apparently a similar bug was fixed in 0.87: Scott, can you
>>> confirm that your *clients* were 0.94 (not just the servers)?
>>>
>>> Thanks,
>>> John
>>>
>>> On Tue, Sep 29, 2015 at 11:56 AM, John Spray <jsp...@redhat.com> wrote:
>>>
>>>> Ah, this is a nice clear log!
>>>>
>>>> I've described the bug here:
>>>> http://tracker.ceph.com/issues/13271
>>>>
>>>> In the short term, you may be able to mitigate this by increasing
>>>> client_cache_size (on the client) if your RAM allows it.
>>>>
>>>> John
>>>>
>>>> On Tue, Sep 29, 2015 at 12:58 AM, Scottix <scot...@gmail.com> wrote:
>>>>
>>>>> I know this is an old one but I got a log in ceph-fuse for it.
>>>>> I got this on OpenSuse 12.1
>>>>> 3.1.10-1.29-desktop
>>>>>
>>>>> Using ceph-fuse
>>>>> ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
>>>>>
>>>>> I am running an rsync in the background and then doing a simple ls -la
>>>>> so the log is large.
>>>>>
>>>>> I am guessing this is the problem. The file is there and if I list the
>>>>> directory again it shows up properly.
>>>>>
>>>>> 2015-09-28 16:34:21.548631 7f372effd700  3 client.28239198 ll_lookup
>>>>> 0x7f370d1b1c50 data.2015-08-23_00-00-00.csv.bz2
>>>>> 2015-09-28 16:34:21.548635 7f372effd700 10 client.28239198 _lookup
>>>>> concluded ENOENT locally for 19d72a1.head(ref=4 ll_ref=5 cap_refs={}
>>>>> open={} mode=42775 size=0/0 mtime=2015-09-28 05:57:57.259306
>>>>> caps=pAsLsXsFs(0=pAsLsXsFs) COMPLETE parents=0x7f3732ff97c0 
>>>>> 0x7f370d1b1c50)
>>>>> dn 'data.2015-08-23_00-00-00.csv.bz2'
>>>>>
>>>>>
>>>>> [image: Selection_034.png]
>>>>>
>>>>> It seems to show up more if multiple things are access the ceph mount,
>>>>> just my observations.
>>>>>
>>>>> Best,
>>>>> Scott
>>>>>
>>>>> On Tue, Mar 3, 2015 at 3:05 PM Scottix <scot...@gmail.com> wrote:
>>>>>
>>>>>> Ya we are not at 0.87.1 yet, possibly tomorrow. I'll let you know if
>>>>>> it still reports the same.
>>>>>>
>>>>>> Thanks John,
>>>>>> --Scottie
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 3, 2015 at 2:57 PM John Spray <john.sp...@redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On 03/03/2015 22:35, Scottix wrote:
>>>>>>> > I was testing a little bit more and decided to run the
>>>>>>> cephfs-journal-tool
>>>>>>> >
>>>>>>> > I ran across some errors
>>>>>>> >
>>>>>>> > $ cephfs-journal-tool journal inspect
>>>>>>> > 2015-03-03 14:18:54.453981 7f8e29f86780 -1 Bad entry start ptr
>>>>>>> > (0x2aebf6) at 0x2aeb32279b
>>>>>>> > 2015-03-03 14:18:54.539060 7f8e29f86780 -1 Bad entry start ptr
>>>>>>> > (0x2aeb000733) at 0x2aeb322dd8
>>>>>>> > 2015-03-03 14:18:54.584539 7f8e29f86780 -1 Bad entry start ptr
>>>>>>> > (0x2aeb000d70) at 0x2aeb323415
>>>>>>> > 2015-03-03 14:18:54.669991 7f8e29f86780 -1 Bad entry start ptr
>>>>>>> > (0x2aeb0013ad) at 0x2aeb323a52
>>>>>>> > 2015-03-03 14:18:54.707724 7f8e29f86780 -1 Bad entry start ptr
>>>>>>> > (0x2aeb0019ea) at 0x2aeb32408f
>>>>>>> > Overall journal integrity: DAMAGED
>>>>>>>
>>>>>>> I expect this is http://tracker.ceph.com/issues/9977, which is
>>>>>>> fixed in
>>>>>>> master.
>>>>>>>
>>>>>>> You are in *very* bleeding edge territory here, and I'd suggest using
>>>>>>> the latest development release if you want to experiment with the
>>>>>>> latest
>>>>>>> CephFS tooling.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> John
>>>>>>>
>>>>>>
>>>>
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS Fuse Issue

2015-09-21 Thread Scottix
I didn't get the core dump.

I set it up now and I'll try to see if I can get it to crash again.

On Mon, Sep 21, 2015 at 3:40 PM Gregory Farnum <gfar...@redhat.com> wrote:

> Do you have a core file from the crash? If you do and can find out
> which pointers are invalid that would help...I think "cct" must be the
> broken one, but maybe it's just the Inode* or something.
> -Greg
>
> On Mon, Sep 21, 2015 at 2:03 PM, Scottix <scot...@gmail.com> wrote:
> > I was rsyncing files to ceph from an older machine and I ran into a
> > ceph-fuse crash.
> >
> > OpenSUSE 12.1, 3.1.10-1.29-desktop
> > ceph-fuse 0.94.3
> >
> > The rsync was running for about 48 hours then crashed somewhere along the
> > way.
> >
> > I added the log, and can run more if you like, I am not sure how to
> > reproduce it easily except run it again, which will take a while to see
> with
> > any results.
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS Fuse Issue

2015-09-21 Thread Scottix
I was rsyncing files to ceph from an older machine and I ran into a
ceph-fuse crash.

OpenSUSE 12.1, 3.1.10-1.29-desktop
ceph-fuse 0.94.3

The rsync was running for about 48 hours then crashed somewhere along the
way.

I added the log, and can run more if you like, I am not sure how to
reproduce it easily except run it again, which will take a while to see
with any results.


ceph.log.bz2
Description: application/bzip
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs total throughput

2015-09-15 Thread Scottix
I have a program that monitors the speed, and I have seen 1TB/s pop up and
there is just no way that is true.
Probably the way it is calculated is prone to extreme measurements, where
if you average it out you get a more realistic number.

On Tue, Sep 15, 2015 at 12:25 PM Mark Nelson  wrote:

> FWIW I wouldn't totally trust these numbers.  At one point a while back
> I had ceph reporting 226GB/s for several seconds sustained. While that
> would have been really fantastic, I suspect it probably wasn't the case. ;)
>
> Mark
>
> On 09/15/2015 11:25 AM, Barclay Jameson wrote:
> > Unfortunately, it's not longer idle as my CephFS cluster is now in
> production :)
> >
> > On Tue, Sep 15, 2015 at 11:17 AM, Gregory Farnum 
> wrote:
> >> On Tue, Sep 15, 2015 at 9:10 AM, Barclay Jameson
> >>  wrote:
> >>> So, I asked this on the irc as well but I will ask it here as well.
> >>>
> >>> When one does 'ceph -s' it shows client IO.
> >>>
> >>> The question is simple.
> >>>
> >>> Is this total throughput or what the clients would see?
> >>>
> >>> Since it's replication factor of 3 that means for every write 3 are
> >>> actually written.
> >>>
> >>> First lets assume I have only one cephfs client writing data.
> >>>
> >>> If this is total throughput then to get the maximum throughput for
> >>> what a client would see do I need to divide it by 3?
> >>>
> >>> Else, if this is what my client sees then do I need to multiply this
> >>> by 3 to see what my maximum cluster throughput would be?
> >>
> >> I believe this is client-facing IO. It's pretty simple to check if
> >> you've got an idle cluster; run rados bench and see if they're about
> >> the same or about three times as large. ;)
> >> -Greg
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Object Storage and POSIX Mix

2015-08-21 Thread Scottix
I saw this article on Linux Today and immediately thought of Ceph.

http://www.enterprisestorageforum.com/storage-management/object-storage-vs.-posix-storage-something-in-the-middle-please-1.html

I was thinking would it theoretically be possible with RGW to do a GET and
set a BEGIN_SEEK and OFFSET to only retrieve a specific portion of the
file.

The other option to append data to a RGW object instead of rewriting the
entire object.
And so on...

Just food for thought.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS vs Lustre performance

2015-08-04 Thread Scottix
I'll be more of a third-party person and try to be factual. =)

I wouldn't throw off Gluster too fast yet.
Besides what you described with the object and disk storage.
It uses Amazon Dynamo paper on eventually consistent methodology of
organizing data.
Gluster has different features so I would look into that as well.

What I have experienced with Lustre is more geared towards SuperComputing
and tuning storage to your workload. In terms of scaling Lustre with HA is
fairly difficult, not impossible but be careful for what you wish for.

It depends on what you are trying to accomplish as the end result. Not
saying Ceph isn't a great option but make smart choices and even test them
out. Testing is how I learned about the differences, and that is how we
ended up with our choice.

Disclaimer: I run a Ceph cluster, so I am more familiar with it but Gluster
was a big contender for us.


On Tue, Aug 4, 2015 at 8:12 AM Mark Nelson mnel...@redhat.com wrote:

 So despite the performance overhead of replication (or EC + cache
 tiering) I think CephFS is still a really good solution going forward.
 We still have a lot of testing/tuning to do, but as you said there are
 definitely advantages.

 I haven't looked closely at either Lustre or Gluster for several years,
 so I'd prefer not to comment on the state of either these days. :)

 Hope that helps!

 Mark

 On 08/04/2015 05:38 AM, jupiter wrote:
  Hi Mark,
 
  Thanks for the comments, that was the same arguments people  concern
  CephFS performance here. But one thing I like the Ceph is it is
  capable to run everything including replications directly to XFS on
  commodity hardware disks, I am not clear if the Lustre can do it as
  well, or did you allude that the Lustre has to run on top of the RAID
  for replications and fault tolerance?
 
  We are also looking for CephFS and Gluster, apart from the main
  difference that Gluster is based on block storage and CephFS is based
  on object storage, Ceph is cetainly has much better scalibility, any
  insight comments of pros, cons and performance between CephFS and
  Gluster?
 
  Thank you and appreciate it.
 
  - jupiter
 
  On 8/3/15, Mark Nelson mnel...@redhat.com wrote:
  On 08/03/2015 06:31 AM, jupiter wrote:
  Hi,
 
  I'd like to deploy Cephfs in a cluster, but I need to have a
 performance
  report compared with Lustre and Gluster. Could anyone point me
 documents
  / links for performance between CephFS, Gluster and Lustre?
 
  Thank you.
 
  Kind regards,
 
  - j
 
  Hi,
 
  I don't know that anything like this really exists yet to be honest.  We
  wrote a paper with ORNL several years ago looking at Ceph performance on
  a DDN SFA10K and basically saw that we could hit about 6GB/s with CephFS
  while Lustre could do closer to 11GB/s.  Primarily that was due to the
  journal on the write side (using local SSDs for journal would have
  improved things dramatically as the limitation was the IB connections
  between the SFA10K and the OSD nodes rather than the disks).  On the
  read side we ended up running out of time to figure it out.  We could do
  about 8GB/s with RADOS but CephFS was again limited to about 6GB/s.
  This was several years ago now so things may have changed.
 
  In general you should expect that Lustre will probably be faster for
  large sequential writes (especially if you use Ceph replication vs RAID6
  for Lustre) and may be faster for large sequential reads.  For small IO
  I suspect that Ceph may do better, and for metadata I would expect the
  situation will be mixed with Ceph faster at some things but possibly
  slower at others since afaik we haven't done a lot of tuning of the MDS
  yet.
 
  Mark
 
 
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Unexpected period of iowait, no obvious activity?

2015-06-23 Thread Scottix
Ya Ubuntu has a process called mlocate which run updatedb

We basically turn it off shown here
http://askubuntu.com/questions/268130/can-i-disable-updatedb-mlocate

If you still want it you could edit the settings /etc/updatedb.conf and add
a prunepath to your ceph directory

On Tue, Jun 23, 2015 at 7:47 AM Daniel Schneller 
daniel.schnel...@centerdevice.com wrote:


  On 23.06.2015, at 14:13, Gregory Farnum g...@gregs42.com wrote:
 
  ...
  On the other hand, there are lots of administrative tasks that can run
  and do something like this. The CERN guys had a lot of trouble with
  some daemon which wanted to scan the OSD's entire store for tracking
  changes, and was installed by their standard Ubuntu deployment.

 Thanks! Good hint. I will look into that.

 Daniel

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v0.94.2 Hammer released

2015-06-12 Thread Scottix
I noticed amd64 Ubuntu 12.04 hasn't updated its packages to 0.94.2
can you check this?

http://ceph.com/debian-hammer/dists/precise/main/binary-amd64/Packages

Package: ceph
Version: 0.94.1-1precise
Architecture: amd64

On Thu, Jun 11, 2015 at 10:35 AM Sage Weil sw...@redhat.com wrote:

 This Hammer point release fixes a few critical bugs in RGW that can
 prevent objects starting with underscore from behaving properly and that
 prevent garbage collection of deleted objects when using the Civetweb
 standalone mode.

 All v0.94.x Hammer users are strongly encouraged to upgrade, and to make
 note of the repair procedure below if RGW is in use.

 Upgrading from previous Hammer release
 --

 Bug #11442 introduced a change that made rgw objects that start with
 underscore incompatible with previous versions. The fix to that bug
 reverts to the previous behavior. In order to be able to access objects
 that start with an underscore and were created in prior Hammer releases,
 following the upgrade it is required to run (for each affected bucket)::

 $ radosgw-admin bucket check --check-head-obj-locator \
  --bucket=bucket [--fix]

 You can get a list of buckets with

 $ radosgw-admin bucket list

 Notable changes
 ---

 * build: compilation error: No high-precision counter available  (armhf,
   powerpc..) (#11432, James Page)
 * ceph-dencoder links to libtcmalloc, and shouldn't (#10691, Boris Ranto)
 * ceph-disk: disk zap sgdisk invocation (#11143, Owen Synge)
 * ceph-disk: use a new disk as journal disk,ceph-disk prepare fail
   (#10983, Loic Dachary)
 * ceph-objectstore-tool should be in the ceph server package (#11376, Ken
   Dreyer)
 * librados: can get stuck in redirect loop if osdmap epoch ==
   last_force_op_resend (#11026, Jianpeng Ma)
 * librbd: A retransmit of proxied flatten request can result in -EINVAL
   (Jason Dillaman)
 * librbd: ImageWatcher should cancel in-flight ops on watch error (#11363,
   Jason Dillaman)
 * librbd: Objectcacher setting max object counts too low (#7385, Jason
   Dillaman)
 * librbd: Periodic failure of TestLibRBD.DiffIterateStress (#11369, Jason
   Dillaman)
 * librbd: Queued AIO reference counters not properly updated (#11478,
   Jason Dillaman)
 * librbd: deadlock in image refresh (#5488, Jason Dillaman)
 * librbd: notification race condition on snap_create (#11342, Jason
   Dillaman)
 * mds: Hammer uclient checking (#11510, John Spray)
 * mds: remove caps from revoking list when caps are voluntarily released
   (#11482, Yan, Zheng)
 * messenger: double clear of pipe in reaper (#11381, Haomai Wang)
 * mon: Total size of OSDs is a maginitude less than it is supposed to be.
   (#11534, Zhe Zhang)
 * osd: don't check order in finish_proxy_read (#11211, Zhiqiang Wang)
 * osd: handle old semi-deleted pgs after upgrade (#11429, Samuel Just)
 * osd: object creation by write cannot use an offset on an erasure coded
   pool (#11507, Jianpeng Ma)
 * rgw: Improve rgw HEAD request by avoiding read the body of the first
   chunk (#11001, Guang Yang)
 * rgw: civetweb is hitting a limit (number of threads 1024) (#10243,
   Yehuda Sadeh)
 * rgw: civetweb should use unique request id (#10295, Orit Wasserman)
 * rgw: critical fixes for hammer (#11447, #11442, Yehuda Sadeh)
 * rgw: fix swift COPY headers (#10662, #10663, #11087, #10645, Radoslaw
   Zarzynski)
 * rgw: improve performance for large object  (multiple chunks) GET
   (#11322, Guang Yang)
 * rgw: init-radosgw: run RGW as root (#11453, Ken Dreyer)
 * rgw: keystone token cache does not work correctly (#11125, Yehuda Sadeh)
 * rgw: make quota/gc thread configurable for starting (#11047, Guang Yang)
 * rgw: make swift responses of RGW return last-modified, content-length,
   x-trans-id headers.(#10650, Radoslaw Zarzynski)
 * rgw: merge manifests correctly when there's prefix override (#11622,
   Yehuda Sadeh)
 * rgw: quota not respected in POST object (#11323, Sergey Arkhipov)
 * rgw: restore buffer of multipart upload after EEXIST (#11604, Yehuda
   Sadeh)
 * rgw: shouldn't need to disable rgw_socket_path if frontend is configured
   (#11160, Yehuda Sadeh)
 * rgw: swift: Response header of GET request for container does not
   contain X-Container-Object-Count, X-Container-Bytes-Used and x-trans-id
   headers (#10666, Dmytro Iurchenko)
 * rgw: swift: Response header of POST request for object does not contain
   content-length and x-trans-id headers (#10661, Radoslaw Zarzynski)
 * rgw: swift: response for GET/HEAD on container does not contain the
   X-Timestamp header (#10938, Radoslaw Zarzynski)
 * rgw: swift: response for PUT on /container does not contain the
   mandatory Content-Length header when FCGI is used (#11036, #10971,
   Radoslaw Zarzynski)
 * rgw: swift: wrong handling of empty metadata on Swift container (#11088,
   Radoslaw Zarzynski)
 * tests: TestFlatIndex.cc races with TestLFNIndex.cc (#11217, Xinze Chi)
 * tests: ceph-helpers 

Re: [ceph-users] Discuss: New default recovery config settings

2015-06-04 Thread Scottix
From a ease of use standpoint and depending on the situation you are
setting up your environment, the idea is as follow;

It seems like it would be nice to have some easy on demand control where
you don't have to think a whole lot other than knowing how it is going to
affect your cluster in a general sense.

The two extremes and a general limitation would be:
1. Priority data recover
2. Priority client usability
3rd might be hardware related like 1Gb connection

With predefined settings you can setup different levels that have sensible
settings and maybe 1 that is custom for the advanced user.
Example command (Caveat: I don't fully know how your configs work):
ceph osd set priority low|medium|high|custom
*With priority set it would lock certain attributes
**With priority unset it would unlock certain attributes

In our use case basically after 8pm the activity goes way down. Here I can
up the priority to medium or high, then at 6 am I can adjust it back to low.

With cron I can easily schedule that or depending on the current situation
I can schedule maintenance and change the priority to fit my needs.



On Thu, Jun 4, 2015 at 2:01 PM Mike Dawson mike.daw...@cloudapt.com wrote:

 With a write-heavy RBD workload, I add the following to ceph.conf:

 osd_max_backfills = 2
 osd_recovery_max_active = 2

 If things are going well during recovery (i.e. guests happy and no slow
 requests), I will often bump both up to three:

 # ceph tell osd.* injectargs '--osd-max-backfills 3
 --osd-recovery-max-active 3'

 If I see slow requests, I drop them down.

 The biggest downside to setting either to 1 seems to be the long tail
 issue detailed in:

 http://tracker.ceph.com/issues/9566

 Thanks,
 Mike Dawson


 On 6/3/2015 6:44 PM, Sage Weil wrote:
  On Mon, 1 Jun 2015, Gregory Farnum wrote:
  On Mon, Jun 1, 2015 at 6:39 PM, Paul Von-Stamwitz
  pvonstamw...@us.fujitsu.com wrote:
  On Fri, May 29, 2015 at 4:18 PM, Gregory Farnum g...@gregs42.com
 wrote:
  On Fri, May 29, 2015 at 2:47 PM, Samuel Just sj...@redhat.com
 wrote:
  Many people have reported that they need to lower the osd recovery
 config options to minimize the impact of recovery on client io.  We are
 talking about changing the defaults as follows:
 
  osd_max_backfills to 1 (from 10)
  osd_recovery_max_active to 3 (from 15)
  osd_recovery_op_priority to 1 (from 10)
  osd_recovery_max_single_start to 1 (from 5)
 
  I'm under the (possibly erroneous) impression that reducing the
 number of max backfills doesn't actually reduce recovery speed much (but
 will reduce memory use), but that dropping the op priority can. I'd rather
 we make users manually adjust values which can have a material impact on
 their data safety, even if most of them choose to do so.
 
  After all, even under our worst behavior we're still doing a lot
 better than a resilvering RAID array. ;) -Greg
  --
 
 
  Greg,
  When we set...
 
  osd recovery max active = 1
  osd max backfills = 1
 
  We see rebalance times go down by more than half and client write
 performance increase significantly while rebalancing. We initially played
 with these settings to improve client IO expecting recovery time to get
 worse, but we got a 2-for-1.
  This was with firefly using replication, downing an entire node with
 lots of SAS drives. We left osd_recovery_threads, osd_recovery_op_priority,
 and osd_recovery_max_single_start default.
 
  We dropped osd_recovery_max_active and osd_max_backfills together. If
 you're right, do you think osd_recovery_max_active=1 is primary reason for
 the improvement? (higher osd_max_backfills helps recovery time with erasure
 coding.)
 
  Well, recovery max active and max backfills are similar in many ways.
  Both are about moving data into a new or outdated copy of the PG ? the
  difference is that recovery refers to our log-based recovery (where we
  compare the PG logs and move over the objects which have changed)
  whereas backfill requires us to incrementally move through the entire
  PG's hash space and compare.
  I suspect dropping down max backfills is more important than reducing
  max recovery (gathering recovery metadata happens largely in memory)
  but I don't really know either way.
 
  My comment was meant to convey that I'd prefer we not reduce the
  recovery op priority levels. :)
 
  We could make a less extreme move than to 1, but IMO we have to reduce it
  one way or another.  Every major operator I've talked to does this, our
 PS
  folks have been recommending it for years, and I've yet to see a single
  complaint about recovery times... meanwhile we're drowning in a sea of
  complaints about the impact on clients.
 
  How about
 
osd_max_backfills to 1 (from 10)
osd_recovery_max_active to 3 (from 15)
osd_recovery_op_priority to 3 (from 10)
osd_recovery_max_single_start to 1 (from 5)
 
  (same as above, but 1/3rd the recovery op prio instead of 1/10th)
  ?
 
  sage
  ___
  ceph-users mailing 

Re: [ceph-users] How to backup hundreds or thousands of TB

2015-05-06 Thread Scottix
As a point to
* someone accidentally removed a thing, and now they need a thing back

I thought MooseFS has an interesting feature that I thought would be good
for CephFS and maybe others.

Basically a timed Trashbin
Deleted files are retained for a configurable period of time (a file
system level trash bin)

It's an idea to cover this use case.


On Wed, May 6, 2015 at 3:35 AM Mariusz Gronczewski 
mariusz.gronczew...@efigence.com wrote:

 Snapshot on same storage cluster should definitely NOT be treated as
 backup

 Snapshot as a source for backup however can be pretty good solution for
 some cases, but not every case.

 For example if using ceph to serve static web files, I'd rather have
 possibility to restore given file from given path than snapshot of
 whole multiple TB cluster.

 There are 2 cases for backup restore:

 * something failed, need to fix it - usually full restore needed
 * someone accidentally removed a thing, and now they need a thing back

 Snapshots fix first problem, but not the second one, restoring 7TB of
 data to recover few GBs is not reasonable.

 As it is now we just backup from inside VMs (file-based backup) and have
 puppet to easily recreate machine config but if (or rather when) we
 would use object store we would backup it in a way that allows for
 partial restore.

 On Wed, 6 May 2015 10:50:34 +0100, Nick Fisk n...@fisk.me.uk wrote:
  For me personally I would always feel more comfortable with backups on a
 completely different storage technology.
 
  Whilst there are many things you can do with snapshots and replication,
 there is always a small risk that whatever causes data loss on your primary
 system may affect/replicate to your 2nd copy.
 
  I guess it all really depends on what you are trying to protect against,
 but Tape still looks very appealing if you want to maintain a completely
 isolated copy of data.
 
   -Original Message-
   From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
 Of
   Alexandre DERUMIER
   Sent: 06 May 2015 10:10
   To: Götz Reinicke
   Cc: ceph-users
   Subject: Re: [ceph-users] How to backup hundreds or thousands of TB
  
   for the moment, you can use snapshot for backup
  
   https://ceph.com/community/blog/tag/backup/
  
   I think that async mirror is on the roadmap
   https://wiki.ceph.com/Planning/Blueprints/Hammer/RBD%3A_Mirroring
  
  
  
   if you use qemu, you can do qemu full backup. (qemu incremental backup
 is
   coming for qemu 2.4)
  
  
   - Mail original -
   De: Götz Reinicke goetz.reini...@filmakademie.de
   À: ceph-users ceph-users@lists.ceph.com
   Envoyé: Mercredi 6 Mai 2015 10:25:01
   Objet: [ceph-users] How to backup hundreds or thousands of TB
  
   Hi folks,
  
   beside hardware and performance and failover design: How do you manage
   to backup hundreds or thousands of TB :) ?
  
   Any suggestions? Best practice?
  
   A second ceph cluster at a different location? bigger archive Disks
 in good
   boxes? Or tabe-libs?
  
   What kind of backupsoftware can handle such volumes nicely?
  
   Thanks and regards . Götz
   --
   Götz Reinicke
   IT-Koordinator
  
   Tel. +49 7141 969 82 420
   E-Mail goetz.reini...@filmakademie.de
  
   Filmakademie Baden-Württemberg GmbH
   Akademiehof 10
   71638 Ludwigsburg
   www.filmakademie.de
  
   Eintragung Amtsgericht Stuttgart HRB 205016
  
   Vorsitzender des Aufsichtsrats: Jürgen Walter MdL Staatssekretär im
   Ministerium für Wissenschaft, Forschung und Kunst Baden-Württemberg
  
   Geschäftsführer: Prof. Thomas Schadt
  
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 --
 Mariusz Gronczewski, Administrator

 Efigence S. A.
 ul. Wołoska 9a, 02-583 Warszawa
 T: [+48] 22 380 13 13
 F: [+48] 22 380 13 14
 E: mariusz.gronczew...@efigence.com
 mailto:mariusz.gronczew...@efigence.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MDS unmatched rstat after upgrade hammer

2015-04-09 Thread Scottix
Alright sounds good.

Only one comment then:
From an IT/ops perspective all I see is ERR and that raises red flags. So
the exposure of the message might need some tweaking. In production I like
to be notified of an issue but have reassurance it was fixed within the
system.

Best Regards

On Wed, Apr 8, 2015 at 8:10 PM Yan, Zheng uker...@gmail.com wrote:

 On Thu, Apr 9, 2015 at 7:09 AM, Scottix scot...@gmail.com wrote:
  I was testing the upgrade on our dev environment and after I restarted
 the
  mds I got the following errors.
 
  2015-04-08 15:58:34.056470 mds.0 [ERR] unmatched rstat on 605, inode has
  n(v70 rc2015-03-16 09:11:34.390905), dirfrags have n(v0 rc2015-03-16
  09:11:34.390905 1=0+1)
  2015-04-08 15:58:34.056530 mds.0 [ERR] unmatched rstat on 604, inode has
  n(v69 rc2015-03-31 08:07:09.265241), dirfrags have n(v0 rc2015-03-31
  08:07:09.265241 1=0+1)
  2015-04-08 15:58:34.056581 mds.0 [ERR] unmatched rstat on 606, inode has
  n(v67 rc2015-03-16 08:54:36.314790), dirfrags have n(v0 rc2015-03-16
  08:54:36.314790 1=0+1)
  2015-04-08 15:58:34.056633 mds.0 [ERR] unmatched rstat on 607, inode has
  n(v57 rc2015-03-16 08:54:46.797240), dirfrags have n(v0 rc2015-03-16
  08:54:46.797240 1=0+1)
  2015-04-08 15:58:34.056687 mds.0 [ERR] unmatched rstat on 608, inode has
  n(v23 rc2015-03-16 08:54:59.634299), dirfrags have n(v0 rc2015-03-16
  08:54:59.634299 1=0+1)
  2015-04-08 15:58:34.056737 mds.0 [ERR] unmatched rstat on 609, inode has
  n(v62 rc2015-03-16 08:55:06.598286), dirfrags have n(v0 rc2015-03-16
  08:55:06.598286 1=0+1)
  2015-04-08 15:58:34.056789 mds.0 [ERR] unmatched rstat on 600, inode has
  n(v101 rc2015-03-16 08:55:16.153175), dirfrags have n(v0 rc2015-03-16
  08:55:16.153175 1=0+1)

 These errors are likely caused by the bug that rstats are not set to
 correct values
 when creating new fs. Nothing to worry about, the MDS automatically fixes
 rstat
 errors.

 
  I am not sure if this is an issue or got fixed or something I should
 worry
  about. But would just like some context around this issue since it came
 up
  in the ceph -w and other users might see it as well.
 
  I have done a lot of unsafe stuff on this mds so not to freak anyone
 out
  if that is the issue.
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Firefly, cephfs issues: different unix rights depending on the client and ls are slow

2015-03-13 Thread Scottix
…


 The time variation is caused cache coherence. when client has valid
information
 in its cache, 'stat' operation will be fast. Otherwise the client need to
send
 request to MDS and wait for reply, which will be slow.


This sounds like the behavior I had with CephFS giving me question marks.
When I had a directory with a large amount of files in it and the first ls
-la took a while to populate and ended with some unknown stats. The second
time I did an ls -la it ran quick with no question marks. My inquiry was if
there is a timeout that could occur? since it has to go ask the mds on a
different machine it seems plausible that the full response is not coming
back in time or fails to get all stats at some point.

I could test this more; is there a command or proccess I can perform to
flush the ceph-fuse cache?

Thanks,
Scott


On Fri, Mar 13, 2015 at 1:49 PM Francois Lafont flafdiv...@free.fr wrote:

 Hi,

 Yan, Zheng wrote :

  http://tracker.ceph.com/issues/11059
 
 
  It's a bug in ACL code, I have updated http://tracker.ceph.com/
 issues/11059

 Ok, thanks. I have seen and I will answer quickly. ;)

  I'm still surprised by such times. For instance, It seems to me
  that, with a mounted nfs share, commands like ls -la are very
  fast in comparison (with a directory which contains the same number
  of files). Can anyone explain to me why there is a such difference
  between the nfs case and the cephfs case? This is absolutely not a
  criticism but it's just to understand the concepts that come into
  play. In the case of ls -al ie just reading (it is assumed that
  there is no writing on the directory), the nfs and the cephfs cases
  seem to me very similar: the client just requests a stat on each file
  in the directory. Am I wrong?
 
  NFS has no cache coherence mechanism. It can't guarantee one client
 always
  see other client's change.

 Ah ok, I didn't know that. Indeed, now I understand that can generate
 performance impact.

  The time variation is caused cache coherence. when client has valid
 information
  in its cache, 'stat' operation will be fast. Otherwise the client need
 to send
  request to MDS and wait for reply, which will be slow.

 Ok, thanks a lot for your explanations.
 Regards.

 --
 François Lafont
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS Attributes Question Marks

2015-03-03 Thread Scottix
I did a bit more testing.
1. I tried on a newer kernel and was not able to recreate the problem,
maybe it is that kernel bug you mentioned. Although its not an exact
replica of the load.
2. I haven't tried the debug yet since I have to wait for the right moment.

One thing I realized and maybe it is not an issue is we are using a symlink
to a folder in the ceph mount.
ceph-fuse on /mnt/ceph type fuse.ceph-fuse
(rw,nosuid,nodev,noatime,user_id=0,group_id=0,default_permissions,allow_other)
lrwxrwxrwx 1 root   root   metadata - /mnt/ceph/DataCenter/metadata
Not sure if that would create any issues.

Anyway we are going to update the machine soon so, I can report if we keep
having the issue.

Thanks for your support,
Scott


On Mon, Mar 2, 2015 at 4:07 PM Scottix scot...@gmail.com wrote:

 I'll try the following things and report back to you.

 1. I can get a new kernel on another machine and mount to the CephFS and
 see if I get the following errors.
 2. I'll run the debug and see if anything comes up.

 I'll report back to you when I can do these things.

 Thanks,
 Scottie

 On Mon, Mar 2, 2015 at 4:04 PM Gregory Farnum g...@gregs42.com wrote:

 I bet it's that permission issue combined with a minor bug in FUSE on
 that kernel, or maybe in the ceph-fuse code (but I've not seen it
 reported before, so I kind of doubt it). If you run ceph-fuse with
 debug client = 20 it will output (a whole lot of) logging to the
 client's log file and you could see what requests are getting
 processed by the Ceph code and how it's responding. That might let you
 narrow things down. It's certainly not any kind of timeout.
 -Greg

 On Mon, Mar 2, 2015 at 3:57 PM, Scottix scot...@gmail.com wrote:
  3 Ceph servers on Ubuntu 12.04.5 - kernel 3.13.0-29-generic
 
  We have an old server that we compiled the ceph-fuse client on
  Suse11.4 - kernel 2.6.37.6-0.11
  This is the only mount we have right now.
 
  We don't have any problems reading the files and the directory shows
 full
  775 permissions and doing a second ls fixes the problem.
 
  On Mon, Mar 2, 2015 at 3:51 PM Bill Sanders billysand...@gmail.com
 wrote:
 
  Forgive me if this is unhelpful, but could it be something to do with
  permissions of the directory and not Ceph at all?
 
  http://superuser.com/a/528467
 
  Bill
 
  On Mon, Mar 2, 2015 at 3:47 PM, Gregory Farnum g...@gregs42.com
 wrote:
 
  On Mon, Mar 2, 2015 at 3:39 PM, Scottix scot...@gmail.com wrote:
   We have a file system running CephFS and for a while we had this
 issue
   when
   doing an ls -la we get question marks in the response.
  
   -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
   data.2015-02-08_00-00-00.csv.bz2
   -? ? ?  ?   ??
   data.2015-02-09_00-00-00.csv.bz2
  
   If we do another directory listing it show up fine.
  
   -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
   data.2015-02-08_00-00-00.csv.bz2
   -rw-r--r-- 1 wwwrun root13675 Feb 10 15:21
   data.2015-02-09_00-00-00.csv.bz2
  
   It hasn't been a problem but just wanted to see if this is an issue,
   could
   the attributes be timing out? We do have a lot of files in the
   filesystem so
   that could be a possible bottleneck.
 
  Huh, that's not something I've seen before. Are the systems you're
  doing this on the same? What distro and kernel version? Is it reliably
  one of them showing the question marks, or does it jump between
  systems?
  -Greg
 
  
   We are using the ceph-fuse mount.
   ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
   We are planning to do the update soon to 87.1
  
   Thanks
   Scottie
  
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS Attributes Question Marks

2015-03-03 Thread Scottix
Ya we are not at 0.87.1 yet, possibly tomorrow. I'll let you know if it
still reports the same.

Thanks John,
--Scottie


On Tue, Mar 3, 2015 at 2:57 PM John Spray john.sp...@redhat.com wrote:

 On 03/03/2015 22:35, Scottix wrote:
  I was testing a little bit more and decided to run the
 cephfs-journal-tool
 
  I ran across some errors
 
  $ cephfs-journal-tool journal inspect
  2015-03-03 14:18:54.453981 7f8e29f86780 -1 Bad entry start ptr
  (0x2aebf6) at 0x2aeb32279b
  2015-03-03 14:18:54.539060 7f8e29f86780 -1 Bad entry start ptr
  (0x2aeb000733) at 0x2aeb322dd8
  2015-03-03 14:18:54.584539 7f8e29f86780 -1 Bad entry start ptr
  (0x2aeb000d70) at 0x2aeb323415
  2015-03-03 14:18:54.669991 7f8e29f86780 -1 Bad entry start ptr
  (0x2aeb0013ad) at 0x2aeb323a52
  2015-03-03 14:18:54.707724 7f8e29f86780 -1 Bad entry start ptr
  (0x2aeb0019ea) at 0x2aeb32408f
  Overall journal integrity: DAMAGED

 I expect this is http://tracker.ceph.com/issues/9977, which is fixed in
 master.

 You are in *very* bleeding edge territory here, and I'd suggest using
 the latest development release if you want to experiment with the latest
 CephFS tooling.

 Cheers,
 John

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS Attributes Question Marks

2015-03-03 Thread Scottix
I was testing a little bit more and decided to run the cephfs-journal-tool

I ran across some errors

$ cephfs-journal-tool journal inspect
2015-03-03 14:18:54.453981 7f8e29f86780 -1 Bad entry start ptr
(0x2aebf6) at 0x2aeb32279b
2015-03-03 14:18:54.539060 7f8e29f86780 -1 Bad entry start ptr
(0x2aeb000733) at 0x2aeb322dd8
2015-03-03 14:18:54.584539 7f8e29f86780 -1 Bad entry start ptr
(0x2aeb000d70) at 0x2aeb323415
2015-03-03 14:18:54.669991 7f8e29f86780 -1 Bad entry start ptr
(0x2aeb0013ad) at 0x2aeb323a52
2015-03-03 14:18:54.707724 7f8e29f86780 -1 Bad entry start ptr
(0x2aeb0019ea) at 0x2aeb32408f
Overall journal integrity: DAMAGED
Corrupt regions:
  0x2aeb3226a5-2aeb32279b
  0x2aeb32279b-2aeb322dd8
  0x2aeb322dd8-2aeb323415
  0x2aeb323415-2aeb323a52
  0x2aeb323a52-2aeb32408f
  0x2aeb32408f-2aeb3246cc

$ cephfs-journal-tool header get
{ magic: ceph fs volume v011,
  write_pos: 184430420380,
  expire_pos: 184389995327,
  trimmed_pos: 184389992448,
  stream_format: 1,
  layout: { stripe_unit: 4194304,
  stripe_count: 4194304,
  object_size: 4194304,
  cas_hash: 4194304,
  object_stripe_unit: 4194304,
  pg_pool: 4194304}}

$ cephfs-journal-tool event get summary
2015-03-03 14:32:50.102863 7f47c3006780 -1 Bad entry start ptr
(0x2aee8000e6) at 0x2aee800c25
2015-03-03 14:32:50.242576 7f47c3006780 -1 Bad entry start ptr
(0x2aee800b3f) at 0x2aee80167e
2015-03-03 14:32:50.486354 7f47c3006780 -1 Bad entry start ptr
(0x2aee800e4f) at 0x2aee80198e
2015-03-03 14:32:50.577443 7f47c3006780 -1 Bad entry start ptr
(0x2aee801f65) at 0x2aee802aa4
Events by type:
no output here


On Tue, Mar 3, 2015 at 12:01 PM Scottix scot...@gmail.com wrote:

 I did a bit more testing.
 1. I tried on a newer kernel and was not able to recreate the problem,
 maybe it is that kernel bug you mentioned. Although its not an exact
 replica of the load.
 2. I haven't tried the debug yet since I have to wait for the right moment.

 One thing I realized and maybe it is not an issue is we are using a
 symlink to a folder in the ceph mount.
 ceph-fuse on /mnt/ceph type fuse.ceph-fuse
 (rw,nosuid,nodev,noatime,user_id=0,group_id=0,default_permissions,allow_other)
 lrwxrwxrwx 1 root   root   metadata - /mnt/ceph/DataCenter/metadata
 Not sure if that would create any issues.

 Anyway we are going to update the machine soon so, I can report if we keep
 having the issue.

 Thanks for your support,
 Scott


 On Mon, Mar 2, 2015 at 4:07 PM Scottix scot...@gmail.com wrote:

 I'll try the following things and report back to you.

 1. I can get a new kernel on another machine and mount to the CephFS and
 see if I get the following errors.
 2. I'll run the debug and see if anything comes up.

 I'll report back to you when I can do these things.

 Thanks,
 Scottie

 On Mon, Mar 2, 2015 at 4:04 PM Gregory Farnum g...@gregs42.com wrote:

 I bet it's that permission issue combined with a minor bug in FUSE on
 that kernel, or maybe in the ceph-fuse code (but I've not seen it
 reported before, so I kind of doubt it). If you run ceph-fuse with
 debug client = 20 it will output (a whole lot of) logging to the
 client's log file and you could see what requests are getting
 processed by the Ceph code and how it's responding. That might let you
 narrow things down. It's certainly not any kind of timeout.
 -Greg

 On Mon, Mar 2, 2015 at 3:57 PM, Scottix scot...@gmail.com wrote:
  3 Ceph servers on Ubuntu 12.04.5 - kernel 3.13.0-29-generic
 
  We have an old server that we compiled the ceph-fuse client on
  Suse11.4 - kernel 2.6.37.6-0.11
  This is the only mount we have right now.
 
  We don't have any problems reading the files and the directory shows
 full
  775 permissions and doing a second ls fixes the problem.
 
  On Mon, Mar 2, 2015 at 3:51 PM Bill Sanders billysand...@gmail.com
 wrote:
 
  Forgive me if this is unhelpful, but could it be something to do with
  permissions of the directory and not Ceph at all?
 
  http://superuser.com/a/528467
 
  Bill
 
  On Mon, Mar 2, 2015 at 3:47 PM, Gregory Farnum g...@gregs42.com
 wrote:
 
  On Mon, Mar 2, 2015 at 3:39 PM, Scottix scot...@gmail.com wrote:
   We have a file system running CephFS and for a while we had this
 issue
   when
   doing an ls -la we get question marks in the response.
  
   -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
   data.2015-02-08_00-00-00.csv.bz2
   -? ? ?  ?   ??
   data.2015-02-09_00-00-00.csv.bz2
  
   If we do another directory listing it show up fine.
  
   -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
   data.2015-02-08_00-00-00.csv.bz2
   -rw-r--r-- 1 wwwrun root13675 Feb 10 15:21
   data.2015-02-09_00-00-00.csv.bz2
  
   It hasn't been a problem but just wanted to see if this is an
 issue,
   could
   the attributes be timing out? We do have a lot of files in the
   filesystem so
   that could be a possible bottleneck.
 
  Huh, that's not something I've seen before. Are the systems you're
  doing this on the same? What

[ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Scottix
We have a file system running CephFS and for a while we had this issue when
doing an ls -la we get question marks in the response.

-rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
data.2015-02-08_00-00-00.csv.bz2
-? ? ?  ?   ??
data.2015-02-09_00-00-00.csv.bz2

If we do another directory listing it show up fine.

-rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
data.2015-02-08_00-00-00.csv.bz2
-rw-r--r-- 1 wwwrun root13675 Feb 10 15:21
data.2015-02-09_00-00-00.csv.bz2

It hasn't been a problem but just wanted to see if this is an issue, could
the attributes be timing out? We do have a lot of files in the filesystem
so that could be a possible bottleneck.

We are using the ceph-fuse mount.
ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
We are planning to do the update soon to 87.1

Thanks
Scottie
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Scottix
3 Ceph servers on Ubuntu 12.04.5 - kernel 3.13.0-29-generic

We have an old server that we compiled the ceph-fuse client on
Suse11.4 - kernel 2.6.37.6-0.11
This is the only mount we have right now.

We don't have any problems reading the files and the directory shows full
775 permissions and doing a second ls fixes the problem.

On Mon, Mar 2, 2015 at 3:51 PM Bill Sanders billysand...@gmail.com wrote:

 Forgive me if this is unhelpful, but could it be something to do with
 permissions of the directory and not Ceph at all?

 http://superuser.com/a/528467

 Bill

 On Mon, Mar 2, 2015 at 3:47 PM, Gregory Farnum g...@gregs42.com wrote:

 On Mon, Mar 2, 2015 at 3:39 PM, Scottix scot...@gmail.com wrote:
  We have a file system running CephFS and for a while we had this issue
 when
  doing an ls -la we get question marks in the response.
 
  -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
  data.2015-02-08_00-00-00.csv.bz2
  -? ? ?  ?   ??
  data.2015-02-09_00-00-00.csv.bz2
 
  If we do another directory listing it show up fine.
 
  -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
  data.2015-02-08_00-00-00.csv.bz2
  -rw-r--r-- 1 wwwrun root13675 Feb 10 15:21
  data.2015-02-09_00-00-00.csv.bz2
 
  It hasn't been a problem but just wanted to see if this is an issue,
 could
  the attributes be timing out? We do have a lot of files in the
 filesystem so
  that could be a possible bottleneck.

 Huh, that's not something I've seen before. Are the systems you're
 doing this on the same? What distro and kernel version? Is it reliably
 one of them showing the question marks, or does it jump between
 systems?
 -Greg

 
  We are using the ceph-fuse mount.
  ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
  We are planning to do the update soon to 87.1
 
  Thanks
  Scottie
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Scottix
I'll try the following things and report back to you.

1. I can get a new kernel on another machine and mount to the CephFS and
see if I get the following errors.
2. I'll run the debug and see if anything comes up.

I'll report back to you when I can do these things.

Thanks,
Scottie

On Mon, Mar 2, 2015 at 4:04 PM Gregory Farnum g...@gregs42.com wrote:

 I bet it's that permission issue combined with a minor bug in FUSE on
 that kernel, or maybe in the ceph-fuse code (but I've not seen it
 reported before, so I kind of doubt it). If you run ceph-fuse with
 debug client = 20 it will output (a whole lot of) logging to the
 client's log file and you could see what requests are getting
 processed by the Ceph code and how it's responding. That might let you
 narrow things down. It's certainly not any kind of timeout.
 -Greg

 On Mon, Mar 2, 2015 at 3:57 PM, Scottix scot...@gmail.com wrote:
  3 Ceph servers on Ubuntu 12.04.5 - kernel 3.13.0-29-generic
 
  We have an old server that we compiled the ceph-fuse client on
  Suse11.4 - kernel 2.6.37.6-0.11
  This is the only mount we have right now.
 
  We don't have any problems reading the files and the directory shows full
  775 permissions and doing a second ls fixes the problem.
 
  On Mon, Mar 2, 2015 at 3:51 PM Bill Sanders billysand...@gmail.com
 wrote:
 
  Forgive me if this is unhelpful, but could it be something to do with
  permissions of the directory and not Ceph at all?
 
  http://superuser.com/a/528467
 
  Bill
 
  On Mon, Mar 2, 2015 at 3:47 PM, Gregory Farnum g...@gregs42.com
 wrote:
 
  On Mon, Mar 2, 2015 at 3:39 PM, Scottix scot...@gmail.com wrote:
   We have a file system running CephFS and for a while we had this
 issue
   when
   doing an ls -la we get question marks in the response.
  
   -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
   data.2015-02-08_00-00-00.csv.bz2
   -? ? ?  ?   ??
   data.2015-02-09_00-00-00.csv.bz2
  
   If we do another directory listing it show up fine.
  
   -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
   data.2015-02-08_00-00-00.csv.bz2
   -rw-r--r-- 1 wwwrun root13675 Feb 10 15:21
   data.2015-02-09_00-00-00.csv.bz2
  
   It hasn't been a problem but just wanted to see if this is an issue,
   could
   the attributes be timing out? We do have a lot of files in the
   filesystem so
   that could be a possible bottleneck.
 
  Huh, that's not something I've seen before. Are the systems you're
  doing this on the same? What distro and kernel version? Is it reliably
  one of them showing the question marks, or does it jump between
  systems?
  -Greg
 
  
   We are using the ceph-fuse mount.
   ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
   We are planning to do the update soon to 87.1
  
   Thanks
   Scottie
  
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Replacing Ceph mons understanding initial members

2014-11-18 Thread Scottix
We currently have a 3 node system with 3 monitor nodes. I created them in
the initial setup and the ceph.conf

mon initial members = Ceph200, Ceph201, Ceph202
mon host = 10.10.5.31,10.10.5.32,10.10.5.33

We are in the process of expanding and installing dedicated mon servers.

I know I can run:
ceph-deploy mon create Ceph300, etc.
to install the new mons but then I will eventually need to destroy the old
mons Ceph200, etc...
Will this create issues or anything I need to update with the initial
members? mon host?

Thanks in advance
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] jbod + SMART : how to identify failing disks ?

2014-11-12 Thread Scottix
I would say it depends on your system and where drives are connected
to. Some HBA have a cli tool to manage the drives connected like a
raid card would do.
One other method I found is sometimes it will expose the leds for you
http://fabiobaltieri.com/2011/09/21/linux-led-subsystem/ has an
article on the /sys/class/led but not guarantee.

On my laptop I could turn on lights and stuff but our server didn't
have anything. Seems like a feature either linux or smartctrl should
have. I have ran into this problem before but did a couple tricks to
figure it out.

I guess best solution is just to track the drives S/N. Maybe a good
note to have in the doc for a Ceph cluster to be aware of.

On Wed, Nov 12, 2014 at 9:06 AM, Erik Logtenberg e...@logtenberg.eu wrote:
 I have no experience with the DELL SAS controller, but usually the
 advantage of using a simple controller (instead of a RAID card) is that
 you can use full SMART directly.

 $ sudo smartctl -a /dev/sda

 === START OF INFORMATION SECTION ===
 Device Model: INTEL SSDSA2BW300G3H
 Serial Number:PEPR2381003E300EGN

 Personally, I make sure that I know which serial number drive is in
 which bay, so I can easily tell which drive I'm talking about.

 So you can use SMART both to notice (pre)failing disks -and- to
 physically identify them.

 The same smartctl command also returns the health status like so:

 233 Media_Wearout_Indicator 0x0032   099   099   000Old_age   Always
   -   0

 This specific SSD has 99% media lifetime left, so it's in the green. But
 it will continue to gradually degrade, and at some time It'll hit a
 percentage where I like to replace it. To keep an eye on the speed of
 decay, I'm graphing those SMART values in Cacti. That way I can somewhat
 predict how long a disk will last, especially SSD's which die very
 gradually.

 Erik.


 On 12-11-14 14:43, JF Le Fillâtre wrote:

 Hi,

 May or may not work depending on your JBOD and the way it's identified
 and set up by the LSI card and the kernel:

 cat /sys/block/sdX/../../../../sas_device/end_device-*/bay_identifier

 The weird path and the wildcards are due to the way the sysfs is set up.

 That works with a Dell R520, 6GB HBA SAS cards and Dell MD1200s, running
 CentOS release 6.5.

 Note that you can make your life easier by writing an udev script that
 will create a symlink with a sane identifier for each of your external
 disks. If you match along the lines of

 KERNEL==sd*[a-z], KERNELS==end_device-*:*:*

 then you'll just have to cat /sys/class/sas_device/${1}/bay_identifier
 in a script (with $1 being the $id of udev after that match, so the
 string end_device-X:Y:Z) to obtain the bay ID.

 Thanks,
 JF



 On 12/11/14 14:05, SCHAER Frederic wrote:
 Hi,



 I’m used to RAID software giving me the failing disks  slots, and most
 often blinking the disks on the disk bays.

 I recently installed a  DELL “6GB HBA SAS” JBOD card, said to be an LSI
 2008 one, and I now have to identify 3 pre-failed disks (so says
 S.M.A.R.T) .



 Since this is an LSI, I thought I’d use MegaCli to identify the disks
 slot, but MegaCli does not see the HBA card.

 Then I found the LSI “sas2ircu” utility, but again, this one fails at
 giving me the disk slots (it finds the disks, serials and others, but
 slot is always 0)

 Because of this, I’m going to head over to the disk bay and unplug the
 disk which I think corresponds to the alphabetical order in linux, and
 see if it’s the correct one…. But even if this is correct this time, it
 might not be next time.



 But this makes me wonder : how do you guys, Ceph users, manage your
 disks if you really have JBOD servers ?

 I can’t imagine having to guess slots that each time, and I can’t
 imagine neither creating serial number stickers for every single disk I
 could have to manage …

 Is there any specific advice reguarding JBOD cards people should (not)
 use in their systems ?

 Any magical way to “blink” a drive in linux ?



 Thanks  regards



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Follow Me: @Taijutsun
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Scottix
Agreed Multi-MDS is a nice to have but not required for full production use.
TBH stability and recovery will win any IT person dealing with filesystems.

On Tue, Nov 4, 2014 at 7:33 AM, Mariusz Gronczewski
mariusz.gronczew...@efigence.com wrote:
 On Tue, 4 Nov 2014 10:36:07 +1100, Blair Bethwaite
 blair.bethwa...@gmail.com wrote:


 TBH I'm a bit surprised by a couple of these and hope maybe you guys
 will apply a certain amount of filtering on this...

 fsck and quotas were there for me, but multimds and snapshots are what
 I'd consider icing features - they're nice to have but not on the
 critical path to using cephfs instead of e.g. nfs in a production
 setting. I'd have thought stuff like small file performance and
 gateway support was much more relevant to uptake and
 positive/pain-free UX. Interested to hear others rationale here.


 Those are related; if small file performance will be enough for one
 MDS to handle high load with a lot of small files (typical case of
 webserver), having multiple acive MDS will be less of a priority;

 And if someone currently have OSD on bunch of relatively weak nodes,
 again, having active-active setup with MDS will be more interesting to
 him than someone that can just buy new fast machine for it.


 --
 Mariusz Gronczewski, Administrator

 Efigence S. A.
 ul. Wołoska 9a, 02-583 Warszawa
 T: [+48] 22 380 13 13
 F: [+48] 22 380 13 14
 E: mariusz.gronczew...@efigence.com
 mailto:mariusz.gronczew...@efigence.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Follow Me: @Scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph features monitored by nagios

2014-07-23 Thread Scottix
We use zabbix but the same concept applies in writing your own scripts.

We take advantage of the command
$ceph -s --format=json 2/dev/null
stderr comes up with some stuff sometimes so we filter that out.

On Wed, Jul 23, 2014 at 6:32 AM, Wolfgang Hennerbichler wo...@wogri.com wrote:
 Nagios can monitor anything you can script. If there isn’t a plugin for it, 
 write it yourself, it’s really not hard. I’d go for icinga by the way, which 
 is more actively maintained than nagios.

 On Jul 23, 2014, at 3:07 PM, pragya jain prag_2...@yahoo.co.in wrote:

 Hi all,

 I am studying nagios for monitoring ceph features.

 different plugins of nagios monitor ceph cluster health, o0sd status, 
 monitor status etc.

 My questions are:
 * Does Nagios monitor ceph for cluster, pool and each PG for
 - CPU utilization
 - memory utilization
 - Network Utilization
 - total storage capacity, storage capacity used, storage capacity remaining 
 etc.

 * Does Nagios monitor ceph for drive configuration, Bad sectors/ fragmented 
 disk, Co-resident monitors/OSDs, Co-resident processes, Kernel version, 
 Mounted filesystem for each OSD?

 Please help me to find out the answers of my questions

 Regards
 Pragya Jain
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Follow Me: @Scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph-fuse remount

2014-07-22 Thread Scottix
Thanks for the info.
I was able to do a lazy unmount and started back up fine if anyone
wanted to know.

On Wed, Jul 16, 2014 at 10:29 AM, Gregory Farnum g...@inktank.com wrote:
 On Wed, Jul 16, 2014 at 9:20 AM, Scottix scot...@gmail.com wrote:
 I wanted to update ceph-fuse to a new version and I would like to have
 it seamless.
 I thought I could do a remount to update the running version but came to a 
 fail.
 Here is the error I got.

 # mount /mnt/ceph/ -o remount
 2014-07-16 09:08:57.690464 7f669be1a760 -1 asok(0x1285eb0)
 AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen:
 failed to bind the UNIX domain socket to
 '/var/run/ceph/ceph-client.admin.asok': (17) File exists
 ceph-fuse[10474]: starting ceph client
 fuse: mountpoint is not empty
 fuse: if you are sure this is safe, use the 'nonempty' mount option
 ceph-fuse[10474]: fuse failed to initialize
 2014-07-16 09:08:57.784900 7f669be1a760 -1
 fuse_mount(mountpoint=/mnt/ceph) failed.
 ceph-fuse[10461]: mount failed: (5) Input/output error

 Or is there a better way to do this?

 I don't think that FUSE supports remounting, and the Ceph client
 implementation definitely doesn't. Sorry!
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com



-- 
Follow Me: @Scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph-fuse remount

2014-07-16 Thread Scottix
I wanted to update ceph-fuse to a new version and I would like to have
it seamless.
I thought I could do a remount to update the running version but came to a fail.
Here is the error I got.

# mount /mnt/ceph/ -o remount
2014-07-16 09:08:57.690464 7f669be1a760 -1 asok(0x1285eb0)
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen:
failed to bind the UNIX domain socket to
'/var/run/ceph/ceph-client.admin.asok': (17) File exists
ceph-fuse[10474]: starting ceph client
fuse: mountpoint is not empty
fuse: if you are sure this is safe, use the 'nonempty' mount option
ceph-fuse[10474]: fuse failed to initialize
2014-07-16 09:08:57.784900 7f669be1a760 -1
fuse_mount(mountpoint=/mnt/ceph) failed.
ceph-fuse[10461]: mount failed: (5) Input/output error

Or is there a better way to do this?

-- 
Follow Me: @Scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS MDS Setup

2014-05-28 Thread Scottix
Looks like we are going to put a hold on CephFS and use RBD till it is
fully supported.
Which brings me to my next question.
I am trying to remove MDS completely and seem to be having issues

I disabled all mounts

disabled all the startup scripts

// Cleaned the mdsmap
ceph mds newfs 0 1 --yes-i-really-mean-it

// Then tried but got error
ceph osd pool delete metadata metadata --yes-i-really-really-mean-it
Error EBUSY: pool 'metadata' is in use by CephFS

// Tried and it looks like a bug of sort
ceph mds cluster_down
// Still get
mdsmap e78: 0/0/0 up
// Shouldn't it be down?

ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)

Do I need to start over and not add the mds to be clean?

Thanks for your time

On Wed, May 21, 2014 at 12:18 PM, Wido den Hollander w...@42on.com wrote:
 On 05/21/2014 09:04 PM, Scottix wrote:

 I am setting a CephFS cluster and wondering about MDS setup.
 I know you are still hesitant to put the stable label on it but I have
 a few questions what would be an adequate setup.

 I know active active is not developed yet so that is pretty much out
 of the question right now.

 What about active standby? How reliable is the standby? or should a
 single active mds be sufficient?


 Active/Standby is fairly stable, but I wouldn't recommend putting it into
 production right now.

 The general advice is always to run a recent Ceph version and a recent
 kernel as well. Like 3.13 in Ubuntu 14.04

 But the best advice: Test your use-case extensively! The more feedback, the
 better.

 Thanks



 --
 Wido den Hollander
 42on B.V.
 Ceph trainer and consultant

 Phone: +31 (0)20 700 9902
 Skype: contact42on
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Follow Me: @Scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS MDS Setup

2014-05-21 Thread Scottix
I am setting a CephFS cluster and wondering about MDS setup.
I know you are still hesitant to put the stable label on it but I have
a few questions what would be an adequate setup.

I know active active is not developed yet so that is pretty much out
of the question right now.

What about active standby? How reliable is the standby? or should a
single active mds be sufficient?

Thanks

-- 
Follow Me: @Scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph User Committee monthly meeting #1 : executive summary

2014-04-04 Thread Scottix
CephFS use case
Wanted to throw in our use case.

We store massive amount of timeseries data files that we need to process as
they come in. Right now raid is holding us over but we are hitting the
upper limits, we would have to spend some serious amount of money to go to
the next level, even then will start needing more throughput. That is where
Ceph would be great, although RBD is feasible, we need it to be reliably
expandable.
This is where Gluster has one up on Ceph. I have put off gluster because I
don't want to deal with split-brain scenarios, anyway we have been
experimenting with CephFS and we believe that is the way to go but we are
very cautious at the moment.



On Fri, Apr 4, 2014 at 9:34 AM, Loic Dachary l...@dachary.org wrote:

 Hi Ceph,

 This month Ceph User Committee meeting was about:

 Tiering, erasure code
 Using http://tracker.ceph.com/
 CephFS
 Miscellaneous

 You will find an executive summary at:


 https://wiki.ceph.com/Community/Meetings/Ceph_User_Committee_meeting_2014-04-03

 The full log of the IRC conversation is also included to provide more
 context when needed. Feel free to edit if you see a mistake.

 Cheers

 --
 Loïc Dachary, Artisan Logiciel Libre


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Follow Me: @Scottix http://www.twitter.com/scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] shutting down for maintenance

2013-12-31 Thread Scottix
The way I have done it is so the osd don't get set out.

Check the link below

http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/#stopping-w-out-rebalancing


On Tue, Dec 31, 2013 at 12:43 AM, James Harper 
james.har...@bendigoit.com.au wrote:

 I need to shut down ceph for maintenance to make some hardware changes. Is
 it sufficient to just stop all services on all nodes, or is there a way to
 put the whole cluster into standby or something first?

 And when things come back up, IP addresses on the cluster network will be
 different (public network will not change though). Is it sufficient to just
 change the config files and the osd's will register themselves correctly,
 or is there more involved?

 Thanks

 James
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Follow Me: @Scottix http://www.twitter.com/scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] optimal setup with 4 x ethernet ports

2013-12-03 Thread Scottix
I found network to be the most limiting factor in Ceph.
Any chance to move to 10G+ would be beneficial.
I did have success with Bonding and just doing a simple RR increased the
throughput.


On Mon, Dec 2, 2013 at 10:17 PM, Kyle Bader kyle.ba...@gmail.com wrote:

  Is having two cluster networks like this a supported configuration?
 Every osd and mon can reach every other so I think it should be.

 Maybe. If your back end network is a supernet and each cluster network is
 a subnet of that supernet. For example:

 Ceph.conf cluster network (supernet): 10.0.0.0/8

 Cluster network #1:  10.1.1.0/24
 Cluster network #2: 10.1.2.0/24

 With that configuration OSD address autodection *should* just work.

  1. move osd traffic to eth1. This obviously limits maximum throughput to
 ~100Mbytes/second, but I'm getting nowhere near that right now anyway.

 Given three links I would probably do this if your replication factor is
 = 3. Keep in mind 100Mbps links could very well end up being a limiting
 factor.

 What are you backing each OSD with storage wise and how many OSDs do you
 expect to participate in this cluster?

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Follow Me: @Scottix http://www.twitter.com/scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Newbie question

2013-10-02 Thread Scottix
I actually am looking for a similar answer. If 1 osd = 1 HDD, in dumpling
it will relocate the data for me after the timeout which is great. If I
just want to replace the osd with an unformated new HDD what is the
procedure?

One method that has worked for me is to remove it out of the crush map then
re add the osd drive to the cluster. This works but seems like a lot of
overhead just to replace a single drive. Is there a better way to do this?


On Wed, Oct 2, 2013 at 8:10 AM, Andy Paluch a...@webguyz.net wrote:

 What happens when a drive goes bad in ceph and has to be replaced (at the
 physical level) . In the Raid world you pop out the bad disk and stick a
 new one in and the controller takes care of getting it back into the
 system. With what I've been reading so far, it probably going be a mess to
 do this with ceph  and involve a lot of low level linux tweaking to remove
 and replace the disk that failed. Not a big Linux guy so was wondering if
 anyone can point to any docs on how to recover from a bad disk in a ceph
 node.

 Thanks


 __**_
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Follow Me: @Scottix http://www.twitter.com/scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Documentation OS Recommendations

2013-09-09 Thread Scottix
I was looking at someones question on the list and started looking up some
documentation and found this page.
http://ceph.com/docs/next/install/os-recommendations/

Do you think you can provide an update for dumpling.

Best Regards
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Hadoop Configuration

2013-08-05 Thread Scottix
Hey Noah,
Yes it does look like an older version 56.6, I got it from the Ubuntu Repo.
Is there another method or pull request I can run to get the latest? I am
having a hard time finding it.

Thanks


On Sun, Aug 4, 2013 at 10:33 PM, Noah Watkins noah.watk...@inktank.comwrote:

 Hey Scott,

 Things look OK, but I'm a little foggy on what exactly was shipping in
 the libcephfs-java jar file back at 0.61. There was definitely a time
 where Hadoop and libcephfs.jar in the Debian repos were out of sync,
 and that might be what you are seeing.

 Could you list the contents of the libcephfs.jar file, to see if
 CephPoolException.class is in there? It might just be that the
 libcephfs.jar is out-of-date.

 -Noah

 On Sun, Aug 4, 2013 at 8:44 PM, Scottix scot...@gmail.com wrote:
  I am running into an issues connecting hadoop to my ceph cluster and I'm
  sure I am missing something but can't figure it out.
  I have a Ceph cluster with MDS running fine and I can do a basic mount
  perfectly normal.
  I have hadoop fs -ls with basic file:/// working well.
 
  Info:
  ceph cluster version 0.61.7
  Ubuntu Server 13.04 x86_64
  hadoop 1.2.1-1 deb install (stable now I did try 1.1.2 same issue)
  libcephfs-java both hadoop-cephfs.jar and libcephfs.jar show up in
 hadoop
  classpath
  libcephfs-jni with symlink trick
 /usr/share/hadoop/lib/native/Linux-amd64-64
  listed here
 
 http://thread.gmane.org/gmane.comp.file-systems.ceph.user/1788/focus=1806
  and the LD_LIBRARY_PATH in hadoop-env.sh
 
  When I try to setup the ceph mount within Hadoop I get an exception
 
  $ hadoop fs -ls
  Exception in thread main java.lang.NoClassDefFoundError:
  com/ceph/fs/CephPoolException
  at
 
 org.apache.hadoop.fs.ceph.CephFileSystem.initialize(CephFileSystem.java:96)
  at
  org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:247)
  at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
  at org.apache.hadoop.fs.FsShell.ls(FsShell.java:583)
  at org.apache.hadoop.fs.FsShell.run(FsShell.java:1812)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
  at org.apache.hadoop.fs.FsShell.main(FsShell.java:1916)
  Caused by: java.lang.ClassNotFoundException:
 com.ceph.fs.CephPoolException
 
  Followed the tutorial here
  http://ceph.com/docs/next/cephfs/hadoop/
 
  core-site.xml settings
  ...
  property
  namefs.ceph.impl/name
  valueorg.apache.hadoop.fs.ceph.CephFileSystem/value
  /property
  property
  namefs.default.name/name
  valueceph://192.168.1.11:6789/value
  /property
  property
  nameceph.data.pools/name
  valuehadoop1/value
  /property
  property
  nameceph.auth.id/name
  valueadmin/value
  /property
  property
  nameceph.auth.keyfile/name
  value/etc/ceph/admin.secret/value
  /property
 
  Any Help Appreciated
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 




-- 
Follow Me: @Scottix http://www.twitter.com/scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Hadoop Configuration

2013-08-04 Thread Scottix
I am running into an issues connecting hadoop to my ceph cluster and I'm
sure I am missing something but can't figure it out.
I have a Ceph cluster with MDS running fine and I can do a basic mount
perfectly normal.
I have hadoop fs -ls with basic file:/// working well.

Info:
ceph cluster version 0.61.7
Ubuntu Server 13.04 x86_64
hadoop 1.2.1-1 deb install (stable now I did try 1.1.2 same issue)
libcephfs-java both hadoop-cephfs.jar and libcephfs.jar show up in hadoop
classpath
libcephfs-jni with symlink trick
/usr/share/hadoop/lib/native/Linux-amd64-64 listed here
http://thread.gmane.org/gmane.comp.file-systems.ceph.user/1788/focus=1806 and
the LD_LIBRARY_PATH in hadoop-env.sh

When I try to setup the ceph mount within Hadoop I get an exception

$ hadoop fs -ls
Exception in thread main java.lang.NoClassDefFoundError:
com/ceph/fs/CephPoolException
at
org.apache.hadoop.fs.ceph.CephFileSystem.initialize(CephFileSystem.java:96)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:247)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at org.apache.hadoop.fs.FsShell.ls(FsShell.java:583)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1812)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1916)
Caused by: java.lang.ClassNotFoundException: com.ceph.fs.CephPoolException

Followed the tutorial here
http://ceph.com/docs/next/cephfs/hadoop/

core-site.xml settings
...
property
namefs.ceph.impl/name
valueorg.apache.hadoop.fs.ceph.CephFileSystem/value
/property
property
namefs.default.name/name
valueceph://192.168.1.11:6789/value
/property
property
nameceph.data.pools/name
valuehadoop1/value
/property
property
nameceph.auth.id/name
valueadmin/value
/property
property
nameceph.auth.keyfile/name
value/etc/ceph/admin.secret/value
/property

Any Help Appreciated
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy questions

2013-06-19 Thread Scottix
Couple things I caught.
The first wasn't a huge issue but good to note.
The second took me a while to figure out.

1. Default attribute:

ceph-deploy new [HOST]
by default creates filestore xattr use omap = true which is used for ext4
http://eu.ceph.com/docs/wip-3060/config-cluster/ceph-conf/#osds but
ceph-deploy osd create {node-name}:{disk}[:{path/to/journal}]
uses XFS by default

2. The command to send admin to a monitor doesn't work:

ceph admin [HOST]
copies the file to /etc/ceph/ceph.client.admin.keyring but the monitor is
expecting /etc/ceph/keyring

Hope this helps some people,
Scottix



On Wed, Jun 12, 2013 at 12:12 PM, Scottix scot...@gmail.com wrote:

 Thanks Greg,
 I am starting to understand it better.
 I soon realized as well after doing some searching I hit this bug.
 http://tracker.ceph.com/issues/5194
 Which created the problem upon rebooting.

 Thank You,
 Scottix


 On Wed, Jun 12, 2013 at 10:29 AM, Gregory Farnum g...@inktank.com wrote:

 On Wed, Jun 12, 2013 at 9:40 AM, Scottix scot...@gmail.com wrote:
  Hi John,
  That makes sense it affects the ceph cluster map, but it actually does a
  little more like partitioning drives and setting up other parameters and
  even starts the service. So the part I see is a little confusing is
 that I
  have to configure the ceph.conf file on top of using ceph-deploy so it
  starts to feel like double work and potential for error if you get
 mixed up
  or you were expecting one thing and ceph-deploy does another.
  I think I can figure out a best practice, but I think it is worth noting
  that just running the commands will get it up and running but it is
 probably
  best to edit the config file as well. I like the new ceph-deploy
 commands
  definitely makes things more manageable.
  A single page example for install and setup would be highly appreciated,
  especially for new users.
 
  I must have skimmed that section in the runtime-changes thanks for
 pointing
  me to the page.

 Just as a little more context, ceph-deploy is trying to provide a
 reference for how we expect users to manage ceph when using a
 configuration management system like Chef. Rather than trying to
 maintain a canonical ceph.conf (because let's be clear, there is no
 canonical one as far as Ceph is concerned), each host gets the
 information it needs in its ceph.conf, and the cluster is put together
 dynamically based on who's talking to the monitors.
 The reason you aren't seeing individual OSD entries in any of the
 configuration files is because the OSDs on a host are actually defined
 by the presence of OSD stores in /var/lib/ceph/osd-*. Those daemons
 should be activated automatically thanks to the magic of udev and our
 init scripts whenever you reboot, plug in a drive which stores an OSD,
 etc.
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com




 --
 Follow Me: @Scottix http://www.twitter.com/scottix
 http://about.me/scottix
 scot...@gmail.com




-- 
Follow Me: @Scottix http://www.twitter.com/scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy questions

2013-06-12 Thread Scottix
Hi John,
That makes sense it affects the ceph cluster map, but it actually does a
little more like partitioning drives and setting up other parameters and
even starts the service. So the part I see is a little confusing is that I
have to configure the ceph.conf file on top of using ceph-deploy so it
starts to feel like double work and potential for error if you get mixed up
or you were expecting one thing and ceph-deploy does another.
I think I can figure out a best practice, but I think it is worth noting
that just running the commands will get it up and running but it is
probably best to edit the config file as well. I like the new ceph-deploy
commands definitely makes things more manageable.
A single page example for install and setup would be highly appreciated,
especially for new users.

I must have skimmed that section in the runtime-changes thanks for pointing
me to the page.

Thanks for responding,
Scottix




On Wed, Jun 12, 2013 at 6:35 AM, John Wilkins john.wilk...@inktank.comwrote:

 ceph-deploy adds the OSDs to the cluster map. You can add the OSDs to
 the ceph.conf manually.

 In the ceph.conf file, the settings don't require underscores. If you
 modify your configuration at runtime, you need to add the underscores
 on the command line.

 http://ceph.com/docs/master/rados/configuration/ceph-conf/
 http://ceph.com/docs/master/rados/configuration/ceph-conf/#runtime-changes

 Underscores and dashes work with the config settings.

 On Tue, Jun 11, 2013 at 4:41 PM, Scottix scot...@gmail.com wrote:
  Hi Everyone,
  I am new to ceph but loving every moment of it. I am learning all of this
  now, so maybe this will help with documentation.
 
  Anyway, I have a few question about ceph-deploy. I was able to setup a
  cluster and be able to get it up and running no problem with ubuntu
 12.04.2
  that isn't the problem. The ceph.conf file is a little bit of a mystery
 for
  me on ceph-deploy. For example when I create a mon or osd on a machine
 the
  ceph.conf file doesn't change at all. Then if I reboot an osd, I have to
  re-activate it every time. Am I suppose to edit the
  config file for each osd? If I don't edit the file how do I keep track of
  each machine? or set special parameters for some machines? or does it
  matter?
 
  One last thing is why does it put underscores '_' for spaces when it does
  deploy the ceph.conf? Seems odd since the documentation doesn't show
  underscores, but I guess it doesn't matter since it works.
 
  Thanks for clarification,
  Scottix
 
  --
  Follow Me: @Scottix
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 



 --
 John Wilkins
 Senior Technical Writer
 Intank
 john.wilk...@inktank.com
 (415) 425-9599
 http://inktank.com




-- 
Follow Me: @Scottix http://www.twitter.com/scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy questions

2013-06-12 Thread Scottix
Thanks Greg,
I am starting to understand it better.
I soon realized as well after doing some searching I hit this bug.
http://tracker.ceph.com/issues/5194
Which created the problem upon rebooting.

Thank You,
Scottix


On Wed, Jun 12, 2013 at 10:29 AM, Gregory Farnum g...@inktank.com wrote:

 On Wed, Jun 12, 2013 at 9:40 AM, Scottix scot...@gmail.com wrote:
  Hi John,
  That makes sense it affects the ceph cluster map, but it actually does a
  little more like partitioning drives and setting up other parameters and
  even starts the service. So the part I see is a little confusing is that
 I
  have to configure the ceph.conf file on top of using ceph-deploy so it
  starts to feel like double work and potential for error if you get mixed
 up
  or you were expecting one thing and ceph-deploy does another.
  I think I can figure out a best practice, but I think it is worth noting
  that just running the commands will get it up and running but it is
 probably
  best to edit the config file as well. I like the new ceph-deploy commands
  definitely makes things more manageable.
  A single page example for install and setup would be highly appreciated,
  especially for new users.
 
  I must have skimmed that section in the runtime-changes thanks for
 pointing
  me to the page.

 Just as a little more context, ceph-deploy is trying to provide a
 reference for how we expect users to manage ceph when using a
 configuration management system like Chef. Rather than trying to
 maintain a canonical ceph.conf (because let's be clear, there is no
 canonical one as far as Ceph is concerned), each host gets the
 information it needs in its ceph.conf, and the cluster is put together
 dynamically based on who's talking to the monitors.
 The reason you aren't seeing individual OSD entries in any of the
 configuration files is because the OSDs on a host are actually defined
 by the presence of OSD stores in /var/lib/ceph/osd-*. Those daemons
 should be activated automatically thanks to the magic of udev and our
 init scripts whenever you reboot, plug in a drive which stores an OSD,
 etc.
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com




-- 
Follow Me: @Scottix http://www.twitter.com/scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com