Re: [ceph-users] (no subject)

2016-07-12 Thread Anand Bhat
Use qemu-img-convert to convert from one format to another.

Regards,
Anand

On Mon, Jul 11, 2016 at 9:37 PM, Gaurav Goyal 
wrote:

> Thanks!
>
> I need to create a VM having qcow2 image file as 6.7 GB but raw image as
> 600GB which is too big.
> Is there a way that i need not to convert qcow2 file to raw and it works
> well with rbd?
>
>
> Regards
> Gaurav Goyal
>
> On Mon, Jul 11, 2016 at 11:46 AM, Kees Meijs  wrote:
>
>> Glad to hear it works now! Good luck with your setup.
>>
>> Regards,
>> Kees
>>
>> On 11-07-16 17:29, Gaurav Goyal wrote:
>> > Hello it worked for me after removing the following parameter from
>> > /etc/nova/nova.conf file
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 

Never say never.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore merge and split

2016-07-11 Thread Anand Bhat
Merge happens either due to movement of objects due to CRUSH recalculation
(when cluster grows or shrinks due to various reasons) or deletion of
objects.

Split happens when portions of objects/volumes get populated that were
previously sparse. Each RADOS object by default is 4MB chunk and volumes
comprise of these objects There is no RADOS object created when there is no
write on that region. When write spans sparse portions of the volume, new
RADOS objects are created under directory that maps the PG to which the
object belongs.

Regards,
Anand

On Mon, Jul 11, 2016 at 5:38 PM, Nick Fisk  wrote:

> I believe splitting will happen on writes, merging I think only happens on
> deletions.
>
>
>
> *From:* Paul Renner [mailto:renner...@gmail.com]
> *Sent:* 10 July 2016 19:40
> *To:* n...@fisk.me.uk
> *Cc:* ceph-users@lists.ceph.com
> *Subject:* Re: [ceph-users] Filestore merge and split
>
>
>
> Thanks...
>
> Do you know when splitting or merging will happen? Is it enough that a
> directory is read, eg. through scrub? If possible I would like to initiate
> the process
>
> Regards
>
> Paul
>
>
>
> On Sun, Jul 10, 2016 at 10:47 AM, Nick Fisk  wrote:
>
> You need to set the option in the ceph.conf and restart the OSD I think.
> But it will only take effect when splitting or merging in the future, it
> won't adjust the current folder layout.
>
>
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of Paul Renner
> > Sent: 09 July 2016 22:18
> > To: ceph-users@lists.ceph.com
> > Subject: [ceph-users] Filestore merge and split
> >
> > Hello cephers
> > we have many (millions,  small objects in our RadosGW system and are
> getting not very good write performance, 100-200 PUTs /sec.
> >
> > I have read on the mailinglist that one possible tuning option would be
> to increase the max. number of files per directory on OSDs with
> > eg.
> >
> > filestore merge threshold = 40
> > filestore split multiple = 8
> > Now my question is, do we need to rebuild the OSDs to make this
> effective? Or is it a runtime setting?
> > I'm asking because when setting this with injectargs I get the message
> "unchangeable" back.
> > Thanks for any insight.
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 

Never say never.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD - Deletion / Discard - IO Impact

2016-07-07 Thread Anand Bhat
These are known problem.

Are you doing mkfs.xfs on SSD? If so, please check SSD data sheets whether
UNMAP is supported. To avoid unmap during mkfs, use mkfs.xfs -K

Regards,
Anand

On Thu, Jul 7, 2016 at 5:23 PM, Nick Fisk  wrote:

> Hi All,
>
>
>
> Does anybody else see a massive (ie 10x) performance impact when either
> deleting a RBD or running something like mkfs.xfs against an existing RBD,
> which would zero/discard all blocks?
>
>
>
> In the case of deleting a 4TB RBD, I’m seeing latency in some cases rise
> up to 10s.
>
>
>
> It looks like it the XFS deletions on the OSD which are potentially
> responsible for the massive drop in performance as I see random OSD’s in
> turn peak to 100% utilisation.
>
>
>
> I’m not aware of any throttling than can be done to reduce this impact,
> but would be interested to here from anyone else that may experience this.
>
>
>
> Nick
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 

Never say never.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Regarding GET BUCKET ACL REST call

2016-06-27 Thread Anand Bhat
Hi,

When GET BUCKET ACL REST call is issued with X-Auth-Token set, call fails.
This is due to bucket in question not having CORS settings. Is there a way
to set CORS on the S3 bucket with REST APIs?  I know a way using boto S3
that works. I am looking for REST APIs for CORS setting.

Regards,
Anand

-- 

Never say never.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph pg level IO sequence

2016-06-24 Thread Anand Bhat
Correct. This is guaranteed.

Regards,
Anand

On Fri, Jun 24, 2016 at 10:37 AM, min fang  wrote:

> Hi, as my understanding, in PG level, IOs are execute in a sequential way,
> such as the following cases:
>
> Case 1:
> Write A, Write B, Write C to the same data area in a PG --> A Committed,
> then B committed, then C.  The final data will from write C. Impossible
> that mixed (A, B,C) data is in the data area.
>
> Case 2:
> Write A, Write B, Read C to the same data area in a PG-> Read C will
> return the data from Write B, not Write A.
>
> Are the above cases true?
>
> thanks.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 

Never say never.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph file change monitor

2016-06-09 Thread Anand Bhat
I think you are looking for inotify/fanotify events for Ceph. Usually these
are implemented for local file system. Ceph being a networked file system,
it will not be easy to implement  and will involve network traffic to
generate events.

Not sure it is in the plan though.

Regards,
Anand

On Wed, Jun 8, 2016 at 2:46 PM, John Spray  wrote:

> On Wed, Jun 8, 2016 at 8:40 AM, siva kumar <85s...@gmail.com> wrote:
> > Dear Team,
> >
> > We are using ceph storage & cephFS for mounting .
> >
> > Our configuration :
> >
> > 3 osd
> > 3 monitor
> > 4 clients .
> > ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
> >
> > We would like to get file change notifications like what is the event
> > (ADDED, MODIFIED,DELETED) and for which file the event has occurred.
> These
> > notifications should be sent to our server.
> > How to get these notifications?
>
> This isn't a feature that CephFS has right now.  Still, I would be
> interested to know what protocol/format your server would consume
> these kinds of notifications in?
>
> John
>
> > Ultimately we would like to add our custom file watch notification hooks
> to
> > ceph so that we can handle this notifications by our self .
> >
> > Additional Info :
> >
> > [test@ceph-zclient1 ~]$ ceph -s
> >
> >> cluster a8c92ae6-6842-4fa2-bfc9-8cdefd28df5c
> >
> >  health HEALTH_WARN
> > mds0: ceph-client1 failing to respond to cache pressure
> > mds0: ceph-client2 failing to respond to cache pressure
> > mds0: ceph-client3 failing to respond to cache pressure
> > mds0: ceph-client4 failing to respond to cache pressure
> >  monmap e1: 3 mons at
> >
> {ceph-zadmin=xxx.xxx.xxx.xxx:6789/0,ceph-zmonitor=xxx.xxx.xxx.xxx:6789/0,ceph-zmonitor1=xxx.xxx.xxx.xxx:6789/0}
> > election epoch 16, quorum 0,1,2
> > ceph-zadmin,ceph-zmonitor1,ceph-zmonitor
> >  mdsmap e52184: 1/1/1 up {0=ceph-zstorage1=up:active}
> >  osdmap e3278: 3 osds: 3 up, 3 in
> >   pgmap v5068139: 384 pgs, 3 pools, 518 GB data, 7386 kobjects
> > 1149 GB used, 5353 GB / 6503 GB avail
> >  384 active+clean
> >
> >   client io 1259 B/s rd, 179 kB/s wr, 11 op/s
> >
> >
> >
> > Thanks,
> > S.Sivakumar
> >
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 

Never say never.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] civetweb vs Apache for rgw

2016-05-23 Thread Anand Bhat
For performance, civetweb is better as fastcgi module associated with
apache is single threaded. But Apache does have fancy features which
civetweb lacks. If you are looking for just the performance, then go for
civetweb.

Regards,
Anand

On Mon, May 23, 2016 at 12:43 PM, fridifree  wrote:

> Hi everyone,
> What would give the best performance in most of the cases, civetweb or
> apache?
>
> Thank you
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 

Never say never.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph -s output

2016-05-23 Thread Anand Bhat
Check states of PGs using "ceph pg dump" and for every PG that is not
"active+clean", issue "ceph pg map " and get mapping OSDs.  Check
the state of those OSDs by looking at their logs under /var/log/ceph/.

Regards,
Anand

On Mon, May 23, 2016 at 6:53 AM, Ken Peng  wrote:

> Hi,
>
> # ceph -s
> cluster 82c855ce-b450-4fba-bcdf-df2e0c958a41
>  health HEALTH_ERR
> 5 pgs inconsistent
> 7 scrub errors
> too many PGs per OSD (318 > max 300)
>
>
> It's HEALTH_ERR above, how to fix up them? Thanks.
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 

Never say never.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Anand Bhat
Page reclamation in Linux is NUMA aware.  So page reclamation is not an issue.

You can see performance improvements only if all the components of a given IO 
completes  on a single core. This is hard to achieve in Ceph as a single IO 
goes through multiple thread switches and the threads are not bound to any 
core.  Starting an OSD with numactl  and binding it to one core might aggravate 
the problem as all the threads spawned by that OSD will compete for the CPU on 
a single core.  OSD with default configuration has 20+ threads .  Binding the 
OSD process to one core using taskset does not help as some memory (especially 
heap) may be already allocated on the other NUMA node.

Looks the design principle followed is to fan out by spawning multiple threads 
at each of the pipelining stage to utilize the available cores in the system.  
Because the IOs won't complete on the same core as issued, lots of cycles are 
lost for cache coherency.

Regards,
Anand



-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stijn 
De Weirdt
Sent: Monday, September 22, 2014 2:36 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] IRQ balancing, distribution

 but another issue is the OSD processes: do you pin those as well? and
 how much data do they actually handle. to checksum, the OSD process
 needs all data, so that can also cause a lot of NUMA traffic, esp if
 they are not pinned.

 That's why all my (production) storage nodes have only a single 6 or 8
 core CPU. Unfortunately that also limits the amount of RAM in there,
 16GB modules have just recently become an economically viable
 alternative to 8GB ones.

 Thus I don't pin OSD processes, given that on my 8 core nodes with 8
 OSDs and 4 journal SSDs I can make Ceph eat babies and nearly all CPU
 (not
 IOwait!) resources with the right (or is that wrong) tests, namely 4K
 FIOs.

 The linux scheduler usually is quite decent in keeping processes where
 the action is, thus you see for example a clear preference of DRBD or
 KVM vnet processes to be near or on the CPU(s) where the IRQs are.
the scheduler has improved recently, but i don't know since what version 
(certainly not backported to RHEL6 kernel).

pinning the OSDs might actually be a bad idea, unless the page cache is flushed 
before each osd restart. kernel VM has this nice feature where allocating 
memory in a NUMA domain does not trigger freeing of cache memory in the domain, 
but it will first try to allocate memory on another NUMA domain. although 
typically the VM cache will be maxed out on OSD boxes, i'm not sure the cache 
clearing itself is NUMA aware, so who knows where the memory is located when 
it's allocated.


stijn
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Regarding ceph osd setmaxosd

2014-07-18 Thread Anand Bhat
Hi,

I have question on intention of Ceph setmaxosd command. From source code, it 
appears as if this is present as a way to limit the number of OSDs in the Ceph 
cluster.  Can this be used to shrink the number of OSDs in the cluster without 
gracefully shutting down OSDs and letting recovery/remapping to happen?

Regards,
Anand



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com