Went through it briefly, looks fine, though I'd like to go over it
some more before picking this up. Note that LIBRADOS_VER_MINOR needs
to be bumped up too.
Thanks,
Yehuda
On Fri, Dec 14, 2012 at 3:18 AM, Filippos Giannakos wrote:
> ---
> src/include/rados/librados.h | 14 ++
>
Reviewed-by: Sage Weil
On Thu, 13 Dec 2012, Alex Elder wrote:
> A connection's socket can close for any reason, independent of the
> state of the connection (and without irrespective of the connection
> mutex). As a result, the connectino can be in pretty much any state
> at the time its socket
Most of the code uses int64_t/__s64 for the pool id, although in a few
cases we screwed up and limited it to 32 bits. In reality, that's way
overkill anyway; we could have left it at 32 bits to begin with.
My first instinct would be to change the return type to long long or s64
and avoid the u
Reviewed-by: Sage Weil
On Thu, 13 Dec 2012, Alex Elder wrote:
> ENOTSUPP is not a standard errno (it shows up as "Unknown error 524"
> in an error message). This is what was getting produced when the
> the local rbd code does not implement features required by a
> discovered rbd image.
>
> Cha
Reviewed-by: Sage Weil
On Thu, 13 Dec 2012, Alex Elder wrote:
> In __unregister_linger_request(), the request is being removed
> from the osd client's req_linger list only when the request
> has a non-null osd pointer. It should be done whether or not
> the request currently has an osd.
>
> Th
Reviewed-by: Sage Weil
On Fri, 14 Dec 2012, Alex Elder wrote:
> When a connection's socket disconnects, or if there's a protocol
> error of some kind on the connection, a fault is signaled and
> the connection is reset (closed and reopened, basically). We
> currently get an error message on the
When a connection's socket disconnects, or if there's a protocol
error of some kind on the connection, a fault is signaled and
the connection is reset (closed and reopened, basically). We
currently get an error message on the log whenever this occurs.
A ceph connection will attempt to reestablish
We should drop this one, I think. See upstream commit
4c199a93a2d36b277a9fd209a0f2793f8460a215. When we added the similar call
on teh request tree it caused some noise in linux-next and then got
removed.
sage
On Thu, 13 Dec 2012, Alex Elder wrote:
> It turns out to be harmless but the red-b
Reviewed-by: Sage Weil
On Thu, 13 Dec 2012, Alex Elder wrote:
> RBD_MAX_SEG_NAME_LEN represents the maximum length of an rbd object
> name (i.e., one of the objects providing storage backing an rbd
> image).
>
> Another symbol, MAX_OBJ_NAME_SIZE, is used in the osd client code to
> define the m
Reviewed-by: Sage Weil
On Thu, 13 Dec 2012, Alex Elder wrote:
> If an osd has no requests and no linger requests, __reset_osd()
> will just remove it with a call to __remove_osd(). That drops
> a reference to the osd, and therefore the osd may have been free
> by the time __reset_osd() returns.
Reviewed-by: Sage Weil
On Thu, 13 Dec 2012, Alex Elder wrote:
> There is no check in rbd_remove() to see if anybody holds open the
> image being removed. That's not cool.
>
> Add a simple open count that goes up and down with opens and closes
> (releases) of the device, and don't allow an rbd
I have updated the "testing" branch in the ceph-client git
repository again, and you'll find that a "forced update" is
needed to bring your own repository up to date.
This will probably be necessary again at some point once
we get some reviews done on commits still in this branch,
but we'll try no
Hi Sage,
this was just an idea and i need to fix MY uuid problem. But then the
crash is still a problem of ceph. Have you looked into my log?
Am 14.12.2012 20:42, schrieb Sage Weil:
On Fri, 14 Dec 2012, Stefan Priebe wrote:
One more IMPORTANT note. This might happen due to the fact that a dis
On Fri, 14 Dec 2012, Stefan Priebe wrote:
> One more IMPORTANT note. This might happen due to the fact that a disk was
> missing (disk failure) afte the reboot.
>
> fstab and mountpoint are working with UUIDs so they match but the journal
> block device:
> osd journal = /dev/sde1
>
> didn't matc
On 12/14/2012 10:53 AM, Nick Bartos wrote:
> Yes I was only enabling debugging for libceph. I'm adding debugging
> for rbd as well. I'll do a repro later today when a test cluster
> opens up.
Excellent, thank you. -Alex
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i
On 12/14/2012 09:59 AM, Joao Eduardo Luis wrote:
> On 12/14/2012 03:41 PM, Jim Schutt wrote:
>> Hi,
>>
>> I'm looking at commit e3ed28eb2 in the next branch,
>> and I have a question.
>>
>> Shouldn't the limit be pg_num > 65536, because
>> PGs are numbered 0 thru pg_num-1?
>>
>> If not, what am I m
On 12/14/2012 03:41 PM, Jim Schutt wrote:
Hi,
I'm looking at commit e3ed28eb2 in the next branch,
and I have a question.
Shouldn't the limit be pg_num > 65536, because
PGs are numbered 0 thru pg_num-1?
If not, what am I missing?
FWIW, up through yesterday I've been using the next branch and t
The kernel is 3.5.7 with the following patches applied (and in the
order specified below):
001-libceph_eliminate_connection_state_DEAD_13_days_ago.patch
002-libceph_kill_bad_proto_ceph_connection_op_13_days_ago.patch
003-libceph_rename_socket_callbacks_13_days_ago.patch
004-libceph_rename_kvec_res
On 12/13/2012 01:00 PM, Nick Bartos wrote:
> Here's another log with the kernel debugging enabled:
> https://gist.github.com/raw/4278697/1c9e41d275e614783fbbdee8ca5842680f46c249/rbd-hang-1355424455.log
>
> Note that it hung on the 2nd try.
Just to make sure I'm working with the right code base, c
Hello Mark,
Am 14.12.2012 16:20, schrieb Mark Nelson:
sudo parted -s -a optimal /dev/$DEV mklabel gpt
sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-journal 0% 10G
sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-data 10G 100%
Isn't that the part type you're using?
mkpart par
Hi,
I'm looking at commit e3ed28eb2 in the next branch,
and I have a question.
Shouldn't the limit be pg_num > 65536, because
PGs are numbered 0 thru pg_num-1?
If not, what am I missing?
FWIW, up through yesterday I've been using the next branch and this:
ceph osd pool set data pg_num 65536
Hi Mark,
Am 14.12.2012 16:20, schrieb Mark Nelson:
sudo parted -s -a optimal /dev/$DEV mklabel gpt
sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-journal 0% 10G
sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-data 10G 100%
My disks are gpt too and i'm also using parted. But
Hi Stefan,
Here's what I often do when I have a journal and data partition sharing
a disk:
sudo parted -s -a optimal /dev/$DEV mklabel gpt
sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-journal 0% 10G
sudo parted -s -a optimal /dev/$DEV mkpart osd-device-$i-data 10G 100%
Mark
On 12
Hi Mark,
but do i set a label for a partition without FS like the journal blockdev?
Am 14.12.2012 16:01, schrieb Mark Nelson:
I often map partitions to something in /dev/disk/by-partlabel and use
those in my ceph.conf files. that way disks can be remapped behind the
scenes and the ceph configur
Hello Dennis,
Am 14.12.2012 15:52, schrieb Dennis Jacobfeuerborn:
didn't match anymore - as the numbers got renumber due to the failed disk.
Is there a way to use some kind of UUIDs here too for journal?
You should be able to use /dev/disk/by-uuid/* instead. That should give you
a stable view
On 12/14/2012 08:52 AM, Dennis Jacobfeuerborn wrote:
On 12/14/2012 10:14 AM, Stefan Priebe wrote:
One more IMPORTANT note. This might happen due to the fact that a disk was
missing (disk failure) afte the reboot.
fstab and mountpoint are working with UUIDs so they match but the journal
block de
On 12/13/2012 08:54 AM, Lachfeld, Jutta wrote:
Hi all,
Hi! Sorry to send this a bit late, it looks like the reply I authored
yesterday from my phone got eaten by vger.
I am currently doing some comparisons between CEPH FS and HDFS as a file system
for Hadoop using Hadoop's integrated ben
On 12/14/2012 10:14 AM, Stefan Priebe wrote:
> One more IMPORTANT note. This might happen due to the fact that a disk was
> missing (disk failure) afte the reboot.
>
> fstab and mountpoint are working with UUIDs so they match but the journal
> block device:
> osd journal = /dev/sde1
>
> didn't m
Hi Noah, Gregory and Sage,
first of all, thanks for your quick replies. Here are some answers to your
questions.
Gregory, I have got the output of "ceph -s" before and after this specific
TeraSort run, and to me it looks ok; all 30 osds are "up":
health HEALTH_OK
monmap e1: 1 mons at {0=
Hi team,
I forgot to include a description (also cc-ing correctly the
synnefo-devel list).
I am a member of the Synnefo team, where we are experimenting with RADOS
as a storage backend to host blocks for our volume block storage named
"archipelago".
In this patch I implement aio stat and a
---
src/include/rados/librados.h | 14 ++
src/include/rados/librados.hpp | 15 +-
src/librados/IoCtxImpl.cc | 42
src/librados/IoCtxImpl.h |9 +
src/librados/librados.cc | 10 ++
5 files
One more IMPORTANT note. This might happen due to the fact that a disk
was missing (disk failure) afte the reboot.
fstab and mountpoint are working with UUIDs so they match but the
journal block device:
osd journal = /dev/sde1
didn't match anymore - as the numbers got renumber due to the fai
On 12/13/2012 07:17 PM, Yehuda Sadeh wrote:
> On Thu, Dec 13, 2012 at 7:37 AM, Stratos Psomadakis wrote:
>> Signed-off-by: Stratos Psomadakis
>> ---
>> Hi Josh,
>>
>> This patch adds the '--json' flag to enable dumping the showmapped output in
> I think that it should be "--format=json" rather th
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 14/12/12 04:38, Gary Lowell wrote:
>> I think that the "--debbuildopts '-j8 -b'" might be trouncing
>> the
>>> - --binary-arch flag - I'll get pbuilder setup and give it a
>>> test - I normally use sbuild (for which the packaging changes
>>> did h
same log more verbose:
11 ec=10 les/c 3307/3307 3306/3306/3306) [] r=0 lpr=0 lcod 0'0 mlcod 0'0
inactive] read_log done
-11> 2012-12-14 09:17:50.648572 7fb6e0d6b780 10 osd.3 pg_epoch: 3996
pg[3.44b( v 3988'3969 (1379'2968,3988'3969] local-les=3307 n=11 ec=10
les/c 3307/3307 3306/3306/3306) [
Hello list,
after a reboot of my node i see this on all OSDs of this node after the
reboot:
2012-12-14 09:03:20.393224 7f8e652f8780 -1 osd/OSD.cc: In function
'OSDMapRef OSDService::get_map(epoch_t)' thread 7f8e652f8780 time
2012-12-14 09:03:20.392528
osd/OSD.cc: 4385: FAILED assert(_get_ma
36 matches
Mail list logo