date:20120601

Re: OSD deadlock with cephfs client and OSD on same machine

2012-06-01 Thread Amon Ott

On Wednesday 30 May 2012 wrote Amon Ott:
> On Tuesday 29 May 2012 you wrote:
> > On Tue, 29 May 2012, Amon Ott wrote:
> > > Please consider putting out a fat warning at least at build time, if
> > > syncfs() is not available, e.g. "No syncfs() syscall, please expect a
> > > deadlock when running osd on non-btrfs together with a local cephfs
> > > mount." Even better would be a quick runtime test for missing syncfs()
> > > and storage on non-btrfs that spits out a warning, if deadlock is
> > > possible.
> >
> > I think a runtime warning makes more sense; nobody will see the build
> > time warning (e.g., those installed debs).
>
> Yes, fully agreed.

Thanks for the new log lines in master git. The warning without syncfs() 
support could be a bit more clear though - the system is not only slower, it 
hangs needing a reset and reboot. This is much worse, specially if cephfs is 
permanently broken by bug 1047 afterwards. And I am pretty sure that our 
systems were not running out of memory, because during our load tests we 
always have several GB of unused memory.

After backporting syncfs() support into Debian stable libc6 2.11 and 
recompiling Ceph with it, our test cluster is now running with syncfs().

A first two hour load test this morning did not produce any problems, so I can 
say that syncfs() makes it significantly more stable than sync(). We will 
make a several day load test soon.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH   Tel: +49 30 24342334
Am Köllnischen Park 1Fax: +49 30 24342336
10179 Berlin http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How will Ceph cope with a failed Journal device?

2012-06-01 Thread Jerker Nyberg


On Fri, 18 May 2012, Tommi Virtanen wrote:


Losing a journal with btrfs: creating a new journal should let the osd
recover the missing parts from replicas (and your data is safe mostly
because of Ceph replication, recovery is just faster).


Cool! No more SSDs (that might fail over being written to continuously 
after a couple of months depending in size, prize, write cycles etc) just 
add a lot of RAM, keep the journals on tmpfs and make sure to run Ceph on 
Btrfs? While keeping the replicas separated so not all fail at once.


The contents of the storage node will not be corrupt or something (just a 
bit old) when losing the journal?


--jerker
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fwd: Re: [PATCH 04/13] libceph: rename socket callbacks

2012-06-01 Thread Alex Elder


Forgot to "reply-all" my response.-Alex

 Original Message 
Subject: Re: [PATCH 04/13] libceph: rename socket callbacks
Date: Fri, 01 Jun 2012 07:00:10 -0500
From: Alex Elder 
To: Sage Weil 

On 05/31/2012 11:02 PM, Sage Weil wrote:

On Wed, 30 May 2012, Alex Elder wrote:

Change the names of the three socket callback functions to make it
more obvious they're specifically associated with a connection's
socket (not the ceph connection that uses it).

Signed-off-by: Alex Elder
---
  net/ceph/messenger.c |   28 ++--
  1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index fe3c2a1..5ad1f0a 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -153,46 +153,46 @@ EXPORT_SYMBOL(ceph_msgr_flush);
   */

  /* data available on socket, or listen socket received a connect */
-static void ceph_data_ready(struct sock *sk, int count_unused)
+static void ceph_sock_data_ready(struct sock *sk, int count_unused)
  {
struct ceph_connection *con = sk->sk_user_data;

if (sk->sk_state != TCP_CLOSE_WAIT) {
-   dout("ceph_data_ready on %p state = %lu, queueing work\n",
+   dout("%s on %p state = %lu, queueing work\n", __func__,


I think it's marginally better to do

dout(__func__ " on %p state = %lu, queueing work\n",

so that the concatenation happens at compile-time instead of runtime.


I think that's a good idea, but we can't assume __func__ is in fact
a constant, can we?  __LINE__ and __FILE__ are, but I think __func__
is named lower case to emphasize that it is not a string literal but
a variable.  And this concatenation only works for string literals.

-Alex


Otherwise, looks good!

Reviewed-by: Sage Weil


con, con->state);
queue_con(con);
}
  }

  /* socket has buffer space for writing */
-static void ceph_write_space(struct sock *sk)
+static void ceph_sock_write_space(struct sock *sk)
  {
struct ceph_connection *con = sk->sk_user_data;

/* only queue to workqueue if there is data we want to write,
 * and there is sufficient space in the socket buffer to accept
-* more data.  clear SOCK_NOSPACE so that ceph_write_space()
+* more data.  clear SOCK_NOSPACE so that ceph_sock_write_space()
 * doesn't get called again until try_write() fills the socket
 * buffer. See net/ipv4/tcp_input.c:tcp_check_space()
 * and net/core/stream.c:sk_stream_write_space().
 */
if (test_bit(WRITE_PENDING,&con->state)) {
if (sk_stream_wspace(sk)>= sk_stream_min_wspace(sk)) {
-   dout("ceph_write_space %p queueing write work\n",
con);
+   dout("%s %p queueing write work\n", __func__, con);
clear_bit(SOCK_NOSPACE,&sk->sk_socket->flags);
queue_con(con);
}
} else {
-   dout("ceph_write_space %p nothing to write\n", con);
+   dout("%s %p nothing to write\n", __func__, con);
}
  }

  /* socket's state has changed */
-static void ceph_state_change(struct sock *sk)
+static void ceph_sock_state_change(struct sock *sk)
  {
struct ceph_connection *con = sk->sk_user_data;

-   dout("ceph_state_change %p state = %lu sk_state = %u\n",
+   dout("%s %p state = %lu sk_state = %u\n", __func__,
 con, con->state, sk->sk_state);

if (test_bit(CLOSED,&con->state))
@@ -200,9 +200,9 @@ static void ceph_state_change(struct sock *sk)

switch (sk->sk_state) {
case TCP_CLOSE:
-   dout("ceph_state_change TCP_CLOSE\n");
+   dout("%s TCP_CLOSE\n", __func__);
case TCP_CLOSE_WAIT:
-   dout("ceph_state_change TCP_CLOSE_WAIT\n");
+   dout("%s TCP_CLOSE_WAIT\n", __func__);
if (test_and_set_bit(SOCK_CLOSED,&con->state) == 0) {
if (test_bit(CONNECTING,&con->state))
con->error_msg = "connection failed";
@@ -212,7 +212,7 @@ static void ceph_state_change(struct sock *sk)
}
break;
case TCP_ESTABLISHED:
-   dout("ceph_state_change TCP_ESTABLISHED\n");
+   dout("%s TCP_ESTABLISHED\n", __func__);
queue_con(con);
break;
default:/* Everything else is uninteresting */
@@ -228,9 +228,9 @@ static void set_sock_callbacks(struct socket *sock,
  {
struct sock *sk = sock->sk;
sk->sk_user_data = con;
-   sk->sk_data_ready = ceph_data_ready;
-   sk->sk_write_space = ceph_write_space;
-   sk->sk_state_change = ceph_state_change;
+   sk->sk_data_ready = ceph_sock_data_ready;
+   sk->sk_write_space = ceph_sock_write_space;
+   sk->sk_state_change = ceph_sock_state_change;
  }


--
1.7.5.4

--
To

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

2012-06-01 Thread Alex Elder


On 05/31/2012 11:24 PM, Sage Weil wrote:

On Wed, 30 May 2012, Alex Elder wrote:

A monitor client has a pointer to a ceph connection structure in it.
This is the only one of the three ceph client types that do it this
way; the OSD and MDS clients embed the connection into their main
structures.  There is always exactly one ceph connection for a
monitor client, so there is no need to allocate it separate from the
monitor client structure.

So switch the ceph_mon_client structure to embed its
ceph_connection structure.

Signed-off-by: Alex Elder
---
  include/linux/ceph/mon_client.h |2 +-
  net/ceph/mon_client.c   |   47 --
  2 files changed, 21 insertions(+), 28 deletions(-)

diff --git a/include/linux/ceph/mon_client.h b/include/linux/ceph/mon_client.h
index 545f859..2113e38 100644
--- a/include/linux/ceph/mon_client.h
+++ b/include/linux/ceph/mon_client.h
@@ -70,7 +70,7 @@ struct ceph_mon_client {
bool hunting;
int cur_mon;   /* last monitor i contacted */
unsigned long sub_sent, sub_renew_after;
-   struct ceph_connection *con;
+   struct ceph_connection con;
bool have_fsid;

/* pending generic requests */
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index 704dc95..ac4d6b1 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -106,9 +106,9 @@ static void __send_prepared_auth_request(struct
ceph_mon_client *monc, int len)
monc->pending_auth = 1;
monc->m_auth->front.iov_len = len;
monc->m_auth->hdr.front_len = cpu_to_le32(len);
-   ceph_con_revoke(monc->con, monc->m_auth);
+   ceph_con_revoke(&monc->con, monc->m_auth);
ceph_msg_get(monc->m_auth);  /* keep our ref */
-   ceph_con_send(monc->con, monc->m_auth);
+   ceph_con_send(&monc->con, monc->m_auth);
  }

  /*
@@ -117,8 +117,8 @@ static void __send_prepared_auth_request(struct
ceph_mon_client *monc, int len)
  static void __close_session(struct ceph_mon_client *monc)
  {
dout("__close_session closing mon%d\n", monc->cur_mon);
-   ceph_con_revoke(monc->con, monc->m_auth);
-   ceph_con_close(monc->con);
+   ceph_con_revoke(&monc->con, monc->m_auth);
+   ceph_con_close(&monc->con);
monc->cur_mon = -1;
monc->pending_auth = 0;
ceph_auth_reset(monc->auth);
@@ -142,9 +142,9 @@ static int __open_session(struct ceph_mon_client *monc)
monc->want_next_osdmap = !!monc->want_next_osdmap;

dout("open_session mon%d opening\n", monc->cur_mon);
-   monc->con->peer_name.type = CEPH_ENTITY_TYPE_MON;
-   monc->con->peer_name.num = cpu_to_le64(monc->cur_mon);
-   ceph_con_open(monc->con,
+   monc->con.peer_name.type = CEPH_ENTITY_TYPE_MON;
+   monc->con.peer_name.num = cpu_to_le64(monc->cur_mon);
+   ceph_con_open(&monc->con,
&monc->monmap->mon_inst[monc->cur_mon].addr);

/* initiatiate authentication handshake */
@@ -226,8 +226,8 @@ static void __send_subscribe(struct ceph_mon_client *monc)

msg->front.iov_len = p - msg->front.iov_base;
msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
-   ceph_con_revoke(monc->con, msg);
-   ceph_con_send(monc->con, ceph_msg_get(msg));
+   ceph_con_revoke(&monc->con, msg);
+   ceph_con_send(&monc->con, ceph_msg_get(msg));

monc->sub_sent = jiffies | 1;  /* never 0 */
}
@@ -247,7 +247,7 @@ static void handle_subscribe_ack(struct ceph_mon_client
*monc,
if (monc->hunting) {
pr_info("mon%d %s session established\n",
monc->cur_mon,
-   ceph_pr_addr(&monc->con->peer_addr.in_addr));
+   ceph_pr_addr(&monc->con.peer_addr.in_addr));
monc->hunting = false;
}
dout("handle_subscribe_ack after %d seconds\n", seconds);
@@ -461,7 +461,7 @@ static int do_generic_request(struct ceph_mon_client
*monc,
req->request->hdr.tid = cpu_to_le64(req->tid);
__insert_generic_request(monc, req);
monc->num_generic_requests++;
-   ceph_con_send(monc->con, ceph_msg_get(req->request));
+   ceph_con_send(&monc->con, ceph_msg_get(req->request));
mutex_unlock(&monc->mutex);

err = wait_for_completion_interruptible(&req->completion);
@@ -684,8 +684,8 @@ static void __resend_generic_request(struct
ceph_mon_client *monc)

for (p = rb_first(&monc->generic_request_tree); p; p = rb_next(p)) {
req = rb_entry(p, struct ceph_mon_generic_request, node);
-   ceph_con_revoke(monc->con, req->request);
-   ceph_con_send(monc->con, ceph_msg_get(req->request));
+   ceph_con_revoke(&monc->con, req->request);
+   ceph_con_send(&monc->con, ceph_msg_get(req->request));

Re: [PATCH 08/13] libceph: start separating connection flags from state

2012-06-01 Thread Alex Elder


On 05/31/2012 11:25 PM, Sage Weil wrote:

On Wed, 30 May 2012, Alex Elder wrote:

A ceph_connection holds a mixture of connection state (as in "state
machine" state) and connection flags in a single "state" field.  To
make the distinction more clear, define a new "flags" field and use
it rather than the "state" field to hold Boolean flag values.

Signed-off-by: Alex Elder
---
  include/linux/ceph/messenger.h |   18 +
  net/ceph/messenger.c   |   50

  2 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 3fbd4be..920235e 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -103,20 +103,25 @@ struct ceph_msg_pos {
  #define MAX_DELAY_INTERVAL(5 * 60 * HZ)

  /*
- * ceph_connection state bit flags
+ * ceph_connection flag bits
   */
+
  #define LOSSYTX 0  /* we can close channel or drop messages on errors
*/
-#define CONNECTING 1
-#define NEGOTIATING2
  #define KEEPALIVE_PENDING  3
  #define WRITE_PENDING 4  /* we have data ready to send */
+#define SOCK_CLOSED11 /* socket state changed to closed */
+#define BACKOFF 15
+
+/*
+ * ceph_connection states
+ */
+#define CONNECTING 1
+#define NEGOTIATING2
  #define STANDBY   8  /* no outgoing messages, socket closed.  we
keep
* the ceph_connection around to maintain shared
* state with the peer. */
  #define CLOSED10 /* we've closed the connection */
-#define SOCK_CLOSED11 /* socket state changed to closed */
  #define OPENING 13 /* open connection w/ (possibly new) peer */
-#define BACKOFF 15


Later it might be work prefixing these with FLAG_ and/or STATE_.


Absolutely, I'm saving that easy stuff for the end.  I'll move the
definitions into messenger.c as well if I can.


Reviewed-by: Sage Weil


Thanks.

-Alex



  /*
   * A single connection with another host.
@@ -133,7 +138,8 @@ struct ceph_connection {

struct ceph_messenger *msgr;
struct socket *sock;
-   unsigned long state;/* connection state (see flags above) */
+   unsigned long flags;
+   unsigned long state;
const char *error_msg;  /* error message, if any */

struct ceph_entity_addr peer_addr; /* peer address */
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 19f1948..29055df 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -176,7 +176,7 @@ static void ceph_sock_write_space(struct sock *sk)
 * buffer. See net/ipv4/tcp_input.c:tcp_check_space()
 * and net/core/stream.c:sk_stream_write_space().
 */
-   if (test_bit(WRITE_PENDING,&con->state)) {
+   if (test_bit(WRITE_PENDING,&con->flags)) {
if (sk_stream_wspace(sk)>= sk_stream_min_wspace(sk)) {
dout("%s %p queueing write work\n", __func__, con);
clear_bit(SOCK_NOSPACE,&sk->sk_socket->flags);
@@ -203,7 +203,7 @@ static void ceph_sock_state_change(struct sock *sk)
dout("%s TCP_CLOSE\n", __func__);
case TCP_CLOSE_WAIT:
dout("%s TCP_CLOSE_WAIT\n", __func__);
-   if (test_and_set_bit(SOCK_CLOSED,&con->state) == 0) {
+   if (test_and_set_bit(SOCK_CLOSED,&con->flags) == 0) {
if (test_bit(CONNECTING,&con->state))
con->error_msg = "connection failed";
else
@@ -393,9 +393,9 @@ void ceph_con_close(struct ceph_connection *con)
 ceph_pr_addr(&con->peer_addr.in_addr));
set_bit(CLOSED,&con->state);  /* in case there's queued work */
clear_bit(STANDBY,&con->state);  /* avoid connect_seq bump */
-   clear_bit(LOSSYTX,&con->state);  /* so we retry next connect */
-   clear_bit(KEEPALIVE_PENDING,&con->state);
-   clear_bit(WRITE_PENDING,&con->state);
+   clear_bit(LOSSYTX,&con->flags);  /* so we retry next connect */
+   clear_bit(KEEPALIVE_PENDING,&con->flags);
+   clear_bit(WRITE_PENDING,&con->flags);
mutex_lock(&con->mutex);
reset_connection(con);
con->peer_global_seq = 0;
@@ -612,7 +612,7 @@ static void prepare_write_message(struct ceph_connection
*con)
prepare_write_message_footer(con);
}

-   set_bit(WRITE_PENDING,&con->state);
+   set_bit(WRITE_PENDING,&con->flags);
  }

  /*
@@ -633,7 +633,7 @@ static void prepare_write_ack(struct ceph_connection *con)
&con->out_temp_ack);

con->out_more = 1;  /* more will follow.. eventually.. */
-   set_bit(WRITE_PENDING,&con->state);
+   set_bit(WRITE_PENDING,&con->flags);
  }

  /*
@@ -644,7 +644,7 @@ static void prepare_write_keepalive(struct ceph_connection
*con)
dout("prepare_write_keepalive %p\

Re: [PATCH 09/13] libceph: start tracking connection socket state

2012-06-01 Thread Alex Elder


On 05/31/2012 11:28 PM, Sage Weil wrote:

On Wed, 30 May 2012, Alex Elder wrote:

Start explicitly keeping track of the state of a ceph connection's
socket, separate from the state of the connection itself.  Create
placeholder functions to encapsulate the state transitions.

 
 | NEW* |  transient initial state
 
 | con_sock_state_init()
 v
 --
 | CLOSED |  initialized, but no socket (and no
 --  TCP connection)
  ^  \
  |   \ con_sock_state_connecting()
  |--
  |  \
  + con_sock_state_closed()   \
  |\   \
  | \   \
  |  --- \
  |  | CLOSING |  socket event;   \
  |  ---  await close  \
  |   ^|
  |   ||
  |   + con_sock_state_closing()   |
  |  / \   |
  | /   ---|
  |/   \   v
  |   /--
  |  /-| CONNECTING |  socket created, TCP
  |  |   / --  connect initiated
  |  |   | con_sock_state_connected()
  |  |   v
 -
 | CONNECTED |  TCP connection established
 -


Can we put this beautiful pictures in the header next to the states?


I can be quite the ASCII artist.  Yes, I will add this, when I update
the state definitions with better names and numbers.

-Alex


Reviewed-by: Sage Weil



Make the socket state an atomic variable, reinforcing that it's a
distinct transtion with no possible "intermediate/both" states.
This is almost certainly overkill at this point, though the
transitions into CONNECTED and CLOSING state do get called via
socket callback (the rest of the transitions occur with the
connection mutex held).  We can back out the atomicity later.

Signed-off-by: Alex Elder
---
  include/linux/ceph/messenger.h |8 -
  net/ceph/messenger.c   |   63

  2 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 920235e..5e852f4 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -137,14 +137,18 @@ struct ceph_connection {
const struct ceph_connection_operations *ops;

struct ceph_messenger *msgr;
+
+   atomic_t sock_state;
struct socket *sock;
+   struct ceph_entity_addr peer_addr; /* peer address */
+   struct ceph_entity_addr peer_addr_for_me;
+
unsigned long flags;
unsigned long state;
const char *error_msg;  /* error message, if any */

-   struct ceph_entity_addr peer_addr; /* peer address */
struct ceph_entity_name peer_name; /* peer name */
-   struct ceph_entity_addr peer_addr_for_me;
+
unsigned peer_features;
u32 connect_seq;  /* identify the most recent connection
 attempt for this connection, client */
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 29055df..7e11b07 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -29,6 +29,14 @@
   * the sender.
   */

+/* State values for ceph_connection->sock_state; NEW is assumed to be 0 */
+
+#define CON_SOCK_STATE_NEW 0   /* ->  CLOSED */
+#define CON_SOCK_STATE_CLOSED  1   /* ->  CONNECTING */
+#define CON_SOCK_STATE_CONNECTING  2   /* ->  CONNECTED or ->  CLOSING
*/
+#define CON_SOCK_STATE_CONNECTED   3   /* ->  CLOSING or ->  CLOSED */
+#define CON_SOCK_STATE_CLOSING 4   /* ->  CLOSED */
+
  /* static tag bytes (protocol control messages) */
  static char tag_msg = CEPH_MSGR_TAG_MSG;
  static char tag_ack = CEPH_MSGR_TAG_ACK;
@@ -147,6 +155,54 @@ void ceph_msgr_flush(void)
  }
  EXPORT_SYMBOL(ceph_msgr_flush);

+/* Connection socket state transition functions */
+
+static void con_sock_state_init(struct ceph_connection *con)
+{
+   int old_state;
+
+   old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
+   if (WARN_ON(old_state != CON_SOCK_STATE_NEW))
+   printk("%s: unexpected old state %d\n", __func__, old_state);
+}
+
+static void con_sock_state_connecting(struct ceph_connection *con)
+{
+   int old_state;
+
+   old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTING);
+   if (WARN_ON(old_state != CON_SOCK_STATE_CLOSED))
+   printk("%s: unexpected old state %d\n", __func__, old_state);
+}
+
+static void con_sock_state_connected(struct ceph_connection *con)
+{
+   int old_state;
+
+   old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTED);
+   if (WARN_ON(

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

2012-06-01 Thread Alex Elder


On 06/01/2012 07:12 AM, Alex Elder wrote:

On 05/31/2012 11:24 PM, Sage Weil wrote:

On Wed, 30 May 2012, Alex Elder wrote:

A monitor client has a pointer to a ceph connection structure in it.
This is the only one of the three ceph client types that do it this
way; the OSD and MDS clients embed the connection into their main
structures. There is always exactly one ceph connection for a
monitor client, so there is no need to allocate it separate from the
monitor client structure.

So switch the ceph_mon_client structure to embed its
ceph_connection structure.

Signed-off-by: Alex Elder


. . .


/* authentication */
monc->auth = ceph_auth_init(cl->options->name,
cl->options->key);
if (IS_ERR(monc->auth)) {
err = PTR_ERR(monc->auth);
- goto out_con;
+ goto out_monmap;
}
monc->auth->want_keys =
CEPH_ENTITY_TYPE_AUTH | CEPH_ENTITY_TYPE_MON |
@@ -824,8 +821,6 @@ out_subscribe_ack:
ceph_msg_put(monc->m_subscribe_ack);
out_auth:
ceph_auth_destroy(monc->auth);
-out_con:
- monc->con->ops->put(monc->con);


AH!

This reminds me, these connections need to be refcounted. There's a
->get() and ->put() op defined so that you can refcount the containing
structure. That means that this patch needs to alo change


Looking at this again.  Why do they need to be refcounted?  If
this patch is correct in embedding the connection into the
containing structure, then the last reference to the containing
structure is coincident with with the last reference to the
connection.  And the other connections are already embedded into
other containing structures.

So--again assuming it's OK to embed the connection--I would rather
see the ->get and ->put methods for the connection go away entirely
and have the containing structure take care of its own damned ref
counting...

This actually gets into another thing I wanted to do anyway (while
digging through raw memory trying to figure out what's going on).
I want every ceph_message to point back to the connection it is
associated with.  That way there's no need for the OSD (for example)
to keep track of the connection--a revoke is simply an operation
on the message, which would already know the connection from which
it is being revoked.

If you think the above approach is good I'll gladly do it now
rather than later.  I think it might eliminate the need for
any need to reference count the connections.

Anyway, when I'm done (if I ever finish!) the state of the connection
will tell you whether any legitimate uses of a connection remain.

-Alex


static const struct ceph_connection_operations mon_con_ops = {
.get = ceph_con_get,
.put = ceph_con_put,

in mon_client.c. Hopefully the mon_client itself is refcounted, *or* we
can ensure that it won't go away before the msgr workqueue is drained and
the get/put ops can turn to no-ops.



Earlier I looked at the ref counting stuff a bit and stopped myself
from going off on that tangent. But it didn't look like it was used
consistently and made a note to myself to revisit it.


Also: when poking around, I noticed that ceph_con_get() and put() are
called directly from osd_client.c... that's a bug! Those connections have
a get and put op defined that twiddles the containing ceph_osd struct's
ref count.

I pushed several patches to your latest (wip-messenger-2) branch that fix
these issues. Compile tested only! The first should probably be folded
into this one, the others follow.


I'll look at your patches and incorporate them as appropriate. But at
the moment I don't see them; whenever you are back online again perhaps
you'll send me a link.

-Alex




out_monmap:
kfree(monc->monmap);
out:
@@ -841,9 +836,7 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
mutex_lock(&monc->mutex);
__close_session(monc);

- monc->con->private = NULL;
- monc->con->ops->put(monc->con);
- monc->con = NULL;
+ monc->con.private = NULL;

mutex_unlock(&monc->mutex);

@@ -1021,7 +1014,7 @@ static void mon_fault(struct ceph_connection *con)
if (!monc->hunting)
pr_info("mon%d %s session lost, "
"hunting for new mon\n", monc->cur_mon,
- ceph_pr_addr(&monc->con->peer_addr.in_addr));
+ ceph_pr_addr(&monc->con.peer_addr.in_addr));

__close_session(monc);
if (!monc->hunting) {
--
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html








--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: rbd command : is it possible to pass authkey in argument ?

2012-06-01 Thread Alexandre DERUMIER

Hi Tommi,
I was looking for a way to pass a keyfile, just found it

--keyfile  (both rbd and qemu drive option)

Better indeed ;)

- Mail original - 

De: "Tommi Virtanen"  
À: "Alexandre DERUMIER"  
Cc: ceph-devel@vger.kernel.org 
Envoyé: Mercredi 30 Mai 2012 19:40:46 
Objet: Re: rbd command : is it possible to pass authkey in argument ? 

On Wed, May 30, 2012 at 2:15 AM, Alexandre DERUMIER  
wrote: 
> Is it possible to pass authkey as argument in rbd command line ? (I can do in 
> with qemu-rbd drive option) 

I see you got an answer for your actual question. I wanted to take a 
different angle. 

You should really avoid putting secrets on command lines, or in 
process environment. Those are readable to all local users. This is 
why I advocate keyring files. 

Alternatively, with qemu, the monitor command mechanism they have 
would let you add the drives, before starting up the vm, without the 
secrets being visible to others. 

-- 

-- 

Alexandre D erumier 
Ingénieur Système 
Fixe : 03 20 68 88 90 
Fax : 03 20 68 90 81 
45 Bvd du Général Leclerc 59100 Roubaix - France 
12 rue Marivaux 75002 Paris - France 

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "rbd rm image" slow with big images ?

2012-06-01 Thread Guido Winkelmann

Am Donnerstag, 31. Mai 2012, 11:19:44 schrieben Sie:
> On Thu, 31 May 2012, Wido den Hollander wrote:
> > Hi,
> > 
> > > Is it the normal behaviour ? Maybe some xfs tuning could help ?
> > 
> > It's in the nature of RBD.
> 
> Yes.
> 
> That said, the current implementation is also stupid: it's doing a single
> io at a time.  #2256 (next sprint) will parallelize this to make it go
> much faster (probably an order of magnitude?).

Will it speed up copy operations as well? Those are a lot more important in 
practice... A delete operation I can usually just fire off and leave running 
in the background, but if I'm running a copy operation, there's usually 
something else waiting (like starting a virtual server that's waiting for its 
disk) that cannot proceed until the copy is actually finished.

On another note, it looks to me (correct me if I'm wrong) like rbd copy 
operations always involve copying all the data objects from the source volume 
to the machine on which the rbd command is running, and then back to the 
cluster, even if that machine isn't even part of the cluster. Are there any 
plans to streamline this?

Regards,
Guido
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

2012-06-01 Thread Sage Weil

On Fri, 1 Jun 2012, Alex Elder wrote:
> On 06/01/2012 07:12 AM, Alex Elder wrote:
> > On 05/31/2012 11:24 PM, Sage Weil wrote:
> > > On Wed, 30 May 2012, Alex Elder wrote:
> > > > A monitor client has a pointer to a ceph connection structure in it.
> > > > This is the only one of the three ceph client types that do it this
> > > > way; the OSD and MDS clients embed the connection into their main
> > > > structures. There is always exactly one ceph connection for a
> > > > monitor client, so there is no need to allocate it separate from the
> > > > monitor client structure.
> > > > 
> > > > So switch the ceph_mon_client structure to embed its
> > > > ceph_connection structure.
> > > > 
> > > > Signed-off-by: Alex Elder
> 
> . . .
> 
> > > > /* authentication */
> > > > monc->auth = ceph_auth_init(cl->options->name,
> > > > cl->options->key);
> > > > if (IS_ERR(monc->auth)) {
> > > > err = PTR_ERR(monc->auth);
> > > > - goto out_con;
> > > > + goto out_monmap;
> > > > }
> > > > monc->auth->want_keys =
> > > > CEPH_ENTITY_TYPE_AUTH | CEPH_ENTITY_TYPE_MON |
> > > > @@ -824,8 +821,6 @@ out_subscribe_ack:
> > > > ceph_msg_put(monc->m_subscribe_ack);
> > > > out_auth:
> > > > ceph_auth_destroy(monc->auth);
> > > > -out_con:
> > > > - monc->con->ops->put(monc->con);
> > > 
> > > AH!
> > > 
> > > This reminds me, these connections need to be refcounted. There's a
> > > ->get() and ->put() op defined so that you can refcount the containing
> > > structure. That means that this patch needs to alo change
> 
> Looking at this again.  Why do they need to be refcounted?  If
> this patch is correct in embedding the connection into the
> containing structure, then the last reference to the containing
> structure is coincident with with the last reference to the
> connection.  And the other connections are already embedded into
> other containing structures.
> 
> So--again assuming it's OK to embed the connection--I would rather
> see the ->get and ->put methods for the connection go away entirely
> and have the containing structure take care of its own damned ref
> counting...

The problem is that socket events queue work, which can take a while, and 
race with, say, osd_client getting an osdmap and dropping it's 
struct ceph_osd.  The ->get and ->put ops just twiddle the containing 
struct's refcount, in that case, so the con_work will find the (now 
closed) ceph_connection and do nothing...

> This actually gets into another thing I wanted to do anyway (while
> digging through raw memory trying to figure out what's going on).
> I want every ceph_message to point back to the connection it is
> associated with.  That way there's no need for the OSD (for example)
> to keep track of the connection--a revoke is simply an operation
> on the message, which would already know the connection from which
> it is being revoked.
> 
> If you think the above approach is good I'll gladly do it now
> rather than later.  I think it might eliminate the need for
> any need to reference count the connections.

That sounds reasonable.. but i'm pretty sure the con refcounts can't go 
away :)

sage


> 
> Anyway, when I'm done (if I ever finish!) the state of the connection
> will tell you whether any legitimate uses of a connection remain.
> 
>   -Alex
> 
> > > static const struct ceph_connection_operations mon_con_ops = {
> > > .get = ceph_con_get,
> > > .put = ceph_con_put,
> > > 
> > > in mon_client.c. Hopefully the mon_client itself is refcounted, *or* we
> > > can ensure that it won't go away before the msgr workqueue is drained and
> > > the get/put ops can turn to no-ops.
> > 
> > 
> > Earlier I looked at the ref counting stuff a bit and stopped myself
> > from going off on that tangent. But it didn't look like it was used
> > consistently and made a note to myself to revisit it.
> > 
> > > Also: when poking around, I noticed that ceph_con_get() and put() are
> > > called directly from osd_client.c... that's a bug! Those connections have
> > > a get and put op defined that twiddles the containing ceph_osd struct's
> > > ref count.
> > > 
> > > I pushed several patches to your latest (wip-messenger-2) branch that fix
> > > these issues. Compile tested only! The first should probably be folded
> > > into this one, the others follow.
> > 
> > I'll look at your patches and incorporate them as appropriate. But at
> > the moment I don't see them; whenever you are back online again perhaps
> > you'll send me a link.
> > 
> > -Alex
> > 
> > > 
> > > > out_monmap:
> > > > kfree(monc->monmap);
> > > > out:
> > > > @@ -841,9 +836,7 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
> > > > mutex_lock(&monc->mutex);
> > > > __close_session(monc);
> > > > 
> > > > - monc->con->private = NULL;
> > > > - monc->con->ops->put(monc->con);
> > > > - monc->con = NULL;
> > > > + monc->con.private = NULL;
> > > > 
> > > > mutex_unlock(&monc->mutex);
> > > > 
> > > > @@ -1021,7 +1014,7 @@ static void mon_fault(struct ceph

Re: iozone test crashed on ceph

2012-06-01 Thread Greg Farnum

On Thursday, May 31, 2012 at 5:58 PM, udit agarwal wrote:
> Hi,
> I have set up ceph system with a client, mon and mds on one system which is
> connected to 2 osds. I ran iozone test with a 10G file and it ran fine. But 
> when
> I ran iozone test with a 5G file, the process got killed and our ceph system
> hanged. Can anyone please help me with this.

What do you mean, "the process got killed"? It hung and some task watcher 
killed it? Or it got OOMed?
How did you determine that the "ceph system" hung? The cluster stopped 
responding to requests, or just the local mount point?
-Greg

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

2012-06-01 Thread Alex Elder


On 06/01/2012 11:20 AM, Sage Weil wrote:

The problem is that socket events queue work, which can take a while, and
race with, say, osd_client getting an osdmap and dropping it's
struct ceph_osd.  The ->get and ->put ops just twiddle the containing
struct's refcount, in that case, so the con_work will find the (now
closed) ceph_connection and do nothing...


I think you're saying that the connection (or its socket) needs to
be protected from its containing structure going away.  So the
connection needs to hold a reference to its container.  If that's
the case then the disposal of the ceph_osd needs to clean up
the connection fully before it goes away.

Anyway, I think I see why there might be a need for the ref counts
and they obviously won't go away if they're needed...

-Alex
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

2012-06-01 Thread Sage Weil

On Fri, 1 Jun 2012, Alex Elder wrote:
> On 06/01/2012 11:20 AM, Sage Weil wrote:
> > The problem is that socket events queue work, which can take a while, and
> > race with, say, osd_client getting an osdmap and dropping it's
> > struct ceph_osd.  The ->get and ->put ops just twiddle the containing
> > struct's refcount, in that case, so the con_work will find the (now
> > closed) ceph_connection and do nothing...
> 
> I think you're saying that the connection (or its socket) needs to
> be protected from its containing structure going away.  So the
> connection needs to hold a reference to its container.  If that's
> the case then the disposal of the ceph_osd needs to clean up
> the connection fully before it goes away.

Yeah.  I think it happens already before we drop the ref:

static void __remove_osd(struct ceph_osd_client *osdc, struct ceph_osd *osd)
{
dout("__remove_osd %p\n", osd);
BUG_ON(!list_empty(&osd->o_requests));
rb_erase(&osd->o_node, &osdc->osds);
list_del_init(&osd->o_osd_lru);
ceph_con_close(&osd->o_con);
put_osd(osd);
}

So it's just the con reference in the workqueue that matters.

sage



> 
> Anyway, I think I see why there might be a need for the ref counts
> and they obviously won't go away if they're needed...
> 
>   -Alex
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

2012-06-01 Thread Alex Elder


On 05/31/2012 11:24 PM, Sage Weil wrote:

Also: when poking around, I noticed that ceph_con_get() and put() are
called directly from osd_client.c... that's a bug!  Those connections have
a get and put op defined that twiddles the containing ceph_osd struct's
ref count.


So are you saying that the calls in "osd_client.c" to ceph_con_get()
and ceph_con_put() should instead be calls to get_osd_con() and
put_osd_con(), respectively?  (Or more generally con->ops->get()
and con->ops->put()?)

-Alex
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

2012-06-01 Thread Sage Weil

On Fri, 1 Jun 2012, Alex Elder wrote:
> On 05/31/2012 11:24 PM, Sage Weil wrote:
> > Also: when poking around, I noticed that ceph_con_get() and put() are
> > called directly from osd_client.c... that's a bug!  Those connections have
> > a get and put op defined that twiddles the containing ceph_osd struct's
> > ref count.
> 
> So are you saying that the calls in "osd_client.c" to ceph_con_get()
> and ceph_con_put() should instead be calls to get_osd_con() and
> put_osd_con(), respectively?  (Or more generally con->ops->get()
> and con->ops->put()?)

Yeah.. one of the patches I pushed [er, and pushing now] fixes that.  

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations

2012-06-01 Thread Alex Elder


On 05/30/2012 02:34 PM, Alex Elder wrote:

In con_close_socket(), SOCK_CLOSED is set in the connection state,
then cleared again after shutting down the socket. Nothing between
the setting and clearing of that bit will ever be affected by it,
so there's no point in setting/clearing it at all. So don't.

Signed-off-by: Alex Elder 


I am retracting this proposed change.

I believe it's possible for the con->sock->ops->shutdown()
call to trigger a TCP_CLOSE socket state change event,
which means that there *is* something that can be affected
by that state bit being set.

-Alex


---
net/ceph/messenger.c | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 07af994..fe3c2a1 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -338,11 +338,9 @@ static int con_close_socket(struct ceph_connection
*con)
dout("con_close_socket on %p sock %p\n", con, con->sock);
if (!con->sock)
return 0;
- set_bit(SOCK_CLOSED, &con->state);
rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
sock_release(con->sock);
con->sock = NULL;
- clear_bit(SOCK_CLOSED, &con->state);
return rc;
}



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "rbd rm image" slow with big images ?

2012-06-01 Thread Wido den Hollander


Hi,

On 06/01/2012 03:51 PM, Guido Winkelmann wrote:

Am Donnerstag, 31. Mai 2012, 11:19:44 schrieben Sie:

On Thu, 31 May 2012, Wido den Hollander wrote:

Hi,


Is it the normal behaviour ? Maybe some xfs tuning could help ?


It's in the nature of RBD.


Yes.

That said, the current implementation is also stupid: it's doing a single
io at a time.  #2256 (next sprint) will parallelize this to make it go
much faster (probably an order of magnitude?).


Will it speed up copy operations as well? Those are a lot more important in
practice... A delete operation I can usually just fire off and leave running
in the background, but if I'm running a copy operation, there's usually
something else waiting (like starting a virtual server that's waiting for its
disk) that cannot proceed until the copy is actually finished.



#2256 is only about parallelizing deletions: 
http://tracker.newdream.net/issues/2256


I don't see a feature request in the tracker for parallelizing a copy, 
but we can always create that one :)



On another note, it looks to me (correct me if I'm wrong) like rbd copy
operations always involve copying all the data objects from the source volume
to the machine on which the rbd command is running, and then back to the
cluster, even if that machine isn't even part of the cluster. Are there any
plans to streamline this?



You are running the rbd command on that client, so that client will read 
the object and write them again as new RADOS objects.


What you are asking is a "cluster-side" clone of a volume, correct?

There is working on-going for layering, where you have one "golden 
image" with multiple childs. With that you can achieve what you want, 
but it's not always desired in every situation.


There has been talking about promoting a child to a fresh volume, that 
would be the same as the cloning you are talking about. I don't know the 
status of that.


Wido


Regards,
Guido
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How will Ceph cope with a failed Journal device?

2012-06-01 Thread Tommi Virtanen

[Whoops, resending as plain text to make vger happy.]

On Fri, Jun 1, 2012 at 4:35 AM, Jerker Nyberg  wrote:
> Cool! No more SSDs (that might fail over being written to continuously
> after a couple of months depending in size, prize, write cycles etc) just
> add a lot of RAM, keep the journals on tmpfs and make sure to run Ceph on
> Btrfs? While keeping the replicas separated so not all fail at once.
>
> The contents of the storage node will not be corrupt or something (just a
> bit old) when losing the journal?

The problem with storing things in RAM is, what if your rack/row/data
center loses power, all at once. It's really hard to guard against
those kinds of massive failures. If you don't have a persistent
journal, you might as well not have a journal at all.

The memory-only systems you see out there are typically only used to
the kinds of applications where rolling back to last (on-disk)
snapshot is acceptable -- stuff like shopping carts. Ceph is built on
significantly stronger promises, so it's not the ideal match for an
architecture like that.

It's also unclear to me on what would happen to Ceph if all the
replicas lost their journals at the same time. That might cause bigger
problems, since there's no up to date replica to pull the lost data
from.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: OSD deadlock with cephfs client and OSD on same machine

2012-06-01 Thread Tommi Virtanen

[Whoops, resending as plain text to make vger happy.]

On Fri, Jun 1, 2012 at 2:35 AM, Amon Ott  wrote:
> Thanks for the new log lines in master git. The warning without syncfs()
> support could be a bit more clear though - the system is not only slower, it
> hangs needing a reset and reboot. This is much worse, specially if cephfs is

That warning, introduced in
https://github.com/ceph/ceph/commit/07498d66233f388807a458554640cb77424114c0
, is more about running multiple OSDs on a single server, and without
syncfs(2) one OSD syncing causes all to sync. It's not related to your
case of loopback mounting, what has *never* worked well, with the
apparent exception of ceph-fuse

> say that syncfs() makes it significantly more stable than sync(). We will
> make a several day load test soon.

That still won't make it reliable, just less likely to trigger. Good
luck, you'll need it with loopback mounts.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 03/27] ceph: Push file_update_time() into ceph_page_mkwrite()

2012-06-01 Thread Jan Kara

CC: Sage Weil 
CC: ceph-devel@vger.kernel.org
Acked-by: Sage Weil 
Signed-off-by: Jan Kara 
---
 fs/ceph/addr.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 173b1d2..12b139f 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1181,6 +1181,9 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, 
struct vm_fault *vmf)
loff_t size, len;
int ret;
 
+   /* Update time before taking page lock */
+   file_update_time(vma->vm_file);
+
size = i_size_read(inode);
if (off + PAGE_CACHE_SIZE <= size)
len = PAGE_CACHE_SIZE;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

ceph fs

2012-06-01 Thread Martin Wilderoth

I have some problems with my ceph filesystem. I have a folder that i cant 
remove.

I.E.
root@lintx2:/mnt/backuppc/pc# ls -la toberemoved/ 
total 0
drwxr-x--- 1 backuppc backuppc28804802 May 15 13:29 .
drwxr-x--- 1 backuppc backuppc 29421083732 Jun  1 15:16 ..

root@lintx2:/mnt/backuppc/pc# rm -rf toberemoved 
rm: cannot remove `toberemoved': Directory not empty

I also have a folder that when I do ls i get the following message

[ 1828.569091] ceph: ceph_add_cap: couldn't find snap realm 100
[ 1828.569105] [ cut here ]
[ 1828.569121] WARNING: at 
/build/buildd-linux-2.6_3.2.17-1~bpo60+1-amd64-CJo7Ex/linux-2.6-3.2.17/debian/build/source_amd64_none/fs/ceph/caps.c:590
 ceph_add_cap+0x38e/0x49e [ceph]()
[ 1828.569139] Modules linked in: cryptd aes_x86_64 aes_generic cbc ceph 
libceph crc32c libcrc32c evdev snd_pcm snd_timer snd soundcore snd_page_alloc 
pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
[ 1828.569182] Pid: 18, comm: kworker/0:1 Tainted: GW
3.2.0-0.bpo.2-amd64 #1
[ 1828.569193] Call Trace:
[ 1828.569207]  [] ? warn_slowpath_common+0x78/0x8c
[ 1828.569221]  [] ? ceph_add_cap+0x38e/0x49e [ceph]
[ 1828.569233]  [] ? fill_inode+0x4eb/0x602 [ceph]
[ 1828.569244]  [] ? ceph_dentry_lru_touch+0x2a/0x68 [ceph]
[ 1828.569258]  [] ? ceph_readdir_prepopulate+0x2de/0x375 
[ceph]
[ 1828.569271]  [] ? dispatch+0xa35/0xef2 [ceph]
[ 1828.569286]  [] ? ceph_tcp_recvmsg+0x43/0x4f [libceph]
[ 1828.569297]  [] ? con_work+0x1070/0x13b8 [libceph]
[ 1828.569308]  [] ? update_curr+0xbc/0x160
[ 1828.569319]  [] ? try_write+0xbe1/0xbe1 [libceph]
[ 1828.569332]  [] ? process_one_work+0x1cc/0x2ea
[ 1828.569342]  [] ? worker_thread+0x12d/0x247
[ 1828.569353]  [] ? process_one_work+0x2ea/0x2ea
[ 1828.569361]  [] ? process_one_work+0x2ea/0x2ea
[ 1828.569372]  [] ? kthread+0x7a/0x82
[ 1828.569384]  [] ? kernel_thread_helper+0x4/0x10
[ 1828.569395]  [] ? int_ret_from_sys_call+0x7/0x1b
[ 1828.569406]  [] ? retint_restore_args+0x5/0x6
[ 1828.569417]  [] ? gs_change+0x13/0x13
[ 1828.569423] ---[ end trace 98770cddb79a6a55 ]---
[ 1828.569433] ceph: ceph_add_cap: couldn't find snap realm 100
[ 1828.569442] [ cut here ]
[ 1828.569452] WARNING: at 
/build/buildd-linux-2.6_3.2.17-1~bpo60+1-amd64-CJo7Ex/linux-2.6-3.2.17/debian/build/source_amd64_none/fs/ceph/caps.c:590
 ceph_add_cap+0x38e/0x49e [ceph]()
[ 1828.569467] Modules linked in: cryptd aes_x86_64 aes_generic cbc ceph 
libceph crc32c libcrc32c evdev snd_pcm snd_timer snd soundcore snd_page_alloc 
pcspkr ext3 jbd mbcache xen_netfront xen_blkfront
[ 1828.569500] Pid: 18, comm: kworker/0:1 Tainted: GW
3.2.0-0.bpo.2-amd64 #1
[ 1828.569508] Call Trace:
[ 1828.569513]  [] ? warn_slowpath_common+0x78/0x8c
[ 1828.569523]  [] ? ceph_add_cap+0x38e/0x49e [ceph]
[ 1828.569533]  [] ? fill_inode+0x4eb/0x602 [ceph]
[ 1828.569543]  [] ? ceph_dentry_lru_touch+0x2a/0x68 [ceph]
[ 1828.569552]  [] ? ceph_readdir_prepopulate+0x2de/0x375 
[ceph]
[ 1828.569563]  [] ? dispatch+0xa35/0xef2 [ceph]
[ 1828.569573]  [] ? ceph_tcp_recvmsg+0x43/0x4f [libceph]
[ 1828.569583]  [] ? con_work+0x1070/0x13b8 [libceph]
[ 1828.569590]  [] ? update_curr+0xbc/0x160
[ 1828.569599]  [] ? try_write+0xbe1/0xbe1 [libceph]
[ 1828.569607]  [] ? process_one_work+0x1cc/0x2ea
[ 1828.569615]  [] ? worker_thread+0x12d/0x247
[ 1828.569622]  [] ? process_one_work+0x2ea/0x2ea
[ 1828.569630]  [] ? process_one_work+0x2ea/0x2ea
[ 1828.569637]  [] ? kthread+0x7a/0x82
[ 1828.569644]  [] ? kernel_thread_helper+0x4/0x10
[ 1828.569652]  [] ? int_ret_from_sys_call+0x7/0x1b
[ 1828.569660]  [] ? retint_restore_args+0x5/0x6
[ 1828.569667]  [] ? gs_change+0x13/0x13
[ 1828.569673] ---[ end trace 98770cddb79a6a56 ]---
Then i see some folders

Is there a way to remove this error directories or a reason / bug why I get 
this messages.

The folder that I try to remove had a similar problem as the one above, I 
manages to remove
all visible files.

 /Regards Martin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: OSD deadlock with cephfs client and OSD on same machine

Re: How will Ceph cope with a failed Journal device?

Fwd: Re: [PATCH 04/13] libceph: rename socket callbacks

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

Re: [PATCH 08/13] libceph: start separating connection flags from state

Re: [PATCH 09/13] libceph: start tracking connection socket state

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

Re: rbd command : is it possible to pass authkey in argument ?

Re: "rbd rm image" slow with big images ?

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

Re: iozone test crashed on ceph

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client

Re: [PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations

Re: "rbd rm image" slow with big images ?

Re: How will Ceph cope with a failed Journal device?

Re: OSD deadlock with cephfs client and OSD on same machine

[PATCH 03/27] ceph: Push file_update_time() into ceph_page_mkwrite()

ceph fs

21 matches

Site Navigation

Mail list logo

Footer information