Files disappearing

2012-09-17 Thread Székelyi Szabolcs
Hi, I have a problem that newly copied files disappear from the FS after a few minutes. The only suspicious log entries look like this: 2012-09-17 15:45:40.251610 7f7024f25700 0 mds.0.server missing 1000818 #1/vmdir/2414/images/deployment.0 (mine), will load later I see messages like

Re: ceph init script does not start

2012-07-15 Thread Székelyi Szabolcs
On 2012. July 14. 07:42:34 Sage Weil wrote: On Sat, 14 Jul 2012, Xiaopong Tran wrote: I'm getting this funny issue. I had setup two test clusters, and mkcephfs and the ceph start up script worked just fine. We are now ready to go production, we have 6 nodes, with 10 disks each, and one

core dumps

2012-07-10 Thread Székelyi Szabolcs
Hi, I usually find core dumps in my root folders belonging to Ceph daemons. Yesterday night two of my three monitors dumped core at the exact same moment. Are you interested in them? And in general, if I find such core files, should I send them to you? Thanks, -- cc -- To unsubscribe from

Re: core dumps

2012-07-10 Thread Székelyi Szabolcs
. On 2012. July 10. 12:28:24 Székelyi Szabolcs wrote: I usually find core dumps in my root folders belonging to Ceph daemons. Yesterday night two of my three monitors dumped core at the exact same moment. Are you interested in them? And in general, if I find such core files, should I send them

Re: Keys caps

2012-07-10 Thread Székelyi Szabolcs
On 2012. July 10. 13:09:10 Gregory Farnum wrote: On Mon, Jul 9, 2012 at 10:27 AM, Székelyi Szabolcs szeke...@niif.hu wrote: On 2012. July 9. 09:33:22 Sage Weil wrote: On Mon, 9 Jul 2012, Székelyi Szabolcs wrote: this far I accessed my Ceph (0.48) FS with the client.admin key, but I'd

Re: Keys caps

2012-07-10 Thread Székelyi Szabolcs
On 2012. July 10. 16:25:47 Sage Weil wrote: On Wed, 11 Jul 2012, Sz?kelyi Szabolcs wrote: The problem is that the mount.ceph command doesn't understand keyrings; it only understands secret= and secretfile=. There is a longstanding feature bug open for this

Keys caps

2012-07-09 Thread Székelyi Szabolcs
Hello, this far I accessed my Ceph (0.48) FS with the client.admin key, but I'd like to change that since I don't want to allow clients to control the cluster. I thought I should create a new key, give it some caps (don't exactly know which ones), and distribute it to clients. Here are some

Re: Keys caps

2012-07-09 Thread Székelyi Szabolcs
On 2012. July 9. 09:33:22 Sage Weil wrote: On Mon, 9 Jul 2012, Székelyi Szabolcs wrote: this far I accessed my Ceph (0.48) FS with the client.admin key, but I'd like to change that since I don't want to allow clients to control the cluster. I thought I should create a new key, give

Re: OSD doesn't start

2012-07-08 Thread Székelyi Szabolcs
On 2012. July 6. 01:33:13 Székelyi Szabolcs wrote: On 2012. July 5. 16:12:42 Székelyi Szabolcs wrote: On 2012. July 4. 09:34:04 Gregory Farnum wrote: Hrm, it looks like the OSD data directory got a little busted somehow. How did you perform your upgrade? (That is, how did you kill your

Re: OSD doesn't start

2012-07-08 Thread Székelyi Szabolcs
On 2012. July 4. 09:34:04 Gregory Farnum wrote: Hrm, it looks like the OSD data directory got a little busted somehow. How did you perform your upgrade? (That is, how did you kill your daemons, in what order, and when did you bring them back up.) Just to make sure: what's the recommended

Re: OSD doesn't start

2012-07-06 Thread Székelyi Szabolcs
On 2012. July 5. 16:12:42 Székelyi Szabolcs wrote: On 2012. July 4. 09:34:04 Gregory Farnum wrote: Hrm, it looks like the OSD data directory got a little busted somehow. How did you perform your upgrade? (That is, how did you kill your daemons, in what order, and when did you bring them

Re: OSD doesn't start

2012-07-05 Thread Székelyi Szabolcs
. The data seems to be accessible at the moment, but I'm afraid that my production cluster will end up in a similar situation after upgrade, so I don't dare to touch it. Do you have any suggestion what I should check? Thanks, -- cc On Wednesday, July 4, 2012 at 8:31 AM, Székelyi Szabolcs wrote

Re: global_init_daemonize: BUG: there are 1 child threads already started that will now die!

2012-05-17 Thread Székelyi Szabolcs
On 2012. May 17. 09:02:45 Sage Weil wrote: On Thu, 17 May 2012, Székelyi Szabolcs wrote: I get the $subject message when starting Ceph with the init script. I have to try it 15-20 times until the start suceeds. I've seen this message emitted by the monitor and MDS daemons, but never by OSDs

Re: [WRN] map e### wrongly marked me down or wrong addr

2012-02-28 Thread Székelyi Szabolcs
On 2012. February 27. 09:03:11 Sage Weil wrote: On Mon, 27 Feb 2012, Székelyi Szabolcs wrote: whenever I restart osd.0 I see a pair of messages like 2012-02-27 17:26:00.132666 mon.0 osd_1_ip:6789/0 106 : [INF] osd.0 osd_0_ip:6801/29931 failed (by osd.1 osd_1_ip:6806/20125) 2012-02-27

Re: [WRN] map e### wrongly marked me down or wrong addr

2012-02-28 Thread Székelyi Szabolcs
On 2012. February 28. 08:16:34 Gregory Farnum wrote: 2012/2/28 Székelyi Szabolcs szeke...@niif.hu: On 2012. February 27. 09:03:11 Sage Weil wrote: On Mon, 27 Feb 2012, Székelyi Szabolcs wrote: whenever I restart osd.0 I see a pair of messages like 2012-02-27 17:26:00.132666 mon.0

Re: Stale NFS file handle

2012-02-24 Thread Székelyi Szabolcs
On 2012. February 23. 10:43:02 Tommi Virtanen wrote: 2012/2/13 Székelyi Szabolcs szeke...@niif.hu: Okay, that sounds like a bug then. The two interesting things would be a ceph-fuse log (--debug-client 10 --debug-ms 1 --log-file /path/to/log) and an mds log (debug mds = 20, debug ms = 1

Re: Stale NFS file handle

2012-02-14 Thread Székelyi Szabolcs
On 2012. February 13. 17:04:27 Tommi Virtanen wrote: 2012/2/13 Székelyi Szabolcs szeke...@niif.hu: I'm using Ceph 0.41 with the FUSE client. After a while I get stale NFS file errors when trying to read a file or list a directory. Logs and scrubbing doesn't show any errors or suspicious

Stale NFS file handle

2012-02-13 Thread Székelyi Szabolcs
Hi, I'm using Ceph 0.41 with the FUSE client. After a while I get stale NFS file errors when trying to read a file or list a directory. Logs and scrubbing doesn't show any errors or suspicious entries. After remounting the filesystem either by restarting the cluster thus forcing the clients to

Re: Stale NFS file handle

2012-02-13 Thread Székelyi Szabolcs
On 2012. February 13. 15:34:13 Sage Weil wrote: On Tue, 14 Feb 2012, Székelyi Szabolcs wrote: I'm using Ceph 0.41 with the FUSE client. After a while I get stale NFS file errors when trying to read a file or list a directory. Logs and scrubbing doesn't show any errors or suspicious entries

Re: Stale NFS file handle

2012-02-13 Thread Székelyi Szabolcs
On 2012. February 13. 15:54:39 Sage Weil wrote: On Tue, 14 Feb 2012, Székelyi Szabolcs wrote: No, there's no NFS in the picture. The OSDs' backend storage is on a local filesystem. I think it's the FUSE client telling me this. Okay, that sounds like a bug then. The two interesting things

Nagios plugin

2012-01-30 Thread Székelyi Szabolcs
Hi Dallas, I've seen in #709 that you've been working on the Nagios plugin for Ceph. I want to monitor Ceph from Nagios, too, and avoid duplicating your work. But I can't find the result anywhere. Can you point me to the right place, if it's available for the public? Thanks, -- cc -- To

Monitors calling for elections all the time under load

2011-08-08 Thread Székelyi Szabolcs
Hello, when I put my cluster under a little stress (doing performance measurements with fio from one client), I see messages like this when watching the cluster with ceph -w: 2011-08-08 11:12:55.448460 log 2011-08-08 11:12:36.912896 mon1 193.225.36.16:6789/0 10 : [INF] mon.iscsigw1 calling

Re: Monitors fallen apart

2011-08-05 Thread Székelyi Szabolcs
On 2011. August 4. 20:14:54 Yehuda Sadeh Weinraub wrote: 2011/8/3 Székelyi Szabolcs szeke...@niif.hu: I'm running ceph 0.32, and since a while it looks like if a monitor fails, then the cluster doesn't find a new one. I have three nodes, two with cmds+cosd+cmon, and one with cmds+cmon

Monitors fallen apart

2011-08-03 Thread Székelyi Szabolcs
Hello, I'm running ceph 0.32, and since a while it looks like if a monitor fails, then the cluster doesn't find a new one. I have three nodes, two with cmds+cosd+cmon, and one with cmds+cmon, which is also running the client. If I stop one of the cmds+cosd+cmon nodes, ceph -w run on the

Degraded PGs blocking open()?

2011-06-06 Thread Székelyi Szabolcs
Hi all, I have a three node ceph setup, two nodes playing all three roles (OSD, MDS, MON), and one being just a monitor (which happens to be the client I'm using the filesystem from). I want to achieve high availablity by mirroring all data between the OSDs and being able to still access