Question about simulation with crushtool

2012-11-22 Thread Nam Dang
Dear all,

I am trying to do some small experiment with crushtool by simulating
different variants of CRUSHs.
However, I encounter some problem with crushtool due to its lack of
documentation.

I want to ask the command to simulate the placement in a 32-device
bucket system (only 1 bucket)? And how to show the placement of the
data through the simulation?
What I've figured out so far is:

> crushtool -t --min_x 3 --build --num_osds 32 root uniform 0

The output that I get is:

crushtool successfully built or modified map.  Use '-o ' to write it out.
rule 0 (data2012-11-23 00:13:21.676296 7fa31f813780  0 layer 1  root
bucket type uniform  0
), x = 0..1023, numrep = 2..2
2012-11-23 00:13:21.676313 7fa31f813780  0 lower_items
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]
2012-11-23 00:13:21.676320 7fa31f813780  0 lower_weights
[65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536,65536]
2012-11-23 00:13:21.676374 7fa31f813780  0   item 0 weight 65536
2012-11-23 00:13:21.676375 7fa31f813780  0   item 1 weight 65536
2012-11-23 00:13:21.676376 7fa31f813780  0   item 2 weight 65536
2012-11-23 00:13:21.676376 7fa31f813780  0   item 3 weight 65536
2012-11-23 00:13:21.676377 7fa31f813780  0   item 4 weight 65536
2012-11-23 00:13:21.676378 7fa31f813780  0   item 5 weight 65536
2012-11-23 00:13:21.676378 7fa31f813780  0   item 6 weight 65536
2012-11-23 00:13:21.676379 7fa31f813780  0   item 7 weight 65536
2012-11-23 00:13:21.676380 7fa31f813780  0   item 8 weight 65536
2012-11-23 00:13:21.676380 7fa31f813780  0   item 9 weight 65536
2012-11-23 00:13:21.676381 7fa31f813780  0   item 10 weight 65536
2012-11-23 00:13:21.676382 7fa31f813780  0   item 11 weight 65536
rule 0 (data) num_rep 2 result size == 02012-11-23 00:13:21.676382
7fa31f813780  0   item 12 weight 65536
:   2012-11-23 00:13:21.676404 7fa31f813780  0   item 13 weight 65536
1024/1024
2012-11-23 00:13:21.676407 7fa31f813780  0   item 14 weight 65536
2012-11-23 00:13:21.676407 7fa31f813780  0   item 15 weight 65536
2012-11-23 00:13:21.676408 7fa31f813780  0   item 16 weight 65536
2012-11-23 00:13:21.676409 7fa31f813780  0   item 17 weight 65536
2012-11-23 00:13:21.676412 7fa31f813780  0   item 18 weight 65536
2012-11-23 00:13:21.676413 7fa31f813780  0   item 19 weight 65536
2012-11-23 00:13:21.676414 7fa31f813780  0   item 20 weight 65536
2012-11-23 00:13:21.676414 7fa31f813780  0   item 21 weight 65536
2012-11-23 00:13:21.676415 7fa31f813780  0   item 22 weight 65536
2012-11-23 00:13:21.676416 7fa31f813780  0   item 23 weight 65536
2012-11-23 00:13:21.676417 7fa31f813780  0   item 24 weight 65536
2012-11-23 00:13:21.676417 7fa31f813780  0   item 25 weight 65536
2012-11-23 00:13:21.676671 7fa31f813780  0   item 26 weight 65536
2012-11-23 00:13:21.676673 7fa31f813780  0   item 27 weight 65536
2012-11-23 00:13:21.676674 7fa31f813780  0   item 28 weight 65536
2012-11-23 00:13:21.676675 7fa31f813780  0   item 29 weight 65536
2012-11-23 00:13:21.676676 7fa31f813780  0   item 30 weight 65536
2012-11-23 00:13:21.676678 7fa31f813780  0   item 31 weight 65536
2012-11-23 00:13:21.676731 7fa31f813780  0  in bucket -1 'root' size
32 weight 2097152


I can't seem to be able to add parameters like
"--show_utilization_all"; crushtool keeps complaining about
"layers must be specified with 3-tuples of (name, buckettype, size)".
I totally have no idea why.

When I modified the source code to force crushtool to printout the
placement of all the data items (for 32 devices it seems crushtool
allocates 32 items only), the list entries are just blank. Basically
it appears that CRUSH fails to find appropriate location for ALL the
data (the devices all have the same weight).

I hope someone here can tell me more about the usage of crushtool for
simulation like this. THank you

Best regards,
Nam Dang

Email: n...@de.cs.titech.ac.jp
HP: (+81) 080-4465-1587
Yokota Lab, Dept. of Computer Science
Tokyo Institute of Technology
Tokyo, Japan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SIGSEGV in cephfs-java, but probably in Ceph

2012-06-05 Thread Nam Dang
Hi Noah,

Thank you for the push. Now I don't have SIGSEGV anymore. By the way,
is there any way to move the ceph_is_mounted() function into the java
library's code instead of putting it in the main Ceph's code?

> The basic idea was that threads in Java did not map 1:1 with kernel
> threads (think co-routines), which would break a lot of stuff,
> especially futex. Looking at some documentation, old JVMs had
> something called Green Threads, but have now been abandoned in favor
> of native threads. So maybe this theory is now irrelevant, and
> evidence seems to suggest you're right and Java is using native
> threads.

I check the error without multithreading and it was the same. And I'm
using Java 6, so threads should be real threads.
My guess is that it's related to some internal locking mechanism.

Best regards,
Nam Dang
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SIGSEGV in cephfs-java, but probably in Ceph

2012-05-31 Thread Nam Dang
I made a mistake in the previous email. As Noah said, this problem is
due to the wrapper being used with an unsuccessfully mounted client.
However, I think if the mount fails, the wrapper should throw an
exception instead of letting the client continue.

Best regards,
Nam Dang
Tokyo Institute of Technology
Tokyo, Japan


On Fri, Jun 1, 2012 at 1:44 PM, Nam Dang  wrote:
> I pulled the Java lib from 
> https://github.com/noahdesu/ceph/tree/wip-java-cephfs
> However, I use ceph 0.47.1 installed directly from Ubuntu's repository
> with apt-get, not the one that I built with the java library. I
> assumed that since the java lib is just a wrapper.
>
>>>There are only two segfaults that I've ever encountered, one in which the C 
>>>wrappers are used with an unmounted client, and the error Nam is seeing 
>>>(although they
>>> could be related). I will re-submit an updated patch for the former, which 
>>> should rule that out as the culprit.
>
> No, this occurs when I call mount(null) with the monitor being taken
> down. The library should throw an Exception instead, but since SIGSEGV
> originates from libcephfs.so so I guess it's more related to Ceph's
> internal code.
>
> Best regards,
>
> Nam Dang
> Tokyo Institute of Technology
> Tokyo, Japan
>
>
> On Fri, Jun 1, 2012 at 8:58 AM, Noah Watkins  wrote:
>>
>> On May 31, 2012, at 3:39 PM, Greg Farnum wrote:
>>>>
>>>> Nevermind to my last comment. Hmm, I've seen this, but very rarely.
>>> Noah, do you have any leads on this? Do you think it's a bug in your Java 
>>> code or in the C/++ libraries?
>>
>> I _think_ this is because the JVM uses its own threading library, and Ceph 
>> assumes pthreads and pthread compatible mutexes--is that assumption about 
>> Ceph correct? Hence the error that looks like Mutex::lock(bool) being 
>> reference for context during the segfault. To verify this all that is needed 
>> is some synchronization added to the Java.
>>
>> There are only two segfaults that I've ever encountered, one in which the C 
>> wrappers are used with an unmounted client, and the error Nam is seeing 
>> (although they could be related). I will re-submit an updated patch for the 
>> former, which should rule that out as the culprit.
>>
>> Nam: where are you grabbing the Java patches from? I'll push some updates.
>>
>>
>> The only other scenario that comes to mind is related to signaling:
>>
>> The RADOS Java wrappers suffered from an interaction between the JVM and 
>> RADOS client signal handlers, in which either the JVM or RADOS would replace 
>> the handlers for the other (not sure which order). Anyway, the solution was 
>> to link in the JVM libjsig.so signal chaining library. This might be the 
>> same thing we are seeing here, but I'm betting it is the first theory I 
>> mentioned.
>>
>> - Noah
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SIGSEGV in cephfs-java, but probably in Ceph

2012-05-31 Thread Nam Dang
I pulled the Java lib from https://github.com/noahdesu/ceph/tree/wip-java-cephfs
However, I use ceph 0.47.1 installed directly from Ubuntu's repository
with apt-get, not the one that I built with the java library. I
assumed that since the java lib is just a wrapper.

>>There are only two segfaults that I've ever encountered, one in which the C 
>>wrappers are used with an unmounted client, and the error Nam is seeing 
>>(although they
>> could be related). I will re-submit an updated patch for the former, which 
>> should rule that out as the culprit.

No, this occurs when I call mount(null) with the monitor being taken
down. The library should throw an Exception instead, but since SIGSEGV
originates from libcephfs.so so I guess it's more related to Ceph's
internal code.

Best regards,

Nam Dang
Tokyo Institute of Technology
Tokyo, Japan


On Fri, Jun 1, 2012 at 8:58 AM, Noah Watkins  wrote:
>
> On May 31, 2012, at 3:39 PM, Greg Farnum wrote:
>>>
>>> Nevermind to my last comment. Hmm, I've seen this, but very rarely.
>> Noah, do you have any leads on this? Do you think it's a bug in your Java 
>> code or in the C/++ libraries?
>
> I _think_ this is because the JVM uses its own threading library, and Ceph 
> assumes pthreads and pthread compatible mutexes--is that assumption about 
> Ceph correct? Hence the error that looks like Mutex::lock(bool) being 
> reference for context during the segfault. To verify this all that is needed 
> is some synchronization added to the Java.
>
> There are only two segfaults that I've ever encountered, one in which the C 
> wrappers are used with an unmounted client, and the error Nam is seeing 
> (although they could be related). I will re-submit an updated patch for the 
> former, which should rule that out as the culprit.
>
> Nam: where are you grabbing the Java patches from? I'll push some updates.
>
>
> The only other scenario that comes to mind is related to signaling:
>
> The RADOS Java wrappers suffered from an interaction between the JVM and 
> RADOS client signal handlers, in which either the JVM or RADOS would replace 
> the handlers for the other (not sure which order). Anyway, the solution was 
> to link in the JVM libjsig.so signal chaining library. This might be the same 
> thing we are seeing here, but I'm betting it is the first theory I mentioned.
>
> - Noah
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SIGSEGV in cephfs-java, but probably in Ceph

2012-05-31 Thread Nam Dang
Hi Noah,

By the way, the test suite of cephfs-java has a bug. You should put
the permission value in the form of 0777 instead of 777 since the
number has to be octal. With 777 I got directories with weird
permission settings.

Best regards
Nam Dang
Tokyo Institute of Technology
Tokyo, Japan


On Thu, May 31, 2012 at 11:43 PM, Noah Watkins  wrote:
>
> On May 31, 2012, at 6:20 AM, Nam Dang wrote:
>
>>> Stack: [0x7ff6aa828000,0x7ff6aa929000],
>>> sp=0x7ff6aa9274f0,  free space=1021k
>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
>>> code)
>>> C  [libcephfs.so.1+0x139d39]  Mutex::Lock(bool)+0x9
>>>
>>> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>>> j  com.ceph.fs.CephMount.native_ceph_mkdirs(JLjava/lang/String;I)I+0
>>> j  com.ceph.fs.CephMount.mkdirs(Ljava/lang/String;I)V+6
>>> j  
>>> Benchmark$CreateFileStats.executeOp(IILjava/lang/String;Lcom/ceph/fs/CephMount;)J+37
>>> j  Benchmark$StatsDaemon.benchmarkOne()V+22
>>> j  Benchmark$StatsDaemon.run()V+26
>>> v  ~StubRoutines::call_stub
>
> Nevermind to my last comment. Hmm, I've seen this, but very rarely.
>
> - Noah
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SIGSEGV in cephfs-java, but probably in Ceph

2012-05-31 Thread Nam Dang
It turned out my monitor went down without my knowing.
So my bad, it wasn't because of Ceph.

Best regards,

Nam Dang
Tokyo Institute of Technology
Tokyo, Japan


On Thu, May 31, 2012 at 10:08 PM, Nam Dang  wrote:
> Dear all,
>
> I am running a small benchmark for Ceph with multithreading and cephfs-java 
> API.
> I encountered this issue even when I use only two threads, and I used
> only open file and creating directory operations.
>
> The piece of code is simply:
> String parent = filePath.substring(0, filePath.lastIndexOf('/'));
> mount.mkdirs(parent, 0755); // create parents if the path does not exist
> int fileID = mount.open(filePath, CephConstants.O_CREAT, 0666); //
> open the file
>
> Each thread mounts its own ceph mounting point (using
> mount.mount(null)) and I don't have any interlocking mechanism across
> the threads at all.
> It appears the error is SIGSEGV sent off by libcepfs. The message is as 
> follows:
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7ff6af978d39, pid=14063, tid=140697400411904
> #
> # JRE version: 6.0_26-b03
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libcephfs.so.1+0x139d39]  Mutex::Lock(bool)+0x9
> #
> # An error report file with more information is saved as:
> # /home/namd/cephBench/hs_err_pid14063.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
>
> I have also attached the hs_err_pid14063.log for your reference.
> An excerpt from the file:
>
> Stack: [0x7ff6aa828000,0x7ff6aa929000],
> sp=0x7ff6aa9274f0,  free space=1021k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C  [libcephfs.so.1+0x139d39]  Mutex::Lock(bool)+0x9
>
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  com.ceph.fs.CephMount.native_ceph_mkdirs(JLjava/lang/String;I)I+0
> j  com.ceph.fs.CephMount.mkdirs(Ljava/lang/String;I)V+6
> j  
> Benchmark$CreateFileStats.executeOp(IILjava/lang/String;Lcom/ceph/fs/CephMount;)J+37
> j  Benchmark$StatsDaemon.benchmarkOne()V+22
> j  Benchmark$StatsDaemon.run()V+26
> v  ~StubRoutines::call_stub
>
> So I think the probably may be due to the locking mechanism of ceph
> internally. But Dr. Weil previously answered my email stating that the
> mounting is done independently so multithreading should not lead to
> this problem. If there is anyway to work around this, please let me
> know.
>
> Best regards,
>
> Nam Dang
> Email: n...@de.cs.titech.ac.jp
> HP: (+81) 080-4465-1587
> Yokota Lab, Dept. of Computer Science
> Tokyo Institute of Technology
> Tokyo, Japan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SIGSEGV in cephfs-java, but probably in Ceph

2012-05-31 Thread Nam Dang
Dear all,

I am running a small benchmark for Ceph with multithreading and cephfs-java API.
I encountered this issue even when I use only two threads, and I used
only open file and creating directory operations.

The piece of code is simply:
String parent = filePath.substring(0, filePath.lastIndexOf('/'));
mount.mkdirs(parent, 0755); // create parents if the path does not exist
int fileID = mount.open(filePath, CephConstants.O_CREAT, 0666); //
open the file

Each thread mounts its own ceph mounting point (using
mount.mount(null)) and I don't have any interlocking mechanism across
the threads at all.
It appears the error is SIGSEGV sent off by libcepfs. The message is as follows:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7ff6af978d39, pid=14063, tid=140697400411904
#
# JRE version: 6.0_26-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode
linux-amd64 compressed oops)
# Problematic frame:
# C  [libcephfs.so.1+0x139d39]  Mutex::Lock(bool)+0x9
#
# An error report file with more information is saved as:
# /home/namd/cephBench/hs_err_pid14063.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.

I have also attached the hs_err_pid14063.log for your reference.
An excerpt from the file:

Stack: [0x7ff6aa828000,0x7ff6aa929000],
sp=0x7ff6aa9274f0,  free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libcephfs.so.1+0x139d39]  Mutex::Lock(bool)+0x9

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.ceph.fs.CephMount.native_ceph_mkdirs(JLjava/lang/String;I)I+0
j  com.ceph.fs.CephMount.mkdirs(Ljava/lang/String;I)V+6
j  
Benchmark$CreateFileStats.executeOp(IILjava/lang/String;Lcom/ceph/fs/CephMount;)J+37
j  Benchmark$StatsDaemon.benchmarkOne()V+22
j  Benchmark$StatsDaemon.run()V+26
v  ~StubRoutines::call_stub

So I think the probably may be due to the locking mechanism of ceph
internally. But Dr. Weil previously answered my email stating that the
mounting is done independently so multithreading should not lead to
this problem. If there is anyway to work around this, please let me
know.

Best regards,

Nam Dang
Email: n...@de.cs.titech.ac.jp
HP: (+81) 080-4465-1587
Yokota Lab, Dept. of Computer Science
Tokyo Institute of Technology
Tokyo, Japan


hs_err_pid14063.log
Description: Binary data


Question about libcephfs mounting mechanism

2012-05-30 Thread Nam Dang
Dear all,

I want to inquire on ceph's internal mounting mechanism. I am using
wip-java-cephfs API to access Ceph internal file system directly.
I want to create several threads that access cephFS simultaneously and
independently, i.e. the mounting in each thread is
independent from the other's, and there is no shared underlying data
structure. As far as I know, java-ceph does not use shared structure,
but I'm not so sure about the underlying code in libcephfs though. I'm
worried that it may be similar to Virtual File System layer, upon
which multiple threads can access concurrently but internally the
mounting point is shared. I hope somebody with experience with Ceph
can help me answer this question.

Thank you very much,

Best regards,
Nam Dang
Tokyo Institute of Technology
Tokyo, Japan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problems when benchmarking Ceph

2012-05-30 Thread Nam Dang
Hi,

i just figured out one problem in my benchmark: all the concurrent
threads use the same file layer provided by the OS - so probably this
can be a bottleneck when the number of threads increases.
I wonder if I can connect directly to the MDS and access the the
underlying file system through some library? Sorry for my inexperience
but I haven't found any mentioning of IO operations for files in the
API. Did I miss something?

Best regards,

Nam Dang
Email: n...@de.cs.titech.ac.jp
Tokyo Institute of Technology
Tokyo, Japan


On Wed, May 30, 2012 at 7:28 PM, Nam Dang  wrote:
> Dear all,
>
> I am using Ceph as a baseline for Hadoop. In Hadoop there is a
> NNThroughputBenchmark, which tries to test the upper limit of the
> namenode (a.k.a MDS in Ceph).
> This NNThroughputBenchmark basically creates a master node, and
> creates many threads that sends requests to the master node as
> possible. This approach minimizes communication overhead when
> employing actual clients. The code can be found here:
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.19/src/test/org/apache/hadoop/hdfs/NNThroughputBenchmark.java
>
> So I'm testing Ceph in similar manner:
> - Mount Ceph to a folder
> - Create many threads and send requests to the MDS on the MDS (so no
> network communication)
> - I do not write any data to the files: just mere file creation.
>
> However, I notice very poor performance on Ceph (only about 485
> ops/sec, as oppose to 8000ops/sec in Hadoop, and I'm not sure why. I
> also notice that when I tried to remove the folder created by the
> interrupted test of the benchmark mentioned above, I took too long I
> had to Ctrl+Break out of the rm program. I'm thinking that the reason
> could be that I'm using Java IO instead of Ceph direct data
> manipulation code. Also, I didn't write any data so there shouldn't be
> any overhead of communicating with the OSDs (or is my assumption
> wrong?)
>
> So do you have any idea on this?
>
> My configuration at the moment:
> - Ceph 0.47.1
> - Intel Xeon 5 2.4Ghz, 4x2 cores
> - 24GB of RAM
> - One node for Monitor, One for MDS, 5 for OSD (of the same configuration)
> - I mount Ceph to a folder on the MDS and run the simulation on that
> folder (creating, opening, deleting files) - Right now I'm just
> working on creating files so I haven't tested with others.
>
> And I'm wondering if there is anyway I can use the API to manipulate
> the file system directly instead of mounting through the OS and use
> the OS's basic file manipulation layer.
> I checked the API doc at http://ceph.com/docs/master/api/librados/ and
> it appears that there is no clear way of accessing the Ceph's file
> system directly, only object-based storage system.
>
> Thank you very much for your help!
>
> Below is the configuration of my Ceph installation:
>
> ; disable authentication
> [mon]
>        mon data = /home/namd/ceph/mon
>
> [osd]
>        osd data = /home/namd/ceph/osd
>        osd journal = /home/namd/ceph/osd.journal
>        osd journal size = 1000
>        osd min rep = 3
>        osd max rep = 3
>        ; the following line is for ext4 partition
>        filestore xattr use omap = true
>
> [mds.1]
>        host=sakura09
>
> [mds.2]
>        host=sakura10
>
> [mds.3]
>        host=sakura11
>
> [mds.4]
>        host=sakura12
>
> [mon.0]
>        host=sakura08
>        mon addr=192.168.172.178:6789
>
> [osd.0]
>        host=sakura13
>
> [osd.1]
>        host=sakura14
>
> [osd.2]
>        host=sakura15
>
> [osd.3]
>        host=sakura16
>
> [osd.4]
>        host=sakura17
>
> [osd.5]
>        host=sakura18
>
>
>
> Best regards,
>
> Nam Dang
> Email: n...@de.cs.titech.ac.jp
> Tokyo Institute of Technology
> Tokyo, Japan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Problems when benchmarking Ceph

2012-05-30 Thread Nam Dang
Dear all,

I am using Ceph as a baseline for Hadoop. In Hadoop there is a
NNThroughputBenchmark, which tries to test the upper limit of the
namenode (a.k.a MDS in Ceph).
This NNThroughputBenchmark basically creates a master node, and
creates many threads that sends requests to the master node as
possible. This approach minimizes communication overhead when
employing actual clients. The code can be found here:
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.19/src/test/org/apache/hadoop/hdfs/NNThroughputBenchmark.java

So I'm testing Ceph in similar manner:
- Mount Ceph to a folder
- Create many threads and send requests to the MDS on the MDS (so no
network communication)
- I do not write any data to the files: just mere file creation.

However, I notice very poor performance on Ceph (only about 485
ops/sec, as oppose to 8000ops/sec in Hadoop, and I'm not sure why. I
also notice that when I tried to remove the folder created by the
interrupted test of the benchmark mentioned above, I took too long I
had to Ctrl+Break out of the rm program. I'm thinking that the reason
could be that I'm using Java IO instead of Ceph direct data
manipulation code. Also, I didn't write any data so there shouldn't be
any overhead of communicating with the OSDs (or is my assumption
wrong?)

So do you have any idea on this?

My configuration at the moment:
- Ceph 0.47.1
- Intel Xeon 5 2.4Ghz, 4x2 cores
- 24GB of RAM
- One node for Monitor, One for MDS, 5 for OSD (of the same configuration)
- I mount Ceph to a folder on the MDS and run the simulation on that
folder (creating, opening, deleting files) - Right now I'm just
working on creating files so I haven't tested with others.

And I'm wondering if there is anyway I can use the API to manipulate
the file system directly instead of mounting through the OS and use
the OS's basic file manipulation layer.
I checked the API doc at http://ceph.com/docs/master/api/librados/ and
it appears that there is no clear way of accessing the Ceph's file
system directly, only object-based storage system.

Thank you very much for your help!

Below is the configuration of my Ceph installation:

; disable authentication
[mon]
mon data = /home/namd/ceph/mon

[osd]
osd data = /home/namd/ceph/osd
osd journal = /home/namd/ceph/osd.journal
osd journal size = 1000
osd min rep = 3
osd max rep = 3
; the following line is for ext4 partition
filestore xattr use omap = true

[mds.1]
host=sakura09

[mds.2]
host=sakura10

[mds.3]
host=sakura11

[mds.4]
host=sakura12

[mon.0]
host=sakura08
mon addr=192.168.172.178:6789

[osd.0]
host=sakura13

[osd.1]
host=sakura14

[osd.2]
host=sakura15

[osd.3]
host=sakura16

[osd.4]
host=sakura17

[osd.5]
host=sakura18



Best regards,

Nam Dang
Email: n...@de.cs.titech.ac.jp
Tokyo Institute of Technology
Tokyo, Japan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Error 5 when trying to mount Ceph 0.47.1

2012-05-24 Thread Nam Dang
Hi,

I've just started working with Ceph for a couple of weeks. At the
moment, I'm trying to setup a small cluster with 1 monitor, 1 MDS and
6 OSDs. However, I cannot mount ceph to the system no matter which
node I'm executing the mounting command on.

My nodes run Ubuntu 11.10 with kernal 3.0.0-12
Seeing some other people also faced similar problems, I attached the
result of running ceph -s as followed:

2012-05-25 23:52:17.802590pg v434: 1152 pgs: 189 active+clean, 963
stale+active+clean; 8730 bytes data, 3667 MB used, 844 GB / 893 GB
avail
2012-05-25 23:52:17.806759   mds e12: 1/1/1 up {0=1=up:replay}
2012-05-25 23:52:17.806827   osd e30: 6 osds: 1 up, 1 in
2012-05-25 23:52:17.806966   log 2012-05-25 23:44:14.584879 mon.0
192.168.172.178:6789/0 2 : [INF] mds.? 192.168.172.179:6800/6515
up:boot
2012-05-25 23:52:17.807086   mon e1: 1 mons at {0=192.168.172.178:6789/0}

I tried to use the mount -t ceph node:port:/ [destination] but I keep
getting "mount error 5 = Input/output error"

I also check if the firewall is blocking anything with nmap -sT -p
6789 [monNode]

My ceph version is 0.47.1, installed with sudo apt-get on the system.
I've spent a couple of days googling with no avails, and the
documentation does not address this issue at all.

Thank you very much for your help,

Best regards,
Nam Dang
Tokyo Institute of Technology
Tokyo, Japan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html