Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2013-01-10 Thread Gregory Farnum
On Wed, Jan 9, 2013 at 8:00 AM, Noah Watkins noah.watk...@inktank.com wrote: Hi Jutta, On Wed, Jan 9, 2013 at 7:11 AM, Lachfeld, Jutta jutta.lachf...@ts.fujitsu.com wrote: the current content of the web page http://ceph.com/docs/master/cephfs/hadoop shows a configuration parameter

RE: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2013-01-09 Thread Lachfeld, Jutta
, December 13, 2012 9:33 PM To: Gregory Farnum Cc: Cameron Bahar; Sage Weil; Lachfeld, Jutta; ceph-devel@vger.kernel.org; Noah Watkins; Joe Buck Subject: Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue The bindings use the default Hadoop settings

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2013-01-04 Thread Gregory Farnum
Sorry for the delay; I've been out on vacation... On Fri, Dec 14, 2012 at 6:09 AM, Lachfeld, Jutta jutta.lachf...@ts.fujitsu.com wrote: I do not have the full output of ceph pg dump for that specific TeraSort run, but here is a typical output after automatically preparing CEPH for a

RE: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-14 Thread Lachfeld, Jutta
- From: Noah Watkins [mailto:jayh...@cs.ucsc.edu] Sent: Thursday, December 13, 2012 9:33 PM To: Gregory Farnum Cc: Cameron Bahar; Sage Weil; Lachfeld, Jutta; ceph-devel@vger.kernel.org; Noah Watkins; Joe Buck Subject: Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-14 Thread Mark Nelson
On 12/13/2012 08:54 AM, Lachfeld, Jutta wrote: Hi all, Hi! Sorry to send this a bit late, it looks like the reply I authored yesterday from my phone got eaten by vger. I am currently doing some comparisons between CEPH FS and HDFS as a file system for Hadoop using Hadoop's integrated

Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Lachfeld, Jutta
Hi all, I am currently doing some comparisons between CEPH FS and HDFS as a file system for Hadoop using Hadoop's integrated benchmark TeraSort. This benchmark first generates the specified amount of data in the file system used by Hadoop, e.g. 1TB of data, and then sorts the data via the

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Sage Weil
Hi Jutta, On Thu, 13 Dec 2012, Lachfeld, Jutta wrote: Hi all, I am currently doing some comparisons between CEPH FS and HDFS as a file system for Hadoop using Hadoop's integrated benchmark TeraSort. This benchmark first generates the specified amount of data in the file system used by

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Gregory Farnum
On Thu, Dec 13, 2012 at 9:27 AM, Sage Weil s...@inktank.com wrote: Hi Jutta, On Thu, 13 Dec 2012, Lachfeld, Jutta wrote: Hi all, I am currently doing some comparisons between CEPH FS and HDFS as a file system for Hadoop using Hadoop's integrated benchmark TeraSort. This benchmark first

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Cameron Bahar
Is the chunk size tunable in A Ceph cluster. I don't mean dynamic, but even statically configurable when a cluster is first installed? Thanks, Cameron Sent from my iPhone On Dec 13, 2012, at 9:41 AM, Gregory Farnum g...@inktank.com wrote: On Thu, Dec 13, 2012 at 9:27 AM, Sage Weil

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Gregory Farnum
On Thu, Dec 13, 2012 at 12:23 PM, Cameron Bahar cba...@gmail.com wrote: Is the chunk size tunable in A Ceph cluster. I don't mean dynamic, but even statically configurable when a cluster is first installed? Yeah. You can set chunk size on a per-file basis; you just can't change it once the

Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark performance comparison issue

2012-12-13 Thread Noah Watkins
The bindings use the default Hadoop settings (e.g. 64 or 128 MB chunks) when creating new files. The chunk size can also be specified on a per-file basis using the same interface as Hadoop. Additionally, while Hadoop doesn't provide an interface to configuration parameters beyond chunk size, we