subject:"splitting of big files\?"

Re: splitting of big files?

2008-05-29 Thread Doug Cutting

Erik Paulson wrote: When reading from HDFS, how big are the network read requests, and what controls that? Or, more concretely, if I store files using 64Meg blocks in HDFS and run the simple word count example, and I get the default of one FileSplit/Map task per 64 meg block, how many bytes into

Re: splitting of big files?

2008-05-28 Thread Erik Paulson

On Tue, May 27, 2008 at 10:49:38AM -0700, Ted Dunning wrote: > > There is a good tutorial on the wiki about this. > > Your problem here is that you have conflated two concepts. The first is the > splitting of files into blocks for storage purposes. This has nothing to do > with what data a prog

Re: splitting of big files?

2008-05-27 Thread Ted Dunning

The input format chosen determines the semantics of the input file. On 5/27/08 9:46 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > How does the application know that the file is 'text' though (i.e. when is new > line a special character)? Or are all files assumed to be text? > > And even

Re: splitting of big files?

2008-05-27 Thread Ted Dunning

There is a good tutorial on the wiki about this. Your problem here is that you have conflated two concepts. The first is the splitting of files into blocks for storage purposes. This has nothing to do with what data a program can read any more than splitting a file into blocks on a disk in a co

RE: Re: splitting of big files?

2008-05-27 Thread Andreas Kostyrka

It's text lines for streaming, which is just another Map/Reduce app. And how it's interpreted by your app, it's up to your input class. Andreas Am Dienstag, den 27.05.2008, 16:46 + schrieb [EMAIL PROTECTED]: > > >- > >Od: Doug Cutting

RE: Re: splitting of big files?

2008-05-27 Thread koara

>- >Od: Doug Cutting > >Each split (except the first) contains the first line starting after >it's start position through the first line ending after its end >position. So if you have a file with: Aha, very nice, in my browsing around t

Re: splitting of big files?

Re: splitting of big files?

Re: splitting of big files?

Re: splitting of big files?

RE: Re: splitting of big files?

RE: Re: splitting of big files?

6 matches

Site Navigation

Mail list logo

Footer information