Please don't use attachments. They should be stripped by the Apache
mailer. There are a bunch of mail archiver sites which don't save
attachments.
Lance
On Sun, Dec 26, 2010 at 8:20 AM, Harsh J wrote:
> Hi,
>
> On Sun, Dec 26, 2010 at 6:29 PM, Black, Michael (IS)
> wrote:
>> I assume there's a
Hi,
On Sun, Dec 26, 2010 at 6:29 PM, Black, Michael (IS)
wrote:
> I assume there's a way to make a specific # of splits and add each document
> to the separate splits...but I'll be darned if I can find the docs or an
> example to show this.
Would CombineFileInputFormat and CombineFileSplit be
el D. Black
Senior Scientist
Advanced Analytics Directorate
Northrop Grumman Information Systems
From: ?? [mailto:toppi...@gmail.com]
Sent: Sat 12/25/2010 10:32 AM
To: common-user@hadoop.apache.org
Subject: EXTERNAL:Re: Custom input split
What is the file you have a
What is the file you have attached? It is not safe.
I don't know the format of lucene index, would you please give an example?
On Sat, Dec 25, 2010 at 12:34 AM, Black, Michael (IS) <
michael.bla...@ngc.com> wrote:
> Using hadoop-0.20
>
>
> I'm doing custom input splits from a Lucene index.
>
>
Using hadoop-0.20
I'm doing custom input splits from a Lucene index.
I want to split the document ID's across N mappers (I'm testing the
scalabilty of the problem across 4 nodes and 8 cores).
So the key is the document# and they are not sequential.
At this point I'm using splits.add to add eac