Hi All,
I am running a job where there are between 1300-1400 map tasks. Some
map task fails due to some error. When 4 such maps fail the job naturally
gets killed. How to ignore the failed tasks and go around executing the
other map tasks. I am okay with loosing some data for the failed t
Hi,
I have a set of items and a pairwise similar items. I want to group
together items that are mutually similar.
For ex : if *A B C D E F G* are the items
I have the following pairwise similar items
*A B*
*A C*
*B C *
*D E *
*C G*
*E F*
I want the output as
*A B C G*
*D E F*
Can someone su
Hi,
I have a set of hashes. Each Hash is a 32 bit Long Integer. Two hashes
are similar if their corresponding hamming distance is less than equal to 2.
I need to group together hashes that are mutually similar to one another
i.e in the output file in each line i should have mutually similar k
Hi ,
I have an input file where each line is of the form :
URLs whose number is within a threshold are considered similar. My
task is to group together all similar urls. For this i wrote a *custom
writable* where i implemented the threshold check in the
*compareTo*meth
An input file where each line corresponds to a document .Each document is
identfied by some fingerPrints .For example a line in the input file
is of the following form :
input:
-
DOCID1 HASH1 HASH2 HASH3 HASH4
DOCID2 HASH5 HASH3 HASH1 HASH4
The output of the mapreduce job
Consider a following input file of format :
input File :
1 2
2 3
3 4
6 7
7 9
10 11
The output Should be as follows :
1 2 3 4
6 7 9
10 11
Hi ,
I need to read an existing lucene index in a map.can someone point
me to the right direction.
Thanks,
Parnab