Re: Computing overlap of two files with hadoop

2011-06-27 Thread Claus Stadler
-files-with-ha Best regards, Claus On 06/24/2011 12:44 PM, Claus Stadler wrote: Hi, My problem is as follows: I have two input files, and I want to determine a) The number of lines which only occur in file 1 b) The number of lines which only occur in file 2 c) The number of lines common to both

Computing overlap of two files with hadoop

2011-06-24 Thread Claus Stadler
Hi, My problem is as follows: I have two input files, and I want to determine a) The number of lines which only occur in file 1 b) The number of lines which only occur in file 2 c) The number of lines common to both (e.g. in regard to string equality) Exaple: File 1: a b c File 2: a d

Re: ClassCastException with LineRecordReader (hadoop release version 0.21.0)

2011-06-02 Thread Claus Stadler
On 04/26/2011 12:52 AM, Claus Stadler wrote: Hi, Thank you for the reply, however now I have the question on what the recommended way is to get hadoop working with this fix? Is there a documentation for this? So some of my questions right now are: . Does the svn head revision usually work

Re: ClassCastException with LineRecordReader (hadoop release version 0.21.0)

2011-04-25 Thread Claus Stadler
Hi, Thank you for the reply, however now I have the question on what the recommended way is to get hadoop working with this fix? Is there a documentation for this? So some of my questions right now are: . Does the svn head revision usually work? . If not, is there a specific revision that is

ClassCastException with LineRecordReader (hadoop release version 0.21.0)

2011-04-20 Thread Claus Stadler
Hi, I guess I am not the first one to see the following exception when trying to initialize a LineRecordReader. However, so far I could't figure out a workaround for this problem. I saw that this problem was fixed in the svn, but when I checked out one of the 0.23.0 versions (I can't