Re: Map tasks processing some files multiple times

2012-12-06 Thread Hemanth Yamijala
as not-splittable? This seems to be a bug in hadoop code if I’m right. ** ** David ** ** ** ** *From:* Raj Vishwanathan [mailto:rajv...@yahoo.com] *Sent:* Thursday, December 06, 2012 1:45 PM *To:* user@hadoop.apache.org *Subject:* Re: Map tasks processing some files multiple times

Re: Map tasks processing some files multiple times

2012-12-06 Thread Hemanth Yamijala
out what I had done. ** ** Dave ** ** ** ** *From:* Hemanth Yamijala [mailto:yhema...@thoughtworks.com] *Sent:* Thursday, December 06, 2012 3:25 PM *To:* user@hadoop.apache.org *Subject:* Re: Map tasks processing some files multiple times ** ** David, ** ** You

Map tasks processing some files multiple times

2012-12-05 Thread David Parks
I've got a job that reads in 167 files from S3, but 2 of the files are being mapped twice and 1 of the files is mapped 3 times. This is the code I use to set up the mapper: Path lsDir = new Path(s3n://fruggmapreduce/input/catalogs/linkshare_catalogs/*~*); for(FileStatus f :

Re: Map tasks processing some files multiple times

2012-12-05 Thread Raj Vishwanathan
Could it be due to spec-ex? Does it make a diffrerence in the end? Raj From: David Parks davidpark...@yahoo.com To: user@hadoop.apache.org Sent: Wednesday, December 5, 2012 10:15 PM Subject: Map tasks processing some files multiple times I’ve got a job

RE: Map tasks processing some files multiple times

2012-12-05 Thread David Parks
@hadoop.apache.org Subject: Re: Map tasks processing some files multiple times Could it be due to spec-ex? Does it make a diffrerence in the end? Raj _ From: David Parks davidpark...@yahoo.com To: user@hadoop.apache.org Sent: Wednesday, December 5, 2012 10:15 PM Subject: Map tasks