Bharath, how would a Pig query look like?
Thank you, Mark On Sun, Jun 26, 2011 at 5:12 PM, Bharath Mundlapudi <bharathw...@yahoo.com>wrote: > If you have Serde or PigLoader for your log format, probably Pig or Hive > will be a quicker solution with the join. > > -Bharath > > > > ________________________________ > From: Mark Kerzner <markkerz...@gmail.com> > To: Hadoop Discussion Group <core-u...@hadoop.apache.org> > Sent: Saturday, June 25, 2011 9:39 PM > Subject: Comparing two logs, finding missing records > > Hi, > > I have two logs which should have all the records for the same record_id, > in > other words, if this record_id is found in the first log, it should also be > found in the second one. However, I suspect that the second log is filtered > out, and I need to find the missing records. Anything is allowed: MapReduce > job, Hive, Pig, and even a NoSQL database. > > Thank you. > > It is also a good time to express my thanks to all the members of the group > who are always very helpful. > > Sincerely, > Mark >