Re: Deduplication Effort in Hadoop

2011-07-14 Thread C.V.Krishnakumar Iyer
Hi, I guess by "system" you meant HDFS. In that case HBase might help. HBase needs to have unique keys. They are just bytes, so I guess you can just concatenate multiple columns in your primary key ( if you have a primary key spanning >1 column) to have a key for HBase, so that duplicates don

Command Line Arguments for Client

2011-02-22 Thread C.V.Krishnakumar Iyer
Hi, Could anyone tell how we could set the commandline arguments ( like -Xmx and -Xms) for the client (not for the map/reduce tasks) from the command that is usually used to launch the job? Thanks, Krishnakumar

Re: libjars options

2011-01-11 Thread C.V.Krishnakumar Iyer
file (can view >> via a JT Web UI)? >> >> -- >> Alex Kozlov >> Solutions Architect >> Cloudera, Inc >> twitter: alexvk2009 >> <http://www.cloudera.com/company/press-center/hadoop-world-nyc/> >> >> >> On Tue, Jan 11, 2011 a

Re: libjars options

2011-01-11 Thread C.V.Krishnakumar Iyer
>> Solutions Architect >> Cloudera, Inc >> twitter: alexvk2009 >> <http://www.cloudera.com/company/press-center/hadoop-world-nyc/> >> >> >> On Tue, Jan 11, 2011 at 11:49 AM, C.V.Krishnakumar Iyer < >> f2004...@gmail.com> wrote: >&

Re: libjars options

2011-01-11 Thread C.V.Krishnakumar Iyer
Hi, I have tried that as well, using -files But it still gives the exact same error. Any other thing that I could try? Thanks, Krishna. On Jan 11, 2011, at 10:23 AM, Ted Yu wrote: > Refer to Alex Kozlov's answer on 12/11/10 > > On Tue, Jan 11, 2011 at 10:10 AM, C.V.Kri

libjars options

2011-01-11 Thread C.V.Krishnakumar Iyer
Hi, Could anyone please guide me as to how to use the -libjars option in HDFS? I have added the necessary jar file (the hbase jar - to be precise) to the classpath of the node where I am starting the job. The following is the format that i am invoking: bin/hadoop jar -libjars bin/hadoo

-libjars option

2011-01-10 Thread C.V.Krishnakumar Iyer
Hi, Could anyone please guide me as to how to use the -libjars option in HDFS? I have added the necessary jar file (the hbase jar - to be precise) to the classpath of the node where I am starting the job. The following is the format that i am invoking: bin/hadoop jar -libjars bin/hadoo