Re: Loading Data to HDFS

2012-10-30 Thread Ranjith
along the lines of the email below, has there any libraries built out to copy files in parallel into the cluster? using some sort of byte offset techniques, etc? Thanks, Ranjith On Oct 30, 2012, at 9:24 AM, "M. C. Srivas" wrote: > Loading a petabyte from a single machine will t

Re: running select count in hive keeps on pending

2012-08-09 Thread Ranjith
the file loaded last could be corrupted. try to decompress the file and see if you get any errors. Thanks, Ranjith On Aug 9, 2012, at 8:07 PM, rei125 wrote: > s.

MR job tuning

2012-05-31 Thread Ranjith
, record and the data buffers. Given that my jvm for map tasks 700m and the space left after taking out the space used for buffers is 400m. What is stored in this 400m? Thanks, Ranjith

Re: CopyFromLocal

2012-05-22 Thread Ranjith
Harsh, Thanks for the response bud. Appreciate it! Thanks, Ranjith On May 21, 2012, at 11:09 PM, Harsh J wrote: > Ranjith, > > MapReduce and HDFS are two different things. MapReduce uses HDFS (and > can use any other FS as well) to do some efficient work, but HDFS does > no

Re: CopyFromLocal

2012-05-21 Thread Ranjith
Thanks harsh. So when it connects directly to the data nodes it does not fire off any mappers. So how does it get the data over? Is it just a block by block copy? Thanks, Ranjith On May 21, 2012, at 9:22 PM, Harsh J wrote: > Ranjith, > > Are you speaking of DistC

CopyFromLocal

2012-05-21 Thread Ranjith
I have always wondered about this and and not sure as to phenomenon. When I fire a map reduce job to copy data over in a distributed fashion I would expect to see mappers executing the copy. What happens with a copy command from Hadoop fs? Thanks, Ranjith

Re: problem setting up multi-user cluster using locally-mounted shared filesystem

2012-05-17 Thread Ranjith
user and >> permissions must be rwx-- Thanks, Ranjith On May 17, 2012, at 5:37 AM, Luca Pireddu wrote: > Hello all, > > we're trying to set up a multi-user MapReduce cluster that doesn't use HDFS. > The idea is to use a central, shared JobTracker to which

Re: How to load raw log file into HDFS?

2012-05-14 Thread Ranjith
hear from the rest of the community about this to see it is consistent with what they have seen. Thanks, Ranjith On May 14, 2012, at 8:45 PM, "Manish Bhoge" wrote: > You first need to copy data using copyFromLocal to your HDFS and then you can > utilize PIG and Hive program for f