Re: FileUtil Copy example

2014-09-25 Thread Susheel Kumar Gadalay
Yes, but I am generating multiple output file names with prefix. So I know their prefix size. On 9/25/14, João Alves j...@5dlab.com wrote: Hey Susheel, Maybe it would be a good idea to check if the string is long enough for your substring operation. (i.e.

Re: FileUtil Copy example

2014-09-25 Thread João Alves
Hey Susheel, Maybe it would be a good idea to check if the string is long enough for your substring operation. (i.e. fstatus[i].getPath().getName().length() = 10) On 25 Sep 2014, at 07:14, Susheel Kumar Gadalay skgada...@gmail.com wrote: I solved it like this. This will move a file from one

Which script is generated launch_container.sh file?

2014-09-25 Thread Nur Kholis Majid
Hi, Which script is generated launch_container.sh file while we run yarn jar? this script is located in ${yarn.nodemanager.local-dirs}/nm-local-dir/usercache/${username}/appcache/application_*\container_*/ I need to insert some command to change LOG_DIRS permission. Thanks.

Re: Unable to use transfer data using distcp between EC2-classic cluster and VPC cluster

2014-09-25 Thread Jameel Al-Aziz
Hi Ankit, Thanks for the info. However, we are not using EMR. We are using our own cluster. We have tried everything listed before in an attempt to get the nodes to register with their public DNS name. The frustration comes from the fact that Hadoop/HDFS seemingly ignores out attempts and

Hive query not working

2014-09-25 Thread Aditya exalter
HI all, I have a hive table partitioned with date (d) as string but while running the query i am getting following exception. SELECT * FROM click WHERE d=2014-09-25 ; FAILED: SemanticException MetaException(message:javax.jdo.JDOException: Invocation of method substring on StringExpression

Hadoop shuffling traffic

2014-09-25 Thread Abdul Navaz
Hello, I am having a Hadoop cluster with 1 name node and 3 data nodes. I running sample word count job on 1GB of file which is distributed among the HDFS. When I run the map reduce job, before even completing the mapping 100 % reduce starts. Say for eg map 40% reduce 10% etc. I would like to

YARN Newbie Question in ApplicationMaster

2014-09-25 Thread Dhanasekaran Anbalagan
Hi Guys, I am new to yarn framework, for example I have a yarn cluster, I am planning to submit mapreduce and storm and sprk applications, Please correct me I am wrong. when I submit each each jobs, It's create own instance of mapreduce,sprk and storm application master in different nodes.

Re: Hadoop shuffling traffic

2014-09-25 Thread Bing Jiang
see mapreduce.job.reduce.slowstart.completedmaps It gives hint of when reduce tasks could kick off. 2014-09-26 8:36 GMT+08:00 Abdul Navaz navaz@gmail.com: Hello, I am having a Hadoop cluster with 1 name node and 3 data nodes. I running sample word count job on 1GB of file which is

Re: Hadoop shuffling traffic

2014-09-25 Thread karthikeyan S
The reducer starts as soon as it has data available from any one of the mappers. The reducer keeps polling the AM and asks if any mapper has completed processing. If so it fetches data from that mapper. So it's not necessary for all the mappers of a task to complete for the reducer to start