date:20161230

Re: A question about HDFS permissions

2016-12-30 Thread Chris Nauroth

Hello, It wasn't clear to me if you were asking for more details about setuid/setgid in general, or just information about how it relates to HDFS, so I'll try to answer both. This statement is a reference to the POSIX notion of setuid and setgid, which allow you to set up an executable file that

Re: merging small files in HDFS

2016-12-30 Thread Chris Nauroth

Hello Piyush, I would typically accomplish this sort of thing by using CombineFileInputFormat, which is capable of combining multiple small files into a single input split. http://hadoop.apache.org/docs/r2.7.3/api/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.html This prevents la

Get map and reduce phases counter values separately

2016-12-30 Thread Mahesh Ogale

Hello, Here is one of the counters I am taking as an example: Name Map Reduce Total GC time elapsed (ms)36,573 1,320 37,893 e.g. ${hadoop:counters('my-task-mr')['org.apache.hadoop.mapred.Task$Counter']['GC_TIME_MILLIS']} - gives me total value 37893 (ma

Re: RDD Location

2016-12-30 Thread Fei Hu

That is a good idea. I tried add the following code to get getPreferredLocations() function: val results: Array[Array[DataChunkPartition]] = context.runJob( partitionsRDD, (context: TaskContext, partIter: Iterator[DataChunkPartition]) => partIter.toArray, dd, allowLocal = true) But it seem

context.runJob() was suspended in getPreferredLocations() function

2016-12-30 Thread Fei Hu

Dear all, I tried to customize my own RDD. In the getPreferredLocations() function, I used the following code to query anonter RDD, which was used as an input to initialize this customized RDD: * val results: Array[Array[DataChunkPartition]] = context.runJob(partitionsRDD, (con

Heartbeat between RM and AM

2016-12-30 Thread Sultan Alamro

Hi all, Can any one tell me how I can modify the heartbeat between the RM and AM? I need to add new requests to the AM from the RM. These requests basically are values calculated by the RM to be used by the AM online. Thanks, Sultan

Re: RDD Location

2016-12-30 Thread Fei Hu

It will be very appreciated if you can give more details about why runJob function could not be called in getPreferredLocations() In the NewHadoopRDD class and HadoopRDD class, they get the location information from the inputSplit. But there may be an issue in NewHadoopRDD, because it generates al

Re: A question about HDFS permissions

Re: merging small files in HDFS

Get map and reduce phases counter values separately

Re: RDD Location

context.runJob() was suspended in getPreferredLocations() function

Heartbeat between RM and AM

Re: RDD Location

7 matches

Site Navigation

Mail list logo

Footer information