It will be very appreciated if you can give more details about why runJob
function could not be called in getPreferredLocations()
In the NewHadoopRDD class and HadoopRDD class, they get the location
information from the inputSplit. But there may be an issue in NewHadoopRDD,
because it generates
Hi all,
Can any one tell me how I can modify the heartbeat between the RM and AM? I
need to add new requests to the AM from the RM.
These requests basically are values calculated by the RM to be used by the
AM online.
Thanks,
Sultan
Dear all,
I tried to customize my own RDD. In the getPreferredLocations() function, I
used the following code to query anonter RDD, which was used as an input to
initialize this customized RDD:
* val results: Array[Array[DataChunkPartition]] =
context.runJob(partitionsRDD,
That is a good idea.
I tried add the following code to get getPreferredLocations() function:
val results: Array[Array[DataChunkPartition]] = context.runJob(
partitionsRDD, (context: TaskContext, partIter:
Iterator[DataChunkPartition]) => partIter.toArray, dd, allowLocal = true)
But it
Hello,
Here is one of the counters I am taking as an example:
Name Map Reduce Total
GC time elapsed (ms)36,573 1,320 37,893
e.g.
${hadoop:counters('my-task-mr')['org.apache.hadoop.mapred.Task$Counter']['GC_TIME_MILLIS']}
- gives me total value 37893
Hello Piyush,
I would typically accomplish this sort of thing by using
CombineFileInputFormat, which is capable of combining multiple small files
into a single input split.
http://hadoop.apache.org/docs/r2.7.3/api/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.html
This prevents
Hello,
It wasn't clear to me if you were asking for more details about
setuid/setgid in general, or just information about how it relates to HDFS,
so I'll try to answer both.
This statement is a reference to the POSIX notion of setuid and setgid,
which allow you to set up an executable file that