Equally split a RDD partition into two partition at the same node

2017-01-14 Thread Fei Hu
Dear all,

I want to equally divide a RDD partition into two partitions. That means,
the first half of elements in the partition will create a new partition,
and the second half of elements in the partition will generate another new
partition. But the two new partitions are required to be at the same node
with their parent partition, which can help get high data locality.

Is there anyone who knows how to implement it or any hints for it?

Thanks in advance,
Fei


Re: RDD Location

2016-12-30 Thread Fei Hu
It will be very appreciated if you can give more details about why runJob
function could not be called in getPreferredLocations()

In the NewHadoopRDD class and HadoopRDD class, they get the location
information from the inputSplit. But there may be an issue in NewHadoopRDD,
because it generates all of the inputSplits on the master node, which means
I can only use a single node to generate and filter the inputSplits even if
the number of inputSplits is huge. Will it be a performance bottleneck?

Thanks,
Fei





On Fri, Dec 30, 2016 at 10:41 PM, Sun Rui <sunrise_...@163.com> wrote:

> You can’t call runJob inside getPreferredLocations().
> You can take a look at the source  code of HadoopRDD to help you implement 
> getPreferredLocations()
> appropriately.
>
> On Dec 31, 2016, at 09:48, Fei Hu <hufe...@gmail.com> wrote:
>
> That is a good idea.
>
> I tried add the following code to get getPreferredLocations() function:
>
> val results: Array[Array[DataChunkPartition]] = context.runJob(
>   partitionsRDD, (context: TaskContext, partIter:
> Iterator[DataChunkPartition]) => partIter.toArray, dd, allowLocal = true)
>
> But it seems to be suspended when executing this function. But if I move
> the code to other places, like the main() function, it runs well.
>
> What is the reason for it?
>
> Thanks,
> Fei
>
> On Fri, Dec 30, 2016 at 2:38 AM, Sun Rui <sunrise_...@163.com> wrote:
>
>> Maybe you can create your own subclass of RDD and override the
>> getPreferredLocations() to implement the logic of dynamic changing of the
>> locations.
>> > On Dec 30, 2016, at 12:06, Fei Hu <hufe...@gmail.com> wrote:
>> >
>> > Dear all,
>> >
>> > Is there any way to change the host location for a certain partition of
>> RDD?
>> >
>> > "protected def getPreferredLocations(split: Partition)" can be used to
>> initialize the location, but how to change it after the initialization?
>> >
>> >
>> > Thanks,
>> > Fei
>> >
>> >
>>
>>
>>
>
>


context.runJob() was suspended in getPreferredLocations() function

2016-12-30 Thread Fei Hu
Dear all,

I tried to customize my own RDD. In the getPreferredLocations() function, I
used the following code to query anonter RDD, which was used as an input to
initialize this customized RDD:

   * val results: Array[Array[DataChunkPartition]] =
context.runJob(partitionsRDD, (context: TaskContext, partIter:
Iterator[DataChunkPartition]) => partIter.toArray, partitions, allowLocal =
true)*

The problem is that when executing the above code, the task seemed to be
suspended. I mean the job just stopped at this code, but no errors and no
outputs.

What is the reason for it?

Thanks,
Fei


Re: RDD Location

2016-12-30 Thread Fei Hu
That is a good idea.

I tried add the following code to get getPreferredLocations() function:

val results: Array[Array[DataChunkPartition]] = context.runJob(
  partitionsRDD, (context: TaskContext, partIter:
Iterator[DataChunkPartition]) => partIter.toArray, dd, allowLocal = true)

But it seems to be suspended when executing this function. But if I move
the code to other places, like the main() function, it runs well.

What is the reason for it?

Thanks,
Fei

On Fri, Dec 30, 2016 at 2:38 AM, Sun Rui <sunrise_...@163.com> wrote:

> Maybe you can create your own subclass of RDD and override the
> getPreferredLocations() to implement the logic of dynamic changing of the
> locations.
> > On Dec 30, 2016, at 12:06, Fei Hu <hufe...@gmail.com> wrote:
> >
> > Dear all,
> >
> > Is there any way to change the host location for a certain partition of
> RDD?
> >
> > "protected def getPreferredLocations(split: Partition)" can be used to
> initialize the location, but how to change it after the initialization?
> >
> >
> > Thanks,
> > Fei
> >
> >
>
>
>


RDD Location

2016-12-29 Thread Fei Hu
Dear all,

Is there any way to change the host location for a certain partition of RDD?

"protected def getPreferredLocations(split: Partition)" can be used to
initialize the location, but how to change it after the initialization?


Thanks,
Fei


Kryo on Zeppelin

2016-10-10 Thread Fei Hu
Hi All,

I am running some spark scala code on zeppelin on CDH 5.5.1 (Spark version
1.5.0). I customized the Spark interpreter to use org.apache.spark.
serializer.KryoSerializer as spark.serializer. And in the dependency I
added Kyro-3.0.3 as following:
 com.esotericsoftware:kryo:3.0.3


When I wrote the scala notebook and run the program, I got the following
errors. But If I compiled these code as jars, and use spark-submit to run
it on the cluster, it worked well without errors.

WARN [2016-10-10 23:43:40,801] ({task-result-getter-1}
Logging.scala[logWarning]:71) - Lost task 0.0 in stage 3.0 (TID 9,
svr-A3-A-U20): java.io.EOFException

at org.apache.spark.serializer.KryoDeserializationStream.
readObject(KryoSerializer.scala:196)

at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(
TorrentBroadcast.scala:217)

at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$
readBroadcastBlock$1.apply(TorrentBroadcast.scala:178)

at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1175)

at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(
TorrentBroadcast.scala:165)

at org.apache.spark.broadcast.TorrentBroadcast._value$
lzycompute(TorrentBroadcast.scala:64)

at org.apache.spark.broadcast.TorrentBroadcast._value(
TorrentBroadcast.scala:64)

at org.apache.spark.broadcast.TorrentBroadcast.getValue(
TorrentBroadcast.scala:88)

at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)

at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.
scala:62)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at org.apache.spark.executor.Executor$TaskRunner.run(
Executor.scala:214)

at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)


There were also some errors when I run the Zeppelin Tutorial:

Caused by: java.io.IOException: java.lang.NullPointerException

at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)

at org.apache.spark.rdd.ParallelCollectionPartition.readObject(
ParallelCollectionRDD.scala:70)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(
NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at java.io.ObjectStreamClass.invokeReadObject(
ObjectStreamClass.java:1058)

at java.io.ObjectInputStream.readSerialData(
ObjectInputStream.java:1900)

at java.io.ObjectInputStream.readOrdinaryObject(
ObjectInputStream.java:1801)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.
java:1351)

at java.io.ObjectInputStream.defaultReadFields(
ObjectInputStream.java:2000)

at java.io.ObjectInputStream.readSerialData(
ObjectInputStream.java:1924)

at java.io.ObjectInputStream.readOrdinaryObject(
ObjectInputStream.java:1801)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.
java:1351)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)

at org.apache.spark.serializer.JavaDeserializationStream.
readObject(JavaSerializer.scala:72)

at org.apache.spark.serializer.JavaSerializerInstance.
deserialize(JavaSerializer.scala:98)

at org.apache.spark.executor.Executor$TaskRunner.run(
Executor.scala:194)

... 3 more

Caused by: java.lang.NullPointerException

at com.twitter.chill.WrappedArraySerializer.read(
WrappedArraySerializer.scala:38)

at com.twitter.chill.WrappedArraySerializer.read(
WrappedArraySerializer.scala:23)

at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)

at org.apache.spark.serializer.KryoDeserializationStream.
readObject(KryoSerializer.scala:192)

at org.apache.spark.rdd.ParallelCollectionPartition$$
anonfun$readObject$1$$anonfun$apply$mcV$sp$2.apply(
ParallelCollectionRDD.scala:80)

at org.apache.spark.rdd.ParallelCollectionPartition$$
anonfun$readObject$1$$anonfun$apply$mcV$sp$2.apply(
ParallelCollectionRDD.scala:80)

at org.apache.spark.util.Utils$.deserializeViaNestedStream(
Utils.scala:142)

at org.apache.spark.rdd.ParallelCollectionPartition$$
anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:80)

at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1160)

Is there anyone knowing why it happended?

Thanks in advance,
Fei


[no subject]

2016-10-10 Thread Fei Hu
Hi All,

I am running some spark scala code on zeppelin on CDH 5.5.1 (Spark version
1.5.0). I customized the Spark interpreter to use
org.apache.spark.serializer.KryoSerializer as spark.serializer. And in the
dependency I added Kyro-3.0.3 as following:
 com.esotericsoftware:kryo:3.0.3


When I wrote the scala notebook and run the program, I got the following
errors. But If I compiled these code as jars, and use spark-submit to run
it on the cluster, it worked well without errors.

WARN [2016-10-10 23:43:40,801] ({task-result-getter-1}
Logging.scala[logWarning]:71) - Lost task 0.0 in stage 3.0 (TID 9,
svr-A3-A-U20): java.io.EOFException

at
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:196)

at
org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:217)

at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:178)

at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1175)

at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165)

at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)

at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)

at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:88)

at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)

at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)

at org.apache.spark.scheduler.Task.run(Task.scala:88)

at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)


There were also some errors when I run the Zeppelin Tutorial:

Caused by: java.io.IOException: java.lang.NullPointerException

at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)

at
org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)

at
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)

at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)

at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)

at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)

at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)

at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)

... 3 more

Caused by: java.lang.NullPointerException

at
com.twitter.chill.WrappedArraySerializer.read(WrappedArraySerializer.scala:38)

at
com.twitter.chill.WrappedArraySerializer.read(WrappedArraySerializer.scala:23)

at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)

at
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:192)

at
org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1$$anonfun$apply$mcV$sp$2.apply(ParallelCollectionRDD.scala:80)

at
org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1$$anonfun$apply$mcV$sp$2.apply(ParallelCollectionRDD.scala:80)

at
org.apache.spark.util.Utils$.deserializeViaNestedStream(Utils.scala:142)

at
org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:80)

at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1160)

Is there anyone knowing why it happended?

Thanks in advance,
Fei


Spark application Runtime Measurement

2016-07-09 Thread Fei Hu
Dear all,

I have a question about how to measure the runtime for a Spak application.
Here is an example:


   - On the Spark UI: the total duration time is 2.0 minutes = 120 seconds
   as following

[image: Screen Shot 2016-07-09 at 11.45.44 PM.png]

   - However, when I check the jobs launched by the application, the time
   is 13s + 0.8s + 4s = 17.8 seconds, which is much less than 120 seconds. I
   am not sure which time I should choose to measure the performance of the
   Spark application.

[image: Screen Shot 2016-07-09 at 11.48.26 PM.png]

   - I also check the event timeline as following. There is a big gap
   between the second job and the third job. I do not know what happened
   during that gap.

[image: Screen Shot 2016-07-09 at 11.53.29 PM.png]

Is there anyone who can help explain which time is the exact time to
measure the performance of a Spark application.

Thanks in advance,
Fei


Remotely submit a job to Yarn on CDH5.4

2015-08-18 Thread Fei Hu
Hi,

I want to remotely submit a job to Yarn on CDH5.4. The following is the code 
about the WordCount and the error report. Any one knows how to solve it?

Thanks in advance,
Fei



INFO: Job job_1439867352386_0025 failed with state FAILED due to: Application 
application_1439867352386_0025 failed 2 times due to AM Container for 
appattempt_1439867352386_0025_02 exited with  exitCode: 1
For more detailed output, check application tracking 
page:http://compute-04:8088/proxy/application_1439867352386_0025/Then, click on 
links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1439867352386_0025_02_01
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.


public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
System.setProperty(HADOOP_USER_NAME,hdfs);  
conf.set(hadoop.job.ugi, supergroup);

conf.set(mapreduce.framework.name, yarn);
conf.set(fs.defaultFS, hdfs://compute-04:8020 
hdfs://compute-04:8020);
conf.set(mapreduce.map.java.opts, -Xmx1024M);
conf.set(mapreduce.reduce.java.opts, -Xmx1024M);

conf.set(fs.hdfs.impl, 
org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
conf.set(fs.file.impl, 
org.apache.hadoop.fs.LocalFileSystem.class.getName());
conf.set(yarn.resourcemanager.address, 199.25.200.134:8032);

conf.set(yarn.resourcemanager.resource-tracker.address, 
199.25.200.134:8031);
conf.set(yarn.resourcemanager.scheduler.address, 
199.25.200.134:8030);
conf.set(yarn.resourcemanager.admin.address, 
199.25.200.134:8033);


conf.set(yarn.nodemanager.aux-services, mapreduce_shuffle);

conf.set(yarn.application.classpath, 
/etc/hadoop/conf.cloudera.hdfs,
+ /etc/hadoop/conf.cloudera.yarn,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/lib/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/lib/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/lib/*”);


GenericOptionsParser optionParser = new GenericOptionsParser(conf, 
args);
String[] remainingArgs = optionParser.getRemainingArgs();
if (!(remainingArgs.length != 2 || remainingArgs.length != 4)) {
  System.err.println(Usage: wordcount in out [-skip 
skipPatternFile]);
  System.exit(2);
}
Job job = Job.getInstance(conf, word count);
job.setJarByClass(WordCount2.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

ListString otherArgs = new ArrayListString();
for (int i=0; i  remainingArgs.length; ++i) {
  if (-skip.equals(remainingArgs[i])) {
job.addCacheFile(new Path(remainingArgs[++i]).toUri());
job.getConfiguration().setBoolean(wordcount.skip.patterns, 
true);
  } else {
otherArgs.add(remainingArgs[i]);
  }
}
FileInputFormat.addInputPath(job, new Path(otherArgs.get(0)));
FileOutputFormat.setOutputPath(job, new Path(otherArgs.get(1)));

System.exit(job.waitForCompletion(true) ? 0 : 1);
  }

Remotely submit a job to Yarn on CDH5.4

2015-08-18 Thread Fei Hu
Hi,

I want to remotely submit a job to Yarn on CDH5.4. The following is the code 
about the WordCount and the error report. Any one knows how to solve it?

Thanks in advance,
Fei



INFO: Job job_1439867352386_0025 failed with state FAILED due to: Application 
application_1439867352386_0025 failed 2 times due to AM Container for 
appattempt_1439867352386_0025_02 exited with  exitCode: 1
For more detailed output, check application tracking 
page:http://compute-04:8088/proxy/application_1439867352386_0025/Then, click on 
links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1439867352386_0025_02_01
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.


public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
System.setProperty(HADOOP_USER_NAME,hdfs);  
conf.set(hadoop.job.ugi, supergroup);

conf.set(mapreduce.framework.name, yarn);
conf.set(fs.defaultFS, hdfs://compute-04:8020);
conf.set(mapreduce.map.java.opts, -Xmx1024M);
conf.set(mapreduce.reduce.java.opts, -Xmx1024M);

conf.set(fs.hdfs.impl, 
org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
conf.set(fs.file.impl, 
org.apache.hadoop.fs.LocalFileSystem.class.getName());
conf.set(yarn.resourcemanager.address, 199.25.200.134:8032);

conf.set(yarn.resourcemanager.resource-tracker.address, 
199.25.200.134:8031);
conf.set(yarn.resourcemanager.scheduler.address, 
199.25.200.134:8030);
conf.set(yarn.resourcemanager.admin.address, 
199.25.200.134:8033);


conf.set(yarn.nodemanager.aux-services, mapreduce_shuffle);

conf.set(yarn.application.classpath, 
/etc/hadoop/conf.cloudera.hdfs,
+ /etc/hadoop/conf.cloudera.yarn,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop/lib/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-hdfs/lib/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/*,
+ 
/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hadoop-yarn/lib/*);


GenericOptionsParser optionParser = new GenericOptionsParser(conf, 
args);
String[] remainingArgs = optionParser.getRemainingArgs();
if (!(remainingArgs.length != 2 || remainingArgs.length != 4)) {
  System.err.println(Usage: wordcount in out [-skip 
skipPatternFile]);
  System.exit(2);
}
Job job = Job.getInstance(conf, word count);
job.setJarByClass(WordCount2.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

ListString otherArgs = new ArrayListString();
for (int i=0; i  remainingArgs.length; ++i) {
  if (-skip.equals(remainingArgs[i])) {
job.addCacheFile(new Path(remainingArgs[++i]).toUri());
job.getConfiguration().setBoolean(wordcount.skip.patterns, 
true);
  } else {
otherArgs.add(remainingArgs[i]);
  }
}
FileInputFormat.addInputPath(job, new Path(otherArgs.get(0)));
FileOutputFormat.setOutputPath(job, new Path(otherArgs.get(1)));

System.exit(job.waitForCompletion(true) ? 0 : 1);
  }




Re: Container beyond virtual memory limits

2015-03-23 Thread Fei Hu
Thank you for your help. It is useful.

Best,
Fei
 On Mar 23, 2015, at 1:09 AM, Drake민영근 drake@nexr.com wrote:
 
 Hi,
 
 See 6. Killing of Tasks Due to Virtual Memory Usage in  
 http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
  
 http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
  
 
 Drake 민영근 Ph.D
 kt NexR
 
 On Sun, Mar 22, 2015 at 12:43 PM, Fei Hu hufe...@gmail.com 
 mailto:hufe...@gmail.com wrote:
 Hi,
 
 I just test my yarn installation, and run a Wordcount program. But it always 
 report the following error, who knows how to solve it? Thank you in advance.
 
 Container [pid=7954,containerID=container_1426992254950_0002_01_05] is 
 running beyond virtual memory limits. Current usage: 13.6 MB of 1 GB physical 
 memory used; 4.3 GB of 2.1 GB virtual memory used. Killing container.
 Dump of the process-tree for container_1426992254950_0002_01_05 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 7960 7954 7954 7954 (java) 5 0 4576591872 3199 
 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java 
 -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN 1638 
 -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1426992254950_0002/container_1426992254950_0002_01_05/tmp
  -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.container.log.dir=/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05
  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 org.apache.hadoop.mapred.YarnChild 199.26.254.140 36542 
 attempt_1426992254950_0002_m_03_0 5 
   |- 7954 7949 7954 7954 (bash) 0 0 65421312 275 /bin/bash -c 
 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java 
 -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  1638 
 -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1426992254950_0002/container_1426992254950_0002_01_05/tmp
  -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.container.log.dir=/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05
  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 org.apache.hadoop.mapred.YarnChild 199.26.254.140 36542 
 attempt_1426992254950_0002_m_03_0 5 
 1/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05/stdout
  
 2/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05/stderr
   
 
 Exception from container-launch.
 Container id: container_1426992254950_0002_01_05
 Exit code: 1
 Stack trace: ExitCodeException exitCode=1: 
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
   at org.apache.hadoop.util.Shell.run(Shell.java:455)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:679)
 
 Thanks,
 Fei
 



Re: Container beyond virtual memory limits

2015-03-23 Thread Fei Hu
Thank you. It works.

Best,
Fei


 On Mar 23, 2015, at 11:27 AM, Gaurav Gupta gaurav.gopi...@gmail.com wrote:
 
 Increasing vmem:pmem ratio should help you out. Default value is 2:1, change 
 it to 5:1
 
 On Sun, Mar 22, 2015 at 10:09 PM, Drake민영근 drake@nexr.com 
 mailto:drake@nexr.com wrote:
 Hi,
 
 See 6. Killing of Tasks Due to Virtual Memory Usage in  
 http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
  
 http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
  
 
 Drake 민영근 Ph.D
 kt NexR
 
 On Sun, Mar 22, 2015 at 12:43 PM, Fei Hu hufe...@gmail.com 
 mailto:hufe...@gmail.com wrote:
 Hi,
 
 I just test my yarn installation, and run a Wordcount program. But it always 
 report the following error, who knows how to solve it? Thank you in advance.
 
 Container [pid=7954,containerID=container_1426992254950_0002_01_05] is 
 running beyond virtual memory limits. Current usage: 13.6 MB of 1 GB physical 
 memory used; 4.3 GB of 2.1 GB virtual memory used. Killing container.
 Dump of the process-tree for container_1426992254950_0002_01_05 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 7960 7954 7954 7954 (java) 5 0 4576591872 3199 
 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java 
 -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN 1638 
 -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1426992254950_0002/container_1426992254950_0002_01_05/tmp
  -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.container.log.dir=/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05
  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 org.apache.hadoop.mapred.YarnChild 199.26.254.140 36542 
 attempt_1426992254950_0002_m_03_0 5 
   |- 7954 7949 7954 7954 (bash) 0 0 65421312 275 /bin/bash -c 
 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java 
 -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  1638 
 -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1426992254950_0002/container_1426992254950_0002_01_05/tmp
  -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.container.log.dir=/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05
  -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 org.apache.hadoop.mapred.YarnChild 199.26.254.140 36542 
 attempt_1426992254950_0002_m_03_0 5 
 1/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05/stdout
  
 2/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05/stderr
   
 
 Exception from container-launch.
 Container id: container_1426992254950_0002_01_05
 Exit code: 1
 Stack trace: ExitCodeException exitCode=1: 
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
   at org.apache.hadoop.util.Shell.run(Shell.java:455)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:679)
 
 Thanks,
 Fei
 
 



Container beyond virtual memory limits

2015-03-21 Thread Fei Hu
Hi,

I just test my yarn installation, and run a Wordcount program. But it always 
report the following error, who knows how to solve it? Thank you in advance.

Container [pid=7954,containerID=container_1426992254950_0002_01_05] is 
running beyond virtual memory limits. Current usage: 13.6 MB of 1 GB physical 
memory used; 4.3 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1426992254950_0002_01_05 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 7960 7954 7954 7954 (java) 5 0 4576591872 3199 
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java 
-Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN 1638 
-Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1426992254950_0002/container_1426992254950_0002_01_05/tmp
 -Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05
 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
org.apache.hadoop.mapred.YarnChild 199.26.254.140 36542 
attempt_1426992254950_0002_m_03_0 5 
|- 7954 7949 7954 7954 (bash) 0 0 65421312 275 /bin/bash -c 
/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java 
-Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  1638 
-Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1426992254950_0002/container_1426992254950_0002_01_05/tmp
 -Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05
 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
org.apache.hadoop.mapred.YarnChild 199.26.254.140 36542 
attempt_1426992254950_0002_m_03_0 5 
1/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05/stdout
 
2/home/hadoop-lzl/hadoop-2.6.0/logs/userlogs/application_1426992254950_0002/container_1426992254950_0002_01_05/stderr
  

Exception from container-launch.
Container id: container_1426992254950_0002_01_05
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)

Thanks,
Fei

Re: Prune out data to a specific reduce task

2015-03-12 Thread Fei Hu
Maybe you could use Partitioner.class to solve your problem.

 On Mar 11, 2015, at 6:28 AM, xeonmailinglist-gmail xeonmailingl...@gmail.com 
 mailto:xeonmailingl...@gmail.com wrote:
 
 Hi,
 
 I have this job that has 3 map tasks and 2 reduce tasks. But, I want to 
 excludes data that will go to the reduce task 2. This means that, only 
 reducer 1 will produce data, and the other one will be empty, or even it 
 doesn't execute.
 
 How can I do this in MapReduce?
 
 ExampleJobExecution.png
 
 
 Thanks,
 
 -- 
 --



Re: Prune out data to a specific reduce task

2015-03-12 Thread Fei Hu
In the Reducer.class, you could ignore the data that you want to exclude based 
on the key or value.


 On Mar 12, 2015, at 12:47 PM, xeonmailinglist-gmail 
 xeonmailingl...@gmail.com wrote:
 
 If I use the partitioner, I must be able to tell map reduce to not execute 
 values from a certain reduce tasks.
 
 The method public int
   getPartition(K key, V value, int numReduceTasks) must always return 
 a partition. I can’t return -1. Thus, I don’ t know how to tell Mapreduce to 
 not execute data from a partition. Any suggestion?
 
  Forwarded Message 
 
 Subject: Re: Prune out data to a specific reduce task
 
 Date: Thu, 12 Mar 2015 12:40:04 -0400
 
 From: Fei Hu hufe...@gmail.com http://mailto:hufe...@gmail.com/
 Reply-To: user@hadoop.apache.org mailto:user@hadoop.apache.org
 To: user@hadoop.apache.org mailto:user@hadoop.apache.org
 Maybe you could use Partitioner.class to solve your problem.
 
 
 
 On Mar 11, 2015, at 6:28 AM, xeonmailinglist-gmail 
 xeonmailingl...@gmail.com mailto:xeonmailingl...@gmail.com wrote:
 
 Hi,
 
 I have this job that has 3 map tasks and 2 reduce tasks. But, I want to 
 excludes data that will go to the reduce task 2. This means that, only 
 reducer 1 will produce data, and the other one will be empty, or even it 
 doesn't execute.
 
 How can I do this in MapReduce?
 
 ExampleJobExecution.png
 
 
 Thanks,
 
 -- 
 --
 
 



Monitor data transformation

2015-03-02 Thread Fei Hu
Hi All,

I developed a scheduler for data locality. Now I want to test the performance 
of the scheduler, so I need to monitor how many data are read remotely. Is 
there every tool for monitoring the volume of data moved around the cluster?

Thanks,
Fei

Data locality

2015-03-02 Thread Fei Hu
Hi All,

I developed a scheduler for data locality. Now I want to test the performance 
of the scheduler, so I need to monitor how many data are read remotely. Is 
there any tool for monitoring the volume of data moved around the cluster?

Thanks,
Fei

Re: Data Placement Strategy for HDFS

2015-01-14 Thread Fei Hu
Hi,

Thank you for your help.

I searched HDFS-7613 by Google and the link 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDFS%20AND%20text%20~%20%227613%22
 
https://issues.apache.org/jira/issues/?jql=project%20=%20HDFS%20AND%20text%20~%20%227613%22,
 but I could not find it.

Could you email me the link? Thank you very much.

Sincerely,
Fei 


 On Jan 14, 2015, at 5:43 PM, Ted Yu yuzhih...@gmail.com wrote:
 
 Fei:
 You can watch this issue:
 HDFS-7613 Block placement policy for erasure coding groups
 
 the solution there would be helpful to us.
 
 Cheers
 
 On Wed, Jan 14, 2015 at 11:04 AM, Fei Hu hufe...@gmail.com 
 mailto:hufe...@gmail.com wrote:
 Thank you for your quick response.
 
 After reading the materials you recommended, my conclusion is that Hadoop 
 does not provide interface to customize the data placement policy. We need to 
 add some codes to the source package of HDFS. Is that right?
 
 Thanks,
 Fei
 
 On Tue Jan 13 2015 at 10:42:27 PM Ted Yu yuzhih...@gmail.com 
 mailto:yuzhih...@gmail.com wrote:
 See this thread: http://search-hadoop.com/m/lAh9i28K7 
 http://search-hadoop.com/m/lAh9i28K7
 
 See also HDFS-7228
 
 Cheers
 
 On Tue, Jan 13, 2015 at 7:33 PM, Fei Hu hufe...@gmail.com 
 mailto:hufe...@gmail.com wrote:
 Hi,
 
 I want to customize the data placement strategy rather than using the default 
 strategy in HDFS. Is there any way to control which datanode the replica is 
 delivered to?
 
 Thank you in advance.
 
 Best regards,
 Fei
 
 



Re: Data Placement Strategy for HDFS

2015-01-14 Thread Fei Hu
Thank you very much.

Fei

 On Jan 14, 2015, at 7:36 PM, Ted Yu yuzhih...@gmail.com wrote:
 
 https://issues.apache.org/jira/browse/HDFS-7613 
 https://issues.apache.org/jira/browse/HDFS-7613
 
 Cheers
 
 On Wed, Jan 14, 2015 at 4:35 PM, Fei Hu hufe...@gmail.com 
 mailto:hufe...@gmail.com wrote:
 Hi,
 
 Thank you for your help.
 
 I searched HDFS-7613 by Google and the link 
 https://issues.apache.org/jira/issues/?jql=project%20%3D%20HDFS%20AND%20text%20~%20%227613%22
  
 https://issues.apache.org/jira/issues/?jql=project%20=%20HDFS%20AND%20text%20~%20%227613%22,
  but I could not find it.
 
 Could you email me the link? Thank you very much.
 
 Sincerely,
 Fei 
 
 
 On Jan 14, 2015, at 5:43 PM, Ted Yu yuzhih...@gmail.com 
 mailto:yuzhih...@gmail.com wrote:
 
 Fei:
 You can watch this issue:
 HDFS-7613 Block placement policy for erasure coding groups
 
 the solution there would be helpful to us.
 
 Cheers
 
 On Wed, Jan 14, 2015 at 11:04 AM, Fei Hu hufe...@gmail.com 
 mailto:hufe...@gmail.com wrote:
 Thank you for your quick response.
 
 After reading the materials you recommended, my conclusion is that Hadoop 
 does not provide interface to customize the data placement policy. We need 
 to add some codes to the source package of HDFS. Is that right?
 
 Thanks,
 Fei
 
 On Tue Jan 13 2015 at 10:42:27 PM Ted Yu yuzhih...@gmail.com 
 mailto:yuzhih...@gmail.com wrote:
 See this thread: http://search-hadoop.com/m/lAh9i28K7 
 http://search-hadoop.com/m/lAh9i28K7
 
 See also HDFS-7228
 
 Cheers
 
 On Tue, Jan 13, 2015 at 7:33 PM, Fei Hu hufe...@gmail.com 
 mailto:hufe...@gmail.com wrote:
 Hi,
 
 I want to customize the data placement strategy rather than using the 
 default strategy in HDFS. Is there any way to control which datanode the 
 replica is delivered to?
 
 Thank you in advance.
 
 Best regards,
 Fei
 
 
 
 



Data Placement Strategy for HDFS

2015-01-13 Thread Fei Hu
Hi,

I want to customize the data placement strategy rather than using the default 
strategy in HDFS. Is there any way to control which datanode the replica is 
delivered to?

Thank you in advance.

Best regards,
Fei

Re: Hadoop Installation on Multihomed Networks

2014-12-12 Thread Fei Hu
I solved the problem by changing the hosts file as follows:
10.10.0.10 10.5.0.10 yngcr10nc01

Thanks,
Fei



 On Nov 11, 2014, at 11:58 AM, daemeon reiydelle daeme...@gmail.com wrote:
 
 You may want to consider configuring host names that embed the subnet in the 
 host name itself (e.g. foo50, foo40, for foo via each of the 50 and 40 
 subnets). ssh key file contents  etc may have to be fiddled with a bit
 
 
 ...
 “The race is not to the swift,
 nor the battle to the strong,
 but to those who can see it coming and jump aside.” - Hunter Thompson
 
 Daemeon C.M. Reiydelle
 USA (+1) 415.501.0198
 London (+44) (0) 20 8144 9872
 
 On Tue, Nov 11, 2014 at 7:47 AM, Fei Hu hufe...@gmail.com 
 mailto:hufe...@gmail.com wrote:
 Hi,
 
 I am trying to install Hadoop1.0.4 on multihomed networks. I have done it as 
 the link 
 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html
  
 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html.
  But it still doesn’t work.
 
 The datanode could not work. Its log is as the following. In my 
 hdfs-site.xml, I set the ip:10.50.0.10 for datanode, but in the following 
 report, host = yngcr10nc01/10.10.0.10 http://10.10.0.10/.I think it may be 
 because in /etc/hosts file, I add the pair of ip and hostname:10.10.0.10 
 YNGCR10NC01 before. 
 
 The problem now is that I could not add 10.50.0.10 YNGCR10NC01 into hosts 
 file, because 10.10.0.10 YNGCR10NC01 is necessary for another program.
 
 Is there any way to solve the problem on multihomed networks?
 
 Thanks in advance,
 Fei Hu
 
 
 
 
 
 2014-11-11 04:20:28,228 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
 STARTUP_MSG: 
 /
 STARTUP_MSG: Starting DataNode
 STARTUP_MSG:   host = yngcr10nc01/10.10.0.10 http://10.10.0.10/
 STARTUP_MSG:   args = []
 STARTUP_MSG:   version = 1.0.4
 STARTUP_MSG:   build = 
 https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 
 https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 
 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
 /
 2014-11-11 04:20:28,436 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
 loaded properties from hadoop-metrics2.properties
 2014-11-11 04:20:28,446 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
 MetricsSystem,sub=Stats registered.
 2014-11-11 04:20:28,447 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period 
 at 10 second(s).
 2014-11-11 04:20:28,447 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system 
 started
 2014-11-11 04:20:28,572 INFO 
 org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi 
 registered.
 2014-11-11 04:20:28,607 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded 
 the native-hadoop library
 2014-11-11 04:20:29,870 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 0 time(s).
 2014-11-11 04:20:30,871 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 1 time(s).
 2014-11-11 04:20:31,872 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 2 time(s).
 2014-11-11 04:20:32,873 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 3 time(s).
 2014-11-11 04:20:33,874 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 4 time(s).
 2014-11-11 04:20:34,875 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 5 time(s).
 2014-11-11 04:20:35,876 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 6 time(s).
 2014-11-11 04:20:36,877 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 7 time(s).
 2014-11-11 04:20:37,877 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 8 time(s).
 2014-11-11 04:20:38,879 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/. Already tried 
 9 time(s).
 2014-11-11 04:20:38,884 ERROR 
 org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to 
 yngcr11hm01/10.50.0.5:9000 http://10.50.0.5:9000/ failed on local 
 exception: java.net.NoRouteToHostException: No route to host
   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107

Hadoop Installation on Multihomed Networks

2014-11-11 Thread Fei Hu
Hi,

I am trying to install Hadoop1.0.4 on multihomed networks. I have done it as 
the link 
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html
 
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html.
 But it still doesn’t work.

The datanode could not work. Its log is as the following. In my hdfs-site.xml, 
I set the ip:10.50.0.10 for datanode, but in the following report, host = 
yngcr10nc01/10.10.0.10.I think it may be because in /etc/hosts file, I add the 
pair of ip and hostname:10.10.0.10 YNGCR10NC01 before. 

The problem now is that I could not add 10.50.0.10 YNGCR10NC01 into hosts file, 
because 10.10.0.10 YNGCR10NC01 is necessary for another program.

Is there any way to solve the problem on multihomed networks?

Thanks in advance,
Fei Hu





2014-11-11 04:20:28,228 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
STARTUP_MSG: 
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = yngcr10nc01/10.10.0.10
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = 
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; 
compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
/
2014-11-11 04:20:28,436 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
loaded properties from hadoop-metrics2.properties
2014-11-11 04:20:28,446 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
MetricsSystem,sub=Stats registered.
2014-11-11 04:20:28,447 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Scheduled snapshot period at 10 second(s).
2014-11-11 04:20:28,447 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
DataNode metrics system started
2014-11-11 04:20:28,572 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi 
registered.
2014-11-11 04:20:28,607 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded 
the native-hadoop library
2014-11-11 04:20:29,870 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 0 time(s).
2014-11-11 04:20:30,871 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 1 time(s).
2014-11-11 04:20:31,872 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 2 time(s).
2014-11-11 04:20:32,873 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 3 time(s).
2014-11-11 04:20:33,874 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 4 time(s).
2014-11-11 04:20:34,875 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 5 time(s).
2014-11-11 04:20:35,876 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 6 time(s).
2014-11-11 04:20:36,877 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 7 time(s).
2014-11-11 04:20:37,877 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 8 time(s).
2014-11-11 04:20:38,879 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: yngcr11hm01/10.50.0.5:9000. Already tried 9 time(s).
2014-11-11 04:20:38,884 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
java.io.IOException: Call to yngcr11hm01/10.50.0.5:9000 failed on local 
exception: java.net.NoRouteToHostException: No route to host
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
at org.apache.hadoop.ipc.Client.call(Client.java:1075)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at com.sun.proxy.$Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:370)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:429)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:331)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:296)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:356)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:299)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
Caused by: java.net.NoRouteToHostException: No route to host

Datanode could not work for the ip is not the same as specified in hdfs-site.xml

2014-11-10 Thread Fei Hu
Hi,

I am installing Hadoop 1.0.4 on our clusters. And I am meeting a problem about 
IP setting for datanode. Maybe it is about multimode networks. But I have tried 
to solve the problem as this link 
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html
 
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html,
 it still does not work.

There are two networks on each computer. For example: 
  on the computer whose hostname is YNGCR10NC01, it has two ip:
brpub Link encap:Ethernet  HWaddr EC:F4:BB:C4:86:28  
  inet addr:10.10.0.10  Bcast:10.10.255.255  Mask:255.255.0.0

em3   Link encap:Ethernet  HWaddr EC:F4:BB:C4:86:2C  
  inet addr:10.50.0.10  Bcast:10.50.0.255  Mask:255.255.255.0

Now I want to use IP:10.50.0.10 to install DataNode. In hfs-site.xml, I change 
some properties as the following:
property  
namedfs.datanode.address/name
value10.50.0.10:50010/value
/property

property
namedfs.datanode.http.address/name
value10.50.0.10:50075/value
/property

But because I am using ip:10.10.0.10 for another job, I have added 
ip:10.10.0.10 to /etc/hosts before as the following:

127.0.0.1   localhost localhost.localdomain localhost4 
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 
localhost6.localdomain6
10.10.0.10 YNGCR10NC01   //This is occupied by another program, so I 
could not add 10.50.0.10 YNGCR10NC01 to the hosts file

10.50.0.5 yngcr11hm01//This is a master node

Therefore, I could not add 10.50.0.10 YNGCR10NC01 to the hosts file

After I start hadoop, the datanode log reports the following errors:
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: 
Call to yngcr11hm01/10.50.0.5:9000 failed on local exception: 
java.net.NoRouteToHostException: No route to host

The front of the datanode log shows 
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = yngcr10nc01/10.10.0.10
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = 
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; 
compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
/
I don’t why the host ip is still 10.10.0.10. I want the host ip to be 
10.50.0.10. Maybe it is caused by the hosts file. But now I could not change 
the hosts file, because the pair of 10.10.0.10 YNGCR10NC01 is being used for 
another program.

Is there any way to solve this problem.

Thank you very much in advance!

All the best,
Fei Hu