Waiting for others to give best practice.
I think you can use eclipse to manage the maven; see the full dependency
hierarchy, if some jar(for example, guava) exists in both hadoop dependency
chain and your own requirements, put your requirements' scope as "provided"
.
Regards,
*Stanley Shi,*
O
First of all, I want to claim that I used CDH5 beta, and managed project
using maven, and I googled and read a lot, e.g.
https://issues.apache.org/jira/browse/MAPREDUCE-1700
http://www.datasalt.com/2011/05/handling-dependencies-and-configuration-in-java-hadoop-projects-efficiently/
I believe the p
Can you run MR jobs (not pig job) which takes Lzo Files as input ?
If you can not run MR jobs. You may want to check the lzo compression
configuration in core-site.xml. Make sure the dynamic library is in
HADOOP_HOME/lib/native/
Here is a FAQ about how to configure lzo
https://code.google.com/a/a
hi, have you solved your problem? i have the same problem. it seems that
the cache behavior has not been triggered.
2014-03-07 23:37 GMT+08:00 hwpstorage :
> Hello,
>
> It looks like the HDFS caching does not work well.
> The cached log file is around 200MB. The hadoop cluster has 3 nodes, each