[
https://issues.apache.org/jira/browse/PIG-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145775#comment-14145775
]
liyunzhang_intel commented on PIG-4168:
---------------------------------------
Hi [~rohini]
very thanks for your comment!
I have updated the patch(PIG-4168_1.patch) and modified somewhere you pointed
out:
1.*Previous*: New ExecTypes are pluggable using ServiceLoader. Please do not
add them to ExecType class.
*In this patch*:
{code}
TestSpark#setUp
public void setUp() throws Exception {
pigServer = new PigServer(new SparkExecType(),
cluster.getProperties());
}
{code}
2.*Previous*:
{code:xml}<copy file="${basedir}/test/core-site.xml"
tofile="${test.build.classes}/core-site.xml"/>{code} Why do you have to create
an empty core-site.xml and copy to build dir?
*In this patch*: I create class SparkMiniCluster, now it generates
build/classes/hadoop-site.xml by code and not generate core-site.xml by
build.xml. This file is needed because of the check in
HExecutionEngine#getExecConf.
{code}
SparkMiniCluster#setupMiniDfsAndMrClusters
private static final File CONF_DIR = new File("build/classes");
private static final File CONF_FILE = new File(CONF_DIR, "hadoop-site.xml");
@Override
protected void setupMiniDfsAndMrClusters() {
try {
CONF_DIR.mkdirs();
if (CONF_FILE.exists()) {
CONF_FILE.delete();
}
m_conf = new Configuration();
m_conf.set("io.sort.mb", "1");
m_conf.writeXml(new FileOutputStream(CONF_FILE));
} catch (IOException e) {
throw new RuntimeException(e);
}
}
{code}
{code}
HExecutionEngine#getExecConf
public JobConf getExecConf(Properties properties) throws ExecException {
JobConf jc = null;
// Check existence of user provided configs
String isHadoopConfigsOverriden =
properties.getProperty("pig.use.overriden.hadoop.configs");
if (isHadoopConfigsOverriden != null &&
isHadoopConfigsOverriden.equals("true")) {
jc = new JobConf(ConfigurationUtil.toConfiguration(properties));
} else {
// Check existence of hadoop-site.xml or core-site.xml in
// classpath if user provided confs are not being used
Configuration testConf = new Configuration();
ClassLoader cl = testConf.getClassLoader();
URL hadoop_site = cl.getResource(HADOOP_SITE);
URL core_site = cl.getResource(CORE_SITE);
if (hadoop_site == null && core_site == null) {
throw new ExecException(
"Cannot find hadoop configurations in classpath "
+ "(neither hadoop-site.xml nor core-site.xml
was found in the classpath)."
+ " If you plan to use local mode, please put
-x local option in command line",
4010);
}
jc = new JobConf();
}
jc.addResource("pig-cluster-hadoop-site.xml");
jc.addResource(YARN_SITE);
return jc;
}
{code}
> Initial implementation of unit tests for Pig on Spark
> -----------------------------------------------------
>
> Key: PIG-4168
> URL: https://issues.apache.org/jira/browse/PIG-4168
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: Praveen Rachabattuni
> Assignee: liyunzhang_intel
> Attachments: PIG-4168.patch
>
>
> 1.ant clean jar; pig-0.14.0-SNAPSHOT-core-h1.jar will be generated by the
> command
> 2.export SPARK_PIG_JAR=$PIG_HOME/pig-0.14.0-SNAPSHOT-core-h1.jar
> 3.build hadoop1 and spark env.spark run in local mode
> jps:
> 11647 Master #spark master runs
> 6457 DataNode #hadoop datanode runs
> 22374 Jps
> 11705 Worker# spark worker runs
> 27009 JobTracker #hadoop jobtracker runs
> 26602 NameNode #hadoop namenode runs
> 29486 org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar
> 19692 Main
>
> 4.ant test-spark
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)