[jira] [Commented] (PIG-4168) Initial implementation of unit tests for Pig on Spark

liyunzhang_intel (JIRA) Tue, 23 Sep 2014 19:06:07 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145775#comment-14145775
 ]


liyunzhang_intel commented on PIG-4168:
---------------------------------------

Hi [~rohini]
very thanks for your comment!
I have updated the patch(PIG-4168_1.patch) and modified somewhere you pointed 
out:
1.*Previous*: New ExecTypes are pluggable using ServiceLoader. Please do not 
add them to ExecType class. 
   *In this patch*:
{code}
TestSpark#setUp
      public void setUp() throws Exception {
                pigServer = new PigServer(new SparkExecType(), 
cluster.getProperties());
    }
{code}
2.*Previous*:
{code:xml}<copy file="${basedir}/test/core-site.xml" 
tofile="${test.build.classes}/core-site.xml"/>{code} Why do you have to create 
an empty core-site.xml and copy to build dir?
   *In this patch*:   I create class SparkMiniCluster, now it generates 
build/classes/hadoop-site.xml by code and not generate core-site.xml by 
build.xml. This file is needed because of the check in 
HExecutionEngine#getExecConf.
{code}
SparkMiniCluster#setupMiniDfsAndMrClusters
  private static final File CONF_DIR = new File("build/classes");
  private static final File CONF_FILE = new File(CONF_DIR, "hadoop-site.xml");

 @Override
    protected void setupMiniDfsAndMrClusters() {
        try {
            CONF_DIR.mkdirs();
            if (CONF_FILE.exists()) {
                CONF_FILE.delete();
            }
            m_conf = new Configuration();
            m_conf.set("io.sort.mb", "1");
            m_conf.writeXml(new FileOutputStream(CONF_FILE));
        } catch (IOException e) {
            throw new RuntimeException(e);

        }
    }
{code}
{code}
HExecutionEngine#getExecConf
public JobConf getExecConf(Properties properties) throws ExecException {
        JobConf jc = null;
        // Check existence of user provided configs
        String isHadoopConfigsOverriden = 
properties.getProperty("pig.use.overriden.hadoop.configs");
        if (isHadoopConfigsOverriden != null && 
isHadoopConfigsOverriden.equals("true")) {
            jc = new JobConf(ConfigurationUtil.toConfiguration(properties));
        } else {
            // Check existence of hadoop-site.xml or core-site.xml in
            // classpath if user provided confs are not being used
            Configuration testConf = new Configuration();
            ClassLoader cl = testConf.getClassLoader();
            URL hadoop_site = cl.getResource(HADOOP_SITE);
            URL core_site = cl.getResource(CORE_SITE);

            if (hadoop_site == null && core_site == null) {
                throw new ExecException(
                        "Cannot find hadoop configurations in classpath "
                                + "(neither hadoop-site.xml nor core-site.xml 
was found in the classpath)."
                                + " If you plan to use local mode, please put 
-x local option in command line",
                                4010);
            }
            jc = new JobConf();
        }
        jc.addResource("pig-cluster-hadoop-site.xml");
        jc.addResource(YARN_SITE);
        return jc;
    }
{code}

> Initial implementation of unit tests for Pig on Spark
> -----------------------------------------------------
>
>                 Key: PIG-4168
>                 URL: https://issues.apache.org/jira/browse/PIG-4168
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Praveen Rachabattuni
>            Assignee: liyunzhang_intel
>         Attachments: PIG-4168.patch
>
>
> 1.ant clean jar;  pig-0.14.0-SNAPSHOT-core-h1.jar will be generated by the 
> command
> 2.export SPARK_PIG_JAR=$PIG_HOME/pig-0.14.0-SNAPSHOT-core-h1.jar 
> 3.build hadoop1 and spark env.spark run in local mode
>   jps:
>       11647 Master #spark master runs
>       6457 DataNode #hadoop datanode runs
>       22374 Jps
>       11705 Worker# spark worker runs
>       27009 JobTracker #hadoop jobtracker runs
>       26602 NameNode  #hadoop namenode runs
>       29486 org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar
>       19692 Main
>  
> 4.ant test-spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4168) Initial implementation of unit tests for Pig on Spark

Reply via email to