Comments on the expect script:

At the spawning of hod
spawn -ignore {SIGHUP} /users/grad/craigm/src/pig/FROMApache/hod0.2/hod.0.2.2/bin/hod -n [lindex $args 0 ] [lindex $args 1] [lindex $args 2] [lindex $args 3] [lindex $args 4] [lindex $args 5] [lindex $args 6 ] [lindex $args 7] [lindex $args 8] [lindex $args 9] [lindex $args 10]

If there are less than 11 arguments to the expect script, expect will still create the extra argv entries when calling exec(3). This confuses the python command line parser, as it expects there to be 0 leftover commandline arguments. Empty command line entries still count as "len(args) > 0". I fixed this by patching around line 71 of hod.0.2.2/hodlib/Common/cfg.py

   options, args = op.parse_args(argv[1:])
+    argsNoblanks = []
+    for a in args:
+      if len(a) > 0:
+        argsnoblanks.append(a)
-  if len(args) > 0
+ if len(argsNoblanks) > 0:
+      print "\nunrecognised argument(s): "
+      print argsNoblanks
     op.print_help()
     sys.exit(1)

A better solution could probably be made by fixing the expect script, but I tried and failed.

There's some rather odd yahoo specific bits in places:
@@ -349,9 +350,9 @@
    }
private String fixUpDomain(String hostPort) throws UnknownHostException {
        String parts[] = hostPort.split(":");
-        if (parts[0].indexOf('.') == -1) {
-            parts[0] = parts[0] + ".inktomisearch.com";
-        }
+        //if (parts[0].indexOf('.') == -1) {
+        //    parts[0] = parts[0] + ".inktomisearch.com";
+        //}
        InetAddress.getByName(parts[0]);
        return parts[0] + ":" + parts[1];
    }

Also a NullPointerException occurs at
@@ -250,7 +250,7 @@
                       cmd.append('/');
            cmd.append(System.getProperty("hod.command"));
//String cmd = System.getProperty("hod.command", "/home/breed/startHOD.expect"); - String cluster = System.getProperty("yinst.cluster"); + String cluster = System.getProperty("yinst.cluster"); //NPE here if property not set if (cluster.length() > 0 && !cluster.startsWith("kryptonite")) {
                               cmd.append(" --config=");
cmd.append(System.getProperty("hod.config.dir"));


Thanks

Craig


Benjamin Reed wrote:
Ah yes, sorry about that. We had a problem with HOD not working well with piped inputs and outputs, so we actually use an expect script to interface to hod. (We should open an issue on this.)

I'm attaching the script that we use.

ben

On Wednesday 28 November 2007 11:38:09 Craig Macdonald wrote:
Hi all,

I've been trying to setup Pig using Hadoop on Demand. Using some
hackery, my incantation now looks like

PATH=/users/tr.craigm/OF_tools/python/bin/:$PATH ROOT=$PWD
scripts/pig.pl -Dlog4j.level=debug -Dhod.server=local
-Dhod.expect.root=$PWD -Dhod.command=hod/bin/hod
-Dhod.expect.uselatest=hodrc/released -Dyinst.cluster=
-Dhadoop.root.logger=DEBUG,console  --cluster hodrc

(the name of my hodrc file is hodrc).

However, the HOD connection code in PigContext mystifies me. Does it
correspond to any released version of HOD?
It seems to connect to HOD, and parse the response.

PIG-18 (https://issues.apache.org/jira/browse/PIG-18) states that Pig
needs to be fixed to work with hod 4.
So I presume that Pig does not worth with the HOD version
hod-open-4.tar.gz  attached to
https://issues.apache.org/jira/browse/HADOOP-1301

However, it doesnt look like Pig works with the other version of Hod
attached to the same JIRA issue: hod.0.2.2.tar.gz

PigContent.java looks for output from HOD in the form of lines starting:
hdfsUI:
hdfs:
mapredUI:
mapred:
hadoopConf:

I cant find any source in either versions of HOD that resemble this.
Does anyone know if Pig will currently work with any currently openly
available version of HOD?

Thanks in advance

Craig



Reply via email to