Re: Why does Generic Options Parser only take the first -D option?

2010-09-03 Thread Edward Capriolo
On Thu, Sep 2, 2010 at 2:29 PM, Ted Yu  wrote:
> I checked GenericOptionsParser from 0.20.2
> processGeneralOptions() should be able to process all -D options:
>
>    if (line.hasOption('D')) {
> *      String[] property = line.getOptionValues('D');
> *      for(String prop : property) {
>        String[] keyval = prop.split("=", 2);
>        if (keyval.length == 2) {
>          conf.set(keyval[0], keyval[1]);
>        }
>      }
>    }
> You can add a log after the bold line to verify that all -D options are
> returned.
>
> On Thu, Sep 2, 2010 at 10:09 AM, Edward Capriolo wrote:
>
>> This is 0.20.0
>> I have an eclipse run configuration passing these as arguments
>> -D hive2rdbms.jdbc.driver="com.mysql.jdbc.Driver" -D
>> hive2rdbms.connection.url="jdbc:mysql://localhost:3306/test" -D
>> hive2rdbms.data.query="SELECT id,name FROM name WHERE $CONDITIONS" -D
>> hive2rdbms.bounding.query="SELECT min(id),max(id) FROM name" -D
>> hive2rdbms.output.strategy=HDFS -D hive2rdbms.ouput.hdfs.path="/tmp/a"
>>
>> My code does this:
>>        public int run(String[] args) throws Exception {
>>
>>                conf = getConf();
>>                GenericOptionsParser parser = new
>> GenericOptionsParser(conf,args);
>>
>>                for (String arg: parser.getRemainingArgs()){
>>                  System.out.println(arg);
>>                }
>>
>> hive2rdbms.connection.url=jdbc:mysql://localhost:3306/test
>> -D
>> hive2rdbms.data.query=SELECT id,name FROM name WHERE $CONDITIONS
>> -D
>> hive2rdbms.bounding.query=SELECT min(id),max(id) FROM name
>> -D
>> hive2rdbms.output.strategy=HDFS
>> -D
>> hive2rdbms.ouput.hdfs.path=/tmp/a
>> 10/09/02 13:04:04 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId=
>> Exception in thread "main" java.io.IOException:
>> hive2rdbms.connection.url not specified
>>        at
>> com.media6.hive2rdbms.job.Rdbms2Hive.checkArgs(Rdbms2Hive.java:70)
>>        at com.media6.hive2rdbms.job.Rdbms2Hive.run(Rdbms2Hive.java:46)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>        at com.media6.hive2rdbms.job.Rdbms2Hive.main(Rdbms2Hive.java:145)
>>
>> So what gives does GenericOptionsParser only take hadoop arguments
>> like mapred.map.tasks? If so how come it sucks up my first -D argument
>> and considers the other ones "Remaining Arguments".
>>
>> Any ideas?
>>
>

Thanks Ted,

I think the problem is if you run a a program stand alone, inside an
IDE, and not inside a test case, the GenericOptionsParser assumes the
first arguments are "Hadoop jar" which they are not in this case.

Edward


Re: Why does Generic Options Parser only take the first -D option?

2010-09-02 Thread Ted Yu
I checked GenericOptionsParser from 0.20.2
processGeneralOptions() should be able to process all -D options:

if (line.hasOption('D')) {
*  String[] property = line.getOptionValues('D');
*  for(String prop : property) {
String[] keyval = prop.split("=", 2);
if (keyval.length == 2) {
  conf.set(keyval[0], keyval[1]);
}
  }
}
You can add a log after the bold line to verify that all -D options are
returned.

On Thu, Sep 2, 2010 at 10:09 AM, Edward Capriolo wrote:

> This is 0.20.0
> I have an eclipse run configuration passing these as arguments
> -D hive2rdbms.jdbc.driver="com.mysql.jdbc.Driver" -D
> hive2rdbms.connection.url="jdbc:mysql://localhost:3306/test" -D
> hive2rdbms.data.query="SELECT id,name FROM name WHERE $CONDITIONS" -D
> hive2rdbms.bounding.query="SELECT min(id),max(id) FROM name" -D
> hive2rdbms.output.strategy=HDFS -D hive2rdbms.ouput.hdfs.path="/tmp/a"
>
> My code does this:
>public int run(String[] args) throws Exception {
>
>conf = getConf();
>GenericOptionsParser parser = new
> GenericOptionsParser(conf,args);
>
>for (String arg: parser.getRemainingArgs()){
>  System.out.println(arg);
>}
>
> hive2rdbms.connection.url=jdbc:mysql://localhost:3306/test
> -D
> hive2rdbms.data.query=SELECT id,name FROM name WHERE $CONDITIONS
> -D
> hive2rdbms.bounding.query=SELECT min(id),max(id) FROM name
> -D
> hive2rdbms.output.strategy=HDFS
> -D
> hive2rdbms.ouput.hdfs.path=/tmp/a
> 10/09/02 13:04:04 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
> Exception in thread "main" java.io.IOException:
> hive2rdbms.connection.url not specified
>at
> com.media6.hive2rdbms.job.Rdbms2Hive.checkArgs(Rdbms2Hive.java:70)
>at com.media6.hive2rdbms.job.Rdbms2Hive.run(Rdbms2Hive.java:46)
>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>at com.media6.hive2rdbms.job.Rdbms2Hive.main(Rdbms2Hive.java:145)
>
> So what gives does GenericOptionsParser only take hadoop arguments
> like mapred.map.tasks? If so how come it sucks up my first -D argument
> and considers the other ones "Remaining Arguments".
>
> Any ideas?
>


Why does Generic Options Parser only take the first -D option?

2010-09-02 Thread Edward Capriolo
This is 0.20.0
I have an eclipse run configuration passing these as arguments
-D hive2rdbms.jdbc.driver="com.mysql.jdbc.Driver" -D
hive2rdbms.connection.url="jdbc:mysql://localhost:3306/test" -D
hive2rdbms.data.query="SELECT id,name FROM name WHERE $CONDITIONS" -D
hive2rdbms.bounding.query="SELECT min(id),max(id) FROM name" -D
hive2rdbms.output.strategy=HDFS -D hive2rdbms.ouput.hdfs.path="/tmp/a"

My code does this:
public int run(String[] args) throws Exception {

conf = getConf();
GenericOptionsParser parser = new 
GenericOptionsParser(conf,args);

for (String arg: parser.getRemainingArgs()){
  System.out.println(arg);
}

hive2rdbms.connection.url=jdbc:mysql://localhost:3306/test
-D
hive2rdbms.data.query=SELECT id,name FROM name WHERE $CONDITIONS
-D
hive2rdbms.bounding.query=SELECT min(id),max(id) FROM name
-D
hive2rdbms.output.strategy=HDFS
-D
hive2rdbms.ouput.hdfs.path=/tmp/a
10/09/02 13:04:04 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
Exception in thread "main" java.io.IOException:
hive2rdbms.connection.url not specified
at com.media6.hive2rdbms.job.Rdbms2Hive.checkArgs(Rdbms2Hive.java:70)
at com.media6.hive2rdbms.job.Rdbms2Hive.run(Rdbms2Hive.java:46)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.media6.hive2rdbms.job.Rdbms2Hive.main(Rdbms2Hive.java:145)

So what gives does GenericOptionsParser only take hadoop arguments
like mapred.map.tasks? If so how come it sucks up my first -D argument
and considers the other ones "Remaining Arguments".

Any ideas?