Re: Writable questions

2010-09-02 Thread Lance Norskog
Wait- you want a print-to-user method or a 'serialize/deserialize' method?

On Tue, Aug 31, 2010 at 2:42 PM, David Rosenstrauch dar...@darose.net wrote:
 On 08/31/2010 02:09 PM, Mark wrote:

 On 8/31/10 10:07 AM, David Rosenstrauch wrote:

 On 08/31/2010 12:58 PM, Mark wrote:

 I have a question regarding outputting Writable objects. I thought all
 Writables know how to serialize themselves to output.

 For example I have an ArrayWritable of strings (or Texts) but when I
 output it to a file it shows up as
 'org.apache.hadoop.io.arraywrita...@21f7186f'

 Am I missing something? I would have expected it to output String1
 String2 String3 etc. If I am going about this the wrong way can someone
 explain the proper way for my reduce phase to output a key and a list of
 values. Thanks

 Writables know how to serialize and deserialize themselves (i.e., to a
 binary I/O stream). But that doesn't necessarily mean that they have a
 toString method for generating human-readable output.

 DR

 Ok that makes sense. How would I go about outputing an ArrayWritable
 then? Use a StringBuilder?

 Hmmm 

 Maybe something like this?

 Arrays.toString((TheArrayElementClass[])ArrayWritable.toArray())

 HTH,

 DR




-- 
Lance Norskog
goks...@gmail.com


Re: DataDrivenInputFormat setInput with boundingQuery

2010-09-02 Thread Lance Norskog
Thank you for mentioning this problem- it's something fairly mysterious to me.



On Tue, Aug 31, 2010 at 8:06 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 On Tue, Aug 31, 2010 at 10:32 PM, Edward Capriolo edlinuxg...@gmail.com 
 wrote:
 I am working with DataDrivenOutputFormat from trunk. None of the unit
 tests seem to test the bounded queries

 Configuration conf = new Configuration();
                Job job = new Job(conf);
                job.setJarByClass(TestZ.class);

                job.setInputFormatClass(DataDrivenDBInputFormat.class);
                job.setMapperClass(PrintlnMapper.class);
                job.setOutputFormatClass(NullOutputFormat.class);
                job.setMapOutputKeyClass(NullWritable.class);
                job.setMapOutputValueClass(NullDBWritable.class);
                job.setOutputKeyClass(NullWritable.class);
                job.setOutputValueClass(NullWritable.class);

                job.setNumReduceTasks(0);

                job.getConfiguration().setInt(mapreduce.map.tasks, 2);

                DBConfiguration.configureDB(conf, com.mysql.jdbc.Driver,
                                jdbc:mysql://localhost:3306/test, null, 
 null);

                DataDrivenDBInputFormat.setInput(job, NullDBWritable.class,
                                SELECT * FROM name WHERE $CONDITIONS,
                                SELECT MIN(id),MAX(id) FROM name);
                int ret = job.waitForCompletion(true) ? 0 : 1;

 Exception in thread main java.lang.RuntimeException:
 java.lang.RuntimeException: java.lang.NullPointerException
        at 
 org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:165)

 Can someone tell me what I am missing here?
 Thanks,
 Edward


 Nevermind
        DBConfiguration.configureDB(job.getConfiguration(), 
 com.mysql.jdbc.Driver,
                                jdbc:mysql://localhost:3306/test, null, 
 null);

 That is 4 hours of my life. I won't get back.




-- 
Lance Norskog
goks...@gmail.com


Over-replication in Hadoop

2010-09-02 Thread Santiago Pérez

I configured a cluster in Hadoop (1 master and 2 slaves) and replication
factor = 3:

property
  namedfs.replication/name
  value3/value
/property

Is Hadoop aware of the over-replication setting the real replication factor
to 2?

Thanks 
-- 
View this message in context: 
http://old.nabble.com/Over-replication-in-Hadoop-tp29602656p29602656.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Do I need to write a RawComparator if my custom writable is not used as a Key?

2010-09-02 Thread Vitaliy Semochkin
Hello,

Do I need to write a  RawComparator if my custom writable is not used
as a Key to improve performance?

Regards,
Vitaliy S


Re: Do I need to write a RawComparator if my custom writable is not used as a Key?

2010-09-02 Thread Owen O'Malley
No, RawComparator is only needed for Keys. 

-- Owen

On Sep 2, 2010, at 3:35, Vitaliy Semochkin vitaliy...@gmail.com wrote:

 Hello,
 
 Do I need to write a  RawComparator if my custom writable is not used
 as a Key to improve performance?
 
 Regards,
 Vitaliy S


Re: Writable questions

2010-09-02 Thread Mark

 On 9/1/10 11:28 PM, Lance Norskog wrote:

Wait- you want a print-to-user method or a 'serialize/deserialize' method?

On Tue, Aug 31, 2010 at 2:42 PM, David Rosenstrauchdar...@darose.net  wrote:

On 08/31/2010 02:09 PM, Mark wrote:

On 8/31/10 10:07 AM, David Rosenstrauch wrote:

On 08/31/2010 12:58 PM, Mark wrote:

I have a question regarding outputting Writable objects. I thought all
Writables know how to serialize themselves to output.

For example I have an ArrayWritable of strings (or Texts) but when I
output it to a file it shows up as
'org.apache.hadoop.io.arraywrita...@21f7186f'

Am I missing something? I would have expected it to output String1
String2 String3 etc. If I am going about this the wrong way can someone
explain the proper way for my reduce phase to output a key and a list of
values. Thanks

Writables know how to serialize and deserialize themselves (i.e., to a
binary I/O stream). But that doesn't necessarily mean that they have a
toString method for generating human-readable output.

DR

Ok that makes sense. How would I go about outputing an ArrayWritable
then? Use a StringBuilder?

Hmmm 

Maybe something like this?

Arrays.toString((TheArrayElementClass[])ArrayWritable.toArray())

HTH,

DR




I wanted to output an ArrayWritable from my reducer to a human readable 
format... something like this


ArrayWritable values = new ArrayWritable(new String[] { value1, 
value2, value3});

context.write(new Text(My Key), values);

I would have thought it would have output the values with some 
configurable delimeter.


Why does Generic Options Parser only take the first -D option?

2010-09-02 Thread Edward Capriolo
This is 0.20.0
I have an eclipse run configuration passing these as arguments
-D hive2rdbms.jdbc.driver=com.mysql.jdbc.Driver -D
hive2rdbms.connection.url=jdbc:mysql://localhost:3306/test -D
hive2rdbms.data.query=SELECT id,name FROM name WHERE $CONDITIONS -D
hive2rdbms.bounding.query=SELECT min(id),max(id) FROM name -D
hive2rdbms.output.strategy=HDFS -D hive2rdbms.ouput.hdfs.path=/tmp/a

My code does this:
public int run(String[] args) throws Exception {

conf = getConf();
GenericOptionsParser parser = new 
GenericOptionsParser(conf,args);

for (String arg: parser.getRemainingArgs()){
  System.out.println(arg);
}

hive2rdbms.connection.url=jdbc:mysql://localhost:3306/test
-D
hive2rdbms.data.query=SELECT id,name FROM name WHERE $CONDITIONS
-D
hive2rdbms.bounding.query=SELECT min(id),max(id) FROM name
-D
hive2rdbms.output.strategy=HDFS
-D
hive2rdbms.ouput.hdfs.path=/tmp/a
10/09/02 13:04:04 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
Exception in thread main java.io.IOException:
hive2rdbms.connection.url not specified
at com.media6.hive2rdbms.job.Rdbms2Hive.checkArgs(Rdbms2Hive.java:70)
at com.media6.hive2rdbms.job.Rdbms2Hive.run(Rdbms2Hive.java:46)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.media6.hive2rdbms.job.Rdbms2Hive.main(Rdbms2Hive.java:145)

So what gives does GenericOptionsParser only take hadoop arguments
like mapred.map.tasks? If so how come it sucks up my first -D argument
and considers the other ones Remaining Arguments.

Any ideas?


Re: Why does Generic Options Parser only take the first -D option?

2010-09-02 Thread Ted Yu
I checked GenericOptionsParser from 0.20.2
processGeneralOptions() should be able to process all -D options:

if (line.hasOption('D')) {
*  String[] property = line.getOptionValues('D');
*  for(String prop : property) {
String[] keyval = prop.split(=, 2);
if (keyval.length == 2) {
  conf.set(keyval[0], keyval[1]);
}
  }
}
You can add a log after the bold line to verify that all -D options are
returned.

On Thu, Sep 2, 2010 at 10:09 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 This is 0.20.0
 I have an eclipse run configuration passing these as arguments
 -D hive2rdbms.jdbc.driver=com.mysql.jdbc.Driver -D
 hive2rdbms.connection.url=jdbc:mysql://localhost:3306/test -D
 hive2rdbms.data.query=SELECT id,name FROM name WHERE $CONDITIONS -D
 hive2rdbms.bounding.query=SELECT min(id),max(id) FROM name -D
 hive2rdbms.output.strategy=HDFS -D hive2rdbms.ouput.hdfs.path=/tmp/a

 My code does this:
public int run(String[] args) throws Exception {

conf = getConf();
GenericOptionsParser parser = new
 GenericOptionsParser(conf,args);

for (String arg: parser.getRemainingArgs()){
  System.out.println(arg);
}

 hive2rdbms.connection.url=jdbc:mysql://localhost:3306/test
 -D
 hive2rdbms.data.query=SELECT id,name FROM name WHERE $CONDITIONS
 -D
 hive2rdbms.bounding.query=SELECT min(id),max(id) FROM name
 -D
 hive2rdbms.output.strategy=HDFS
 -D
 hive2rdbms.ouput.hdfs.path=/tmp/a
 10/09/02 13:04:04 INFO jvm.JvmMetrics: Initializing JVM Metrics with
 processName=JobTracker, sessionId=
 Exception in thread main java.io.IOException:
 hive2rdbms.connection.url not specified
at
 com.media6.hive2rdbms.job.Rdbms2Hive.checkArgs(Rdbms2Hive.java:70)
at com.media6.hive2rdbms.job.Rdbms2Hive.run(Rdbms2Hive.java:46)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.media6.hive2rdbms.job.Rdbms2Hive.main(Rdbms2Hive.java:145)

 So what gives does GenericOptionsParser only take hadoop arguments
 like mapred.map.tasks? If so how come it sucks up my first -D argument
 and considers the other ones Remaining Arguments.

 Any ideas?



Exception while archiving

2010-09-02 Thread Ranjib Dey
Hi ,
I am trying to archive a folder containing text files using following
command, from haddop home dir.

./bin/hadoop archive -archiveName xxx.har /root/new/*  /root/

and reciveing the following output :
___
Exception
null
__

any idea what is going wrong? is there any log which i can check for
details?

regards
ranjib


Re: Writable questions

2010-09-02 Thread Steve Hoffman
You could use the standard List.toString() method which does a nice
job of printing something like this;

(A1,A2,A3)

assuming the objects contained in the list implement toString() to
something you'd want to see.

Use in conjuction with java.util.Arrays.asList() and the
ArrayWriteable.toStrings() like this:

ArrayWritable values = new ArrayWritable(new String[] { value1,
value2, value3});
String humanReadableString = Arrays.asList(values.toStrings()).toString();

Steve

On Thu, Sep 2, 2010 at 9:34 AM, Mark static.void@gmail.com wrote:
  On 9/1/10 11:28 PM, Lance Norskog wrote:

 Wait- you want a print-to-user method or a 'serialize/deserialize' method?

 On Tue, Aug 31, 2010 at 2:42 PM, David Rosenstrauchdar...@darose.net
  wrote:

 On 08/31/2010 02:09 PM, Mark wrote:

 On 8/31/10 10:07 AM, David Rosenstrauch wrote:

 On 08/31/2010 12:58 PM, Mark wrote:

 I have a question regarding outputting Writable objects. I thought all
 Writables know how to serialize themselves to output.

 For example I have an ArrayWritable of strings (or Texts) but when I
 output it to a file it shows up as
 'org.apache.hadoop.io.arraywrita...@21f7186f'

 Am I missing something? I would have expected it to output String1
 String2 String3 etc. If I am going about this the wrong way can
 someone
 explain the proper way for my reduce phase to output a key and a list
 of
 values. Thanks

 Writables know how to serialize and deserialize themselves (i.e., to a
 binary I/O stream). But that doesn't necessarily mean that they have a
 toString method for generating human-readable output.

 DR

 Ok that makes sense. How would I go about outputing an ArrayWritable
 then? Use a StringBuilder?

 Hmmm 

 Maybe something like this?

 Arrays.toString((TheArrayElementClass[])ArrayWritable.toArray())

 HTH,

 DR



 I wanted to output an ArrayWritable from my reducer to a human readable
 format... something like this

 ArrayWritable values = new ArrayWritable(new String[] { value1, value2,
 value3});
 context.write(new Text(My Key), values);

 I would have thought it would have output the values with some configurable
 delimeter.