Re: rack awarness unexpected behaviour

2013-10-03 Thread Michel Segel
And that's the rub.
Rack awareness is an artificial construct.

You want to fix it and match the real world, you need to balance the racks 
physically.
Otherwise you need to rewrite load balancing to take in to consideration the 
number and power of the nodes in the rack. 

The short answer, it's easier to fudge the values in the script.

Sent from a remote device. Please excuse any typos...

Mike Segel

 On Oct 3, 2013, at 8:58 AM, Marc Sturlese marc.sturl...@gmail.com wrote:
 
 Doing that will balance the block writing but I think here you loose the
 concept of physical rack awareness.
 Let's say you have 2 physical racks, one with 2 servers and one with 4. If
 you artificially tell hadoop that one rack has 3 servers and the other 3 you
 are loosing the concept of rack awareness. You're not guaranteeing that each
 physical rack contains at least a replica of each block.
 
 So if you have 2 racks with different number of servers, it's not possible
 to do proper rack awareness without filling the disks of the rack with less
 servers first. Am I right or am I missing something?
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4093337.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
 


Re: rack awarness unexpected behaviour

2013-08-22 Thread Michel Segel
Rack aware is an artificial concept.
Meaning you can define where a node is regardless of is real position in the 
rack.

Going from memory, and its probably been changed in later versions of the 
code...

Isn't the replication... Copy on node 1, copy on same rack, third copy on 
different rack?

Or has this been improved upon?

Sent from a remote device. Please excuse any typos...

Mike Segel

On Aug 22, 2013, at 5:14 AM, Harsh J ha...@cloudera.com wrote:

 I'm not aware of a bug in 0.20.2 that would not honor the Rack
 Awareness, but have you done the two below checks as well?
 
 1. Ensuring JT has the same rack awareness scripts and configuration
 so it can use it for scheduling, and,
 2. Checking if the map and reduce tasks are being evenly spread across
 both racks.
 
 On Thu, Aug 22, 2013 at 2:50 PM, Marc Sturlese marc.sturl...@gmail.com 
 wrote:
 I'm on cdh3u4 (0.20.2), gonna try to read a bit on this bug
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4086049.html
 Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
 
 
 
 -- 
 Harsh J
 


Re: DN cannot talk to NN using Kerberos on secured hdfs

2012-09-16 Thread Michel Segel
That should be a bug. All host names should be case insensitive. 

Sent from a remote device. Please excuse any typos...

Mike Segel

On Sep 12, 2012, at 12:25 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com 
wrote:

 
 This is because JAVA only supports AES 128 by default. To support AES 256, 
 you will need to install the unlimited-JCE policy jar from 
 http://www.oracle.com/technetwork/java/javase/downloads/index.html
 
 Also, there is another case of Kerberos having issues with hostnames with 
 some/all letters in caps. If that is the case, you should try tweaking your 
 host-names to all lower-case.
 
 Thanks,
 +Vinod Kumar Vavilapalli
 Hortonworks Inc.
 http://hortonworks.com/
 
 On Sep 12, 2012, at 9:47 AM, Shumin Wu wrote:
 
 Hi,
 
 I am setting up a secured hdfs using Kerberos.  I got NN, 2NN working just
 fine. However, DN cannot talk to NN and throws the following exception. I
 disabled the AES256 from keytab, which in theory it should fall back to the
 AES128, or whatever encryption on the top of the list, but it still
 complains about the same. Any help, suggestion, comment is highly
 appreciated.
 
 *Apache Hadoop version: *
 2.0.0
 
 *Security configuration Snippet of DN:*
 ...
 property
namedfs.datanode.data.dir.perm/name
value700/value
  /property
 
  property
namedfs.datanode.address/name
value0.0.0.0:1004/value
  /property
 
  property
namedfs.datanode.http.address/name
value0.0.0.0:1006/value
  /property
 
  property
namedfs.datanode.keytab.file/name
value/etc/hadoop/conf/hdfs.keytab/value
 
  property
namedfs.datanode.kerberos.principal/name
valuehdfs/_HOST@REALM/value
  /property
 ...
 
 *Exceptions in Log:*
 
 javax.security.sasl.
 SaslException: GSS initiate failed [Caused by GSSException: Failure
 unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS
 mode with HMAC SHA1-96 is not supported/enabled)]
at
 com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:159)
at
 org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1199)
at
 org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1393)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:710)
at
 org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:509)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:484)
 Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism
 level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not
 supported/enabled)
at
 sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:741)
at
 sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:323)
at
 sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:267)
at
 com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:137)
... 5 more
 Caused by: KrbException: Encryption type AES256 CTS mode with HMAC SHA1-96
 is not supported/enabled
 
 
 Thanks,
 Shumin Wu
 


Re: DN cannot talk to NN using Kerberos on secured hdfs

2012-09-16 Thread Michel Segel
That should be a bug. All host names should be case insensitive. 

Sent from a remote device. Please excuse any typos...

Mike Segel

On Sep 12, 2012, at 12:25 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com 
wrote:

 
 This is because JAVA only supports AES 128 by default. To support AES 256, 
 you will need to install the unlimited-JCE policy jar from 
 http://www.oracle.com/technetwork/java/javase/downloads/index.html
 
 Also, there is another case of Kerberos having issues with hostnames with 
 some/all letters in caps. If that is the case, you should try tweaking your 
 host-names to all lower-case.
 
 Thanks,
 +Vinod Kumar Vavilapalli
 Hortonworks Inc.
 http://hortonworks.com/
 
 On Sep 12, 2012, at 9:47 AM, Shumin Wu wrote:
 
 Hi,
 
 I am setting up a secured hdfs using Kerberos.  I got NN, 2NN working just
 fine. However, DN cannot talk to NN and throws the following exception. I
 disabled the AES256 from keytab, which in theory it should fall back to the
 AES128, or whatever encryption on the top of the list, but it still
 complains about the same. Any help, suggestion, comment is highly
 appreciated.
 
 *Apache Hadoop version: *
 2.0.0
 
 *Security configuration Snippet of DN:*
 ...
 property
namedfs.datanode.data.dir.perm/name
value700/value
  /property
 
  property
namedfs.datanode.address/name
value0.0.0.0:1004/value
  /property
 
  property
namedfs.datanode.http.address/name
value0.0.0.0:1006/value
  /property
 
  property
namedfs.datanode.keytab.file/name
value/etc/hadoop/conf/hdfs.keytab/value
 
  property
namedfs.datanode.kerberos.principal/name
valuehdfs/_HOST@REALM/value
  /property
 ...
 
 *Exceptions in Log:*
 
 javax.security.sasl.
 SaslException: GSS initiate failed [Caused by GSSException: Failure
 unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS
 mode with HMAC SHA1-96 is not supported/enabled)]
at
 com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:159)
at
 org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1199)
at
 org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1393)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:710)
at
 org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:509)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:484)
 Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism
 level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not
 supported/enabled)
at
 sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:741)
at
 sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:323)
at
 sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:267)
at
 com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:137)
... 5 more
 Caused by: KrbException: Encryption type AES256 CTS mode with HMAC SHA1-96
 is not supported/enabled
 
 
 Thanks,
 Shumin Wu
 


Re: Hive error when loading csv data.

2012-06-26 Thread Michel Segel
Yup. I just didnt add the quotes.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P sandeepreddy.3...@gmail.com 
wrote:

 Thanks for the reply.
 I didnt get that Michael. My f2 should be abc,def
 
 On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel 
 michael_se...@hotmail.comwrote:
 
 Alternatively you could write a simple script to convert the csv to a pipe
 delimited file so that abc,def will be abc,def.
 
 On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
 
 Hive's delimited-fields-format record reader does not handle quoted
 text that carry the same delimiter within them. Excel supports such
 records, so it reads it fine.
 
 You will need to create your table with a custom InputFormat class
 that can handle this (Try using OpenCSV readers, they support this),
 instead of relying on Hive to do this for you. If you're successful in
 your approach, please also consider contributing something back to
 Hive/Pig to help others.
 
 On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
 sandeepreddy.3...@gmail.com wrote:
 
 
 Hi all,
 I have a csv file with 46 columns but i'm getting error when i do some
 analysis on that data type. For simplification i have taken 3 columns
 and
 now my csv is like
 c,zxy,xyz
 d,abc,def,abcd
 
 i have created table for this data using,
 hive create table test3(
 f1 string,
 f2 string,
 f3 string)
 row format delimited
 fields terminated by ,;
 OK
 Time taken: 0.143 seconds
 hive load data local inpath '/home/training/a.csv'
 into table test3;
 Copying data from file:/home/training/a.csv
 Copying file: file:/home/training/a.csv
 Loading data to table default.test3
 OK
 Time taken: 0.276 seconds
 hive select * from test3;
 OK
 c   zxy xyz
 d   abcdef
 Time taken: 0.156 seconds
 
 When i do select f2 from test3;
 my results are,
 OK
 zxy
 abc
 but this should be abc,def
 When i open the same csv file with Microsoft Excel i got abc,def
 How should i solve this error??
 
 
 
 --
 Thanks,
 sandeep
 
 --
 
 
 
 
 
 
 --
 Harsh J
 
 
 
 
 
 -- 
 Thanks,
 sandeep


Re: Hive error when loading csv data.

2012-06-26 Thread Michel Segel
What I am suggesting is to write a simple script , maybe using python, where 
you replace the commas that are used as field delimiter

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P sandeepreddy.3...@gmail.com 
wrote:

 If i do that my data will be d|abc|def|abcd my problem is not solved
 
 On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel 
 michael_se...@hotmail.comwrote:
 
 Yup. I just didnt add the quotes.
 
 Sent from a remote device. Please excuse any typos...
 
 Mike Segel
 
 On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P sandeepreddy.3...@gmail.com
 wrote:
 
 Thanks for the reply.
 I didnt get that Michael. My f2 should be abc,def
 
 On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel 
 michael_se...@hotmail.comwrote:
 
 Alternatively you could write a simple script to convert the csv to a
 pipe
 delimited file so that abc,def will be abc,def.
 
 On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
 
 Hive's delimited-fields-format record reader does not handle quoted
 text that carry the same delimiter within them. Excel supports such
 records, so it reads it fine.
 
 You will need to create your table with a custom InputFormat class
 that can handle this (Try using OpenCSV readers, they support this),
 instead of relying on Hive to do this for you. If you're successful in
 your approach, please also consider contributing something back to
 Hive/Pig to help others.
 
 On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
 sandeepreddy.3...@gmail.com wrote:
 
 
 Hi all,
 I have a csv file with 46 columns but i'm getting error when i do some
 analysis on that data type. For simplification i have taken 3 columns
 and
 now my csv is like
 c,zxy,xyz
 d,abc,def,abcd
 
 i have created table for this data using,
 hive create table test3(
 f1 string,
 f2 string,
 f3 string)
 row format delimited
 fields terminated by ,;
 OK
 Time taken: 0.143 seconds
 hive load data local inpath '/home/training/a.csv'
 into table test3;
 Copying data from file:/home/training/a.csv
 Copying file: file:/home/training/a.csv
 Loading data to table default.test3
 OK
 Time taken: 0.276 seconds
 hive select * from test3;
 OK
 c   zxy xyz
 d   abcdef
 Time taken: 0.156 seconds
 
 When i do select f2 from test3;
 my results are,
 OK
 zxy
 abc
 but this should be abc,def
 When i open the same csv file with Microsoft Excel i got abc,def
 How should i solve this error??
 
 
 
 --
 Thanks,
 sandeep
 
 --
 
 
 
 
 
 
 --
 Harsh J
 
 
 
 
 
 --
 Thanks,
 sandeep
 
 
 
 
 -- 
 Thanks,
 sandeep


Re: Hive error when loading csv data.

2012-06-26 Thread Michel Segel
Sorry,
I was saying  that you can write a python script that replaces the delimiter 
with a | and ignore the commas within quotes.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 26, 2012, at 8:58 PM, Sandeep Reddy P sandeepreddy.3...@gmail.com 
wrote:

 If i do that my data will be d|abc|def|abcd my problem is not solved
 
 On Tue, Jun 26, 2012 at 6:48 PM, Michel Segel 
 michael_se...@hotmail.comwrote:
 
 Yup. I just didnt add the quotes.
 
 Sent from a remote device. Please excuse any typos...
 
 Mike Segel
 
 On Jun 26, 2012, at 4:30 PM, Sandeep Reddy P sandeepreddy.3...@gmail.com
 wrote:
 
 Thanks for the reply.
 I didnt get that Michael. My f2 should be abc,def
 
 On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel 
 michael_se...@hotmail.comwrote:
 
 Alternatively you could write a simple script to convert the csv to a
 pipe
 delimited file so that abc,def will be abc,def.
 
 On Jun 26, 2012, at 2:51 PM, Harsh J wrote:
 
 Hive's delimited-fields-format record reader does not handle quoted
 text that carry the same delimiter within them. Excel supports such
 records, so it reads it fine.
 
 You will need to create your table with a custom InputFormat class
 that can handle this (Try using OpenCSV readers, they support this),
 instead of relying on Hive to do this for you. If you're successful in
 your approach, please also consider contributing something back to
 Hive/Pig to help others.
 
 On Wed, Jun 27, 2012 at 12:37 AM, Sandeep Reddy P
 sandeepreddy.3...@gmail.com wrote:
 
 
 Hi all,
 I have a csv file with 46 columns but i'm getting error when i do some
 analysis on that data type. For simplification i have taken 3 columns
 and
 now my csv is like
 c,zxy,xyz
 d,abc,def,abcd
 
 i have created table for this data using,
 hive create table test3(
 f1 string,
 f2 string,
 f3 string)
 row format delimited
 fields terminated by ,;
 OK
 Time taken: 0.143 seconds
 hive load data local inpath '/home/training/a.csv'
 into table test3;
 Copying data from file:/home/training/a.csv
 Copying file: file:/home/training/a.csv
 Loading data to table default.test3
 OK
 Time taken: 0.276 seconds
 hive select * from test3;
 OK
 c   zxy xyz
 d   abcdef
 Time taken: 0.156 seconds
 
 When i do select f2 from test3;
 my results are,
 OK
 zxy
 abc
 but this should be abc,def
 When i open the same csv file with Microsoft Excel i got abc,def
 How should i solve this error??
 
 
 
 --
 Thanks,
 sandeep
 
 --
 
 
 
 
 
 
 --
 Harsh J
 
 
 
 
 
 --
 Thanks,
 sandeep
 
 
 
 
 -- 
 Thanks,
 sandeep


Re: How to mapreduce in the scenario

2012-05-29 Thread Michel Segel
Hive? 
Sure Assuming you mean that the id is a FK common amongst the tables...

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 29, 2012, at 5:29 AM, liuzhg liu...@cernet.com wrote:

 Hi,
 
 I wonder that if Hadoop can solve effectively the question as following:
 
 ==
 input file: a.txt, b.txt
 result: c.txt
 
 a.txt:
 id1,name1,age1,...
 id2,name2,age2,...
 id3,name3,age3,...
 id4,name4,age4,...
 
 b.txt: 
 id1,address1,...
 id2,address2,...
 id3,address3,...
 
 c.txt
 id1,name1,age1,address1,...
 id2,name2,age2,address2,...
 
 
 I know that it can be done well by database. 
 But I want to handle it with hadoop if possible.
 Can hadoop meet the requirement?
 
 Any suggestion can help me. Thank you very much!
 
 Best Regards,
 
 Gump
 
 
 


Re: How to Integrate LDAP in Hadoop ?

2012-05-29 Thread Michel Segel
Which release? Version?
I believe there are variables in the *-site.xml that allow LDAP integration ...



Sent from a remote device. Please excuse any typos...

Mike Segel

On May 26, 2012, at 7:40 AM, samir das mohapatra samir.help...@gmail.com 
wrote:

 Hi All,
 
   Did any one work on hadoop with LDAP integration.
   Please help me for same.
 
 Thanks
  samir


Re: Reduce Hangs at 66%

2012-05-03 Thread Michel Segel
Well...
Lots of information but still missing some of the basics...

Which release and version?
What are your ulimits set to?
How much free disk space do you have?
What are you attempting to do?

Stuff like that.



Sent from a remote device. Please excuse any typos...

Mike Segel

On May 2, 2012, at 4:49 PM, Keith Thompson kthom...@binghamton.edu wrote:

 I am running a task which gets to 66% of the Reduce step and then hangs
 indefinitely. Here is the log file (I apologize if I am putting too much
 here but I am not exactly sure what is relevant):
 
 2012-05-02 16:42:52,975 INFO org.apache.hadoop.mapred.JobTracker:
 Adding task (REDUCE) 'attempt_201202240659_6433_r_00_0' to tip
 task_201202240659_6433_r_00, for tracker
 'tracker_analytix7:localhost.localdomain/127.0.0.1:56515'
 2012-05-02 16:42:53,584 INFO org.apache.hadoop.mapred.JobInProgress:
 Task 'attempt_201202240659_6433_m_01_0' has completed
 task_201202240659_6433_m_01 successfully.
 2012-05-02 17:00:47,546 INFO org.apache.hadoop.mapred.TaskInProgress:
 Error from attempt_201202240659_6432_r_00_0: Task
 attempt_201202240659_6432_r_00_0 failed to report status for 1800
 seconds. Killing!
 2012-05-02 17:00:47,546 INFO org.apache.hadoop.mapred.JobTracker:
 Removing task 'attempt_201202240659_6432_r_00_0'
 2012-05-02 17:00:47,546 INFO org.apache.hadoop.mapred.JobTracker:
 Adding task (TASK_CLEANUP) 'attempt_201202240659_6432_r_00_0' to
 tip task_201202240659_6432_r_00, for tracker
 'tracker_analytix4:localhost.localdomain/127.0.0.1:44204'
 2012-05-02 17:00:48,763 INFO org.apache.hadoop.mapred.JobTracker:
 Removing task 'attempt_201202240659_6432_r_00_0'
 2012-05-02 17:00:48,957 INFO org.apache.hadoop.mapred.JobTracker:
 Adding task (REDUCE) 'attempt_201202240659_6432_r_00_1' to tip
 task_201202240659_6432_r_00, for tracker
 'tracker_analytix5:localhost.localdomain/127.0.0.1:59117'
 2012-05-02 17:00:56,559 INFO org.apache.hadoop.mapred.TaskInProgress:
 Error from attempt_201202240659_6432_r_00_1: java.io.IOException:
 The temporary job-output directory
 hdfs://analytix1:9000/thompson/outputDensity/density1/_temporary
 doesn't exist!
at 
 org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
at 
 org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:240)
at 
 org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:438)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
 
 2012-05-02 17:00:59,903 INFO org.apache.hadoop.mapred.JobTracker:
 Removing task 'attempt_201202240659_6432_r_00_1'
 2012-05-02 17:00:59,906 INFO org.apache.hadoop.mapred.JobTracker:
 Adding task (REDUCE) 'attempt_201202240659_6432_r_00_2' to tip
 task_201202240659_6432_r_00, for tracker
 'tracker_analytix3:localhost.localdomain/127.0.0.1:39980'
 2012-05-02 17:01:07,200 INFO org.apache.hadoop.mapred.TaskInProgress:
 Error from attempt_201202240659_6432_r_00_2: java.io.IOException:
 The temporary job-output directory
 hdfs://analytix1:9000/thompson/outputDensity/density1/_temporary
 doesn't exist!
at 
 org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
at 
 org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:240)
at 
 org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:438)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
 
 2012-05-02 17:01:10,239 INFO org.apache.hadoop.mapred.JobTracker:
 Removing task 'attempt_201202240659_6432_r_00_2'
 2012-05-02 17:01:10,283 INFO org.apache.hadoop.mapred.JobTracker:
 Adding task (REDUCE) 'attempt_201202240659_6432_r_00_3' to tip
 task_201202240659_6432_r_00, for tracker
 'tracker_analytix2:localhost.localdomain/127.0.0.1:33297'
 2012-05-02 17:01:18,188 INFO org.apache.hadoop.mapred.TaskInProgress:
 Error from attempt_201202240659_6432_r_00_3: java.io.IOException:
 The temporary job-output directory
 hdfs://analytix1:9000/thompson/outputDensity/density1/_temporary
 doesn't exist!
at 
 

Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3

2012-05-03 Thread Michel Segel
Well, you've kind of painted yourself in to a corner...
Not sure why you didn't get a response from the Cloudera lists, but it's a 
generic question...

8 out of 10 TB. Are you talking effective storage or actual disks? 
And please tell me you've already ordered more hardware.. Right?

And please tell me this isn't your production cluster...

(Strong hint to Strata and Cloudea... You really want to accept my upcoming 
proposal talk... ;-)


Sent from a remote device. Please excuse any typos...

Mike Segel

On May 3, 2012, at 5:25 AM, Austin Chungath austi...@gmail.com wrote:

 Yes. This was first posted on the cloudera mailing list. There were no
 responses.
 
 But this is not related to cloudera as such.
 
 cdh3 is based on apache hadoop 0.20 as the base. My data is in apache
 hadoop 0.20.205
 
 There is an upgrade namenode option when we are migrating to a higher
 version say from 0.20 to 0.20.205
 but here I am downgrading from 0.20.205 to 0.20 (cdh3)
 Is this possible?
 
 
 On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi 
 prash1...@gmail.comwrote:
 
 Seems like a matter of upgrade. I am not a Cloudera user so would not know
 much, but you might find some help moving this to Cloudera mailing list.
 
 On Thu, May 3, 2012 at 2:51 AM, Austin Chungath austi...@gmail.com
 wrote:
 
 There is only one cluster. I am not copying between clusters.
 
 Say I have a cluster running apache 0.20.205 with 10 TB storage capacity
 and has about 8 TB of data.
 Now how can I migrate the same cluster to use cdh3 and use that same 8 TB
 of data.
 
 I can't copy 8 TB of data using distcp because I have only 2 TB of free
 space
 
 
 On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar nitinpawar...@gmail.com
 wrote:
 
 you can actually look at the distcp
 
 http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
 
 but this means that you have two different set of clusters available to
 do
 the migration
 
 On Thu, May 3, 2012 at 12:51 PM, Austin Chungath austi...@gmail.com
 wrote:
 
 Thanks for the suggestions,
 My concerns are that I can't actually copyToLocal from the dfs
 because
 the
 data is huge.
 
 Say if my hadoop was 0.20 and I am upgrading to 0.20.205 I can do a
 namenode upgrade. I don't have to copy data out of dfs.
 
 But here I am having Apache hadoop 0.20.205 and I want to use CDH3
 now,
 which is based on 0.20
 Now it is actually a downgrade as 0.20.205's namenode info has to be
 used
 by 0.20's namenode.
 
 Any idea how I can achieve what I am trying to do?
 
 Thanks.
 
 On Thu, May 3, 2012 at 12:23 PM, Nitin Pawar 
 nitinpawar...@gmail.com
 wrote:
 
 i can think of following options
 
 1) write a simple get and put code which gets the data from DFS and
 loads
 it in dfs
 2) see if the distcp  between both versions are compatible
 3) this is what I had done (and my data was hardly few hundred GB)
 ..
 did a
 dfs -copyToLocal and then in the new grid did a copyFromLocal
 
 On Thu, May 3, 2012 at 11:41 AM, Austin Chungath 
 austi...@gmail.com
 
 wrote:
 
 Hi,
 I am migrating from Apache hadoop 0.20.205 to CDH3u3.
 I don't want to lose the data that is in the HDFS of Apache
 hadoop
 0.20.205.
 How do I migrate to CDH3u3 but keep the data that I have on
 0.20.205.
 What is the best practice/ techniques to do this?
 
 Thanks  Regards,
 Austin
 
 
 
 
 --
 Nitin Pawar
 
 
 
 
 
 --
 Nitin Pawar
 
 
 


Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3

2012-05-03 Thread Michel Segel
Ok... When you get your new hardware...

Set up one server as your new NN, JT, SN.
Set up the others as a DN.
(Cloudera CDH3u3)

On your existing cluster... 
Remove your old log files, temp files on HDFS anything you don't need.
This should give you some more space.
Start copying some of the directories/files to the new cluster.
As you gain space, decommission a node, rebalance, add node to new cluster...

It's a slow process. 

Should I remind you to make sure you up you bandwidth setting, and to clean up 
the hdfs directories when you repurpose the nodes?

Does this make sense?

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 3, 2012, at 5:46 AM, Austin Chungath austi...@gmail.com wrote:

 Yeah I know :-)
 and this is not a production cluster ;-) and yes there is more hardware
 coming :-)
 
 On Thu, May 3, 2012 at 4:10 PM, Michel Segel michael_se...@hotmail.comwrote:
 
 Well, you've kind of painted yourself in to a corner...
 Not sure why you didn't get a response from the Cloudera lists, but it's a
 generic question...
 
 8 out of 10 TB. Are you talking effective storage or actual disks?
 And please tell me you've already ordered more hardware.. Right?
 
 And please tell me this isn't your production cluster...
 
 (Strong hint to Strata and Cloudea... You really want to accept my
 upcoming proposal talk... ;-)
 
 
 Sent from a remote device. Please excuse any typos...
 
 Mike Segel
 
 On May 3, 2012, at 5:25 AM, Austin Chungath austi...@gmail.com wrote:
 
 Yes. This was first posted on the cloudera mailing list. There were no
 responses.
 
 But this is not related to cloudera as such.
 
 cdh3 is based on apache hadoop 0.20 as the base. My data is in apache
 hadoop 0.20.205
 
 There is an upgrade namenode option when we are migrating to a higher
 version say from 0.20 to 0.20.205
 but here I am downgrading from 0.20.205 to 0.20 (cdh3)
 Is this possible?
 
 
 On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi prash1...@gmail.com
 wrote:
 
 Seems like a matter of upgrade. I am not a Cloudera user so would not
 know
 much, but you might find some help moving this to Cloudera mailing list.
 
 On Thu, May 3, 2012 at 2:51 AM, Austin Chungath austi...@gmail.com
 wrote:
 
 There is only one cluster. I am not copying between clusters.
 
 Say I have a cluster running apache 0.20.205 with 10 TB storage
 capacity
 and has about 8 TB of data.
 Now how can I migrate the same cluster to use cdh3 and use that same 8
 TB
 of data.
 
 I can't copy 8 TB of data using distcp because I have only 2 TB of free
 space
 
 
 On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar nitinpawar...@gmail.com
 wrote:
 
 you can actually look at the distcp
 
 http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
 
 but this means that you have two different set of clusters available
 to
 do
 the migration
 
 On Thu, May 3, 2012 at 12:51 PM, Austin Chungath austi...@gmail.com
 wrote:
 
 Thanks for the suggestions,
 My concerns are that I can't actually copyToLocal from the dfs
 because
 the
 data is huge.
 
 Say if my hadoop was 0.20 and I am upgrading to 0.20.205 I can do a
 namenode upgrade. I don't have to copy data out of dfs.
 
 But here I am having Apache hadoop 0.20.205 and I want to use CDH3
 now,
 which is based on 0.20
 Now it is actually a downgrade as 0.20.205's namenode info has to be
 used
 by 0.20's namenode.
 
 Any idea how I can achieve what I am trying to do?
 
 Thanks.
 
 On Thu, May 3, 2012 at 12:23 PM, Nitin Pawar 
 nitinpawar...@gmail.com
 wrote:
 
 i can think of following options
 
 1) write a simple get and put code which gets the data from DFS and
 loads
 it in dfs
 2) see if the distcp  between both versions are compatible
 3) this is what I had done (and my data was hardly few hundred GB)
 ..
 did a
 dfs -copyToLocal and then in the new grid did a copyFromLocal
 
 On Thu, May 3, 2012 at 11:41 AM, Austin Chungath 
 austi...@gmail.com
 
 wrote:
 
 Hi,
 I am migrating from Apache hadoop 0.20.205 to CDH3u3.
 I don't want to lose the data that is in the HDFS of Apache
 hadoop
 0.20.205.
 How do I migrate to CDH3u3 but keep the data that I have on
 0.20.205.
 What is the best practice/ techniques to do this?
 
 Thanks  Regards,
 Austin
 
 
 
 
 --
 Nitin Pawar
 
 
 
 
 
 --
 Nitin Pawar
 
 
 
 


Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3

2012-05-03 Thread Michel Segel
Ok... So riddle me this...
I currently have a replication factor of 3.
I reset it to two.

What do you have to do to get the replication factor of 3 down to 2?
Do I just try to rebalance the nodes?

The point is that you are looking at a very small cluster.
You may want to start the be cluster with a replication factor of 2 and then 
when the data is moved over, increase it to a factor of 3. Or maybe not.

I do a distcp to. Copy the data and after each distcp, I do an fsck for a 
sanity check and then remove the files I copied. As I gain more room, I can 
then slowly drop nodes, do an fsck, rebalance and then repeat.

Even though this us a dev cluster, the OP wants to retain the data. 

There are other options depending on the amount and size of new hardware.
I mean make one machine a RAID 5 machine, copy data to it clearing off the 
cluster.

If 8TB was the amount of disk used, that would be 2. TB used.
Let's say 3TB. Going raid 5, how much disk is that?  So you could fit it on one 
machine, depending on hardware, or maybe 2 machines...  Now you can rebuild 
initial cluster and then move data back. Then rebuild those machines. Lots of 
options... ;-)

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 3, 2012, at 11:26 AM, Suresh Srinivas sur...@hortonworks.com wrote:

 This probably is a more relevant question in CDH mailing lists. That said,
 what Edward is suggesting seems reasonable. Reduce replication factor,
 decommission some of the nodes and create a new cluster with those nodes
 and do distcp.
 
 Could you share with us the reasons you want to migrate from Apache 205?
 
 Regards,
 Suresh
 
 On Thu, May 3, 2012 at 8:25 AM, Edward Capriolo edlinuxg...@gmail.comwrote:
 
 Honestly that is a hassle, going from 205 to cdh3u3 is probably more
 or a cross-grade then an upgrade or downgrade. I would just stick it
 out. But yes like Michael said two clusters on the same gear and
 distcp. If you are using RF=3 you could also lower your replication to
 rf=2 'hadoop dfs -setrepl 2' to clear headroom as you are moving
 stuff.
 
 
 On Thu, May 3, 2012 at 7:25 AM, Michel Segel michael_se...@hotmail.com
 wrote:
 Ok... When you get your new hardware...
 
 Set up one server as your new NN, JT, SN.
 Set up the others as a DN.
 (Cloudera CDH3u3)
 
 On your existing cluster...
 Remove your old log files, temp files on HDFS anything you don't need.
 This should give you some more space.
 Start copying some of the directories/files to the new cluster.
 As you gain space, decommission a node, rebalance, add node to new
 cluster...
 
 It's a slow process.
 
 Should I remind you to make sure you up you bandwidth setting, and to
 clean up the hdfs directories when you repurpose the nodes?
 
 Does this make sense?
 
 Sent from a remote device. Please excuse any typos...
 
 Mike Segel
 
 On May 3, 2012, at 5:46 AM, Austin Chungath austi...@gmail.com wrote:
 
 Yeah I know :-)
 and this is not a production cluster ;-) and yes there is more hardware
 coming :-)
 
 On Thu, May 3, 2012 at 4:10 PM, Michel Segel michael_se...@hotmail.com
 wrote:
 
 Well, you've kind of painted yourself in to a corner...
 Not sure why you didn't get a response from the Cloudera lists, but
 it's a
 generic question...
 
 8 out of 10 TB. Are you talking effective storage or actual disks?
 And please tell me you've already ordered more hardware.. Right?
 
 And please tell me this isn't your production cluster...
 
 (Strong hint to Strata and Cloudea... You really want to accept my
 upcoming proposal talk... ;-)
 
 
 Sent from a remote device. Please excuse any typos...
 
 Mike Segel
 
 On May 3, 2012, at 5:25 AM, Austin Chungath austi...@gmail.com
 wrote:
 
 Yes. This was first posted on the cloudera mailing list. There were no
 responses.
 
 But this is not related to cloudera as such.
 
 cdh3 is based on apache hadoop 0.20 as the base. My data is in apache
 hadoop 0.20.205
 
 There is an upgrade namenode option when we are migrating to a higher
 version say from 0.20 to 0.20.205
 but here I am downgrading from 0.20.205 to 0.20 (cdh3)
 Is this possible?
 
 
 On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi 
 prash1...@gmail.com
 wrote:
 
 Seems like a matter of upgrade. I am not a Cloudera user so would not
 know
 much, but you might find some help moving this to Cloudera mailing
 list.
 
 On Thu, May 3, 2012 at 2:51 AM, Austin Chungath austi...@gmail.com
 wrote:
 
 There is only one cluster. I am not copying between clusters.
 
 Say I have a cluster running apache 0.20.205 with 10 TB storage
 capacity
 and has about 8 TB of data.
 Now how can I migrate the same cluster to use cdh3 and use that
 same 8
 TB
 of data.
 
 I can't copy 8 TB of data using distcp because I have only 2 TB of
 free
 space
 
 
 On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar 
 nitinpawar...@gmail.com
 wrote:
 
 you can actually look at the distcp
 
 http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
 
 but this means that you have two different

Re: Accessing global Counters

2012-04-20 Thread Michel Segel
Actually it's easier to use dynamic counters...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 20, 2012, at 11:36 AM, Amith D K amit...@huawei.com wrote:

 Yes U can use user defined counter as Jagat suggeted.
 
 Counter can be enum as Jagat described or any string which are called dynamic 
 counters.
 
 It is easier to use Enum counter than dynamic counters, finally it depends on 
 your use case :)
 
 Amith
 
 From: Jagat [jagatsi...@gmail.com]
 Sent: Saturday, April 21, 2012 12:25 AM
 To: common-user@hadoop.apache.org
 Subject: Re: Accessing global Counters
 
 Hi
 
 You can create your own counters like
 
 enum CountFruits {
 Apple,
 Mango,
 Banana
 }
 
 
 And in your mapper class when you see condition to increment , you can use
 Reporter incrCounter method to do the same.
 
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Reporter.html#incrCounter(java.lang.Enum,%20long)
 
 e.g
 // I saw Apple increment it by one
 reporter.incrCounter(CountFruits.Apple,1);
 
 Now you can access them using job.getCounters
 
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html#getCounters()
 
 Hope this helps
 
 Regards,
 
 Jagat Singh
 
 
 On Fri, Apr 20, 2012 at 9:43 PM, Gayatri Rao rgayat...@gmail.com wrote:
 
 Hi All,
 
 Is there a way for me to set global counters in Mapper and access them from
 reducer?
 Could you suggest how I can acheve this?
 
 Thanks
 Gayatri
 
 


Re: How to rebuild NameNode from DataNode.

2012-04-19 Thread Michel Segel
Switch to MapR M5?
:-)

Just kidding. 
Simple way of solving this pre CDH4...
NFS mount a directory from your SN and add it to your list of checkpoint 
directories.
You may lose some data, but you should be able rebuild. 

There is more to this, but its the basic idea on how to get a copy of your meta 
data.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 18, 2012, at 11:48 PM, Saburo Fujioka fuji...@do-it.co.jp wrote:

 Hello,
 
 I do a tentative plan of operative trouble countermeasures
 of a system currently now.
 
 If when NameNode has been lost, but are investigating the
 means to rebuild the remaining NameNode from DataNode,
 I don't know at the moment.
 
 Were consistent with those of the DataNode is the namespaceID
 of dfs/name/current/ VERSION as confirmation,
 fsimage are not rebuilt, the list did not display anything
 in the hadoop dfs-ls.
 
 The risk of loss for NameNode  because that is protected by
 Corosync + Pacemaker + DRBD is low.
 Because of the rare case, it is necessary to clarify the means
 to reconfigure the NameNode from DataNode.
 
 Do you know how to?
 
 
 I am using hadoop 1.0.1.
 
 Thank you very much,
 
 


Re: Setting a timeout for one Map() input processing

2012-04-18 Thread Michel Segel
Multiple threads within the mapper where you have the main thread starting a 
timeout thread and a process thread. Take the result of the thread that 
finishes first, ignoring the other and killing it all within the Mapper.map() 
method? Sure it seems possible.
(you output from the time out thread is a NOOP...)

Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 18, 2012, at 7:39 AM, Ondřej Klimpera klimp...@fit.cvut.cz wrote:

 Hello, I'd like to ask you if there is a possibility of setting a timeout for 
 processing one input line of text input in mapper function.
 
 The idea is, that if processing of one line is too long, Hadoop will cut this 
 process and continue processing next input line.
 
 Thank you for your answer.
 
 Ondrej Klimpera
 


Re: Control Resources / Cores assigned for a job

2012-03-16 Thread Michel Segel
Fair scheduler?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Mar 16, 2012, at 5:54 PM, Deepak Nettem deepaknet...@gmail.com wrote:

 Hi,
 
 I want to be able to control the number of nodes assigned to a MR job on
 the cluster. For example, I want the job to not execute more than 10
 Mappers at a time, irrespective of whether there are more nodes available
 or not.
 
 I don't wish to control the number of mappers that are created by the job.
 The number of mappers is tied to the problem size / input data and on my
 input splits. I want it to remain that way.
 
 Is this possible? If so, what's the best way to do this?
 
 Deepak


Re: Should splittable Gzip be a core hadoop feature?

2012-03-01 Thread Michel Segel

 I do agree that a git hub project is the way to go unless you could convince 
Cloudera, HortonWorks or MapR to pick it up and support it.  They have enough 
committers 

Is this potentially worthwhile? Maybe, it depends on how the cluster is 
integrated in to the overall environment. Companies that have standardized on 
using gzip would find it useful.



Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 29, 2012, at 3:17 PM, Niels Basjes ni...@basjes.nl wrote:

 Hi,
 
 On Wed, Feb 29, 2012 at 19:13, Robert Evans ev...@yahoo-inc.com wrote:
 
 
 What I really want to know is how well does this new CompressionCodec
 perform in comparison to the regular gzip codec in
 
 various different conditions and what type of impact does it have on
 network traffic and datanode load.  My gut feeling is that
 
 the speedup is going to be relatively small except when there is a lot of
 computation happening in the mapper
 
 
 I agree, I made the same assesment.
 In the javadoc I wrote under When is this useful?
 *Assume you have a heavy map phase for which the input is a 1GiB Apache
 httpd logfile. Now assume this map takes 60 minutes of CPU time to run.*
 
 
 and the added load and network traffic outweighs the speedup in most
 cases,
 
 
 No, the trick to solve that one is to upload the gzipped files with a HDFS
 blocksize equal (or 1 byte larger) than the filesize.
 This setting will help in speeding up Gzipped input files in any situation
 (no more network overhead).
 From there the HDFS file replication factor of the file dictates the
 optimal number of splits for this codec.
 
 
 but like all performance on a complex system gut feelings are
 
 almost worthless and hard numbers are what is needed to make a judgment
 call.
 
 
 Yes
 
 
 Niels, I assume you have tested this on your cluster(s).  Can you share
 with us some of the numbers?
 
 
 No I haven't tested it beyond a multiple core system.
 The simple reason for that is that when this was under review last summer
 the whole Yarn thing happened
 and I was unable to run it at all for a long time.
 I only got it running again last december when the restructuring of the
 source tree was mostly done.
 
 At this moment I'm building a experimentation setup at work that can be
 used for various things.
 Given the current state of Hadoop 2.0 I think it's time to produce some
 actual results.
 
 -- 
 Best regards / Met vriendelijke groeten,
 
 Niels Basjes


Re: Should splittable Gzip be a core hadoop feature?

2012-02-29 Thread Michel Segel
Let's play devil's advocate for a second?

Why? Snappy exists.
The only advantage is that you don't have to convert from gzip to snappy and 
can process gzip files natively.

Next question is how large are the gzip files in the first place?

I don't disagree, I just want to have a solid argument in favor of it...




Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 28, 2012, at 9:50 AM, Niels Basjes ni...@basjes.nl wrote:

 Hi,
 
 Some time ago I had an idea and implemented it.
 
 Normally you can only run a single gzipped input file through a single
 mapper and thus only on a single CPU core.
 What I created makes it possible to process a Gzipped file in such a way
 that it can run on several mappers in parallel.
 
 I've put the javadoc I created on my homepage so you can read more about
 the details.
 http://howto.basjes.nl/hadoop/javadoc-for-skipseeksplittablegzipcodec
 
 Now the question that was raised by one of the people reviewing this code
 was: Should this implementation be part of the core Hadoop feature set?
 The main reason that was given is that this needs a bit more understanding
 on what is happening and as such cannot be enabled by default.
 
 I would like to hear from the Hadoop Core/Map reduce users what you think.
 
 Should this be
 - a part of the default Hadoop feature set so that anyone can simply enable
 it by setting the right configuration?
 - a separate library?
 - a nice idea I had fun building but that no one needs?
 - ... ?
 
 -- 
 Best regards / Met vriendelijke groeten,
 
 Niels Basjes


Re: working with SAS

2012-02-06 Thread Michel Segel
Both responses assume replacing SAS w a Hadoop cluster.
I would agree that going to EC2 might make sense in terms of a PoC before 
investing in a physical cluster, but we need to know more about the underlying 
problem.

First, can the problem be broken down in to something that can be accomplished 
in parallel sub tasks?  Second... How much data? It could be a good use case 
for whirr...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 6, 2012, at 2:32 AM, Prashant Sharma prashan...@imaginea.com wrote:

 + you will not necessarily need vertical systems for speeding up
 things(totally depends on your query) . Give a thought of having commodity
 hardware(much cheaper) and hadoop being suited for them, *I hope* your
 infrastructure can be cheaper in terms of price to performance ratio.
 Having said that, I do not mean you have to throw away you existing
 infrastructure, because it is ideal for certain requirements.
 
 your solution can be like writing a mapreduce job which does what query is
 supposed to do and run it on a cluster of size ? depends! (how fast you
 want things be done? and scale). Incase your querry is adhoc and have to be
 run frequently. You might wanna consider HBASE and HIVE as solutions with a
 lot of expensive vertical nodes ;).
 
 BTW Is your querry iterative? A little more details on your type of querry
 can attract guy's with more wisdom to help.
 
 HTH
 
 
 On Mon, Feb 6, 2012 at 1:46 PM, alo alt wget.n...@googlemail.com wrote:
 
 Hi,
 
 hadoop is running on a linux box (mostly) and can run in a standalone
 installation for testing only. If you decide to use hadoop with hive or
 hbase you have to face a lot of more tasks:
 
 - installation (whirr and Amazone EC2 as example)
 - write your own mapreduce job or use hive / hbase
 - setup sqoop with the terradata-driver
 
 You can easy setup part 1 and 2 with Amazon's EC2, I think you can also
 book Windows Server there. For a single query the best option I think
 before you install a hadoop cluster.
 
 best,
 Alex
 
 
 --
 Alexander Lorenz
 http://mapredit.blogspot.com
 
 On Feb 6, 2012, at 8:11 AM, Ali Jooan Rizvi wrote:
 
 Hi,
 
 
 
 I would like to know if hadoop will be of help to me? Let me explain you
 guys my scenario:
 
 
 
 I have a windows server based single machine server having 16 Cores and
 48
 GB of Physical Memory. In addition, I have 120 GB of virtual memory.
 
 
 
 I am running a query with statistical calculation on large data of over 1
 billion rows, on SAS. In this case, SAS is acting like a database on
 which
 both source and target tables are residing. For storage, I can keep the
 source and target data on Teradata as well but the query containing a
 patent
 can only be run on SAS interface.
 
 
 
 The problem is that SAS is taking many days (25 days) to run it (a single
 query with statistical function) and not all cores all the time were used
 and rather merely 5% CPU was utilized on average. However memory
 utilization
 was high, very high, and that's why large virtual memory was used.
 
 
 
 Can I have a hadoop interface in place to do it all so that I may end up
 running the query in lesser time that is in 1 or 2 days. Anything
 squeezing
 my run time will be very helpful.
 
 
 
 Thanks
 
 
 
 Ali Jooan Rizvi
 
 
 


Re: Too many open files Error

2012-01-26 Thread Michel Segel
Sorry going from memory...
As user Hadoop or mapred or hdfs what do you see when you do a ulimit -a?
That should give you the number of open files allowed by a single user...


Sent from a remote device. Please excuse any typos...

Mike Segel

On Jan 26, 2012, at 5:13 AM, Mark question markq2...@gmail.com wrote:

 Hi guys,
 
   I get this error from a job trying to process 3Million records.
 
 java.io.IOException: Bad connect ack with firstBadLink 192.168.1.20:50010
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2903)
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
 
 When I checked the logfile of the datanode-20, I see :
 
 2012-01-26 03:00:11,827 ERROR
 org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
 192.168.1.20:50010, storageID=DS-97608578-192.168.1.20-50010-1327575205369,
 infoPort=50075, ipcPort=50020):DataXceiver
 java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
at sun.nio.ch.IOUtil.read(IOUtil.java:175)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
at
 org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
at
 org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at
 org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at
 org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at java.io.DataInputStream.read(DataInputStream.java:132)
at
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:262)
at
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)
at
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)
at
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)
at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
at java.lang.Thread.run(Thread.java:662)
 
 
 Which is because I'm running 10 maps per taskTracker on a 20 node cluster,
 each map opens about 300 files so that should give 6000 opened files at the
 same time ... why is this a problem? the maximum # of files per process on
 one machine is:
 
  cat /proc/sys/fs/file-max   --- 2403545
 
 
 Any suggestions?
 
 Thanks,
 Mark


Re: Problems with timeout when a Hadoop job generates a large number of key-value pairs

2012-01-20 Thread Michel Segel
Steve,
If you want me to debug your code, I'll be glad to set up a billable 
contract... ;-)

What I am willing to do is to help you to debug your code...

Did you time how long it takes in the Mapper.map() method?
The reason I asked this is to first confirm that you are failing within a map() 
method.
It could be that you're just not updating your status...

You said that you are writing many output records for a single input. 

So let's take a look at your code.
Are all writes of the same length? Meaning that in each iteration of 
Mapper.map() you will always write. K number of rows?

If so, ask yourself why some iterations are taking longer and longer?

Note: I'm assuming that the time for each iteration is taking longer than the 
previous...

Or am I missing something?

-Mike

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jan 20, 2012, at 11:16 AM, Steve Lewis lordjoe2...@gmail.com wrote:

 We have been having problems with mappers timing out after 600 sec when the
 mapper writes many more, say thousands of records for every
 input record - even when the code in the mapper is small and fast. I
 no idea what could cause the system to be so slow and am reluctant to raise
 the 600 sec limit without understanding why there should be a timeout when
 all MY code is very fast.
 P
 I am enclosing a small sample which illustrates the problem. It will 
 generate a 4GB text file on hdfs if the input file does not exist or is not
 at least that size and this will take some time (hours in my configuration)
 - then the code is essentially wordcount but instead of finding and
 emitting words - the mapper emits all substrings of the input data - this
 generates a much larger output data and number of output records than
 wordcount generates.
 Still, the amount of data emitted is no larger than other data sets I know
 Hadoop can handle.
 
 All mappers on my 8 node cluster eventually timeout after 600 sec - even
 though I see nothing in the code which is even a little slow and suspect
 that any slow behavior is in the  called Hadoop code. This is similar to a
 problem we have in bioinformatics where a  colleague saw timeouts on his 50
 node cluster.
 
 I would appreciate any help from the group. Note - if you have a text file
 at least 4 GB the program will take that as an imput without trying to
 create its own file.
 /*
 
 */
 import org.apache.hadoop.conf.*;
 import org.apache.hadoop.fs.*;
 import org.apache.hadoop.io.*;
 import org.apache.hadoop.mapreduce.*;
 import org.apache.hadoop.mapreduce.lib.input.*;
 import org.apache.hadoop.mapreduce.lib.output.*;
 import org.apache.hadoop.util.*;
 
 import java.io.*;
 import java.util.*;
 /**
 * org.systemsbiology.hadoop.SubstringGenerator
  *
  * This illustrates an issue we are having where a mapper generating a
 much larger volume of
  * data ans number of records times out even though the code is small,
 simple and fast
  *
  * NOTE!!! as written the program will generate a 4GB file in hdfs with
 good input data -
  * this is done only if the file does not exist but may take several
 hours. It will only be
  * done once. After that the failure is fairly fast
  *
 * What this will do is count  unique Substrings of lines of length
 * between MIN_SUBSTRING_LENGTH and MAX_SUBSTRING_LENGTH by generatin all
 * substrings and  then using the word could algorithm
 * What is interesting is that the number and volume or writes in the
  * map phase is MUCH larger than the number of reads and the volume of
 read data
  *
  * The example is artificial but similar the some real BioInformatics
 problems -
  *  for example finding all substrings in a gemome can be important for
 the design of
  *  microarrays.
  *
  *  While the real problem is more complex - the problem is that
  *  when the input file is large enough the mappers time out failing to
 report after
  *  600 sec. There is nothing slow in any of the application code and
 nothing I
 */
 public class SubstringCount  implements Tool   {
 
 
public static final long ONE_MEG = 1024 * 1024;
public static final long ONE_GIG = 1024 * ONE_MEG;
public static final int LINE_LENGTH = 100;
public static final Random RND = new Random();
 
   // NOTE - edit this line to be a sensible location in the current file
 system
public static final String INPUT_FILE_PATH = BigInputLines.txt;
   // NOTE - edit this line to be a sensible location in the current file
 system
public static final String OUTPUT_FILE_PATH = output;
 // NOTE - edit this line to be the input file size - 4 * ONE_GIG
 should be large but not a problem
public static final long DESIRED_LENGTH = 4 * ONE_GIG;
// NOTE - limits on substring length
public static final int MINIMUM_LENGTH = 5;
public static final int MAXIMUM_LENGTH = 32;
 
 
/**
 * create an input file to read
 * @param fs !null file system
 

Re: I am trying to run a large job and it is consistently failing with timeout - nothing happens for 600 sec

2012-01-19 Thread Michel Segel
Timeout errors don't usually occur outside of the Mapper.map() 'phase'.

When we've seen this error it has to deal w M/R going against HBase

Since the OP sees the error when he does a bulk 'write', but it stops when he 
reduces the number of writes ... That kind of suggests where the problem occurs 
...

Unless of course I missed something...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jan 18, 2012, at 9:28 PM, Raj Vishwanthan rajv...@yahoo.com wrote:

 You can try the following
 - make it into a map only job (for debug  purposes)
 - start your shuffle phase after all the maps are complete( there is a 
 parameter for this)
 -characterize your disks for performance
 
 Raj 
 
 
 Sent from Samsung Mobile
 
 Steve Lewis lordjoe2...@gmail.com wrote:
 
 In my hands the problem occurs in all map jobs - an associate with a 
 different cluster - mine has 8 nodes - his 40 reports 80% of map tasks fail 
 with a few succeeding - 
 I suspect some kind of an I/O waiot but fail to see how it gets to 600sec
 
 On Wed, Jan 18, 2012 at 4:50 PM, Raj V rajv...@yahoo.com wrote:
 Steve
 
 Does the timeout happen for all the map jobs? Are you using some kind of 
 shared storage for map outputs? Any problems with the physical disks? If the 
 shuffle phase has started could the disks be I/O waiting between the read and 
 write?
 
 Raj
 
 
 
 
 From: Steve Lewis lordjoe2...@gmail.com
 To: common-user@hadoop.apache.org
 Sent: Wednesday, January 18, 2012 4:21 PM
 Subject: Re: I am trying to run a large job and it is consistently failing 
 with timeout - nothing happens for 600 sec
 
 1) I do a lot of progress reporting
 2) Why would the job succeed when the only change in the code is
   if(NumberWrites++ % 100 == 0)
   context.write(key,value);
 comment out the test  allowing full writes and the job fails
 Since every write is a report I assume that something in the write code or
 other hadoop code for dealing with output if failing. I do increment a
 counter for every write or in the case of the above code potential write
 What I am seeing is that where ever the timeout occurs it is not in a place
 where I am capable of inserting more reporting
 
 
 
 On Wed, Jan 18, 2012 at 4:01 PM, Leonardo Urbina lurb...@mit.edu wrote:
 
 Perhaps you are not reporting progress throughout your task. If you
 happen to run a job large enough job you hit the the default timeout
 mapred.task.timeout  (that defaults to 10 min). Perhaps you should
 consider reporting progress in your mapper/reducer by calling
 progress() on the Reporter object. Check tip 7 of this link:
 
 http://www.cloudera.com/blog/2009/05/10-mapreduce-tips/
 
 Hope that helps,
 -Leo
 
 Sent from my phone
 
 On Jan 18, 2012, at 6:46 PM, Steve Lewis lordjoe2...@gmail.com wrote:
 
 I KNOW is is a task timeout - what I do NOT know is WHY merely cutting
 the
 number of writes causes it to go away. It seems to imply that some
 context.write operation or something downstream from that is taking a
 huge
 amount of time and that is all hadoop internal code - not mine so my
 question is why should increasing the number and volume of wriotes cause
 a
 task to time out
 
 On Wed, Jan 18, 2012 at 2:33 PM, Tom Melendez t...@supertom.com wrote:
 
 Sounds like mapred.task.timeout?  The default is 10 minutes.
 
 http://hadoop.apache.org/common/docs/current/mapred-default.html
 
 Thanks,
 
 Tom
 
 On Wed, Jan 18, 2012 at 2:05 PM, Steve Lewis lordjoe2...@gmail.com
 wrote:
 The map tasks fail timing out after 600 sec.
 I am processing one 9 GB file with 16,000,000 records. Each record
 (think
 is it as a line)  generates hundreds of key value pairs.
 The job is unusual in that the output of the mapper in terms of records
 or
 bytes orders of magnitude larger than the input.
 I have no idea what is slowing down the job except that the problem is
 in
 the writes.
 
 If I change the job to merely bypass a fraction of the context.write
 statements the job succeeds.
 This is one map task that failed and one that succeeded - I cannot
 understand how a write can take so long
 or what else the mapper might be doing
 
 JOB FAILED WITH TIMEOUT
 
 *Parser*TotalProteins90,103NumberFragments10,933,089
 
 
 *FileSystemCounters*HDFS_BYTES_READ67,245,605FILE_BYTES_WRITTEN444,054,807
 *Map-Reduce Framework*Combine output records10,033,499Map input records
 90,103Spilled Records10,032,836Map output bytes3,520,182,794Combine
 input
 records10,844,881Map output records10,933,089
 Same code but fewer writes
 JOB SUCCEEDED
 
 *Parser*TotalProteins90,103NumberFragments206,658,758
 *FileSystemCounters*FILE_BYTES_READ111,578,253HDFS_BYTES_READ67,245,607
 FILE_BYTES_WRITTEN220,169,922
 *Map-Reduce Framework*Combine output records4,046,128Map input
 records90,103Spilled
 Records4,046,128Map output bytes662,354,413Combine input
 records4,098,609Map
 output records2,066,588
 Any bright ideas
 --
 Steven M. Lewis PhD
 4221 105th Ave NE
 Kirkland, WA 

Re: Dynamically adding nodes in Hadoop

2011-12-17 Thread Michel Segel
Actually I would recommend avoiding /etc/hosts and using DNS if this is going 
to be a production grade cluster...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 17, 2011, at 5:40 AM, alo alt wget.n...@googlemail.com wrote:

 Hi,
 
 in the slave - file too. /etc/hosts is also recommend to avoid DNS
 issues. After adding in slaves the new node has to be started and
 should quickly appear in the web-ui. If you don't need the nodes all
 time you can setup a exclude and refresh your cluster
 (http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F)
 
 - Alex
 
 On Sat, Dec 17, 2011 at 12:06 PM, madhu phatak phatak@gmail.com wrote:
 Hi,
  I am trying to add nodes dynamically to a running hadoop cluster.I started
 tasktracker and datanode in the node. It works fine. But when some node
 try fetch values ( for reduce phase) it fails with unknown host exception.
 When i add a node to running cluster do i have to add its hostname to all
 nodes (slaves +master) /etc/hosts file? Or some other way is there?
 
 
 --
 Join me at http://hadoopworkshop.eventbrite.com/
 
 
 
 -- 
 Alexander Lorenz
 http://mapredit.blogspot.com
 
 P Think of the environment: please don't print this email unless you
 really need to.
 


Re: More cores Vs More Nodes ?

2011-12-17 Thread Michel Segel
Brad you said 64 core allocations.
So how many cores are lost due to the overhead of virtualization?
Isn't it 1 core per VM?
So you end up losing 8 cores when you create 8 vms... Right?

Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 13, 2011, at 7:15 PM, Brad Sarsfield b...@bing.com wrote:

 Hi Prashant,
 
 In each case I had a single tasktracker per node. I oversubscribed the total 
 tasks per tasktracker/node by 1.5 x # of cores.
 
 So for the 64 core allocation comparison.
In A: 8 cores; Each machine had a single tasktracker with 8 maps / 4 
 reduce slots for 12 task slots total per machine x 8 machines (including head 
 node)
In B: 2 cores; Each machine had a single tasktracker with 2 maps / 1 
 reduce slots for 3 slots total per machines x 29 machines (including head 
 node which was running 8 cores)
 
 The experiment was done in a cloud hosted environment running set of VMs.
 
 ~Brad
 
 -Original Message-
 From: Prashant Kommireddi [mailto:prash1...@gmail.com] 
 Sent: Tuesday, December 13, 2011 9:46 AM
 To: common-user@hadoop.apache.org
 Subject: Re: More cores Vs More Nodes ?
 
 Hi Brad, how many taskstrackers did you have on each node in both cases?
 
 Thanks,
 Prashant
 
 Sent from my iPhone
 
 On Dec 13, 2011, at 9:42 AM, Brad Sarsfield b...@bing.com wrote:
 
 Praveenesh,
 
 Your question is not naïve; in fact, optimal hardware design can ultimately 
 be a very difficult question to answer on what would be better. If you 
 made me pick one without much information I'd go for more machines.  But...
 
 It all depends; and there is no right answer :)
 
 More machines
   +May run your workload faster
   +Will give you a higher degree of reliability protection from node / 
 hardware / hard drive failure.
   +More aggregate IO capabilities
   - capex / opex may be higher than allocating more cores More cores
   +May run your workload faster
   +More cores may allow for more tasks to run on the same machine
   +More cores/tasks may reduce network contention and increase increasing 
 task to task data flow performance.
 
 Notice May run your workload faster is in both; as it can be very workload 
 dependant.
 
 My Experience:
 I did a recent experiment and found that given the same number of cores (64) 
 with the exact same network / machine configuration;
   A: I had 8 machines with 8 cores
   B: I had 28 machines with 2 cores (and 1x8 core head node)
 
 B was able to outperform A by 2x using teragen and terasort. These machines 
 were running in a virtualized environment; where some of the IO capabilities 
 behind the scenes were being regulated to 400Mbps per node when running in 
 the 2 core configuration vs 1Gbps on the 8 core.  So I would expect the 
 non-throttled scenario to work even better.
 
 ~Brad
 
 
 -Original Message-
 From: praveenesh kumar [mailto:praveen...@gmail.com]
 Sent: Monday, December 12, 2011 8:51 PM
 To: common-user@hadoop.apache.org
 Subject: More cores Vs More Nodes ?
 
 Hey Guys,
 
 So I have a very naive question in my mind regarding Hadoop cluster nodes ?
 
 more cores or more nodes - Shall I spend money on going from 2-4 core 
 machines, or spend money on buying more nodes less core eg. say 2 machines 
 of 2 cores for example?
 
 Thanks,
 Praveenesh
 
 
 


Re: Routing and region deletes

2011-12-08 Thread Michel Segel
Per Seffensen,

I would urge you to step away from the keyboard and rethink your design.
It sounds like you want to replicate a date partition model similar to what you 
would do if you were attempting this with HBase.

HBase is not a relational database and you have a different way of doing things.

You could put the date/time stamp in the key such that your data is sorted by 
date.
However, this would cause hot spots.  Think about how you access the data. It 
sounds like you access the more recent data more frequently than historical 
data.  This is a bad idea in HBase.
(note: it may still make sense to do this ... You have to think more about the 
data and consider alternatives.)

I personally would hash the key for even distribution, again depending on the 
data access pattern.  (hashed data means you can't do range queries but again, 
it depends on what you are doing...)

You also have to think about how you purge the data. You don't just drop a 
region. Doing a full table scan once a month to delete may not be a bad thing. 
Again it depends on what you are doing...

Just my opinion. Others will have their own... Now I'm stepping away from the 
keyboard to get my morning coffee...
:-)


Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 8, 2011, at 7:13 AM, Per Steffensen st...@designware.dk wrote:

 Hi
 
 The system we are going to work on will receive 50mio+ new datarecords every 
 day. We need to keep a history of 2 years of data (thats 35+ billion 
 datarecords in the storage all in all), and that basically means that we also 
 need to delete 50mio+ datarecords every day, or e.g. 1,5 billion every month. 
 We plan to store the datarecords in HBase.
 
 Is it somehow possible to tell HBase to put (route) all datarecords belonging 
 to a specific date or month to a designated set of regions (and route nothing 
 else there), so that deleting all data belonging to that day/month i 
 basically deleting those regions entirely? And is explicit deletion of entire 
 regions possible at all?
 
 The reason I want to do this is that I expect it to be much faster than doing 
 explicit deletion record by record of 50mio+ records every day.
 
 Regards, Per Steffensen
 
 
 


Re: Issue with DistributedCache

2011-11-24 Thread Michel Segel
Silly question... Why do you need to use the distributed cache for the word 
count program?
 What are you trying to accomplish?

I've only had to play with it for one project where we had to push out a bunch 
of c++ code to the nodes as part of a job...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Nov 24, 2011, at 7:05 AM, Denis Kreis de.kr...@gmail.com wrote:

 Hi Bejoy
 
 1. Old API:
 The Map and Reduce classes are the same as in the example, the main
 method is as follows
 
 public static void main(String[] args) throws IOException,
 InterruptedException {
UserGroupInformation ugi =
 UserGroupInformation.createProxyUser(remote user name,
 UserGroupInformation.getLoginUser());
ugi.doAs(new PrivilegedExceptionActionVoid() {
public Void run() throws Exception {

JobConf conf = new JobConf(WordCount.class);
conf.setJobName(wordcount);

conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);

conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
 
FileInputFormat.setInputPaths(conf, new Path(path to input 
 dir));
FileOutputFormat.setOutputPath(conf, new Path(path to
 output dir));

conf.set(mapred.job.tracker, ip:8021);

FileSystem fs = FileSystem.get(new URI(hdfs://ip:8020),
 new Configuration());
fs.mkdirs(new Path(remote path));
fs.copyFromLocalFile(new Path(local path/test.jar), new
 Path(remote path));

 


Re: MapReduce Examples

2011-11-24 Thread Michel Segel
I think that there are... Since you work in databases, have you looked at HBase?
The reason the word count example is nice is that anyone can get a bunch of 
word / text documents freely from the web.

Having completed the word count, pi approximation, historical temp 
calculation... Do you have access to actual data?

I mean if you were Sears, Kroger, HomeDepot, IKEA,... Large retailers with lots 
of PoS data, you could try to find out whats the big holiday sellers... Or I 
guess if you can get your hands on twitter data... ( I dont know... I don't 
tweet) you can start to think of writing your own examples...

Actually that would probably be a better way to learn it...
Oh and one last idea... There's a book written by some guys from the university 
of Maryland ...
Maybe another good source of ideas...


HTH

-Mike

Sent from a remote device. Please excuse any typos...

Mike Segel

On Nov 24, 2011, at 6:02 AM, Sloot, Hans-Peter hans-peter.sl...@atos.net 
wrote:

 Hi ,
 
 Are there some a bit more sophisticated example of MapReduce than the 
 wordcounting available?
 I work for a database department and just the word count is not very 
 impressive.
 
 Regards Hans-Peter
 
 
 
 
 
 
 Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd 
 voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken 
 wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. 
 Aangezien de integriteit van het bericht niet veilig gesteld is middels 
 verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden 
 gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij 
 netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij 
 is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke 
 aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, 
 aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of 
 diensten levert zijn met uitsluiting van alle andere voorwaarden de 
 Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op 
 aanvraag direct kosteloos toegezonden.
 
 This e-mail and the documents attached are confidential and intended solely 
 for the addressee; it may also be privileged. If you receive this e-mail in 
 error, please notify the sender immediately and destroy it. As its integrity 
 cannot be secured on the Internet, the Atos Nederland B.V. group liability 
 cannot be triggered for the message content. Although the sender endeavours 
 to maintain a computer virus-free network, the sender does not warrant that 
 this transmission is virus-free and will not be liable for any damages 
 resulting from any virus transmitted. On all offers and agreements under 
 which Atos Nederland B.V. supplies goods and/or services of whatever nature, 
 the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms 
 of Delivery shall be promptly submitted to you on your request.
 
 Atos Nederland B.V. / Utrecht
 KvK Utrecht 30132762
 
 


Re: Issue with DistributedCache

2011-11-24 Thread Michel Segel
Denis...

Sorry, you lost me.

Just to make sure we're using the same terminology...
The cluster is comprised of two types of nodes...
The data nodes which run DN,TT, and if you have HBase, RS.
Then there are control nodes which run you NN,SN, JT and if you run HBase, HM 
and ZKs ...

Outside of the cluster we have machines set up with Hadoop installed but are 
not running any of the processes. They are where our users launch there jobs. 
We call them edge nodes. ( it's not a good idea to let users directly on the 
actual cluster.)

Ok, having said all of that... You launch you job from the edge nodes... Your 
data sits in HDFS so you don't need distributed cache at all. Does that make 
sense?
You job will run on the local machine, connect to the JT and then run.

We set up the edge nodes so that all of the jars, config files are already set 
up for the users and we can better control access...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Nov 24, 2011, at 7:22 AM, Denis Kreis de.kr...@gmail.com wrote:

 Without using the distributed cache i'm getting the same error. It's
 because i start the job from a remote client / programmatically
 
 2011/11/24 Michel Segel michael_se...@hotmail.com:
 Silly question... Why do you need to use the distributed cache for the word 
 count program?
  What are you trying to accomplish?
 
 I've only had to play with it for one project where we had to push out a 
 bunch of c++ code to the nodes as part of a job...
 
 Sent from a remote device. Please excuse any typos...
 
 Mike Segel
 
 On Nov 24, 2011, at 7:05 AM, Denis Kreis de.kr...@gmail.com wrote:
 
 Hi Bejoy
 
 1. Old API:
 The Map and Reduce classes are the same as in the example, the main
 method is as follows
 
 public static void main(String[] args) throws IOException,
 InterruptedException {
UserGroupInformation ugi =
 UserGroupInformation.createProxyUser(remote user name,
 UserGroupInformation.getLoginUser());
ugi.doAs(new PrivilegedExceptionActionVoid() {
public Void run() throws Exception {
 
JobConf conf = new JobConf(WordCount.class);
conf.setJobName(wordcount);
 
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
 
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
 
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
 
FileInputFormat.setInputPaths(conf, new Path(path to input 
 dir));
FileOutputFormat.setOutputPath(conf, new Path(path to
 output dir));
 
conf.set(mapred.job.tracker, ip:8021);
 
FileSystem fs = FileSystem.get(new URI(hdfs://ip:8020),
 new Configuration());
fs.mkdirs(new Path(remote path));
fs.copyFromLocalFile(new Path(local path/test.jar), new
 Path(remote path));
 
 
 
 


Re: Matrix multiplication in Hadoop

2011-11-19 Thread Michel Segel
You really don't need to wait...

If you're going to go down this path you can use a jni wrapper to do the c/c++ 
code for the gpu...
You can do that now...

If you want to go beyond the 1D you can do it but you have to get a bit 
creative... but it's doable...


Sent from a remote device. Please excuse any typos...

Mike Segel

On Nov 19, 2011, at 10:53 AM, Edward Capriolo edlinuxg...@gmail.com wrote:

 Sounds like a job for next gen map reduce native libraries and gpu's. A
 modern day Dr frankenstein for sure.
 
 On Saturday, November 19, 2011, Tim Broberg tim.brob...@exar.com wrote:
 Perhaps this is a good candidate for a native library, then?
 
 
 From: Mike Davis [xmikeda...@gmail.com]
 Sent: Friday, November 18, 2011 7:39 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Matrix multiplication in Hadoop
 
 On Friday, November 18, 2011, Mike Spreitzer mspre...@us.ibm.com wrote:
 Why is matrix multiplication ill-suited for Hadoop?
 
 IMHO, a huge issue here is the JVM's inability to fully support cpu vendor
 specific SIMD instructions and, by extension, optimized BLAS routines.
 Running a large MM task using intel's MKL rather than relying on generic
 compiler optimization is orders of magnitude faster on a single multicore
 processor. I see almost no way that Hadoop could win such a CPU intensive
 task against an mpi cluster with even a tenth of the nodes running with a
 decently tuned BLAS library. Racing even against a single CPU might be
 difficult, given the i/o overhead.
 
 Still, it's a reasonably common problem and we shouldn't murder the good
 in
 favor of the best. I'm certain a MM/LinAlg Hadoop library with even
 mediocre performance, wrt C, would get used.
 
 --
 Mike Davis
 
 The information and any attached documents contained in this message
 may be confidential and/or legally privileged.  The message is
 intended solely for the addressee(s).  If you are not the intended
 recipient, you are hereby notified that any use, dissemination, or
 reproduction is strictly prohibited and may be unlawful.  If you are
 not the intended recipient, please contact the sender immediately by
 return e-mail and destroy all copies of the original message.
 


Re: Matrix multiplication in Hadoop

2011-11-18 Thread Michel Segel
Is Hadoop the best tool for doing large matrix math. 
Sure you can do it, but, aren't there better tools for these types of problems?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Nov 18, 2011, at 10:59 AM, Mike Spreitzer mspre...@us.ibm.com wrote:

 Who is doing multiplication of large dense matrices using Hadoop?  What is 
 a good way to do that computation using Hadoop?
 
 Thanks,
 Mike


Re: Data locality for a custom input format

2011-11-12 Thread Michel Segel
There's this article on InfoQ that deals with this issue... ;-)

http://www.infoq.com/articles/HadoopInputFormat

Sent from a remote device. Please excuse any typos...

Mike Segel

On Nov 12, 2011, at 7:51 AM, Harsh J ha...@cloudera.com wrote:

 Tharindu,
 
 InputSplit#getLocations()  i.e.,
 http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapreduce/InputSplit.html#getLocations()
 is used to decide locality of a task. You need your custom InputFormat
 to prepare the right array of these objects. The # of objects == # of
 map tasks, and the locations array gets used by the scheduler for
 local assignment.
 
 For a FileSplit preparation, this is as easy as passing the block
 locations obtained from the NameNode. For the rest type of splits, you
 need to fill them up yourself.
 
 On Sat, Nov 12, 2011 at 7:12 PM, Tharindu Mathew mcclou...@gmail.com wrote:
 Hi hadoop devs,
 
 I'm implementing a custom input format and want to understand how to make
 use of data locality.
 
 AFAIU, only file input format makes use of data locality since the job
 tracker picks data locality based on the block location defined in the file
 input split.
 
 So, the job tracker code is partly responsible for this. So providing data
 locality for a custom input format would be to either either extend file
 input format or modify job tracker code (if that makes sense even).
 
 Is my understanding correct?
 
 --
 Regards,
 
 Tharindu
 
 blog: http://mackiemathew.com/
 
 
 
 
 -- 
 Harsh J
 


Re: Map-Reduce in memory

2011-11-04 Thread Michel Segel
Hi, 
First, you have 8 physical cores. Hyper threading makes the machine think that 
it has 16. The trouble is that you really don't have 16 cores so you need to be 
a little more conservative.

You don't mention HBase, so I'm going to assume that you don't have it 
installed.
So in terms of tasks, allocate a core each to DN and TT leaving 6 cores or 12 
hyper threaded cores. This leaves a little headroom for the other linux 
processes...

Now you can split the number of remaining cores however you want.
You can even overlap a bit since you are not going to be running all of your 
reducers at the same time.
So let's say 10 mappers and the 4 reducers to start.

Since you have all that memory, you can bump up you DN and TT allocations.

W ith respect to your tuning... You need to change them one at a time...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Nov 4, 2011, at 1:46 AM, N.N. Gesli nnge...@gmail.com wrote:

 Thank you very much for your replies.
 
 Michel, disk is 3TB (6x550GB, 50 GB from each disk is reserved for local 
 basically for mapred.local.dir). You are right on the CPU; it is 8 core but 
 shows as 16. Is that mean it can handle 16 JVMs at a time? CPU is a little 
 overloaded, but that is not a huge problem at this point.
 
 I made io.sort.factor 200 and io.sort.mb 2000. Still got the same 
 error/timeout. I played with all related conf settings one by one. At last, 
 changing mapred.job.shuffle.merge.percent from 1.0 back to 0.66 solved the 
 problem.
 
 However, the job is still taking long time. There are 84 reducers, but only 
 one of them takes a very long time. I attached the log file of that reduce 
 task. Majority of the data gets spilled to disk. Even if I set 
 mapred.child.java.opts to 6144, the reduce task log shows
 ShuffleRamManager: MemoryLimit=1503238528, MaxSingleShuffleLimit=375809632
 as if memory is 2GB (70% of 2GB=1503238528b). In the same log file later 
 there is also this line:
 INFO ExecReducer: maximum memory = 6414139392
 I am not using memory monitoring. Tasktrackers have this line in the log:
 TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is 
 disabled.
 Why is ShuffleRamManager is finding that number as if the max memory is 2GB?
 Why am I still getting that much spill even with these aggressive memory 
 settings?
 Why only one reducer taking that long?
 What else I can change to make this job processed in the memory and finish 
 faster?
 
 Thank you.
 -N.N.Gesli
 
 On Fri, Oct 28, 2011 at 2:14 AM, Michel Segel michael_se...@hotmail.com 
 wrote:
 Uhm...
 He has plenty of memory... Depending on what sort of m/r tasks... He could 
 push it.
 Didn't say how much disk...
 
 I wouldn't start that high... Try 10 mappers and 2. Reducers. Granted it is a 
 bit asymmetric and you can bump up the reducers...
 
 Watch your jobs in ganglia and see what is happening...
 
 Harsh, assuming he is using intel, each core is hyper threaded so the box 
 sees this as 2x CPUs.
 8 cores looks like 16.
 
 
 Sent from a remote device. Please excuse any typos...
 
 Mike Segel
 
 On Oct 28, 2011, at 3:08 AM, Harsh J ha...@cloudera.com wrote:
 
  Hey N.N. Gesli,
 
  (Inline)
 
  On Fri, Oct 28, 2011 at 12:38 PM, N.N. Gesli nnge...@gmail.com wrote:
  Hello,
 
  We have 12 node Hadoop Cluster that is running Hadoop 0.20.2-cdh3u0. Each
  node has 8 core and 144GB RAM (don't ask). So, I want to take advantage of
  this huge RAM and run the map-reduce jobs mostly in memory with no spill, 
  if
  possible. We use Hive for most of the processes. I have set:
  mapred.tasktracker.map.tasks.maximum = 16
  mapred.tasktracker.reduce.tasks.maximum = 8
 
  This is *crazy* for an 8 core machine. Try to keep M+R slots well
  below 8 instead - You're probably CPU-thrashed in this setup once
  large number of tasks get booted.
 
  mapred.child.java.opts = 6144
 
  You can also raise io.sort.mb to 2000, and tweak io.sort.factor.
 
  The child opts raise to 6~ GB looks a bit unnecessary since most of
  your tasks work on record basis and would not care much about total
  RAM. Perhaps use all that RAM for a service like HBase which can
  leverage caching nicely!
 
  One of my Hive queries is producing 6 stage map-reduce jobs. On the third
  stage when it queries from a 200GB table, the last 14 reducers hang. I
  changed mapred.task.timeout to 0 to see if they really hang. It has been 5
  hours, so something terribly wrong in my setup. Parts of the log is below.
 
  It is probably just your slot settings. You may be massively
  over-subscribing your CPU resources with 16 map task slots + 8 reduce
  tasks slots. At worst case, it would mean 24 total JVMs competing over
  8 available physical processors. Doesn't make sense to me at least -
  Make it more like 7 M / 2 R or so :)
 
  My questions:
  * What should be my configurations to make reducers to run in the memory?
  * Why it keeps waiting for map outputs?
 
  It has to fetch map outputs to get some data to start

Re: Map-Reduce in memory

2011-10-28 Thread Michel Segel
Uhm...
He has plenty of memory... Depending on what sort of m/r tasks... He could push 
it.
Didn't say how much disk...

I wouldn't start that high... Try 10 mappers and 2. Reducers. Granted it is a 
bit asymmetric and you can bump up the reducers...

Watch your jobs in ganglia and see what is happening... 

Harsh, assuming he is using intel, each core is hyper threaded so the box sees 
this as 2x CPUs.
8 cores looks like 16. 


Sent from a remote device. Please excuse any typos...

Mike Segel

On Oct 28, 2011, at 3:08 AM, Harsh J ha...@cloudera.com wrote:

 Hey N.N. Gesli,
 
 (Inline)
 
 On Fri, Oct 28, 2011 at 12:38 PM, N.N. Gesli nnge...@gmail.com wrote:
 Hello,
 
 We have 12 node Hadoop Cluster that is running Hadoop 0.20.2-cdh3u0. Each
 node has 8 core and 144GB RAM (don't ask). So, I want to take advantage of
 this huge RAM and run the map-reduce jobs mostly in memory with no spill, if
 possible. We use Hive for most of the processes. I have set:
 mapred.tasktracker.map.tasks.maximum = 16
 mapred.tasktracker.reduce.tasks.maximum = 8
 
 This is *crazy* for an 8 core machine. Try to keep M+R slots well
 below 8 instead - You're probably CPU-thrashed in this setup once
 large number of tasks get booted.
 
 mapred.child.java.opts = 6144
 
 You can also raise io.sort.mb to 2000, and tweak io.sort.factor.
 
 The child opts raise to 6~ GB looks a bit unnecessary since most of
 your tasks work on record basis and would not care much about total
 RAM. Perhaps use all that RAM for a service like HBase which can
 leverage caching nicely!
 
 One of my Hive queries is producing 6 stage map-reduce jobs. On the third
 stage when it queries from a 200GB table, the last 14 reducers hang. I
 changed mapred.task.timeout to 0 to see if they really hang. It has been 5
 hours, so something terribly wrong in my setup. Parts of the log is below.
 
 It is probably just your slot settings. You may be massively
 over-subscribing your CPU resources with 16 map task slots + 8 reduce
 tasks slots. At worst case, it would mean 24 total JVMs competing over
 8 available physical processors. Doesn't make sense to me at least -
 Make it more like 7 M / 2 R or so :)
 
 My questions:
 * What should be my configurations to make reducers to run in the memory?
 * Why it keeps waiting for map outputs?
 
 It has to fetch map outputs to get some data to start with. And it
 pulls the map outputs a few at a time - to not overload the network
 during shuffle phases of several reducers across the cluster.
 
 * What does it mean dup hosts?
 
 Duplicate hosts. Hosts it already knows about and has already
 scheduled fetch work upon.
 
 snip
 
 -- 
 Harsh J
 


Re: Job Submission schedule, one file at a time ?

2011-10-25 Thread Michel Segel
Not sure what you are attempting to do...
If you submit the directory name... You get a single m/r job to process all. 
( but it doesn't sound like that is what you want...)

You could use Oozie, or just a simple shell script that will walk down a list 
of files in the directory and then launch a Hadoop task...

Or did you want something else?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Oct 25, 2011, at 3:53 PM, Daniel Yehdego dtyehd...@miners.utep.edu wrote:

 
 Hi, 
 I do have a folder with 50 different files and and I want to submit a Hadoop 
 MapReduce job using each file as an input.My Map/Reduce programs basically do 
 the same job for each of my files but I want to schedule and submit a job one 
 file at a time. Its like submitting a job with one file input, wait until the 
 job completes and submit the second job (second file) right after.I want to 
 have 50 different Mapreduce outputs for the 50 input files. 
 Looking forward for your inputs , Thanks.
 Regards, 
 
 
 


Re: Question on mianframe files

2011-10-15 Thread Michel Segel
You may not want to do this...
Does the data contain and packed or zoned decimals?
If so, the dd conversion will corrupt your data.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Oct 15, 2011, at 3:51 AM, SRINIVAS SURASANI vas...@gmail.com wrote:

 Hi,
 
 I'm downloading mainframe files using FTP in binary mode on to local file
 system. These files are now seen as EBCDIC. The information about these
 files are
(a) fixed in length ( each
 field in record has fixed length).
(b)each record is of some KB
 ( This KB is fixed for each record).
 
 Now here I'm able to convert this EBCDIC files to ASCII files with in unix
 file system using the following command.
 dd if INPUTFILENAME  of OUTPUTFILENAME conv=ascii,unblock cbs=150
 150 being  the record size.
 
 So, here I want this conversion to be done in Hadoop to leverage the use of
 parallel processing. I was wondering is there any record reader available
 for this kind of files and also about how to convert Packed Decimals COMP(3)
 files to ASCII.
 
 Any suggestions on how this can be done.
 
 
 Srinivas


Re: Question on mianframe files

2011-10-15 Thread Michel Segel
Ok...
Your best bet is to bite the bullet and forego using dd. What end up doing is 
staging the files twice ... Raw ebcdic, converted ASCII then move to HDFS...

You would be better off passing in the data definition of the record and 
convert the files on the fly... Pretty straight forward...

Not sure why you would want to do this in a m/r unless you're going to do more 
processing than a simple conversion...


Sent from a remote device. Please excuse any typos...

Mike Segel

On Oct 15, 2011, at 10:30 AM, SRINIVAS SURASANI vas...@gmail.com wrote:

 Yes, we have files which are only EBCDIC and files containing  EBCDIC +
 PACKED DECIMALS. As a first step started working on only EBCDIC files. dd
 command working fine but the intention is to do this conversion with in HDFS
 to leverage parallel processing.
 
 On Sat, Oct 15, 2011 at 10:58 AM, Michel Segel 
 michael_se...@hotmail.comwrote:
 
 You may not want to do this...
 Does the data contain and packed or zoned decimals?
 If so, the dd conversion will corrupt your data.
 
 
 Sent from a remote device. Please excuse any typos...
 
 Mike Segel
 
 On Oct 15, 2011, at 3:51 AM, SRINIVAS SURASANI vas...@gmail.com wrote:
 
 Hi,
 
 I'm downloading mainframe files using FTP in binary mode on to local file
 system. These files are now seen as EBCDIC. The information about these
 files are
   (a) fixed in length ( each
 field in record has fixed length).
   (b)each record is of some
 KB
 ( This KB is fixed for each record).
 
 Now here I'm able to convert this EBCDIC files to ASCII files with in
 unix
 file system using the following command.
 dd if INPUTFILENAME  of OUTPUTFILENAME conv=ascii,unblock cbs=150
 150 being  the record size.
 
 So, here I want this conversion to be done in Hadoop to leverage the use
 of
 parallel processing. I was wondering is there any record reader available
 for this kind of files and also about how to convert Packed Decimals
 COMP(3)
 files to ASCII.
 
 Any suggestions on how this can be done.
 
 
 Srinivas
 


Re: Hadoop cluster optimization

2011-08-21 Thread Michel Segel
Avi,
First why 32 bit OS?
You have a 64 bit processor that has 4 cores hyper threaded looks like 8cpus.

With only 1.7 GB you're going to be limited on the number of slots you can 
configure. 
I'd say run ganglia but that would take resources away from you.  It sounds 
like the default parameters are a pretty good fit.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Aug 21, 2011, at 6:57 AM, Avi Vaknin avivakni...@gmail.com wrote:

 Hi all !
 How are you?
 
 My name is Avi and I have been fascinated by Apache Hadoop for the last few
 months.
 I am spending the last two weeks trying to optimize my configuration files
 and environment.
 I have been going through many Hadoop's configuration properties and it
 seems that none
 of them is  making a big difference (+- 3 minutes of a total job run time).
 
 In Hadoop's meanings my cluster considered to be extremely small (260 GB of
 text files, while every job is going through only +- 8 GB).
 I have one server acting as NameNode and JobTracker, and another 5 servers
 acting as DataNodes and TaskTreckers.
 Right now Hadoop's configurations are set to default, beside the DFS Block
 Size which is set to 256 MB since every file on my cluster takes 155 - 250
 MB.
 
 All of the above servers are exactly the same and having the following
 hardware and software:
 1.7 GB memory
 1 Intel(R) Xeon(R) CPU E5507 @ 2.27GHz
 Ubuntu Server 10.10 , 32-bit platform
 Cloudera CDH3 Manual Hadoop Installation
 (for the ones who are familiar with Amazon Web Services, I am talking about
 Small EC2 Instances/Servers)
 
 Total job run time is +-15 minutes (+-50 files/blocks/mapTasks of up to 250
 MB and 10 reduce tasks).
 
 Based on the above information, does anyone can recommend on a best practice
 configuration??
 Do you thinks that when dealing with such a small cluster, and when
 processing such a small amount of data,
 is it even possible to optimize jobs so they would run much faster? 
 
 By the way, it seems like none of the nodes are having a hardware
 performance issues (CPU/Memory) while running the job.
 Thats true unless I am having a bottle neck somewhere else (seems like
 network bandwidth is not the issue).
 That issue is a little confusing because  the NameNode process and the
 JobTracker process should allocate 1GB of memory each,
 which means that my hardware starting point is insufficient and in that case
 why am I not seeing a full Memory utilization using 'top' 
 command on the NameNode  JobTracker Server? 
 How would you recommend to measure/monitor different Hadoop's properties to
 find out where is the bottle neck?
 
 Thanks for your help!!
 
 Avi
 
 
 


Re: Namenode Scalability

2011-08-10 Thread Michel Segel
So many questions, why stop there?

First question... What would cause the name node to have a GC issue?
Second question... You're streaming 1PB a day. Is this a single stream of data?
Are you writing this to one file before processing, or are you processing the 
data directly on the ingestion stream?

Are you also filtering the data so that you are not saving all of the data?

This sounds like a homework assignment than a real world problem.

I guess people don't race cars against trains or have two trains traveling in 
different directions anymore... :-)


Sent from a remote device. Please excuse any typos...

Mike Segel

On Aug 10, 2011, at 12:07 PM, jagaran das jagaran_...@yahoo.co.in wrote:

 To be precise, the projected data is around 1 PB.
 But the publishing rate is also around 1GBPS.
 
 Please suggest.
 
 
 
 From: jagaran das jagaran_...@yahoo.co.in
 To: common-user@hadoop.apache.org common-user@hadoop.apache.org
 Sent: Wednesday, 10 August 2011 12:58 AM
 Subject: Namenode Scalability
 
 In my current project we  are planning to streams of data to Namenode (20 
 Node Cluster).
 Data Volume would be around 1 PB per day.
 But there are application which can publish data at 1GBPS.
 
 Few queries:
 
 1. Can a single Namenode handle such high speed writes? Or it becomes 
 unresponsive when GC cycle kicks in.
 2. Can we have multiple federated Name nodes  sharing the same slaves and 
 then we can distribute the writes accordingly.
 3. Can multiple region servers of HBase help us ??
 
 Please suggest how we can design the streaming part to handle such scale of 
 data. 
 
 Regards,
 Jagaran Das 


Re: Sanity check re: value of 10GbE NICs for Hadoop?

2011-06-29 Thread Michel Segel
I'm not sure which point you are trying to make.
To answer to answer your question...

With respect to price... 10GBe is cost effective.
You have to consider 1GBe is not only you port speed but also there is going to 
be the speed of the uplink or trunk.

So if you continue to build out, you run in to bandwidth issues between racks. 
So you end up doing 1GBe ports and then higher speed by either port bonding or 
bigger bandwidth for uplinks only. These switches are more expensive than 
simple 1GBe switches, but less than full 10GBe.

Depending on vendor, number of ports, discount, you can get the switch for 
approx 10,000 and up. Think $550 to $600 a port for 10GBe.

With Sandy Bridge, you will start to see 10GBe on the mother boards.

If you're following discussion on the performance gains, you'll start to see 
the network being the bottleneck.

If you are planning to build a new cluster... You should plan on 10gbe.







Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 29, 2011, at 1:07 AM, Bharath Mundlapudi bharathw...@yahoo.com wrote:
 One could argue that its too early for 10Gb NIC in Hadoop Cluster. Certainly 
 having extra bandwidth is good but at what price?
 
 
 Please note that all the points you mentioned can work with 1Gb NICs today. 
 Unless if you can back with price/performance data. Typically, Map output is 
 compressed. If system is hitting peak network utilization, one can select 
 high compression rate algorithms at the cost of CPU.  Most of these machines 
 comes with dual NIC cards, so one could do link bonding to push more bits.
 
 
 One area may have good benefit of 10Gb NIC is High Density Systems - 24 core 
 and 3x12TB disks. This is the trend now and will continue. These systems can 
 saturate the 1Gb NICs. 
 
 
 -Bharath
 
 
 
 
 From: Saqib Jang -- Margalla Communications saq...@margallacomm.com
 To: common-user@hadoop.apache.org
 Sent: Tuesday, June 28, 2011 10:16 AM
 Subject: Sanity check re: value of 10GbE NICs for Hadoop?
 
 Folks,
 
 I've been digging into the potential benefits of using 
 
 10 Gigabit Ethernet (10GbE) NIC server connections for
 
 Hadoop and wanted to run what I've come up with
 
 through initial research by the list for 'sanity check'
 
 feedback. I'd very much appreciate your input on
 
 the importance (or lack of it) of the following potential benefits of
 
 10GbE server connectivity as well as other thoughts regarding
 
 10GbE and Hadoop (My interest is specifically in the value
 
 of 10GbE server connections and 10GbE switching infrastructure, 
 
 over scenarios such as bonded 1GbE server connections with 
 
 10GbE switching).
 
 
 
 1.   HDFS Data Loading. The higher throughput enabled by 10GbE
 
 server and switching infrastructure allows faster processing and 
 
 distribution of data.
 
 2.   Hadoop Cluster Scalability. High-performance for initial data
 processing
 
 and distribution directly impacts the degree of parallelism or scalability
 supported
 
 by the cluster.
 
 3.   HDFS Replication. Higher speed server connections allows faster
 file replication.
 
 4.   Map/Reduce Shuffle Phase. Improved end-to-end throughput and
 latency directly impact the 
 
 shuffle phase of a data set reduction especially for tasks that are at the
 document level 
 
 (including large documents) and lots of metadata generated by those
 documents as well as video analytics and images.
 
 5.   Data Reporting. 10GbE server networking etwork performance can 
 
 improve data reporting performance, especially if the Hadoop cluster is
 running 
 
 multiple data reductions. 
 
 6.   Support of Cluster File Systems.  With 10 GbE NICs, Hadoop could be
 reorganized 
 
 to use a cluster or network file system. This would allow Hadoop even with
 its Java implementation 
 
 to have higher performance I/O and not have to be so concerned with disk
 drive density in the same server.
 
 7.   Others?
 
 
 
 
 
 thanks,
 
 Saqib
 
 
 
 Saqib Jang
 
 Principal/Founder
 
 Margalla Communications, Inc.
 
 1339 Portola Road, Woodside, CA 94062
 
 (650) 274 8745
 
 www.margallacomm.com


Re: Using own InputSplit

2011-05-29 Thread Michel Segel
You can add that sometimes the input file is too small and you don't get the 
desired parallelism. 

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 27, 2011, at 12:25 PM, Harsh J ha...@cloudera.com wrote:

 Mohit,
 
 On Fri, May 27, 2011 at 10:44 PM, Mohit Anchlia mohitanch...@gmail.com 
 wrote:
 Actually this link confused me
 
 http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Job+Input
 
 Clearly, logical splits based on input-size is insufficient for many
 applications since record boundaries must be respected. In such cases,
 the application should implement a RecordReader, who is responsible
 for respecting record-boundaries and presents a record-oriented view
 of the logical InputSplit to the individual task.
 
 But it looks like application doesn't need to do that since it's done
 default? Or am I misinterpreting this entirely?
 
 For any type of InputFormat Hadoop provides along with itself, it
 should already handle this for you (Text Files (say, \n-ended),
 Sequence Files, Avro Datafiles). If you have a custom file format that
 defines its own record delimiter character(s); you would surely need
 to write your own InputFormat that splits across properly (the wiki
 still helps on how to manage the reads across the first split and the
 subsequents).
 
 -- 
 Harsh J
 


Re: Reducer granularity and starvation

2011-05-19 Thread Michel Segel
Fair scheduler won't help unless you set it to allow preemptive executions 
which may not be a good thing...

Fair scheduler will wait until the current task completes before assigning a 
new task to the open slot.  So if you have a long running job... You're SOL.

A combiner will definitely help but you will still have the issue of long 
running jobs. You could put you job in a queue that limits the number of 
slots... But then you will definitely increase the time to run your job.

If you could suspend a task... But that's anon-trivial solution...

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 18, 2011, at 5:04 PM, W.P. McNeill bill...@gmail.com wrote:

 I'm using fair scheduler and JVM reuse.  It is just plain a big job.
 
 I'm not using a combiner right now, but that's something to look at.
 
 What about bumping the mapred.reduce.tasks up to some huge number?  I think
 that shouldn't make a difference, but I'm hearing conflicting information on
 this.


Re: Suggestions for swapping issue

2011-05-11 Thread Michel Segel
You have to do the math...
If you have 2gb per mapper, and run 10 mappers per node... That means 20gb of 
memory.
Then you have TT and DN running which also take memory... 

What did you set as the number of mappers/reducers per node?

What do you see in ganglia or when you run top?

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 11, 2011, at 12:31 PM, Adi adi.pan...@gmail.com wrote:

 Hello Hadoop Gurus,
 We are running a 4-node cluster. We just upgraded the RAM to 48 GB. We have
 allocated around 33-34 GB per node for hadoop processes. Leaving the rest of
 the 14-15 GB memory for OS and as buffer. There are no other processes
 running on these nodes.
 Most of the lighter jobs run successfully but one big job is de-stabilizing
 the cluster. One node starts swapping and runs out of swap space and goes
 offline. We tracked the processes on that node and noticed that it ends up
 with more than expected hadoop-java processes.
 The other 3 nodes were running 10 or 11 processes and this node ends up with
 36. After killing the job we find these processes still show up and we have
 to kill them manually.
 We have tried reducing the swappiness to 6 but saw the same results. It also
 looks like hadoop stays well within the memory limits allocated and still
 starts swapping.
 
 Some other suggestions we have seen are:
 1) Increase swap size. Current size is 6 GB. The most quoted size is 'tons
 of swap' but note sure how much it translates to in numbers. Should it be 16
 or 24 GB
 2) Increase overcommit ratio. Not sure if this helps as a few blog comments
 mentioned it didn't help
 
 Any other hadoop or linux config suggestions are welcome.
 
 Thanks.
 
 -Adi


Re: Cluster hardware question

2011-04-26 Thread Michel Segel
Hi,
Actually if you have 2 4 core CPUs xeon chips... You will become i/o bound with 
4 drives.
The rule of thumb tends to be 2 disks per core so you would want 16 drives per 
node... At least in theory.

24 1TB drives would be interesting, but I'm not sure what sort of problems you 
could expect to encounter until you had to expand the cluster...


Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 26, 2011, at 8:55 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote:

 Hi,
 
 People say a balanced server configration is as following:
 
 2 4 Core CPU, 24G RAM, 4 1TB SATA Disks
 
 But we have been used to use storages servers with 24 1T SATA Disks,
 we are wondering will Hadoop be CPU bounded if this kind of servers
 are used. Does anybody have experiences with hadoop running on servers
 with so many disks.
 
 Regards,
 
 Xiaobo G