Re: Enabling fair scheduler using Bootstrap is failing

2012-10-30 Thread Chunky Gupta
Hi,

Today, I enabled logging while creating new job. Error which I see in log
files are :

ERROR org.apache.hadoop.security.UserGroupInformation (IPC Server handler
12 on 9000): PriviledgedActionException as:hadoop
cause:java.io.IOException: File /mnt/var/lib/hadoop/tmp/mapred/system/
jobtracker.info could only be replicated to 0 nodes, instead of 1

and

2012-10-30 06:14:26,527 WARN org.apache.hadoop.hdfs.DFSClient (Thread-18):
Error Recovery for block null bad datanode[0] nodes == null
2012-10-30 06:14:26,527 WARN org.apache.hadoop.hdfs.DFSClient (Thread-18):
Could not get block locations. Source file
/mnt/var/lib/hadoop/tmp/mapred/system/jobtracker.info - Aborting...
2012-10-30 06:14:26,527 WARN org.apache.hadoop.mapred.JobTracker (main):
Writing to file hdfs://
10.92.235.20:9000/mnt/var/lib/hadoop/tmp/mapred/system/jobtracker.infofailed!
2012-10-30 06:14:26,528 WARN org.apache.hadoop.mapred.JobTracker (main):
FileSystem is not ready yet!
2012-10-30 06:14:26,534 WARN org.apache.hadoop.mapred.JobTracker (main):
Failed to initialize recovery manager.

Also the default file I am uploading at bootstrap mapred-site.xml, I am
removing this configuration:
property
namemapred.job.tracker/name
valueip-10-116-159-127.ec2.internal:9001/value
/property

Please suggest any solution for this.

Thanks,
Chunky.


On Mon, Oct 29, 2012 at 7:10 PM, Chunky Gupta chunky.gu...@vizury.comwrote:

 Hi,

 I tried this also in optional arguments --site-config-file
 s3://viz-emr-hive/config/mapred-site.xml -m
 mapred.jobtracker.taskScheduler=org.apache.hadoop.mapred.FairScheduler

 This time it goes to state Bootstrapping and then failed.

 Let me know what changes I can do to make it work.

 Thanks,
 Chunky.


 On Mon, Oct 29, 2012 at 6:37 PM, Chunky Gupta chunky.gu...@vizury.comwrote:

 Hi,

 I am trying to enable fair scheduler on my emr cluster at bootstrap. The
 steps I am doing are :

 1. Creating Job instance from AWS console as Create New Job Flow with
 Job Type as Hive program.
 2. Selecting Start an Interactive Hive Session.
 3. Selecting Master and core instance group and Amazon EC2 Key Pair .
 4. Selecting Configure your Bootstrap Actions and action type as
 Configure Hadoop.
 5. Uploaded a mapred-site.xml in s3 with setting parameters for enabling
 fair scheduler as :
  property
   namemapred.fairscheduler.allocation.file/name
   valueconf/pools.xml/value
   /property
   property
   namemapred.jobtracker.taskScheduler/name
   valueorg.apache.hadoop.mapred.FairScheduler/value
   /property
   property
   namemapred.fairscheduler.assignmultiple/name
   valuetrue/value
   /property
   property
   namemapred.fairscheduler.eventlog.enabled/name
   valuefalse/value
   /property

 6. In optional arguments I tried --site-mapred-site,s3://XXX(where I
 uploaded)/mapred-site.xml to upload this xml file for my cluster.

 Finally the creation of machine is failing with error On the master
 instance (xxx), bootstrap action 1 returned a non-zero return code.

 I think in optional arguments I am giving something wrong. Please help me
 in this.

 Thanks,
 Chunky.





Re: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

2012-10-30 Thread Robin Verlangen
Thank you for pointing me to the /tmp/root/hive.log, forgot about that one.
The problem was caused by:

*Caused by: java.sql.SQLException: Binary logging not possible. Message:
Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode
'STATEMENT'*

This happened because of enabling mysql binlog for replication. Didn't know
that could co-relate to hive. Just disabled it for now. Thank you all for
your time.

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl

http://goo.gl/Lt7BC

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



2012/10/30 Shreepadma Venugopalan shreepa...@cloudera.com

 Hi Robin,

 Can you attach the execution logs? The logs contain the exception stack.

 Thanks,
 Shreepadma


 On Mon, Oct 29, 2012 at 10:19 AM, Chen Song chen.song...@gmail.comwrote:

 Is there anything interesting on hive.log?

 If you are running Hive shell with root user, by default, the hive.log
 should be in /tmp/hive.log.

 BTW, are you using MySQL as your metastore?

 On Mon, Oct 29, 2012 at 12:22 PM, Robin Verlangen ro...@us2.nl wrote:

 I have to add, while trying to do other operations on hive by hand, it
 also fails.

 For example:
 *hive drop table my_other_table;*
 *FAILED: Error in metadata:
 java.lang.reflect.UndeclaredThrowableException*
 *FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask*


 Best regards,

 Robin Verlangen
 *Software engineer*
 *
 *
 W http://www.robinverlangen.nl
 E ro...@us2.nl

 http://goo.gl/Lt7BC

 Disclaimer: The information contained in this message and attachments is
 intended solely for the attention and use of the named addressee and may be
 confidential. If you are not the intended recipient, you are reminded that
 the information remains the property of the sender. You must not use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.



 2012/10/29 Robin Verlangen ro...@us2.nl

 Hi Chen,

 The user that ran this job is root and all hdfs folders are also owned
 by root.

 Best regards,

 Robin Verlangen
 *Software engineer*
 *
 *
 W http://www.robinverlangen.nl
 E ro...@us2.nl

 http://goo.gl/Lt7BC

 Disclaimer: The information contained in this message and attachments
 is intended solely for the attention and use of the named addressee and may
 be confidential. If you are not the intended recipient, you are reminded
 that the information remains the property of the sender. You must not use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.



 2012/10/29 Chen Song chen.song...@gmail.com

 Looks to me the permission issue.

 Can you check if the user (which ran the hive query) has write
 permission on */user/hive/warehouse/mydatabase.db/mytable?*

 On Mon, Oct 29, 2012 at 8:38 AM, Robin Verlangen ro...@us2.nl wrote:

 Hi there,

 Since today our Hive jobs suddenly fail (nothing changed actually).
 The end looks like this:

 *MapReduce Total cumulative CPU time: 38 minutes 31 seconds 260 msec*
 *Ended Job = job_201210291304_0015*
 *Loading data to table mydatabase.mytable*
 *rmr: DEPRECATED: Please use 'rm -r' instead.*
 *Deleted /user/hive/warehouse/mydatabase.db/mytable*
 *Failed with exception null*
 *FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.MoveTask*
 *MapReduce Jobs Launched:*
 *Job 0: Map: 196  Reduce: 3   Cumulative CPU: 2311.26 sec   HDFS
 Read: 0 HDFS Write: 0 SUCCESS*
 *Total MapReduce CPU Time Spent: 38 minutes 31 seconds 260 msec*

 Does anyone have a clue how to resolve this?

 Best regards,

 Robin Verlangen
 *Software engineer*
 *
 *
 W http://www.robinverlangen.nl
 E ro...@us2.nl

 http://goo.gl/Lt7BC

 Disclaimer: The information contained in this message and attachments
 is intended solely for the attention and use of the named addressee and 
 may
 be confidential. If you are not the intended recipient, you are reminded
 that the information remains the property of the sender. You must not 
 use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.




 --
 Chen Song







 --
 Chen Song






Re: Help in hive query

2012-10-30 Thread John Meagher
The WHERE part in the approvals can be moved up to be an IF in the SELECT...


SELECT client_id,receive_dd,receive_hh,
  receive_hh+1,
  COUNT(1) AS transaction_count,
  SUM( IF ( response=00, 1, 0) ) AS approval_count,
  SUM( IF ( response=00, 1, 0) ) / COUNT(1) * 100 AS percent
FROM sale_test
group by fdc_client_id,receive_dd,receive_hh,receive_hh+1;



On Tue, Oct 30, 2012 at 1:51 AM, dyuti a hadoop.hiv...@gmail.com wrote:
 Hi All,

 I want to perform (No.of .approvals in an hour/No.of transactions in that
 hour)*100.

 //COUNT(1) AS cnt  gives total transactions in an hour
 SELECT client_id,receive_dd,receive_hh,receive_hh+1,COUNT(1) AS cnt FROM
 sale_test group by fdc_client_id,receive_dd,receive_hh,receive_hh+1;

 GETREGREOS  23  16  17  5969
 GETREGREOS  23  21  22  2602
 GETREGREOS  24  3   4   114

 //Approved transactions where response=00
 SELECT client_id,receive_dd,receive_hh,receive_hh+1,COUNT(1) AS cnt FROM
 sale_test where response=00 group by
 client_id,receive_dd,receive_hh,receive_hh+1;
 GETREGREOS  23  16  17  5775
 GETREGREOS  23  21  22  2515
 GETREGREOS  24  3   4   103


 I want to perform 100 * (5775/5969) , 100 * (2515/2602) , 100 * (103/114)
 like the same for all other clients for each hour i.e., (No.of .approvals in
 an hour/No.of transactions in that hour)*100.

 Please help me out as how to achieve this in hive.


 Thanks  Regards,
 dti