Re: Job counters limit exceeded exception

2013-01-02 Thread Alexander Alten-Lorenz
Hi,

These happens when operators are used in queries (Hive Operators). Hive creates 
4 counters per operator, max upto 1000, plus a few additional counters like 
file read/write, partitions and tables. Hence the number of counter required is 
going to be dependent upon the query. 

Using EXPLAIN EXTENDED and grep -ri operators | wc -l print out the used 
numbers of operators. Use this value to tweak the MR settings carefully. 

Praveen has a good explanation 'bout counters online:
http://www.thecloudavenue.com/2011/12/limiting-usage-counters-in-hadoop.html

Rule of thumb for Hive:
count of operators * 4 + n (n for file ops and other stuff).

cheers,
 Alex 


On Jan 2, 2013, at 10:35 AM, Krishna Rao krishnanj...@gmail.com wrote:

 A particular query that I run fails with the following error:
 
 ***
 Job 18: Map: 2  Reduce: 1   Cumulative CPU: 3.67 sec   HDFS Read: 0 HDFS
 Write: 0 SUCCESS
 Exception in thread main
 org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
 counters: 121 max=120
 ...
 ***
 
 Googling suggests that I should increase mapreduce.job.counters.limit.
 And that the number of counters a job uses
 has an effect on the memory used by the JobTracker, so I shouldn't increase
 this number too high.
 
 Is there a rule of thumb for what this number should be as a function of
 JobTracker memory? That is should I be cautious and
 increase by 5 at a time, or could I just double it?
 
 Cheers,
 
 Krishna

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF



Regarding Backup of MySQL if we have that as Metadata store for HIVE

2013-01-02 Thread Ramasubramanian Narayanan
Hi,

If we have MySQL as the metadata store for HIVE,

1) do we need to backup every day ?

2) Is there any automatic way in Hadoop to have replication copy for MySQL
too?

3) Anyway to update the metadata if we loose the information in MySQL?


regards,
Rams


Re: Regarding Backup of MySQL if we have that as Metadata store for HIVE

2013-01-02 Thread Nitin Pawar
if you are using hcatalog for metadata store, then you can just setup a
daily cron to take up data bacup with mysqldump.

from what I understand is keep the metadata backup setup from hadoop
operations as you dont really need to complicate it.

Just write a mysqldump query and assuming that you really dont have a huge
metadata store, you can make it hourly or twice a day backup and store in
files
and then you can easily restore from the dump files.

Others may have better solutions but this is how I had done it sometime
back


On Wed, Jan 2, 2013 at 7:51 PM, Ramasubramanian Narayanan 
ramasubramanian.naraya...@gmail.com wrote:

 Hi,

 If we have MySQL as the metadata store for HIVE,

 1) do we need to backup every day ?

 2) Is there any automatic way in Hadoop to have replication copy for MySQL
 too?

 3) Anyway to update the metadata if we loose the information in MySQL?


 regards,
 Rams




-- 
Nitin Pawar