Add keys to column family in HBase using Python

2015-04-15 Thread Manoj Venkatesh
Dear Hadoop experts,

I have a Hadoop cluster which has Hive, HBase installed along with other Hadoop 
components.  I am currently exploring ways to automate a data migration process 
from Hive to HBase which involves new columns of data added ever so often.  I 
was successful in creating a HBase table using Hive and load data into the 
HBase table, on these lines I tried to add columns to the HBase table(from 
Hive) using the alter table syntax and I got the error message, ALTER TABLE 
cannot be used for a non-native table temp_testing.

As an alternative to this I am also trying to do this programmatically using 
Python, I have explored the libraries 
HappyBasehttps://happybase.readthedocs.org/en/latest/index.html and 
starbasehttp://pythonhosted.org//starbase/. These libraries provide 
functionality for creating, deleting and other features but none of these 
provide an option to add a key to a column family. Does anybody know of a 
better way of achieving this with Python, say libraries or through other means.

Thanks in advance,
Manoj

The information transmitted in this email is intended only for the person or 
entity to which it is addressed, and may contain material confidential to Xoom 
Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient(s) is prohibited. If you received this email in error, please contact 
the sender and delete the material from your files.


cluster utiliation when using fair scheduler

2015-04-15 Thread Thon, Ingo
Hi,

I'm using the fair scheduler for Yarn. I have not specified any pools, so the 
fair-scheduler.xml is basically empty.
However, only one third of the cluster is utilized.

On the scheduler page I see a single Queue which is root and it is specified 
that 33.3% used
This 33.3% is independent of the number of jobs.
Currently, I have a single job running and it is claimed that for this 
Application pipeline:
Used Resources: memory:196608, vCores:48
Num Active Applications:21
Num Pending Applications:   0
Min Resources:  memory:0, vCores:0
Max Resources:  memory:589824, vCores:48
Fair Share: memory:45372, vCores:0

The number of vCores appears to be to low as well. The number of cores should 
be 6x16.

Any suggestions what to check?

Cheers Ingo



Restriction of disk space on HDFS

2015-04-15 Thread Vijayadarshan REDDY
Hi Guys

Quick question - Using the fair scheduler we can restrict access to map tasks, 
reduce tasks and overall system resources for each queue. Using the same 
mechanism we don't see a any parameter to allocate disk usage by the queues.

Can you pls let us know if there is there a way to do this in CDH5?


Thanks and Regards,
Vijayadarshan Reddy
TO-Core BI Technology
DBS Bank Ltd
Email : vijayadars...@dbs.com
Mobile : +65 83157090

CONFIDENTIAL NOTE:
The information contained in this email is intended only for the use of the 
individual or entity named above and may contain information that is 
privileged, confidential and exempt from disclosure under applicable law. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this communication 
is strictly prohibited. If you have received this message in error, please 
immediately notify the sender and delete the mail. Thank you.


RE: Mapreduce job got stuck

2015-04-15 Thread Rohith Sharma K S
Hi Vandana

From the configurations, it looks like none of the NodeManagers are registered 
with RM because of configuration “yarn.resourcemanager.resource- 
tracker.address” issue.  May be you can confirm any NM’s are registered with RM.

In the below, there is space after “resource-“ but “resource-tracker” is single 
without any space. Check after removing space.
nameyarn.resourcemanager.resource- tracker.address/name

Similarly I see same issue in “yarn.nodemanager.aux- 
services.mapreduce.shuffle.class” where space after “aux-”!!!

Hope it helps you to resolve issue

Thanks  Regards
Rohith Sharma K S

From: Vandana kumari [mailto:kvandana1...@gmail.com]
Sent: 15 April 2015 15:33
To: user@hadoop.apache.org
Subject: Mapreduce job got stuck

i had setup a 3 node hadoop cluster on centos 6.5 but nodemanager is not 
running on master and is running on slave nodes. Alse when i submit a job then 
job get stuck. the same job runs well on sinle node setup. I am unable to 
figure out the problem. Attaching all the configuration files.
Any help will be highly appreciated.

--
Thanks and regards
  Vandana kumari


RE: Change in fair-scheduler.xml

2015-04-15 Thread Rohith Sharma K S
Hi


1 - Is there a document on what should be the default settings in the XML file 
for say 96 GB.. 48 core system with say 4/queues?
You can refer below the doc for configuring fair scheduler
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html


2 - When we change the file does the yarn service need to be bounced for the 
changed values to get reflected?
Yarn admin supports runtime refresh queues without restarting ResourceManager.  
It can be achieved by  using “$HADOOP_HOME/bin/yarn rmadmin –refreshQueues”  
CLI command.


Thanks  Regards
Rohith Sharma K S

From: Manish Maheshwari [mailto:mylogi...@gmail.com]
Sent: 15 April 2015 15:43
To: user@hadoop.apache.org
Subject: Change in fair-scheduler.xml


Hi, We are trying to change properties of fair scheduler settings.

1 - Is there a document on what should be the default settings in the XML file 
for say 96 GB.. 48 core system with say 4/queues?

2 - When we change the file does the yarn service need to be bounced for the 
changed values to get reflected?

Thanks
Manish


Re: Restriction of disk space on HDFS

2015-04-15 Thread Nitin Pawar
HDFS and job scheduling queues are entirely different systems.

HDFS disk quota are set at a directory level and then you can intern limit
the permissions of that directory to a group and then it indirectly means
this group has this much dist quota

On Wed, Apr 15, 2015 at 3:55 PM, Vijayadarshan REDDY vijayadars...@dbs.com
wrote:

  Hi Guys



 Quick question - Using the fair scheduler we can restrict access to map
 tasks, reduce tasks and overall system resources for each queue. Using the
 same mechanism we don’t see a any parameter to allocate disk usage by the
 queues.



 Can you pls let us know if there is there a way to do this in CDH5?





 *Thanks and Regards,*

 *Vijayadarshan Reddy*

 *TO-Core BI Technology*

 *DBS Bank Ltd*

 *Email : vijayadars...@dbs.com vijayadars...@dbs.com*

 *Mobile : +65 83157090 %2B65%2083157090*


  CONFIDENTIAL NOTE: The information contained in this email is intended
 only for the use of the individual or entity named above and may contain
 information that is privileged, confidential and exempt from disclosure
 under applicable law. If the reader of this message is not the intended
 recipient, you are hereby notified that any dissemination, distribution or
 copying of this communication is strictly prohibited. If you have received
 this message in error, please immediately notify the sender and delete the
 mail. Thank you.




-- 
Nitin Pawar


Re: Mapreduce job got stuck

2015-04-15 Thread Vandana kumari
i had attached nodemanager log of master file and modified yarn-site.xml
file

On Wed, Apr 15, 2015 at 6:21 AM, Rohith Sharma K S 
rohithsharm...@huawei.com wrote:

  Hi Vandana



 From the configurations, it looks like none of the NodeManagers are
 registered with RM because of configuration “*yarn.resourcemanager.resource-
 tracker.address*” issue.  May be you can confirm any NM’s are registered
 with RM.



 In the below, there is space after “*resource-“ but “resource-tracker” *is
 single without any space. Check after removing space.

 *nameyarn.resourcemanager.resource- tracker.address/name*



 *Similarly I see same issue in “yarn.nodemanager.aux-
 services.mapreduce.shuffle.class” where space after “aux-”!!!*



 *Hope it helps you to resolve issue*



 Thanks  Regards

 Rohith Sharma K S



 *From:* Vandana kumari [mailto:kvandana1...@gmail.com]
 *Sent:* 15 April 2015 15:33
 *To:* user@hadoop.apache.org
 *Subject:* Mapreduce job got stuck



 i had setup a 3 node hadoop cluster on centos 6.5 but nodemanager is not
 running on master and is running on slave nodes. Alse when i submit a job
 then job get stuck. the same job runs well on sinle node setup. I am unable
 to figure out the problem. Attaching all the configuration files.

 Any help will be highly appreciated.


 --

 Thanks and regards

   Vandana kumari




-- 
Thanks and regards
  Vandana kumari
2015-04-15 17:15:52,022 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: STARTUP_MSG: 
/
STARTUP_MSG: Starting NodeManager
STARTUP_MSG:   host = kirti/172.17.14.22
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 2.2.0
STARTUP_MSG:   classpath = 

RE: Restriction of disk space on HDFS

2015-04-15 Thread Ravi Shankar
Please refer

 

https://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/YarnComma
nds.html#resourcemanager

 

Best regards,

Nair

 

From: Vijayadarshan REDDY [mailto:vijayadars...@dbs.com] 
Sent: Wednesday, April 15, 2015 6:25 AM
To: user@hadoop.apache.org
Subject: Restriction of disk space on HDFS

 

Hi Guys

 

Quick question - Using the fair scheduler we can restrict access to map
tasks, reduce tasks and overall system resources for each queue. Using the
same mechanism we don't see a any parameter to allocate disk usage by the
queues. 

 

Can you pls let us know if there is there a way to do this in CDH5?

 

 

Thanks and Regards,

Vijayadarshan Reddy

TO-Core BI Technology

DBS Bank Ltd

Email : vijayadars...@dbs.com mailto:vijayadars...@dbs.com 

Mobile : +65 83157090

 


CONFIDENTIAL NOTE:


The information contained in this email is intended only for the use of the
individual or entity named above and may contain information that is
privileged, confidential and exempt from disclosure under applicable law. If
the reader of this message is not the intended recipient, you are hereby
notified that any dissemination, distribution or copying of this
communication is strictly prohibited. If you have received this message in
error, please immediately notify the sender and delete the mail. Thank you. 



Re: Mapreduce job got stuck

2015-04-15 Thread shashwat shriparv
Please check the error logs. and send the logs.



On Wed, Apr 15, 2015 at 3:33 PM, Vandana kumari kvandana1...@gmail.com
wrote:

 nodemanager






*Warm Regards,*
*Shashwat Shriparv*
*http://bit.ly/14cHpad http://bit.ly/14cHpad *

*http://goo.gl/rxz0z8 http://goo.gl/rxz0z8*
*http://goo.gl/RKyqO8 http://goo.gl/RKyqO8*
[image: https://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
https://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9[image:
https://twitter.com/shriparv] https://twitter.com/shriparv[image:
https://www.facebook.com/shriparv] https://www.facebook.com/shriparv[image:
http://google.com/+ShashwatShriparv]
http://google.com/+ShashwatShriparv[image:
http://www.youtube.com/user/sShriparv/videos]
http://www.youtube.com/user/sShriparv/videos[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] shrip...@yahoo.com


Mapreduce job got stuck

2015-04-15 Thread Vandana kumari
i had setup a 3 node hadoop cluster on centos 6.5 but nodemanager is not
running on master and is running on slave nodes. Alse when i submit a job
then job get stuck. the same job runs well on sinle node setup. I am unable
to figure out the problem. Attaching all the configuration files.
Any help will be highly appreciated.

-- 
Thanks and regards
  Vandana kumari
?xml version=1.0 encoding=UTF-8?
?xml-stylesheet type=text/xsl href=configuration.xsl?
!--
  Licensed under the Apache License, Version 2.0 (the License);
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an AS IS BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
--

!-- Put site-specific property overrides in this file. --

configuration
property
  namefs.default.name/name
  valuehdfs://kirti:9000/value
/property
property
  namehadoop.tmp.dir/name
  value/tmp/hadoop-hadoopuser/value
/property
/configuration
?xml version=1.0 encoding=UTF-8?
?xml-stylesheet type=text/xsl href=configuration.xsl?
!--
  Licensed under the Apache License, Version 2.0 (the License);
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an AS IS BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
--

!-- Put site-specific property overrides in this file. --

configuration
property
 namedfs.replication/name
value3/value
/property

property
  namedfs.name.dir/name
  valuefile:///home/hadoopuser/hadoopspace/hdfs/namenode/value
/property

property
  namedfs.data.dir/name
  valuefile:///home/hadoopuser/hadoopspace/hdfs/datanode/value
/property
/configuration
?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?
!--
  Licensed under the Apache License, Version 2.0 (the License);
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an AS IS BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
--

!-- Put site-specific property overrides in this file. --

configuration
	property
	  namemapreduce.framework.name/name
	  valueyarn/value
	 /property
property
namemapreduce.jobtracker.address/name
valuekirti:54311/value
/property

property
namemapreduce.jobtracker.http.address/name
value0.0.0.0:50030/value
/property

property
namemapreduce.jobhistory.address/name
value0.0.0.0:10020/value
/property

property
namemapreduce.jobhistory.webapp.address/name
value0.0.0.0:19888/value
/property

/configuration
?xml version=1.0?
!--
  Licensed under the Apache License, Version 2.0 (the License);
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an AS IS BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
--
configuration

!-- Site specific YARN configuration properties --
property
  nameyarn.nodemanager.aux-services/name
  valuemapreduce_shuffle/value
/property
 property
nameyarn.nodemanager.aux- services.mapreduce.shuffle.class/name
valueorg.apache.hadoop.mapred.ShuffleHandler/value
  /property
  property
nameyarn.resourcemanager.resource- tracker.address/name
valuekirti:8025/value
  /property
  property
nameyarn.resourcemanager.scheduler.address/name
valuekirti:8030/value
  /property
  property
nameyarn.resourcemanager.address/name
valuekirti:8040/value
  /property


/configuration


Change in fair-scheduler.xml

2015-04-15 Thread Manish Maheshwari
Hi, We are trying to change properties of fair scheduler settings.

1 - Is there a document on what should be the default settings in the XML
file for say 96 GB.. 48 core system with say 4/queues?

2 - When we change the file does the yarn service need to be bounced for
the changed values to get reflected?

Thanks
Manish


Re: Mapreduce job got stuck

2015-04-15 Thread shashwat shriparv
What is your yarn.nodemanager.address address ?



*Warm Regards,*
*Shashwat Shriparv*
*http://bit.ly/14cHpad http://bit.ly/14cHpad *

*http://goo.gl/rxz0z8 http://goo.gl/rxz0z8*
*http://goo.gl/RKyqO8 http://goo.gl/RKyqO8*
[image: https://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
https://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9[image:
https://twitter.com/shriparv] https://twitter.com/shriparv[image:
https://www.facebook.com/shriparv] https://www.facebook.com/shriparv[image:
http://google.com/+ShashwatShriparv]
http://google.com/+ShashwatShriparv[image:
http://www.youtube.com/user/sShriparv/videos]
http://www.youtube.com/user/sShriparv/videos[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] shrip...@yahoo.com

On Wed, Apr 15, 2015 at 3:42 PM, shashwat shriparv 
dwivedishash...@gmail.com wrote:

 Please check the error logs. and send the logs.



 On Wed, Apr 15, 2015 at 3:33 PM, Vandana kumari kvandana1...@gmail.com
 wrote:

 nodemanager






 *Warm Regards,*
 *Shashwat Shriparv*
 *http://bit.ly/14cHpad http://bit.ly/14cHpad *

 *http://goo.gl/rxz0z8 http://goo.gl/rxz0z8*
 *http://goo.gl/RKyqO8 http://goo.gl/RKyqO8*
 [image: https://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
 https://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9[image:
 https://twitter.com/shriparv] https://twitter.com/shriparv[image:
 https://www.facebook.com/shriparv] https://www.facebook.com/shriparv[image:
 http://google.com/+ShashwatShriparv] 
 http://google.com/+ShashwatShriparv[image:
 http://www.youtube.com/user/sShriparv/videos]
 http://www.youtube.com/user/sShriparv/videos[image:
 http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] shrip...@yahoo.com



RE: Mapreduce job got stuck

2015-04-15 Thread Rohith Sharma K S
Hi,

On master machine, NodeManager is not running because of “Caused by: 
java.net.BindException: Problem binding to [kirti:8040], got from logs.

The port 8040 is in use!!! Configure available port number.


Thanks  Regards
Rohith Sharma K S

From: Vandana kumari [mailto:kvandana1...@gmail.com]
Sent: 15 April 2015 16:29
To: user@hadoop.apache.org; Rohith Sharma K S
Subject: Re: Mapreduce job got stuck

When i made the changes as specified by Rohith, my job is running but it runs 
only on slave nodes(amit  yashbir) not on master node(kirti) and still no 
nodemanager is running on master node.

On Wed, Apr 15, 2015 at 6:39 AM, Vandana kumari 
kvandana1...@gmail.commailto:kvandana1...@gmail.com wrote:
i had attached nodemanager log of master file and modified yarn-site.xml file

On Wed, Apr 15, 2015 at 6:21 AM, Rohith Sharma K S 
rohithsharm...@huawei.commailto:rohithsharm...@huawei.com wrote:
Hi Vandana

From the configurations, it looks like none of the NodeManagers are registered 
with RM because of configuration “yarn.resourcemanager.resource- 
tracker.address” issue.  May be you can confirm any NM’s are registered with RM.

In the below, there is space after “resource-“ but “resource-tracker” is single 
without any space. Check after removing space.
nameyarn.resourcemanager.resource- tracker.address/name

Similarly I see same issue in “yarn.nodemanager.aux- 
services.mapreduce.shuffle.class” where space after “aux-”!!!

Hope it helps you to resolve issue

Thanks  Regards
Rohith Sharma K S

From: Vandana kumari 
[mailto:kvandana1...@gmail.commailto:kvandana1...@gmail.com]
Sent: 15 April 2015 15:33
To: user@hadoop.apache.orgmailto:user@hadoop.apache.org
Subject: Mapreduce job got stuck

i had setup a 3 node hadoop cluster on centos 6.5 but nodemanager is not 
running on master and is running on slave nodes. Alse when i submit a job then 
job get stuck. the same job runs well on sinle node setup. I am unable to 
figure out the problem. Attaching all the configuration files.
Any help will be highly appreciated.

--
Thanks and regards
  Vandana kumari



--
Thanks and regards
  Vandana kumari



--
Thanks and regards
  Vandana kumari