Hi,
I'm a newbie when it comes to Spark and Hadoop eco system in general. Our
team has been predominantly a Microsoft shop that uses MS stack for most of
their BI needs. So we are talking SQL server for storing relational data
and SQL Server Analysis services for building MOLAP cubes for
Thanks for the sharing another view point, will look around it.
But seems like, we might need something more specific.
On Thu, Feb 26, 2015 at 2:43 PM, Rohith Sharma K S
rohithsharm...@huawei.com wrote:
Hi
If you are using CapacityScheduler, can you try using
DominantResourceCalculator
Hi,
Scenario: Read data from hbase table and store as csv in hdfs.
Problem: I have hdfs and hbase secured with kerberos. Both secured with
different keytab files and with different user name.
If I do *kinit* with hdfs keytab its launching mappers and but not able to
read from hbase table.
Error:
Hi,
Scenario: Read data from hbase table and store as csv in hdfs.
Problem: I have hdfs and hbase secured with kerberos. Both secured with
different keytab files and with different user name.
If I do *kinit* with hdfs keytab its launching mappers and but not able to
read from hbase table.
Error:
Hello Krishna,
Exception seems to be IP specific. It might be occurred due to unavailability
of IP address in the system to assign. Double check the IP address availability
and run the job.
Thanks,
S.RagavendraGanesh
ViSolve Hadoop Support Team
ViSolve Inc. | San Jose, California
Hi,
we occasionally run into a BindException causing long running jobs to
occasionally fail.
The stacktrace is below.
Any ideas what this could be caused by?
Cheers,
Krishna
Stacktrace:
379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task - Job
Submission failed with exception
Team,
I am using Ambari to install a cluster which now needs to be deleted and
re-installed.
Is there a clean way to Uninstall the cluster, clean up all the binaries
from all the nodes and do a fresh install ?
There is no data on the cluster, so nothing to worry.
Thanks in advance
Please take a look at:
https://issues.apache.org/jira/browse/AMBARI-249
Ambari mailing list seems to be better place to ask this question.
Cheers
On Thu, Feb 26, 2015 at 7:21 AM, Steve Edison sediso...@gmail.com wrote:
Team,
I am using Ambari to install a cluster which now needs to be
Steve,
I wrote up that howto some month ago, worth to test?
http://mapredit.blogspot.de/2014/06/remove-hdp-and-ambari-completely.html
http://mapredit.blogspot.de/2014/06/remove-hdp-and-ambari-completely.html
R,
Alexander
On 26 Feb 2015, at 16:21, Steve Edison sediso...@gmail.com wrote:
Hi folks,
MEAN (Mongodb,Express,Angular.js,Node.js) techstack is the
recent buzz in the market for building Web Applications. My confuison is,
why only Mongodb suits the best when compared to other NoSql databases
like(hbase,cassandra). If Mongodb is best suited, then why ? Also, can
There are many excellent ways to build web-applications -- there is no
single correct way. Learn the strengths and weaknesses of your tools, and
apply the right tools to solve your problem, i.e., let the problem you're
solving inform your choice of tools.
And ignore folks who insist on being
Thanks Jan. I did the follwoing:
1) Manually set the timezone of all the nodes using sudo
dpkg-reconfigure tzdata
2) Re-booted the nodes
Still having the same exception.
How can I configure NTP?
Regards,
Tariq
On Thu, Feb 26, 2015 at 5:33 PM, Jan van Bemmelen j...@tokyoeye.net wrote:
Could you check for any time differences between your servers? If so, please
install and run NTP, and retry your job.
Regards,
Jan
On 26 Feb 2015, at 17:57, tesm...@gmail.com wrote:
I am getting Unauthorized request to start container. This token is
expired.
How to resovle it. The
Is there a way to clone a *org.apache.hadoop.mapreduce.Job *that was
created by a user?
Hi,
Impala is a product of Cloudera. You might request help per:
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user
BR,
Alex
On 26 Feb 2015, at 17:15, Vitale, Tom thomas.vit...@credit-suisse.com wrote:
I
Hi Tariq,
You seem to be using debian or ubuntu. The documentation here will guide you
through setting up ntp:
http://www.cyberciti.biz/faq/debian-ubuntu-linux-install-ntpd/
http://www.cyberciti.biz/faq/debian-ubuntu-linux-install-ntpd/ . When you
have finished these steps you can check the
Please take a look at:
http://www.tldp.org/LDP/sag/html/basic-ntp-config.html
On Thu, Feb 26, 2015 at 10:19 AM, tesm...@gmail.com tesm...@gmail.com
wrote:
Thanks Jan. I did the follwoing:
1) Manually set the timezone of all the nodes using sudo
dpkg-reconfigure tzdata
2) Re-booted the
Thanks Jan,
I followed the link and re-booted the node.
Still no success.
Time on this node is about 13 minutes behind the other nodes. Any otehr
suggestion please
This node is workig as my namenode
On Thu, Feb 26, 2015 at 6:31 PM, Jan van Bemmelen j...@tokyoeye.net wrote:
Hi Tariq,
Hi Tariq,
So this is not really an Hadoop issue, but more a general Linux time question.
Here’s how to manually get the time synchronised:
/etc/init.d/ntp stop (or whatever way you prefer to kill ntpd)
ntpdate 0.centos.pool.ntp.org http://0.centos.pool.ntp.org/
This should sync time with the
Hi,
As per my understanding we don't take backup of Hadoop cluster as the size
is very large generally .
However in case if somebody has dropped a table by mistake then how should
we recover the data ?
How to take backup of Hadoop ecosystem individual component.
Thanks
Krish
Hey, Tariq:
Definitely this is the time problem.
But if this is not the production cluster, and to unblock your progress,
you could set
yarn.resourcemanager.rm.container-allocation.expiry-interval-ms a larger
number in your yarn-site.xml. The current default number is 60 which is
There are several approaches. I would check hdfs trash folder of the user
deleting a file. Expiration of items in trash is controlled by
fs.trash.interval property on core-site.xml.
Artem Ervits
On Feb 26, 2015 1:31 PM, Krish Donald gotomyp...@gmail.com wrote:
Hi,
As per my understanding we
Hi
If you are using CapacityScheduler, can you try using
DominantResourceCalculator i.e configuring below property value in
capacity-scheduler.xml file.
property
nameyarn.scheduler.capacity.resource-calculator/name
valueorg.apache.hadoop.yarn.util.resource. DominantResourceCalculator
I am working with YarnClient for the 1st time. My goal is to get and
display the applications running on Yarn using Java. My project setup is as
follows:
public static void main(String[] args) throws IOException, YarnException {
// Create yarnClient
YarnConfiguration conf = new
Hi,
I have to run two kind of applications, one requiring less cores but more
memory ( Application_High_Mem) and another application which requires more
cores but less memory ( Application_High_Core).
I can use specific queues to submit them to, but that can lead to one node
contributing to only
Simple way to meet your goal , you can add hadoop jars into project classpath.
I.e If you have hadoop package, extract it and add all the jars into project
classpath.
Then you change java code below
YarnConfiguration conf = new YarnConfiguration();
Hello,
I am trying to write a Hadoop program that handles JSON and hence wrote a
CustomInputFormat to handle the data. The Custom format extends the
RecordReader and then overrides the nextKeyValue() method.
However, this doesn't solve the problem when one JSON object is split
across two
Hi Xuan,
I applied the patch for YARN-3103 and the issue hasn't occurred since. Thanks
for your help! :)
Regards,
Rahul Chhiber
From: Xuan Gong [mailto:xg...@hortonworks.com]
Sent: Friday, February 06, 2015 5:53 AM
To: user@hadoop.apache.org
Subject: Re: Application Master fails due to Invalid
On a kerberos based Hadoop cluster, a kinit is done and then oozie command
is executed. This works every time (thus no setup issues), except once it
failed with following error.
Error: AUTHENTICATION : Could not authenticate, GSSException: No valid
credentials provided (Mechanism level: Generic
29 matches
Mail list logo