Re: Transferring security tokens to remote machines

2015-02-12 Thread Robert Metzger
Hi, thank you for the quick reply. I'll look into the links to see if we can implement a similar mechanism. Robert On Thu, Feb 12, 2015 at 6:19 PM, Alexander Alten-Lorenz wget.n...@gmail.com wrote: Hi Robert, forgive me if I’m wrong, but so far as I understand Flink uses nearly the same

Interview Questions asked for Hadoop Admin

2015-02-12 Thread Krish Donald
Hi, Does anybody has interview questions which was asked during their interview on Hadoop admin role? I found few on internet but if somebody who has attended the interview can give us an idea , that will be great. Thanks Krish

Re: Neural Network in hadoop

2015-02-12 Thread Ted Dunning
That is a really old paper that basically pre-dates all of the recent important work in neural networks. You should look for works on Rectified Linear Units (ReLU), drop-out regularization, parameter servers (downpour sgd) and deep learning. Map-reduce as you have used it will not produce

Re: Interview Questions asked for Hadoop Admin

2015-02-12 Thread jay vyas
Hi krish. Im going to interpret this as What is a real world hadoop project workload i can run to study for my upcoming job interview :) ... You could look here https://github.com/apache/bigtop/tree/master/bigtop-bigpetstore/bigpetstore-mapreduce If you understand that application, you will do

Neural Network in hadoop

2015-02-12 Thread unmesha sreeveni
I am trying to implement Neural Network in MapReduce. Apache mahout is reffering this paper http://www.cs.stanford.edu/people/ang/papers/nips06-mapreducemulticore.pdf Neural Network (NN) We focus on backpropagation By defining a network structure (we use a three layer network with two output

journal node shared edits directory should be present on HDFS or NAS or anything else?

2015-02-12 Thread Chandrashekhar Kotekar
Hi, I am trying to configure name node HA and I want to further configure automatic fail over. I am confused about '*dfs.namenode.shared.edits.dir*' configuration. Documentation says that active namde node writes to shared storage. I would like to know if this means that name nodes write it on

Re: journal node shared edits directory should be present on HDFS or NAS or anything else?

2015-02-12 Thread Chandrashekhar Kotekar
Hi Brahma Reddy, Thanks for the quick answer. It explains a lot but I have one more question. Maybe it is a stupid question but, required shared storage means active name node will write to its local disk? Do I need to configure or use any shared storage like NAS or SAN array or S3 storage for

RE: journal node shared edits directory should be present on HDFS or NAS or anything else?

2015-02-12 Thread Brahma Reddy Battula
Hello Chandrashekhar, Active namenode will write to require shared storage and will not write to HDFS.. Please check following docs for reference When Sharedstorage is Journalnode: property namedfs.namenode.shared.edits.dir/name

Re: Neural Network in hadoop

2015-02-12 Thread Alpha Bagus Sunggono
In my opinion, - This is just for 1 iteration. Then, batch gradient means find all delta, then updates all weight. So , I think its improperly if each have weight updated. Weight updated should be after Reduced. - Backpropagation can be found after Reduced. - This iteration should be repeat and

RE: hadoop cluster with non-uniform disk spec

2015-02-12 Thread Brahma Reddy Battula
Hello daemeon reiydelle Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy? Yes, you need to set this policy which will balance among the disks @Chen Song following settings controls what percentage of new block allocations will be sent

RE: Building for Windows

2015-02-12 Thread Kiran Kumar.M.R
Hi Lucio, You need to install any of following on your build machine. These are one time install and do not need internet connection. You can download ISO of VS and install. These tools are necessary to build native C++ code in hadoop-common and hadoop-hdfs 1.Windows 7.1 SDK (Along with

RE: Error with winutils.sln

2015-02-12 Thread Kiran Kumar.M.R
Hi Venkat, I checked the log file Below is the particular error appearing at line 635 in output.txt you attached. [ERROR] Could not find project to resume reactor build from: :hadoop-yarn-web-proxy vs [MavenProject: org.apache.hadoop:hadoop-main:2.6.0 @ D:\h\HADOOP~2.0-S\pom.xml, MavenProject:

installation with Ambari

2015-02-12 Thread Adaryl Bob Wakefield, MBA
I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it ask me to confirm host, I get: 1. Warning that I’m not inputting a fully qualified domain name. 2. The host that the Ambari instance is actually sitting on is not even registering. When run hostname

Re: installation with Ambari

2015-02-12 Thread Yusaku Sako
Hi Adaryl, Ambari expects FQDNs to be set on the hosts. On your hosts, you want to make sure that hostname -f returns the FQDN (with the domain name, like c6401.ambari.apache.org). Your /etc/hosts should look something like below (note that for each host, there's the FQDN followed by the short

Re: installation with Ambari

2015-02-12 Thread Ted Yu
Looks like you may get good answer from Ambari mailing list. http://ambari.apache.org/mail-lists.html On Thu, Feb 12, 2015 at 9:24 PM, Adaryl Bob Wakefield, MBA adaryl.wakefi...@hotmail.com wrote: I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the screen where it

Re: Neural Network in hadoop

2015-02-12 Thread unmesha sreeveni
On Thu, Feb 12, 2015 at 4:13 PM, Alpha Bagus Sunggono bagusa...@gmail.com wrote: In my opinion, - This is just for 1 iteration. Then, batch gradient means find all delta, then updates all weight. So , I think its improperly if each have weight updated. Weight updated should be after Reduced.

Re: FileSystem Vs ZKStateStore for RM recovery

2015-02-12 Thread Tsuyoshi Ozawa
I think ZooKeeper can handle thousands of updates, I meant thousands of updates per second. Thanks, - Tsuyoshi On Fri, Feb 13, 2015 at 3:59 PM, Tsuyoshi Ozawa oz...@apache.org wrote: Hi Suma, I think ZooKeeper can handle thousands of updates, so thousands of jobs can be launched at the

Re: installation with Ambari

2015-02-12 Thread Yusaku Sako
When setting up /etc/hosts, you can use whatever domain name you would like (just pick one arbitrarily). For example, host01.hadoop, host02.hadoop, etc., where hadoop is the chosen domain name. Yusaku From: MBA adaryl.wakefi...@hotmail.commailto:adaryl.wakefi...@hotmail.com Reply-To:

Re: installation with Ambari

2015-02-12 Thread Adaryl Bob Wakefield, MBA
This is turning into less about Ambari and more general computing. I’m trying to set up Hadoop on a home network. Not work, not on EC2; just a simple three node cluster in my personal computer lab. My machines don’t belong to a domain. Everything I read says that in this situation, the computer

Re: FileSystem Vs ZKStateStore for RM recovery

2015-02-12 Thread Tsuyoshi Ozawa
Hi Suma, I think ZooKeeper can handle thousands of updates, so thousands of jobs can be launched at the same time. More jobs can be running at the same time since the number of updates against ZooKeeper is less than the number of jobs. Please free to ask us if you face the scalability or

Transferring security tokens to remote machines

2015-02-12 Thread Robert Metzger
Hi, I'm a committer at the Apache Flink project. One of our users asked for adding support for reading from a secured HDFS cluster. Flink has a master-worker model. Since its not really feasible for users to login with their kerberos credentials on all workers, I wanted to acquire the security

RE: journal node shared edits directory should be present on HDFS or NAS or anything else?

2015-02-12 Thread Brahma Reddy Battula
Hello Chandrashekhar, Yes, you need to configure the shared storage(Active namenode writes to shared storage and Standby NN will read). Please check following mail for configuration..Shared storage can be Journalnode(which is one process and come along with hadoop package, check following

RE: Failed to start datanode due to bind exception

2015-02-12 Thread Brahma Reddy Battula
Hello Rajesh I think, you might have configured dfs.domain.socket.path as /var/run/hdfs-sockets/datanode Actually ,This is a path to a UNIX domain socket that will be used for communication between the DataNode and local HDFS clients. If the string _PORT is present in this path, it will be

Re: Failed to start datanode due to bind exception

2015-02-12 Thread Alexander Alten-Lorenz
/var/run/hdfs-sockets has to be the right permissions. Per default 755 hdfs:hdfs BR, Alexander On 10 Feb 2015, at 19:39, Rajesh Thallam rajesh.thal...@gmail.com wrote: There are no contents in the hdfs-sockets directory Apache Hadoop Base version if 2.5.0 (using CDH 5.3.0) On Tue, Feb

Re: commodity hardware

2015-02-12 Thread Gaurav Sharma
Indeed, another example would be the Dell r620. On Feb 12, 2015, at 08:51, Don Hilborn dhilb...@hortonworks.com wrote: Super Micro is a good example of commodity Hardware. http://www.supermicro.com/index_home.cfm​ From: t...@bentzn.com t...@bentzn.com Sent: Thursday, February 12, 2015 10:50

Re: commodity hardware

2015-02-12 Thread Alexander Alten-Lorenz
Typically that term means standard hardware which should be present per default in an enterprise without any extras like RAID, highSpeed NICs, dual power supply and so on. But that change more and more, since some new independent frameworks and tools enter the market, like Spark, Kafka, Storm

Re: commodity hardware

2015-02-12 Thread William Temperley
I'd say hardware is commodity when it's purchased to maximize the performance-to-price ratio, as opposed to just going for optimum performance, which will always cost a boat-load. E.g. a 15000 RPM SAS drive is not commodity, but a 7200RPM SATA drive is. On 12 February 2015 at 17:45, Adaryl

Re: Transferring security tokens to remote machines

2015-02-12 Thread Alexander Alten-Lorenz
Hi Robert, forgive me if I’m wrong, but so far as I understand Flink uses nearly the same model as HDFS (not at all). Means the master receives an action and distribute that to the workers (more or less ;)) HDFS as example uses not an push mechanism, the DN clients fetch the token from the NN

commodity hardware

2015-02-12 Thread Adaryl Wakefield
Does anybody have a good definition of commodity hardware? I'm having a hard time explaining it to people. I have no idea when a piece of HW is commodity or whatever the opposite of commodity is. B.

Re: commodity hardware

2015-02-12 Thread th
If you can buy it in a shop from a shelf somewhere it's 'commodity' :) /th -Original Besked- Fra: Adaryl Wakefield adaryl.wakefi...@hotmail.com Til: user@hadoop.apache.org Dato: 12-02-2015 16:45 Emne: commodity hardware Does anybody have a good definition of commodity hardware? I'm

Re: commodity hardware

2015-02-12 Thread Don Hilborn
Super Micro is a good example of commodity Hardware. http://www.supermicro.com/index_home.cfm? From: t...@bentzn.com t...@bentzn.com Sent: Thursday, February 12, 2015 10:50 AM To: user@hadoop.apache.org Subject: Re: commodity hardware If you can buy it in a

Re: commodity hardware

2015-02-12 Thread Mathew Thomas
My idea of commodity: Quad Core i5 or higher 16 GB RAM SSD hard drive (non-RAID, JBOD, size may vary) Fast network On Thu, Feb 12, 2015 at 11:50 AM, t...@bentzn.com wrote: If you can buy it in a shop from a shelf somewhere it's 'commodity' :) /th --

Interview Questions asked

2015-02-12 Thread Krish Donald
Hi, Does anybody has interview questions which was asked during their interview on hadoop? I found few on internet but if somebody who has attended the interview can give us an idea , that will be great. Thanks Krish

Re: Interview Questions asked

2015-02-12 Thread Russell Jurney
Diagram/code a mapreduce join. On Thursday, February 12, 2015, Krish Donald gotomyp...@gmail.com wrote: Hi, Does anybody has interview questions which was asked during their interview on hadoop? I found few on internet but if somebody who has attended the interview can give us an idea ,